Polymorphism in C
C is not known for being an object-oriented language -- and while it may not have true objects in the form of classes, it does have compound datatypes and function pointers. Personally, this is enough for me. I don't often need operator overloading, or even templates, as the type generics provided by void*'s often work out just fine for most of my usecases. Last year I wanted to work on an emulator for the original Playstation hardware. Unfortunately I was forced to abandon it before it was even close to being finished, as my freetime was completely eaten up by work. But before that happened, I was able to implement some things that I consider to be pretty nifty -- so today I want to take a bit of a look at those.
The problem
In an emulator, there can be a lot of components that need to work together; and a number of them perform similar, albeit not identical tasks. The CPU and GPU both need read/write access to memory; memory and the BIOS both need to be able to provide data/instructions to the CPU and GPU, for example. And all of this communication happens over a bus, which serves as an intermediary between all of these devices.
Since they perform similar functions, it would stand to reason that polymorphism is applicable here; and indeed it is, even in C. Implementing it takes a bit of boilerplate, but it is my opinion that writing boilerplate is better than writing unnecessary loops, selection statements, or cascading blocks of conditionals -- though that may just be the ADHD talking.
The solution
My solution was to break these components into two categories: devices, and the bus. For devices, we have the following definition(s):
struct device_vt;
struct device {
  struct device_vt* _vmt;
};
struct device_vt {
  struct device_vt* parent_vt;
    uint32_t (*read32)(struct device* self, uint32_t addr);
    uint32_t (*write32)(struct device* self, uint32_t addr, uint32_t value);
    /* Other overrideable functions below */
};
struct device* device_init(struct device* self);
void device_term(struct device* self);
And for the bus:
/* forward declaration of device, because we don't need all the data from the header */
struct device;
struct bus_device_entry {
  struct device* device;
    uint32_t address;
    uint32_t range;
};
struct bus {
  struct bus_device_entry* devices;
    uint32_t num_devices;
};
struct bus* bus_init(struct bus* self);
void bus_term(struct bus* self);
void bus_attach(struct bus* self, struct device* device, uint32_t addr, uint32_t range);
uint32_t bus_read32(int32_t address);
void bus_write(uint32_t address, uint32_t value);
/* Other functions for reading/writing */
You may notice a peculiarity in that *_init returns the same type as the first argument. This is a personal preference of mine, as it allows me to use the initialization procedures with both automatic variables (by reference), or heap-allocated memory, with extra effort in the calling function. That is, I can do:
struct bus b;
bus_init(&b);
/* or */
struct bus* bp = bus_init(malloc(sizeof(struct bus));
And on that note, lets take a look at device_init(), and company:
static struct device_vt _vt = {
NULL,
};
struct device* device_init(struct device* self)
{
  if(self == NULL) return NULL;
    
  self->vt = &_vt;
   	return self;
}
Pretty simple, right? device here is filling the role of an abstract base-class, so it makes sense for it to not really do a whole lot. We should never be calling these directly. But lets look at something more interesting now. How about one of its subclasses, like bios?
/* bios.h */
struct bios {
  struct device base;
  struct bios_vt* vt;
    uint8_t* data;
};
struct bios_vt {
  struct device_vt* parent_vt;
    uint32_t (*read32)(struct bios* self, uint32_t addr);
    /* and company ... */
};
/* bios.c */
static uint32_t _bios_read32(struct bios* self, uint32_t addr);
static struct bios_vt _vmt = {
  _bios_read32,
};
struct bios* bios_init(struct bios* self)
{
  if(self == NULL) return NULL;
  device_init(&self->base);
    
    self->vt = &_vt;
    self->vt->parent_vt = self->base.vt;
    self->base.vt = (struct device_vt*)self->vt;
    
    return self;
}
Here we initialize the base-class (device) first, and then we set up the virtual method table (VMT) for the subclass. There's one peculiarity here: When we set up the VMT for the subclass, we copy the reference to the base-class VMT, and then set the reference in self->base to point to the subclassed VMT. This is what allows us to treat all of the devices in a uniform manner. This dispatch table will always call the correct implementation. As for why we keep that reference to the original VMT -- well, that is because it may be beneficial to be able to call those functions in the future (i.e., if there are multiple layers of inheritance, instead of just an ABC like I have here).
Lastly, lets take a look at three parts of the bus:
/* Attach a device to the bus */
void bus_attach(struct bus* self, struct device* device, uint32_t addr, uint32_t range)
{
  struct bus_device_entry* pe;
    pe = realloc(self->device, (self->num_devices + 1) * sizeof(struct bus_device_entry));
    if(pe != NULL) {
    
    	self->device = pe;
        self->device[self->num_devices] = (struct bus_device_entry){device, addr, range}; 
        self->num_devices++;
    }
}
/* Find a device that is mapped to a given range of memory */
static struct bus_device_entry* _bus_find_device(struct bus* self, uint32_t addr, uint32_t* rel)
{
  int i;
  struct bus_device_entry* e;
    
    for(i = 0, e = self->device; i < self->num_devices; i++, e++) {
    
    	if(addr >= e->address && addr < e->address + e->range) {
        
        	/* relative offset, if requested */
        	if(rel != NULL) *rel = addr - self->address[i];
        }
  }
    
    return e;
}
uint32_t bus_read32(struct bus* self, uint32_t addr)
{
    uint32_t value;
    uint32_t rel_addr;
  struct bus_device_entry* e;
    
    e = _bus_find_device(self, addr, &rel_addr);
    if(e != NULL) {
    
    	/* If we've hit this point, then the
        * device *should* implement read32. If it doesn't,
        * either the wrong address is being requested, or the
        * device is not implementing something it needs to. */
    	value = e->vt->read32(e->device, rel_addr);
  }
    
    return value;
}
Now all that needs to be done on the part of something like the CPU is something like this:
union instruction _cpu_fetch(struct cpu* self)
{
  uint32_t instr;
    instr = bus_read32(self->bus, self->pc);
    return (union instruction){ .value = instr };
}
So, it's a decent amount of boilerplate, but it makes my _cpu_fetch() code far more simple, and the bus is completely decoupled from every device that connects to it. All that is required is that those devices implement the struct device interface.
Hopefully this is helpful to someone -- I know that there are probably dozens of OOP-C posts out there, but I just thought I'd share my specific usecase. In the future, I may write a post about augmenting this with some very basic RTTI by abusing the preprocessor, if I can find the code I wrote that did it.
Thanks for reading.

