Tuesday, March 31, 2009

Polymorphic Excision

Usually when I'm getting back into a codebase, I'll pick a small project and latch onto that until it's completed. This is a great way to re-orient yourself with a once familiar codebase that you've become rusty with. Lately I've jumped back into my millenias old Star Trader project and decided to do just this.

A while back there was a great presentation at one of the Microsoft Gamefest events titled "Cross-Platform Graphics Engine Development". The premise of this talk was that excessive use of OOP features (like Polymorphism) is harmful when building a system that relies on maximum performance and thus such a system should be tailored to use a more direct approach utilizing type definitions (as opposed to polymorphic type abstraction). What this means in the end is instead of keeping a hierarchy of inherited interfaces to something like your renderer, just create a new type dependent on your platform (and API of choice) and link it in. This is precisely what I decided to do.

Now I'm not the type that buys into the argument that C++ is incredibly slower than C, but in general, using a feature needlessly while paying the cost is just, well, dumb. The way my code was I had a 3 teir inheritance hierarchy with the interface at the bottom, the shared functionality (ex. state tracking) at the middle, and the API implementation (D3D9, OpenGL, ...) at the top.  This allows me to derive the common interface to generate any number of API implementations and link them in as a DLL. This works pretty well but I'm nearly certain I will never be swapping out my renderer on the same platform (Win32,  MacOSX, PS3...). I was paying for a feature I will never use.

To resolve this, I first compiled the renderer as a static library. After this, I came up with a public interface to the renderer, that, based on platform and project settings will access only the primary accessor to the renderer. In addition I broke off the common behavior into it's own object that the API specific code calls as neccessary. This was all pretty straightforward and the main executable now directly links to the API specific Renderer. When I create additional platform/API implementations, the interface can still be enforced with a conditional inheritance of the base level interface to ensure that everything was implemented properly (a nice trick). Just to be thorough, I did go ahead and implement an additional option that allows a DLL to be loaded with an inherited renderer (as before), so I could do something like release a Direct3D 11 renderer for those that supported it down the line. Personally I don't like doing stuff like that (you're basically bandaging on functionality and resources at that point instead of building it in to the initial overarching plan), but I figured what the heck.

I was quite surprised at the ease of this switch. That is of course until I started the game and my resource manager puked on me. :-)

I really enjoy using pluggable software factories. I enjoy them so much I use them exclusively for declaring resource type allocators to my resource manager. In a one line macro, I can completely register a resource class and know that I did it right. It's fantastic and despite the use of Macro's and Globals, I think it's a very elegant solution. Coupling is reduced to nearly nothing since the definition, implementation and registration can all happen in the same .cpp file. I'll spare you the details but here is a link to a good explanation of pluggable factories.

Since it's a solution that exists in global scope, it does come with it's caveats. Initialization in global scope is seemingly random, so how do you control the order of initialization so that the list of resource allocators is not re-initialized to empty AFTER having already added elements to it? The solution to this is actually quite simple though you won't find it in too many programming books. Using lazy evaluation (also known as Construct On First Use), I ensure the list is initialized only on it's first call. This is really easy to implement and merely consists of using a static variable (of the list) within the scope of a function. The first time the function is called to get the list is when the list is initialized. It would look something like this:
CResourceAllocatorList *GetResourceAllocatorList()
{
    static CResourceAllocatorList s_ResourceAllocatorList;
    return &s_ResourceAllocatorList;
}

To use it:
GetResourceAllocatorList()->Append( pNewAllocator );

This works great in practice and like I said, is super convenient. When these are defined in a DLL, they sit there waiting until the DLL is created and as soon as that happens are initialized and registered. Eventually they do have to be collected (since the DLL has it's own heap) but this is easy enough and if you use a linked list it's as easy as linking the allocator as another node. But what happens when you link these in via a static library? Nothing... nothing at all. If a factory is placed in a file that has no external references, the linker will omit a reference to the compiled object file when linking it in to the executable. This effectively means that your factories will NEVER be registered. That's, well, really bad! While Visual Studio does have an option to never remove unreferenced data (which should probably never be used), it doesn't actually work for data removed due to an .obj being unreferenced (this is supposedly intended behavior).

Resolving this issue is naturally a pain in the ass. One way is to look at your list of generated symbols, find the decorated name for your factory and include a pragma line that forces inclusion of that symbol in the linker, i.e.
#pragma comment( linker, "/include:?blah@@" )

This is incredibly annoying and completely destroys the convenience of using a factory in the first place. Another way is to include some kind of reference to that file or your factory in code that is definitely executed (like main() or your primary initialization routine). This is what I ended up doing but in a semi-automatic way. I made a Macro in which you pass in the resource type name (RESTYPE_TEXTURE, RESTYPE_MODEL, etc...) and a reference is automatically generated for you (calling a dummy function in an automatically generated pointer to the base factory type, initialized when the factory is created). This effectively adds an additional step to the process but at least it's somewhat straightforward, as opposed to the other option.

After that everything was working perfectly! I still have to do some performance tests to double check that talks premise (of sacrificing flexibility for performance) but at first observation the program does appear to have a smaller memory footprint (likely due to the additional optimization the compiler/linker does thanks to the file being directly linked in and the exclusion of some v-tables).

All in all it was a nice little exercise that definitely got me re-acquainted with my old codebase. Next up on my radar is finishing up my new shader system. I'll talk more about it some other time.

1 comment:

Demiurge said...

Really interesting!
Have you used a single renderer and then added methods based on macros that are placed conditionally based on the current platform???

Thanks!