A Nearly-Automatic C++ Run-Time Type System

I don’t know how many active programmers read this blog. For non-programmers, you can safely skip this entire entry. It’s very technical and will probably be quite boring to anyone but hard-core software architects.

In developing my bigger overall project, I’ve needed a number of small utilities that I’d rather not have to create myself. One of those is a robust C++ runtime type system, useful for serialization, but also dynamic morphable code. The built-in C++ RTTI generally falls short for a number of reasons, including a wasteful 4+ bytes per object (that’s in addition to any virtual function table). The system in Visual Studio also doesn’t seem to understand the concept of structures and arrays very well, which makes it not at all useful to me. And other systems out there either add their own per-object storage, or they pre-process the header files (which is cool, but complex and painful to set up, and hard to limit to only certain objects and members), or they have horrendous macros to contend with.

So I wrote a quick drop-in replacement, which set me back two weeks on my master schedule. But it was worth it. It has zero per-object storage, no base-class to inherit from, no table lookups and the amount of typing for each new class I make is extremely low (zero, if I don’t want the class or member to have any type information). Plus, the system functions just as well for built-in classes, such as int and float, all pointers — everything I’ve thrown at it so far, including the rest of my vast new project.

Here’s an example class:

struct Foo
{
    int value;
    int* test[10];
    Foo* ptr;

    Foo() { }

// this special constructor, and a public default constructor/destructor are all you need per typed class
    Foo(TypePtr& t)
    {
        INIT_STRUCT(Foo);
        // use INIT_DERIVED(Foo,Parent) for all inherited members to be brought in
        // for templated classes, use INIT_TEMPLATE(Foo,mytype) to result in a type named "Foo<mytype>" vs. "Foo<T>"

        INIT_MEMBER(Foo, value);
        INIT_MEMBER(Foo, test);
        INIT_MEMBER(Foo, ptr);
    }

    Foo(int _value)
    {
        value = _value;
        ptr = this;
    }
};

If I call cout < < TYPE(Foo) , I get the following output printed to the screen:

struct Foo
{
    int value;
    int* test[10]; //internally: int**
    Foo* ptr;
};

This is all stored as a nested table of static Type objects, each about 48 bytes, not counting any string data for names, which I’ll probably make strippable for a release version.

The Type object for any given typename can be found by simply calling Typed<Foo>::GetType() (or via the equivalent macro: TYPE(Foo) ) For an instantiated object Foo f, call ::GetType(f). These calls are inlined and are resolved at compile-time to incur almost zero overhead, not even a table lookup–just a check to see if the type is not yet defined. The biggest cost occurs once, when the type is initialized via that special constructor, on first use (I gave up on using C++ static initializers, as the invocation order is hard to control).

For dynamic runtime typing, this requires one last small step: adding a virtual TypePtr GetType() { return ::GetType(*this); } (or macro ENABLE_DYNAMIC_TYPING ) to each derived class, which allows one to call f->GetType(); on any runtime object. It piggy-backs on the existing vtbl, so there’s no added overhead. Without that extra line of code, the Type would appear as whatever the compile-time pointer/reference thinks it is (e.g., a Foo* pointer will think the object is of type Foo, even if it is derived from Foo.)

That’s all the programmer has to worry about. Everything else is automatic. All basic types are defined, and pointer types are automatically created to match and point to (via a Type::GetDeref() call) their dereferenced types. I’ve experimented with alternative Cast(p) and Create() functionality as well, though I haven’t needed it. And I’ll add automatic serialization when I need it. Note: one need not include all members in a type definition. So it’s easy to make serializers that ignore runtime or derivative data, or even automatic pointer serializers that handle new memory layouts.

To explain how this is all done, one needs to understand templates pretty well, and partial specialization in particular. It allows a Typed<T> class that is automatically created for each type T that I invoke via any TYPE(T) or related call. It doesn’t exist unless you ask for it. And partial-specialization allows the system to define static initializers for built-in or opaque types that don’t require those special constructors, while still using generic class initializers for everything else.

The way it avoids any per-object storage is to use class statics in the Typed<T> classes, not in the many instantiations of each type. Since the basic underlying Type object is the same for all objects of that type (and, indeed for all types — the Typed<T> is just type-dressing), it only makes sense to store it here. And that allows a Typed object to exist without changing anything about the built-in int definition. It should work for opaque/finished types as well, but only for publicly accessible members (or protected members for derived classes). Finally, I’ve also played around with having a static-per-type "identity" object hang around which can be used for things like vectors and matrices, or anything else.

Anyway, this sort of thing is exactly why I like programming for myself. I can release the code, or keep it to myself. But most importantly, I don’t lose it when I change jobs or clients.

If anyone wants to use or improve on this system, please let me know. I’d personally like to get rid of the requirement for public default constructors and destructors, as some of my base classes like to keep those private. You can comment below if you’re interested. I won’t post it unless there’s sufficient interest, as it’ll take me a few hours to isolate it and package it up.

6 Responses to “A Nearly-Automatic C++ Run-Time Type System”

  1. I needed something similar, and wound up with something along the lines of:

    class Foo
    {
    REFLECT // hide hash table
    public:
    Foo();
    Attrib bar;
    };

    Foo::Foo()
    {
    attribs[bar.Name()] = &bar;
    }

    where attribs is a hash table hiding inside the reflectable class.

    I ended up needing to wrap the attribute in a template in order to be able to control access (so that if something external to the class modified the variable, the change wouldn’t be unnoticed).

    I did a similar thing as you for type identification, and compare pointers to strings containing a typename.

    Coupled with a bit of partial specialization

    template
    std::string ConvertValueToString(const T&);
    template
    T StringAsValue(const std::string&);

    I got nice automatic reflection (and script language bindings) due to the attribs map sitting in every serialized class.

    The thing I don’t like about my system is that it has a portion that must be dynamically intialized. The thing I do like about it is that since the attributes themselves are templated, I’ve been able to swap in a lot of different algorithms and approaches to reflection without modifying the code that uses the reflection at all.

    Thanks for a though provoking post.

  2. BTW, my C++ got all mangled in the reply.

    It was supposed to say

    Attrib ANGLE float ANGLE bar;

    (Similarly, template angle brackets are missing everywhere in the post)

    also, in the constructor, I meant to mention that the code initializing the hash table is hidden in a boring

    INITREFLECT(bar);

    macro but I manually expanded it for explanation..

  3. I came across this article via a Google search. I’m a student working on a semester long project with the goal of implementing something similar to WinFS. I was wondering if there was any way I could get a copy of the code, or even a more technical description of how you implemented this. It’s something I’ve been trying to accomplish for almost a month now, but I’m not a very experienced C++ programmer (I only recently decided to make the transition from C to C++). I don’t intend on using your code (part of the project is that I must implement anything non-trivial on my own), I just need more than the description given to be able to “get it” and figure out how to implement my own system. Thanks.

  4. This sounds like a very nice system avi. I have a (sort of) related Type custom allocation/management system for a project I’ve been working on and this would be very interesting to me if you decide to GPL and share your code.

    I’d be interested in taking a shot at some of the improvements you mentioned too.

    Give me an email avi if you are interested. What I’d really like to do is wrap up a project like yours with a custom performance Type allocation and management system similar to the one I’ve had to develop, GPL it and throw it up on sourceforge.

  5. Did you ever decide to publish this? It sounds like just the thing for a little project of mine where I have to serialize certain member variable of a large number of classes to a database.

    If you’ve published it, I’d be very interested in looking at it and providing feedback and improvements.

  6. I’ll try to get it posted soon. Sorry for the delay.

Discussion Area - Leave a Comment