Sunday, October 02, 2005

C++: Templates and the ABI

In the previous post, we saw why inlined code is dangerous in a public API. Unfortunately, I carelessly put templates in the same bag in an attempt to generalize the idea too much, giving an incorrect impression of how they work. A reader (fellow_traveler) spotted the mistake and here is this new post to clarify what's up with templates and an ABI... if one can relate the two concepts at all.

C++ makes intensive use (or abuse?) of templates to achieve genericity; one can quickly notice this in the STL, where each container is parametrized based on one or more types. Other libraries, such as Boost, go further and use templates in a lot of situations where one couldn't have ever imagined so.

In order to understand what goes on with templates, let's remind how they work. A template defines a piece of code (be it a class or a function) that is parametrized by a type given by the developer. The template does not exist as binary code, because, simply put, that is impossible: it lacks type information to be compiled.

Here is a trivial function that returns the sum of two objects whose type is defined by Type; we will use it to illustrate some examples below. Also, and to focus on the ABI, let's assume that this code is part of a public library; therefore, it must be placed in a header file (e.g., foo.hpp) to be useful to other users (if it were in a foo.cpp file, it'd simply be private and not usable outside that file).

template <class Type>
Type
add(const Type& p1, const Type& p2)
{
return p1 + p2;
}

When the template is used in someone else's code — in other words, it is instantiated — the compiler grabs the template's source code, fills in the parametrized gaps with the type given by the developer and creates the final object code. For example, given:

int foo = add<int>(2, 3);
float bar = add<float>(2.4, 3.5);

The compiler gets the verbatim add function's code from the header file, replaces Type with int, generates the object code for the resulting function and stores it alongside the user's binary. The same happens with the float instance. Notice how the binary code is not in the library where the template came from, and also notice that the user's binary has gained two new functions, one for each instantiation.

So what happens? Templates cannot take advantage of (binary) shared libraries. Whenever the code in a template changes, the template's user is forced to rebuild his code (if he wants to get the new changes, of course). Imagine that there was a security bug (or any other serious bug) in the template's code: you'd need to make sure to rebuild all its uses to fix the issue, something well-known by users of static binaries.

Of course, the library developer could explicitly instantiate some common types in his library's binary so that the user needn't duplicate the code. This could work in some cases, but as he cannot predict what types will the developer use, this is not a complete solution.

Other developers create templates in a two-layered design. The public template is a very thin wrapper over a private class that achieves genericity by using void * types. This way, the public template is unlikely to change, and the developers can safely change their internal code without requiring external rebuilds. I think I saw this in the STL itself, or maybe in QT, cannot remember.

Summarizing: as my reader said, it makes no sense to talk about templates and ABIs, because a template never has an ABI. It is only an API that, once compiled in third-party code, becomes part of it. I'm now wondering how Java 1.5's templates work or if they suffer from these issues too...