Compiling Generics ― Andreas Zwinkau

Generic programming is a good idea in practically all cases. When such code is compiled to lower-level code, the generics must be lowered. There are two basic approaches with different trade-offs. Let's consider the following code, which shows the generic max function.

T max<T>(T a, T b) {
  if (a < b) return b;
  else return a;
}

void foo() {
  int    x = max(5  , 11  );
  double y = max(5.0, 11.0);
}

Instantiation (C++ style)

Whenever we use a generic piece of code, the compiler will create concrete instances for the type parameters.

In this example, the compiler must instantiate and generate two max methods. Effectively, it is equivalent to compile the following code:

int max(int a, int b) {
  if (a < b) return b;
  else return a;
}

double max(double a, double b) {
  if (a < b) return b;
  else return a;
}

void foo() {
  int    x = max(5  , 11  );
  double y = max(5.0, 11.0);
}

General instance (Java style)

To avoid duplicate instantiations, we can restrict this to one instance of the most general type, which (in Java) is Object. Essentially, we always use the general instance max<Object>, but we can do a better type check. Since we just use a non-generic max, this is called type erasure. Effectively, it is equivalent to the following code:

Object max(Object a, Object b) {
  if (a < b) return b;
  else return a;
}

void foo() {
  Integer x = max(new Integer(5), new Integer(11));
  Double  y = max(new Double(5.0), new Double(11.0));
}

Note the boxing in foo, which is necessary for primitive types, since we cast them to Object.

Trade-Offs

Basically, if performance is important then template instantiation is the way to go, otherwise type erasure is to be prefered. The reason for the better performance of templates is not so much about the dynamic dispatch of generics (dynamic dispatch performance with vtables is ok), but that dynamic dispatch prevents further optimizations. For our example above, the two different max calls could be inlined to save the call overhead. Additionally, for integers max can be optimized further.

Code size will be better with type erasure, since we only have one instance, compared to all the template variants. What type erasure provides is better error messages, since the instantiated code usually loses the information about the instantiation. However, this might be fixable with sufficient engineering. For instantiation the full code must be available, while type erasure only needs the interface. This means compilation is more modular and faster.

Hybrid?

We can understand instantiation as an optimization. For example, we could be to use type erasure for reference types (classes) and instantiation for value types (structs). Classes use vtables anyways, so the performance hit is negligible. High-performance code usually relies on value types in the inner loops, so these parts get optimized well. In general, the compiler could decide case by case whether to instantiate a type parameter Nevertheless, I do not know any compiler who does this.

Conclusion

Template instantiation for performance, otherwise type erasure.

Actually, there are more possibilities, but the two discussed are the most prominent ones.