Abusing “Phantom Types” to Encode List Lengths Into Their Type

posted by Craig Gidney on September 10, 2012

Yesterday there was a post on Reddit about “phantom types” in Java. Phantom types are a technique to expose runtime state to the compiler (without lots of code duplication). The idea is to take a class, add a generic parameter, and use empty interfaces as sigils that indicate particular states. The example from the post splits a Plane class into Plane<Landed>, Plane<Landing> and Plane<Flying>. This is useful because, for example, the compiler will prevent you from mistakenly passing a flying plane into a “taxi to runway” method.

I don’t know exactly how I feel about phantom types as a practical coding technique, but I do know it’s possible to abuse them a lot more than you might initially expect. In particular, your sigil types can themselves be generic and this allows you to do things like counting. I will demonstrate this fact with a rudimentary example: a list with known compile-time size in C#.

The basic technique mirrors the Peano Axioms that define the natural numbers. You have a ‘zero’ interface and a ‘next number’ interface that you nest around zero many times. Numbers are encoded into the nesting depth:

interface ILength {}
interface IZero : ILength {}
interface IOnePlus<TLen> : ILength where TLen : ILength {}
//0 == IZero, 1 == IOnePlus<IZero>, 2 == IOnePlus<IOnePlus<IZero>>, ...

Now all we need is a list type that includes a generic parameter for its length:

class SizedList<TItem, TLen> : IEnumerable<TItem> where TLen : ILength {
    public readonly Link<TItem> Head;
    public SizedList(Link<TItem> head) {
        this.Head = head;
    }
    public IEnumerator<TItem> GetEnumerator() {
        for (var e = this.Head; e != null; e = e.Next)
            yield return e.Item;
    }
    IEnumerator IEnumerable.GetEnumerator() {
        return GetEnumerator();
    }
}
class Link<T> {
    public readonly T Item;
    public readonly Link<T> Next;
    public Link(T item, Link<T> next = null) {
        this.Item = item;
        this.Next = next;
    }
}

and some methods that maintain the invariant that the nesting depth matches the list length:

static class SizedList {
    public static SizedList<T, IZero> Empty<T>() {
        return new SizedList<T, IZero>(null);
    }
    public SizedList<TItem, IOnePlus<TLen>> WithAppend<TItem, TLen>(
            this SizedList<TItem, TLen> list,
            TItem item) where TLen : ILength {
        return new SizedList<TItem, IOnePlus<TLen>>(new Link<TItem>(item, list.Head));
    }
    public static SizedList<TItem, TLen> WithPop<TItem, TLen>(
            this SizedList<TItem, IOnePlus<TLen>> items) where TLen : ILength {
        return new SizedList<TItem, TLen>(items.Head.Next);
    }
}

The benefits of this abuse? Compile-time errors when the size of the list is wrong:

static class Program {
    static Tuple<int, int> SafeToPair(this SizedList<int, IOnePlus<IOnePlus<IZero>>> list) {
        return Tuple.Create(list.Head.Item, list.Head.Next.Item);
    }
    static TItem SafeMax<TItem, TLen>(this SizedList<TItem, IOnePlus<TLen>> list) where TLen:ILength {
        return list.Max();
    }

    static void Main(string[] args) {
        var list0 = SizedList.Empty<int>();
        var list1 = list0.WithAppend(5);
        var list2 = list1.WithAppend(6);
        var list3 = list2.WithAppend(7);
        var list2B = list3.WithPop();
        
        // SafeMax only works on non-empty lists (compiler errors on others)
        var compileError1 = list0.SafeMax(); //<-- not allowed
        var max1 = list1.SafeMax();
        var max2 = list2.SafeMax();
        var max2B = list2B.SafeMax();
        var max3 = list3.SafeMax();

        // SafeToPair only works on size 2 lists (compiler errors on others)
        var compileError2 = list0.SafeToPair(); //<-- not allowed
        var compileError3 = list1.SafeToPair(); //<-- not allowed
        var pair1 = list2.SafeToPair();
        var pair2 = list2B.SafeToPair();
        var compileError4 = list3.SafeToPair(); //<-- not allowed
    }
}

This counting-with-types technique is clearly useful, but I don't recommend jumping through the necessary hoops in practice. Things can get really ugly when you try to, for example, implement SelectMany over sized lists. I'm not even sure if compile times won't go exponential under this sort of abuse.


Twisted Oak Studios offers consulting and development on high-tech interactive projects. Check out our portfolio, or Give us a shout if you have anything you think some really rad engineers should help you with.

Archive

More interesting posts (23 of 33 articles)

Or check out our Portfolio.