Swift’s value types are *almost* able to represent algebraic data types. Unfortunately, they fall short of the mark when it comes to recursion, and while they’ve announced that their solution, indirect `case`

s, will ship in a later build of Swift 2, there’s still reason to want them today.

The standard solution is to use `Box<T>`

, a function, or some other reference type to manually force an indirection for the recursive cases:

Unfortunately, this has a few significant warts:

- Clients of the API have to know about
`Box<T>`

; it can’t be a private implementation detail. This can in turn lead to ambiguities if APIs aren’t using a common dependency to provide the`Box<T>`

type. Further, they have to box and unbox the values themselves. - Pattern matching cannot be performed recursively in a single step.

Indirect cases will (I believe) resolve both of these issues, but there’s another solution we can apply today which solves both *and* provides a significant increase in the expressiveness of the type, at the expense of introducing a (useful) intermediary type.

To begin with, note that in Swift 2, it’s no longer necessary to box elements of parameterized types in enum cases. This suggests a straightforward refactoring: replace `Expression`

’s recursive instances with elements of a type parameter:

Now we’ve got an `Expression`

type that can be instantiated with a given type parameter to recur. But if we try to describe the type of a recursive instance of it, we immediately run into a wall:

It would seem we’ve simply moved the problem from the `case`

s to the type, and can now see more clearly why Swift doesn’t allow `case`

s to recur directly: it amounts to an infinite type. Some indirection is required, *somewhere*, and by allowing the programmer to place it (whether by explicit boxing or an `indirect`

keyword), the performance implications are somewhat under their control, rather than the compiler’s.

We need some way to tie `Expression`

into a knot (as it were), looping back around into itself, but without requiring us to write out an infinite list of nested type parameters. If we were writing a function instead of a type, we could use the `fix`

function, which computes the least fixed point of a function, to lift a nonrecursive function into a recursive one:

Instead of making a recursive function, we make a nonrecursive function taking a function as a parameter, and return an inner function which calls through it in order to recur. `fix`

calls the outer function with a closure which calls back into `fix`

, tying the knot. Is there an analogous fixed point for types? If there were, we would expect it to have the same overall shape: it would apply a type constructor like `Expression<T>`

to a type which itself provides the connection back to `Expression<T>`

.

I’ll let you in on a secret: types are functions, too. `Expression<T>`

is actually a function, abstracted over a parameter `T`

to a concrete instance of `Expression`

with `Recur`

instantiated to `T`

. And it turns out that, like other functions, types also have fixed points.

In Haskell (the inevitable destination of any discussion of fixed points in programming languages), we could write this `Fix`

type of a parameter type `f`

like so:

This is Haskell notation approaching its densest form, so let’s compare it with how `fix`

(the least fixed point of functions) is defined in Swift:

The `fix`

function applies `f`

, the function parameter passed to `fix`

, to the result of applying `fix`

recursively to `f`

again. It wraps this up in another closure to avoid infinite looping.

Analogously, the `Fix`

type applies `f`

, the type parameter passed to `fix`

, to the result of applying `Fix`

recursively to `f`

again. Haskell is lazily-evaluated, so it doesn’t need to wrap the lot of it up again.

Let’s try writing `Fix`

in Swift. It only has one `case`

, so it can be a struct instead of an `enum`

.

Now it needs to have a type parameter, `F`

.

So far so good. Now we need to apply `F`

to itself, recursively. But doesn’t that cause the infinite sequence of nested types again? `Fix<F<Fix<F<…>>>>`

is no improvement on `Expression<Expression<Expression<…>>>`

.

Fortunately, Swift allows you to refer to a type without reference to its type parameters in its body:

Unfortunately, while `Fix`

is a complete reference inside the body of this type, Swift doesn’t know that `F`

can accept type parameters, and thus rejects this. We can be sneaky and use a `protocol`

with a `typealias`

to work around this:

But now when we add the constraint to tie `F`

into a knot, we run into a new issue: `swiftc`

crashes. (rdar://20000145).

Fortunately, while Swift can’t express a *generic* `Fix`

over any arbitrary fixable type, it *can* express a fixed point of `Expression`

*specifically*. Let’s call this new type `Term`

. Once again, it’s a `struct`

, and its body holds an `Expression`

instantiated to itself. This one errors out, but it’s clear we’re getting closer:

`Term`

is recursive because it holds an `Expression`

which in turn holds (in some of its cases) a `Recur`

, which we’ve instantiated to `Term`

. We need to reintroduce an indirection via a reference type like `Box<T>`

or a function.

Haven’t we just moved the problem around again? Well, sort of. Certainly we still need to box the values, but now we can do it in one and only one place—`Term`

—and additionally we can make it `private`

, avoiding exposing our implementation details to our consumers. Our constructor and getter can handle the boxing and unboxing for us:

That’s a pretty decent reason to use this approach right now (if you can’t wait for indirect cases). But it only solves one of the problems we mentioned initially; we still can’t pattern match recursively. For example, if we wanted to evaluate application expressions, we would want to write something like this:

But because of the `Term`

and `Box`

, neither of which can be matched through, we would have to write this instead:

If we could flatten out the type, we could pattern match. Flattening out the type would put us straight back into the infinite sequence of `Expression<…>`

s; but maybe we can only *partially* flatten it?

We don’t need to pattern match against arbitrarily-nested terms for this example; we just want to match against a single nested layer. Therefore, we really only need to flatten out a single step of the recursive type. We’d need to apply this for each appearance of `Recur`

in `Expression`

, replacing it with `Expression<Recur>`

.

Replacing each instance of a type parameter with an instance of another type parameter sounds like a job for a `map`

function. In Haskell, this function is known as `fmap`

, for functor map, where functors are a kind of mathematical object with some specific shape, and where map preserves this shape. For example, the `Array.map`

method, given some function `transform`

, produces a new array with the same number of elements and in the same order (i.e. preserving the structure of the array), but with each element replaced by applying `transform`

. Array, then, is a functor; and it turns out, so is our `Expression`

tree.

In our case, `map`

should replace the `Recur`

instances with the result of applying some function to them. There are no instances of `Recur`

in `Variable`

cases, so it should just re-wrap the variable name in the resulting type; the `Abstraction`

and `Application`

cases will apply `transform`

:

We can use this to implement recursion schemes, improving our confidence in recursive functions over the type, but for now we’ll limit ourselves to enabling pattern matching. Given an `Expression<Recur>`

, we want to replace each `Recur`

with its recursive instantiation, `Expression<Recur>`

. Otherwise put, we need a function of type `Expression<Recur> -> Expression<Expression<Recur>>`

. Let’s implement this as a method, and call it destructure (since it decomposes the structure of the type):

…but we can’t! To implement a function of type `Expression<Recur> -> Expression<Expression<Recur>>`

using `map`

, we’d need a function of type `Recur -> Expression<Recur>`

to pass to it. There is no useful function that can do this; without knowing a *specific* (and actually recursive) type for `Recur`

, we have no way to recover the `Expression<Recur>`

that we want to return.

Instead, let’s use a constrained extension to limit ourselves to `Expression<Term>`

. Unfortunately it’s not *quite* that simple, because Swift, for reasons beyond my knowledge (rdar://21512469), forbids the obvious thing:

We’ll work around this using a protocol, `FixpointType`

:

Now we can constrain the extension to `FixpointType`

like we want:

There are two problems remaining with this implementation:

- We still don’t have a way to get an
`Expression<Recur>`

from a`Recur`

. `swiftc`

crashes. (rdar://21328632)

Fortunately, we can resolve the former by adding a property to the protocol:

With that out of the way, we can work around the crash by loosening the constraints slightly; we don’t actually require that `Recur.Fixed`

be recursive; we just need to be able to name it. Now we can give the return type of `destructure`

as `Expression<Recur.Fixed>`

, and implement it in the obvious way, mapping each term to its body:

Now we can use `destructure`

to implement evaluation of well-formed `.Application`

expressions, using exactly the pattern matching we wanted in the first place: