While it’s true that anything which can be computed, period, can be computed in the lambda calculus, you might not want to: it’s austere, to say the least, and was not designed with modern sensibilities regarding readability in mind. We developed all those languages and features for a reason! Still, Church demonstrated not just that it was possible to compute anything computable with the lambda calculus, but also how one might do so.
In this series, we’ll examine some ways to express common programming language features using the minimalistic tools of the lambda calculus. We begin with perhaps the most ubiquitous type: booleans.
The lambda calculus’s austerity is extreme: you don’t even have booleans. All you have are:
Lambda abstractions;
Applications; and
Variables.
We’ll now review these in some detail; feel free to skip this section if you’re already familiar with the lambda calculus.
Lambda abstractions (“lambdas,” “abstractions,” and “functions” will also be used interchangeably) introduce a function of a single variable.
Abstractions are written λ x . y
, for variable x
and expression y
, where x
is now available as a bound variable in the body, and any enclosing definition of x
is shadowed (i.e. λ x . λ x . x
= λ x . λ y . y
≠ λ x . λ y . x
). (We shall assume strictly lexical scoping for the time being.)
In Haskell, we would write \ x > y
instead; in JavaScript, function (x) { return y }
or (x) => y
.
Applications (“function application” and “function call” will be used interchangeably) apply the result of the expression on the left to the expression on the right.
Applications are written as x y
, for expressions x and y, and leftassociated, i.e. a b c
= (a b) c
≠ a (b c)
. Function application binds tighter than lambda abstraction, i.e. λ x . λ y . y x
= λ x . λ y . (y x)
≠ λ x . (λ y . y) x
.
The syntax is the same in Haskell; in JavaScript, we would write x(y)
or a(b, c)
. Note however that since lambda calculus functions are all singleargument functions, a more direct (though less idiomatic) equivalent for the latter would be a(b)(c)
.
Variables introduced by enclosing lambdas.
Variable are written as more or less arbitrary names, typically alphanumeric (e.g. x
or y0
or thing
); however, we will feel free to include nonalphanumeric characters in names as we see fit, since the paucity of syntax means there’s little risk of ambiguity.
Since the only available variables are those bound by enclosing lambdas, we can also infer that there are no let
bindings for local variables, and no globals of any sort; the lambda calculus doesn’t come with a standard library.
In quasiBNF, the grammar for the lambda calculus is extremely minimal:
And finally, this table gives a sidebyside comparison of the syntax of the lambda calculus with the corresponding syntax in Haskell & JavaScript:
Lambda calculus  Haskell  JavaScript  

Abstraction 
λ x . y

\ x > y

(x) => y

Application 
f x

f x

f(x)

Variable 
x

x

x

Lambdas are the only way to introduce values—they’re the only “literal” syntax in the language. We can therefore infer that the only kinds of runtime values must be closures. In an interpreter for the lambda calculus, closures might consist of the name of the introduced variable, the body of the lambda, & a map relating the names and values of any variables it closed over when constructed (again, we assume strict lexical scoping). There are no bits, bytes, words, pointers, or objects in the language’s semantics; only this runtime representation of lambdas.
Likewise, lambdas are also the only way to introduce variables—there’s no standard library, builtins, primitives, prelude, or global environment to provide common definitions. We’re truly baking the apple pie from scratch.
All of this raises the question: how do you do anything when you don’t even have true
and false
? Lambdas and variables don’t do, they merely are, so that leaves application. When all you have is application, everything looks like a lambda abstraction, so we’ll represent booleans using lambdas.
Of course, it’s not just booleans we’re after; true
and false
aren’t much use without and
, or
, not
, if
, and all the rest. To be useful, our representation of booleans should therefore suffice to define these, as well. But how do you define if
without using if
? In a lazy language like Haskell, we might define if
as a function something like so:
if_ :: Bool > a > a > a
= if cond then then_ else else_ if_ cond then_ else_
In a strict language like JavaScript, we’d instead take functions for the alternatives:
function if_(cond, then_, else_) {
if (cond) {
then_();
else {
} else_();
} }
Both these definitions use the language’s native booleans and if
syntax (a tactic for implementing embedded DSLs known as “metacircularity”), and thus aren’t viable in the lambda calculus. However, they do give us a hint: in both cases we have a function taking a condition, consequence, and alternative, and using the first to select one of the latter two. In the lambda calculus, we might start by writing:
if = λ cond then else . ?
(Note: there aren’t any keywords in the lambda calculus, so there’s nothing stopping me from naming variables things like if
, a fact which I will take free advantage of.)
We’ve introduced a definition for if
, as a function of three parameters; now what do we do with them? The lambda calculus’s stark palette makes it easy to enumerate all the things we can do with some variable a
:
Ignore it, whether by simply not mentioning it at all (as in λ a . λ b . b
), or by shadowing it with another lambda which binds the same name (as in λ a . λ a . a
).
Mention it, whether on its own in the body of a lambda (as in λ a . a
or λ a . λ b . a
), somewhere within either side of an application (as in λ a . λ b . a b
or λ a . λ b . b a
), or some combination of both (as in λ a . (λ b . a) a
).
We could for example simply return then
or else
:
if = λ cond then else . then
if = λ cond then else . else
But in that case the conditional isn’t conditional at all—the value in no way depends on cond
. Clearly the body must make use of all three variables if we want it to behave like the if
s we know and love from other languages.
Taking a step back for a moment, let’s examine the roles of if
’s arguments. then
and else
are passive; we only want to use or evaluate one or the other depending on the value of cond
. cond
, then, is the key: it takes the active role.
Thus, in the same way that our if_
functions in Haskell & JavaScript employed those language’s features to implement, we’re going to define if cond then else
as the application of the condition to the other two parameters:
if = λ cond then else . cond then else
This feels strangely like cheating: surely we’ve only moved the problem around. Now instead of if
making the decision about which argument to return, we’ve deferred it to cond
. But if
and cond
aren’t the same, semantically; if
takes a boolean and two other arguments and returns one of the latter, while cond
is a boolean—albeit evidently a boolean represented as a function. Let’s make that precise by writing down if
’s type:
if : Bool > a > a > a
Notwithstanding our use of the yettobedefined name Bool
for the type of the condition, this is the same type as we gave if_
in Haskell; that’s a good sign that we’re on the right track! It takes a Bool
and two arguments of type a
, and it must return one of those because that’s the only way for it to come up with the a
that it returns. But what is Bool
?
Working backwards from the type and definition of if
, we see that cond
is applied to two arguments, and therefore must be a function of two parameters. Further, these are both of type a
, and the value it returns must also be of type a
for if
’s type to hold. Thus, we can define the type Bool
like so:
Bool = ∀ a . a > a > a
If a given Bool
is a function of two arguments of arbitrary type, returning the same type, it must therefore select one of its arguments to return. There are only two distinguishable inhabitants of Bool
, true
and false
, so we can therefore deduce that since if
defers the selection of the result to the Bool
, for true
and false
to actually differ they must make opposite selections. In other words, true
must return the then
parameter, while false
must return the else
one:
true, false : Bool
true = λ then else . then
false = λ then else . else
We didn’t move the problem around after all; we solved it. What we noticed was a deeper insight: this encoding of booleans makes if
redundant, since if we can apply if
to a Bool
and two arguments, we could equally apply the Bool
to those arguments directly.
It’s frequently convenient to conflate booleans with bits, their minimal representation, but in truth they’re not the same at all. Practically, some programming languages define booleans as a byte in memory, perhaps clamping its values to 0 and 1; others define them as instances of some boolean class, or constructors of an algebraic datatype. Some provide no formal relationship between true
and false
at all, save for a common interface—duck typing.
Mathematically, booleans are the values in propositional logic; the upper and lower bounds of a lattice; the zero and one of a semiring; the members of the set with cardinality 2; and many other things in many different contexts.
Operationally, booleans represent choice, and this is a pattern that we’ll see repeated: encoding a datatype with lambdas means representing the datatype as functions supporting all of its operations. All operations on booleans can be defined by selecting between two alternatives, which is precisely what our encoding does.
We can demonstrate this by defining some other operations on booleans, e.g. logical operators, using the encoding we’ve built thus far.
not
takes a single Bool
and returns another:
not : Bool > Bool
not = λ x . ?
As when defining if
, all we can do with a Bool
is branch on it:
not = λ x . if x ? ?
But which arguments should we pass if we wish to return a Bool
with the opposite value? Recall the definition of Bool
from above:
Bool = ∀ a . a > a > a
To return a Bool
, therefore, each argument must likewise be a Bool
. The first argument will be selected if x
is true
, the second if x
is false
, so if we want the opposite value from x
we can simply apply it to the opposite values in either position:
not = λ x . if x false true
not x
will therefore return false
if x
is true
, and true
if x
is false
; equationally:
not true = false
not false = true
Which is precisely the meaning we intended not
to have.
or
and and
are closely related to one another, so we’ll define them simultaneously. Both take two Bool
s and return a Bool
:
or, and : Bool > Bool > Bool
or = λ x y . ?
and = λ x y . ?
As with not
, all we can do with Bool
s is branch:
or = λ x y . if x ? ?
and = λ x y . if x ? ?
For or
, if x
is true
, we can return true
immediately (“shortcircuiting”). For and
, it’s the opposite:
or = λ x y . if x true ?
and = λ x y . if x ? false
If x
is false
, or
needs to test whether y
is true
; likewise, if x
is true
, and
needs to test whether y
is also true
. Once more, all we can do with Bool
s is branch:
or = λ x y . if x true (if y ? ?)
and = λ x y . if x (if y ? ?) false
And since we must return a Bool
, we can use true
and false
:
or = λ x y . if x true (if y true false)
and = λ x y . if x (if y true false) false
Pleasantly, if y true false
(and likewise y true false
) is operationally equivalent to y
. Using that equivalence, we can simplify these definitions, leaving us with:
or = λ x y . if x true y
and = λ x y . if x y false
In this post, we’ve explored defining a ubiquitous programming language feature—booleans—using nothing more than the spartan trappings of the lambda calculus. We’ve emerged with a language which can express not merely functions and their applications, but also fundamental metaphysical concepts such as truth.
In the next post, we’ll look at lambdaencodings of beauty: ML/Haskellstyle algebraic datatypes.
]]>case
s, will ship in a later build of Swift 2, there’s still reason to want them today.
The standard solution is to use Box<T>
, a function, or some other reference type to manually force an indirection for the recursive cases:
enum Expression {
case Variable(String)
case Abstraction(String, Box<Expression>)
case Application(Box<Expression>, Box<Expression>)
}
Unfortunately, this has a few significant warts:
Box<T>
; it can’t be a private implementation detail. This can in turn lead to ambiguities if APIs aren’t using a common dependency to provide the Box<T>
type. Further, they have to box and unbox the values themselves.Indirect cases will (I believe) resolve both of these issues, but there’s another solution we can apply today which solves both and provides a significant increase in the expressiveness of the type, at the expense of introducing a (useful) intermediary type.
To begin with, note that in Swift 2, it’s no longer necessary to box elements of parameterized types in enum cases. This suggests a straightforward refactoring: replace Expression
’s recursive instances with elements of a type parameter:
enum Expression<Recur> {
case Variable(String)
case Abstraction(String, Recur)
case Application(Recur, Recur)
}
Now we’ve got an Expression
type that can be instantiated with a given type parameter to recur. But if we try to describe the type of a recursive instance of it, we immediately run into a wall:
let expression: Expression<Expression<Expression<…>>>
It would seem we’ve simply moved the problem from the case
s to the type, and can now see more clearly why Swift doesn’t allow case
s to recur directly: it amounts to an infinite type. Some indirection is required, somewhere, and by allowing the programmer to place it (whether by explicit boxing or an indirect
keyword), the performance implications are somewhat under their control, rather than the compiler’s.
We need some way to tie Expression
into a knot (as it were), looping back around into itself, but without requiring us to write out an infinite list of nested type parameters. If we were writing a function instead of a type, we could use the fix
function, which computes the least fixed point of a function, to lift a nonrecursive function into a recursive one:
let factorial = fix { recur in
{ n in n > 0 ? n * recur(n  1) : 1 }
}
Instead of making a recursive function, we make a nonrecursive function taking a function as a parameter, and return an inner function which calls through it in order to recur. fix
calls the outer function with a closure which calls back into fix
, tying the knot. Is there an analogous fixed point for types? If there were, we would expect it to have the same overall shape: it would apply a type constructor like Expression<T>
to a type which itself provides the connection back to Expression<T>
.
I’ll let you in on a secret: types are functions, too. Expression<T>
is actually a function, abstracted over a parameter T
to a concrete instance of Expression
with Recur
instantiated to T
. And it turns out that, like other functions, types also have fixed points.
In Haskell (the inevitable destination of any discussion of fixed points in programming languages), we could write this Fix
type of a parameter type f
like so:
data Fix f = Fix (f (Fix f))
This is Haskell notation approaching its densest form, so let’s compare it with how fix
(the least fixed point of functions) is defined in Swift:
public func fix<A, B>(f: (A > B) > A > B) > A > B {
return { f(fix(f))($0) }
}
The fix
function applies f
, the function parameter passed to fix
, to the result of applying fix
recursively to f
again. It wraps this up in another closure to avoid infinite looping.
Analogously, the Fix
type applies f
, the type parameter passed to fix
, to the result of applying Fix
recursively to f
again. Haskell is lazilyevaluated, so it doesn’t need to wrap the lot of it up again.
Let’s try writing Fix
in Swift. It only has one case
, so it can be a struct instead of an enum
.
struct Fix {}
Now it needs to have a type parameter, F
.
struct Fix<F> {}
So far so good. Now we need to apply F
to itself, recursively. But doesn’t that cause the infinite sequence of nested types again? Fix<F<Fix<F<…>>>>
is no improvement on Expression<Expression<Expression<…>>>
.
Fortunately, Swift allows you to refer to a type without reference to its type parameters in its body:
struct Fix<F> {
let body: F<Fix>
}
Unfortunately, while Fix
is a complete reference inside the body of this type, Swift doesn’t know that F
can accept type parameters, and thus rejects this. We can be sneaky and use a protocol
with a typealias
to work around this:
protocol Fixable {
typealias Recur
}
struct Fix<F: Fixable> {
let body: F
}
But now when we add the constraint to tie F
into a knot, we run into a new issue: swiftc
crashes. (rdar://20000145).
protocol Fixable {
typealias Recur
}
struct Fix<F: Fixable where F.Recur == Fix> {
let body: F
}
// => fish: Job 1, 'swift boom.swift' terminated by signal SIGSEGV (Address boundary error)
Fortunately, while Swift can’t express a generic Fix
over any arbitrary fixable type, it can express a fixed point of Expression
specifically. Let’s call this new type Term
. Once again, it’s a struct
, and its body holds an Expression
instantiated to itself. This one errors out, but it’s clear we’re getting closer:
struct Term {
let body: Expression<Term>
}
// => error: recursive value type 'Term' is not allowed
Term
is recursive because it holds an Expression
which in turn holds (in some of its cases) a Recur
, which we’ve instantiated to Term
. We need to reintroduce an indirection via a reference type like Box<T>
or a function.
Haven’t we just moved the problem around again? Well, sort of. Certainly we still need to box the values, but now we can do it in one and only one place—Term
—and additionally we can make it private
, avoiding exposing our implementation details to our consumers. Our constructor and getter can handle the boxing and unboxing for us:
struct Term {
init(body: Expression<Term>) {
boxedBody = Box(body)
}
var body: Expression<Term> {
return boxedBody.value
}
private let boxedBody: Box<Expression<Term>>
}
That’s a pretty decent reason to use this approach right now (if you can’t wait for indirect cases). But it only solves one of the problems we mentioned initially; we still can’t pattern match recursively. For example, if we wanted to evaluate application expressions, we would want to write something like this:
switch expression {
case let .Application(.Abstraction(variable, body), argument):
// substitute argument for variable in body
default:
// throw an error
}
But because of the Term
and Box
, neither of which can be matched through, we would have to write this instead:
switch expression {
case let .Application(abstraction, argument):
switch abstraction.body {
case let .Abstraction(variable, body):
// substitute argument for variable in body
default:
break
}
fallthrough
default:
// throw an error
}
If we could flatten out the type, we could pattern match. Flattening out the type would put us straight back into the infinite sequence of Expression<…>
s; but maybe we can only partially flatten it?
We don’t need to pattern match against arbitrarilynested terms for this example; we just want to match against a single nested layer. Therefore, we really only need to flatten out a single step of the recursive type. We’d need to apply this for each appearance of Recur
in Expression
, replacing it with Expression<Recur>
.
Replacing each instance of a type parameter with an instance of another type parameter sounds like a job for a map
function. In Haskell, this function is known as fmap
, for functor map, where functors are a kind of mathematical object with some specific shape, and where map preserves this shape. For example, the Array.map
method, given some function transform
, produces a new array with the same number of elements and in the same order (i.e. preserving the structure of the array), but with each element replaced by applying transform
. Array, then, is a functor; and it turns out, so is our Expression
tree.
In our case, map
should replace the Recur
instances with the result of applying some function to them. There are no instances of Recur
in Variable
cases, so it should just rewrap the variable name in the resulting type; the Abstraction
and Application
cases will apply transform
:
enum Expression<Recur> {
…
func map<Other>(transform: Recur > Other) > Expression<Other> {
switch self {
case let .Variable(x):
return .Variable(x)
case let .Abstraction(x, body):
return .Abstraction(x, transform(body))
case let .Application(a, b):
return .Application(transform(a), transform(b))
}
}
}
We can use this to implement recursion schemes, improving our confidence in recursive functions over the type, but for now we’ll limit ourselves to enabling pattern matching. Given an Expression<Recur>
, we want to replace each Recur
with its recursive instantiation, Expression<Recur>
. Otherwise put, we need a function of type Expression<Recur> > Expression<Expression<Recur>>
. Let’s implement this as a method, and call it destructure (since it decomposes the structure of the type):
enum Expression<Recur> {
…
func destructure() > Expression<Expression<Recur>> {
return map {
// what do we do here?
}
}
}
…but we can’t! To implement a function of type Expression<Recur> > Expression<Expression<Recur>>
using map
, we’d need a function of type Recur > Expression<Recur>
to pass to it. There is no useful function that can do this; without knowing a specific (and actually recursive) type for Recur
, we have no way to recover the Expression<Recur>
that we want to return.
Instead, let’s use a constrained extension to limit ourselves to Expression<Term>
. Unfortunately it’s not quite that simple, because Swift, for reasons beyond my knowledge (rdar://21512469), forbids the obvious thing:
extension Expression where Recur == Term { … }
// => error: sametype requirement makes generic parameter 'Recur' nongeneric
We’ll work around this using a protocol, FixpointType
:
protocol FixpointType {
typealias Fixed
}
extension Term: FixpointType {
typealias Fixed = Expression<Term>
}
Now we can constrain the extension to FixpointType
like we want:
extension Expression where Recur : FixpointType, Recur.Fixed == Expression<Recur> {
func destructure() > Expression<Expression<Recur>> {
return map {
// what do we do here?
}
}
}
There are two problems remaining with this implementation:
Expression<Recur>
from a Recur
.swiftc
crashes. (rdar://21328632)Fortunately, we can resolve the former by adding a property to the protocol:
protocol FixpointType {
typealias Fixed
var body: Fixed { get }
}
With that out of the way, we can work around the crash by loosening the constraints slightly; we don’t actually require that Recur.Fixed
be recursive; we just need to be able to name it. Now we can give the return type of destructure
as Expression<Recur.Fixed>
, and implement it in the obvious way, mapping each term to its body:
extension Expression where Recur : FixpointType {
func destructure() > Expression<Recur.Fixed> {
return map { term in term.body }
}
}
Now we can use destructure
to implement evaluation of wellformed .Application
expressions, using exactly the pattern matching we wanted in the first place:
switch expression.destructure() {
case let .Application(.Abstraction(variable, body), argument):
// substitute argument for variable in body
default:
// throw an error
}
]]>Now, take a pair of integers as x and y coordinates across this square. Their size in bits determines the resolution at which they can measure this square.
An integer of n bits can hold any of 2ⁿ distinct values. 32bit integers, therefore, would divide the square into a grid of 2³² points.
At 32 bits of resolution, adjacent coordinates, e.g. …0101
and …0110
, are about a kilometre apart on our square.
If we double the size of our integers, we now divide the square into a grid of 2⁶⁴ points.
At 64 bits of resolution, still covering the entire span of the orbit of Neptune, adjacent coordinates are about 0.24µm apart, or about 1% of the width of an average human hair.
And famously, populating a 128bit address space would require us to boil the oceans.
]]>