Antitypical

Sequent calculus cheat sheet

2021-08-14T23:01:02Z

Rules for sequent calculus connectives, formatted as a cheat sheet.

- +
Additive
₁ `A` `AB` ₂ `B` `AB` , `A` , `B` , `AB` `A`, `B`, `AB`, `A` `AB` ₁ `B` `AB` ₂
(`AB`) ≈ `AB` (`AB`) ≈ `AB`
no left rule for , , no right rule for
≈ ≈
Multiplicative
`A` `B` `AB` , `A`, `B` , `AB` `A`, `B`, `AB`, `A` `B` `AB`
(`AB`) ≈ `AB` (`AB`) ≈ `AB`
_R _R _R _R _R , _R _R _E _E _E, _E _E _E _E
_R ≈ _E _E ≈ _R
Implicative
`A` `B` `AB` `A`, , `B` , `AB` `A`, , `B` `BA`, `A` `B` `BA`
(`AB`) ≈ `BA` (`BA`) ≈ `AB` `AB` ≈ `AB` `BA` ≈ `AB`
Negation
`A` `A` `A`, , `A` , `A` `A`, `A` `A`
`A` ≈ `A` `A` ≈ `A`_R `A` ≈ `A` `A` ≈ `A`_E
Assertion
`A` `A` , `A` , `A` `A`, `A`, `A` `A`
`A` ≈ `A` `A` ≈ _E`A` `A` ≈ `A` `A` ≈ _R`A`
Shifts
`A`, `A` `A` , `A` `A` `A`, , `A` `A`
(`AB`)^N = (`A^N`)`B` (`AB`)^V = (`A^V`(`B^V`))
Quantification
`A`{`B`/`X`} `X`.`A` , `A` `X` ∉ fv() , `X`.`A` `X` ∉ fv() `A`, `X`.`A`, `A`{`B`/`X`} `X`.`A`
(`X`.`A`) ≈ `X`.`A` (`X`.`A`) ≈ `X`.`A`
Core
init `A` `A` cut , `A` `A`,
Structural
`weaken` `A`, `weaken` , `A` `contract` `A`, `A`, `A`, `contract` , `A`, `A` , `A`
Legend `A`: negative variable `A`: positive variable `A`: arbitrary polarity variable Show polarities

When Howard Met Curry

2021-07-28T20:31:49Z

The Curry-Howard correspondence is a map for moving between logic and type theory, relating propositions with types and proofs with programs. It describes a two-way street, and we can freely move between the two worlds, or perhaps merely two perspectives, so long as we follow the map.

Sometimes the road takes us to unexpected places. Here’s a trip I’ve been on recently.

Double negation

I’ve been working on a language named sequoia, which embeds polarized classical logic in Haskell. Of course, Haskell corresponds to an intuitionistic logic, so we can’t just bring arbitrary classical proofs over directly. We have to use a double-negation translation, importing classical propositions A as intuitionistic propositions A. There are several such translations named (and infinitely more possible), differing in how many negations are placed where, but they all get the job done.

Curry-Howard tells us a translation for negations, but we can work this one out ourselves with a little logical knowledge: a negation A can also be encoded as the implication A. It’s straightforward enough: “A implies falsehood” means the same thing as “not A.”

Implications translate to functions, but what about ? That simplest, yet perhaps initially baffling, of algebraic datatypes, the empty type. We can define these for ourselves, but there’s a standard definition in the Data.Void module:

data Void

Having no constructors, Void also has no inhabitants—no proof terms—just like . So Void indeed corresponds to . So what kind of thing is a -> Void? A function returning Void is a function that cannot return; it can only pass control along to another function returning Void. In other words, a -> Void is just what Curry-Howard tells us: a continuation.

Thus, the double negation A becomes a continuation from a continuation:

type DoubleNegation a = (a -> Void) -> Void

which is a shape also known as continuation-passing style. Classical languages embed into intuitionistic ones via CPS.

As discussed, modelling with Void extends to modelling negations A (encoded A) with continuations. Further opens the door to using logical reasoning principles relating to . For example, we can use the function absurd, defined as:

absurd :: Void -> a
absurd v = case v of {}

to supply proofs using the principle of explosion, or ex falso quodlibet. And Void is an appropriately abortive substitute for , since there’s no way for control to pass through a type with no inhabitants.

However good a fit Void might be initially, it’s quite inconvenient when embedding a language within another, whether by encoding or interpretation. You typically want control to return, and to run code afterwards, and to collect results, to continue with the next iteration of a REPL, or to print, save, or transmit computed values, clean up acquired resources, and so on. Absent tricks like throwing values up the stack with exception handlers, you simply can’t: nothing returns from the Void.

We don’t necessarily have to use Void itself, however, but merely something which gives us the same proprties w.r.t. control. In particular, we can substitute Void out for another type, often a parameter, in our definition of DoubleNegation. I mentioned earlier that DoubleNegation is continuation-passing style, so I’m going to go ahead and spoil the punchline by renaming the type to Cont at the same time:

newtype Cont r a = Cont { runCont :: (a -> r) -> r }

Control-wise, the main feature of Void is abortive control—it can never return, so it can only pass control to something else with the same property. Cont offers us the same behaviour, in that normal control flow is represented as applications of the continuation, passing control along to the next computation, while abortive control is possible by returning r directly.

By replacing Void with a type parameter r (the traditional name, perhaps for “result,”, or “return”), particularly one visible at the type level (vs. universally quantified over as with Codensity), we’ve also opened the door to more interesting control tricks like multi-prompt delimited control, and therefore to arbitrary effects. Handy tricks to have up your sleeve when implementing a language, to say the least.

Beginnings and endings

I recently wrote a post about what I termed environment-passing style. A series of observations arrives at (in some sense) dual translations of a -> b as (b -> r) -> (a -> r) and (e -> a) -> (e -> b) (shapes that you start seeing everywhere once you learn to think of them this way, e.g. in folds and unfolds, in Mendler-style algebras and coalgebras, in lambda encodings, etc.). Here, just as above, r substitutes for—indeed, it abstracts—. What about e?

The common theme running throughout the sequent calculus is duality. r abstracts Void, Void corresponds to , (negative falsity) dualizes to (positive truth), corresponds to (), () is abstracted to e. (Back and forth ’cross the Curry-Howard bridge!)

Earlier, we judged r a fitting substitute for Void because it behaved compatibly with respect to control. In contrast, () doesn’t behave in any way at all.

r’s abstraction of Void is determined entirely by Curry-Howard: r abstracts Void insofar as Void corresponds to and its logical rules. The left rule has this sequent as an axiom:

This gives us ex falso quodlibet: on the left suffices to prove any sequent. As we saw, Void provides this via absurd. For r, imagine a type representing sequents implemented using Cont. If your arguments contain an r, there’s no need to consider any other arguments or what values you could compute and return; in fact, you don’t need the continuation at all. Just return the argument of type r and you’re done.

The right rule is instead:

This instead says that you can construct it along any sequent which was provable anyway; or, working bottom-up, adds no information to a proof, and so can be discarded at any time. This one is a little stranger; we can’t construct a Void, period; but if we could, it certainly wouldn’t add any information. On the other hand, r is inhabited, so we can certainly follow the analogy: we can construct an r if we can return without it anyway; in fact, we do, by applying the return continuation.

For e we instead need to trace its relationships with . ’s rules are dual, mirror images of the ones for . Again, we start with the left rule, which corresponds closely to the right rule for :

Left rules can be read by recalling that a sequent is sort of like a function: we can build a function to eliminate and Γ into Δ by means of a funciton eliminating Γ into Δ. Put another way, doesn’t give us any power to prove things that we didn’t have without it, and so we can always introduce it as a hypothetical. () works much the same way: adding an argument of type () won’t help you construct anything, since you could have just introduced it as a literal. e therefore must work the same way, but we’re going to weaken our informal restatement of this down to: you are free to ignore an argument of type e. Of course, we aren’t in an e.g. linear setting, so we’re free to ignore every argument. But even if we were in a linear setting, e should still be discardable.

On the right:

You can always make a . Ditto (); it’s what makes it discardable. (, a positive connective, is indeed defined by its right rule.) e, then, must also be freely introduced, ambiently, ubiquitously.

Considering that we started with the humble and , the combined consequences of these rules and equivalences for our purposes are surprisingly useful. Expressed as derived rules, on the side, if we have an A, we can use it to eliminate an A_R—a continuation—at any time; likwise, we can introduce a continuation from A to satisfy a demand for a A:

_R , A A_R, _R A, _R _R , A_R

Dually, demand for _EA is satisfied by demand for A; and we can always turn A into _EA:

_E , A _EA, _E A, _E _E , _EA

Assertive negativity

R represents the ubiquitously available ability to jump to the end of the program and abort. E, on the other hand, represents the ubiquitously available ability to summon (already globally-available) information out of thin air: the environment (hence the name E). Neither of these give us anything really new—after all, we could always pass information inwards, threading it through all the intervening calls, or abort and return outwards by means of Maybe or the like. But doing either in a single step without changing the rest of the code base is pretty handy.

Further, Cont r a gives us some tools that A alone does not, including delimited continuations. Delimited continuations allow us to jump not only to the end of the program, but to some designated intermediate position(s)—often called prompts—introduced by reset, and even to resume control at the point at which we jumped afterwards. This in turn allows us to encode arbitrary effects and handlers.

In much the same way, the dual structure—probably a comonad—gives us local environments, sandboxing, and coeffects.

If Cont r a is ¬¬A, then what is this dual structure? Following the thread backwards, Cont r a is A because Cont r a is (A_R)_R, which is an encoding of A. Our encoding of _EA, on the other hand, doesn’t correspond to any connective—yet. So let’s introduce one: , pronounced “not untrue,” is an assertion (some relation to the logical notion, no relation to the computational one), dual to a negation, and works just like our encoding above:

A A , A , A

Unlike , the composition of on itself is surprisingly boring. If encodes as → A, then all we’ve got is → → A, which gives us nothing the single instance didn’t. I thought these things were supposed to be dual; what gives?

Polarization

I mentioned before that sequoia embeds polarized classical logic. Thus, the above tells only half the story, because we have two different negations: the negative negation ¬ (“not”), and the positive negation ~ (“negate”). They further accept propositions of the opposite polarity, i.e. ¬ takes a positive proposition and ~ takes a negative one, and are involutive, cancelling each other out. ¬~A^- ≈ A^-, and ~¬A⁺ ≈ A⁺.

Likewise, there are actually two different assertions. , which we saw above, is the negative one, while the positive one is stranger still. We arrived at the negative assertion by considering the negative negation, maybe we can find the positive one by a similar route.

The encoding of ¬A as A → which we saw earlier wouldn’t be well-polarized for the positive negation ~A. Instead, ~A is encoded as − A, where A − B (“A without B”) is (categorically) a coexponential, (logically) a coimplicaiton or subtraction, and (computationally) a calling context. (Downen has called it a call stack, but I dislike that name, as it’s more like a single frame than an entire stack.)

While the logical rules for its introduction and elimination offer some insight into what its representation must hold, it’s perhaps clearest under a different set of encodings: this time, encoding → and − in terms of disjunction/conjunction and negations. Classically, A → B can be encoded as ¬A ∨ B, while A − B can be encoded as A ∧ ¬B (i.e. “A and not B,” hence the pronunciation of − as “without”). If A − B could be encoded as a conjunction of A and the negation of B, then what does Curry-Howard have to say about that? Conjunctions are product types; negations are still continuations; A − B is isomorphic to a pair of an A and a continuation from B.

We can see now that A → B and A - B are dual: A − B holds both the argument to and continuation from A → B. I sometimes imagine A − B as a pipeline, with a section missing; A → B is precisely the missing section that fits and completes it.

Thus far our encodings of the two negations and our single assertion are:

A ≈ A
A ≈ A
A = A

We can further organize these by polarity and purpose:

	negative	positive
negation	`A` ≈ `A`	`A` ≈ `A`
assertion	`A` = `A`	…?

The negations both invert polarity, whereas ¬̷ maintains it. Further, the negative connectives both employ →, whereas ~ uses −. Putting it together, we can expect the positive assertion to encode as −, and to maintain polarity, and that gives us:

	negative	positive
negation	`A` ≈ `A`	`A` ≈ `A`
assertion	`A` = `A`	`A` ≈ `A`

✓, pronounced “true,” is truly dual to ¬, and ¬̷ is dual to ~. ✓’s encoding gives us the following rules:

A, A, A A

So far, so… disappointing. These are precisely the same rules as we found for ¬̷; only the symbols and the polarities have been swapped. And what’s worse, the same was already true of the rules for the negations.

Speaking of which, the encoding for ~ seems circular: the positive continuation ~A can be encoded as − A, itself represented as a pair of a unit value and … a continuation from A? But no, it’s the positive negation ~A that can be encoded thus. Just the same, that distinction alone isn’t satisfying: one wonders what the point of ~ is, polarity aside. We’ve already got a continuation connective in ¬; what do we need another one for?

It was in precisely such a mood that I happened to open Paul Downen & Zena Ariola’s recent paper Compiling with Classical Connectives to where I’d last left off, on a page starting with this paragraph:

The two negation types can be thought of as two dual way for representing first-class continuations in a programming language. One way to formulate a continuation is by capturing the context as a first class value. This corresponds to the data type ⊖A which packages up a covalue F as the value ⊖F, which can be later unpacked by pattern-matching in λ⊖α.c. Another way to formulate continuations is through functions that never return. This corresponds to the codata type ¬A which has values of the form λ¬x.c, which is analogous to a function abstraction that does not bind a return pointer, and covalues of the form ¬W, which is analogous to throwing a value to a continuation without waiting for a returned result.

Compiling with Classical Connectives, Paul Downen, Zena M. Ariola

In short, you don’t use the same representation of continuations under two different names; you use two different representations, each with their own strengths. I was delighted to read this, because it reflects something about ~ that I’d only just noticed a day or two prior: the reason representing ¬A with continuations a -> r works is that we’re consistent about it. Shouldn’t we be consistent about our treatment of , too? In which case, we should revisit our table:

	negative	positive
negation	¬A ≈ A → _R	~A ≈ _E - A
assertion	¬̷A = _E → A	✓A ≈ A - _R

In other words, ~A is defined to hold the environment in which it was constructed. It’s a lot like a closure, and precisely the kind of thing Downen & Ariola describe.

This also sheds some light on the contrast between the two assertions. ✓A holds an A, along with a continuation from R. (I’m not exactly certain what that means, just yet.) ¬̷A, on the other hand, may close over an A, or it may construct one from E. In other words, ¬̷A models dynamically-scoped variables, and ✓A models lexically-scoped ones. And while ¬̷A may look partial at first blush, it‘s important to remember that while the logical rules don’t (currently) encode this fact, E needn’t be the same for every sequent any more than R need be. (reset, for example, allows the conclusion’s R to vary from the premise’s.)

Endings and beginnings

I set out to understand continuation-passing style better, and whatever its opposite might be, and ended up coming up with two new connectives modelling lexical & dynamic scoping, a way to integrate coeffects with the same deep level of access as effects and CPS already enjoy, and a deeper appreciation the positive negation ~ and its role in contrast to ¬. Given all that, it’s hard to be too sad that I still don’t know much about what I was calling environment-passing style. I intend to carry the even-handed treatment of ~ further still, and try to understand the encodings of → and −. I also intend to write about the modular, composable representation of n-input, n-output computations I’ve come up with based on the above findings. That will have to wait.

For now, I mostly just wanted to share how wonderful it is what sort of things we can discover as we follow the road back and forth across the Curry-Howard correspondence.

Sequent Calculi and Metacircularity

2021-07-16T16:06:12Z

Sequent calculi are powerful and flexible tools for studying logic and, via Curry-Howard, computation. But why, and how? Where does this power come from?

We enjoy a variety of idioms to describe the relationship between problems and solutions. For example: “use the right tool for the job,” and “a good impedance match.” Where sequent calculi offer a good impedance match, it may in part be because of how they model the logical primitives and principles we build atop them.

A sequent is, in general, a pair of contexts (collections of propositions), often represented by the variables Γ (the antecedents, or hypotheses; computationally, the inputs) and Δ (the succedents, or consequents; computationally, the outputs), separated by the ⊢ symbol (called “the turnstile”, but pronounced “proves,” “entails,” etc.; computationally, “produces,” “returns,” etc.), where Γ is treated conjunctively and Δ disjunctively. Thus, the general sequent form:

Γ ⊢ Δ

can be read as “all of Γ prove some of Δ,” or in computational terms, “all of Γ produce some of Δ.”

Specific configurations of sequent have more precise interpretations as a consequence of these general rules, and these give us examples of how we use the sequent calculus to build systems for logic and computation.

Starting simply, we have truth:

· ⊢ A

(NB: · on either side of the turnstile means an empty context.) Reading this literally, “nothing proves A,” but “nothing is required to prove A” or “A is provable without any extra information” are clearer. Or, simply, “A is true.” Dually, falsehood:

A ⊢ ·

Literally, “A proves nothing,” or “nothing is derivable from A;” more simply, “A is false.”

These two examples show us the edge cases of the two contexts: since we interpret Γ conjunctively, the empty case is truth; and since we interpret Δ disjunctively, the empty case is falsehood. The corner case, where both is empty, is interesting too:

· ⊢ ·

Interpreting as we did for truth and falsehood, “truth proves falsity”—a contradiction. This is a surprisingly useful tool to have logically, and perhaps even more so computationally, tho that will have to wait for a future post.

Truth and falsehood are the units of conjunction and disjunction, respectively, and since we treat the contexts in such manner it should be no surprise that we get those behaviours directly:

A, B ⊢ ·

Straightforwardly, “A and B are false.” Note the “and” there: conjunction! On the other side of the turnstile:

· ⊢ A, B

“A or B is true.” This is how we get disjunction.

This time, the opposite corner case isn’t much more interesting than the edges it intersects:

A, B ⊢ C, D

“A and B prove C or D” is a perfectly cromulent sequent to have kicking around, but we don’t learn a lot more from its treatment of the contexts. But observe that the turnstile plays a role here as well. It’s not just punctuation, or rather, it is, but the space it punctuates is important. “A proves B” is (more or less) another way of saying “A implies B,” and implication is the last piece missing. (Well, there’s negation, but you can compose that from the pieces we have so far.)

To recap, we’ve seen how to model ⊤ (truth), ⊥ (falsity), ∨ (disjunction), ∧ (conjunction), → (implication), and finally, contradiction. All of that just from examining individual sequents, not even considering how they’re composed together! Furthermore, you can make a case that the rules for how we treat occurrences of variables in sequents directly allow the encoding of universal and existential quantification.

This is perhaps my favourite example of metacircular interpretation; for another example, the ideal language to implement a lisp in turns out to be a lisp, because it’s already got everything a lisp needs.

Environment-Passing Style

2021-07-08T10:08:54Z

Functions of type A → B can be translated into corresponding functions of type ¬B → ¬A in continuation-passing style (CPS), where ¬A is logically negation but computationally a continuation from A.

This widens the view of functions as value transformers, taking values of type A to values of type B, to include an alternative perspective of them as continuation transformers (as noted by Andrzej Filinski in Declarative Continuations and Categorical Duality.)

Logic and computation are rife with dualities, leading one to wonder: what’s the dual of CPS?

Logically, ¬A is often regarded as equivalent to, or even synonymous with, A → ⊥. Computationally, ⊥ corresponds under Curry-Howard to the empty type (sometimes written 0; named Void in ghc Haskell). This is a perfectly reasonable choice when the entire program will be under this translation, but we often want only part of the program to undergo translation and thus a type more tractable than Void would be welcome.

Fortunately, it’s entirely acceptable to choose some other type to substitute for ⊥, so lng as we’re consistent. Somewhat arbitrarily, we’ll call R the result type. Now our CPS translation yields:

(B → R) → (A → R)

As a quick aside: this representation is just a flip away from being the composition of the ReaderT monad transformer on ContT.

As Filinski notes, functions-as-continuation-transformers map continuation contravariantly from the return type to the argument type. Otherwise put, applying a function f : A -> B to a continuation k : ¬B (an operation which Filinski writes as k ↓ f) yields a continuation of type ¬A. Thinking of continuations as functions to some specially-chosen return type maks this look rather more like composition than application, but Filinski further stresses the importance of understanding continuations as their own kind of thing.

That said, it raises another question: why are continuations written with a type constructor, while values just sort of are? Why the asymmetry?

As noted before, ¬B → ¬A is equivalent to (B → ⊥) → (A → ⊥). One way to approach the question of duality is to dualize the logical propositions. We’re going to use the polarized logic from the previous post, but we won’t worry too much about the polarities, simply acknowledging that some quantity of shifts will be necessary.

Implication dualizes to subtraction: (A → B)^⊥ = A - B. Note that this is dual all by itself—A and B aren’t negated. So one answer could be: the dual of ¬B → ¬A is ¬B - ¬A. This is true; the latter represents precisely the negative space (as it were) around the former. But it’s somwhat unsatisfying just the same; that’s the dual of the function type, not the dual of CPS as a mechanism.

We can also negate the argument and return types, in which case the dual of ¬B → ¬A could be taken to be either B → A (classically) or ¬¬B → ¬¬A. And since A → B is equivalent to ¬B → ¬A, that means that we can further replace ¬¬B → ¬¬A with ¬¬¬A → ¬¬¬B, and then apply triple negation elimination (both classically and intuitionistically valid) to obtain (¬B → ¬A)^⊥ = ¬A → ¬B. Also true; also unsatisfying. What’s missing?

One of the great benefits of programmer-accessible CPS is the availability of delimited continuations, which are a generalization of regular, no-return continuations to return to one or more chosen places in a computation (typically in an outer scope). They furthermore allow the code there to return back into the inner scope, and thus enable inner scopes to communicate with outer ones—exactly what’s needed for effect handlers. ¬A represents a continuation from A, in that we interpret it as one, but if we want the dual of CPS we need to dualize continuations, too.

As one further bit of inspiration, just as ¬A, the negative negation of a positive proposition A, is equivalent to A → ⊥, ~A, the positive negation of a negative proposition A, is equivalent to 1 - A. The data of a subtraction consists of an argument on the left and a continuation from the proposition on the right—the argument to and continuation from the result of the dual function type, precisely—so using subtractions themselves would be moving the problem around, to some degree. But note that 1 and ⊥ are de Morgan duals; so clearly there’s something here.

Thus, modulo a bit of hand-waving, we arrive at:

((B → ⊥) → (A → ⊥))^⊥ = (1 → A) → (1 → B)

And just as we were justified in replacing ⊥ with R, we can now feel justified in replacing 1 with S, yielding:

((B → R) → (A → R))^⊥ = (S → A) → (S → B)

(The justification being that S abstracts 1 just as R abstracts ⊥.)

I’m describing this pattern of function as environment-passing style (tho I’m hoping someone has already described this in detail under a rather better name). I don’t have good idioms for its use, no handy APIs, nor best practices, but I am hopeful that it will prove useful in exploring the relationships between profunctor optics and CPS, and in describing a program’s context-awareness à la coeffects, just like continuations have done for effects. Programmable continuations have been marvellously handy; perhaps programmable values will be as well. And the composition(s) of the two styles is especially intruiging given that values and continuations can each wrap the other.

Finally, a vivid analogy keeps popping into my head, of a piece of twine or wire with 1 at one end and ⊥ at the other, and a program’s interpreter either pulling itself along generating values, or pushing itself along satisfying continuations. Good luck, little program.

Duality

2021-06-07T19:49:02Z

The rules for a variety of polarized classical connectives, in a focused sequent calculus presentation to reflect a variety of dualities, and interpreted via Curry-Howard.

Additive

₁

A

AB

₂

B

AB

, A

, B

, AB

& (“with”): negative conjunction ≈ lazy pair

A,

B,

AB,

A

AB

₁

B

AB

₂

⊕ (“sum”): positive disjunction ≈ either

(AB) ≈

A

B

(AB) ≈

A

B

The additive connectives, & (pronounced “with”) and ⊕ (“sum”) are unfamiliar symbols for familiar connectives: & is conjunction (∧, “and”), while ⊕ is disjunction (∨, “or”).

& is negative, and thus focused on the left (indicated by the square brackets). In a polarized calculus, negative connectives are defined by their left rules, which are rules for how to use them. (Γ contains inputs, so left rules consume things.) & has two left rules, which use it by projecting out the left and right parts, respectively. Its right rule builds it from its components: think pair/tuple literals.

⊕ is positive, and thus focused on the right. Positive connectives are defined by their right rules, which are rules for how to make them. (Δ contains outputs, so right rules produce things.) Its left rule consumes the sum by handling each alternative separately; think case expressions over an Either.

The additive units, ⊤ (“top”) and 0, are a bit of a puzzle. Note that neither has a focusing rule: you can’t use a ⊤ once you have it, and you can’t get a 0 at all. On the other hand, both their invertible rules are axioms (i.e. have no premises).

A sequent Γ ⊢ Δ can be read as “all of Γ prove any of Δ;” under that interpretation we can see that ⊤, which corresponds to logical truth, indeed satisfies that claim: once at least one thing on the right is proven, we’re done.

Likewise, the left rule for 0, falsity, is recognizable as the principle of ex falso quodlibet: you can prove anything at all from a contradiction.

no left rule for

⊤ (“top”): negative truth ≈ unit

no right rule for

0: positive falsity ≈ void

≈

Multiplicative

A

B

AB

, A, B

, AB

⅋ (“par”): negative disjunction ≈ parallel/nondeterministic choice

A, B,

AB,

A

B

AB

⊗ (“tensor”): positive conjunction ≈ strict pair

(AB) ≈

A

B

(AB) ≈

A

B

In addition to & and ⊕, the additive conjunction and disjunction, ⅋ (“par”) and ⊗ (“tensor”) are multiplicative disjunction and conjunction. The difference is more obvious in a linear logic, but for us the relevant distinction between & and ⊗ is that we identify negative (resp. positive) types like & (resp. ⊗) with call-by-name (resp. call-by-value).

⅋, however, is stranger. It’s a disjunction, eliminated by using both halves (just like ⊕), but introduced using two things. Linear logic again offers one perspective, but another reading of Γ ⊢ Δ (via Curry-Howard) as “consume Γ to produce Δ” is salient: parallelism (hence “par”), or nondeterminism. The latter explains why ⅋ is disjunctive despite (at least notionally) containing two things; think of <|>. For Maybe, it selects the leftmost Just, acting as a choice; for [], it instead selects all alternatives, acting as a union.

Alongside multiplicative disjunction and conjunction are ⊥ (“bottom”), a negative falsity, and 1, a positive truth.

⊥, like positive falsity (0), proves anything if you’ve already got one (⊥⊢). Surprisingly, we can introduce one when we can prove the rest of the sequent (⊢⊥). Logically, falsity can’t change Δ’s provability, so this rule is justified.

Dually, we can’t learn anything from vacuous truth, so 1 can be eliminated freely (1⊢). Introducing it is likewise trivial (⊢1), just as for ⊤.

⊥ (“bottom”): negative falsity ≈ void

1: positive truth ≈ unit

≈

Implicative

A

B

AB

A,

, B

, AB

→: negative implication ≈ function

A,

, B

BA,

A

B

BA

– (“subtraction”, sometimes ⤙, “co-implication”): positive implication ≈ calling context

(AB) ≈ BA

(BA) ≈ AB

Just as implication resembles the turnstile (as is evident in the premise of ⊢→), our computational reading of Γ ⊢ Δ as “consume Γ to produce Δ” indicates that functions do as well.

A – B can be read as “A without B” (both arithmetically and logically), but insight into its computational interpretation can be gained by its relationship to functions.

The premises for →⊢ and ⊢– are the same, as are the ones for ⊢→ and –⊢: duality means “each eliminates the other.” → eliminations are function calls, so – has an argument: the A in the first premise of ⊢–. B is a hypothesis, meaning that the derivation must provide it. Therefore, it also has a continuation from B—and thus, everything you need to call a function: both input, and output destination.

This gives a vivid picture of the nature of duality: a subtraction is the negative space around an implication, and an implication is therefore also the negative space around a subtraction. A complete view of either requires both.

Negating

Like → and –, negations have the interesting behaviour of moving propositions across the turnstile: from antecedents to succeedents. They are constructed (⊢¬ and ⊢∼) from a premise with a hypothesis: to make a ¬A or ∼A, we must have a premise which uses an A—rather like the premise of ⊢→. Dually, their elimination (¬⊢ and ∼⊢) requires an A to use as input. Thus, negations have long been given computational meaning as continuations, a representation of “the rest of the program” from some program point.

Intuitionistic logics therefore often encode negation as implication of falsity, i.e. ¬A ≈ A → ⊥. Intuitively, “not A” and “A implies falsity” align. This has the precise shape of a continuation. Further, double-negation like ¬¬A becomes (A → ⊥) → ⊥, which has the shape of continuation-passing style.

Our negations are involutive, i.e. (mutual) self-inverses, due both to polarization (flipped from their argument) and focusing behaviour (remaining in the current phase).

A

A,

, A

¬ (“not”): positive-within-negative negation ≈ lazy continuation

, A

A,

A

∼ (“negate”): negative-within-positive negation ≈ strict continuation

A

≈ A

A

≈ A

Shifts

A,

A

, A

↑ (“up shift,” call-by-push-value’s F): positive-within-negative shift ≈ return

A

A,

, A

A

↓ (“down shift”, call-by-push-value’s U): negative-within-positive shift ≈ thunk

(AB)^N =

(A^N)

B

(AB)^V = (A^V

(B^V)

)

translations of CBN and CBV function types into polarized calculi

Shifts serve two pragmatic purposes: like negations, they have the opposite polarity of whatever they shift; and they end the focusing phase of proof search and begin the inversion phase.

Up shifts may be unfamiliar in terms of types, but at the term level are something like a return, particularly in monadic code: they embed a value (positive) within a computation (negative). In CBPV (call-by-push-value), this is the left adjoint, F.

Down shifts, meanwhile, may be more familiar as thunks: they wrap a computation (negative) up into a value (positive). In CBPV, this is the right adjoint, U.

The translations of function types from CBN and CBV into polarized calculi (left) model evaluation orders using shifts. The CBN translation treats arguments as thunks around lazy computations, while the CBV translation explicitly wraps the result in a computation.

Quantifiers

The rules for quantification are slightly unusual in this presentation in their use of both side conditions (checking that the variable does not occur free elsewhere within the sequent) and (capture-avoiding) substitution (A{B/X} substitutes the proposition B for the variable X within A, e.g. (X ⊕ Y){Z/X} = Z ⊕ Y).

Quantified propositions—and thus, types—offer yet more dualities: both can be used to hide information, but from different perspectives. ∀ constrains its construction, but is used freely; ∃ constrains its uses, but is constructed freely. Universally-quantified types are used when folding polymorphically recursive data and a variety of encodings using functions. Existentials model abstraction in abstract data types, ML modules, open universes of exceptions, and GADTs.

Here, the identification of negative = computation, positive = value is stretched. If computation in the CBPV sense is “something resident in code pages” then ∀ seems misplaced—why should ∀ be represented at runtime (absent dependent types)? On the other hand, the identification of positive = data, negative = codata fits: ∀ is defined by how it is used, and ∃ by how it is made.

A{B/X}

X.A

, A

X ∉ fv(

)

, X.A

∀ (“for all”): universal quantification ≈ polymorphism

X ∉ fv(

)

A,

X.A,

A{B/X}

X.A

∃ (“there exists”): existential quantification ≈ abstract data type

(X.A) ≈ X.A

Core

init

A

cut

, A

A,

The init and cut rules are quite common to sequent (and other) calculi. This presentation clarifies a couple of useful properties.

init allows us to begin a proof (or end one, working bottom-up as we often do, from goal to premise) with a proposition given to us. Under Curry-Howard’s interpretation of proofs as programs, we can understand this as the polymorphic identity function, id. (Sequent calculus proofs in particular correspond to higher-order programs, e.g. functions taking functions to functions.)

cut instead allows us to prove a sequent by introducing and immediately eliminating a proposition in its premises. Again taking sequents as analogues of function types, we can think of it as the composition of proofs with compatible consequents and hypotheses.

Furthermore, cut and init form a category, where the morphisms are sequents between contexts, joined by cuts. The Curry-Howard correspondence holds up here, too: id and function composition give a category of functions between types.

Structural

Structural rules governing our use of the context. Without these, our logic would be classical linear logic (and would have to adjust the multiplicative and cut rules to split the context between premises). The rules are given their traditional names here, but since we mostly operate bottom-up, starting with the goal given by the syntax of a program, they actually have the exact opposite sense for us.

Top-down, weaken introduces an arbitrary hypothesis or consequent. An extra hypothesis weakens the proof because more information must be provided to prove the same conclusion. An extra conclusion instead weakens what we can learn from it. Contraction, meanwhile, eliminates redundant information from the context.

Bottom-up, weakening instead strengthens a premise, tightening its requirements and guarantees. In reality, however, this mostly amounts to bookkeeping, removing propositions that are in the way of some goal. Contraction, meanwhile, allows us to duplicate information we already know, again primarily as a way of getting one’s ducks in a row for the benefit of proof search. e.g. init proves a minimalistic A ⊢ A. Weakening is often required to whittle sequents down for init to apply.

Note that the customary exchange rule is not required as this presentation’s contexts are unordered.

weaken

A,

weaken

, A

contract

A, A,

A,

contract

, A, A

, A

All you need is λ, part one: booleans

2020-03-29T20:17:26Z

Nearly a century ago, Alonzo Church invented the simple, elegant, and yet elusive lambda calculus. Along with Alan Turing, he then proved the Church-Turing thesis: that anything computable with a Turing machine can also be computed in the lambda calculus. However, nearly as soon as we had digital computers, we started inventing programming languages, and with them a vast treasure of features, beautiful and terrible, many of which seem very hard to relate to the fundamental nature of computability, let alone the lambda calculus specifically.

While it’s true that anything which can be computed, period, can be computed in the lambda calculus, you might not want to: it’s austere, to say the least, and was not designed with modern sensibilities regarding readability in mind. We developed all those languages and features for a reason! Still, Church demonstrated not just that it was possible to compute anything computable with the lambda calculus, but also how one might do so.

In this series, we’ll examine some ways to express common programming language features using the minimalistic tools of the lambda calculus. We begin with perhaps the most ubiquitous type: booleans.

The lambda calculus’s austerity is extreme: you don’t even have booleans. All you have are:

Lambda abstractions;
Applications; and
Variables.

We’ll now review these in some detail; feel free to skip this section if you’re already familiar with the lambda calculus.

Lambda abstractions

Lambda abstractions (“lambdas,” “abstractions,” and “functions” will also be used interchangeably) introduce a function of a single variable.

Abstractions are written λ x . y, for variable x and expression y, where x is now available as a bound variable in the body, and any enclosing definition of x is shadowed (i.e. λ x . λ x . x = λ x . λ y . y ≠ λ x . λ y . x). (We shall assume strictly lexical scoping for the time being.)

In Haskell, we would write \ x -> y instead; in JavaScript, function (x) { return y } or (x) => y.

Applications

Applications (“function application” and “function call” will be used interchangeably) apply the result of the expression on the left to the expression on the right.

Applications are written as x y, for expressions x and y, and left-associated, i.e. a b c = (a b) c ≠ a (b c). Function application binds tighter than lambda abstraction, i.e. λ x . λ y . y x = λ x . λ y . (y x) ≠ λ x . (λ y . y) x.

The syntax is the same in Haskell; in JavaScript, we would write x(y) or a(b, c). Note however that since lambda calculus functions are all single-argument functions, a more direct (though less idiomatic) equivalent for the latter would be a(b)(c).

Variables

Variables introduced by enclosing lambdas.

Variable are written as more or less arbitrary names, typically alphanumeric (e.g. x or y0 or thing); however, we will feel free to include non-alphanumeric characters in names as we see fit, since the paucity of syntax means there’s little risk of ambiguity.

Since the only available variables are those bound by enclosing lambdas, we can also infer that there are no let bindings for local variables, and no globals of any sort; the lambda calculus doesn’t come with a standard library.

Summary

In quasi-BNF, the grammar for the lambda calculus is extremely minimal:

e := λ x . e | e e | x | (e)

And finally, this table gives a side-by-side comparison of the syntax of the lambda calculus with the corresponding syntax in Haskell & JavaScript:

Syntax of the lambda calculus, Haskell, & JavaScript
	Lambda calculus	Haskell	JavaScript
Abstraction	`λ x . y`	`\ x -> y`	`(x) => y`
Application	`f x`	`f x`	`f(x)`
Variable	`x`	`x`	`x`

Due to the lambda calculus’s terseness, I will be making free use of several notational conveniences:

writing λ x y . z as an abbreviation of λ x . λ y . z.
writing ? to stand for bits we don’t know yet, as though we had an environment supporting holes.
writing definitions as though we had a metalanguage.
referencing definitions elsewhere as though we had globals.
writing type signatures as though we had a type system, and even a typechecker, with as much polymorphism and inference as is convenient at any particular moment.
using syntactic recursion as though it existed.
using general recursion as though it made sense.
ignoring application order, normalization, reduction, substitution, values, references, allocation, copying, space, time, entropy, and any and all other such details whenever I feel like it.

By convention, I will name types in TitleCase and both term and (local) type variables in camelCase.

I will try to avoid pulling rabbits from hats too wantonly, but for now, I’ll ask you to suspend disbelief; I hope to revisit and justify some of these in later posts.

Unconditional λ

Lambdas are the only way to introduce values—they’re the only “literal” syntax in the language. We can therefore infer that the only kinds of runtime values must be closures. In an interpreter for the lambda calculus, closures might consist of the name of the introduced variable, the body of the lambda, & a map relating the names and values of any variables it closed over when constructed (again, we assume strict lexical scoping). There are no bits, bytes, words, pointers, or objects in the language’s semantics; only this runtime representation of lambdas.

Likewise, lambdas are also the only way to introduce variables—there’s no standard library, built-ins, primitives, prelude, or global environment to provide common definitions. We’re truly baking the apple pie from scratch.

All of this raises the question: how do you do anything when you don’t even have true and false? Lambdas and variables don’t do, they merely are, so that leaves application. When all you have is application, everything looks like a lambda abstraction, so we’ll represent booleans using lambdas.

Of course, it’s not just booleans we’re after; true and false aren’t much use without and, or, not, if, and all the rest. To be useful, our representation of booleans should therefore suffice to define these, as well. But how do you define if without using if? In a lazy language like Haskell, we might define if as a function something like so:

if_ :: Bool -> a -> a -> a
if_ cond then_ else_ = if cond then then_ else else_

In a strict language like JavaScript, we’d instead take functions for the alternatives:

function if_(cond, then_, else_) {
  if (cond) {
    then_();
  } else {
    else_();
  }
}

Both these definitions use the language’s native booleans and if syntax (a tactic for implementing embedded DSLs known as “meta-circularity”), and thus aren’t viable in the lambda calculus. However, they do give us a hint: in both cases we have a function taking a condition, consequence, and alternative, and using the first to select one of the latter two. In the lambda calculus, we might start by writing:

if = λ cond then else . ?

(Note: there aren’t any keywords in the lambda calculus, so there’s nothing stopping me from naming variables things like if, a fact which I will take free advantage of.)

We’ve introduced a definition for if, as a function of three parameters; now what do we do with them? The lambda calculus’s stark palette makes it easy to enumerate all the things we can do with some variable a:

Ignore it, whether by simply not mentioning it at all (as in λ a . λ b . b), or by shadowing it with another lambda which binds the same name (as in λ a . λ a . a).
Mention it, whether on its own in the body of a lambda (as in λ a . a or λ a . λ b . a), somewhere within either side of an application (as in λ a . λ b . a b or λ a . λ b . b a), or some combination of both (as in λ a . (λ b . a) a).

We could for example simply return then or else:

if = λ cond then else . then
if = λ cond then else . else

But in that case the conditional isn’t conditional at all—the value in no way depends on cond. Clearly the body must make use of all three variables if we want it to behave like the ifs we know and love from other languages.

Taking a step back for a moment, let’s examine the roles of if’s arguments. then and else are passive; we only want to use or evaluate one or the other depending on the value of cond. cond, then, is the key: it takes the active role.

Thus, in the same way that our if_ functions in Haskell & JavaScript employed those language’s features to implement, we’re going to define if cond then else as the application of the condition to the other two parameters:

if = λ cond then else . cond then else

This feels strangely like cheating: surely we’ve only moved the problem around. Now instead of if making the decision about which argument to return, we’ve deferred it to cond. But if and cond aren’t the same, semantically; if takes a boolean and two other arguments and returns one of the latter, while cond is a boolean—albeit evidently a boolean represented as a function. Let’s make that precise by writing down if’s type:

if : Bool -> a -> a -> a

Notwithstanding our use of the yet-to-be-defined name Bool for the type of the condition, this is the same type as we gave if_ in Haskell; that’s a good sign that we’re on the right track! It takes a Bool and two arguments of type a, and it must return one of those because that’s the only way for it to come up with the a that it returns. But what is Bool?

Working backwards from the type and definition of if, we see that cond is applied to two arguments, and therefore must be a function of two parameters. Further, these are both of type a, and the value it returns must also be of type a for if’s type to hold. Thus, we can define the type Bool like so:

Bool = ∀ a . a -> a -> a

I’m making explicit use of the for-all quantifier here to drive home the point that any particular Bool value must be able to be applied to then and else values of any arbitrary type a, defined now or in the future.

By the same token, we could have written if’s type more explicitly as:

if : ∀ a . Bool -> a -> a -> a

Here and in future, local type variables can be assumed to be implicitly generalized in the same manner as Haskell if not otherwise quantified.

If a given Bool is a function of two arguments of arbitrary type, returning the same type, it must therefore select one of its arguments to return. There are only two distinguishable inhabitants of Bool, true and false, so we can therefore deduce that since if defers the selection of the result to the Bool, for true and false to actually differ they must make opposite selections. In other words, true must return the then parameter, while false must return the else one:

true, false : Bool
true  = λ then else . then
false = λ then else . else

We didn’t move the problem around after all; we solved it. What we noticed was a deeper insight: this encoding of booleans makes if redundant, since if we can apply if to a Bool and two arguments, we could equally apply the Bool to those arguments directly.

We chose to define if as applying the Bool to the other arguments in the same order it received them, but we could just as easily have swapped them:

if = λ cond then else . cond else then

In this case, if would be more useful since it would preserve our familiar argument ordering. As an exercise for the reader, consider what other effects this difference would have. What are the tradeoffs, syntactically and semantically? When would one or the other definition be more or less convenient?

It’s frequently convenient to conflate booleans with bits, their minimal representation, but in truth they’re not the same at all. Practically, some programming languages define booleans as a byte in memory, perhaps clamping its values to 0 and 1; others define them as instances of some boolean class, or constructors of an algebraic datatype. Some provide no formal relationship between true and false at all, save for a common interface—duck typing.

Mathematically, booleans are the values in propositional logic; the upper and lower bounds of a lattice; the zero and one of a semiring; the members of the set with cardinality 2; and many other things in many different contexts.

Operationally, booleans represent choice, and this is a pattern that we’ll see repeated: encoding a datatype with lambdas means representing the datatype as functions supporting all of its operations. All operations on booleans can be defined by selecting between two alternatives, which is precisely what our encoding does.

We can demonstrate this by defining some other operations on booleans, e.g. logical operators, using the encoding we’ve built thus far.

not takes a single Bool and returns another:

not : Bool -> Bool
not = λ x . ?

As when defining if, all we can do with a Bool is branch on it:

not = λ x . if x ? ?

But which arguments should we pass if we wish to return a Bool with the opposite value? Recall the definition of Bool from above:

Bool = ∀ a . a -> a -> a

To return a Bool, therefore, each argument must likewise be a Bool. The first argument will be selected if x is true, the second if x is false, so if we want the opposite value from x we can simply apply it to the opposite values in either position:

not = λ x . if x false true

not x will therefore return false if x is true, and true if x is false; equationally:

not true  = false
not false = true

Which is precisely the meaning we intended not to have.

Note that this is not the only way that we could have implemented not.

not’s type is Bool -> Bool, which is equivalent to (∀ a . a -> a -> a) -> ∀ a . a -> a -> a Thus, we could also define not by taking the extra arguments that the result Bool will be applied to, and using them directly, though in the opposite order:

not = λ x then else . if x else then

Or equivalently, but perhaps slightly more familiar:

not = λ x . λ then else . if x else then

This style of definition can be surprising if you’re not used to so-called “curried functions” as commonly used in e.g. Haskell, but it’s operationally equivalent to the definition developed above. As an exercise, try to work out why that equivalence holds.

or and and are closely related to one another, so we’ll define them simultaneously. Both take two Bools and return a Bool:

or, and : Bool -> Bool -> Bool
or  = λ x y . ?
and = λ x y . ?

As with not, all we can do with Bools is branch:

or  = λ x y . if x ? ?
and = λ x y . if x ? ?

For or, if x is true, we can return true immediately (“short-circuiting”). For and, it’s the opposite:

or  = λ x y . if x true ?
and = λ x y . if x ?    false

If x is false, or needs to test whether y is true; likewise, if x is true, and needs to test whether y is also true. Once more, all we can do with Bools is branch:

or  = λ x y . if x true       (if y ? ?)
and = λ x y . if x (if y ? ?) false

And since we must return a Bool, we can use true and false:

or  = λ x y . if x true              (if y true false)
and = λ x y . if x (if y true false) false

Pleasantly, if y true false (and likewise y true false) is operationally equivalent to y. Using that equivalence, we can simplify these definitions, leaving us with:

or  = λ x y . if x true y
and = λ x y . if x y    false

Conclusion

In this post, we’ve explored defining a ubiquitous programming language feature—booleans—using nothing more than the spartan trappings of the lambda calculus. We’ve emerged with a language which can express not merely functions and their applications, but also fundamental metaphysical concepts such as truth.

In the next post, we’ll look at lambda-encodings of beauty: ML/Haskell-style algebraic datatypes.

Pattern matching over recursive values in Swift

2015-07-01T02:29:44Z

Swift’s value types are almost able to represent algebraic data types. Unfortunately, they fall short of the mark when it comes to recursion, and while they’ve announced that their solution, indirect cases, will ship in a later build of Swift 2, there’s still reason to want them today.

The standard solution is to use Box, a function, or some other reference type to manually force an indirection for the recursive cases:

enum Expression {
	case Variable(String)
	case Abstraction(String, Box)
	case Application(Box, Box)
}

Unfortunately, this has a few significant warts:

Clients of the API have to know about Box; it can’t be a private implementation detail. This can in turn lead to ambiguities if APIs aren’t using a common dependency to provide the Box type. Further, they have to box and unbox the values themselves.
Pattern matching cannot be performed recursively in a single step.

Indirect cases will (I believe) resolve both of these issues, but there’s another solution we can apply today which solves both and provides a significant increase in the expressiveness of the type, at the expense of introducing a (useful) intermediary type.

To begin with, note that in Swift 2, it’s no longer necessary to box elements of parameterized types in enum cases. This suggests a straightforward refactoring: replace Expression’s recursive instances with elements of a type parameter:

enum Expression {
	case Variable(String)
	case Abstraction(String, Recur)
	case Application(Recur, Recur)
}

Now we’ve got an Expression type that can be instantiated with a given type parameter to recur. But if we try to describe the type of a recursive instance of it, we immediately run into a wall:

let expression: Expression>>

It would seem we’ve simply moved the problem from the cases to the type, and can now see more clearly why Swift doesn’t allow cases to recur directly: it amounts to an infinite type. Some indirection is required, somewhere, and by allowing the programmer to place it (whether by explicit boxing or an indirect keyword), the performance implications are somewhat under their control, rather than the compiler’s.

We need some way to tie Expression into a knot (as it were), looping back around into itself, but without requiring us to write out an infinite list of nested type parameters. If we were writing a function instead of a type, we could use the fix function, which computes the least fixed point of a function, to lift a nonrecursive function into a recursive one:

let factorial = fix { recur in
    { n in n > 0 ? n * recur(n - 1) : 1 }
}

Instead of making a recursive function, we make a nonrecursive function taking a function as a parameter, and return an inner function which calls through it in order to recur. fix calls the outer function with a closure which calls back into fix, tying the knot. Is there an analogous fixed point for types? If there were, we would expect it to have the same overall shape: it would apply a type constructor like Expression to a type which itself provides the connection back to Expression.

I’ll let you in on a secret: types are functions, too. Expression is actually a function, abstracted over a parameter T to a concrete instance of Expression with Recur instantiated to T. And it turns out that, like other functions, types also have fixed points.

In Haskell (the inevitable destination of any discussion of fixed points in programming languages), we could write this Fix type of a parameter type f like so:

data Fix f = Fix (f (Fix f))

This is Haskell notation approaching its densest form, so let’s compare it with how fix (the least fixed point of functions) is defined in Swift:

public func fix(f: (A -> B) -> A -> B) -> A -> B {
	return { f(fix(f))($0) }
}

The fix function applies f, the function parameter passed to fix, to the result of applying fix recursively to f again. It wraps this up in another closure to avoid infinite looping.

Analogously, the Fix type applies f, the type parameter passed to fix, to the result of applying Fix recursively to f again. Haskell is lazily-evaluated, so it doesn’t need to wrap the lot of it up again.

Let’s try writing Fix in Swift. It only has one case, so it can be a struct instead of an enum.

struct Fix {}

Now it needs to have a type parameter, F.

struct Fix {}

So far so good. Now we need to apply F to itself, recursively. But doesn’t that cause the infinite sequence of nested types again? Fix>>> is no improvement on Expression>>.

Fortunately, Swift allows you to refer to a type without reference to its type parameters in its body:

struct Fix {
	let body: F
}

Unfortunately, while Fix is a complete reference inside the body of this type, Swift doesn’t know that F can accept type parameters, and thus rejects this. We can be sneaky and use a protocol with a typealias to work around this:

protocol Fixable {
	typealias Recur
}

struct Fix {
	let body: F
}

But now when we add the constraint to tie F into a knot, we run into a new issue: swiftc crashes. (rdar://20000145).

protocol Fixable {
	typealias Recur
}

struct Fix {
	let body: F
}
// => fish: Job 1, 'swift boom.swift' terminated by signal SIGSEGV (Address boundary error)

Fortunately, while Swift can’t express a generic Fix over any arbitrary fixable type, it can express a fixed point of Expression specifically. Let’s call this new type Term. Once again, it’s a struct, and its body holds an Expression instantiated to itself. This one errors out, but it’s clear we’re getting closer:

struct Term {
	let body: Expression
}
// => error: recursive value type 'Term' is not allowed

Term is recursive because it holds an Expression which in turn holds (in some of its cases) a Recur, which we’ve instantiated to Term. We need to reintroduce an indirection via a reference type like Box or a function.

Haven’t we just moved the problem around again? Well, sort of. Certainly we still need to box the values, but now we can do it in one and only one place—Term—and additionally we can make it private, avoiding exposing our implementation details to our consumers. Our constructor and getter can handle the boxing and unboxing for us:

struct Term {
	init(body: Expression) {
		boxedBody = Box(body)
	}

	var body: Expression {
		return boxedBody.value
	}

	private let boxedBody: Box>
}

That’s a pretty decent reason to use this approach right now (if you can’t wait for indirect cases). But it only solves one of the problems we mentioned initially; we still can’t pattern match recursively. For example, if we wanted to evaluate application expressions, we would want to write something like this:

switch expression {
case let .Application(.Abstraction(variable, body), argument):
	// substitute argument for variable in body
default:
	// throw an error
}

But because of the Term and Box, neither of which can be matched through, we would have to write this instead:

switch expression {
case let .Application(abstraction, argument):
	switch abstraction.body {
	case let .Abstraction(variable, body):
		// substitute argument for variable in body
	default:
		break
	}
	fallthrough
default:
	// throw an error
}

If we could flatten out the type, we could pattern match. Flattening out the type would put us straight back into the infinite sequence of Expression<…>s; but maybe we can only partially flatten it?

We don’t need to pattern match against arbitrarily-nested terms for this example; we just want to match against a single nested layer. Therefore, we really only need to flatten out a single step of the recursive type. We’d need to apply this for each appearance of Recur in Expression, replacing it with Expression.

Replacing each instance of a type parameter with an instance of another type parameter sounds like a job for a map function. In Haskell, this function is known as fmap, for functor map, where functors are a kind of mathematical object with some specific shape, and where map preserves this shape. For example, the Array.map method, given some function transform, produces a new array with the same number of elements and in the same order (i.e. preserving the structure of the array), but with each element replaced by applying transform. Array, then, is a functor; and it turns out, so is our Expression tree.

In our case, map should replace the Recur instances with the result of applying some function to them. There are no instances of Recur in Variable cases, so it should just re-wrap the variable name in the resulting type; the Abstraction and Application cases will apply transform:

enum Expression {
	…
	func map(transform: Recur -> Other) -> Expression {
		switch self {
		case let .Variable(x):
			return .Variable(x)
		case let .Abstraction(x, body):
			return .Abstraction(x, transform(body))
		case let .Application(a, b):
			return .Application(transform(a), transform(b))
		}
	}
}

We can use this to implement recursion schemes, improving our confidence in recursive functions over the type, but for now we’ll limit ourselves to enabling pattern matching. Given an Expression, we want to replace each Recur with its recursive instantiation, Expression. Otherwise put, we need a function of type Expression -> Expression>. Let’s implement this as a method, and call it destructure (since it decomposes the structure of the type):

enum Expression {
	…
	func destructure() -> Expression> {
		return map {
			// what do we do here?
		}
	}
}

…but we can’t! To implement a function of type Expression -> Expression> using map, we’d need a function of type Recur -> Expression to pass to it. There is no useful function that can do this; without knowing a specific (and actually recursive) type for Recur, we have no way to recover the Expression that we want to return.

Instead, let’s use a constrained extension to limit ourselves to Expression. Unfortunately it’s not quite that simple, because Swift, for reasons beyond my knowledge (rdar://21512469), forbids the obvious thing:

extension Expression where Recur == Term { … }
// => error: same-type requirement makes generic parameter 'Recur' non-generic

We’ll work around this using a protocol, FixpointType:

protocol FixpointType {
	typealias Fixed
}

extension Term: FixpointType {
	typealias Fixed = Expression
}

Now we can constrain the extension to FixpointType like we want:

extension Expression where Recur : FixpointType, Recur.Fixed == Expression {
	func destructure() -> Expression> {
		return map {
			// what do we do here?
		}
	}
}

There are two problems remaining with this implementation:

We still don’t have a way to get an Expression from a Recur.
swiftc crashes. (rdar://21328632)

Fortunately, we can resolve the former by adding a property to the protocol:

protocol FixpointType {
	typealias Fixed
	var body: Fixed { get }
}

With that out of the way, we can work around the crash by loosening the constraints slightly; we don’t actually require that Recur.Fixed be recursive; we just need to be able to name it. Now we can give the return type of destructure as Expression, and implement it in the obvious way, mapping each term to its body:

extension Expression where Recur : FixpointType {
	func destructure() -> Expression {
		return map { term in term.body }
	}
}

Now we can use destructure to implement evaluation of well-formed .Application expressions, using exactly the pattern matching we wanted in the first place:

switch expression.destructure() {
case let .Application(.Abstraction(variable, body), argument):
	// substitute argument for variable in body
default:
	// throw an error
}

Full code listing.

On the Order of Neptune

2014-04-20T04:20:00Z

Inscribe the orbit of Neptune in a square.

Now, take a pair of integers as x and y coordinates across this square. Their size in bits determines the resolution at which they can measure this square.

An integer of n bits can hold any of 2ⁿ distinct values. 32-bit integers, therefore, would divide the square into a grid of 2³² points.

At 32 bits of resolution, adjacent coordinates, e.g. …0101 and …0110, are about a kilometre apart on our square.

If we double the size of our integers, we now divide the square into a grid of 2⁶⁴ points.

At 64 bits of resolution, still covering the entire span of the orbit of Neptune, adjacent coordinates are about 0.24µm apart, or about 1% of the width of an average human hair.

And famously, populating a 128-bit address space would require us to boil the oceans.

Antitypical

Sequent calculus cheat sheet

Additive

Multiplicative

Implicative

Negation

Assertion

Shifts

Quantification

Core

Structural

Legend

When Howard Met Curry

Double negation

Beginnings and endings

Assertive negativity

Polarization

Endings and beginnings

Sequent Calculi and Metacircularity

Environment-Passing Style

Duality

Additive

Multiplicative

Implicative

Negating

Shifts

Quantifiers

Core

Structural

All you need is λ, part one: booleans

λ is blind

Lambda abstractions

Applications

Variables

Summary

Unconditional λ

Conclusion

Pattern matching over recursive values in Swift

On the Order of Neptune