Variants are monads
12 Nov 2021
Recently I’ve been working on a statically typed, pure functional programming language called Lune. Lune’s syntax is similar to ML-family languages such as OCaml and Haskell. But Lune does not have ML-style algebraic data types. Instead, it uses a form of row polymorphism based on these two papers:
A row is a list of label-type pairs. Rows are implemented in Lune with two type constructors: nil :: Row
, and (:=) :: Label -> Type -> Row
. The standard Prelude module also includes a type operator (;)
, defined as (;) f t = f t
. The labels in a row can be any identifier starting with a capital letter. Here are some examples:
nil
X := float; nil
X := float; Y := float; nil
Name := string; Age := int; nil
Name := string; Name := string; Age := int; nil
As you can see in the last example, the labels in a row do not have to be unique. There are very few cases where duplicate labels are useful, but allowing duplicate labels greatly simplifies the type system. Rows also have a special unification rule that allows labels to be swapped. Thus N := int; S := string; nil
is equivalent to S := string; N := int; nil
.
Because rows have the kind Row
rather than the kind Type
, it is impossible to construct a value with a row as its type. There are two ways of transforming rows into normal types: by surrounding them in curly brackets, or by surrounding them in square brackets. Curly brackets are used to define record types, and square brackets are used to define variant types. In this post I will focus on the latter.
Variants represent a choice between multiple types, each with a different label attached to them. For example, suppose there is a function that returns an integer labeled with Success
if it succeeds, or an error message labeled with Error
if it fails. The return type of this function would be [Success := int; Error := string; nil]
.
In Lune, labels are first-class values. This means they can be passed as arguments and returned from functions. The standard Prelude defines a type constructor label :: Label -> Type
. The type of a term-level label x
is label x
. For example, Name
has the type label Name
. This is what makes labels so powerful; unlike strings, they can be differentiated at the type level.
There are three basic operations on variants: injection, embedding, and deconstruction. These are represented in Lune with the functions (^)
, embed
, and match
respectively.
-
The
(^)
operator has the typeany s a r. label s -> a -> [s := a; r]
. For example,Success ^ 5
has the type[Success := int; r]
, wherer
can be any row. Basically, what you are doing when you writeSuccess ^ 5
is labelling the value5
with the labelSuccess
. -
The existence of variants allows for subtypes. For example,
[Message := string; nil]
is a subtype of[Message := string; Number := int; nil]
. Every value of the type[Message := string; nil]
, such asMessage ^ "hello"
andMessage ^ "abc"
, is also a value of the type[Message := string; Number := int; nil]
. However, the Lune type system, like every other ML type system, does not implicitly coerce between values of different monotypes. This means that if you have a functionf
that returns a value of type[Message := string; nil]
, and a functiong
that expects a value of type[Message := string; Number := int; nil]
, you cannot legally composef
withg
. This is where label embedding comes in. The standard Prelude defines a function calledembed
with the typeany s a r. label s -> [r] -> [s := a; r]
. This allows you to explicitly coerce a value of any variant type to any of its supertypes. Instead of writingg (f x)
, which does not type-check, you can writeg (embed Number (f x))
. -
The
match
function allows you to extract a value from a label. It has the typeany s a b r. label s -> (a -> b) -> ([r] -> b) -> [s := a; r] -> b
. For example, suppose you want to convert a value of the type[Success := int; Error := string; nil]
into an integer. If the value isSuccess ^ x
, you want to returnx
, but if the value is an error, you want to return 0. Such a function can be defined withmatch
by writingmatch Success (do x. x) (do x. 0)
. (In Lune, anonymous functions are created with thedo
keyword.) The standard Prelude exports a function calledelse
, which is defined aselse x y = x
. This allows you to make pattern-matching expressions more readable. For instance, the above function could be writtenmatch Success (do x. x) (else 0)
. This is roughly equivalent to the Haskell expression\case Success x -> x; _ -> 0
.
The promotion of labels to first class allows you to define functions that work on any variant type. This is an enormous improvement on other ML languages. For example, you can define a function called case
that does the same thing as match
, but ignores the value inside the label.
val case :: any s a b r. label s -> b -> ([r] -> b) -> [s := a; r] -> b
let case s x = match s (do y. x)
But this is only the tip of the iceberg. It turns out that [s := a; r]
is a monad for any label s
and any row r
. And with first-class labels, it is possible to define a monadic operation that works on all variant types.
val map :: any s a b r. label s -> (a -> b) -> [s := a; r] -> [s := b; r]
let map s f = match s (do x. s ^ f x) (embed s)
val apply :: any s a b r. label s -> [s := (a -> b); r] -> [s := a; r] -> [s := b; r]
let apply s vf vx = vf # match s (do f. map s f vx) (embed s)
val bind :: any s a b r. label s -> (a -> [s := b; r]) -> [s := a; r] -> [s := b; r]
let bind s f = match s f (embed s)
These higher-order functions are equivalent to Haskell’s fmap
, (<*>)
, and (=<<)
, except that they each take a label as a parameter; the label to map the given function over. In Haskell terminology, I have defined a Monad
instance for every single sum type at once. This is not remotely possible in any ML language. In my opinion, Lune’s combination of row polymorphism and first-class labels almost makes up for its lack of typeclasses.