The Lune type system
8 Sep 2021
At the moment, I’m working on a compiler for a purely functional language called Lune. The parser and code generator are finished, but the type checker is still incomplete. The type system I plan to implement in Lune is very different from that of other functional languages. Most notably, it allows for subtypes. As a result, every expression can be assigned multiple types, with some more general than others.
There are a few built-in types:
numberis the set of real numbers. In practice, these are stored as double-precision floating-point values.
intis the set of integers. It is a subset of
stringis the set of strings. String literals are written with double quotes.
charis the set of single-character strings. It is a subset of
atomis the set of atoms. An atom is any word beginning with a capital letter, e.g.
Hello. (All identifiers must start with a lowercase letter in Lune, so this doesn’t create any ambiguity.) Atoms are not strictly necessary—you could use strings instead—but the use case for atoms is completely different from that of strings, so I think they deserve a separate type.
- There are an infinite number of “literal types” inhabited by a single value. For example,
3is a set containing just the number three,
"owo"is a set containing just the string
Nullis a set containing just the atom
anyis the supertype of all types. Every expression can be given the type
voidis an uninhabited “bottom type”. Since it contains no values, it is a subtype of all types.
There are also a few operators that act on types:
a -> bis the type of a function with domain
b. For example, the
roundfunction would have the type
number -> int.
a | bdenotes the union of sets
a, bdenotes the set of ordered pairs of elements of
aand elements of
Universal quantification is denoted with square brackets. For example, the type of the identity function is
[a] a -> a. (You could also annotate it as
any -> any, but in that case, you’d hardly be able to do anything with the returned value, because its type would be way too general.) I’m probably going to disallow higher-ranked types for the sake of simplicity.
Type synonyms are created with the
type keyword. For example,
type bit = 0 | 1 assigns the synonym
bit to the set with two elements: 0 and 1. You can also use the
type keyword to create type-level functions which take sets as arguments. For example, a
maybe type could be defined as
type maybe a = Null | Just, a. This means that a value of type
maybe int can be either the atom
Null or the atom
Just combined with an integer. Type synonyms can also be recursive, allowing you to define things such as linked lists and Peano-style natural numbers.
As I said earlier, the existence of subtypes implies that many values can be assigned more than one type. In fact, every value has an infinite number of types associated with it. For example, the string
"x" is a member of the following sets:
"x" | a
char | a
string | a
a can be any type.
I haven’t figured out all the details of the type-checking algorithm yet. Type inference is going to be a major issue, because every function can have a wide variety of possible types. I’m going to have to create an algorithm which infers the “most reasonable” type for every function, but what exactly is the definition of “most reasonable”? I’ll post the solution here once I figure it all out.