The Lune type system
8 Sep 2021
At the moment, I’m working on a compiler for a purely functional language called Lune. The parser and code generator are finished, but the type checker is still incomplete. The type system I plan to implement in Lune is very different from that of other functional languages. Most notably, it allows for subtypes. As a result, every expression can be assigned multiple types, with some more general than others.
There are a few built-in types:
numberis the set of real numbers. In practice, these are stored as double-precision floating-point values.intis the set of integers. It is a subset ofnumber.stringis the set of strings. String literals are written with double quotes.charis the set of single-character strings. It is a subset ofstring.atomis the set of atoms. An atom is any word beginning with a capital letter, e.g.Hello. (All identifiers must start with a lowercase letter in Lune, so this doesn’t create any ambiguity.) Atoms are not strictly necessary—you could use strings instead—but the use case for atoms is completely different from that of strings, so I think they deserve a separate type.- There are an infinite number of “literal types” inhabited by a single value. For example,
3is a set containing just the number three,"owo"is a set containing just the string"owo", andNullis a set containing just the atomNull. anyis the supertype of all types. Every expression can be given the typeany.voidis an uninhabited “bottom type”. Since it contains no values, it is a subtype of all types.
There are also a few operators that act on types:
a -> bis the type of a function with domainaand rangeb. For example, theroundfunction would have the typenumber -> int.a | bdenotes the union of setsaandb.a, bdenotes the set of ordered pairs of elements ofaand elements ofb.
Universal quantification is denoted with square brackets. For example, the type of the identity function is [a] a -> a. (You could also annotate it as any -> any, but in that case, you’d hardly be able to do anything with the returned value, because its type would be way too general.) I’m probably going to disallow higher-ranked types for the sake of simplicity.
Type synonyms are created with the type keyword. For example, type bit = 0 | 1 assigns the synonym bit to the set with two elements: 0 and 1. You can also use the type keyword to create type-level functions which take sets as arguments. For example, a maybe type could be defined as type maybe a = Null | Just, a. This means that a value of type maybe int can be either the atom Null or the atom Just combined with an integer. Type synonyms can also be recursive, allowing you to define things such as linked lists and Peano-style natural numbers.
As I said earlier, the existence of subtypes implies that many values can be assigned more than one type. In fact, every value has an infinite number of types associated with it. For example, the string "x" is a member of the following sets:
"x"charstringany"x" | achar | astring | a
where a can be any type.
I haven’t figured out all the details of the type-checking algorithm yet. Type inference is going to be a major issue, because every function can have a wide variety of possible types. I’m going to have to create an algorithm which infers the “most reasonable” type for every function, but what exactly is the definition of “most reasonable”? I’ll post the solution here once I figure it all out.