Mythryl Types: Hindley-Milner Type Inference

5.4.13 Mythryl Types: Hindley-Milner Type Inference

The lack of explicit type declarations makes a Mythryl function definition like

    fun next(arg) = arg+1;

look a lot more like a similar declaration in an untyped scripting language like Python or Ruby than it does like the equivalent declaration in C, festooned with type declarations:

    int next (int arg) { return a+1; }

But Mythryl is in fact a strongly typed language — more so than C, in fact — and Mythryl syntax lets us, if we wish, festoon our function declarations with as many types as any C program. The result even looks something like the above C case when we do so:

    fun sum (arg: Int): Int = { a+1; };

Appearances to the contrary, from a typing point of view Mythryl code is in fact much more like C than like untyped scripting languages.

Like the C compiler, the Mythryl compiler statically computes a precise type for every variable, value, function and expression.

Like the C compiler, the Mythryl compiler takes advantage of type information for such things as deciding when to generate floating-point arithmetic instructions and when to generate integer arithmetic instructions.

The critical difference is that C has a very simple, ad hoc type system design dating from the 1960s, whereas Mythryl uses a sophisticated modern type system designed around Hindley-Milner type inference.

Hindley-Milner type inference (also known as Damas-Milner type inference) is based upon the unification operation popularized by Prolog. Consequently writing type declarations in Mythryl is a bit like writing Prolog code; as we shall see subsequently, it is possible to write pages of useful Mythryl code entirely as type definitions.

The more immediately interesting aspect of Mythryl type inference is that the compiler freely propagates type inferences outward through the source code from every source of information.

The easiest way to explore Mythryl typing is to use its interactive mode with result type display turned on:

    linux$ my
    eval:  set_control  "mythryl_parser::show_interactive_result_types" "TRUE";

    eval:  2+2;

    4 : Int

    eval:  fun sum(a,b) = a+b;

    \\ : (Int, Int) -> Int

    eval:  fun swap(a,b) = (b,a);

    \\ : (X, Y) -> (Y, X)

The remainder of this section presumes that you have turned on result type display as shown above.

Suppose for example that you enter

    eval:  fun next (arg) = arg + 1.0;

    \\ : Float -> Float

The Mythryl declaration of the overloaded addition operator in src/lib/core/init/pervasive.pkg declares that it combines two values of the same type to produce another value of the same type:

    overloaded my + :   ((X, X) -> X)

The Mythryl compiler knows that constant 1.0 is of type Float, hence it can deduce that arg must also be a Float, and so must the result of the addition and consequently of the function, so that function next must necessarily take a Float argument and return a Float result, giving it a type of Float -> Float.

If we instead enter

    eval:  fun next(arg) = arg + "1";

    \\ : String -> String

exactly the same chain of reasoning leads the compiler to deduce a type of String -> String for our function.

It is due to the power of this style of type inference that Mythryl code can be written largely without explicit types. The major exception is api definitions. Api definitions represent interfaces to unknown external code so one needs to explicitly specify all types in an .api file. (It is in any event good documentation to do so.)

Another place where type inference often fails is when setting a variable to an empty list:

    result_list = [];

In such cases, the Mythryl compiler often has no idea whether you intend result_list to be a list of integers, floats, strings, or maybe something exotic like complete symbol tables. Consequently, you will often see Mythryl code giving the type explicitly in such cases:

    result_list = ([]: List(String));

As a general rule, if the Mythryl compiler cannot deduce the type of a variable, the human reader of the code probably cannot either, so such type declarations are in any case welcome documention.

The Mythryl compiler constructs a global dependency tree of all api declarations and package definitions and then compiles rootward from the leafs. Consequently, when it compiles a package, it has in hand full type information about all other modules referenced by that package. (Also as a consequence, the Mythryl compiler gets very upset if you have cyclic dependencies between packages: It then has no idea how to get started compiling. This is the “recursive modules” problem, which has received a great deal of attention from researchers.)

Comments and suggestions to: bugs@mythryl.org