The Mythryl codebase contains a lot of historical artifacts, so some familiarity with its history is helpful in understanding the code.
Mythryl is a fork of the SML/NJ [4] codebase. SML/NJ was the first compiler for SML, and remains the de facto standard reference SML compiler.
To understand its signficance, some of the history of the SML language must be given as context:
The original ML language was defined in the late 1970s by Robin Milner as a metalanguage (hence "ML") for the Edinburgh logical framework LF.
The SML/NJ compiler was written in cooperation between Bell Labs and Princeton. It began about 1985 as primarily a two-person effort between David MacQueen and Andrew Appel, with MacQueen serving as language expert and Appel as compiler expert. A long succession of PhD students also contributed, and in fact continue to contribute.
Appel is a fan of Fate-Passing Style, and author of a series of books on compiler implementation using it. (Since he was chief architect of the compiler, these books provide useful insight into the SML/NJ design and implementation philosophy. [8]) Consquently, the initial SML/NJ compiler consisted of a front end from which the current front end is directly descended, and a FPS-based backend with handcrafted code generator.
In 1990, Standard ML was defined by the publication of The Definition of Standard ML by Robin Milner, Mads Tofte, Robert Harper and David MacQueen. In particular, this incorporated MacQueen’s module system design [6], a huge step forward whose repercussions are still being felt. This slim volume was the first to formally define not only the syntax for a practical programming language, but also its semantics.
A 1991 snapshot of the five-year-old SML/NJ compiler is provided by Appel and MacQueen’s "Standard ML of New Jersey".[7]
Zhong Shao’s 1994 Princeton PhD thesis [5] provides a good snapshot of the SML/NJ compiler as of that year.
About 1992, Yale launched a FLINT ("Functional Language INTermediate code representation"?) project [1] to improve the optimization of functional languages.
The code developed by this initially separate project was later merged into the SML/NJ compiler, providing the lambdacode and anormcode form passes which now sit between the front end and the original FPS optimizer. Essentially, the FLINT-derived code now forms the front half of the Mythryl highcode module, while the original FPS optimizer forms the back half. The seperate heritage of these two parts lives on in the form of a lack of integration, coordination and nomenclature between them.
Stefan Monnier’s 2003 thesis [3] describes both the tension and the synergy between the FLINT-derived and FPS-based parts of the highcode module.
Also about 1992, the MLRISC project [9] was launched to implement an optimizing, portable, retargetable, language-neutral back end. A snapshot of this project as of 1994 is provided in [10].
The Definition of Standard ML was updated and republished in 1997. The changes were mostly minor, and in fact on the whole mostly served to simplify the language by removing unproductive elements of the original definition.
About 2000, MLRISC replaced the original SML/NJ compiler backend about, although integration between the new backend and the rest of the compiler remains marginal. (This part of the compiler is renamed "lowhalf" in the Mythryl codebase.)
Also about 2000, Bell Labs, now renamed Lucent, spun off as a separate company, and tanking in the stock market, stopped funding development of SML/NJ. As a result, the principal contributors were forced to seek new positions, and development of the SML/NJ codebase slowed to a glacial crawl for the 2000-2005 period, with in fact no new end-user releases of the compiler whatever.