Multi-core programming means multiprocessing — having more than one program counter active at the same instant — and multiprocessing means multiprogramming — having more than one thread conceptually active at any given time — so the rise of commodity multi-core CPU chips means that we are all now willy-nilly multiprogramming practitioners.
One of the prettiest design choices made early on by the SML/NJ design team — probably Andrew Appel — was to dispense entirely with the notion of a call stack and instead allocate callframes directly upon the heap.
This idea has a long and mixed history; a number of systems like Smalltalk started out doing this for its elegance and simplicity, but had to give it up for performance reasons.
But SML/NJ had the advantage of having from the outset a high-performance multi-generation copying garbage collector (classic Smalltalk relied on simple reference-counting) and consequently wound up in a sweet spot where on the one hand the garbage collector allowed simple and elegant callframe allocation, while on the other hand the demands of on-heap callframe allocation kept the garbage collector implementors on their toes, resulting in no-compromise performance levels which benefit all the rest of the system.
From a multiprogramming point of view, the result is that in SML/NJ — and thus ultimately Mythryl — the fundamental multiprogramming thread-switch call/cc primitive is just as fast as a vanilla function call because in fact it is just a function call, while in contemporary systems it involves an actual switch of stacks involving hundreds of instructions of context save and restore, which consequently takes hundreds of times longer.
The bottom line for today’s programmer: As we head into the era of serious multiprogramming and multiprocessing, the Mythryl programmer enjoys an essentially optimally efficient infrastructure on which to build, whereas most other programmers are headed for ticklish performance problems.