UpNext

Overview

A common programming operation is constructing a mapping from some set of keys to some set of corresponding values. We might be mapping file names to file lengths or employee numbers to employee records or program variables to their types. Abstractly, we are constructing a function which is defined by exhaustive enumeration rather than by any concise rule.

In this situation a Perl programmer would automatically reach for a hashtable.

A Mythryl programmer, however, will usually reach for a balanced binary tree.

Niklaus Wirth pointed out some years back that balanced binary trees are rarely the best-performing algorithm by a given measure, but they are usually in the top three or so by any given measure. By contrast, the algorithm which places first by one measure will often place dead last by another.

For example, hashtables have an average access time of O(1) with a very low proportionality constant; they win hands-down by this measure. But their worst case is a disastrous O(N)! You would not want to use a hashtable in software controlling something like an airliner or nuclear reactor; it might appear to work fine for years and then out of the blue stop dead due to an improbable series of hash bucket collisions.

For balanced binary trees, by contrast, the worst case O(log(N), just the same as the best case. Balanced binary trees are a tad slower than hashtables but rock-solid dependable.

Consequently using balanced binary trees is a very safe and sane habit; they will never let you down. Using hashtables, by contrast, is the kind of habit that is likely to get you killed some fine morning when you least expect it.

However, the Mythryl programmer’s fondness for balanced binary trees goes much deeper than just their being a nice safe and sane datastructure.

A pervasive theme in Mythryl programming is avoiding the needless use of side-effects. There is no practical way to update a hashtable without side effects: The entire table would have to be copied at each update, at prohibitive O(N) cost. It is however perfectly practical to update balanced binary trees without side effects: By doing path copying when we update a balanced binary tree we can leave the original tree intact, simply building a new tree to replace it. Sharing common parts between the old and new tree lets us do this quite efficiently, taking only O(log(N)) time and space.


Comments and suggestions to: bugs@mythryl.org

UpNext