PreviousUpNext

5.2.11  Regular Expressions

Mythryl regular expressions are patterned after those of Perl, the de facto standard. Unlike Perl, Mythryl has no special compiler support for regular expression usage. On the down side, this means that Mythryl regular expression syntax is not quite as compact as that of Perl. On the up side, since the Mythryl regular expression support is implemented entirely in library code, you may easily write or use alternate regular expression libraries if you do not like the stock one. In fact, Mythryl ships with several. (Some might say, several too many.)

Matching a string against a regular expression may be done using the =~ operator:

    #!/usr/bin/mythryl

    fun has_vowel( string ) = {
         #
         if (string =~ ./(a|e|i|o|u|y)/)  printf "'%s' contains a vowel.\n"          string;
         else                             printf "'%s' does not contain a vowel.\n"  string;
         fi;
    };

    has_vowel("mythryl");
    has_vowel("crwth");

When run, the above prints out

    linux$ ./my-script
    'mythryl' contains a vowel.
    'crwth' does not contain a vowel.
    linux$

Unlike Perl, Mythryl does not hardwire the meaning of the =~ operator. We will cover defining such operators later.

Other than matching a string against a regular expression, the most frequently used regular expression operation is doing substitutions of matched substrings.

Here is how to replace all substrings matching a given regular expression by a given constant replacement string:

    linux$ my

    eval:  regex::replace_all ./f.t/ "FAT" "the fat father futzed";
    "the FAT FATher FATzed"

Important detail: If you need to include a / within a regular expression, you cannot do so by backslashing it; you must instead double it:

    fun has_slash( string ) =   string =~ ./\//;      # Do NOT do this!  It will not work!
    fun has_slash( string ) =   string =~ .////;      # Do this instead.
    fun has_slash( string ) =   string =~ "/";        # Or this -- a regex is just a string, so string constants work fine.
    fun has_slash( string ) =   string =~ .</>;       # Or this -- .<foo> is just like ./foo/ except for the delimiter chars.
    fun has_slash( string ) =   string =~ .|/|;       # Or this -- .|foo| is just like ./foo/ except for the delimiter chars.
    fun has_slash( string ) =   string =~ .#/#;       # Or this -- .#foo# is just like ./foo/ except for the delimiter chars.

The above discussion is far from exhausting the topic of regular expressions, but it is enough for the first go-around; we will return to regular expressions later.


Comments and suggestions to: bugs@mythryl.org

PreviousUpNext