Category Archives: factor

Roman Numeral Fun

In my waiting-for-people-to-leave-so-I-can-get-into-classrooms time at work I wrote a roman numeral translator in Factor. It’s a bit different from your normal implementation as Factor’s parser actually does almost all the work:

USING: strings parser kernel words sequences math ;

: NUMERAL: CREATE dup reset-generic dup t "parsing" set-word-prop parse-definition  parsed add define-compound ; parsing

NUMERAL: I 1 ;
NUMERAL: IV 4 ;
NUMERAL: V 5 ;
NUMERAL: IX 9 ;
NUMERAL: X 10 ;
NUMERAL: XL 40 ;
NUMERAL: L 50 ;
NUMERAL: XC 90 ;
NUMERAL: C 100 ;
NUMERAL: CD 400 ;
NUMERAL: D 500 ;
NUMERAL: CM 900 ;
NUMERAL: M 1000 ;

: separate ( str -- str )
    "" swap [ " " append ] [ add ] interleave ;

: join-special ( str str -- str )
    dup >r split1 [ 1 r> remove-nth swap 3append ] [ r> drop ] if* ;

: merge-specials ( str -- str )
    [ "I V" "I X" "X L" "X C" "C D" "C M" ] [ join-special ] each ;

: convert-numerals ( string -- arr )
    separate merge-specials parse ;

: all-numerals? ( str -- ? )
    [ "IVXLCDM" member? ] all? ;

: roman>number ( roman -- number )
    >upper dup all-numerals? [ convert-numerals sum ] [ drop "Not a roman numeral" ] if ;

Instead of grabbing characters and keeping a running tally, I defined a bunch of parsing words using NUMERAL: to hold the values. I then took the string and split it into individual characters (“XIV” becomes “X I V”). The 4′s and 9′s are then rejoined (so we get “X IV”). I then simply parse the string, which gives a list of numbers and sum that up. It’s not perfect as it allows any pattern of numerals (“IVIVIVIV” parsed to 22), but good enough.

Checklist

The code for generating the checklist is written in Factor. Factor is rather outside the mainstream, but it works beautifully for my purposes.

I store the checklist data in a text file that looks like the following:

CHECKLIST:
    FAMILY: Wrens
        SPECIES: Carolina Wren
            STATUS: Fairly common year-round. Can be found almost anywhere. There's too many around to not be breeding, but I have yet to find a nest.
            PB
        END

        SPECIES: House Wren
            STATUS: Regular in spring and summer. Possible almost anywhere. Young at Paine Estate imply breeding.
            CB
        END

        SPECIES: Winter Wren
            STATUS: Probably a rare visitor, possible at all seasons?
            RECORD: M. Emmons, 5/14/1997, ? , 2
            RECORD: R. Haaseth/D. Finch, 2/6/2004, near Prospect Hill, 1
            RECORD: R. Haaseth/D. Finch, 10/2005, ?, 1
            PERSONAL: 11/25/2006, Met State, 1
        END

        SPECIES: Marsh Wren
            STATUS: Likely a regular migrant and possibly a breeder at Waverly Oaks Marsh.
            RECORD: M. Rines, 4/20/2006, Waverly Oaks Marsh, 1
        END
    END

    HYPOTHETICAL: White-winged Crossbill | irruptive species
    HYPOTHETICAL: Hoary Redpoll | irruptive species

    HISTORICAL: Black Vulture | coming soon
    HISTORICAL: Boreal Chickadee | coming soon
    HISTORICAL: Louisiana Waterthrush | coming soon

This is a nice clean structure that is easy to parse. In fact, the code to parse it is "USE: checklistn" swap dup file-length swap <file-reader> stream-read append parse call ; Yep, the checklist is actually code that parses itself.

To do this, I defined a few tuples:

  • TUPLE: checklist confirmed hypotheticals historicals
  • TUPLE: family name species
  • TUPLE: species name status breeding records
  • TUPLE: record observer date place quantity
  • TUPLE: historical name details
  • TUPLE: hypothetical name reason

Each one simply contains the data and stores anything below it in a vector. I then defined a bunch of words that handle the parsing.

  • CHECKLIST: creates a new checklist
  • FAMILY: creates a new family and gives it the name of whatever is on the line
  • SPECIES: does the same for species
  • STATUS: stores the status
  • PB and CB store the breeding
  • RECORD: and PERSONAL: create and store records (PERSONAL: creates a record with observer “me”)
  • END is a generic word that adds the species or family to the family or checklist respectively (I’d like to define it for the checklist as well, but the stack effects don’t match so it doesn’t work)
  • HISTORICAL: and HYPOTHETICAL: do the obvious

For the most part, defining those was straightforward. The trickiest part was dealing with status, as I had to be able to call a word for the next object on the stack after reading the status in. I ended up with the ugly

: STATUS: rest-of-line swap ?push  swap swap ?push  set-species-status V{ } clone [ push ] keep >quotation swap ?push  keep swap ?push ; parsing

mess. I’m sure there’s a better way but that took a couple days to get and I haven’t fiddled with it since.

Getting from the checklist tuple to the output was pretty easy, just doing lots of formatting with make and working down.