Factor: a stack-based programming language

January 18th, 2007 by Samuel Tardieu

As you may already know, I’m a big fan of stack-based languages such as Forth, functional languages such as Haskell and reflexive languages such as Smalltalk. You can imagine how happy I was when I discovered Factor a few days ago: it combines all those aspects.

Today, a friend sent me someone email signature and asked me if I could decipher it:

01101001001000000110011001110101011000110110101101100101011001000010
00000111100101101111011101010111001000100000011011010110111101101101
00001101000010100110000101101110011001000010000001101110011011110111
01110010000001100110011011110111001000100000011110010110111101110101

As any programmer on the Earth would have, I immediately assumed that those were ASCII codes printed in binary format. I had a Factor interactive shell opened in one of my windows, so I cut and pasted the whole string (surrounded by quotes) and entered:

8 group [ 2 base> ] map >string print

and the cleartext version appeared instantly. All in all, it took me around 20 seconds to uncover the original text using Factor.

How does this work? Factor is a stack-based language, meaning that data are put onto a stack and words (equivalent of functions in other languages) use the data on the stack and put results there. Factor is a (dynamically) typed language: complex data can be pushed onto the stack, while untyped languages such as Forth can only push numbers there.

Writing the string pushes it on the stack. Using

8 group

takes the string on the top of the stack, considers it as a sequence (a succession of characters), group them by eight and returns an array of strings of length 8. At this point, there is only one element on the stack: an array of eight-characters-long strings.

Then

[ 2 base> ]

pushes another element on the top of the stack: a quotation (the equivalent of a lambda expression in functional languages), which is a block containing code. The base> word consumes two elements from the stack, a string S and a number B, and pushes back a number which is the value represented by S in base B. For example, the expression

"01101001" 2 base>

will let the value 105 on the stack, as 01101001 in binary represents 105 in decimal.

map

takes a sequence and a quotation. It represents the classical map operation in functional languages: it applies the quotation to every element of the sequence and gathers the results in a new sequence. As a consequence, each eight-characters-long string gets transformed into its decimal representation. At the end, we end up with a single element on the stack, which is an array containing all the ASCII codes of the sentence.

>string

transforms a sequence of ASCII codes into the corresponding string. Then

print

is similar to C’s puts and prints the string on the standard output while recognizing special sequences such as \r and \n.

The content of the unencrypted text itself is not important; my point is that Factor is very compact and its stack-oriented nature helps writing concise and clear programs. For example, here is one of the many possible implementations of the reverse functionality: it takes a string from the stack and lets its ASCII representation in binary onto the stack.

"" [ 2 >base 8 CHAR: 0 pad-left append ] reduce

For example, I’m sure that you’re thrilled to know that “Hello, world!” encodes as

0100100001100101011011000110110001101111001011000010
0000011101110110111101110010011011000110010000100001

8 Responses to “Factor: a stack-based programming language”

  1. Thomas Says:

    In the same family you should not forget Postscript!

    Here’s an example I like of a Postscript program that renders its own source text, using the stack to create a function and pass it as argument to itself for the purpose of displaying it…

    %!PS-Adobe-2.0
    { ($Id: //depot/src/self.ps#5 $) pop
    20 616 moveto /Helvetica-Bold findfont 12 scalefont setfont
    (%!PS-Adobe-2.0) show
    20 600 moveto
    /tshow { currentpoint pop dup 550 ge { neg 20 add -16 rmoveto } { pop }
    ifelse show }
    bind def
    4 dict dup /cdict exch def
    dup 40 (\\\() put
    dup 41 (\\\)) put
    92 (\\\\) put
    /cshow { dup cdict exch known { dup cdict exch get }
    { 1 string dup 0 4 3 roll put } ifelse show } bind def
    /xshow {
    dup type /arraytype eq { ({ ) tshow { xshow } forall (} ) tshow }
    { dup type /nametype eq { dup xcheck not { (/) tshow } if } if
    dup type /stringtype eq { (\() tshow { cshow } forall (\) ) tshow }
    { 20 string cvs tshow ( ) tshow } ifelse } ifelse } def
    xshow
    { dup exec showpage } { xshow } forall } dup exec showpage

  2. Samuel Tardieu Says:

    Thomas: yeah, I like quines too.

  3. Samuel Tardieu Says:

    Thomas: here is a possible Factor quine.

    [ pprint " dup call" print ] dup call
    
  4. Daniel Ehrenberg Says:

    This is a great example. There’s just a little thing about your ascii encoding that could be improved. "" [ 2 >base 8 CHAR: 0 pad-left append ] reduce is O(n^2), because it appends to “” n times. This might be OK for a short string, but for longer strings, it could take a really long time. A more efficient way to do it is using make. I happen to have just written a summary of how to use make here. The (untested) way to rewrite your code using make is: [ [ 2 >base 8 CHAR: 0 pad-left % ] each ] "" make. The way that is implemented, it takes a vector and repeatedly appends on to it in place before converting the result to a string. This way, you don’t have to copy the string so many times.

  5. Samuel Tardieu Says:

    Daniel: I was not comfortable in introducing yet another concept here (make), while I assumed everyone familiar with functional languages would recognize reduce. But thanks for your input, it looks just like my own implementation indeed.

    And by the way I read your post earlier today already :)

  6. Christian Cornelssen Says:

    perl -nle ‘print pack(”B*”,$_)’

    does the same job, but yes, the factor code is elegant.

  7. anonymous Says:

    Now, that’s one nasty sig :)

  8. simon Says:

    >In the same family you should not forget Postscript!

    C’est vrai ça (hey, why English? Don’t we all speak French here? And I’m sure those who don’t will be more than pleased to learn our language, as we did for theirs :-))… J’ai toujours trouvé le postscript fantastique.. Un peu lent, peut-être :-) Mais vraiment intéressant.
    Mais “factor” a l’air épatant. Merci !

    P.S : j’espère que tout va bien !

Leave a Reply


Creative Commons License