As you may already know, I’m a big fan of stack-based languages such as Forth, functional languages such as Haskell and reflexive languages such as Smalltalk. You can imagine how happy I was when I discovered Factor a few days ago: it combines all those aspects.
Today, a friend sent me someone email signature and asked me if I could decipher it:
01101001001000000110011001110101011000110110101101100101011001000010 00000111100101101111011101010111001000100000011011010110111101101101 00001101000010100110000101101110011001000010000001101110011011110111 01110010000001100110011011110111001000100000011110010110111101110101
As any programmer on the Earth would have, I immediately assumed that those were ASCII codes printed in binary format. I had a Factor interactive shell opened in one of my windows, so I cut and pasted the whole string (surrounded by quotes) and entered:
8 group [ 2 base> ] map >string print
and the cleartext version appeared instantly. All in all, it took me around 20 seconds to uncover the original text using Factor.
How does this work? Factor is a stack-based language, meaning that data are put onto a stack and words (equivalent of functions in other languages) use the data on the stack and put results there. Factor is a (dynamically) typed language: complex data can be pushed onto the stack, while untyped languages such as Forth can only push numbers there.
Writing the string pushes it on the stack. Using
takes the string on the top of the stack, considers it as a sequence (a succession of characters), group them by eight and returns an array of strings of length 8. At this point, there is only one element on the stack: an array of eight-characters-long strings.
[ 2 base> ]
pushes another element on the top of the stack: a quotation (the equivalent of a lambda expression in functional languages), which is a block containing code. The
base> word consumes two elements from the stack, a string
S and a number
B, and pushes back a number which is the value represented by
S in base
B. For example, the expression
"01101001" 2 base>
will let the value
105 on the stack, as
01101001 in binary represents
105 in decimal.
takes a sequence and a quotation. It represents the classical
map operation in functional languages: it applies the quotation to every element of the sequence and gathers the results in a new sequence. As a consequence, each eight-characters-long string gets transformed into its decimal representation. At the end, we end up with a single element on the stack, which is an array containing all the ASCII codes of the sentence.
transforms a sequence of ASCII codes into the corresponding string. Then
is similar to C’s
puts and prints the string on the standard output while recognizing special sequences such as
The content of the unencrypted text itself is not important; my point is that Factor is very compact and its stack-oriented nature helps writing concise and clear programs. For example, here is one of the many possible implementations of the reverse functionality: it takes a string from the stack and lets its ASCII representation in binary onto the stack.
[ 2 >base 8 CHAR: 0 pad-head ] [ append ] map-reduce
I’m sure that you’re thrilled to know that “Hello, world!” encodes as
(edited on 2009-06-13 to use