An avian carrier's blog – Factor Atom feed

Factor programming language
  1. Accessing serial ports the easy way (2011-12-01)

    Every once in a while, I see people having a hard time accessing a RS232 or USB serial port from Java. There exist several solutions to do this in Java:

    • The Java Communications 3.0 API looks awfully old and unmaintained. It is available for Solaris SPARC, Solaris x86, and Linux x86.

    • RXTX is a mix between Java code and C code accessed through the Java native interface. It is hosted on a CVS server and it looks like the 2.2 release will never go out since it got stuck on version 2.2pre2 released in 2009. The last stable version is 2.1.7 from 2006.

    • PureJavaComm is a drop-in replacement for those two libraries, written in Java and using JNA to interface with the system. It is simpler to setup than RXTX, is hosted on GitHub and is actively maintained.

    However, there exist at least one other solution which does not require the use of any external library. This is what I chose to interface a Scala program with a XBee Pro module through a serial interface to interact with my students devices. I also use it to interface the Factor programming language with the same XBee module.

    Every language has a well-defined and well-maintained sockets library, right? So why not simply use socat, a multipurpose relay which is able to bridge many protocols and interfaces such as, in our case, TCP and a serial port?

    I launch socat as

    % socat TCP-LISTEN:4161,fork,reuseaddr FILE:/dev/ttyUSB0,b57600,raw
    

    and what I immediately get is a TCP server listening onto port 4161 and ready to relay any incoming connection to the /dev/ttyUSB0 serial port with a 57600 baud rate. And not only do I have no more concern about accessing the serial port properly, but also I can access a port located on a remote computer as easily by launching socat there instead of locally.

    But what if I want to spy on the TCP/serial relay to see that I send the right codes to the XBee module? socat offers you a choice of command-line options to dump the data in various formats.

    What does the Scala interface look like? I have a XBee abstract class lacking an InputStream to receive the input from the XBee module and an OutputStream to send the output to it. This class is extended into a concrete class using simply:

    import java.net.Socket
    
    class TCPXBee(host: String, port: Int) extends XBee {
    
      private val socket: Socket = new Socket(host, port)
    
      override protected val inStream = socket.getInputStream
      override protected val outStream = socket.getOutputStream
    
      init()
    
    }
    

    socat makes my life easy. It is probably already packaged for your operating system, go and get it! Oh, and did I mention that it works with IPv6 too?

  2. Positional factoring (2011-01-06)

    Last summer, someone asked on stackoverflow how to factor a positive integer into prime numbers, and give the output on the form of powers at the right position in a sequence. Let us take a simple example: 825 is 3 × 5 × 5 × 11. The first prime numbers are 2, 3, 5, 7, 11. So 825 can be represented uniquely as the sequence (0 1 2 0 1) because it is equal to 20 × 31 × 52 × 70 × 111.

    Like a lot of people there, I could not resist giving an implementation of the requested functionality in one of my favorite programming languages to show how easy it was. In Factor the math.primes and math.primes.factors vocabularies make it a very easy task. Let us describe it step by step.

    The Factor implementation

    group-factors takes a number to factor and returns a sequence of pairs, each containing a prime factor and the number of times this factor appears, starting with the lowest prime factor and ending with the highest one.

    ( scratchpad ) 825 group-factors .
    { { 3 1 } { 5 2 } { 11 1 } }
    

    Next, we need the first prime numbers up-to the highest prime factor (11 in our case). Starting from our list, we can extract the first element of the last pair, which is the highest prime factor, and ask for the primes-numbers up to this number using primes-upto:

    ( scratchpad ) { { 3 1 } { 5 2 } { 11 1 } } last first primes-upto .
    V{ 2 3 5 7 11 }
    

    Then, starting again from the original factors list and this new primes number list, we can look up for each prime its corresponding power, or use 0 if it is not present. Easy, no?

    Here is the complete implementation with the necessary plumbing added:

    USING: assocs kernel math.primes math.primes.factors sequences ;
    : count-factors ( n -- seq )
        group-factors [ last first primes-upto ] keep
        [ at 0 or ] curry map ;
    

    Testing it gives, as expected:

    ( scratchpad ) 825 count-factors .
    V{ 0 1 2 0 1 }
    

    Doing it in J

    Incidentally, while browsing stackoverflow today, I found this question again and realized that no one had given the obvious answer in J: there exists a built-in operator doing exactly the requested operation.

    _ q:825
    0 1 2 0 1
    

    The reverse functionality is easy to use as well:

    _(q:^:_1) 0 1 2 0 1
    825
    

    However, despite the conciseness of the J solution, I will rather use Factor for real work, as I usually expect to be able to decipher the code I wrote the day before.

  3. Send yourself a greetings email from me (2011-01-01)

    Every new year, on January 1st, a good friend of mine sends automated greeting emails at midnight sharp. However, this year, he forgot to set the subject correctly. While the message body appropriately contains "Meilleurs vœux à toutes et à tous pour 2011 !" (Best wishes to all for 2011!), the mail subject reads "Bonne année deux mille dix !" (Happy new year two thousand and ten!).

    Fortunately, Factor will allow me to send similar greeting emails without making the same mistake. It is easy to translate an integer value such as "2011" into a French string ("deux mille onze") by using the math.text.french vocabulary that I wrote in 2009, based on math.text.english from Aaron Schaefer and a lot of complicated French language rules.

    The following code lets me send a mail to a targeted recipient containing the correct year both in the subject and in the body. I hope my friend does not mind that I stole the text from his email.

    USING: accessors arrays calendar combinators io.encodings.utf8
    io.sockets kernel make math.text.french math.parser namespaces smtp ;
    IN: newyear
    
    : subject ( -- string )
        [ "Bonne année " % now year>> number>text % " !" % ] "" make ;
    
    : body ( -- string )
        [
           "Meilleurs vœux à toutes et à tous pour " %
            now year>> number>string %
            " ! Que cette année soit pour\n" %
            "tout le monde riche d'heureuses " %
            "surprises en toutes choses !\n\n" %
            "Amitiés,\nSamuel" %
        ] "" make ;
    
    : new-email ( recipients sender -- email )
        <email>
          [ from<< ] keep
          [ to<< ] keep
          subject >>subject
          body >>body ;
    
    : send-greetings-from-sam ( email-address smtp-host -- )
        25 <inet> smtp-server set
        [
            1array "Samuel Tardieu <sam@rfc1149.net>" new-email
            send-email
        ] with-smtp-connection ;
    

    If you want to try it, you can copy-paste it into a Factor listener and run

    "Your name <your-email-address>" "your-smtp-host"
    send-greetings-from-sam
    

    to receive greetings from me.

    If everything goes well, you should receive something like this in your mailbox:

    From: Samuel Tardieu <sam@rfc1149.net>
    To: Your name <your-email-address>
    Subject: Bonne année deux mille onze !
    Date: Sat, 1 Jan 2011 01:24:49 +0100
    Message-Id: <15391953945985082941-1293841489599214@spyke>
    MIME-Version: 1.0
    Content-Type: text/plain; charset=UTF-8
    
    Meilleurs vœux à toutes et à tous pour 2011 ! Que cette année soit pour
    tout le monde riche d'heureuses surprises en toutes choses !
    
    Amitiés,
    Samuel
    

    Do not hesitate to run this every year. I mean it.

  4. Feed and relative links (2010-12-27)

    Yesterday, the Factor section of this blog was added to Planet Factor. Soon after, Jon Harper noticed that some links in one of my posts were incorrectly directed onto the Planet Factor site.

    In fact, it is perfectly allowed to have relative links in an Atom feed, both in the structured part and in the HTML one. The URL resolution mechanism starts from the feed address, unless it is overriden by one or more xml:base elements in the feed itself, according to the XML Base specification. Unfortunately, as of today the Factor feed parser does not handle relative URLs at all and let them unchanged.

    It needs fixing of course, but since it is unlikely that Factor feed parser is the only parser with such a bug, I tweaked this site feeds generation. Each post goes through the following filter before getting into the Atom feed:

    require 'jekyll'
    require 'rexml/document'
    require 'uri'
    
    module AbsoluteLinks
    
      BASE = URI.parse(Jekyll.configuration({})['url'])
    
      # The complete list should be cite, classid,
      # codebase, data, href, longdesc, src, and usemap
      # but we only use a few of them.
      TOFIX = ['cite', 'href', 'src']
    
      def fix_link(post, attr)
        post.each_element("//[@#{attr}]") { |e|
          origin = e.attributes[attr]
          e.attributes[attr] = BASE.merge(origin)
        }
      end
    
      def absolute_links(input)
        post = REXML::Document.new("<post>#{input}</post>").root
        TOFIX.each {|attr| fix_link(post, attr)}
        post.to_s[6..-8]
      end
    
    end
    

    Only cite, href and src are handled here instead of the whole list given in comments. REXML (Ruby XML library) is slow enough to avoid looking for all the tags. A SAX-based parser may be more appropriate here since it would require only one tree traversal.

    Also, I used an ugly hack to have REXML parse the post content as one element where there are several paragraphs and sections. The content gets encapsulated into a <post/> XML tag which gets brutally removed at the end by a crude string manipulation.

    Now, someone should fix Factor feed parser and let it properly handle relative URLs. This is more complicated than it sounds, as it requires parsing and changing the (possibly semi-valid) HTML content from the feed entries.

  5. Something nice about every language I use (2010-12-09)

    I'll follow Dave Ray and will try to say something nice about a bunch of programming languages I use or have used seriously:

    • Ada – The only language I would trust my life to.

    • C – It gets things done easily in a controlled space when resources are scarce. I use it in many embedded situations, often with FreeRTOS.

    • C++ – Its templating system with specialization beats everything I know. When I worked on Urbi at Gostai, I had a lot of pleasure using it.

    • Erlang – The language to use to develop distributable parallel applications. I wrote many programs for research projects with it.

    • Factor – One of the languages I feel the most comfortable with. I really like the reverse polish notation and the powerful combinators. I use it for many personal and teaching projects.

    • Forth – Forth is one of the languages that I have been liking since the first time I heard about it. Its conciseness, simplicity, grammar and ease of implementation beats almost everything when it comes to size on very small embedded systems. I used it to write a Forth compiler targetting the Microchip PIC16Fxxx microcontrollers family.

    • Haskell – I started using it when I had to send patches for Darcs. I really love monads, and I also love explaining them in class. My window manager configuration is also written in Haskell.

    • J – It is unbeatable if you have RSI and need to type as little characters as possible for a task that can be applied to a whole array. I use it mostly to solve Project Euler problems.

    • Java – Well, everyone knows it so it may be used to explain a simple concept. Is that nice enough?

    • Javascript – Javascript lets us do things in the browser I would not have imagined five years ago. For example, this web page is static but includes Twitter updates and comments, thanks to Javascript. On the server side, I use it within a CouchDB database where I store a whole web application; it dynamically generates iCalendar views for multiple people from data gathered at TVrage.com using their XML API.

    • Python – I can hack anything in a few minutes and still be able to read it later. I wrote a Forth compiler for the Microchip PIC18Fxxx microcontrollers family with it.

    • Ruby – Feels like Python, only more functional and cleaner. I would use it more if I had not been bitten by threading unstability on Sparc64Linux in the past (for the will-spam-for-food.eu.org service we ran with Pierre Beyssac and Thomas Quinot). Ruby helps me run this blog.

    • Scala – There comes a useful, powerful and pleasant to use language targetting the Java virtual machine. I used it to write my HarassMe Android application.

    I probably forgot some languages in the list. However, if I use them, I am sure I can tell something nice about them.

  6. Recursion and while loops in Factor (2007-10-11)

    Tonight, I was willing to create a while construct in Factor taking two items on the stack:

    1. a quotation to execute when the test is true
    2. a test quotation

    For example, I wanted to be able to find the smallest power of two greater than or equal to an arbitrary number (here 34098) by doing:

    1 [ 2 * ] [ dup 34098 < ] while
    

    (of course there are much better ways of doing that, such as 34098 2 * 1 + 2 /i but that's not the point here)

    This would be much easier if we could recurse from within a block (by using a recurse word). Let's start with that:

    SYMBOL: recursive-block
    
    : set-block ( quot -- )
      recursive-block set ;
    
    : recurse ( quot -- )
      recursive-block get call ;
    
    : with-recurse ( quot -- )
      recursive-block get >r
      dup set-block call
      r> set-block ;
    

    A quote passed through with-recurse can use the recurse word and re-execute itself.

    Now that we have recursion, it is easy to implement while using currying:

    : while ( quot quot -- )
      swap [ call recurse ] curry [ slip when ] 2curry
      with-recurse ; inline
    

    Note that inline is used here to give the optimizing compiler a chance to build the complete quotation at compile-time if both quotations given to while are statically known.

    To create a vocabulary recursion containing this code, one can create a file extra/recursion/recursion.factor containing:

    USING: kernel namespaces ;
    IN: recursion
    
    <PRIVATE
    
    SYMBOL: recursive-block
    
    : set-block ( quot -- )
      recursive-block set ;
    
    : init-block ( -- )
      [ "recurse called without with-recurse" throw ] set-block ;
    
    PRIVATE>
    
    : recurse ( quot -- )
      recursive-block get call ;
    
    : with-recurse ( quot -- )
      recursive-block get >r
      dup set-block call
      r> set-block ;
    
    : while ( quot quot -- )
      swap [ call recurse ] curry [ slip when ] 2curry
      with-recurse ; inline
    
    MAIN: init-block
    

    Loading this library can be done by issuing the following command in the listener: "recursion" run.

    That's all folks.

  7. Factor: a stack-based programming language (2007-01-18)

    As you may already know, I'm a big fan of stack-based languages such as Forth, functional languages such as Haskell and reflexive languages such as Smalltalk. You can imagine how happy I was when I discovered Factor a few days ago: it combines all those aspects.

    Today, a friend sent me someone email signature and asked me if I could decipher it:

    01101001001000000110011001110101011000110110101101100101011001000010
    00000111100101101111011101010111001000100000011011010110111101101101
    00001101000010100110000101101110011001000010000001101110011011110111
    01110010000001100110011011110111001000100000011110010110111101110101
    

    As any programmer on the Earth would have, I immediately assumed that those were ASCII codes printed in binary format. I had a Factor interactive shell opened in one of my windows, so I cut and pasted the whole string (surrounded by quotes) and entered:

    8 group [ 2 base> ] map >string print
    

    and the cleartext version appeared instantly. All in all, it took me around 20 seconds to uncover the original text using Factor.

    How does this work? Factor is a stack-based language, meaning that data are put onto a stack and words (equivalent of functions in other languages) use the data on the stack and put results there. Factor is a (dynamically) typed language: complex data can be pushed onto the stack, while untyped languages such as Forth can only push numbers there.

    Writing the string pushes it on the stack. Using

    8 group
    

    takes the string on the top of the stack, considers it as a sequence (a succession of characters), group them by eight and returns an array of strings of length 8. At this point, there is only one element on the stack: an array of eight-characters-long strings.

    Then

    [ 2 base> ]
    

    pushes another element on the top of the stack: a quotation (the equivalent of a lambda expression in functional languages), which is a block containing code. The base> word consumes two elements from the stack, a string S and a number B, and pushes back a number which is the value represented by S in base B. For example, the expression

    "01101001" 2 base>
    

    will let the value 105 on the stack, as 01101001 in binary represents 105 in decimal.

    map
    

    takes a sequence and a quotation. It represents the classical map operation in functional languages: it applies the quotation to every element of the sequence and gathers the results in a new sequence. As a consequence, each eight-characters-long string gets transformed into its decimal representation. At the end, we end up with a single element on the stack, which is an array containing all the ASCII codes of the sentence.

    >string
    

    transforms a sequence of ASCII codes into the corresponding string. Then

    print
    

    is similar to C's puts and prints the string on the standard output while recognizing special sequences such as \r and \n.

    The content of the unencrypted text itself is not important; my point is that Factor is very compact and its stack-oriented nature helps writing concise and clear programs. For example, here is one of the many possible implementations of the reverse functionality: it takes a string from the stack and lets its ASCII representation in binary onto the stack.

    [ 2 >base 8 CHAR: 0 pad-head ] [ append ] map-reduce
    

    I'm sure that you're thrilled to know that "Hello, world!" encodes as

    0100100001100101011011000110110001101111001011000010
    0000011101110110111101110010011011000110010000100001
    

    (edited on 2009-06-13 to use pad-head and map-reduce)