-*- outline -*-

--- README file for PicForth $ReleaseVersion: 0.19 $

* What is that?

This program is a Forth compiler for the Microchip PIC 16F87x family.

* Why this project?

I needed to write some code on a PIC to control a digital model railroad
system using the DCC (Digital Control Command) protocol. However, writing
it in assembly is error-prone and writing it in C is no fun as C compiled code
typically needs a lot of space.

So I wrote this compiler, not for the purpose of writing a compiler, but as
a tool to write my DCC engine.

* State of the compiler

The compiler does not aim to be ANS Forth compliant. It has quite a few words
already implemented, and I will implement more of them as needed. Of course,
you are welcome to contribute some (see below for license information).

At this time, many words are missing from standard Forth. For example, I have
no multiply operation as I have no use for it at this time and won't spend
time to implement things I don't need (remember, Forth is a tool before
anything else).

* License

The compiler is released at the moment under the GNU General Public
License version 2 (I intend to use the less restrictive BSD license in
the future, but as it is based on gforth, I have to sort out those
issues with gforth copyright holders).

However, the code produced by using this compiler is not tainted by the
GPL license at all. You can do whatever you want with it, and I claim
absolutely no right on the input or output of this compiler. I encourage
to use it for whatever you want.

Note that I would really like people to send me their modifications
(be they bug fixes or new features) so that I can incorporate them in
the next release.

* Why not use Mary?

Mary was a great inspiration source, I even kept some of the names from it.
However, no code has been reused, as both Forth do not have the same goal.

* Organisation

The stack is indexed by the only indirect register, fsr. The indf register
automatically points to the top of stack.

The w register is used as a scratch. Attempts to use it to cache the
top of stack proved to be inefficient, as we often need a scratch register.

* Compiling

The compiler is hosted on gforth, a free software compiler for Unix systems.
The command line to use to compile file foo.fs into foo.hex, and getting a
usable map into foo.map is:

  gforth picforth.fs -e 'include foo.fs final-dump foo.hex map bye' | \
     sort -o foo.map

Of course, you should automate this in a Makefile, such as the one provided
with the compiler.

If you install the GNU PIC utils (from http://gputils.sourceforge.net/),
then you can read the assembled code by using "gpdasm".

* Interactive mode

By executing

  gforth picforth.fs -e 'host picquit'

(or "make interactive" from a Unix shell), you are dropped into an
interactive mode, where you can use the following words to check your code:

  see ( "name" -- )    Disassemble a word
  map ( -- )           Print code memory map
  dis ( -- )           Disassemble the whole code section

* Literals

Hexadecimal literals should be prefixed by a dollar sign "$" to avoid
confusion with existing constants (such as "c" for carry bit). This is a
strong advice.

* Default base

The default base is hexadecimal. Do not change it before including libraries
bundled with the compiler, as they do expect hexadecimal mode.

* Stack size

The default stack size is 16. If you use the multitasker included in
"multitasker.fs" (see below), each task gets an additionnal 8 bytes of
task-specific stack.

* Shifting

"rlf-tos" and "rrf-tos" respectively shift the top-of-stack left or right,
with the carry entering the byte and the outgoing bit entering the carry.

"lshift" and "rshift" used with a constant shift, and "2*" and "2/" do have
the last exited bit in the carry.

"swapf-tos" will swap the upper and lower nibble of the top-of-stack.

* Looping

There exists a "v-for"/"v-next" structure (v stands for variable):

  v-for ( n addr -- )
    Initialize addr content with n.

  v-next ( addr -- )
    Decrement addr content. If content is not zero, jump to v-for location.

Also, the words "begin", "again", "while", "until" and "repeat" are
implemented.

* Variable

Variables are not automatically initialized to zero, as this would waste
too much code if it is not needed. If you want a variable explicitely
initialized, use "create" and "," such as in:

  create attempts 3 ,

* Tables

A table starts with "table" word which takes the top of stack and executes
the first element for 0, the second for 1, ... Control returns from the
current word after executing the requested action. Action is "tc: word"
which executes word.

  table ( n -- )
  tc: ( "name" -- )

Example:

  : print-number ( n -- )   \ n <= 3, print number
    table
      tc: zero tc: one tc: two tc: three
  ;

The table structure does not consume return stack slots.

* Main program

A "main" word indicates that the next address is the main program. Use for
example:

  main : main-program ( -- )
    (do initialisations)
    (call mainloop)
  ;

* Macros

You can switch to macro mode by using the "macro" word. You get back to
target mode by using the "target" word.

* Interrupts

If you want to use interrupts, use

  include picisr.fs

Two words do respectively save and restore the context around interrupt
handling code:

  isr-save ( -- )
  isr-restore-return ( -- )

Also, the word "isr" is provided to notify that the next address is the
isr handler.

For example, you can write an interrupt handler with:

  isr : interrupt-handler ( -- )
    isr-save
       (interrupt handling code here)
    isr-restore-return
  ;

Do not forget that the return stack depth is only height. An interrupt can
occur at any time unless you mask them or unset the GIE bit.

Two facility words that manipulate GIE are also provided:

  enable-interrupts ( -- )
  disable-interrupts ( -- )

You have to dispatch the interrupts and clear the interrupt bits manually
before you return from the handler.

Versions that do nothing are provided in the default compiler. Useful versions
are redefined when using picisr.fs.

Because of this, include picisr.fs as soon as possible, before other files
and before using enable-interrupts and disable-interrupts. Other included
files may fail to act properly if you don't.

* Argument passing

In Forth, argument passing is done on the stack. However, if you want to
transmit the top-of-stack argument in the w register (for example if a word
typically takes a constant which is put on the stack just before calling it),
you can use the defining word "::" instead of ":". All calls will
automatically use this convention.

Note that you cannot use words defined with "::" in a table (see the
"Tables" section).

If you want to return a value in the w register, you can use the word ">w"
which loads the top-of-stack into the w register before every exit point.
After calling a word which returns its result in the w register, you can
call "w>" to put the w register value onto the stack.

* Bit manipulation

To ease bit manipulation, the following words are defined for port p:

  and!       ( n p -- )    logical and with n
  /and!      ( n p -- )    logical and with ~n
  /and       ( a b -- c )  logical and of a and ~b
  or!        ( n p -- )    logical or with n
  xor!       ( n p -- )    logical xor with n
  invert!    ( p -- )      invert content
  bit-set    ( p b -- )    set bit b of p (both have to be constants)
  bit-clr    ( p b -- )    clear bit b of p (both have to be constants)
  bit-toggle ( p b -- )    toggle bit b of p (both have to be constants)
  bit-mask   ( p b -- m )  put 1<<b on stack
  bit-set?   ( p b -- m )  put bit-mask (non-zero) on stack if bit b of p is
                           set, zero otherwise
  bit-clr?   ( p b -- f )  true if bit b of p is clear

Six words help designate bit or port pins:

  bit    ( n addr "name" -- )    ( Runtime: -- addr n )
  pin-a  ( n "name" -- )         ( Runtime: -- porta n )
  pin-b  ( n "name" -- )         ( Runtime: -- portb n )
  pin-c  ( n "name" -- )         ( Runtime: -- portc n )
  pin-d  ( n "name" -- )         ( Runtime: -- portd n )
  pin-e  ( n "name" -- )         ( Runtime: -- porte n )

For example, you can create a pin designating an error LED and manipulate
it using:

  3 pin-b error-led                   \ Error LED is on port B3
  : error error-led bit-set ;         \ Signal error
  : no-error error-led bit-clr ;      \ Clear error

To ease reading, the words "high", "low", "high?", "low?" and "toggle"
are aliases for, respectively, "bit-set", "bit-low", "bit-set?",
"bit-clr?" and "bit-toggle".

You can change the direction of a pin by using ">input" or ">output" after
a pin defined with "pin-x". For example, to set the error led port as an
output, use:

  error-led >output

* Watchdog timer

The word "clrwdt" is available from Forth to clear the watchdog timer.

* Reading from or writing to EEPROM

By using

  include piceeprom.fs

you have access to new words allowing you to access the PIC EEPROM:

  ee@          ( a -- b )     read the content of a and return it
  ee!          ( b a -- )     write b into a

Also, in any case, you can store data in EEPROM using those words:

  eecreate     ( "name" -- )            similar as create but in EEPROM space
  ee,          ( b -- )                 store byte in EEPROM
  s"           ( <ccc>" -- eaddr n )    store string in EEPROM
  l"           ( <ccc>" -- eaddr n )    strore string + character 13 in EEPROM

* Reading from or writing to flash memory

Two words allow reading from and writing to the flash memory when the file
"picflash.fs" is included with

  include picflash.fs

Those words expect manipulate a 14 bits program memory cell whose 13 bits
address is in EEADRH:EEADR. The data is read from or stored to EEDATH:EEDATA.

  flash-read ( -- )
  flash-write ( -- )

If "picisr.fs" has been included before this file, interrupts will be properly
disabled around flash writes.

* Map and disassembler code

A map can be generated in interactive mode using the "map" word.

* Multitasking

Two multitasker have been implemented.

** Priority-based multitasker

A basic priority-based cooperative multitasker allows you to
concurrently run several indenpendant tasks. Each task should execute
in a short time and will be called again next time (the entry point
does not change). This looks like a state machine.

To use this multitasker, use "include priotasker.fs" in your program.

The following words can be used to define tasks (the entry point for the
task is the next defined word):

  task ( prio "name" -- )
                 Define a new task with priority prio. By default, this task
                 will be active. You can use the "start" and "stop" words
                 to control it. Those words can be used from an interrupt
                 handler.

  task-cond ( prio "name" -- )
                 Define a new task with priority prio. By default, this task
                 is inactive. You can enable it by using the "signal" word
                 on it. If you use "signal" N times, then the task will be
                 run exactly N times. "signal" can be used from an interrupt
                 handler.

  task-idle ( -- )
                 Define a new task which will be executed inconditionnaly
                 when there is nothing else to do. Such a task can not be
                 stopped.

  task-set ( bit port prio -- )
                 Define a new task with priority prio that will be run when
                 bit bit of port port is set.

  task-clr ( bit port prio -- )
                 Define a new task with priority prio that will be run when
                 bit bit of port port is clear.

Priority 0 is the greatest one, while priority 255 corresponds to the lowest
(idle) priority. You should use priority in the range 0-254 for your own
tasks.

The multitasker is run by using the word "multitasker". This word takes care
of scheduling the highest priority tasks first. It also clears the watchdog
once per round.

The multitasker looks for all tasks of priority 0 ready to execute. If it
find some, it executes them and starts over. If it doesn't, it looks for
priority 1 tasks ready to execute. If it find some, it executes them and
starts over. If it doesn't, etc. It does this up to priority 255.

Since each word is called each time from the beginning, there is no
need to maintain task-specific stacks, as the stack has to be
considered empty.

** Basic cooperative multitasker

The basic cooperative multitasker is much simpler. It allows you to
relinguish the CPU whenever you want, provided that you are not in the
middle of a call (context-switch only occurs during top-level calls).

To use this multitasker, use "include multitasker.fs" at the top of
your program. The following words are defined:

  task ( -- )
      Create a new task with its own data stack. The task entry point
      will be the next defined word.

  yield ( -- )
      Relinguish control so that another task gets a chance to
      execute.

  multitasker ( -- )
      Code for the multitasker program. This word never returns.

This multitasker makes no use of the return stack at all. However,
each task takes four to six program words for initialization and five program
words to resume the task, plus three or four program words per yield
instruction. Context-switching takes at most 18 instruction cycles
(3.6s max on a 20MHz PIC, 18s on a 4MHz PIC), and typically
14. Also, the multitasker takes care of clearing the watchdog timer at
each round.

Each task needs 3 bytes in RAM to save its context and 8 bytes for its
data stack.

* Libraries

Some libraries can be used to enhance your application:

  - "libnibble.fs": nibbles and characters conversion
  - "libcmove.fs": implementation of ANS Forth "CMOVE" word

* Optimizations

The following optimizations are implemented:

** Tail recursion

Tail recursion is implemented at "exit" and ";" points.

  : x y z ;

generates the following code for word x:

  call    y
  goto    z

The sequence "recurse exit" also benefits from tail recursion.

** Redundant pop/push are removed

For example, the (particularily stupid and useless)

  dup dup drop

sequence generates

  movf     0x00,w
  decf     0x04,f
  movwf    0x00

which in fact corresponds to a single "dup".

Also, the following sequence

  drop 3

generates

  movlw    0x03
  movwf    0x00

** Most operations use direct-access and literal variants when possible

The following sequence

  9 and

generates

  movlw    0x09
  andwf    0x00,f

Also, combined with the redundant push/pop eliminations, the following code

  dup 9 and if ...

generates

  movf    0x00,w
  andlw   0x09
  btfsc   0x03,2

** Load, store and operations are mixed

The following sequence (with "current" and "next" being variables)

  current @ 1+ 7 and next !

generates

  movf    0x3B,w
  addlw   0x01
  andlw   0x07
  movwf   0x3C

** Condition inversions

Short (one instruction) "if" actions are transformed into reversed
conditions. For example, the following word:

  \ This word clears port a0 if port c2 is high, and sets  port b1 in any case.
  : z portc 2 high? if porta 0 low then portb 0 high ;

generates the following code:

  btfsc    0x07,2  ; skip next instruction if port c2 is low
  bcf      0x05,0  ; set port a0 low
  bsf      0x06,1  ; set port b1 high
  return           ; return from word

** Bank switch optimizations

The compiler tries to remove useless bank manipulations. The following word

 :: ee@ ( addr -- n ) eeadr ! eepgd bit-set rd bit-set eedata @ ;

generates:

  bsf      0x03,6     ; select bank 2
  movwf    0x0d       ; write into eeadr (in bank 2)
  bsf      0x03,5     ; select bank 3
  bsf      0x0c,7     ; set bit eepgd of eecon1 (in bank 3)
  bsf      0x0c,0     ; set bit rd of eecon1 (in bank 3)
  bcf      0x03,5     ; select bank 2
  movf     0x0c,w     ; read eedata (in bank 2)
  bcf      0x03,6     ; select bank 0
  decf     0x04,f     ; decrement stack pointer
  movwf    0x00       ; place read value on top of stack
  return

** Operation retarget

If an operation result is stored on the stack then popped into w, the
operation is modified to target w directly.

For example, the following word:

  : timer ( n -- ) invert tmr0 ! ;

generates

  comf     0x00,w
  incf     0x04,f
  movwf    0x01
  return

** Bit test operations

If a "and" operation before a test can be rewritten using a bit test
operation, it will.

For example, the code:

  checksum @ 1 and if parity-error exit then ...

will be compiled as:

  btfsc    0x33,0
  goto     0x037      ; parity-error
  ...

Using an explicit bit-test holds the same result:

  porta 3 high? if exit then

will be compiled as:

  btfsc   0x05,3
  return

** Useless loads removed when testing

Before a test, if the z status bit already holds the right result, no extra
test will be generated.

  9 and dup if 1+ then

will be compiled as:

  movlw    0x09
  andwf    0x00,f
  btfss    0x03,2
  incf     0x00,f

Also, the compiler detects operation which do not modify neither w or the
top of stack. For example,

  dup checksum xor! dcc-high !

will be compiled as

  movf    0x00,w
  xorwf   0x6c,f
  incf    0x04,f
  movwf   0x5b

** Increment/decrement and skip if zero used when possible

The following word:

  : action-times ( n -- ) begin action 1- dup while repeat drop ;

will be compiled as:

  call    0x022          ; call action
  decfsz  0x00,f
  goto    0x027          ; jump to "call action" above
  incf    0x04,f
  return

** Values are not normalized when this is not necessary

The word:

  :: x ( n -- flag ) 3 < if a then ;

generates

  addlw   0xFD
  btfss   0x03,0
  call    a
  return

The "<" test did not cause the value to be normalized to 0 or -1, as it is
not needed.

* Configuration word

The configuration can be configured with the following words:

  set-fosc   ( n -- )       Choose oscillator mode (default: fosc-rc)
     fosc-lp   Low power
     fosc-xt   External oscillator
     fosc-hs   High-speed oscillator
     fosc-rc   RC circuit
  set-wdte   ( flag -- )    Watchdog timer enable (default: true)
  set-/pwrte ( flag -- )    Power-on timer disable (default: true)
  set-boden  ( flag -- )    Brown-out detect enable (default: true)
  set-lvp    ( flag -- )    Low voltage programming (default: true)
  set-cpd    ( flag -- )    EEPROM protection disable (default: true)
  set-wrt    ( flag -- )    FLASH protection disable (default: true)
  set-debug  ( flag -- )    In-circuit debugger disable (default: true)
  set-cp     ( n -- )       Code protection (default: no-cp)
     no-cp     No protection
     full-cp   Full protection

* Examples

Some files are included as examples with a Makefile. E.g, to build
"booster.hex", run "make booster.fs":

  - booster.fs: code for a booster which handles overload and overheat
    signals This also serves as an example for the priority-based
    multitasker.

  - generator.fs: code for a DCC signal generator based on serial commands
    (work in progress, not functional yet)

  - silver.fs: code that runs on a silver card (a smartcard with a 16f876
    and a 24c64 serial eeprom)

  - taskexample.fs: example of multitasking code using the basic multitasker

  - controller.fs: another multitasking example, used to control multiple
    peripherals and inputs using a serial link

  - i2cloader.fs: a flash and eeprom loader using an I2C bus to reprogram
    the PIC

* Caveats and limitations

This compiler release suffers from the following known limitations. Note
that most of them (if not all) will disappear in subsequent releases.

** Code space

At this time, the PCLATH register is not used thus the code area is limited
to 2048 (800h) instructions. The compiler will abort if an attempt is made
to set the code pointer outside of this area.

** Memory space

The memory space is limited to only the first bank, from 20h to 7fh (95
bytes). Attempts to write outside of this area will result in a compiler
abortion.

** No interactivity

There is no link between the compiler and the target.

* Credits

I would like to thank the following people:

  - Keith Wootten for his precious examples of how he uses a forth-ish
    assembler for the PIC and his inspiration for some control structures

  - Francisco Rodrigo Escobedo Robles for his Mary PIC Forth compiler

  - Herman Tamas for his suggestions for some word names

  - Daniel Serpell for his superoptimizer (a program looking for the shortest
    possible sequences doing a particular job)
