Software - Token Threading

Introduction and 65C02 code

Token threading is a technique of representing code (or code sequences) in a compact form, at the expense of some speed. Bytes are (usually) read sequentially, with each byte representing some sort of action (e.g. multiply) and in some cases followed by data. The tokenized code is often position independent so that it can be loaded at, and executed from, any address. Token threading (or something very similar to it) has been used to implement interpreters for languages such as BASIC.

The 6502 cannot, of course, execute the tokens directly, so it is necessary to write a routine which interprets the token. On a 65C02, the JMP (abs,X) instruction is helpful for this.

INTERPRET
.1 JSR .2
   JMP .1
.2 JSR GET
   ASL
   TAX
   BCS .3 ; if token >= 128, use the second half of the table
   JMP (JUMP_TABLE,X)
.3 JMP (JUMP_TABLE+256,X)

JUMP_TABLE
   DW TOKEN_00_HANDLER
   DW TOKEN_01_HANDLER
;etc.
   DW TOKEN_FE_HANDLER
   DW TOKEN_FF_HANDLER

You do not need to implement all 256 tokens, of course.

One thing to note is the fact that the JSR .2 is the first instruction. (In many interpreters the JSR is right before the JMP (abs,X) rather than at the beginning.) This allows a token handler to return with RTS. (It's often useful for one token handler to JSR to another; e.g. a "calculate offset of 2-D array index" token handler might wish to call a multiply token handler.) It also allows token handlers to "return" with a JMP to INTERPRET.2 (note INTERPRET.2 not INTERPRET) in those cases where a little extra speed is desired.

GET simply gets the next token, advancing the Interpretation Pointer (IP) as it goes. The (zp) addressing mode available on the 65C02 is helpful for this.

GET
   INC IP
   BNE .1
   INC IP+1
.1 LDA (IP)
   RTS

Notice that IP is incremented before the LDA. This is because it is often useful to return with the N and Z flags reflecting the value of the byte just read. If the LDA were before the increment, the INC instruction(s) would overwrite the N and Z flags. (However, an ORA #0 instruction could be placed before the RTS to update the N and Z flags.)

Also worth noting is that you are not limited to 256 tokens. One way to do this is for one of the token handlers (e.g. token $00) to simply get the next byte and treat that as token, giving you 256 more tokens. The token $00 handler would look like INTERPRET, but with it's own, separate, jump table. Token $00 can be thought of as a prefix byte, or an escape byte, in this instance.

Another way is to get the next two bytes and simply jump to that address, like so:

TOKEN_00_HANDLER
   JSR GET ;get hi byte
   PHA
   JSR GET ;get lo byte
   PHA
   RTS ;"jump" to address+1 since we're using RTS rather than JMP

Of course, the most commonly used tokens should be single byte tokens, as these will be faster and take less space.

If 128 or fewer tokens are needed, INTERPRET can be simplified by using only even numbered tokens (which eliminates the ASL and the BCS instructions).

INTERPRET
.1 JSR .2
   JMP .1
.2 JSR GET
   TAX
   JMP (TABLE,X)

It's also possible to gain a little extra speed by inlining GET in INTERPRET rather than calling it via JSR.

NMOS 6502 code

On the NMOS 6502, GET can use (zp),Y addressing, like so:

GET
   LDY #0
   INC IP
   BNE .1
   INC IP+1
.1 LDA (IP),Y
   RTS

Without JMP (abs,X) available, a common way to make a jump table is to use a pair of PHA instructions with an RTS. In this case, the address in the table must be one less than the address to jump to, since that is how RTS works. It's also worthwhile to split the table into two pieces so that the ASL and BCS instructions are not necessary. Also, the Y register is used here, since it was used in GET and that way the X register is unused.

(Assembly syntax note: > is used as the "hi byte of" operator)

INTERPRET
.1 JSR .2
   JMP .1
.2 JSR GET
   TAY
   LDA JUMP_TABLE_HI,Y
   PHA
   LDA JUMP_TABLE_LO,Y
   PHA
   RTS
TABLE_HI
   DB >TOKEN_00_HANDLER-1
   DB >TOKEN_01_HANDLER-1
;etc.
   DB >TOKEN_FE_HANDLER-1
   DB >TOKEN_FF_HANDLER-1
TABLE_LO
   DB TOKEN_00_HANDLER-1
   DB TOKEN_01_HANDLER-1
;etc.
   DB TOKEN_FE_HANDLER-1
   DB TOKEN_FF_HANDLER-1

Self-modifying code

The 6502 implementation can be sped up by using self-modifying code. In this case, the jump table must be aligned on a page boundary. Note the similarity to the 65C02 code.

INTERPRET
.1 JSR .2
   JMP .1
.2 JSR GET
   ASL
   BCS .4
   STA .3+1
.3 JMP (JUMP_TABLE)
.4 STA .5+1
.5 JMP (JUMP_TABLE+256)

If 128 or fewer tokens are needed, then the same sort of optimization that was made for the 65C02 code can be made here also, again by using only even numbered tokens. However, the jump table must still be aligned on a page boundary.

INTERPRET
.1 JSR .2
   JMP .1
.2 JSR GET
   STA .3+1
.3 JMP (JUMP_TABLE)

To avoid using the Y register, GET can also use self-modifying code. Since both of the self-modifying versions of INTERPRET use only the accumulator, this has the benefit of using neither the X nor the Y register.

GET
   INC IP+1
   BNE .1
   INC IP+2
.1
IP = *+1
   LDA $FFFF
   RTS
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License