Shi
Loading...
Searching...
No Matches
Dictionary

All definitions in Forth must share a certain header which contains

  • a link to the previous or next definition
  • some flags which tell the interpreter how to treat the word and
  • a name for actual look-up
Field Description Size [b]
Link Link to previous or next definition 4
Flags Properties of word (e.g. immediate, inline, ...) 1
Name Counted string 1 length + length chars
Code/data Code or data user-defined

Shi is what's called a direct threaded Forth which means that the execution token of a word equals it's very first assembly instruction. Since the ARMv7-M architecture has certain alignment restrictions and can only execute code from 2-byte aligned addresses there might be a padding byte right after the definitions name.

Dictionary types
Although in practice there is just a single type of dictionary entry it's useful to differ between the data-space a definition resides in and whether it's part of the core dictionary or not. Using this properties we can divide Shi's dictionary into:

  1. Core dictionary - part of Shi itself
  2. User dictionary in data - extended through the user and compiled into data
  3. User dictionary in text - extended through the user and compiled into text

Creating the core dictionary
The core dictionary is created by some macro magic heavily inspired by Mecrisp-Stellaris. The macro WORD can be used to automatically create a linked list of assembler functions including flags and a counted string name. The macro parameter label is optional and only necessary if the name of the word contains special characters which are not allowed as assembly labels, otherwise name is also used as label. Since WORD uses the numeric labels 7, 8 and 9 the actual definition may only use 1-6 for its own branches.

.macro WORD flags, name, label
.p2align 1 @ Align before link
link\@\‍(): @ Label the link
9: .word 9f @ Link (4 byte)
.byte \flags @ Flags (1 byte)
.byte 8f - 7f @ Length (1 byte)
7: .ascii "\name" @ Name (cstring)
8: .p2align 1 @ Align before code
.thumb_func
.ifnb \label @ Label for code (use name if label wasn't defined)
\label\‍():
.else
\name\‍():
.endif
.endm

Here is an example of the definition of + created with WORD.

WORD FLAG_INTERPRET_COMPILE & FLAG_INLINE & FOLDS_2, "+", plus
ldmia dsp!, {r0}
adds tos, r0
bx lr

Flags
The standard differs between three different properties a word can have. It might have interpretation semantics, compilation semantics and it might be immediate. In theory any combination of those three can occur although some like interpretation semantics and immediate might not make much sense. Shi comes with additional flags for its optimizations and feature to compile to flash. Specially the latter is tricky since variables compiled to it still need to have a cell of ram memory somewhere. For that reason the definition gets marked with the RESERVE_x flag which lets Shi reserve memory cells at the end of data-space at initialization.

Flag Description Value
FLAG_SKIP Definition ignored 0b1111'1111
FLAG_INTERPRET Definition has interpretation semantics 0b0111'1111
FLAG_COMPILE Definition has compilation semantics 0b1011'1111
FLAG_IMMEDIATE Definition is immediate (and executes during compilation) 0b1101'1111
FLAG_INLINE Definition is short enough to get inlined instead of called 0b1110'1111
RESERVE_x Definition needs to reserve cells of data-space 0b1111'xx11
FOLDS_x Definition allows constant folding (e.g. 3 4 + is replaced by 7) 0b1111'11xx

Search order
As mentioned at page Variables the link to the latest definition in ram is always stored. In case there is no definition in ram yet it still has its initial value which is the start of the core dictionary. Anyhow link provides the start of a singly linked list which iterates through the dictionary in the following order:

  1. User dictionary in data
  2. Core dictionary in text
  3. User dictionary in text

In case the user hasn't extended the dictionary so far it looks like this:

dot_inline_dotgraph_1.png

The light gray entry is special because it's the very last entry of the core dictionary. It is also the only definition of the core which resides in data and not in text. This is a necessity to allow the very last link of the core dictionary to point to the first entry of the user dictionary in text without recompiling Shi. The address of the user dictionary is simply not known until runtime when it's passed as parameter to the initialization function.

Once the user starts adding definitions in both data-spaces the dictionary might change it's appearance to something like this:

dot_inline_dotgraph_2.png

An implication of this search order is that definitions in data are found faster than those in text.

Initialization
To initialize Shi the functions shi::init and shi_init can be used. Both functions take a struct which contains the begin and end addresses of the data-spaces as well as the necessary text alignment for compilation to flash. Passing addresses and alignment for text is completely optional and can simply be set to 0 if not needed.

Besides applying the passed addresses there are three things happening during initialization.

  1. Sweep text
    The whole dictionary is searched for definitions which need to reserve ram. The necessary amount of ram memory is taken from the end of data-space. Afterwards the last found link is saved as beginning of the text data-space. At this point the last link might either be the last core dictionary entry or the last user dictionary entry depending on whether the user has already extended the dictionary or not.
  2. Fill data
    Shi does not rely on whether the data-space passed in is zero-initialized or not. In any case it gets overwritten by the value defined by SHI_ERASED_WORD. By default that's 0xFFFFFFFF which mimics what most cleared flash devices are.
  3. Set context
    Initializing the Shi context which means initializing the following registers
    tos .req r6 @ Top of stack
    dsp .req r7 @ Data-stack pointer
    lfp .req r8 @ Literal-folding pointer
    tos gets initialized with '*', dsp with the stack end and lfp with 0.