The Basic interpreter

Most of the ROM and GROM memory in the TI-99/4A is devoted to the Basic Interpreter. One could even consider that TI-Basic is operating system of the TI-99/4A.

This interpreter is written in GPL, itself an interpreted language, with a bunch of assembly language routines. This has the unfortunate consequence of making TI-Basic extremely sluggish, as the double interpretation process takes a lot of time. On the other hand, one could argue that it makes programs more compact (although I'm not fully convinced of that).

I don't have the pretention of understanding the TI-Basic interpreter. In this page, I summarized information that I gathered from several sources and that (I hope) could be usefull to somebody studying TI-Basic "under the hood".

Structure of a Basic program
VDP memory
Statement list
Tokens
Line number table
Symbol table
String space
PAB chain
Value stack
Scratch-pad usage

The Basic GROMs
Tables
_PARSE
_EXEC
_CONT
_NUD
_Tokens

Routines


Structure of a Basic program

A TI-Basic program consists in a line number table and a list of Basic statements. It is generally stored at the top of the VDP memory (although Basic programs can be stored in GRAM/GROM). The statement list comes just under the area reserved by the disk controller for file buffers, and grows downwards. The line number table comes under it.

Under the program come all the variable and constants it uses: the value stack, the symbol table, the string space and the PAB space.

VDP memory usage with TI-Basic

>0000


>02FF
Screen Image (add >60 to each char)
+ Char pattern table (chars >00-60)
>0300
>031F
Color table
>0320
>036F
Crunch buffer for current line
(human-readable Basic <=> tokens)
>0370
>03C0
>03DF
>07FF
Char pattern table for chars >6E-FF
VDP roll out buffer

@>8324
@>836E
Value stack
@>833C
PAB chain
@>831A
@>8318
String space
@>833E Symbol table
@>8330
@>8332
Line number table

@>8370
Statement list

Disk file buffers


Disk file buffers

The TI disk controller card reserves space at the top of the VDP RAM at power-up time. Other controllers may or may not do so. The TI routines are written in such a way that several controller cards can reserve space under each other.

The last free address, just under the bottom of these buffers, is stored in >8370. Normally it will be >37D7, with CALL FILES(3). If you change the number of files, add or substract 518 bytes (>206) per file.


Basic statement list

This list starts at the top of the VDP memory, just under the area reserved by the disk controller. The list grows downwards, with the last line that was typed in at the bottom. Which means that the lines are not sorted by number (the line number table will take care of that). During Basic execution, XML >1B can be used to get the next token and place it in >8342.

Each statement begins with a lenght byte and ends with a >00 byte. All TI-Basic keywords are replaced with 1-byte codes known as "tokens". This both saves space and speeds up execution. The "crunch buffer" at >0320-036F is used to place the decoded Basic statment and perform the conversion operations.

Scratch-pad addresses

>837C contains a pointer to the next token to be processed in the current statement.
Byte >8342 contains the value of the previous token fetched from the statement.
>8332 points to the top of the line number table, just below the statement list.
>8370 points to the last byte in the statement list.

Example

Here is a exemple of how a TI-Basic statement is encoded.

100 CALL MYSUB(A,"TEST2",U$,512).

Address Token/chars Meaning 
>37BA >1D Line size
>37BB >9D CALL
>37BC >C8 Unquoted string
>37BD >05 String length
>37BE MYSUB The name is, of course, not encoded
>37C3 >B7 (
>37C4 A Variable names are not encoded either
>37C5 >B3 ,
>37C6 >C7 Quoted string "..."
>37C7 >05 String length
>37C8 TEST2 Content of the string
>37CD >B3 ,
>37CE U$ Another variable
>37D0 >B3 ,
>37D1 >C8 Unquoted string
>37D2 >03 String length
>37D3 512 Numeric constants are passed as strings
>37D6 >B6 )
>37D7 >00 End-of-line mark


TI-Basic tokens

A TI-basic program is not stored as such in memory. Instead, each line is "crunched" into 1-byte tokens, as much as possible. All token have a value of >80 or above, which allow to quickly distinguish them from a variable name or a number (since these are made of ascii characters). The exceptions are those token that correspond to keywords that cannot be part of a program (e.g. NEW, RUN, etc).

Word Token Word Token Word Token
Quoted
string
>C7 Unquoted
string
>C8 Line
number
>C9
) >B6 EXP >CE SEG$ >D8
( >B7 INT >CF STR$ >DB
& >B8 LOG >D0 LIST >03
^ >C5 RND >D7 SAVE >08
= >BE SGN >D1 CALL >9D
* >C3 SIN >D2 OPEN >9F
/ >C4 SQR >D3 BREAK >8E
+ >C1 TAN >D4 GOSUB >87
- >C2 LEN >D5 FIXED >FA
< >BF POS >D9 INPUT >92
> >C0 VAL >DA PRINT >9C
: >B5 ASC >DC TRACE >90
; >B4 REC >DE CLOSE >A0
# >FD NEW >01 OPTION >9E
, >B3 RUN >00 RETURN >88
GO >85 CON >02 NUMBER >05
IF >84 NUM >05 OUTPUT >F7
ON >9B RES >07 APPEND >F9
TO >B1 BYE >04 UPDATE >F8
DEF >89 OLD >06 DELETE >99
DIM >8A BASE >F1 UNTRACE >91
END >8B DATA >93 UNBREAK >8F
EOF >CA EDIT >09 RESTORE >94
FOR >8C ELSE >81 DISPLAY >A2
LET >8D GOTO >86 CONTINUE >02
REM >9A NEXT >96 VARIABLE >F3
SUB >A1 READ >97 INTERNAL >F5
TAB >FC STEP >B2 RELATIVE >F4
ABS >CB STOP >98 RANDOMIZE >95
ATN >CC THEN >B0 PERMANENT >FB
COS >CD CHR$ >D6 SEQUENTIAL >F6




RESEQUENCE >07



Line number table

The line number table begins right below the statement list and grows downwards in VDP memory. The line numbers are arranged in numerical order from top to bottom. Each entry consists in two words: the line number and the address of that line in the statements list. This is the address of the first executable byte in the statement, i.e. after the size byte.

Scratch-pad addresses

Word >832E: Current line entry in the line number table
Word >8330: Bottom of the line number table
Word >8332: Top of the the line numer table
Word >8336: Entry for the line containing the next DATA element

Example

100 CALL MYSUB(A,"TEST2",U$,512)
110 STOP

The line number table for the above program would be:

Address Contents  Meaning             
>37AF : >0063 Line number (110)
>37B1 : >37B8 Pointer to the line in the statement list
>37B3 : >0064 Line number (100)
>37B5 : >37BB Pointer to the line in the statement list



Symbol table

This table is used to store all variables used in a Basic program. Each entry is at least 8 bytes (i.e. 4 words) long, but can be much longer if needed. For instance, entries for arrays contain a size word for each dimension, followed by pointers for each elements.

XML >13 and XML >16 can be used to search the symbol table for a given symbol. Both place a pointer to the symbol in the symbol table at >834A. Then XML >14 can be called to generate an 8-byte description for this symbol.
XML >15 can be used to modify the value of the variable, after pushing the description on stack with XML >17.

Type Byte 1 Byte 2 Word 2 Word 3 Word 4 More words
DEF 0 Name size Ptr to next entry Ptr to symbol name Ptr to definition
String >80 Name size Ptr to next entry Ptr to name Ptr to string
Numeric 0 Name size Ptr to next entry Ptr to name Floating point value (4 words)
Numeric array Dim Name size Ptr to next entry Ptr to name Size for each dimension. Floating point values.
String array >8Dim Name size Ptr to next entry Ptr to name Size for each dimension. Pointers to strings.

Ptr to name: points to a location in the statement list where the variable name is spelled out. E.g. X=25.
Ptr to next entry: allow to walk the symbol table by linking from a variable to the next. The last entry contains >0000.

Scratch-pad addresses

Word >833E: Bottom of the symbol table
Word >8330: Bottom of the line number table (which is just above the symbol table).

Example

100 X=25
110 A$="STRING IN A$"

This sample program generates the following symbol table:

Address  Content                                   
>3778 : >80 02 >3780 >37AB >3759
>3780 : >00 01 >0000 >37BE >4019 >0000 >0000 >0000

>37AB and >37BE are addresses in the statement list where the variable names A$ and X are mentionned. The sizes of their names are 02 and 01 respectively.
>3759 is an address in the string space, pointing at the S in "STRING IN A$".
>4019 0000 0000 0000 is the floating point value for 25.0

Here is another example, that deals with arrays:

100 DIM A$(2,3)
110 A$(0,0)="TEST"
120 A$(0,1)="THIS"
130 A$(1,1)="THAT"
140 A$(1,2)="SOMETHING ELSE"

>8202 >0000 >37CC >0002 >0003   >3735 >372D >0000 >0000  >0000 >3725 >3713 0  0 0 0 0
|| | | | | | A$(0,0) (0,1) (0,2) (0,3) (1,0) (1,1) (1,2) ... (2,3)
|| | | | | '- Second dimension '-----'------'-- empty strings
|| | | | '------- First dimension
|| | | '- Ptr to name in program
|| | '------- Next entry (none here)
|| '- Name size (A$ is 2-char long)
|'--- Number of dimensions (2 here)
'---- String flag

For a string array, the values following the dimensions (>3735, >372D, etc) are pointers to the strings in the string space. If the string is empty, there is nothing is the string space and the pointer is >0000.

For a numerical array the structure is similar, except that the string pointers are replaced with numerical values.Each value is 8-byte long in floating point format.


String space

Strings are saved in the string space at the bottom of the VDP memory. The strings accumultes downwards. When you change the value of a string variable, a new string is created in the string space. The old one remains in place, but it not used anymore.

The GROM routine >0038 can be used to allocate space for a new string in the string space. If necessary, it will call the "garbage collection routine" to free more space by getting rid of unused strings.

Each string is preceded and followed by a size byte. In addition, each string is preceded with a pointer to its entry in the symbol table (to make it easier to change its pointer when the string has to be moved). This pointer is >0000 for unused strings that can be deleted by the garbage collection routine.

Scratch-pad addresses

Word >8318: Top of the string space.
Word >831A: Bottom of the string space.
Word >831C: Temporary pointer to a string.

Example:

100 PRINT "THIS IS A TEST"
110 A$="STRING IN A$"
120 A$="NEW STRING"

This sample program generates the following string space:

Address Contents                      Comments                        
>374D : >3783 >0A NEW STRING >0A New version. >3783 points to >3750 in A$.
>375B : >0000 >0C STRING IN A$ >0C Old version: can be deleted
>376B : >0000 >0E THIS IS A TEST >0E Temporary string used by PRINT

The symbol table contains only one entry, the one for A$:

Address Contents                      Comments                        
>377D : >80 >02 >0000 >37B3 >3750 >3750 points at the N in "NEW STRING"



PAB chain

In Basic, all Peripheral Access Blocks are linked together, in a list of the following format:

Bytes 0-1 : link to next PAB (>0000 in last one)
Byte 2 : file #
Byte 3 : internal offset (used to write in PAB data buffer)

Byte 4 : opcode
Byte 5 : error/type flags
Bytes 6-7 : data buffer address
Byte 8 : record length
Byte 9 : characters count
Bytes 10-11: record number (relative files)
Byte 12 : screen offset/status returned by opcode >09
Byte 13 : file name length
Byte 14+ : file name

By that time, you probably realised that bytes 4 to 14 correspond to bytes 0 to 10 of the PAB used to call a DSR: TI-Basic just adds a 4 bytes header above them.

Scratch-pad addresses

Word >833C: Pointer to the first PAB in the chain


Value stack

Basic maintains a stack in VDP memory onto which values can be pushed with with XML >17 and retrieved with XML >18. The internal description of Basic symbols is fairly complex, since there are many different types of symbols, however XML >14 can be used to generate a description that always fits in 8 bytes (i.e. 4 words), to be placed on the value stack. Always use XML >17 and >18 to push/pop strings from the stack since they update the pointers in the description (this is required if a string was moved while its entry was on the stack, for instance).

Note that it may happen that a program does not require any value to be placed on the stack.

Here are the formats for the different entries in the value stack, as created by XML >14:

Type of entry Word 1 Word 2 Word 3 Word 4
Numeric cte Floating point number, in radix 100 notation
String cte >001C >6500 pointer to string string size
Numeric var pointer to entry >0000 pointer to value >0000
String var pointer to entry >6500 pointer to string string size
Numeric array pointer to entry >00 Dim pointer to value >0000
String array pointer to entry >65 Dim pointer to value >0000
GOSUB ptr to line# >6600 n.a. n.a.
FOR * pointer to entry >6700 pointer to value ptr to line #
DEF number ptr to line >6800 old symbol table ptr old free space ptr
DEF string ptr to line >6880 old symbol table ptr old free space ptr

Ptr to entry: points to the entry for that variable in the symbol table.
Dim: number of dimensions of the array.
*: each FOR entry is followed by two numeric entries: the step and the upper limit values.


Scratch-pad usage summary

Address Use
>8300-8316 Temporary variables storage.
>8318 Beginning of string space (i.e. top).
>831A End of string space (bottom), first free address in VDP
>831C Temporary string pointer. Also: PAB error.
>831E Start of current Basic statement.
>8320 Current screen address.
>8322 Error code returned by assembly language routines.
>8324 VDP value stack base pointer.
>8326 Return address for assembly language routines.
>8328 Pointer to NUD tables for PARSe and EXEC.
>832A Pointer to end of screen display (cf >8320).
>832C Pointer to current token (or text) in the current statement.
>832E Pointer to current line number, in line number table.
>8330 Pointer to start of line number table.
>8332 Pointer to end of line number table, start of statement list.
>8334 Data pointer for READ.
>8336 Line number table pointer for READ.
>8338 Address of intrinsic Poly constants ???
>833A Subprogram symbol table pointer.
>833C PAB address in VDP RAM: first link in PAB list.
>833E Pointer to bottom of symbol table.
>8340 VDP RAM free space pointer.
>8342 Current char/token (value).
>8344 Contains >FF if RUN, else >00 ( * READY * ).
>8345 Extended Basic flags:bit 0=1 Auto-num, 1=1 On break next,
3=1 Trace, 4=1 Edit mode, 5=1 On warning stop,
6=1 On warning next, 2 + 7 unused.
>8346 Crunch buffer destruction level
>8348 Last subprogram block on stack.
>836C Floating point error address in GROM ??
>836D Contains >08 for DSR call.





Basic GROMs structure

Here is an outline of the content of the Basic GROMs (addresses >2000 to >57FF).

Tables

EXEC table

The GPL opcode EXECute can call 34 procedures, their addresses being stored in a table located at >1C9C in CPU memory. These procedures may be in ROM or in GROM, in which case they are written in GPL. For the latter, the table only contains an index with a >8000 flag to indicate a GROM procedure. The index is used to branch inside the NUD table located in GROM memory.

NB >1A2C corresponds to an error code 0

Token Address Token Address
reserved >1A2C INPUT NUD >16
ELSE >1A2C DATA >19E6
: : >1A2C RESTORE NUD >12
IF >1BB6 RANDOMIZE NUD >14
GO >1A8E NEXT >1C14
GOTO >1AFC READ NUD >0A
GOSUB >1AE0 STOP >1A3C
RETURN >1B74 DELETE NUD >3E
DEF >19E6 REM >19E6
DIM >19E6 ON >1A92
END >1A36 PRINT NUD >0C
FOR NUD >00 CALL NUD >0E
LET >1BEA OPTION >19E6
BREAK NUD >02 OPEN NUD >18
UNBREAK NUD >04 CLOSE NUD >1A
TRACE NUD >06 SUB >1A2C
UNTRACE NUD >08 DISPLAY NUD >3C
UNTRACE NUD >08


PARS table

The GPL opcode PARSe uses a table located at >1CE2 in CPU memory, that contains addresses for 38 procedures. These procedures can be in assembly or in GPL.NUD refers to an entry in the NUD table in GROM.


NB. >1A2C corresponds to an error code 0

Token Address Token Address Token Address
( NUD >1C / >1A2C SGN NUD >2E
& >1A2C ^ >1A2C SIN NUD >30
reserved >1A2C reserved >1A2C SQR NUD >32
OR >1A2C Quoted string NUD >10 TAN NUD >34
AND >1A2C String >1A5C LEN NUD >36
XOR >1A2C Line number >1A2C CHR$ NUD >38
NOT >1A2C EOF NUD >4A RND NUD >3A
= >1A2C ABS NUD >22 SEG$ NUD >40
< >1A2C ATN NUD >24 POS NUD >46
> >1A2C COS NUD >26 VAL NUD >44
+ NUD >1E EXP NUD >28 STR$ NUD >42
- NUD >20 INT NUD >2A ASC NUD >48
* >1A2C LOG NUD >2C

CONT table

The GPL opcode CONTinue uses an 8 procedures table located at >1D2E in CPU memory. These procedures are all in ROM.

Token Address
= >1D5C
< >1D3E
> >1D4C
+ >1DEC
- >1E18
* >1E24
/ >1E30
^ >1E3C


NUD table

Some of the procedures called by PARSe and EXEC are written in GPL and stored in the Basic GROMs. They are listed in the NUD table in the form of BR G@xxxx statements (that may well lead to another branch statement). The NUD table is in GROM memory, at address specified by word >8322 (normally >4E84). The parse and exec tables in ROM only contain index values whithin the NUD table (plus a flag bit of >8000 to indicate a GPL procedure).

Keyword Address Keyword Address Keyword Address
FOR >4FB6 ( >4FF9 CHR$ >52EA
BREAK >5463 + >4FB2 RND >4F00
UNBREAK >5479 - >4FA8 DISPLAY >4000
TRACE >5459 ABS >4ED1 DELETE >4002
UNTRACE >545E ATN >4EDC SEG$ >524A
READ >400E COS >4EE2 STR$ >531A
PRINT >4004 EXP >4EE8 VAL >5349
CALL >50DB INT >4EEE POS >53A9
quoted string >5111 LOG >4EFA ASC >5306
RESTORE >400C SGN >4F26 EOF >401C
RANDOMIZE >50C8 SIN >4F40

INPUT >4006 SQR >4F46

OPEN >4008 TAN >4F4C

CLOSE >400A LEN >52BE


Token tables

The crunching and expanding routines use a token table located in GROM at address >2870. The table is split into several sub-tables, depending on the lenght of the keywords. For each keyword length, the subtable ends with a >FF byte.

The pointers for the different sub-tables are just before them, at address >285C:

Address  Value  Pointer to subtable for 
>285C >2870 1-byte keywords
>285E >288F 2-byte keywords
>2860 >289C 3-byte keywords
>2862 >291D 4-byte keywords
>2864 >2973 5-byte keywords
>2866 >299E 6-byte keywords
>2868 >29D0 7-byte keywords
>286A >29F1 8-byte keywords
>286C >2A16 9-byte keywords
>286E >2A2B 10-byte keywords

Tokens C7 to C9 are not part of the token table.


Routines

See the NUD table above, for the address of some Basic procedures.

Here are a few other entry points in GROM memory:

Address Use
>1387 OPEN cassette.
>13CF READ cassette.
>13DA WRITE cassette.
>13F2 OLD cassette.
>140E CLOSE cassette.
>1444 Verify cassette.
>1489 SAVE cassette.
>216F Start of Basic interpreter (Entry point for NEW).
>2214 Address table for RUN, NEW, CONTINUE, LIST, BYE,
NUMBER, OLD, RES,SAVE and EXIT.
>27E3 Clears screen, resets cursor and continues as below:
>27F1 Loads char patterns, resets colors and VDP registers 2,3 and 4.
>2A42 Start line editor with default position and length.
>2A49 Ditto with max length in >835E.
>2A4F Ditto with starting screen position in >8361.
>3450 Checks if a char is valid for a variable name (A-Z, a-z, 0-9..).
>351C CALL CLEAR.
>3538 CALL SOUND.
>360E CALL HCHAR.
>362A CALL VCHAR.
>3643 CALL CHAR.
>3708 CALL KEY.
>3748 CALL JOYST.
>37D6 CALL SCREEN.
>401E OPEN a file.
>4160 DELETE a file.
>4174 CLOSE a file
>41CF Closes all files.
>41D7 RESTORE a file.
>4227 PRINT in a file or on screen.
>426C DISPLAY on screen.
>4344 INPUT from files or keyboard.
>45E3 READ the DATA inserted in a program.
>4641 OLD loads a program.
>46FC SAVE a program.
>474C LIST a program.
>482B EOF tests for end of file.
>4D7C Prints "Bad Value".
>4D81 Prints "String-number mismatch".
>566C Prints "Can't do that".
>56CD Scrolls up.
>56EF CALL GCHAR.
>5713 CALL COLOR.


Revision 1. 6/13/99 Preliminary, but ok to release
Revision II. 9/12/99 Made some corrections. Added examples.
Revision III. 9/19/99 More examples added (symbol table, value stack).
Revision IV. 3/4/00. Added PARS, EXEC, and CONT tables.



Back to the TI-99/4A Tech Pages