Assembly language primer

Hey, I said "primer". Not "tutorial", ok?

Machine language
Assembly language
CPU registers
Addressing modes
Bytes vs words

Opcodes
Common assembly-time instructions

Encoding format
Hexadecimal notation


Machine language

Basically, a computer consists of a microprocessor together with some memory. All the additional gear (keyboard, screen, mouse, etc) is just here for a secondary purpose: allow a human to interface with the computer.

The microprocessor is the brain of the computer: it controls the whole system according to the instructions contained in a program. But no matter how sophisticated, the microprocessor still does not understand english! The only thing it understands are numbers. In the case of the TI-99/4A, numbers from 0 to >FFFF, i.e. 65535 (if you don't know what I mean by >FFFF, thre will be an explanation of the hexadecimal notation at then end of this page).

Each operation that the microprocessor can execute has been assigned a number: >A000 (40960) for addition, >6000 for substraction, etc. A program is nothing else that a carefully arranged list of such numbers.

Actually, I slightly oversimplified the situation. Some instructions are encoded by a unique number, but most can be encoded by a range of numbers, depending on their targets. Let's thake the negation operation for instance: it can be encoded by any number between >0500 and >053F. Each number corresponds to a different target for the instruction: >0500 instructs the microprocessor to negate register 0, >0501 to negate register 1, etc.

The set of valid numbers accepted by the processor therefore constitues a language. It is called "machine language". In the case of the TI-99/4A, the microprocessor (the TMS9900) understands 69 different instructions, encoded by a total of 57792 numbers (i.e. some numbers do not correspond to any valid instruction).


Assembly language

As you probably realise, machine language is not very convenient for a human to use. Who's gonna recall 57792 numbers and know what they are used for? It is possible to program directly in machine language: I can do it to some extent, to patch a program with a sector editor for instance. But this is only convient for tiny bits of program using a small subset of instructions. If we want to do anything more complex, we'll need an assembler.

An assembler is a program that translates human managable mnemonics into machine language. For instance, the mnemonic for the addition is "A", the one for substaction is "S" and the one for negation is "NEG". Such mnemonics are called "opcode" for operation code.

Another set of mnemonics can be combined with the opcodes to define the target of an instruction (i.e. its operand). For instance, register 0 would be designated as "R0" and register 1 as "R1".

To negate register 0 you would write:

       NEG  R0       

Similarly, to add R1 to R0:

       A    R1,R0      

It's easier to handle than machine language isn't it?

This set of opcodes and operands is called "assembly language". It's nothing else than a litteral representation of machine language in a human-managable form.

Of course, since we are using an assembler, we could as well add some extra features that will make our life easier. The assembler may automatically check for syntax errors and tell us about it. It may also allow us to replace numbers with alphanumeric "labels". For instance, I could define a label for the number 10:

TEN   EQU  10       

EQU means "equates" and is not an opcode, in that it will not be translated to machine language. It is an assembly-time instruction, used to tell the assembler that we would like to use the word "TEN" as a synonym for the number ten. Wherever we write "TEN" in our program the assembler will replace it with 10 before it translates the program into machine language:

       LI   R0,TEN       Is equivalent to:
LI R0,10 And loads the value 10 into R0

Whereas opcodes and operands are pretty much defined by the structure of the microprocessor, assembly instructions are only limited by the programers imagination. Therefore, different assemblers may used very different instructions. In all examples you'll find on this website, I have used the instruction set of the original assembler released by Texas Instruments with the module "Editor/Assembler".

Object files

Conceivably, an assembler could load machine code directly into memory at then end of the assembly process. But that would not be convenient because you would have to re-assemble a program every time you want to execute it. Therefore most assemblers are file-based programs.

The TI assembler takes its input from a Dis/Var 80 text file and produces a Dix/Fix 80 object file in tagged-object format. It also offers you the option to produce a list file, where all syntax error will be reported: this is easier than trying to jot them down as they flash on screen!

The object file contains machine language mixed with special instructions for the linker and the loader (the 'tags'). If you are curious, the tagged-object format of this file is described in my page on the Editor/Assembler cartridge.

Linker

A linker is a program that lets you piece together several object files assembled independently. This is a nice feature when you write really big assembly programs, because it means that you don't have to re-assemble a mammoth program every time you make a small change. You can split the program into smaller files and assemble them separately, then you only re-assemble the one you modified.

Also, you could have a library of usefull routines, whether written by you or not, and link them to your program as independently assembled files. This saves you the need to reinvent the wheel for every program.

Finally, some linkers will let you link assembly with other languages, such as GPL or Extended Basic.

A very good linker is the RAG linker, from R.G. Green. It takes one or more DF80 object files and produces an executable version of the program, in the form of a memory-image "program" file.

The memory-image file must then be loaded into memory by yet another program, called a loader. The Editor/Assembler cartridge contains such a loader in its option 5, which is why memory-image files are often refered to as EA5 files. The TI-writer option 3 also calls an EA5 loader, so do Funnelweb options 1 to 3. Finally, you will find a stand-alone loader on this site: my MILD loader that let's you load both machine language and GPL.

Texas Instruments also came up with several hybrid "linking-loaders" which you'll find in Editor/Assembler option 3, Mini-memory option 1 and Extended Basic CALL LOAD. As their name implies, these programs perform linking while the program is loaded. This is much slower then loading EA5 files, but it has the advantage that object files can be loaded anywhere there is room in memory, whereas EA5 files are meant to run at a predefined location. Also, linking with Extended Basic is easier this way.

Once the program is in memory, you can use a small utility called SAVE to dump it into an EA5 program file.

In summary, the process of creating assembly language programs can be represented as such:

Your brain
|
| Editor
|
V
Text File (DV80)
|
| Assembler
|
V
Object file (DF80) Other object files (DF80)
| |
| Linker |
| |
V V
Memory-image EA5 file (Program)
|
| EA5 loader
|
V
Program is executed in memory

Alternatively, using the TI Editor/Assembler cartridge:

Your brain
|
| Editor
|
V
Text File (DV80)
|
| Assembler
|
V
Object file (DF80) SAVE utility (DF80)
| :
| EA3 linking loader :
| :
V V
Program is executed in memory
:
: SAVE is called (optional)
:
V
Memory-image file (Program)

Source file format

The format of the source file is the only one we are concerned with for the moment being, since this is the file that you must write. Fortunately, its format is very simple: each line in the file contains an opcode with its operands, generally in the form:

[label] Opcode [operand][,operand] [comments]
Spaces ^ ^ ^

The fields in [brakets] are optional. Note however that the number of operands is determined by the opcode and is therefore not optional for a given opcode: some have no operands, some have one, some have two. The syntax is a little more relaxed for instructions to the assembler, and some of these can take a variable number of operands.

Labels in the left margin are used to easily refer to a location in your program, for instance to branch to it. There must be at least one space between label and opcode, between opcode and operands, and between operands and comments (if any). If there are two operands, they should be separated with a comma, not with spaces.

Comments are very important with assembly language, since it can be quite difficult to figure out what a program is doing when reading the source file. This even applies to your own programs: you'd be surprised how difficult it is for you to understand what you wrote just a few month ago! Therefore, use comments a lot: leave messages for yourself.

If you need more room than just the end of the line, you can enter lines that contain only comment. Just make sure that such lines begin with a * that will instruct the assembler to ignore them.

Example:

* This is a comment line
MYTEST MOV R1,R2 This is a sample assembly language line


Minimal assembly program

So what's the smallest assembly program that you can write? Something like this:

START  RT          return to caller
END START

RT is not a real opcode, it's an assembly alias for one of the most common return statement, which should actually be written B *R11 This statement returns to the caller, in this case the loader that will return control to you (hopefully).

END is an instruction that tells the assembler that the end of the file has been reached. An optional operand can also be used to indicate the entry point of your program: in our case this is the label START.

If you don't specify an entry point, an EA5 loader will begin execution at the top of your program. An EA3 loader will wait for you to enter the name of a label where you want to start execution. So you can do this, you must tell the assembler to include the label into the object file. This is done as follows:

       DEF  START
START RT return to caller
END

The DEF instruction makes one or more label available to the linker and/or the loader.


CPU Registers

Before we discuss assembly language further, I would like to introduce you briefly to the structure of the TMS9900 microprocessor (a.k.a. CPU for "Central Processing Unit).. You'll find a more complete description in here in case you are interested.

First let's talk about registers. A register is nothing else than a memory cell that is integrated inside the microprocessor. The advantage is that it's much easier and faster for the microprocessor to access a register than to access the external memory. The TMS9900 has 3 main registers:

The program counter
The workspace pointer
The status register


Program counter

The Program Counter (PC) keeps trace of the currently executed instruction. This allows the TMS9900 to fetch the next instruction to be executed. Some instructions allow you to load a different value inside PC, thereby performing a "jump" or "branch" inside your program (the equivalent of a Basic GOTO). The program counter is a 16-bit register that simply contains the memory address of the current instruction.


Status register

The Status register (ST) contains flag bits use by the TMS9900 to make decisions. Most operations affect one or the other of these bits. And conversely, some instructions behave differently according to the status of a given bit (e.g. conditional jumps). The structure of the status register is the following:

Bit 0 1 2 3 4 5 6 7 to 11 12 to 15
Use High GT Equ Carry Ovf Par Xop not used Interrupt mask

The first bits deal with mathematical operations:

High means logically higher than, for unsigned operations. For instance, you could use the compare opcode "C" to compare R0 and R1:

      LI  R0,>0001     Loads number 1 into R0
LI R1,>FFFF Loads number 65535 (a.k.a. -1) into R1
C R1,R0 Compares R1 to R0

After the comparison instruction, the "High" bit will be set to 1 since >FFFF is logically higher than 1.

GT (greater than) performs the same function, but consider that the operands contain signed values. By convention, >FFFF means -1, >FFFE means -2, >FFFD means -3, ... and >8000 means -32768.

In the above example, GT will be 0 since -1 is arithmetically smaller than 1.

Eq means equal and is set by the comparison operation when the two operands contain equal values.

And now, let's mention a very usefull feature of the TMS9900: after almost each operation that deals with numbers the processor will aurtomatically compare the result to zero and set the status bits accordingly. For instance, if you decrement R0 by 1:

       DEC  R0       

the content of the status register will reflect the result of the comparison of R0 (after it was decremented) with zero. If you were using R0 as a countdown counter, there is no need to use a "C" instruction to compare it to zero: this was already done by the DEC instruction.

Carry is used to indicate a carry over. It can be viewed as a 17th bit, left of bit 0. For instance, if you do:

      LI  R0,>0001     Load number 1 into R0
LI R1,>FFFF Load number 65535 (a.k.a. -1) into R1
A R1,R0 Add R1 to R0

The result will be >10000, which is too big to fit in 16 bits. R0 will therefore contain >0000 and the "Carry" bit will be set to indicate that the value of R0 "wraped over".

Ovf indicates an overflow during some operations. Mostly, ovf deals with the sign bit (i.e. bit 0). For instance, if you add two positive numbers and the result may be understood as a negative number, ovf will be set:

      LI  R0,>4000     Load number into R0
LI R1,>4001 Load number into R1
A R1,R0 Add R1 to R0

The result of the addition (to be placed in R0) is >8003, wich is perfectly correct for unsigned values: 16384+16385 = 32769. However, if we were to consider these numbers as signed, we would get: 16384+16385 = -32767. The TMS9900 sets ovf to warn us about a potential problem in this case, as well as in similar situations.

Par stands for parity. It is only used by a limited number of opcodes. The microprocessor checks all the bits involved in these operations and counts how many of them were "1". If the result is odd, the Par bit will be set. This can serve as a transmission error control mechanism for instance, but is generally of little use to the average programmer.

Xop is set during the execution of a XOP instruction.

The interrupt masks tells the TMS9900 upto which priority level it can accept interrupts. In the TI-99/4A console, all interrupts are hardwired at level 2. Therefore only two cases are of importance for us:

  • If the interrupt mask is 0 or 1, interrupts are not allowed
  • If the interrupt mask is 2 to 15, interrupts are allowed.
  • In case you wonder, an interrupt is a hardware-triggered event that causes the TMS9900 to temporarely stop execution of the current program and to execute another program instead: the interrupt service routine (ISR). Once the ISR is completed, the TMS9900 will (hopefully) resume execution of the program where is was interrupted.


    Worskpace pointer

    Registers are also meant to be used by the programmer: since they are located on chip they are faster than external memory. Furthermore, since there is only a limited number of registers, they can be encoded within the instruction number itself (cf the example above for NEG R0 and NEG R1). By contrast, if we wanted to negate an address in the external memory, we wourld write:

         NEG  @>2000   Negates word at address >2000 

    But now we need 16 bits to specify the address of the word to negate, obiously it cannot fit together with the opcode into a 16-bit instruction word! Therefore, this instruction will require an extra word and will be encoded as: >0520 >2000 (the >0020 in >0520 indicates that there is an extra word in this instruction).

    With instructions that deal with two operands (like the addition) it may be necessary to have an extra word for each of them. Therefore a TM9900 instruction can be 1, 2 or 3 words long.

    Now how many registers are there for us to use inside the TMS9900? Well, none... or 16.... or as much as we want!

    Let me explain that. Instead of placing registers inside the microprocessor, the TI designers decided that registers will be in the external memory. The TMS9900 only contains a workspace pointer, i.e. a register that contains the address of 16 pseudo-registers in the external memory.

    This kind of defeats the principle of a register: it will take as much time to access a register that to access a memory address... since registers are memory addresses. Well, not quite. We saw above that an instruction requires an extra word to use an operand in the external memory. Thus, by using a register we make the program shorter, and we also make it faster: we save the time that would be required to read that extra word.

    In addition, if we change the value of the workspace pointer, we automatically start to work with a fresh set of 16 registers. If we change it back to what it was before, we'll found our previous registers with their original content. This is very usefull for context switching, i.e. to switch from one task to another. This is what occurs during the execution of an interrupt for instance: the TMS9900 automatically switches to a workspace placed at >83C0 and "remembers" where was the workspace the main program was using. This allows to return from the interrupt with an undisturbed set of registers.

    You can also cause such a context switch programmatically, using instructions like BLWP (Branch and Load a new Workspace Pointer), LWPI (Load Workspace Pointer from Immediate value) and RTWP (ReTurn using old Workspace Pointer).


    Addressing modes

    If you followed the above, you will have now understood that a given instruction address memory in two different ways: as a register (i.e. on offset with respect the the workspace pointer) or as an absolute address. These different ways are known as "addressing modes". There is a total of eight addressing modes for the operands of the TMS9900 instructions, but out of these three are reserved for sets of special instructions.

    Immediate operands: e.g. LI R0,12
    Register addressing: e.g. CLR R1
    Indirect register addressing: e.g. CLR *R1
    Auto-incrementing addressing: e.g. CLR *R1+
    Direct addressing: e.g. CLR @>2000
    Indexed addressing: e.g. CLR @>2000(R1)
    PC relative addressing: e.g. JMP $+4
    CRU relative addressing: e.g. SBO 0


    Immediate operands

    These only apply to a restricted set of instructions. An immediate value is a number that immediately follows the instruction in the program. This value is to be used as an operand.

    We already encountered one of these: the LI instruction that appeared in many examples above:

          LI  R0,>0001     Loads number 1 into R0

    In machine language this corresponds to:
    >0200
    >0001.

    Where >020x encodes the LI opcode, >xxx0 encodes register 0 and the extra word >0001 encodes the immediate value.

    The "immediate" opcodes are:
    LI R1,123 (load immediate value)
    AI R1,-12 (add immediate value)
    ANDI R1,>1FFF (and immediate value)
    ORI R1,>8040 (or immediate value)
    CI R1,25 (compare with immediate value)
    LWPI >83E0 (load worskpace pointer with immediate value)
    LIMI 2 (load interrupt mask with immediate value)

    As you can see, they all end with letter I (and no other opcode does). They are thus pretty easy to recognize and to remember.

    Most have two operands. The first one must be a register (no indirect, no indexing, just a plain register addressing). LWPI and LIMI have only one operand because the register is implicitly defined by the instruction: the Worspace pointer register and the Status register of the TMS9900, respectively.

    The second operand must be an immediate value: it's a requirement, not a choice.

    By contrast, other opcodes generally have the choice between the 5 other addressing modes. There are a few exceptions to this rule though: some opcodes can only deal with registers for on apparent reason. That's because the TI designers ran out of numbers to encode all these possibilities in machine language.


    Register addressing

          NEG   R0    Negates the content of R0

    This one is already an old friend: the operand is simply one of the sixteen registers in the current workspace. They are abbreviated R0 through R15.

    Note that the "R" is optional with the TI assembler. In fact, to use the "R" you must set a special option when launching the assembler. I didn't know that when I learned assembly (using a hacked assembler, with no manual) and I therefore got the habit of not using the R. I had a hard time remembering to add it while I was typing the examples for these pages...


    Indirect register addressing

          NEG  *R1   Negates the content of the word pointed at by R1

    Here the register is not the final target of the instruction. Instead, it contains the address of the target (i.e. it is a pointer to it).

    In the above example, if R0 contains >A123 the instruction will result in negating word at address >A123.

    This allow you to operate on different addresses according to the situation, even though the exact address could not be predicted by the time you wrote the program.


    Indirect auto-incrementing register addressing

          NEG  *R1+   Negates the content of the word pointed at by R1

    These is almost exactly the same as the above, with one exception: once the instruction is completed, the content of R1 will be incremented (by two in this case, because we are dealing with words).

    That comes extremely handy for copy loops for instance:

           LI   R1,>2000      We want to copy this memory area
    LI R2,>A000 Into that one
    LI R0,>1000 That's the number of bytes to copy
    L1     MOV  *R1+,*R2+     Copy one word
    DECT R0 We decrement by two because 1 word is 2 bytes
    JGT L1 DECT automatically compares R0 with 0.
    * If it's greater we jump to L1, other wise we come here

    Do you see how it works?


    Direct addressing

    A.k.a. Symbolic addressing

          NEG  @>2000   Negates the word at address >2000 

    That one is easy to understand: you just use the memory address of the word you want to target. Optionally, you could designate this word with a label, hence the name "symbolic":

          NEG  @COUNTER    Assuming you define COUNTER as >2000 

    Finally, most assembly will let you enter arithmetic expressions like:

          NEG  @COUNTER+4    Negates the word at address COUNTER + 4

    The assembler calculates the value of COUNTER+4 (in our case >2004) at assembly time and encodes the instruction as NEG @>2004. One more example of the usefull things an assembler can do for you.


    Indexed addressing

          NEG  @>2000(R1)  Negates the word at address: >2000 plus the value of R1 

    This one kind of combines @>2000 with *R1: the target address is calculated by adding the direct (a.k.a. symbolic) address and the content of R1.

    For instance, if R1 contains 2, the above instruction negates >2002.
    If R1 contains >0548, it negates >2548 etc.

    Note that R0 cannot be used as an index (this is because NEG @>2000 is in fact encoded as NEG @>2000(R0), the TMS9900 knows that in this case it must not use R0).

    This adressing mode is usefull in two situations:

          LI   R1,8           Loads the number 8 into R1
    NEG @TABLE(R1) Negates entry 8 in the table

    In this case, the symbolic part of the operand refers to a table of values you placed in memory. R1 represents an index inside that table (hence the name of this addressing mode). Of course, we could have written NEG @TABLE+8, but in the above example, the content of R1 can vary. Therefore, the NEG instruction can apply to one or the other word in the table, according to the value of R1.

    The second situation is the "mirror image" of the first one:

          LI    R1,LIST     Makes R1 a pointer to LIST
    NEG @8(R1) Negates entry 8 in the list

    It does the same job as the above: which one you want to use is mainly a matter of taste. Sometimes you prefer to think "I'm targetting the 8th entry in the table" and sometimes you like better "I'm dealing with the 8th byte after the current position in the list".

    There are two more addressing modes that are only used by well defined sets of opcodes (just like immediate operands are).


    PC relative addressing

    This one is used exclusively by jump instructions:
    JMP (unconditional jump)
    JEQ (Jump if Equal, i.e. if the Equ bit is 1 in the status register)
    JNE (Jump if Not Equal, i.e. if the Equ bit is 0)
    JGT (Jump if Greater Than)
    JH (Jump if Higher)
    JHE (Jump if Higher or Equal)
    JL (Jump if Lower, i.e. if the "High"and the "Equ" bits are 0)
    JLE (Jump is Lower or Equal)
    JLT (Jump if Less Than, i.e. if the "GT" and the "Equ" bits are 0)
    JNC (Jump if No Carry)
    JOC (Jump On Carry)
    JNO (Jump is No Overflow)
    JOP (Jump on Odd Parity, i.e. if the "Par" bit is set)

    Again these are easy to recognise: they all begin with "J" (no other opcode does).

    Youn will note however that there are some missing: what about JLTE and JGTE? And why isn't it a "jump if overflow" and a "jump on even parity". Again that's because the machine language is running out of valid numbers.

    But the missing one are the logical complement of existing instructions: Therefore, to perform the equivalent of "Jump if Less Than of Equal" you could invert the comparison and use "JGT":

          C    R1,R4     Compare R1 to R4
    JGTE SK1 Error: "illegal opcode"
          C    R4,R1     Do the inverse comparison 
    JLT SK1 Now this is gonna work

    Similarly, you could compensate for a defficient "jump if overflow" by changing the structure of your program:

           A    R1,R4    Add R1 to R4
    JIO SK1 Jump if overflow? Does not exist!
    MOV R4,R2 Continue if no overflow
    ...
    SK1 NEG R1 Do something if overflow occured
           A    R1,R4    Let's try again
    JNO SK1 Jump if no overflow
    NEG R1 React to overflow here
    ...
    SK1 MOV R4,R2 And continue here if no overflow

    The main advantage of jump instructions, is that they can be encoded together with the opcode. How is this possible? Didn't I just mentionned that a memory address uses up 16 bits and therefore always requires an extra word in the program?

    Yes, but you see jump to not use memory addresses. Instead they jump to a given distance from the current instruction: plus or minus 127 words. This allows to encode all jump values with a single byte and leaves the other 8 bits for the opcode. BUT it means that we cannot jump everywhere in the program: the destination address must be within the 127 words limit (actually it's 128 words forwards and 127 backwards). Here again, the assembler wil calculate this for you and issue an error message if the jump cannot be encoded. This "out of range" message is one of the most frequent you will encounter...

    There are two ways to specify a jump in assembly. First you can type the displacement yourself:

          C    R1,R0    Compare R1 to R0
    JLT $+4 Jump 4 bytes down if R1 is less than R0 (signed)
    JMP $-120 Jump 120 bytes up otherwise (replaces a JGTE )

    The $ sign stands for "current value of the program counter". The assemble then knows that the offset you specified is relative to the current instruction. (FYI: in machine language jumps are WORDS relative to the NEXT instruction: therefore a JMP $+4 is encoded as >1001, not as >1004. But the assembler takes care of this for you).

    In the above example, it's relatively easy to figure that JLT $+4 jumps over the JMP. But who wants to count 120 bytes upwards to figure out where the JMP is going to land?

    Therefore, the assembler lets you use a label to specify the destination. The value of this label is most simple defined by placing it in the left margin of the target instruction: the assembler automatically assigns the current value of the program counter (i.e. the memory location of the opcode) to such labels.

          JNE  OK         Perform some kind of test 
    JLT UHOH And another one here
    ... Continue here if both are negative
    OK MOV R1,R2 Here if first test hit
    ...
    UHOH NEG @>2000 And here if second jump taken

    Note that contrarily to braching instructions, you must not include the @ sign before a label in the case of a jump (as opposed to B @OK). This is usefull to remind you that you are restricted to a limited range of motion.


    CRU relative addressing

    The CRU (Communication Register Unit) is a special hardware trick that allows the TMS9900 to communicate with pieces of equipment without using the data bus. Conceptually, it could be viewed as a set of 2048 wires linking the microprocessor to peripherals (in fact it uses only three lines, plus the address bus, but that's not the point).

    You can also view it as a set of 2048 bits, one per each "wire". By reading a bit you can test the status of a wire, by writing to a bit you can change the status of a wire. Provided of course that the peripheral in question is designed to allow these operations: there could be read-only bits, write-only bits and not-used bits!

    There are five CRU operations:

    TB (Test Bit: transfer the bit into the Equ bit of the status register)
    SBO (Set Bit to One)
    SBZ (Set Bit to Zero)
    LDCR (Load CRU: allows to modify upto 16 bits at a time)
    STCR (Store CRU: allows to read upto 16 bits at a time)

          LI   R12,>1300     CRU base of the RS232 card  
    SBO 7 Turn light on
    TB 7 Read it back
    JEQ OK Jump if bit is 1 (not zero)

    Note that TB is somewhat missleading: JEQ jumps it the bit is 1 and JNE if the bit is zero. That's because the bit is copied in the Equ bit (that normally stands for "equals zero" in automatic comparisons).

    So what's the use of the LI R12,>1300 instruction? Well, instead of numbering the bits from zero, the TMS9900 numbers them from the value found in R12, divided by two (because the last addess line does not exist on a word-oriented microprocessor, we cannot encode uneven addresses).

    That's usefull when dealing with peripherals: each card is assigned a chunk of 128 bits at addresses starting at >1000, >1100, >1200, etc upto >1F00 (addresses >0000 to >0FFF are internal to the console). If you want to turn on the disk controller card, whose CRU address is >1100 you may do:

         LI    R12,>0000
    SBO >880 That's >1100 divided by two

    But that's kinda hard to remember right.

    Now you could do the same with:

          LI   R12,>1100
    SBO 0 By convention, the first bit of a card turns it on

    Which is somewhat easier to use: the bits are relative to the card, not to the whole CRU address space

    And when it comes to LDCR and STCR, you don't even have a choice: you must use the second style since you can't specify the starting bit in the instruction (it's always zero)

          LI   R12,>1102
    LDCR R1,3 Load 3 CRU bits from R1 into the disk controller card

    Because >1102 accesses bit 1 of the disk controller card (remember: the content of R12 is the bit number times 2), the above example affects the values of bit 1, 2, and 3 of the controller card.

    Note that with LDCR and STCR the only valid addressing mode is register (that's one of those case where numbers were missing...). The bits are always taken/stored from the right to the left of the register. If the operation deals with 1 to 8 bits, only the left byte of the register will be used: a byte-oriented operation is assumed. With 9 to 15 bits, the whole register is accessed.

    For a more complere explanation of the CRU, see here.


    Bytes versus words

    The TMS9900, and the TI-99/4A that is built around it, is a word-oriented, byte-addressable processor.

    What this means is that the processor mainly deals with 16-bit words. But that these words are stored in a memory that is addressed as bytes. It may seem silly, but that's what happened in most computer systems: even a 32-bit Pentium is still using a byte-addressable memory.

    With 16 bits we can encode numbers from 0 to 65535, which means that the size of the memory has to be at most 64K bytes if we want to be able to deal with an address as a single value. The memory could be twice as large is it were word-addressable, but that's not the case.

    For the processor, it means that the least significant address line (A15 in the TI numbering convention) is perfectly useless: it just represents the byte inside the word. Therefore, the TMS9900 has only 15 address lines: A0 through A14.

    Nevertheless, the CPU can deal with byte values if necessary. It just uses an internal trick: when you write a value to a byte, the processor reads the whole word, modifies the byte you were accessing, and writes the word back. There is a bunch of opcodes that allow such byte-oriented operations: AB, SB, MOVB, CB, SOCB, SZCB and in some cases LDCR and STCR.

    When the operand is a register, the only byte that you can access in this manner is the leftmost one, the most significant. That's the byte stored in memory at an uneven address. When using direct addressing, you can specify an odd or an even address indifferently for byte operations.

    On the other hand word operations always require an even address, since the 16-bit value is transfered as such on the data bus (and the address is always even since A15 is missing).

          LI   R1,>0104     Loads >0104 into R1
    S R1,@>2000 Substracts >0104 from >2000-2001
    S R1,@>2001 Ditto! It does not affect >2002 !!!
    SB R1,@>2000 Substract >01 from >2000. >2001 unchanged
    SB R1,@>2001 Substracts >01 from >2001. >2000 unchanged

    Just remember: for word operations always use an even address.


    Data bus multiplexing

    For some mysterious reason, the TI-99/4A designers decided to cripple their machine by using an 8-bit data bus to access most peripherals. Only the console ROMs and the 256 bytes of console RAM known as the "scratch-pad" are accessed via a 16-bit data bus.

    For the rest of the memory, a special circuitery in the console multiplexes the data bus and passes it as twice 8 bits. This requires the creation of an extra address line, A15, which indicates which byte is currently accessed. We therefore end up with a 16-bit address bus. Note that A15 is multiplexed with the CRU data output line, but this is not a problem during memory operations since this CRU line will be inactive.

    The big drawback is that memory access is now twice slower, and the multiplexer circuit has to put the TMS9900 on hold until the transfer is completed!


    Opcodes

    Branching opcodes
    Arithmetic opcodes
    Bitwise logic opcodes
    Shift opcodes
    CRU opcodes
    Various opcodes
    Forbidden opcodes

    Opcode Value Fmt Operands Status
    A >A000 I source,dest HGECO
    AB >B000 I source,dest HGECOP
    ABS >0740 VI dest HGECO
    AI >0220 VIII reg,immed HGECO
    ANDI >0240 VIII reg,immed HGE
    B >0440 VI dest -
    BL >0680 VI dest -
    BLWP >0400 VI dest -
    C >8000 I source,dest HGE
    CB >9000 I source,dest HGE P
    CI >0280 VIII reg,immed HGE
    CKOF >03C0 VII - -
    CKON >03A0 VII - -
    CLR >04C0 VI dest -
    COC >2000 III source,reg E
    CZC >2400 III source,reg E
    DEC >0600 VI dest HGECO
    DECT >0640 VI dest HGECO
    DIV >3C00 IX source,reg2 O
    IDLE >0340 VII - -
    INC >0580 VI dest HGECO
    INCT >05C0 VI dest HGECO
    INV >0540 VI dest HGE
    JEQ >1300 II PC-rel E=1
    JGT >1500 II PC-rel G=1
    JH >1B00 II PC-rel H=1 E=0
    JHE >1400 II PC-rel H=1 E=1
    JL >1A00 II PC-rel H=0 E=0
    JLE >1200 II PC-rel H=0 E=1
    JLT >1100 II PC-rel G=0 E=0
    JMP >1000 II PC-rel Always
    JNC >1700 II PC-rel C=0
    JNE >1600 II PC-rel E=0
    JNO >1900 II PC-rel O=0
    JOC >1800 II PC-rel C=1
    JOP >1C00 II PC-rel P=1
    LDCR >3000 IV source,nbits HGECOP
    LI >2000 VIII reg,immed HGE
    LIMI >0300 VIII immed -
    LREX >03E0 VII - -
    LWPI >02E0 VIII immed -
    MOV >C000 I source,dest HGE
    MOVB >D000 I source,dest HGE P
    MPY >3800 IX source,reg2 -
    NEG >0500 VI dest HGECO
    ORI >0260 VIII reg,immed HGE
    RSET >0360 VII - -
    RTWP >0380 VII - All
    S >6000 I source,dest HGECO
    SB >7000 I source,dest HGECOP
    SBO >1D00 II bit -
    SBZ >1E00 II bit -
    SETO >0700 VI dest -
    SLA >0A00 V reg,count HGECO
    SOC >E000 I source,dest HGE
    SOCB >F000 I source,dest HGEC P
    SRA >0800 V reg,count HGEC
    SRC >0B00 V reg,count HGEC
    SRL >0900 V reg,count HGECO
    STCR >3400 IV dest,nbits HGE P
    STST >02C0 VIII reg -
    STWP >02A0 VIII reg -
    SWPB >06C0 VI dest -
    SZC >4000 I source,dest HGE
    SZCB >5000 I source,dest HGE P
    TB >1F00 II bit E
    X >0480 VI dest depends
    XOP >2C00 IX source,xop# X
    XOR >2800 III source,reg HGE
    Illegal >0000-01FF, >0320-033F,
    >0780-07FF,>0C00-0FFF



    Branching opcodes

    These opcodes allow you to control the flow of your program, by jumping from a point to another, possibly remembering where to come back.

    Jxx offset

    We already discussed the jump opcodes. They are very usefull since they are conditional: your program can make decisions and follow one or the other path of execution according to the value of a given status bit.


    B dest

    The first one is very easy to understand: B just loads the value of the destination operand into the program counter register of the TMS9900. As a result, the program execution continues from that address. Note that contrarily to jumps there is no range limitation for B: the whole 64K range is within reach. This is because the address is passed as an extra word after the opcode.

    Status bits affected: none


    BL dest

    Branch-and-Link does exactly the same thing, except that it remembers where it came from: the address of the next instruction (the one that would be executed of the branch were not taken) is placed in register R11. Then the program execution continues from the specifed address: in this case it is marked by a label called "THERE".

    Status bits affected: none

    How to return from such a branch? Very easy: just use a B instruction with an R11 addressed as an indirect operand. Like that:

          B    *R11    Which can also be abbreviated as:
    RT

    Here we are just telling the TMS9900: branch at the address that you will find in R11. Which happens to be the return address, since BL automatically placed it in here.

    Of course, you should carefully preserve this return address. If you plan to use R11, make sure you first transfer its content into another register or memory location. A very common mistake is to place another BL within a function called with a BL: the new return point will overwrite the old one.

          BL    @HERE       Branch at "HERE", place address "BACK" into R11
    BACK NEG R0
    HERE  BL    @THERE      Incorrect example: branch at "THERE" and place "DONE" into R11 
    DONE ABS R0
    B *R11 This returns at "DONE" since its now the value in R11

    THERE B *R11 This returns at "DONE"
    * Now here is the correct example
    BL @HERE Branch at "HERE", place address "BACK" into R11
    BACK NEG R0
    HERE  MOV   R11,R10     Save return point
    BL @THERE Branch at "THERE" and place "DONE" into R11
    DONE ABS R0
    B *R10 This returns at "BACK" since its the value in R10

    THERE B *R11 This returns at "DONE" since its the value in R11


    BLWP dest

    Branch-and-Load-Workspace-Pointer is slightly trickier: here we want to branch, but also to change the workspace. To be able to return to the calling point, we must not only remember the return address (i.e. the content of the PC register), but also the location of the old workspace (i.e. the content of the WR register). And since we are at it, we may also record the value of the status register. This way, when we come back, we can restore the situation just like is is now.

    You may think that BLWP is called with two operands, one for the address to branch at, the other for the new workspace. If that's the case, you came close but no cigar! BLWP has only one operand, a pointer: it contains the address of a pair of words that contain the address to branch at and the new workspace. It is more flexible this way since you only have to know one value (the pointer) to perform the branch.

          BLWP  @WHERE
    WHERE DATA  >8300       New workspace to be used 
    DATA HERE Address to branch at
    HERE  MOV   R0,R1       The BLWP branches here 

    So where are the three values we wanted to remember upon branching? The pointer to the old worskpace is automatically placed into R13, the return address (the instruction that follows the BLWP) into R14, and the current value of the status register into R15.

    Here also, we should be carefull not to overwrite them with our data. However, unlike BL, there is no risk of erasing them with a second BLWP, since that one will save its return values into R13-R15 of the NEW workspace. That's a very neat feature: it means that BLWPs can be nested as deep as we want! The drawback is that this operation is fairly slow, as the TMS9900 must do a lot of memory transfer to complete it...

    Status bits affected: none


    RTWP

    Now how do we return to the caller and restore the three TMS9900 registers? There is a dedicated instruction for that purpose: Return-with-Worskpace-Pointer. Upon execution it will branch to the address found in R14, with the workspace specified in R13 and place the content of R15 in the status register. And we just have restored the situation as it was before the BLWP was taken.

    That's an ideal feature for a multi-tasking operating system. Too bad there isn't one around... Actually, there is one multi-tasking feature in the TI-99/4A: the interrupt service routine. Upon reception of an interrupt, the TMS9900 performs an implicit BLWP @>0004, i.e uses the values found at >4000 and >4002 in the console ROM as workspace (>83C0) and address (>0900). At this address you'll find the interrupt service routine, the subprogram that handles all the interrupts. This subprogram returns to the main program with a simple RTWP and the main program never knows it was interrupted.

    Know, if you have an "assembly freak" mind, you'll probably have thought of an alternative use for RTWP: change the content of the registers in the TMS9900. All we have to do is to load values into R13, R14 and R15 and execute a RTWP.

          LI   R13,>83E0    This will be the new workspace
    LI R14,THERE This will be the new address
    MOV R1,R15 The content of R1 will be the new status
    RTWP Do it
          LWPI >83E0        We could alter the workspace this way
    B @THERE We could change the address this way
    * ??? But there is no opcode to load the status

    This trick is often used to return at a differnt address in case an error occured within a procedure: just place the address of the error handling routine into R14 and the next RTWP will branch to it. It is good practice to save the previous value of R14 somewhere, in case the error handling routine would need it, after all


    XOP

    Extended operations are implicit BLWPs that use vectors located at precise addresses in the console ROMs:

    XOP 0 uses addresses >0040 and >0042 (worskpace >280A, address >0C1C)
    XOP 1 uses addresses >0044 and >0046 (workspace >FFD8, address >FFF8)
    XOP 2 uses addresses >0048 and >004A (values may vary according to the ROM version), etc

    XOPs constitute an advanced feature and are discussed in the TMS9900 page.


    Parameter passing

    While we are taking about branching and calling subroutines, I would like to briefly discribe the three classical ways for a subroutine to get parameters from the calling program: global variables, registers and immediate data.

    * This snippet illustrate three ways to pass parameters 
    * to a subroutine called with BL
          CLR   @COUNTER     Place the value >0000 into the word at address COUNTER 
    LI R0,>01B4 Place value >01B4 into R0
    BL @SUB1 Call a subroutine
    DATA 'BA' Place the value 'BA' (>4241) into a data word
    BACK MOV R1,R0 Continue execution here
    SUB1  MOV   @COUNTER,R2  Use the value in COUNTER: global parameter
    A R0,R2 Use the value in R1: parameter in register
    MOV *R11+,R0 Get the value in the data word. AND change R11 to return to BACK
    LI R1,>0456 To return a result: put it in a register
    MOV R2,@RESULT Our into a global

    It's not considered as good programming practice to make to much a wide use of global parameters. You should reserve them for critical values that are needed by many subroutines in your program: using the as global avoids having to pass them with a register every time.

    Whether to use a register or a data word largely depends on what kind of information is passed by this parameter. If it can contain a wide range of values that vary often, better use a register. If it just allows to differentiate between a few situation uses a data word: this will save you a register, and a word in memory (the one occupied by the LI instruction in the caller).

    For instance: a routine that places a character somewhere onscreen should probably pass the character and the screen position in registers. This will make it more versatile. By contrast, a routine used to display pre defined error messages may well use dat words as there won't be that many messages and they will probably always be displayed at the same address.

    Note the way the data word is retrieved in the subroutine: it automatically increments R11 by two, which will skip the data word upon return

    The corresponding operations for a subroutine called with BLWP would be the following:

    * This snippet illustrate three ways to pass parameters
    * to a subroutine called with BLWP
          CLR   @COUNTER     Place the value >0000 into the word at address COUNTER 
    LI R0,>01B4 Place value >01B4 into R0
    BLWP @SUB4 Call a subroutine, this time changing the context
    DATA 'BA' Place the value 'BA' (>4241) into a data word
    BACK MOV R1,R0 Continue execution here
    SUB4  DATA  WREGS,SUB40  Use this workspace and this address
    SUB40 MOV   @COUNTER,R2  Use the value in COUNTER: global parameter
    A *R13,R2 Use the value in old R0: parameter in register
    MOV *R14+,R0 Get the value in the data word. AND change R14 to return to BACK
    LI R1,>0456 To return a result: put it in a register
    MOV R1,@2(R13) Remember: the address of the old workspace is in R13
    MOV R2,@RESULT Our return value in a global

    Did you get the trick with R13? Since R13 contains the address of the old workspace, *R13 points at the old R0, @2(R13) at the old R1, @4(R13) at the old R2, etc.


    Returning values

    If the subroutine wants to return a value, it can place it in a register, or in a global variable. Generally, one does not return values in a DATA statement (because the program could be placed in ROM, in which case the data word won't take the new value). On the other hand, the following trick is often use to indicate a special condition upon return:

          BL   @SUB2       Call a subroutine
    JMP ERROR In principle, returns here
    MOV R0,R1 In fact, returns here if no error occured
    SUB2  CI   R0,>0111    Compare R0 to >0111
    JL ERR We don't want it to be smaller (silly example)
    INCT R11 Change return address: skip the JMP if no error
    ERR B *R11 Return at the address found in R11

    SUB2 indicates to the caller that everything went OK by skipping the JMP upon return. Another way to do it would be to place the test in the caller, since the B instruction does not affect the status register:

          BL   @SUB3       Call a subroutine
    JL ERROR Use the comparison performed in SUB3
    MOV R0,R1
    SUB3  CI   R0,>0111    Compare R0 to >0111
    B *R11 Let the caller do the conditional branch

    Now what about routines called with BLWP? The first solution is identical, just replace R11 with R14:

          BLWP @SUB8       Call a subroutine
    JMP ERROR In principle, returns here
    MOV R0,R1 In fact, returns here if no error occured
    SUB8  DATA WREGS,SUB80 New workspace and branching address
    SUB80 CI   R0,>0111    Compare R0 to >0111
    JL ERR We don't want it to be smaller (silly example)
    INCT R14 Change return address: skip the JMP if no error
    ERR BLWP Return at the address found in R14

    The second solution is trickier since the RTWP discards the content of the status register and replaces it with R15. We therefore have to store the status in R15 beforehand (or some value of our choice).

          BLWP @SUB9       Call a subroutine
    JL ERROR Use the comparison performed in SUB9
    MOV R0,R1
    SUB90 DATA WREGS,SUB80 New workspace and branching address
    SUB80 CI   R0,>0111    Compare R0 to >0111
    STST R15 Store the status register in R15
    RTWP Return with the new status taken from R15

    Finally, let me remind you that you can change the return address by loading any value in R11 (or R14 for RTWP)

           BL   @ENTRY1        Call a procedure 
    BACK ABS @TABLE(R2) That will return here
    ENTRY2 LI   R11,ERROR      Special entry point: always return to error 
    ENRTY1 MOV R0,R0 Regular entry point: return provided by BL
    B *R11


    Arithmetic opcodes

    The following opcodes can be used to perform integer math. Unless otherwise indicated, the source and destination operands can be accessed in any of the five main addressing modes: Rx, *Rx, *Rx+, @xxxx and @xxxx(Rx).

    MOV source,dest

    Copies the content of the source operand into the destination operand and compares it to zero.
    Status bits affected: High, Gt, Equ


    MOVB source,dest

    Same as MOV, but affects only the leftmost (most significant) byte of the operands.
    Status bits affected: High, Gt, Equ, Parity


    LI register,immediate

    Loads the immediate value into the destination register and compares it to zero.
    Status bits affected: High, Gt, Equ


    A source,dest

    Adds the content of the source operand to the destination operand and compares the result to zero.
    Status bits affected: High, Gt, Equ, Carry, Ovf


    AB source,dest

    Same as A, but affects only the leftmost (most significant) byte of the operands.
    Status bits affected: High, Gt, Equ, Carry, Ovf, Parity


    AI register,immediate

    Adds an immediate value to the destination register.
    Status bits affected: High, Gt, Equ, Carry, Ovf

    NB. There is no SI (substract immediate) instruction, but you can use AI with negative values: AI R0,-5


    S source,dest

    Substracts the content of the source operand from the destination operand and compares the result to zero.
    Status bits affected: High, Gt, Equ, Carry


    SB source,dest

    Same as S, but affects only the leftmost (most significant) byte of the operands.
    Status bits affected: High, Gt, Equ, Carry, Ovf, Parity


    C source,dest

    Compares the content of the source operand to that of the destination operand.
    Status bits affected: High, Gt, Equ


    CB source,dest

    Same as C, but affects only the leftmost (most significant) byte of the operands.
    Status bits affected: High, Gt, Equ, Parity


    CI register,immediate

    Compares the register to an immediate value.
    Status bits affected: High, Gt, Equ,


    DEC dest

    Decrements the destination operand by 1 and compares the result to zero.
    Status bits affected: High, Gt, Equ, Carry, Ovf


    DECT dest

    Decrements the destination operand by 2 and compares the result to zero
    Status bits affected: High, Gt, Equ, Carry, Ovf, Parity


    INC dest

    Increments the destination operand by 1 and compares the result to zero
    Status bits affected: High, Gt, Equ, Carry, Ovf


    INCT dest

    Increments the destination operand by 2 and compares the result to zero
    Status bits affected: High, Gt, Equ, Carry, Ovf, Parity


    NEG dest

    Negates the destination operand and compares the result to zero.
    Status bits affected: High, Gt, Equ, Carry, Ovf, Parity


    ABS dest

    Takes the absolute value of the destination operand (i.e. negates it if it is less than 0) and compares the result to zero.
    Status bits affected: High, Gt, Equ, Carry, Ovf, Parity

    NB There are no DECB, DECTB, INCB, INCTB, NEGB, nor ABSB byte-oriented operations.


    MPY source,register

    Multiplies the source operand and the destination register. The result will probably be 2-word long and will therefore be placed into the destination register and the following register! (If R15 is the destination, the next memory word after the workspace is used).

    Example: R0 * R1 --> [R1-R2]

          LI   R0,5
    LI R1,10
    MPY R0,R1 Now R1-R2 contains 10
    * i.e. R1 is 0, R2 is 10


    DIV source,register

    Divides the 2-word value found in the destination regiser and the following register by the content of the source operand. The result is placed in the destination register. The remainder in the next register.
    Status bits affected: Ovf

    Example: [R1-R2] / R0 --> R1, Remainder in R2

          LI   R0,4
    LI R1,8
    LI R2,3 R1-R2 contains >0008 0003
    DIV R0,R1 Now R1 is >0002
    * and R2 is >0003 (remainder)


    Bitwise logic opcodes

    The following operations deal with operands on a bitwise basis, i.e. the operands are not considered as words or bytes but as a collection of individual bits. The change of a given bit will never affect neighbouring bits.


    CLR dest

    Reset (clears) all bits in the destination operand to zero. The result is >0000.
    Status bits affected: none


    SETO dest

    Sets all bits in the destination operand to one. The result is >FFFF.
    Status bits affected: none


    INV dest

    Inverts all bits in the destination operand (logical NOT) and compares the result to zero.
    A 0 bit becomes a 1, a 1 bit becomes a 0.
    inv 0 = 1
    inv 1 = 0

    Status bits affected: High, Gt, Equ


    ANDI register,immediate

    Ands the bit in the destination register with those in the immediate value and compares the result to zero.
    If both the source and destination bit are 1, the resulting bit will be 1. Otherwise it will be 0.
    0 andi 0 = 0
    0 andi 1 = 0
    1 andi 0 = 0
    1 andi 1 = 1

    Status bits affected: High, Gt, Equ


    ORI register,immediate

    Ors the bit in the destination register with those in the immediate value and compares the result to zero.
    If both the source and destination bit are 0, the resulting bit will be 0. Otherwise it will be 1.
    0 ori 0 = 0
    0 ori 1 = 1
    1 ori 0 = 1
    1 ori 1 = 1

    Status bits affected: High, Gt, Equ


    XOR source, register

    Exclusive OR of the bits in the source operand with those in the destination register. The result is compared to zero.

    If a bit is 1 in either the source or the destination operand, but not both, it will be one. Bits that are identical in both operands are reset to 0:
    0 xor 0 = 0
    0 xor 1 = 1
    1 xor 0 = 1
    1 xor 1 = 0

    Status bits affected: High, Gt, Equ

    NB There is no XORI instruction


    SOC source,dest

    Sets to 1 all bits in the destination operand that correspond to a 1 in the source operand and compares the result to 0. The other bits are unchanged.

    Set Ones Corresponding is therefore the equivalent of a logical OR:
    0 soc 0 = 0
    0 soc 1 = 1
    1 soc 0 = 1
    1 soc 1 = 1

    Status bits affected: High, Gt, Equ

    Example:

          LI   R0,>0401
    LI R1,>1021
    SOC R0,R1 R1 now contains >1421


    SOCB source,dest

    Same as SOC, but affects only the leftmost (most significant) byte of the operands.
    Status bits affected: High, Gt, Equ, Parity



    SZC source,dest

    Resets to 0 all bits in the destination operand that correspond to a 1 in the source operand and compares the result to 0. The other bits are unchanged.

    Set Zero Corresponding is therefore the equivalent of a logical NOT-A AND B:
    0 szc 0 = 0
    0 szc 1 = 1
    1 szc 0 = 0
    1 szc 1 = 0

    Status bits affected: High, Gt, Equ

    Example:

          LI   R0,>0401
    LI R1,>1420
    SZC R0,R1 R1 now contains >1020


    SZCB source,dest

    Same as SOC, but affects only the leftmost (most significant) byte of the operands.
    Status bits affected: High, Gt, Equ, Parity


    COC source,dest

    Checks whether all bits in the destination operand that correspond to a 1 in the source operand are 1. Sets the Equ bit if this is the case. All bits are unchanged.

    Compares Ones Corresponding therefore performes a masked comparison with >FFFF:
    0 coc 0 --> not considered
    0 coc 1 --> not considered
    1 coc 0 --> Equ will be 0 if this happens at least once
    1 coc 1 --> ok so far.

    Status bits affected: Equ

    Example:

          LI   R0,>0401
    LI R1,>1021
    LI R2 >4481
    COC R0,R1 Equ = 0 (because >0400 is not set in R1)
    JEQ SK1 The jump is not taken
    COC R0,R2 Equ = 1
    JEQ SK2 The jump is taken


    CZC source,dest

    Checks whether all bits in the destination operand that correspond to a 1 in the source operand are 0. Sets the Equ bit if this is the case. All bits are unchanged.

    Compares Ones Corresponding therefore performes a masked comparison to with >0000:
    0 coc 0 --> not considered
    0 coc 1 --> not considered
    1 coc 0 --> ok so far
    1 coc 1 --> Equ will be 0 if this happens at least once.

    Status bits affected: Equ

    Example:

          LI   R0,>0401
    LI R1,>1021
    LI R2 >4080
    CZC R0,R1 Equ = 0 (because >0001 is not reset in R1)
    JEQ SK1 The jump is not taken
    CZC R0,R2 Equ = 1
    JEQ SK2 The jump is taken

    NB These is no COCB, nor CZCB byte-oriented instruction.



    Shift opcodes

    The following operations shift the content of a registed, i.e. move all bits in it towards the left or the right.

    Shifting one position to the left corresponds to a multiplication by two (just as moving the decimal point by one position corresponds to a multiplication/division by 10). Shifting to the right corresponds to a division by two. However, in the latter case we may have to take the sign bit into account, if we are dealing with signed values: therefore there are two shift-right instructions: one for logical operands, one for arithmetic operands.

    For each you can specify by how many positions you want to shift the bits: legal values are 1 to 15. Shifting by 0 has of course no effect, so this value is interpreted differently: If you specify a shift by 0, the number of positions is taken from the value in R0 (rounded to 15). If this value is zero, the bits will be shifted by 16 positions.


    SLA register,count

    Shifts the content of the register by count positions to the left, filling up the bits on the right with 0. The result is compared to zero, the last bit shifted out is placed in the carry bit.

    Status bits affected: High, Gt, Equ, Carry

    NB There is no SLL (Shift Left Logically) because it's equivalent to SLA (Shift Left Arithmetic).

    Example:

          LI   R0,>0401
    SLA R0,1 R0 now contains >0802
    SLA R0,4 R0 now contains >8020
    SLA R0,1 R0 now contains >0040 and the Carry is 1


    SRL register,count

    Shifts the content of the register by count positions to the right, filling up the bits on the left with 0. The result is compared to zero, the last bit shifted out is placed in the carry bit.

    Status bits affected: High, Gt, Equ, Carry, Ovf (=1 if sign bit is changed).

    Example:

          LI   R1,>8401
    SRL R1,1 R1 now contains >4200 and Carry is 1
    SLA R1,4 R1 now contains >0420 and Carry is 0
    LI R0,1 Shift by one
    SLA R1,0 R1 now contains >0210 (shift value taken from R0)


    SRA register,count

    Shifts the content of the register by count positions to the right, filling up the bits on the right with copies of the sign bit. The result is compared to zero, the last bit shifted out is placed in the carry bit.

    Status bits affected: High, Gt, Equ, Carry

    Example:

          LI   R1,>8002      (i.e -32766)
    SRA R1,1 R1 now contains >C001 (i.e. -16383, correct division by 2)
    * SRL R1,1 By contast, a SRL would give >4001, which is 16385!


    SRC register,count

    Shifts the content of the register by count positions to the right, filling up the bits on the left with those ejected on the right. The result is compared to zero, the last bit shifted out is placed in the carry bit.

    Status bits affected: High, Gt, Equ, Carry

    NB There is no SLC (Shift Left Circular), but you can do it with SRC (Shift Right Circular): just substact the desired displacement from 16: SLC R1,5 is encoded by SRC R1,11.

    Example:

          LI   R1,>0401
    SRC R1,1 R1 now contains >8200 and Carry is 1
    SRC R1,4 R1 now contains >0820 and Carry is 0
    SRC R1,12 R1 now contains >8200 (equivalent to SLC R1,4)


    CRU opcodes

    These opcodes affect the CRU. See above for the CRU addressing mode. I also have a whole page that discusses CRU issues. You may want to have a look at it.

    SBO bit

    Sets a CRU bit to 1. The bit number can be from 0 to 15 and is relative to the CRU address in R12.
    Status bits affected: none


    SBZ bit

    Sets a CRU bit to 0. The bit number can be from 0 to 15 and is relative to the CRU address in R12.
    Status bits affected: none


    TB bit

    Tests a CRU bit. The bit number can be from 0 to 15 and is relative to the CRU address in R12. The bit is copied into the Equ bit in the status register.
    Status bits affected: Equ


    LDCR register,nbits

    Loads nbits bits to the CRU, starting from the CRU address in R12. If there are 1 to 8 bits, they bits are taken (from right to left) from the most significant byte of the source register. If there are 9 to 16 bits, they are taken (from right to left) from the whole register.

    Status bits affected: High, Gt, Equ, Carry. (Parity for 1-byte operations)


    STCR register,nbits

    Reads nbits bits from the CRU, starting from the CRU address in R12. If there are 1 to 8 bits, they bits are stored (from right to left) into the most significant byte of the source register. If there are 9 to 16 bits, they are stored (from right to left) into the whole register.

    Status bits affected: High, Gt, Equ, Carry. (Parity for 1-byte operations)


    Various opcodes

    Finally, here are some opcodes that I couldn't fit in any of the above categories.

    LWPI immediate

    Loads an immediate value into the workspace pointer register of the TMS9900, effectively changing the workspace. Just make sure that immediate does not correspond to an address in ROM !

    Status bits affected: none


    STWP register

    Stores the content of the workspace pointer register of the TMS9900 (i.e. the address of the current workspace) into the destination register.

    Status bits affected: none

    NB. There is no LDWP, so if you want to change the value of the workspace you have two solutions:

    * Assume we want to use the value in R0 for our new workspace
    * We have two solutions:
    * 1) Put the value in our program and perform a LWPI
           MOV  R0,HERE+2     Replaces the >0000 below with our value
    HERE LWPI >0000 Loads the new workspace pointer
    * 2) Put the value in R13 and perform a dummy RTWP
           STST R15           Make sure the status won't be affected
    LI R14,CONT Make sure we go on normally
    MOV R0,R13
    RTWP
    CONT ... Execution continues here with our new workspace


    STST register

    Stores the status register of the TMS9900 into the destination register.

    Status bits affected: none

    NB There is no LDST to load a value in the status register. But you can put it in R15 and perform a RTWP (just make sure that R13 contains the proper workspace, and R14 a valid address).

    * Assume we want to put the value in R0 into the status register
    * Solution: Put the value in R13 and perform a dummy RTWP
           STWP R13            Make sure thje workspce won't change
    LI R14,CONT1 Make sure we go on normally
    MOV R0,R15
    RTWP
    CONT1 ... Execution continues here with new status


    LIMI immediate

    Loads a value from 0 to 15 into the interrupt mask part of the status register. Only interrupts with a level equal or smaller than this value will be recognized. On the TI-99/4A all interrupts are hardwired as level 1. Therefore:

    LIMI 0 and LIMI 1 enable interrupts
    LIMI2 to LIMI 15 disable interrupts

    Status bits affected: interrupt mask


    SWPB dest

    Swap the most significant and the least significant bytes of the destination operand.

    Status bits affected: none.

    Example:

           LI   R1,>A52B
    SWPB R1 R1 now contains >2BA5


    X dest

    Executes the machine language instruction whose value is found in the destination operand. If the instruction requires additional memory words for operands, they will be taken after the X instruction, not after the address of the operand.

    Status bits affected: depends on the instruction executed.

    Example (also see the page on the TMS9900)

    * This performs: B    *R11
    LI R9,>045B This means B *R11
    X R9 Do it
    * This performs: LI   R1,>1234
    LI R0,>0201 This means LI R1,xxxx
    X R0
    DATA >1234 Value to load in R1
    * This performs: NEG  @>2000
    X @TEST Opcode to execute
    DATA >2000 Operand used
    TEST   NEG  @>0000


    Forbidden opcodes

    There are five opcodes that should never be used with the TI-99/4A, because they would be mistaken for CRU operations. These are:

    CKON
    CKOF
    LREX
    RSET
    IDLE

    See the TMS9900 page for details.



    Assembly-time instructions

    Besides opcodes, you can also include instructions to the assembler whithin your source file. These instructions may or may not generate code, some deal with a printout feature, some issue commands for the linker, etc.

    They may vary from assembler to assembler, therefore I only included here the most common ones.

    Instuctions that generate data: DATA, BYTE, TEXT or reserve room for it BSS, BES, EVEN
    Instructions for the linker: DEF, REF
    Instructions for the loader: AORG, RORG, END
    Others: COPY


    DATA value[,value]

    This instuction is used to generate code that is not an opcode. Generally, to place data in your program.

    value can be any number (or math expression) between 0 and >FFFF. It can also be entered as a 2-character string constant. If needed, many data words can be place on the same line, separed with comas.

    Note that a data word will always be loaded at an even address, on a word boundary.

    Example:

           DATA 5,6,>8001,0
    DATA 'TE'


    BYTE value[,value]

    Pretty much like DATA, except that it generates only one byte, and is therefore not limited by word boundaries.

    Value can be any number between 0 and 255, or a single-character string.


    TEXT 'string'

    This instruction is used to insert data into your program. The data is specified in the form of a quoted string.

    Example:

           TEXT 'This is a test'


    BSS nbytes
    BES nbytes

    These two instructions are used to reserve space into your program. Generally you will place data there. Contrarily to the above instructions, BSS (block starting with symbol) and BES (block ending with symbol) don't place values into memory: they just tell the loader to skip nbytes before to load the next instruction.

    BSS only differs from BES when you use a label: the label value corresponds to the current address with BSS and to the current address plue nbytes with BES (i.e. the end address).

    Example

    CLRBUF LI   R1,BUFFER  Point to our buffer
    LI R2,256 Buffer size in bytes
    L1 CLR *R1+ Clear a word in it
    DECT R2 Decrement bytes counter
    JNE L1 More to clear
    B *R11
    BUFFER BSS  256        Reserve 256 bytes of memory
           MOV   R0,R1     Next procedure


    EVEN

    Is used to make sure the loading will continue from a word boundary. This instruction is usefull afer a TEXT statement, when you don't want to count the number of characters to know whether you must add a BYTE statement or not. EVEN issues a >00 data byte if the current address is uneven, it does nothing otherwise.


    DEF label[,label]

    This instructions is used to pass one or more labels to the loader. The values of these labels will thereby be available for other files, or for the loader itself (to launch you program for instance).

    label must be a valid label (1 to 6 characters, the first one being not a digit) and must be defined in the source file.

    Example:

          DEF START,INIT
    DEF COUNT
    START MOV  R11,R10       Entry point of my program 
    ... Long program
    B *R10 Return to the caller
    INIT  CLR  @COUNT        Procedure available to another file 
    B *R11
    COUNT DATA 10            Data word available to another file
         END


    REF label[,label]

    This is the mirror-image of DEF. It allows your source file to reference labels that are part of another source file. You can thereby access its variables, call its procedures, etc.

    You are not allowed however to perform arithmetic on REF labels since these are undefined at assembly time

    Example:

           REF COUNT,INIT     
           BL   @INIT           Call a procedure in another file 
    MOV @COUNT,R0 Use a data word from another file
    MOV  @COUNT+2,R1 Illegal: no math allowed with REF labels
    LI R1,COUNT Instead, do this
    MOV @2(R1),R1 It's ok, since the math is done at execution time


    AORG address

    This instruction will force the loader to load the program at a precise memory location. In general you won't use it, except to modify a precise data word. E.g. to place values in the non-maskable interrupt vectors at >FFFC-FFFF.


    RORG [offset]

    This instruction allows the loader to determine itself where the program should be loaded. The TI loader starts from >A000 towards >FFD8, and continues with the low-memory expansion (in which there is very limited space, since that's were the loader is located, and it also contains a symbol table for the REFs and DEFs).

    RORG can be used to cancel the effect of an AORG.

    If an offset is specified, it will be added to the current address. The effect is similar to that of the BSS instruction, except that you could specify a negative offset, thereby causing the loader to overwrite what it just loaded. (Sounds like a silly thing to do but it may be usefull with paged-memory devices).


    END [label]

    This is the only instruction that must be part of any program. It tells the assembler to stop processing the source file.

    If an optional label is specified, it will be used by the loader to automatically start the program once this file is loaded. Otherwise, you must DEFine a label and enter its name when the loader asks you where to start.


    COPY "filename"

    This allows for writing very long programs. It instructs the assembler to switch to the file specified in double-quotes. Once this file is completely assembled, the assembler will resume with the current one. Generally, assemblers don't let you nest COPY statements, i.e. you cannot have a COPY in a copied file. But you can have as many COPY as you want in the initial file.



    Encoding format

    Assembly language opcodes and operands are encoded into machine language according to nine fundamental formats. Each uses up a word of memory per opcode, possibly together with one or two extra words for operands.

    Format Operands >80 >40 >20 >10 >08 >04 >02 >01 >80 >40 >20 >10 >08 >04 >02 >01
    I source,dest Opcode B Td destination (reg) Ts source (reg)
    II PC-relative Opcode PC-relative offset in words
    III source,register Opcode register Ts source (reg)
    IV source,nbits Opcode nbits Ts source (reg)
    V register,count Opcode count register
    VI dest Opcode Ts source (reg)
    VII - Opcode
     0   0   0   0   0
    VIII register,immed Opcode
     0
    register
    IX source,register Opcode register Ts source (reg)

    Ts and Td define the type of addressing for the source and destination operand respectively.

    00: Rx
    01: *Rx
    10: @yyyy(Rx) or @yyyy if Rx = 0 This requires an additional memory word to store the yyyy value
    11: *Rx+

    Source and dest contain the workspace register, used in the way indicated be the addressing mode.

    Immed operands also require an additional word to store the immediate value.


    Hexadecimal notation

    Decimal notation: base 10

    In almost all cultures throughout the ages, the numbering system is based on 10 digits. This is of course because we have 10 fingers (finger is "digitum" in latin...). The only exception I know of are that Mayas who used 20 digits. Guess why?

    In our culture these ten digits are represented by ten symbols derived from arabic: "0", "1", "2", "3", "4", "5", "6", "7", "8", and "9".

    With 10 digits we can represent ten numbers: zero through nine. Now what if we want to write down a larger number? Well, we just combine two digits: one for the tenth and one for the units. 23 means "two times ten, plus three", which is twenty-three. Similarly, we can add a third digit for the undredth, a fourth for the thousands, etc.


    Hexadecimal notation: base 16

    But nothing prevents us to use more (or less) than 10 digits. Let's say we want to use sixteen digits instead of ten. First we need to find names and symbols for the extra six. We could come up with goofy names (borg, spam, taku, etc) and fancy symbols, but we wouldn't be able to type them from a computer keyboard. Therefore let's keep it simple and decide that the extra digits will be "A", "B", "C", "D", "E" and "F" (lower case may or may not be ok). In our new system, "A" has the value ten, "B" is eleven, etc upto "F" which is fifteen.

    To represent numbers greater than fifteen, we must combine two digits: one for the "sixteenth" and one for the units. >23 means "two times sixteen, plus three", which is thirty-five. This way we can write down 16*16=256 numbers.

    To go further, we need a third digit whose value will be 256, a fourth one whose value will be 4096 (i.e. 16*16*16), etc.

    For instance: >123B means "1 times 4096, plus 2 times 256, plus 3 times 16, plus 11" which is 4667.


    Base indication

    We also need a way to distinguish our new numbering system, that we'll call "hexadecimal", from the good old decimal one. Obviously, any number containing digits from "A" through "F" has to be hexadecimal. But if I write 10, do I mean ten or sixteen?

    Texas Instruments adopted the following convention: any hexadecimal number must be preceded by a "greater sign" sign. E.g. >1234

    Most PC people use another convention: append a h to the number: 1234h (which also allows to append a d for decimal numbers, a b for binary, etc).

    High-level languages like C and C++ use yet another convention: start any hexadecimal number with "0x". E.g. 0x1234. The "x" stands of course for hexadecimal and the leading 0 is only here so that the compiler won't mistake the number for a variable name (variable names cannot start with a digit from 0 to 9 in C/C++).


    Binary notation: base 2

    As mentionned above, we could also use less than 10 digits, if we wanted two. Let's say we use only two: "0" and "1". This will allow us to write down the numbers zero and one. To go further we'll need an extra digits for the pairs: 10 means "one pair, plus no unit" wich is two. And 11 means "one pair, plus one unit" which is three.

    The next digits will a a weithg of 4, the next one a weight of 8, etc.

    For instance, the number 11001101 reads as "1 times 128, plus 1 times 64, plus 0 times 32, plus 0 times 16, plus 1 times times 8, plus one times 4, plus 0 times 2, plus 1" which is 205 in decimal.

    Why would we want to use such a clumsy numbering system? Because it's easy to translate in terms of hardware: 0 means "no current" and 1 means "current flowing" for instance. Or 0 could mean "0 volts" and 1 mean "5 volts".

    That's how today's computers are built: they represent all numbers in binary and encode them as two voltage levels. In the early times of computing, there were also "anolog" computers, that were using 10 different voltage levels to encode decimal numbers. But these were technically difficult to built and were completely superceeded by digital computers.


    Base conversions

    As you have probably noted from the examples above, converting a hex number into a decimal number is not easy, especially with very large numbers (You have 30 seconds to calculate the decimal value of >123456789ABCDEF0).

    The problem is even worse with binary numbers: how much is 10010110101101001001010 in decimal?

    Converting decimal numbers to another base is also annoying: How much is 3333 is hexadecimal? We must do the following:

    How many times >1000 (i.e. 4096) in 3333? Zero.
    How many time >0100 (i.e. 256) in 3333? Thirteen, which is >D in hex notation. The remainder is 3333-(13*256)=5.
    How many times >0010 (i.e. 16) in 5? Zero.
    How many times >0001 (i.e. 1) in 5. Five.
    The result is thus: >0D05

    By contrast, conversions between hexadecimal and binary are very easy. Do you see why? Because 16 is a power of 2, whereas 10 is not. Just compare the "weight" of the digits in the various bases:

    Decimal: 1, 10, 100, 1000, 10000, etc.
    Hexadecimal: 1, 16, 256, 4096, 65536, etc.
    Binary: 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, etc.

    Now do you see the pattern? Every fourth binary digit has the same weight than an hexadecimal digit (this never occurs with decimal). This means that we can split any binary number in groups of four digits and convert them individually into an hex digit.

    To come back to the example above (how much is 10010110101101001001010 in hexadecimal?). Piece of cake:

    100 1011 0101 1010 0100 1010 is
    >4 B 5 A 4 A i.e. >4B5A4A

    And conversely, to translate >1234 in binary:

     >1    2    3    4
    0001 0010 0011 0100 Done in five seconds!

    All you need to know are the 16 first binary values. That's easier to memorize than the powers of two!

    Hex Binary Hex Binary
    >0 0000 >8 1000
    >1 0001 >9 1001
    >2 0010 >A 1010
    >3 0011 >B 1011
    >4 0100 >C 1100
    >5 0101 >D 1101
    >6 0110 >E 1110
    >7 0111 >F 1111

    And this is why computer people use hexadecimal a lot. Note that we could also have used another power of two as a base. Eight for instance, using only digits "0" through "7". This is known as "octal" and is sometimes used, but much less often than hexadecimal. It has the advantage than you do not need extra digits.


    Hex math

    To perform hexadecimal operations, we'll follow the exact same rules than for decimal operations:

      >1234
    + >96FB

    Lets start from the rightmost digit: "4" plus "B" (four plus eleven) is "F" (sixteen):

      >1234
    + >96FB
    F


    Second digit: "3" plus "F" (three plus fifteen) is eighteen (>12). We thus have a carry of sixteen and put down two.

        1
    >1234
    + >96FB
    2F

    Now, "2" plus "6", plus the carried "1" is "9".

      >1234
    + >96FB
    92F

    And finally "9" plus "1" is "A". Et voila!

      >1234
    + >96FB
    >A92F

    We can do the same thing in base 2:

                 1       11       1
    100010 100010 100010 100010 100010 100010
    +010110 +010110 +010110 +010110 +010110 +010110
    0 00 000 1000 01000 101000
    carry 1 carry report
    again carry


    I'll let you do substractions as an exercise...


    Negative numbers

    It's often necessary to deal with negative numbers. Therefore several conventions have been established to represent a negative number in binay format. Generally, the leftmost bit is used a a sign bit: e.g."0" means positive and "1" means negative.

    The remaining bits may represent the number, and they do in some conventions. However, the most common convention is "two's complement". It is the one used by Texas Instruments for the TI-99/4A.

    In two's complement notation, negative numbers are represented as follows:

    >FFFF is -1
    >FFFE is -2
    >FFFD is -3
    ...
    >8001 is -32767
    >8000 is -32768

    The big advantage of this notation is that >FFFF is greater than >FFFE, which is mathematically correct (-1 is greater than -2), and also true for unsigned numbers.

    Problem only occur when comparing a negative number with a positive one: >FFFF is greater than >0001, but -1 is smaller than 1.

    That's why the TMS9900 status register contains two status bits for number comparisons: "high" that deals with unsigned values, and "Gt" that considers the values as signed.


    Revision 1. 6/9/99 Preliminary
    Revision 2. 6/13/99 Ok to release


    Back to the TI-99/4A Tech Pages