The repository contains my C
implementation of the write your own VM
course. The README
will mainly contain useful notes I can read to look back at
this project and more material if I feel the need for it. I also re-ordered some
of the points as it made more sense to me when I worked on the VM.
If you want to play around with this VM, as long as you have a Unix-like distribution, you can do the following :
git clone https://github.com/Nathan-Furnal/small-vm
Then, go to the small-vm
directory and do :
make
Once this is done and you’re still in the directory, you can play one of the games with :
./vm games/2048.obj # or rogue.obj
2^16 memory locations, each storing a 16-bit value. Stored in an array.
10 total registers, 16 bits each. 8 general purpose, 1 program counter and 1 condition flags. Stored in an array.
The R_COND
register stores condition flags with information about calculation
(positive, zero, negative).
An instruction is a command, made up of an opcode which indicates the task to perform and a set of parameters which provides inputs. There are 16 opcodes in LC-3. Each instruction is 16 bits long, the left 4 bits store the opcode and the rest of the bits are for the parameters.
A specific instruction such as ADD
or AND
always has the same opcode since
it defines it. But it can have different mixes of parameters, meaning that the
output machine code for an instruction will differ depending on parameters. This
is the case because an instruction such as ADD
can take a destination register
(DR
) to store the result of a sum between two values stored in source
registers or between a source register and an immediate value. We talk about
register mode when all the parameters are registers and immediate mode if at
least a parameter is an immediate value. Bit 5 of the instruction is set to 1 if
immediate mode is used, it is set to 0 otherwise.
Finally, there are only 5 bits allowed for the immediate value, up to 2^5=32
(unsigned).
When implementing operations, bit shifting (<<
or >>
) is used to reach the
proper register and boolean masking is applied to only get the relevant
bits. For example, in the ADD
operation, accessing the destination register (DR
)
which goes from bit 9 to 11 (3 bits wide) requires to right shift by 9 and apply
a 3 bits boolean mask (& 0x7
).
The addition operation. If bit 5
is 1, the immediate value is sign extended to
16 bits. In both cases, the source register content and the second operand (the
content of another register or an immediate value) are added and the result is
stored in the destination register.
Both these operations are not used here (thus not implemented).
The bit-wise logical AND
, it works in exactly the same way as ADD
but stores
the result of the bit-wise AND
in the destination register.
The bit-wise complement, it takes a source directory and stores its bit-wise complement in the destination directory.
The conditional branching operation. It tests the sign bits [11:9], if any of
the condition code tested is set, the program branches to the location specified
by the sign extended offset. Meaning that it increases the program counter
(PC
) which is a 16-bit register containing the address of the next
instruction.
The unconditional branching operation. Unconditionally jumps to the location
specified by the location of the base register. RET
is a special case of
JMP
, which jumps to R7
. The content of the register is copied into PC
in
either case.
First, the value incremented PC
is saved in R7
. This is the linkage back to
the calling routine. Then the program counter is loaded with the address of the
first instruction of the subroutine, causing an unconditional jump to that
address. The address of that subroutine is obtained from the base register if
bit 11 is 0 or the address computed by sign-extending bits [10:0] and adding
this value to the incremented PC
.
Loads the content from the given memory address in parameter to the destination register, the condition codes are updated afterwards.
This is works like a pointer in C
. An address is computed by sign-extending
bits [8:0] to 16 bits and adding this value to the incremented PC
. What is
stored at this address is the address of the data to be loaded into the
destination register. The condition codes are set afterwards.
An address is computed by sign-extending bits [5:0] to 16 bits and adding this
value to the contents of the register specified by bits [8:6]. The contents of
memory at this address are loaded into DR
. The condition codes are set, based on
whether the value loaded is negative, zero, or positive.
An address is computed by sign-extending bits [8:0] to 16 bits and adding this
value to the incremented PC. This address is loaded into DR
. The condition codes
are set, based on whether the value loaded is negative, zero, or positive.
Stores the content of the source register into the memory at the program counter + a sign-extended offset.
Stores the content of the source register (SR
) into the memory address
specified by the memory address given the program counter + a sign-extended
offset. This is similar to indirection in C
.
The contents of the register specified by SR
are stored in the memory location
whose address is computed by sign-extending bits [5:0] to 16 bits and adding
this value to the contents of the register specified by bits [8:6].
The LC-3
provides a few predefined routines for performing common tasks and
interacting with I/O devices. For example, there are routines for getting input
from the keyboard and for displaying strings to the console. These are called
trap routines which you can think of as the operating system or API for the
LC-3
. Each trap routine is assigned a trap code which identifies it (similar
to an opcode). To execute one, the TRAP
instruction is called with the trap
code of the desired routine.
Those are not new instructions but conveniences similar to system calls. When a
trap code is called, the PC
is moved to that code’s address.
Some special registers are not accessible from the normal register table. Instead, a special address is reserved for them in memory. To read and write to these registers, you read and write to their memory location. These are called memory mapped registers. They are used to interact with special hardware devices.
The LC-3 has two memory mapped registers that need to be implemented. They are
the keyboard status register (KBSR
) and keyboard data register (KBDR
). The
KBSR
indicates whether a key has been pressed and the KDBR
identifies which
key was pressed.
Although you can request keyboard input using GETC
, this blocks execution
until input is received. KBSR
and KBDR
allows you to poll the state of the
device and continue execution, so the program can stay responsive while waiting
for input.
I think there is a “useful” way to read the source files to understand what’s going on and I’d start with the order laid out below. Additional comments and documentation can be found in the files.
File | Content |
---|---|
vm.h | VM building blocks, contains the memory and registers implementation as well as the basic needed operations. |
vm.c | Contains the implementation of the needed operations. |
main.c | Main application loop, tying everything together. |