Projects / TinyVM

GitHub

TinyVM is a simple register based virtual machine implemented in C (tinyvm.c). The bytecode assembler is written in Python (tas.py). TinyVM has 4 registers ($0 - $3) and 64k of memory in a 32-bit address space (0x00000000 - 0x0000FFFF).

Each instruction is encoded in a single 64-bit word. Register count and memory are defined at compile time, but due to only having 32 bits available for addressing and 8 bits for registers, allocating more than 4GB of memory or 256 registers is pointless.

The following instructions (loosely based on MIPS) have been implemented:

No. Keyword Instruction Description
0x00 halt Halt Terminate program
0x01 nop No Operation Do nothing
0x02 li Load Immediate Load 0x00000000 into $0
0x03 lw Load Word Load the contents of the memory location pointed to by $1 into $0
0x04 sw Store Word Store the contents of $1 in the memory location pointed to by $0
0x05 add Add Add $0 to $1 and store the result in $2
0x06 sub Subtract Subtract $1 from $0 and store the result in $2
0x07 mult Multiply Multiply $0 by $1 and store the result in $2
0x08 div Divide Divide $0 by $1 and store the result in $2
0x09 j Unconditional Jump Jump to memory location 0x00000000
0x0A jr Unconditional Jump (Register) Jump to memory location stored in $0
0x0B beq Branch if Equal Branch to memory location stored in $2 if $0 and $1 are equal
0x0C bne Branch if Not Equal Branch to memory location stored in $2 if $0 and $1 are not equal
0x0D inc Increment Register Increment $0
0x0E dec Decrement Register Decrement $0

Bytecode Format

Each instruction is a single 64-bit word composed of eight 8-bit octets (little endian).

The first octet contains the instruction number, while the second, third, and fourth octets contain register numbers. The lower 32 bits are either an unsigned memory location or a signed immediate value.

Instruction Register 0 Register 1 Register 2 Immediate Value
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
Assembly Hex Binary
li $2 0x7fffffff 0x102000007fffffff 0001 0000 0010 0000 0000 0000 0000 0000 0111 1111 1111 1111 1111 1111 1111 1111
add $2 $0 $1 0x4020001000000000 0100 0000 0010 0000 0000 0000 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000

Assembly Language

Comments

Lines which begin with a semicolon are treated as comments.

Labels

Lines which end with a colon are treated as labels. The assembler works in two passes so there is no need to forward declare your labels.

Example Assembly

; fill up 64k of memory

; counter
li $1 0x00000000

; end
li $2 0x0000FFFF

; memory location of loop start
li $3 loop

loop:
  ; store the value of the counter in the memory location
  ; contained in the counter.
  sw $1 $1

  ; increment the counter
  inc $1

  ; loop if the counter hasn't yet reached the end
  bne $1 $2 $3

  ; end program
  halt

Example

Assuming the assembly source above…

Assemble, generate bytecode

$ python tas.py examples/test.asm examples/test.tvm
0201000000000000 li $1 0x00000000
020200000000ffff li $2 0x0000FFFF
0203000000000003 li $3 loop
0401010000000000 sw $1 $1
0d01000000000000 inc $1
0c01020300000000 bne $1 $2 $3
0000000000000000 halt

Inspect bytecode

$ od -x examples/test.tvm
0000000          00000000        02010000        0000ffff        02020000
0000020          00000003        02030000        00000000        04010100
0000040          00000000        0d010000        00000000        0c010203
0000060          00000000        00000000
0000070

Compile TinyVM

$ make

Execute bytecode

Each cycle, TinyVM prints the value of the program counter, the next instruction, executes that instruction, and then prints the values of the registers, with each line representing a single CPU cycle.

$ ./tinyvm examples/test.tvm
00000001 02000000ffffffff ffffffff 00000000 00000000 00000000
00000002 0201000012345678 ffffffff 12345678 00000000 00000000
00000003 0202000000012ac0 ffffffff 12345678 00012ac0 00000000
00000004 0503010200000000 ffffffff 12345678 00012ac0 12358138
00000005 0600010200000000 12332bb8 12345678 00012ac0 12358138
00000006 0000000000000000 12332bb8 12345678 00012ac0 12358138