Publis a post on assembly and stack-based registers
This commit is contained in:
parent
a169ee4c61
commit
e3dca542e9
1 changed files with 175 additions and 0 deletions
|
@ -0,0 +1,175 @@
|
||||||
|
X-Date: 2024-01-21T21:00:00Z
|
||||||
|
X-Note-Id: 5605fe4e-a614-4477-9540-62923aea61e8
|
||||||
|
Subject: Better assembler and stack-based register file
|
||||||
|
X-Slug: better_assembler_and_stack_based_register_file
|
||||||
|
|
||||||
|
During the last few weeks, I've made a lot of changes to the programming language I'm working on.
|
||||||
|
It looks like I'm close to being able to write actually useful programs in it (at least on the level of assembly).
|
||||||
|
So let's go over the interesting parts.
|
||||||
|
|
||||||
|
|
||||||
|
## S-expression-based assembler
|
||||||
|
|
||||||
|
In the previous post, I described the "reader" and "writer": two modules that allow to parse and serialize
|
||||||
|
data structures in the form of extended S-expressions. Originally, S-expressions are just a way to represent
|
||||||
|
nested lists containing "atomic" values. My extension allows to represent dictionaries and arrays as well.
|
||||||
|
|
||||||
|
Here's an example such data structure represented as an S-expression:
|
||||||
|
|
||||||
|
```
|
||||||
|
;; a list
|
||||||
|
(
|
||||||
|
1 2 3 ;; numbers
|
||||||
|
foobar ;; symbol
|
||||||
|
{"foo" "bar"} ;; dict with str key
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
Since my language would be based on Lisp, all programs will be written as sequences of S-expressions.
|
||||||
|
In Lisp, function calls look like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
(function_name arg1 arg2 arg3)
|
||||||
|
```
|
||||||
|
|
||||||
|
This would call `function_name` with 3 arguments. Because of how universal this notation is, any program
|
||||||
|
can be represented this way. For example, this is what factorial implementation looks like:
|
||||||
|
|
||||||
|
```
|
||||||
|
(defun factorial (x)
|
||||||
|
(if (zerop x)
|
||||||
|
1
|
||||||
|
(* x (factorial (- x 1)))))
|
||||||
|
```
|
||||||
|
|
||||||
|
My "reader" can already successfully parse the syntax tree of this program and turn it into a data structure.
|
||||||
|
I don't yet have a compiler for the high level language that would turn it into bytecode, but I already have
|
||||||
|
an assembler.
|
||||||
|
|
||||||
|
The only problem with the assembler was that I had to implement a custom parser for it. For example, this is
|
||||||
|
a program from one of the older posts:
|
||||||
|
|
||||||
|
```
|
||||||
|
; Load value '2' into register r0
|
||||||
|
li r0, 2
|
||||||
|
; Multiply register r0 by 2 and
|
||||||
|
; put result back to r0
|
||||||
|
mul r0, r0, 2
|
||||||
|
ret
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
As you can see, it has a custom syntax where a mnemonic needs to be writen first, and then followed by
|
||||||
|
arguments separated by a comma. Because of the subtleties of this format, it has taken around 700 lines of C
|
||||||
|
to implement and was very hacky. So I thought to myself: if assembly programs are just a list of opcodes with
|
||||||
|
arguments, then maybe I can write them as S-expressions as well and save on writing a custom parser?
|
||||||
|
|
||||||
|
And this is what I ended up doing. Now the same program would look like this:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
; Load value '2' into register r0
|
||||||
|
(li r0 2)
|
||||||
|
; Multiply register r0 by 2 and
|
||||||
|
; put result back to r0
|
||||||
|
(mul r0 r0 2)
|
||||||
|
(ret)
|
||||||
|
```
|
||||||
|
|
||||||
|
And here's an example of evaluating a factorial of 15 and printing it to the screen:
|
||||||
|
|
||||||
|
```
|
||||||
|
(const c1 "! is")
|
||||||
|
(const max 15)
|
||||||
|
(const one 1)
|
||||||
|
|
||||||
|
(sr 10)
|
||||||
|
|
||||||
|
(mov r0 max)
|
||||||
|
(mov r3 one)
|
||||||
|
(mov r2 one)
|
||||||
|
|
||||||
|
(label factorial)
|
||||||
|
(mul r2 r2 r3)
|
||||||
|
(add r3 r3 one)
|
||||||
|
(jle r3 r0 factorial)
|
||||||
|
|
||||||
|
(label end)
|
||||||
|
|
||||||
|
(lfi r5 print)
|
||||||
|
(mov r6 r0)
|
||||||
|
(mov r7 c1)
|
||||||
|
(mov r8 r2)
|
||||||
|
|
||||||
|
(call r5 3)
|
||||||
|
(retnil)
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Stack-based register file
|
||||||
|
|
||||||
|
Initially, I designed the virual machine loosely after a RISC-V architecture. There were few opcodes, a stack,
|
||||||
|
and 16 general-purpose registers. Of course, since the virtual machine is not real hardware and is targeting
|
||||||
|
a dynamic language - it means that the registers and stack were tagged with a type identifier of an object
|
||||||
|
they point to. But otherwise the design was quite close.
|
||||||
|
|
||||||
|
Eventually, when I started thinking how I'm going to compile actual high-level code to the virtual machine
|
||||||
|
bytecode, I've found a problem called "register allocation". Usually, when you compile a C or C++ program,
|
||||||
|
you have local variables in your functions. Those variables live on a stack, and are loaded into registers
|
||||||
|
when your program performs computation on them (say, adding two variables). Because operating with registers
|
||||||
|
is very fast, and loading data from a stack is a lot slower, compilers have an intricate optimizer that would
|
||||||
|
try to use registers to store and access the intermediate results as much as possible.
|
||||||
|
|
||||||
|
On one hand, I don't want to write a complex register allocation optimizer. On the other hand, I don't want
|
||||||
|
to write the virtual machine purely as a stack machine. Mostly because stack machines shuffle around values
|
||||||
|
too much, and it's very hard to read their disassembly because of this. Try reading Python disassembly, and
|
||||||
|
you'll see what I mean.
|
||||||
|
|
||||||
|
After thinking about this for a few evenings, I had a thought: if in my virtual machine there's no difference
|
||||||
|
in performance between accessing a register and a value on the stack - maybe I can put registers on a separate
|
||||||
|
stack? In this case, I can have essentially unlimited number of registers available for a function to work with,
|
||||||
|
and it will have its own set of registers that are independent from the registers of the function that called it.
|
||||||
|
|
||||||
|
Excited, I did a bit of googling to see if there are any other runtimes that did this, and sure enough some did!
|
||||||
|
This was the case for CLR (.NET runtime) and Lua virtual machine. I didn't want to dive into CLR, because it
|
||||||
|
was too big, but [this Lua 5 paper](https://www.lua.org/doc/jucs05.pdf) was short and lovely, and described
|
||||||
|
everything I needed.
|
||||||
|
|
||||||
|
Changing the approach to accessing registers wasn't easy, and took almost a whole day of refactoring. But after
|
||||||
|
finishing that, I can write things like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
(const c1 42)
|
||||||
|
(const c2 5)
|
||||||
|
(const c3 47)
|
||||||
|
(sr 3)
|
||||||
|
|
||||||
|
;; Call a function that
|
||||||
|
;; adds two numbers and
|
||||||
|
;; puts result back to r0
|
||||||
|
(setjump r0 add2)
|
||||||
|
(mov r1 c1)
|
||||||
|
(mov r2 c2)
|
||||||
|
(call r0 2)
|
||||||
|
|
||||||
|
;; Check that r0 == 47
|
||||||
|
(aeq r0 c3)
|
||||||
|
|
||||||
|
(retnil)
|
||||||
|
|
||||||
|
;; Function that adds
|
||||||
|
;; two numbers
|
||||||
|
(label add2)
|
||||||
|
(sr 3)
|
||||||
|
;; These registers are
|
||||||
|
;; not the same as the
|
||||||
|
;; caller's
|
||||||
|
(add r2 r0 r1)
|
||||||
|
(ret r2)
|
||||||
|
```
|
||||||
|
|
||||||
|
In this example you see only a few registers used, like `r0`, `r1` and `r2`. But the current implementation
|
||||||
|
allows you to go all the way up to `r65536`, which should be plenty for addressing local variables of pretty
|
||||||
|
much any program.
|
||||||
|
|
||||||
|
Now, with this freedom of using large range of registers, writing a compiler would be a lot simpler.
|
Loading…
Reference in a new issue