Add a post about strings and slices
This commit is contained in:
parent
c4d2281704
commit
2756c48cfa
1 changed files with 40 additions and 0 deletions
|
@ -0,0 +1,40 @@
|
||||||
|
X-Date: 2023-09-02T23:50:00Z
|
||||||
|
X-Note-Id: b8bf5799-4608-4762-925e-8de0b759a970
|
||||||
|
Subject: VM progress update: strings, slices and function bindings
|
||||||
|
X-Slug: vm_progress_update_strings_slices_calls
|
||||||
|
|
||||||
|
Last few days I've been working on getting the initial string support landed in the VM implementation.
|
||||||
|
As soon as I finish this, it would be possible to write programs in assembly that can show something
|
||||||
|
to the user.
|
||||||
|
|
||||||
|
String implementation actually consists of two things:
|
||||||
|
|
||||||
|
- strings themselves (objects on the heap that contain a character array inside and its size)
|
||||||
|
- string slices (that can reference parts of the string without creating a copy)
|
||||||
|
|
||||||
|
String slices are convenient because in theory they can be small enough to be put into registers
|
||||||
|
or stored on the stack. This means that code that walks the strings (parsing, splitting, etc) won't
|
||||||
|
put a lot of pressure on the garbage collector. And since strings are immutable, it should always
|
||||||
|
be safe to keep the slice around.
|
||||||
|
|
||||||
|
From the garbage collector's point of view, slices point to the beginning of the string, and contain
|
||||||
|
a range. This allows to hold the original string in memory if you have a slice pointing to it.
|
||||||
|
|
||||||
|
In theory, string and array slices should work the same way. I don't yet have array slice support, but
|
||||||
|
in the end it would likely be just the same VM opcode for both.
|
||||||
|
|
||||||
|
The latest patches also removed a custom implementation of slices from the assembler code, and it's now
|
||||||
|
based on the same functionality that the VM uses. It complicated the assembler code a little bit, as
|
||||||
|
I have to carry around VM data structures in order to do memory allocations. This is because memory
|
||||||
|
arenas aren't flexible enough to do allocation of arbitrary sizes. The arena has a fixed limit, and as
|
||||||
|
soon as you hit it, a garbage collection is triggered. In the VM bytecode that doesn't pose a problem
|
||||||
|
since registers and stack are serving as GC roots. In the assembler which is written in C, the GC
|
||||||
|
roots are spread around the code and it's not easy to wrap them.
|
||||||
|
|
||||||
|
Fixing the GC issues is a matter of a separate implementation, where I would borrow a few ideas
|
||||||
|
on how memory arenas work in Zig. It would allow me to safely work with the VM memory from C code,
|
||||||
|
and only do GC once the execution fully leaves the C procedure. This mostly means implementing a linked-list
|
||||||
|
of memory pages (pretty much what malloc does).
|
||||||
|
|
||||||
|
All in all, a few more steps and I would be able to implement a `print` function and be able to
|
||||||
|
write a "game of life" in the VM assembly. Can't wait for that to start working.
|
Loading…
Reference in a new issue