Add a post about garbage collection
This commit is contained in:
parent
15644f98ac
commit
29fb6aa1eb
1 changed files with 69 additions and 0 deletions
|
@ -0,0 +1,69 @@
|
|||
X-Date: 2023-08-15T21:30:00Z
|
||||
X-Note-Id: 81666e7b-ed56-400a-b08d-5edad9f966f5
|
||||
Subject: VM progress update: simple garbage collection
|
||||
X-Slug: vm_progress_update_simple_garbage_collection
|
||||
|
||||
Today I've reached another milestone with the virtual machine.
|
||||
It can now allocate memory and perform garbage collection.
|
||||
|
||||
It is interesting that I don't yet have a high-level language implemented,
|
||||
but the garbage collection is still useful. In bare-metal assembly, you have
|
||||
to track memory allocations and deallocations yourself, since the memory is unstructured,
|
||||
and a potential garbage collector won't be able to differentiate between pointers and
|
||||
integers. In my virtual machine (and its assembly language), all memory is tagged and
|
||||
structured, and thus if you have a pointer to the virtual machine data structure, the
|
||||
garbage collector is able to walk it. So even the assembly/bytecode will soon be able to
|
||||
work as if memory is not a concern.
|
||||
|
||||
For the implementation of garbage collection, I've chosen a modified
|
||||
[Cheney's algorithm](https://en.wikipedia.org/wiki/Cheney%27s_algorithm).
|
||||
This algorithm contains two distinct "heaps", one of which is currently in use for allocation.
|
||||
If it runs out, the algorithm would walk all objects that are reachable from the registers and
|
||||
the stack of the virtual machine, and move them to the other heap. The previous heap would then
|
||||
contain only dead objects, and can be grown/shrunk for the next cycle if needed.
|
||||
|
||||
For any practical purposes, Cheney's algorithm would be quite slow, and many real programming
|
||||
languages contain multiple heaps (python has 3) and vary the size of the heaps to reclaim
|
||||
short-lived objects faster. However, I predict that the naive algorithm would get me pretty far along
|
||||
the path to the implementation of the actual language, so I don't care that much about it now.
|
||||
|
||||
What follows below is a piece of code that demonstrates the C interface for interacting with
|
||||
VM memory allocation and GC capabilities. It is taken almost verbatim from the unittest and
|
||||
annotated.
|
||||
|
||||
```
|
||||
# Allocate all structures required by the virtual machine
|
||||
vm_t* vm = new_vm();
|
||||
|
||||
# Allocate an array of 10 elements
|
||||
tagged_value_t arr = vm_mk_array(vm, 10);
|
||||
array_set(arr, 0, mk_i64(42));
|
||||
|
||||
# Save pointer to the array in a register so it survives garbage collection
|
||||
vm->registers[R1] = arr;
|
||||
|
||||
assert(arena_generation(&vm->arena, arr) == 0,
|
||||
"Array should be in first generation\n");
|
||||
|
||||
# Perform garbage collection
|
||||
vm_gc(vm);
|
||||
|
||||
# The array has moved, so retreive a pointer to it from the register
|
||||
arr = vm->registers[R1];
|
||||
|
||||
# Check that the array contains correct value at 0-th position
|
||||
tagged_value_t val = array_get(arr, 0);
|
||||
assert(op_eq(val, mk_i64(42)), "0-th element of an array == %ld\n",
|
||||
val.value.i64);
|
||||
|
||||
# Make sure that the array has really moved between generations
|
||||
assert(arena_generation(&vm->arena, arr) == 1,
|
||||
"Array should be in second generation\n");
|
||||
```
|
||||
|
||||
The API is in place, so the next step would be to add a few instructions to the bytecode
|
||||
that can call memory allocations, and write an inefficient merge sort (the one that would
|
||||
copy sub-ranges).
|
||||
|
||||
As usual, the code can be found [here](https://git.sr.ht/~knazarov/lisp.experimental)
|
||||
(note that it's experimental, so don't expect any clarity).
|
Loading…
Reference in a new issue