Publish a post on new garbage collector
This commit is contained in:
parent
8e08e2ec07
commit
6a77c697f1
1 changed files with 60 additions and 0 deletions
60
content/posts/new_spin_on_cpp_garbage_collector/note.md
Normal file
60
content/posts/new_spin_on_cpp_garbage_collector/note.md
Normal file
|
@ -0,0 +1,60 @@
|
|||
X-Date: 2024-08-03T01:59:20Z
|
||||
X-Note-Id: 926d77c4-2332-4b0a-a9de-d0215facb90c
|
||||
Subject: New spin on a C++ garbage collector
|
||||
X-Slug: new_spin_on_cpp_garbage_collector
|
||||
|
||||
While doing another round of implementing my programming language in C++, I've reached
|
||||
a stage where garbage collection works and is integrated with the "host language".
|
||||
An example is worth a thousand words, so here you go:
|
||||
|
||||
```
|
||||
StaticArena<64 * 1024 * 1024> arena;
|
||||
|
||||
TEST_CASE(dict_insert) {
|
||||
auto d = DIEX(Dict::create(arena));
|
||||
|
||||
d = DIEX(d.insert(arena, 1, 2));
|
||||
d = DIEX(d.insert(arena, 1, 3));
|
||||
d = DIEX(d.insert(arena, 3, 3));
|
||||
d = DIEX(d.insert(arena, 0, 4));
|
||||
d = DIEX(d.insert(arena, 0, 5));
|
||||
d = DIEX(d.insert(arena, 2, 6));
|
||||
|
||||
DIEX(arena.gc());
|
||||
|
||||
auto s = DIEX(write_one(arena, d));
|
||||
|
||||
DIEX(arena.gc());
|
||||
|
||||
ASSERT_EQUALS(s, "{0 5 1 3 2 6 3 3}");
|
||||
}
|
||||
```
|
||||
|
||||
Here, `Dict::create()` creates a dictionary and places its data into a garbage-collected arena.
|
||||
The variable `d` that is placed on the C++ stack is just a garbage collector "root". When the
|
||||
object is moved by the garbage collector between different arena heaps, the roots get re-pointed
|
||||
to the new locations, so it is safe to operate with such variables and place them to other C++
|
||||
objects.
|
||||
|
||||
I then call `d.insert()` to insert key-value pairs to the dictionary. Note that this function
|
||||
returns a new object that I assign back to `d`. This is because in my language objects are actually
|
||||
immutable, and `d.insert()` creates a new dictionary instead of changing the old one.
|
||||
|
||||
The call to `arena.gc()` performs a full garbage collection cycle and switches the primary and
|
||||
secondary arena heaps, achieving data compaction and removal of objects that are no longer
|
||||
accessible (here - previous versions of dictionary `d`).
|
||||
|
||||
Then I convert dictionary into a string with `write_one()`, which is a way to get textual
|
||||
representation of any hierarchy of objects.
|
||||
|
||||
Note that `arena.gc()` is not the only way to trigger garbage-collection. Any memory allocation that
|
||||
overflows the active heap will trigger a GC cycle, and it means that you can't normally hold any
|
||||
direct pointers to such heap and need to always use gc roots to refer to objects.
|
||||
|
||||
Most importantly, the example doesn't use any additional dynamically allocated memory apart from
|
||||
a contiguous region occupied by the `arena`. This is a property that I would like to keep, as
|
||||
it would make embedding into other programs easier: you'd always keep data of the host program
|
||||
and the extension language separate.
|
||||
|
||||
The end result looks pretty satisfactory to me to a degree where I think I can start prototyping
|
||||
a bytecode compiler.
|
Loading…
Reference in a new issue