diff --git a/content/posts/immutable_tree_structured_stack/note.md b/content/posts/immutable_tree_structured_stack/note.md new file mode 100644 index 0000000..221b43c --- /dev/null +++ b/content/posts/immutable_tree_structured_stack/note.md @@ -0,0 +1,41 @@ +X-Date: 2024-09-13T01:44:20Z +X-Note-Id: 54bf0663-7e0c-462f-831b-e27fabb8f943 +Subject: Immutable tree-structured stack +X-Slug: immutable_tree_structured_stack + +I'm writing [valeri](https://git.knazarov.com/knazarov/valeri) to be a pure language. As a pure +language, it needs a way to somehow do IO in a way that doesn't violate purity and doesn't lead +to mutation of state or non-reproducible behavior. + +One of the ways to achieve this is called "algebraic effects". You can read theory on +algebraic effects in the [Eff paper](https://www.eff-lang.org/handlers-tutorial.pdf). In short, +IO can be represented as a "continuation", where an attempt to call a function doing IO leads +to saving the full state of computation and "throwing" it up the stack until it reaches the +top level. The VM then performs the requested IO and calls the continuation object with the +results. It looks like sort of a resumable try-catch. + +The reason why systems with "algebraic effects" can be considered pure is because in a sense +you invert the control. As if the program is divided into 2 pieces - the one before the IO and the +one after the IO. Both of those parts are themselves pure function and you just call them +sequentially. + +There are multiple clever ways to implement continuations, but most of them reduce program +debuggability in the same way that automatic tail-call optimizations do. Mostly because +they don't preserve the call stack except for what is necessary for continuing the execution +past a certain point. An attempt to print a stacktrace on exception will likely lead to +nonsensical results. (or even exploring values of certain local variables in the debugger can be +impossible) + +So instead, I went for the most straightforward approach to saving program state. Initially I +started out with having a single mutable stack backed by an array. As many regular languages +do, it used a base pointer to delimit the current "frame". And now I've converted one single +stack to a linked list of frames: whenever a function is called, a new frame is allocated +on the heap, and it points back to the previous frame. So in theory you can return back to +the previous frame and start a new "branch" from there. + +The only trick here is to prevent modification of the shared parts of the stack when we throw +the continuation and a function somewhere up the stack performs some computation. To achieve +this, my stack data structure is immutable: all modifications of the frame actually create +a new copy of the same frame, and the VM switches to that. It's not efficient, but it's a +way to get a working solution. And there are ways to remove unneeded copies where we can +perform direct mutation of the frame if there's nobody else who refers to the same frame.