Publish a note about tree structured stack

2024-09-13 03:38:56 +01:00 · 2024-09-13 03:38:56 +01:00 · ecbbdcede2
commit ecbbdcede2
parent 83b7cc0385
1 changed files with 41 additions and 0 deletions
--- a/content/posts/immutable_tree_structured_stack/note.md
+++ b/content/posts/immutable_tree_structured_stack/note.md
@ -0,0 +1,41 @@
+X-Date: 2024-09-13T01:44:20Z
+X-Note-Id: 54bf0663-7e0c-462f-831b-e27fabb8f943
+Subject: Immutable tree-structured stack
+X-Slug: immutable_tree_structured_stack
+
+I'm writing [valeri](https://git.knazarov.com/knazarov/valeri) to be a pure language. As a pure
+language, it needs a way to somehow do IO in a way that doesn't violate purity and doesn't lead
+to mutation of state or non-reproducible behavior.
+
+One of the ways to achieve this is called "algebraic effects". You can read theory on
+algebraic effects in the [Eff paper](https://www.eff-lang.org/handlers-tutorial.pdf). In short,
+IO can be represented as a "continuation", where an attempt to call a function doing IO leads
+to saving the full state of computation and "throwing" it up the stack until it reaches the
+top level. The VM then performs the requested IO and calls the continuation object with the
+results. It looks like sort of a resumable try-catch.
+
+The reason why systems with "algebraic effects" can be considered pure is because in a sense
+you invert the control. As if the program is divided into 2 pieces - the one before the IO and the
+one after the IO. Both of those parts are themselves pure function and you just call them
+sequentially.
+
+There are multiple clever ways to implement continuations, but most of them reduce program
+debuggability in the same way that automatic tail-call optimizations do. Mostly because
+they don't preserve the call stack except for what is necessary for continuing the execution
+past a certain point. An attempt to print a stacktrace on exception will likely lead to
+nonsensical results. (or even exploring values of certain local variables in the debugger can be
+impossible)
+
+So instead, I went for the most straightforward approach to saving program state. Initially I
+started out with having a single mutable stack backed by an array. As many regular languages
+do, it used a base pointer to delimit the current "frame". And now I've converted one single
+stack to a linked list of frames: whenever a function is called, a new frame is allocated
+on the heap, and it points back to the previous frame. So in theory you can return back to
+the previous frame and start a new "branch" from there.
+
+The only trick here is to prevent modification of the shared parts of the stack when we throw
+the continuation and a function somewhere up the stack performs some computation. To achieve
+this, my stack data structure is immutable: all modifications of the frame actually create
+a new copy of the same frame, and the VM switches to that. It's not efficient, but it's a
+way to get a working solution. And there are ways to remove unneeded copies where we can
+perform direct mutation of the frame if there's nobody else who refers to the same frame.