121 lines
5.2 KiB
Markdown
121 lines
5.2 KiB
Markdown
|
X-Date: 2024-10-05T22:14:26Z
|
||
|
X-Note-Id: 235642a1-b67f-4836-a6ae-1b9272e5275b
|
||
|
Subject: Capturing and serializing continuations
|
||
|
X-Slug: capturing_and_serializing_continuations
|
||
|
|
||
|
With latest patches to [Valeri](https://git.knazarov.com/knazarov/valeri) it became possible to
|
||
|
do some magic that you just can't do in Python. At least not without creating lots of tricky
|
||
|
edge cases. Because of Valeri's immutability, objects in memory always form a unidirectional
|
||
|
graph. The same is true for virtual machine runtime. This means that it is possible to take
|
||
|
current state of computation, serialize it to a contiguous byte array, send over the network
|
||
|
and then resume the computation on the other end.
|
||
|
|
||
|
Because of the [referential transparency](https://en.wikipedia.org/wiki/Referential_transparency),
|
||
|
objects in memory don't have any "identity" other than their contents. So if they are equal,
|
||
|
you can substitute one with the other and nothing will change. Let's look at the example, and
|
||
|
then I'll break it down:
|
||
|
|
||
|
```
|
||
|
(def c (guard (fn body () (+ 1 (raise :unused)))
|
||
|
(fn handler (err cont) cont)))
|
||
|
|
||
|
;; "serialized" will hold the detached representation
|
||
|
;; of the continuation "c"
|
||
|
(def serialized (serialize c))
|
||
|
(println "serialized: " serialized "\n")
|
||
|
|
||
|
(def deserialized (deserialize serialized))
|
||
|
(println "deserialized: " deserialized "\n")
|
||
|
|
||
|
;; Call the deserialized continuation
|
||
|
(println "result: " (deserialized 2))
|
||
|
```
|
||
|
|
||
|
First, let's start with `guard`. It does pretty much the same thing as exception handlers do
|
||
|
in other languages. Its prototype is `(guard <function> <exception-handler>)`. It calls the
|
||
|
`<function>` and if an exception occurs during the execution, it calls the `<exception-handler>`.
|
||
|
Exception handler takes two arguments: first is the exception object itself, and the second is
|
||
|
the "continuation" object. The only difference with typical programming languages is that by
|
||
|
calling the "continuation" object, you will jump back to where the exception has occured, and
|
||
|
the `(raise)` function will return the value passed to the continuation as a parameter.
|
||
|
|
||
|
In this example, we suspend execution right at the time the second argument to addition is
|
||
|
computed. If the continuation is called with an integer parameter - it will be used in this
|
||
|
addition. So effectively, you can think that the continuation contains a function roughly
|
||
|
equivalent to `(fn (x) (+ 1 x))`. For simplicity, we have a very simple function here. But
|
||
|
it can in fact be arbitrarily nested and use recursion. In such case the whole call stack
|
||
|
starting from the "guard" will be preserved.
|
||
|
|
||
|
Because in the handler function we immediately return the continuation object - it will become
|
||
|
the result of the whole guard statement and then assigned to variable `c`. We can then repeatedly
|
||
|
call `c` with different arguments like `(c 5)` to get `6`. The number of times we call it doesn't
|
||
|
matter: every time execution will be "forked off" from the point where the original computation
|
||
|
has been suspended.
|
||
|
|
||
|
Next in the example we have this:
|
||
|
|
||
|
```
|
||
|
(def serialized (serialize c))
|
||
|
(println "serialized: " serialized "\n")
|
||
|
```
|
||
|
|
||
|
Here, `(serialize c)` takes the computation state graph in memory and flattens it into the byte array
|
||
|
(possible due to the graph being unidirectional). When executed, it will give you something like this:
|
||
|
|
||
|
```
|
||
|
serialized: #<bytearray 1200076003000608000710000740000...>
|
||
|
;; Some of the output was trimmed to save screen space
|
||
|
```
|
||
|
|
||
|
This flattened representation is computed by getting hold of `c` as a "root" and then traversing the
|
||
|
graph, copying and compacting every object traversed into the contiguous area in memory. Because
|
||
|
memory addressing in Valeri is offset-based, there is no separate format for the serialized data.
|
||
|
Data structures in memory are exactly the same as in the serialized form.
|
||
|
|
||
|
Then we deserialize the byte array back:
|
||
|
|
||
|
```
|
||
|
(def deserialized (deserialize serialized))
|
||
|
(println "deserialized: " deserialized "\n")
|
||
|
```
|
||
|
|
||
|
When executed, this code gives us:
|
||
|
|
||
|
```
|
||
|
deserialized: #<continuation>
|
||
|
```
|
||
|
|
||
|
It's important to reiterate that there is no separate format for serialized data. Because of this,
|
||
|
"deserialization" just involves copying the content of the byte array into the heap, and reinterpreting
|
||
|
the beginning of that data as the beginning of the resulting data structure. Of course I'll do some
|
||
|
validation in the future so that you can't corrupt the runtime by handcrafting the byte array, but the
|
||
|
general approach will stay the same.
|
||
|
|
||
|
And finally, we execute the deserialized continuation:
|
||
|
|
||
|
```
|
||
|
(println "result: " (deserialized 2))
|
||
|
```
|
||
|
|
||
|
Which gives us:
|
||
|
|
||
|
```
|
||
|
result: 3
|
||
|
```
|
||
|
|
||
|
And of course we can easily run it again:
|
||
|
|
||
|
```
|
||
|
(println "result: " (deserialized 42))
|
||
|
result: 43
|
||
|
```
|
||
|
|
||
|
This execution model is very flexible. By using it, you can save the full "image" of the running program,
|
||
|
or only part of it. Or send computation over the network to be performed on the "other side", where that
|
||
|
"other side" can be reasonably certain that the computation won't be able to do anything it's not supposed
|
||
|
to do.
|
||
|
|
||
|
Another interesting thing you can do with this is "execution snapshotting", where you can save state of
|
||
|
your program at a certain point in time, and then revert back to it if you are debugging or otherwise
|
||
|
reproducing unintended behavior.
|