Add a post about the vm progress

This commit is contained in:
Konstantin Nazarov 2023-12-31 04:13:18 +00:00
parent 52c85e9d35
commit 78c9c6a1ba
Signed by: knazarov
GPG key ID: 4CFE0A42FA409C22
2 changed files with 42 additions and 1 deletions

View file

@ -0,0 +1,42 @@
X-Date: 2023-12-31T09:00:00Z
X-Note-Id: 9ef1ba1c-2fea-4bac-9126-db53b6c127cf
Subject: VM progress update: dicts and parsing
X-Slug: vm_progress_update_dicts_and_parsing
This would be a short update, because I don't have much time at the moment to write.
But still, the virtual machine progresses at a steady pace. I feel that I'm pretty close
to having things in place to write a compiler on top of.
First, I've added support for proper dictionaries. As I wrote in the previous post, the dictionary
implementation is based on the red-black trees, which means that I had to carefully refactor all
existing basic data structures, so that they all have "strict total order" comparison capability.
The dictionaries can be "frozen" as well as other data structures, in which case they turn into a
binary search tree represented as an array (as there's no way to resize a frozen data structure, this
is OK).
Then, arrays no longer have special case implementation for storing integer types. I thought this
would be a good idea, but their implementation details have leaked into other pieces of the codebase,
causing bloat and bugs. I've since refactored the special cases away, and things became much simpler.
It means that an array of 64-bit integers is 2x as large as it may have been, but it's fine right now.
Arrays have received support for in-place resize. You can take a pointer to an array and increase or
decrease its size. It works in a way that doesn't invalidate existing pointers to the array.
Strings are now represented as arrays of 32-bit UTF-8 codepoints. Initially I thought that for the
sake of compactness it's better to just store variable length encoding internally, but the lack of
random access to characters has messed up a few of the things that depended on strings. So I've
rewritten them to have a less compact form, but be more user-friendly.
There is now support for proper boolean values. Initially I thought that having the "truthy" `'t`
symbol and `nil` would do. But then I thought that many external real-world apps do care about
proper boolean types (e.g. anything that consumes JSON). So I've added boolean type as a first-class
citizen.
And finally, there is now a "writer" and a "reader". The purpose of the "writer" is to serialize
data structures into the text form. This can be used to print data structures to the screen, or
debug the raw frozen slices taken from memory. The "reader" does things in reverse - it takes the
textual representation and turns it into the language's data structures.
Because I'm basing the language on S-expressions, it means that the textual representation that the
"reader" consumes is almost the AST (abstract syntax tree) that the compiler can use to produce
the lower-level bytecode.

1
result
View file

@ -1 +0,0 @@
/nix/store/0jp7ls9glc96hwd7y4jhxcvb30dwjljc-knazarov.com-0.1.0