Add a post about the vm progress

2023-12-31 04:13:18 +00:00 · 2023-12-31 04:13:18 +00:00 · 78c9c6a1ba
commit 78c9c6a1ba
parent 52c85e9d35
2 changed files with 42 additions and 1 deletions
--- a/content/posts/vm_progress_update_dicts_and_parsing/note.md
+++ b/content/posts/vm_progress_update_dicts_and_parsing/note.md
@ -0,0 +1,42 @@
+X-Date: 2023-12-31T09:00:00Z
+X-Note-Id: 9ef1ba1c-2fea-4bac-9126-db53b6c127cf
+Subject: VM progress update: dicts and parsing
+X-Slug: vm_progress_update_dicts_and_parsing
+
+This would be a short update, because I don't have much time at the moment to write.
+But still, the virtual machine progresses at a steady pace. I feel that I'm pretty close
+to having things in place to write a compiler on top of.
+
+First, I've added support for proper dictionaries. As I wrote in the previous post, the dictionary
+implementation is based on the red-black trees, which means that I had to carefully refactor all
+existing basic data structures, so that they all have "strict total order" comparison capability.
+The dictionaries can be "frozen" as well as other data structures, in which case they turn into a
+binary search tree represented as an array (as there's no way to resize a frozen data structure, this
+is OK).
+
+Then, arrays no longer have special case implementation for storing integer types. I thought this
+would be a good idea, but their implementation details have leaked into other pieces of the codebase,
+causing bloat and bugs. I've since refactored the special cases away, and things became much simpler.
+It means that an array of 64-bit integers is 2x as large as it may have been, but it's fine right now.
+
+Arrays have received support for in-place resize. You can take a pointer to an array and increase or
+decrease its size. It works in a way that doesn't invalidate existing pointers to the array.
+
+Strings are now represented as arrays of 32-bit UTF-8 codepoints. Initially I thought that for the
+sake of compactness it's better to just store variable length encoding internally, but the lack of
+random access to characters has messed up a few of the things that depended on strings. So I've
+rewritten them to have a less compact form, but be more user-friendly.
+
+There is now support for proper boolean values. Initially I thought that having the "truthy" `'t`
+symbol and `nil` would do. But then I thought that many external real-world apps do care about
+proper boolean types (e.g. anything that consumes JSON). So I've added boolean type as a first-class
+citizen.
+
+And finally, there is now a "writer" and a "reader". The purpose of the "writer" is to serialize
+data structures into the text form. This can be used to print data structures to the screen, or
+debug the raw frozen slices taken from memory. The "reader" does things in reverse - it takes the
+textual representation and turns it into the language's data structures.
+
+Because I'm basing the language on S-expressions, it means that the textual representation that the
+"reader" consumes is almost the AST (abstract syntax tree) that the compiler can use to produce
+the lower-level bytecode.
--- a/1
+++ b/1
@ -1 +0,0 @@
-/nix/store/0jp7ls9glc96hwd7y4jhxcvb30dwjljc-knazarov.com-0.1.0
				`@ -1 +0,0 @@`
				`/nix/store/0jp7ls9glc96hwd7y4jhxcvb30dwjljc-knazarov.com-0.1.0`