diff --git a/content/posts/vm_progress_update_dicts_and_parsing/note.md b/content/posts/vm_progress_update_dicts_and_parsing/note.md new file mode 100644 index 0000000..d1bd7ba --- /dev/null +++ b/content/posts/vm_progress_update_dicts_and_parsing/note.md @@ -0,0 +1,42 @@ +X-Date: 2023-12-31T09:00:00Z +X-Note-Id: 9ef1ba1c-2fea-4bac-9126-db53b6c127cf +Subject: VM progress update: dicts and parsing +X-Slug: vm_progress_update_dicts_and_parsing + +This would be a short update, because I don't have much time at the moment to write. +But still, the virtual machine progresses at a steady pace. I feel that I'm pretty close +to having things in place to write a compiler on top of. + +First, I've added support for proper dictionaries. As I wrote in the previous post, the dictionary +implementation is based on the red-black trees, which means that I had to carefully refactor all +existing basic data structures, so that they all have "strict total order" comparison capability. +The dictionaries can be "frozen" as well as other data structures, in which case they turn into a +binary search tree represented as an array (as there's no way to resize a frozen data structure, this +is OK). + +Then, arrays no longer have special case implementation for storing integer types. I thought this +would be a good idea, but their implementation details have leaked into other pieces of the codebase, +causing bloat and bugs. I've since refactored the special cases away, and things became much simpler. +It means that an array of 64-bit integers is 2x as large as it may have been, but it's fine right now. + +Arrays have received support for in-place resize. You can take a pointer to an array and increase or +decrease its size. It works in a way that doesn't invalidate existing pointers to the array. + +Strings are now represented as arrays of 32-bit UTF-8 codepoints. Initially I thought that for the +sake of compactness it's better to just store variable length encoding internally, but the lack of +random access to characters has messed up a few of the things that depended on strings. So I've +rewritten them to have a less compact form, but be more user-friendly. + +There is now support for proper boolean values. Initially I thought that having the "truthy" `'t` +symbol and `nil` would do. But then I thought that many external real-world apps do care about +proper boolean types (e.g. anything that consumes JSON). So I've added boolean type as a first-class +citizen. + +And finally, there is now a "writer" and a "reader". The purpose of the "writer" is to serialize +data structures into the text form. This can be used to print data structures to the screen, or +debug the raw frozen slices taken from memory. The "reader" does things in reverse - it takes the +textual representation and turns it into the language's data structures. + +Because I'm basing the language on S-expressions, it means that the textual representation that the +"reader" consumes is almost the AST (abstract syntax tree) that the compiler can use to produce +the lower-level bytecode. diff --git a/result b/result deleted file mode 120000 index e947cf0..0000000 --- a/result +++ /dev/null @@ -1 +0,0 @@ -/nix/store/0jp7ls9glc96hwd7y4jhxcvb30dwjljc-knazarov.com-0.1.0 \ No newline at end of file