From 30f6f5e030fa76589531b468a6c7640dc1dc5f44 Mon Sep 17 00:00:00 2001 From: Konstantin Nazarov Date: Mon, 7 Oct 2024 00:14:29 +0100 Subject: [PATCH] Publish a post on disassembling functions --- content/posts/disassembling_functions/note.md | 68 +++++++++++++++++++ 1 file changed, 68 insertions(+) create mode 100644 content/posts/disassembling_functions/note.md diff --git a/content/posts/disassembling_functions/note.md b/content/posts/disassembling_functions/note.md new file mode 100644 index 0000000..2bd4492 --- /dev/null +++ b/content/posts/disassembling_functions/note.md @@ -0,0 +1,68 @@ +X-Date: 2024-10-06T23:01:40Z +X-Note-Id: 710a16d0-f850-4d9b-b054-0e6b1c94b1c1 +Subject: Disassembling functions +X-Slug: disassembling_functions + +In the latest patch to [Valeri](https://git.knazarov.com/knazarov/valeri), I've added support +for disassembling arbitrary functions. It can be done both in the REPL and in file execution mode. + +When you disassemble a function, you get back the human-readable virtual machine instructions +that can be used to see if code generation has been done correctly. This of course can't +compare with a fully-featured debugger, but is enough to iron out the basics. + +Anyway, here's an example: + +``` +;; Function that calculates n! +(fn fact (n) + (if (<= n 0) + 1 + (* n (fact (- n 1))))) + +;; Output the VM bytecode of the function +(println (disassemble fact)) +``` + +When executed, this program outputs: + +``` +constants: + c0: true + c1: false + c2: 0 + c3: 1 + c4: nil + +code: + mov r1 c0 + mov r2 r0 + mov r3 c2 + less-equal r2 r3 0 + mov r1 c1 + equal r1 c0 0 + jump 3 + mov r2 c3 + jump 8 + mov r2 r0 + mov r3 c4 + mov r4 r0 + mov r5 c3 + sub r4 r4 r5 + selfcall r3 r5 + mul r2 r2 r3 + mov r1 r2 + mov r0 r1 + ret r0 +``` + +As you can see, there are `constants` and `code` sections. This is because every function +in Valeri is self-contained and the code is not "glued" together with other functions. + +Most of the opcodes have their first parameter as the accumulator (destination) and the rest +as parameters. For example, `mov r1 c0` moves the value of constant `c0` into register `r1`. +And `sub r4 r4 r5` subtracts value of `r5` from `r4` and puts the result back to `r4`. + +If you carefully study the opcodes, you'll notice that there are many redundant operations +that can be eliminated. This is true, and mostly due to the code generator being naive. In +the future when I'll get to optimizations it should be possible to reduce the size of the +bytecode.