knazarov.com/content/posts/code_generation_for_simple_arithmetic/note.md

2.2 KiB

X-Date: 2024-08-11T22:05:08Z X-Note-Id: feb302ea-ad13-4727-befa-83b7d4cc4333 Subject: First working code generation for simple arithmetic X-Slug: code_generation_for_simple_arithmetic

I've got my first bytecode output by the compiler. The runtime now takes a string, parses it into S-expressions, and passes them to the compilation phase. The compilation phase is more or less one-pass and emits an unoptimized bytecode for a "register machine". At the moment, the register machine is not there, so no execution yet. But I already know how to make it because the previous version of Valeri had it.

This is what the entry point in Valeri looks like now:

auto code_str = TRY(String::create("(* (+ 1 2 3) 4)"));
auto reader = Reader(code_str);

auto parsed = TRY(reader.read_one());
auto code_str_written = TRY(write_one(parsed));

// Print the s-expression back, to make sure that
// we've parsed it correctly
TRY(debug_print(code_str_written));

auto compiled = TRY(compile(parsed));
Function& fun = *compiled.to<Function>();

TRY(debug_print(TRY(fun.constants())));
TRY(debug_print(TRY(fun.code())));

And this is what it outputs to the console:

(* (+ 1 2 3) 4)

[1 2 3 4]

[
  #<opcode mov r0 c0>
  #<opcode mov r1 c1>
  #<opcode add r2 r0 r1>
  #<opcode mov r3 c2>
  #<opcode add r0 r2 r3>
  #<opcode mov r1 c3>
  #<opcode mul r0 r0 r1>
  #<opcode ret r0>
]
  • The first line here is the expression we are compiling. It is exactly the same as what we've been parsing.
  • The second line is an array of constants. They are later referred to as c<index> where <index> is position in this array.
  • And the rest is text representation of the bytecode. It should be quite obvious what it does, the only thing that you need to know is that the accumulator is usually the first argument to the instruction.

In this example, a few things can potentially be eliminated. If I were writing the bytecode by hand, I could've removed the pointless assignments of constants to registers and instead referred to constants directly in subsequent instructions. But at this point it will make the compiler a lot more complex than needed for the prototype.

Next steps will be:

  • Writing a simple virtual machine
  • Adding lambda functions, conditionals and such
  • Writing a REPL

Stay tuned.