Add a post on calling convention

2023-08-28 22:57:58 +01:00 · 2023-08-28 22:57:58 +01:00 · c4d2281704
commit c4d2281704
parent 4b1817e1ae
1 changed files with 66 additions and 0 deletions
--- a/content/posts/calling_convention_dilemma/note.md
+++ b/content/posts/calling_convention_dilemma/note.md
@ -0,0 +1,66 @@
 X-Date: 2023-08-28T23:30:00Z
 X-Note-Id: 4204588c-e5bd-4214-a08f-b2cd77029d4b
 Subject: Calling convention dilemma
 X-Slug: calling_convention_dilemma
 Statically typed languages usually have knowledge of how they should pass arguments to a function, because
 that function's signature is known in advance. If you look at the x86 C calling convention for example, you'd
 see that either parameters are passed through registers (for small values like integers or pointers), or
 on the stack (for larger values).
 Even if you don't know what exact function you're calling (in case of function pointers), the prototype
 of that function tells the compiler everything it needs to know to produce the platform-specific machine code.
 For dynamic languages, it is different. The prototype is not known in advance, and so you'd have to rely
 on a higher-level construct. For example, you always pass one parameter, which is an array. Or two parameters,
 one of which is an array and another is a hash table (for Python-like keyword arguments).
 I've suddenly found myself somewhere in the middle. I design my programming language VM to be specifically
 made for dynamic languages. But still, the architecture that it uses has been derived from a RISC CPU.
 It has a stack, but opcodes deal exclusively with registers, and you need to call explicit load/store.
 Having registers means that it would be nice to be able to pass parameters in them, in case the number
 of arguments to the function is short enough. And it would help with optimizing tail calls as well
 (less shuffling of the stack).
 The problem here can be demonstrated on a "print" function. Imagine that it's just a built-in that accepts
 an arbitrary number of arguments. In a made-up assembly, it would look something like this:
 ```
 ;; Load two constant strings
 ;; into registers r0 and r1
 loadc r0, hello
 loadc r1, world
 ;; How to detect arg count?
 call print
 ret
 .const
    hello: "Hello"
    world: "World"
 ```
 In this example, the `print` function has no way to know that it should use registers r0 and r1. Even if
 the calling convention allows passing arguments through registers. It just has no way to know that there
 are two arguments (it could've been more). And even if we are not talking about a "variadic" function,
 we may just have a pointer to it and thus no way to inspect it.
 What I'm leaning towards in this case is to embed the argument count into the low-level "virtual CPU"
 calling convention. So instead of `call print`, you'd have `call 2, print` which would have an "immediate"
 value encoded into the opcode. When this opcode is executed, it would set up a new call frame and
 put the information about the number of parameters into the frame itself (along with the return
 address and a link to the previous call frame). `print` can then look at the frame and deduce the correct
 number.
 The benefit of this approach is that the caller always knows the number of arguments exactly. And the
 callee may then take up to a certain pre-defined number of arguments from the registers, and the
 rest from the stack.
 You may be wondering -- why go to all these lengths when a purely stack-based virtual machine would
 be much simpler and probably already solves these problems? Well, the most straightforward answer
 to this would be that I want to make the VM a good target for code generation. Reading the code
 generated for a register-based machine is a lot easier than for a stack-based one. Same for debugging.
 But at the end of the day, it's just fun to do. So let's see how it goes.