Add a post on calling convention

2023-08-28 22:57:58 +01:00 · 2023-08-28 22:57:58 +01:00 · c4d2281704
commit c4d2281704
parent 4b1817e1ae
1 changed files with 66 additions and 0 deletions
--- a/content/posts/calling_convention_dilemma/note.md
+++ b/content/posts/calling_convention_dilemma/note.md
@ -0,0 +1,66 @@
+X-Date: 2023-08-28T23:30:00Z
+X-Note-Id: 4204588c-e5bd-4214-a08f-b2cd77029d4b
+Subject: Calling convention dilemma
+X-Slug: calling_convention_dilemma
+
+Statically typed languages usually have knowledge of how they should pass arguments to a function, because
+that function's signature is known in advance. If you look at the x86 C calling convention for example, you'd
+see that either parameters are passed through registers (for small values like integers or pointers), or
+on the stack (for larger values).
+
+Even if you don't know what exact function you're calling (in case of function pointers), the prototype
+of that function tells the compiler everything it needs to know to produce the platform-specific machine code.
+
+For dynamic languages, it is different. The prototype is not known in advance, and so you'd have to rely
+on a higher-level construct. For example, you always pass one parameter, which is an array. Or two parameters,
+one of which is an array and another is a hash table (for Python-like keyword arguments).
+
+I've suddenly found myself somewhere in the middle. I design my programming language VM to be specifically
+made for dynamic languages. But still, the architecture that it uses has been derived from a RISC CPU.
+It has a stack, but opcodes deal exclusively with registers, and you need to call explicit load/store.
+
+Having registers means that it would be nice to be able to pass parameters in them, in case the number
+of arguments to the function is short enough. And it would help with optimizing tail calls as well
+(less shuffling of the stack).
+
+The problem here can be demonstrated on a "print" function. Imagine that it's just a built-in that accepts
+an arbitrary number of arguments. In a made-up assembly, it would look something like this:
+
+```
+;; Load two constant strings
+;; into registers r0 and r1
+loadc r0, hello
+loadc r1, world
+
+;; How to detect arg count?
+call print
+
+ret
+
+.const
+    hello: "Hello"
+    world: "World"
+```
+
+In this example, the `print` function has no way to know that it should use registers r0 and r1. Even if
+the calling convention allows passing arguments through registers. It just has no way to know that there
+are two arguments (it could've been more). And even if we are not talking about a "variadic" function,
+we may just have a pointer to it and thus no way to inspect it.
+
+What I'm leaning towards in this case is to embed the argument count into the low-level "virtual CPU"
+calling convention. So instead of `call print`, you'd have `call 2, print` which would have an "immediate"
+value encoded into the opcode. When this opcode is executed, it would set up a new call frame and
+put the information about the number of parameters into the frame itself (along with the return
+address and a link to the previous call frame). `print` can then look at the frame and deduce the correct
+number.
+
+The benefit of this approach is that the caller always knows the number of arguments exactly. And the
+callee may then take up to a certain pre-defined number of arguments from the registers, and the
+rest from the stack.
+
+You may be wondering -- why go to all these lengths when a purely stack-based virtual machine would
+be much simpler and probably already solves these problems? Well, the most straightforward answer
+to this would be that I want to make the VM a good target for code generation. Reading the code
+generated for a register-based machine is a lot easier than for a stack-based one. Same for debugging.
+
+But at the end of the day, it's just fun to do. So let's see how it goes.