Add a post on calling convention
This commit is contained in:
parent
4b1817e1ae
commit
c4d2281704
1 changed files with 66 additions and 0 deletions
66
content/posts/calling_convention_dilemma/note.md
Normal file
66
content/posts/calling_convention_dilemma/note.md
Normal file
|
@ -0,0 +1,66 @@
|
||||||
|
X-Date: 2023-08-28T23:30:00Z
|
||||||
|
X-Note-Id: 4204588c-e5bd-4214-a08f-b2cd77029d4b
|
||||||
|
Subject: Calling convention dilemma
|
||||||
|
X-Slug: calling_convention_dilemma
|
||||||
|
|
||||||
|
Statically typed languages usually have knowledge of how they should pass arguments to a function, because
|
||||||
|
that function's signature is known in advance. If you look at the x86 C calling convention for example, you'd
|
||||||
|
see that either parameters are passed through registers (for small values like integers or pointers), or
|
||||||
|
on the stack (for larger values).
|
||||||
|
|
||||||
|
Even if you don't know what exact function you're calling (in case of function pointers), the prototype
|
||||||
|
of that function tells the compiler everything it needs to know to produce the platform-specific machine code.
|
||||||
|
|
||||||
|
For dynamic languages, it is different. The prototype is not known in advance, and so you'd have to rely
|
||||||
|
on a higher-level construct. For example, you always pass one parameter, which is an array. Or two parameters,
|
||||||
|
one of which is an array and another is a hash table (for Python-like keyword arguments).
|
||||||
|
|
||||||
|
I've suddenly found myself somewhere in the middle. I design my programming language VM to be specifically
|
||||||
|
made for dynamic languages. But still, the architecture that it uses has been derived from a RISC CPU.
|
||||||
|
It has a stack, but opcodes deal exclusively with registers, and you need to call explicit load/store.
|
||||||
|
|
||||||
|
Having registers means that it would be nice to be able to pass parameters in them, in case the number
|
||||||
|
of arguments to the function is short enough. And it would help with optimizing tail calls as well
|
||||||
|
(less shuffling of the stack).
|
||||||
|
|
||||||
|
The problem here can be demonstrated on a "print" function. Imagine that it's just a built-in that accepts
|
||||||
|
an arbitrary number of arguments. In a made-up assembly, it would look something like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
;; Load two constant strings
|
||||||
|
;; into registers r0 and r1
|
||||||
|
loadc r0, hello
|
||||||
|
loadc r1, world
|
||||||
|
|
||||||
|
;; How to detect arg count?
|
||||||
|
call print
|
||||||
|
|
||||||
|
ret
|
||||||
|
|
||||||
|
.const
|
||||||
|
hello: "Hello"
|
||||||
|
world: "World"
|
||||||
|
```
|
||||||
|
|
||||||
|
In this example, the `print` function has no way to know that it should use registers r0 and r1. Even if
|
||||||
|
the calling convention allows passing arguments through registers. It just has no way to know that there
|
||||||
|
are two arguments (it could've been more). And even if we are not talking about a "variadic" function,
|
||||||
|
we may just have a pointer to it and thus no way to inspect it.
|
||||||
|
|
||||||
|
What I'm leaning towards in this case is to embed the argument count into the low-level "virtual CPU"
|
||||||
|
calling convention. So instead of `call print`, you'd have `call 2, print` which would have an "immediate"
|
||||||
|
value encoded into the opcode. When this opcode is executed, it would set up a new call frame and
|
||||||
|
put the information about the number of parameters into the frame itself (along with the return
|
||||||
|
address and a link to the previous call frame). `print` can then look at the frame and deduce the correct
|
||||||
|
number.
|
||||||
|
|
||||||
|
The benefit of this approach is that the caller always knows the number of arguments exactly. And the
|
||||||
|
callee may then take up to a certain pre-defined number of arguments from the registers, and the
|
||||||
|
rest from the stack.
|
||||||
|
|
||||||
|
You may be wondering -- why go to all these lengths when a purely stack-based virtual machine would
|
||||||
|
be much simpler and probably already solves these problems? Well, the most straightforward answer
|
||||||
|
to this would be that I want to make the VM a good target for code generation. Reading the code
|
||||||
|
generated for a register-based machine is a lot easier than for a stack-based one. Same for debugging.
|
||||||
|
|
||||||
|
But at the end of the day, it's just fun to do. So let's see how it goes.
|
Loading…
Reference in a new issue