knazarov.com/content/posts/implementing_globals/note.md

74 lines
2.7 KiB
Markdown

X-Date: 2024-01-28T16:00:00Z
X-Note-Id: 566a90f3-f0c6-4348-89c1-8fe83535e365
Subject: Implementing globals
X-Slug: implementing_globals
Global variables in dynamic languages are usually implemented differently from static ones.
For example, consider this C code:
```
const int globalvar = 42;
int foo() {
return globalvar;
}
```
Here, the compiler would know that `globalvar` is defined, and can even insert a direct pointer
to it during the linking phase. This is because by its design, a static language would usually require
that all references be known in advance. Otherwise it would be impossible to determine their type and
check some of the memory safety guarantees.
Many dynamic languages would use a different strategy. Especially in a REPL. Consider for example this
Python REPL session:
```
>>> def foo():
... return globalvar
...
>>> globalvar = 42
>>> foo()
42
```
As you can see, Python has no problems with compiling `foo` even though the variable has not been defined
yet. We can then define the variable and call `foo` and everything will work. This is because Python
checks that `globalvar` is not a local variable in context of `foo` and inserts a dynamic lookup that
tries to find this variable by name in a dictionary of globals.
For languages with a REPL, it is very convenient to be able to define and re-define top-level global
variables and have all the rest of your functions pick them up without having to re-compile.
REPL is a form of image-based development, just a tiny bit less powerful. In my language, I'm trying
to build something that is familiar to Lisp developers: an ability to dynamically re-evaluate blocks
of code from the editor, sending them to a running process.
For that, I've implemented support in the virtual machine and assembly for addressing global variables.
Ideally they should be module-scoped, but for now I only have one global scope, so it would work for a while.
Here's an example in the assembler of how to use globals:
```
;; Declare a global symbol
(global foo)
(const forty-two 42)
(sr 2)
;; Set the global to a constant
(setglobal foo forty-two)
;; Make sure the value of the global
;; is correct
(getglobal r0 foo)
(aeq r0 forty-two)
(retnil)
```
In this case, `(global foo)` is not an actual opcode, but an instruction to the assembler that
it should add `foo` to an array of globals. The virtual machine uses a few optimizations, so that
it doesn't perform dictionary lookups every time, but only on first access.
So far, it does pretty much what Lua and Python do, and it doesn't look like anything special.
But it would be important later on, when I'll add ability to load multiple pieces of bytecode
and have them call each other.