Add a post about syntax objects
This commit is contained in:
parent
e10d4e0566
commit
346db7677f
1 changed files with 75 additions and 0 deletions
75
content/posts/error_reporting_and_syntax_objects/note.md
Normal file
75
content/posts/error_reporting_and_syntax_objects/note.md
Normal file
|
@ -0,0 +1,75 @@
|
||||||
|
X-Date: 2024-09-08T01:43:33Z
|
||||||
|
X-Note-Id: cf3ed5fd-cbf3-41cc-8c75-3b8b1220fb34
|
||||||
|
Subject: Error reporting and syntax objects
|
||||||
|
X-Slug: error_reporting_and_syntax_objects
|
||||||
|
|
||||||
|
If you know a little bit about lisp, you may think that it is "homoiconic". The code
|
||||||
|
that it compiles is written the same way as regular data. For example:
|
||||||
|
|
||||||
|
```
|
||||||
|
valeri> (+ 1 2 3 (* 4 5))
|
||||||
|
26
|
||||||
|
```
|
||||||
|
|
||||||
|
This is of course a program, but you can also quote it to get back a list:
|
||||||
|
|
||||||
|
```
|
||||||
|
valeri> '(+ 1 2 3 (* 4 5))
|
||||||
|
(+ 1 2 3 (* 4 5))
|
||||||
|
```
|
||||||
|
|
||||||
|
And many people will either know or realize that it opens up a possibility for source code
|
||||||
|
transformation, and in particular macros. I've even heard from some that in lisp, you write
|
||||||
|
code directly in AST. But this is actually wrong!
|
||||||
|
|
||||||
|
Consider for a moment what will happen if during an arithmetic operation you'll get a runtime
|
||||||
|
error? How would the runtime show you the source code location of the error? To do that, the
|
||||||
|
compiler must emit debug information with source code mapping. And if the data structures
|
||||||
|
that the compiler is receiving as input are just regular lists - the source mapping is lost.
|
||||||
|
|
||||||
|
So, practical lisp implementations (at least of Scheme) actually do have AST, which is called
|
||||||
|
"syntax objects". See [Racket docs](https://docs.racket-lang.org/guide/stx-obj.html) for an
|
||||||
|
in-depth explanation.
|
||||||
|
|
||||||
|
In Scheme, syntax objects can wrap any other object and give it additional context such as
|
||||||
|
lexical scope, source code location, or any other custom metadata. You can "pack" and "unpack"
|
||||||
|
syntax objects if you want to really fiddle with a low-level representation. Scheme also uses
|
||||||
|
syntax objects for hygienic macro system, but that's out of scope for me right now.
|
||||||
|
|
||||||
|
Since I want [Valeri](https://git.sr.ht/~knazarov/valeri) to be friendly, I've taken a stab
|
||||||
|
at implementing syntax objects. To play with them in the REPL, you can do as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
valeri> (syntax 42)
|
||||||
|
#<syntax 42>
|
||||||
|
|
||||||
|
valeri> (syntax (1 2 3))
|
||||||
|
#<syntax (#<syntax 1> #<syntax 2> #<syntax 3>)>
|
||||||
|
|
||||||
|
valeri> (syntax {1 2 3 4})
|
||||||
|
#<syntax (dict #<syntax 1> #<syntax 2> #<syntax 3> #<syntax 4>)>
|
||||||
|
```
|
||||||
|
|
||||||
|
Here, `syntax` is a special form that allows you to keep the syntax information of its parameter.
|
||||||
|
Compare `(syntax (1 2 3))` in the example with the following:
|
||||||
|
|
||||||
|
```
|
||||||
|
valeri> (quote (1 2 3))
|
||||||
|
(1 2 3)
|
||||||
|
```
|
||||||
|
|
||||||
|
Quote actually does the reverse: it will strip the syntax information from its parameter, so the user
|
||||||
|
will see what they expect. Any time any "atoms" (numbers, strings, symbols, etc...) get compiled into
|
||||||
|
the bytecode, their syntax information is stripped.
|
||||||
|
|
||||||
|
In the current implementation, the reader that parses source code into the object hierarchy is already
|
||||||
|
embedding source code information. The compiler or runtime don't utilize this information yet to
|
||||||
|
enrich error messages, but that's coming up soon.
|
||||||
|
|
||||||
|
And finally, because I've added the collection of syntax context to the reader, it now will show
|
||||||
|
errors that happen on the reader phase, like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
valeri> (1 2 "foo)
|
||||||
|
#<error:syntax-error "<unknown>:1:6 Syntax error: unterminated string">
|
||||||
|
```
|
Loading…
Reference in a new issue