Publish a post on writing programming languages

This commit is contained in:
Konstantin Nazarov 2023-10-29 22:19:06 +00:00
parent 047c9eb406
commit 04f6f7b549
Signed by: knazarov
GPG key ID: 4CFE0A42FA409C22

View file

@ -0,0 +1,41 @@
X-Date: 2023-10-29T22:30:00Z
X-Note-Id: a2b2c668-ab55-4939-b07c-9315c70084f8
Subject: Writing programming languages is fun
X-Slug: writing_programming_languages_is_fun
If you've never thought to write a programming language yourself because you believe that it's too hard, I'll try to change your mind.
Many software engineers that I personally know, approach programming languages as "black boxes". They would use the language, but never look under the hood to see what's happening inside and how it works. Which is understandable: a typical programming language consists of layers of historical baggage that is hard to navigate. I'm not an exception here: even though I looked at the LLVM or Python code multiple times in my career, it was always for solving a particular problem.
In practice, for languages written in C or C++, the code will be littered with macros, platform-related branches and all sorts of optimizations. For higher-level ones it may be better, but still you'd be unlikely to understand even how modules are organized and how they talk to each other without spending very significant time.
I'd argue that studying the implementation of an existing mainstream language is a wrong approach if you'd like to understand the key mechanics.
Now, why should you care at all about these things? Isn't it the case that there are lots of great engineers who have never touched the internals of a compiler/interpreter? Sure, I'll grant you that. I would even say that I'm not a great engineer myself, so I can't be the judge. Every time I figure something out, it's usually a hard-won battle that takes _way_ longer than people around me realize.
Having said that, there are a few things that are special about programming languages that make them a good learning material:
- They are multi-stage transformers: starting from lexing to code generation. The transformers would be more or less independent, and writing them a few times will teach you how to compose programs out of "pure" blocks. This is something that you rarely see in mature codebases (unless of course they are written in Ocaml/Haskell)
- They would teach you to think in trees. A compiler/interpreter is mostly a tree walker, whether for evaluation or optimization.
- And finally, you'd start to appreciate the cost of abstractions. You'd learn the costs of closures, different types of dispatch and memory accesss.
For me personally, it has changed the way I write code to be more composable. It now consists more of just passing simple data structures around and transforming them with plain functions, rather than from "objects".
The first language that I've ever written wasn't even turing-complete. I was just interested in plotting graphs from various mathematical functions and displaying them on the screen. So I came up with a simple in-place evaluator for the formulas that you fed with an array of points and it gave you back another array of points. You could then send them to a 2D surface for display.
Then, I invented a couple of macro-languages of various complexity. This still was way before I even discovered that there is a formalism for parsing grammars. My most vivid memory is a command console for a simple 3D game where you could experiment with the world without recompilation.
And then at some point, I discovered Lisp. I never quite enjoyed writing large programs in it, but it turned out to be exceptionally easy to implement. After reading about the basic building blocks, you can bring up a working version in Python in just a few days.
Sure, a production version of a Lisp dialect would be as hard as other languages, because it would include a JIT compiler and a generational garbage collector. But for educational purposes you don't need this. It is exactly the simplicity of the core idea that matters for a hobby implementation.
Finally, there are many programming languages that are useful without being "fully-featured":
- text processing: sed, awk, and others
- templating: m4, jinja, etc.
- shells: bash, ash, etc.
- and many, many more
The case I'm trying to make here is that to have fun, you need to stay "small". Just pick a niche where you would have only a limited goal, and build from there.
If you need a practical guide, take a look at [Crafting Interpreters](https://craftinginterpreters.com/).
If you look for inspiration, check out this video: [The most beautiful program ever written](https://www.youtube.com/watch?v=OyfBQmvr2Hc)