X-Date: 2023-10-07T23:50:00Z X-Note-Id: 4aa1fd3d-ff26-4662-ba20-4e8b5c9a3ad7 Subject: Statically linked Python interpreter X-Slug: statically_linked_python_interpreter This would be helpful for you if you want to run a Python program on a variety of Linux systems without modifying/repackaging it. For example, if you wrote a system administration or orchestration tool and just want it to work everywhere. So far, the only feasible option to do this was to write your tool in Go or Rust, which can create real redistributable binaries, which has been one of their killer features. This is why many sysadmin tools and CLIs are written in Go. In some cases however, you may not want to introduce another language to your stack (especially if you're already familiar with Python). ## Quickstart If you just want to play with it, run a script from my [static-python](https://git.sr.ht/~knazarov/static-python) repository. You'd get a Python binary that you can just drop on any Linux box and it will work. For an explanation, read along. ## What about the official instructions? There's an article on the Python wiki called [Building Python Statically](https://wiki.python.org/moin/BuildStatically). It gives the right direction, but it is outdated. You'd need more than that to build the latest version of Python. ## How does it work Python itself is written in C, and so the first step is to get yourself a toolchain (compiler and a set of libraries) suitable to be embedded into the binary. Default glibc (which is normally shipped on Linux) is not meant to be used this way. So, you'd need something like [Musl libc](https://musl.libc.org/). Thanks to Alpine Linux which uses it by default, a lot of software has been fixed to work well with it. Alpile Linux itself links with Musl libc dynamically, but it doesn't make a difference for compatibility. Then, you'd need to compile all standard libraries that Python depends on statically as well. This includes sqlite, ncurses, bzip2, and others. Instead of .so (dynamic library) files, you'd get .a (static library) files, and a set of accompanying headers. Then comes the difficult part of persuading Python to link with them. Currently, there are certain targets where Python is built statically, like WASM where it would be loaded into the browser. This is often used to build a web experimentation environment like the interactive console on [python.org](https://www.python.org/). Unfortunately, some of the functionality that makes parts of Python link statically together is explicitly gated behind WASM flags. This is what we'd need to change. First, if you look into [Modules/Setup.stdlib.in](https://github.com/python/cpython/blob/main/Modules/Setup.stdlib.in), you'd see that there's a configuration option `*@MODULE_BUILDTYPE@*`, which controls whether the standard modules should be built as shared or static libraries. The wiki page on Static Linking recommends manually modifying the generated files, but this is not needed in the latest version. What you need is to do this: ``` MODULE_BUILDTYPE=static ./configure ``` Passing `MODULE_BUILDTYPE=static` will switch all modules to be built statically. If you do this, the build will fail further down the line. This is because some modules will still be built dynamically. `Setup.stdlib.in` even mentions this: ``` # Some testing modules MUST be built as shared libraries. *shared* @MODULE__TESTIMPORTMULTIPLE_TRUE@_testimportmultiple _testimportmultiple.c @MODULE__TESTMULTIPHASE_TRUE@_testmultiphase _testmultiphase.c @MODULE__TESTMULTIPHASE_TRUE@_testsinglephase _testsinglephase.c @MODULE__CTYPES_TEST_TRUE@_ctypes_test _ctypes/_ctypes_test.c # Limited API template modules; must be built as shared modules. @MODULE_XXLIMITED_TRUE@xxlimited xxlimited.c @MODULE_XXLIMITED_35_TRUE@xxlimited_35 xxlimited_35.c ``` This means that we have to disable those. Fortunately, they don't seem essential. You'd likely lose comatibility with some of the modules on PyPi (especially ones that use 2to3), but that should not be too many of them, as Python2 has been deprecated. The testing modules can be disabled with a configuration option `--disable-test-modules`, which already exists. However, the last 2 cannot. In case of compilation to WASM, they are disabled automatically, as the configure script detects the lack of `dlopen()`. But for a static binary, `dlopen()` is still there, so it keeps them. And there is no dedicated switch. This is why I've created a small patch to the configure script: [staticbuild.patch](https://git.sr.ht/~knazarov/static-python/tree/master/item/staticbuild.patch) This script introduces an additional option `--disable-xxlimited-modules` which acts as an explicit instruction to not build and link those 2 modules. In the end, this is the command line you'd use: ``` MODULE_BUILDTYPE=static ./configure --disable-test-modules --disable-xxlimited-modules ``` And after typing `make`, you'd get your statically linked Python. I'm not quite sure why this is not fixed in upstream yet. Maybe I should contribute a patch. ## NixOS static build environment for Python This brings us to another interesting topic. As I said previously, setting up a static toolchain is not a trivial task. Doing it manually is very time consuming and not very reproducible. Recently I've learned that NixOS has support for cross-compilation, including cross-compiling to the same platform but with a static toolchain. You can read more about it [here](https://nix.dev/tutorials/cross-compilation#developer-environment-with-a-cross-compiler). Essentially, it gives you a compiler and a vast number of packages already prepared properly as dependencies for your statically linked project. This makes it very easy to maintain a static build environment, and is why I've implemented the python build script with Nix. As of now, the standard Nix recipe for compiling Python doesn't cross-compile to static musl libc, but I plan to contribute my patch there, so in the future you would just be able to grab the binary directly from Nix. ## Further advice Just getting a static Python binary is probably not enough for you to comfortably run your software. You'd need to package all your code to one distributable archive and ship it with the static Python binary. It can be done with one of these tools: - [pex](https://pex.readthedocs.io), a packer for your entire virtualenv to a single executable archive (python binary will be external) - [pyinstaller](https://pyinstaller.org), which can pack your code to an executable archive, and include the python binary along with it Covering these tools goes out of scope of this article, but I'm just mentioning them if you want to take this further. You may wonder why I didn't just use pyinstaller if it bundles Python binary together with the code. This is because it will still depend on a couple of system libraries. In the pyinstaller docs they say it explicitly: you'd have to build a package for every major distribution this way. But if you use a static Python, you don't have to. So use my code in combination with pyinstaller, they are not mutually exclusive.