DSAA Blog: Reproducible medical imaging software environments in Nix, or, Living in the future is hard.

I’ve been thinking for a long time about how to build software environments that are resilient to the ravages of time and that others can pick up and use effectively at their leisure. So when in the course of my work at DSAA I needed to develop an effective environment for doing machine learning on medical imaging data, I jumped at the chance. The fruit of that labour is:

https://github.com/LKS-CHART/medical-imaging-nix

a working environment of software, defined in software. It allows me or anyone else to reproduce my work or use the exact software computing environment I use. What do I mean by environment? In this case I mean isolated collections of programs, packages, and libraries that can be used together, but don’t interact with the rest of your software. The concept might be familiar if you’ve worked with renv, conda, or lmod.

The project is powered by nix, a futuristic technology gaining a foothold among certain groups of software developers devops people, and programming language researchers¹. Renowned for its ability to almost fully deterministically and reproducibly build software, it enables strongly isolated environments.

What is medical-imaging-nix?

The repo above is a nix project² that generates a software environment that can be used as a starting point for future projects.

The environment contains R, python, jupyter, a suite of tools for working with medical imaging files, pytorch and tensorflow with GPU support enabled. All³ the things a medical imaging data scientist needs to hit the ground running on a new project. It can be forked and expanded to include other kinds of dependencies, you could add julia⁴, rust, fortran, and many others with just a few lines of code.

The environment can be compiled into a apptainer container ⁵ for portability to places that don’t have nix. To get started with the environment you need nix installed with flakes enabled but a new project can be initialized as simply as:

mkdir my-awesome-project
cd my-awesome-project
nix flake init -t github:LKS-CHART/medical-imaging-nix
nix develop

the first time you run this it will take a very long time, because it will build most of the software universe from scratch for you⁶. But after that all subsequent calls to nix develop will quickly drop you into your shiny new environment. If you’re ready to package up your environment into a container you can run:

nix build

this also takes a while, so start it before a meeting. But once that’s done you can send your environment image to anywhere it might be needed.

How does it work

I decided to build the environment using nix⁷, it is a package manager, a build system, and a programming language. The programming language allows you to write code that builds software, the build system builds the software and caches it in a content addressable store, and the package manager-like features give you access to those packages. This allows you to generate supercharged versions of conda environments, renvs, and can even obviate the need to use docker. I’ll explain the advantages in a moment.

The nix ecosystem also provides a curated set of a packages referred to as nixpkgs, akin to conda, or debian packages available from apt. Nixpkgs is the largest set of packages provided by any package manager, so on paper this should mean that building environments with nix should be easier than any of the alternatives, you can just grab your packages from the massive set of packages in nixpkgs.

But what’s different?

If you talk to users of nix and ask them why nix, the argument is rarely centered on the size of nixpkgs although that is certainly a plus. You’re more likely to hear about how the builds are isolated, they don’t interact with the rest of your software, that the builds are specified in a single programming language, and that versioning of the entire repository can be done through git. These are no mean feats.

Building software is really hard. If you’ve ever wanted to build someone’s project from source, especially complicated modern projects, it can be a trying ordeal. Software can be built with make, cmake, ymake, autotools, setuptools, R CMD INSTALL, ninja, bazel, shake, and many many more build systems and tools. The extremely brilliant devs who designed the nix build system found ways to hook into most of the other common build systems, and then ecosystem contributors used those tools to create reproducible recipes to build each of the 81k+ packages in nixpkgs. Adding your own additional packages using the nix language and build system is relatively simple⁸, so as your project grows it can absorb new packages into your declaratively specified software environment⁹

Once you’ve built the software, you then have to worry about dynamic dependencies. Dynamically linked C programs rely on dependencies that are hanging around on disk. A program called the linker says go find this particular feature you use when you run the program, if the dependencies providing those features have changed on disk your results could be different. Dynamically linking C programs/libraries relies on a byzantine collection of different types of mutable global state. There is your system default library locations (global state, disk-backed), LIBRARY_PATH/LD_LIBRARY_PATH (global state, shell environment local and disk-backed), pkg-config is sometimes used to configure linking (disk backed global-state pointing to disk-backed global state). Global state has the potential to do like state and change… state. This means your results are balanced on a house of cards where shifting one part of the state could break your results.

Don’t believe me that this is a real issue? What about silently getting the wrong results for your numerical code because you had LD_LIBRARY_PATH or update-alternatives set incorrectly, oof.

Or not being able to build your software at all because one of your package managers got in the way. This may sound like puritanical nerd worries, but these are real issues I’ve experienced in practice. I was soured on conda long ago when even having it on my executable path broke my ability to build the R package I was developing for work. What happened? Conda’s addition to my PATH overrode my system h5cc a compiler wrapper for building C libraries that depend on HDF5, this linked in the wrong version of HDF5 and prevented my R package from building. I had not asked conda for h5cc, it was pulled in as a dependency of some arbitrary conda package I was using. It’s wasn’t just conda’s fault, the R package itself was an eldritch horror of an autotools build, ditching conda was easier than fixing the R build to ignore conda’s h5cc.

Nix gets around these issues by not using the system default libraries wherever possible, avoiding LIBRARY_PATH/LD_LIBRARY_PATH wherever possible¹⁰, and making the places where state is unavoidable immutable (your nix store of built things is read-only). So builds are hermetic and isolated, you can happily have multiple versions of the same C library running around without paying any extra attention to where your C dependencies are coming from. This means you don’t need a separate docker container to have an alternate universe of C libraries to make sure your analysis works, you just have it beside all your normal stuff, and that’s relevatory when you’ve been bitten by these problems enough times.

These builds, instead of being an imperative sequence of commands to run to build and install software to a specific place, are written in the nix programming language, a functional programming language designed to make it easy to modify and tweak the build and dependencies so that your software environment is fully specified in code.

By contrast conda has 8000 prebuilt packages, ubuntu offers an admirable 36k. These builds assume something about the directory layout of your system, and they can be broken by updating system packages using other package manager.

Nix also empowers you to be your own package repository, significant effort has gone into making builds fully deterministic where possible. This means once I’ve built “pySweetDataToolR.jl” I can give it to you, if we’re on the same architecture you can just slot the relevant parts of my nix store into yours, so for a medium or larger organization you can set up a global cache of nix builds on a server that can be downloaded by each user. For smaller orgs you might be able to get away with a single nix-store that everyone can share. No more N numpys per employee.

Where’s the rub?

So far I might have sounded effusive, if not fanboyish about the space alien wizard technology that can replace apt, conda, renv, etc. but there are real and significant sharp corners to nix and nixpkgs especially for data science. First off while nixpkgs includes every single package available on CRAN at the time of last snapshot, its coverage of pypi is piddly. Nixpkgs includes ~5200 python3.10 packages, whereas pypi has ~432k packages¹¹. In order to put together the data science environment I built, I needed to package or modify 33 python packages. Some medical imaging related, some for working with jupyter notebooks, some machine learning related. And while generally not very challenging once I got the hang of it, some are quite thorny to package. Most are properly built from source, but some are just a thin wrapper around the wheels available on pypi, which defeats the purpose of nix¹². Nix also encourages you to run the full test suite for the packages, but often test code is scrubbed from packages on pypi, so unless you get the code from github you may not have the tests. And you might need to disable some tests because nix’s test environment is immutable, and builds are disallowed from downloading supplemental data by default, which sometimes breaks test code, so either you have to patch the tests yourself or you turn them off (so guess which you choose if you’re pushing for a deadline).

So life in the future is tough, because you become part of the team building it. The future I mean. Since starting my nix journey several years ago I’ve contributed code to nixpkgs more than a few times, but when you are under pressure to achieve actual business goals it can be very frustrating to have to solve these problems yourself. What’s worse is I still feel like a beginner with nix despite having years of experience. Others I’ve encountered like the documentation, participate in the discourse or IRC channels and feel comfortable using nix, my experience hasn’t been so pleasant, with most issues I’ve encountered feeling almost ungoogleable, different learning styles I guess?

The other place I’ve found nixpkgs to be frustrating to use for python, is upgrading. Unless you’re maintaining your own branch of nixpkgs you are somewhat at the whims of other nixpkg contributors as to what gets upgraded when. And due to a potentially poor choice on strategies the most up-to-date version of nixpkgs may contain a large set of temporarily broken python packages. Often upgrading one package will break many packages that depend on it, not only because the code becomes incompatible, but because it is fashionable in python packaging to set strict version upper bounds, so the package won’t even build (so we can’t check if all the tests still pass with the new version). So I find myself in the position of checking out old versions of files from nixpkgs to build my package overlays when I need to downdate a certain dependency. This is tedious and I should probably switch to maintaining my own version of nixpkgs with my downdates and modifications, but this makes it harder to share with others.

So this points to the biggest advantage of conda over nix, when conda does not have a package it gets it from pypi, nixpkgs cannot fail over to grabbing from pypi and installing with pip, it also can’t do dependency solving, if the nixpkgs version of a python lib isn’t compatible with another, you have to go find a satisfactory version yourself (or cheat and lie about the version requirements^ [I was reminded after the first draft of this post about the existence of pythonRelaxDeps, a way to automatically soften version requirements, I may switch to use this in the future.]). This is painful. There are two nix projects that have aimed to address this problem, mach-nix which has been abandoned¹³, and poetry2nix which may solve some of my woes but I haven’t tried it yet.

Another advantage alluded to already is speed. Compiling things takes time, so unless you’re always getting prebuilt binaries from the nix servers you can be in for long build times. Nix is good at not duplicating build work, but sometimes rebuilding is unavoidable. Say you want to use a newer version of cuda, or gcc, most of your environment will need to be rebuilt¹⁴

Finally, the last difficulty is with irreducible system dependencies. Nixpkgs, when you’re not using the nix operating system, does not interact well with graphics drivers, there is a wrapper project I use called NixGL which gives you access to graphics drivers and allows you to run programs that use CUDA¹⁵. However this has meant I needed to hardcode my graphics driver into the nix flake, severely hindering portability¹⁶.

Is this worth it?

I don’t know. I’ve sunk considerable cost into building this environment so one might reasonably expect me to be quite biased.

Would cutting down my effort by 75% at the risk of some amount of computational irreproducibility and unportability be worth it?

Is this a case of good enough practices trumping best practices?

I’m not sure, but now that it’s built I know that I will be able to revisit this exact environment for many years to come. I’ve acquired the skills to fix package sets as issues arise, so it’s not a large burden any longer to develop. If I can iron out a few more details about merging python package sets from multiple versions of nixpkgs I would think that this is substantively superior to managing my software with conda, apt, renv, and docker.

Would I encourage others to adopt the strategy of creating a nix environment from scratch for each new project? Probably not, at least don’t go it alone. Sharing working flakes and overlays for data scientists to springboard off of makes this type of reproducible, portable, futuristic software versioning possible¹⁷, but keep your novelty budget in mind, you might be wise to pick boring technology.

Future directions

My plans are to continue improving and refining this environment. Future directions I’d plan to investigate are:

Improving the code quality of the overlay, for example converting from explicit find-replaces to pythonRelaxDeps in the code that generates python package builds.
Contributing upstream. I’ve had a few PRs back to nixpkgs go through, and some stagnated from slow feedback cycles. I should commit to giving back to the community when capacity allows.
Figure out how to interleave package version sets for python. In general if I want a version of software from nixpkgs commit A and some other software from commit B, I can just use two versions of nixpkgs in my flake. Unfortunately it is not so simple to do with python packages, instead of building the desired package against your versions of all other python packages it will try to bring in its own, which breaks python installs. I suspect I can “borrow” ideas from mach-nix for this.
Generalize the graphics driver pinning, or at least provide a shell script to automatically patch the flake to the current system graphics driver.
Continue adding/packaging useful software so that this environment stays productive and portable for years to come.
Conveniently for me, one of the very few other nix medical imaging machine learning people just joined our team, so I’m sure he’ll have many useful suggestions to improve things.

I like many others think that approaches like nix will become more prevalent in the future, and it’s nice to get a taste of it now even if there are pain points.

Acknowledgements

Thanks to Ben Darwin, Chloe Pou-Prom, Meggie Debnath, Dimuth Kurukulaarachchi, and Derek Beaton for providing useful feedback on drafts of this post.

Comments

You can use your Mastodon account to reply to this post.

a lot of ink has been spilled detailing how nix works and why you should use it. My treatment will be relatively superficial, instead focusing on my efforts and challenges using it in practice. If you are interested in the former I recommend:
- https://serokell.io/blog/what-is-nix
- https://revelry.co/insights/development/nix-time/ ↩︎
specifically a nix flake, something like a library or package, useful for combining different projects that use nix↩︎
well, most anyway↩︎
although it is recommended you manage julia packages not using nix.↩︎
I will probably switch to or add docker before long↩︎
including pytorch and tensorflow, let it run overnight. This is mostly due to CUDA, I explain a bit later↩︎
Nix can mean a lot of things: https://www.haskellforall.com/2022/08/stop-calling-everything-nix.html ↩︎
once you’ve conquered learning it.↩︎
this is also a pain point, which I’ll talk about later. But it’s painful only because nix allows you to do something that essentially no one would even try to do with other package managers.↩︎
it does this by patching the produced compiled artifacts to point to their “dynamic” dependencies statically using @rpath↩︎
to be fair, pypi doesn’t have package quality standards, many of these are abandoned or malware↩︎
by introducing possible system dependencies and portability issues, although I would argue incrementally better is still better, a few risky packages is better than all risky packages.↩︎
although hopefully returning as part of https://github.com/nix-community/dream2nix in the future↩︎
a colleague suggests that we create a build farm where a background process builds new versions of things so that there are prebuilt versions always available, but that is an infrastructure investment that hasn’t felt worth it yet. Apparently the issue is that CUDA is technically not free software and by policy the nix build farms won’t build it for us. ↩︎
you need to start programs with nixGL<driver> <program> which is irritating, but I think can be fixed in my flake↩︎
Ok you’d need to edit one line in the flake but that’s enough of a barrier to discourage users. If anyone in the nix community can help me solve this problem I’d be extremely grateful↩︎
and of course contributing back to nixpkgs where you can↩︎