nix-vm-test: reproducible integration tests

We released nix-vm-test, a test framework to quickly spin up a virtual machine and run tests using a single CLI command.
- 3 min read

TL;DR: We released nix-vm-test, a test framework to quickly spin up a virtual machine and run tests using a single CLI command.

The NixOS Linux distribution has showed us that using a combination of Nix as a test driver and qemu VMs for isolation is a powerful concept. It makes it simple to spin up fresh environments, while keeping the variance between runs low. It is used extensively to test many parts of the distro and heart to the velocity of the project.

So when we wanted to test system-manager, the natural thing to do was to port it to other Linux distributions. It allows us to test that the changes system-managers bring to Ubuntu are correct. Every run gets a fresh VM, so there is no fear of breaking something. And the whole setup is very simple: (1) install Nix, (2) write a bit of Nix code, (3) run nix build .#your-tests to exercise the tests.

Now the test harness has been extracted from system-manager and made generally available. We also added support for Debian and Fedora on top of the existing Ubuntu, and took the opportunity to clean the API a bit.

A minimal example

To give you a sense of what technical usage looks like, here is a symthetic example that shows the test framework in action:


  # Load the dependencies
  inputs = {
    nixpkgs.url = "github:nixos/nixpkgs?ref=nixos-unstable";
    nix-vm-test.url = "github:numtide/nix-vm-test";

  outputs = { self, nixpkgs, nix-vm-test }:
      lib = nix-vm-test.lib.x86_64-linux;
      # Create a test for Debian 13
      myTest = lib.debian."13" {
        sharedDirs = {
          dir1 = {
            # This makes the current folder available in the test at /tmp/dir1
            source = "${self}";
            target = "/tmp/dir1";
        # A synthetic test
        testScript = ''
          # Wait for the system to be fully booted
          # Test that the mount worked
          vm.succeed('ls /tmp/dir1')
          # Test that flake.nix contains the string "FLOB"
          vm.succeed('grep FLOB /tmp/dir1/flake.nix')
      # Run the sandboxed run with `nix flake check`
      checks.x86_64-linux.myTest = myTest.sandboxed;
      # Spins up an interactive environment with `nix run .#`
      packages.x86_64-linux.default = myTest.driverInteractive;
      # Run the non-sandboxed environment with `nix run .#myTest`
      packages.x86_64-linux.myTest = myTest.driver;

In the above example, we re-use the same synthetic test for all three modes.

Here is how much time it takes to run the sandboxed mode:

$ time nix flake check -L
vm-test> (finished: waiting for unit, in 0.06 seconds)
vm-test> vm: must succeed: ls /tmp/dir1
vm-test> vm # [  OK  ] Finished systemd-update-utmp-runleā€¦e - Record Runlevel Change in UTMP.
vm-test> (finished: must succeed: ls /tmp/dir1, in 0.02 seconds)
vm-test> vm: must succeed: grep WOOT /tmp/dir1/flake.nix
vm-test> (finished: must succeed: grep WOOT /tmp/dir1/flake.nix, in 0.01 seconds)
vm-test> (finished: run the VM test script, in 15.58 seconds)
vm-test> test script finished in 15.72s
vm-test> cleanup
vm-test> kill machine (pid 8)
vm-test> vm # qemu-kvm: terminating on signal 15 from pid 5 (/nix/store/y027d3bvlaizbri04c1bzh28hqd6lj01-python3-3.11.7/bin/python3.11)
vm-test> (finished: cleanup, in 0.05 seconds)
vm-test> kill vlan (pid 6)

real	0m20.720s
user	0m0.956s
sys	0m0.288s

And here is how it looks like with the interactive python console:

$ nix run .#
Machine state will be reset. To keep it, pass --keep-vm-state
start all VLans
start vlan
running vlan (pid 2563436; ctl /tmp/vde1.ctl)
(finished: start all VLans, in 0.00 seconds)
additionally exposed symbols:
    start_all, test_script, machines, vlans, driver, log, os, create_machine, subtest, run_tests, join_all, retry, serial_stdout_off, serial_stdout_on, polling_condition, Machine

Complete example

A full example would be a bit large for this blog post, but have a look by yourself over here:

In this example we test the Garage project package installation, in a test matrix with different distros, and versions.

Future work

The project is stable in its current condition, but could do with a bit more work around:

  • Documentation.
  • Add support for more Linux distributions and other OSes.
  • Add a network layer to allow cluster testing.

We hope to get around it during our next contract! (hint hint)


Hopefully this post gives you a sense of for what this project is useful for, what it can do and how to use it. And that you are curious to try it out in one of your next projects.