The Dune retreat, organised by Jane Street, is an event where all the developers of Dune meet for a week to hack on Dune. The Dune team has grown quite a lot and is now a mix of developers from Jane Street, OCaml Labs, and Tarides; developers from companies that are using or are looking at using Dune, such as LexiFi or CEA; developers from friend communities such as Coq; and open source contributors. The event is a great way for all the developers to get together.

Just like for the first retreat, we all came out of it recharged with our heads full of ideas to fuel the development of Dune for the year to come. Here is a summary of things we discussed, worked on, and continue working on.

More dynamism!

The initial architecture of Dune was as follows:

  • Step 1: Gather all the configuration (command line, dune files, etc…)
  • Step 2: Generate all the build rules
  • Step 3: Execute the build, i.e., follow the build DAG and execute commands

Since then, the design has evolved a bit but is still mostly like this. In particular, we currently cannot use the result of executing a build command to produce more rules. While this is fine for the majority of projects, it is often quite limiting. We have now reached a point where it is clear that we cannot continue like this. The extra flexibility is needed in a number of cases, such as the linked build contexts feature discussed later in this post.

The main difficulty is a well-known one for OCaml developers: we need to spread the monad. Indeed, the parallelism at Step 3 is obtained by using a concurrency monad. We traverse the build DAG and systematically fork execution when descending into the children of a node, thus spawning many lightweight threads that all try to start many external commands. It is then the scheduler that throttles the execution of the commands according to the -j option passed to dune.

So if we want to use the result of a command, then we have to be in the monad. However, this monad is currently only used at Step 3. So if we want to compute more things dynamically, then we have to spread the monad through the rest of the codebase.

There is obviously a potential performance concern; indeed we are turning direct code into monadic code and so creating more closures, putting more pressure on the gc, etc. And while we will certainly benchmark this change, we are not too concerned about it slowing down Dune. Indeed, the monad we are using in Dune is called Fiber and is very light. Unlike Async or Lwt, a Fiber.t can only be used once and sharing or forking must be explicit. As a result, Fiber.t is pretty much just a CPS monad and should be a piece of cake for the compiler to optimise away.

Rudi Grinberg from OCaml Labs has started working on this gargantuan refactoring.

Linked build contexts

Dune allows building against multiple configurations simultaneously. For instance, in a single invocation of Dune you can build:

  • against a 4.08 compiler
  • against a 4.09 compiler
  • against a 4.09 compiler with flamda
  • against a windows cross-compiler

At the moment, the configuration of each build context is determined once and for all at the beginning of Dune’s execution. This is, however, too limited for complex workflows such as the ones used by the mirage community. In that world, we need to build and run user code in order to determine the configuration of a subsequent build context in which the target “unikernels” are built.

To support such workflows, we are planning to allow configuring a build context using programs built in another build context. For instance, Dune will support obtaining the default C flags for building esp32 unikernels by running a command built from a separate configuration context.

More generally, the concept we want to introduce is that the public elements of a particular build context will be visible to another build context–a bit like if we ran a first build, installed the artefacts and then ran another build using a different configuration. Except that all these steps will be transparent and orchestrated by Dune itself.

This work will also make it possible to encode a full bootstrap build of the OCaml compiler with Dune.

We discussed this feature during the retreat but this work is currently pending on spreading the fiber monad.

Distributed shared cache

When using a distributed shared artifact cache, Dune will prefetch artefacts into the local shared cache so that they are ready when it needs them. But if they are not present when it needs them, it will just run the build command locally, creating some sort of race between the CPU and the network. For the distributed cache to be efficient and speed up builds, the prefetch heuristics need to be good.

One idea would be to get help from the VCS, or in the case of Jane Street, from Iron, so that when working on a PR/feature the system would tell us which artefacts from the distributed cache we are the most likely to need. This idea is, however, quite complex to put in motion as it would require coordination between various systems. It might also be computationally expensive.

Another idea is the following: we can decompose the execution of a rule into the three following steps:

  • Step 1: dune knows an over-approximation of the dependencies and needs to build a bit more in order to know the exact set of dependencies (for instance, it needs to call ocamldep)
  • Step 2: dune knows the exact set of dependencies, the exact command to run, etc…
  • Step 3: once all the dependencies are built, dune fires the rule

At Step 2, we know exactly what we need and can send an exact prefetch request to the shared cache. Unfortunately, it is not clear that Step 2 happens soon enough to be efficient in practice. Indeed, it might happen quite close to Step 3, meaning that the CPU would win the race over the network.

However, in Dune, Step 1 happens very early and the over-approximation is actually not that big compared to reality. So the idea is to compute a key for the shared cache that takes into account:

  • the current configuration (such as flambda, version of OCaml, …)
  • an over-approximation of all the source files a rule transitively depends on

This key is overly specific. Indeed, changing one comment in a file would change many such keys even though it wouldn’t change the key computed at Step 2. However, because we know it very early, we might as well try it, as a hit on this key will pretty certainly gives us what we need. We hope that using prefetch hints at Steps 1 and/or 2 will be enough to find good prefetch candidates.

The distributed shared cache is being developed by Quentin Hocquet from Tarides.

Splitting out the daemon

Right now, when using the shared cache, Dune can either promote things itself to the shared cache or talk to a daemon. This daemon is part of Dune itself. However, to add support for the distributed shared cache, we need more library dependencies such as Irmin. Since we cannot add these dependencies to Dune, we plan to instead extract the daemon as a separate project and what will remain in the main repository is only the “direct” mode where Dune operates without a daemon.

This means that after installing Dune, users will only be able to use the local shared cache and will need to install an additional package in order to use a distributed cache, which seems fine.

Runtime data files and relocation

Whenever possible, we build and distribute self-contained binary applications. However, this is not always possible and an application might need to read additional data files at runtime or even load plugins. There is a lot of boilerplate when we need to do so, such as:

  • where to find the files once the application is installed
  • where to find the files during testing before the application has been installed
  • in the case of plugins, how to handle dependencies between plugins
  • once installed, can the application and its data be moved to another location, i.e., is it relocatable

Francois Bobot from CEA is very much interested in this question for Frama-C, as they are currently porting their old build system to use Dune. We discussed a proposal and he has been working on implementing it. This feature will allow packages to declare an “installation site” where the package itself and other packages might install data files or plugins. Then Dune will take care of the low-level details and expose a high-level interface to the application to load data files or plugins.

Better vendoring workflows

Historically, OCaml packages were black boxes with three buttons on them:

  • ./configure
  • make
  • make install

So if you wanted to use a third-party library in your project, you had to install it first. This was generally done via a package manager such as opam.

Dune opened the door to an alternative workflow: when you put two projects side by side, Dune is able to open them and understand them as a whole as if they were a single project. This means that vendoring–importing the source code of a dependency–is very easy with Dune. More and more users are starting to make use of this feature for development because it simplifies their workflow.

Indeed, in the past, when you would submit a pull request to a project, you would have to wait for the project to make a new release before you could start using it in your project. Or you could use “pins” but that was tedious and didn’t scale well. With dune, users can vendor temporarily or permanently their dependencies to get going until the release.

This all works well but there are a few points of friction around releasing a package with vendored dependencies in opam. In some cases, it is preferable to use the embedded vendored code, for instance when publishing a binary, and sometimes it is preferable to ignore the embedded vendored code and use what’s installed by opam instead. This is not very well supported by Dune at the moment.

We discussed various solutions and have at least one concrete proposal. However, because it is possible to work around this limitation by hacking the opam build commands, we decided to wait until a standard workflow has emerged before committing to a particular one in Dune.

GraphQL introspection

Etienne Millon from Tarides is adding support for querying various information about a Dune project via GraphQL, a standard query language developed by Facebook. With this feature, it will be easy for third-party applications to extract various information out of Dune.

Using GraphQL would be much easier than say, reading the dune files and interpreting them.

Modern CI integration

A few people from OCaml Labs and Tarides are working on ocaml-ci, a modern configuration-less CI for OCaml projects based on OCurrent. ocaml-ci will rely on standard dune invocations for various tasks such as building, testing, and checking formatting, so that users have nothing to configure.

Up to now, users have relied on CI scripts using opam. These require configurations and are not always very flexible. For instance, testing multi-package projects always tends to be difficult. The new CI system will be simpler and will just work out of the box.

Craig Ferguson from Tarides, who is working on this project and attended the retreat, gave us a demo of the system and showed various interesting aspects of its implementation, such as a nice DSL they have to generate docker files.

Mdx

Nathan Rebours from Tarides has been working on adding mdx support to Dune. Mdx is a tool allowing users to have verified OCaml toplevel and shell fragments in markdown and mli files, making it a good tool to write and maintain up-to-date documentation examples. At the moment, using mdx in Dune requires writing and maintaining tedious custom rules.

With this new feature, all the user will have to do is write an mdx stanza. A follow-up of this work will be to improve the way mdx and dune interacts to make the overall system simpler and faster.

As part of this work, Nathan will look at unifying the native and byte code versions of the OCaml toplevel libraries, which should benefit a lot of people.

Cram tests

Since the beginning, we have been writing a lot of integration tests for Dune using a small cram test tool written in OCaml. Such tests are very simple to read, write, and work with. And because their syntax is so simple and close to shell commands that a lot of people know, it is easy for contributors to write tests.

Other projects are interested in writing such tests, so I have been working on generalising the cram test tool so that we can eventually publish it.

One aspect I am looking at before releasing the tool is reducing the boilerplate to a minumun. Right now, to run a cram test one has to write at least the following:

(rule
 (progn
  (run dune-cram run %{dep:run.t})
  (diff? run.t run.t.corrected)))

I’d like to make the last line unnecessary by adding a general mechanism for commands to report additional information to dune–in this case, that they produced a correction. If the dune boilerplate required becomes so simple that it is not necessary to have a dedicated cram stanza, it means that it will be easy for users to develop their own build or test tool that will integrate nicely with Dune without requiring Dune to know about it.

Polling mode

We didn’t particularly discuss this topic during the reteat, but since this post mentions many of the big new things that will happen during the year I thought it would be good to give an update on this topic.

For a while dune has supported a “watch” mode where it watches the file system and automatically rebuilds when something changes. While this works, the implementation is not very much more than:

while true do
  dune build
  wait_for_change
done

Which means that it doesn’t scale well to large repositories. At Jane Street we are particularly interested in making it scale since we want to use Dune on our massive internal repository. So we have been working for a while on a principled system to make polling builds fast and scalable. This work started about two years ago when we introduced a new computation model with Rudi Horn who interned at Jane Street. It then continued last year with my colleague Arseniy Alekseyev, who pushed the system forward and integrated it deeper into the core of Dune. And it is now Andrey Mokhov who is putting the last stone on the edifice and has been working on literally “purifying” the code of Dune, i.e., removing all side effects that are not tracked by the memoisation system.

(At the same time continuing to rewrite and clean up some of the terrible original code I wrote. Indeed, that’s a story for another post, but for the original development of Jbuilder, there was a ton of work to do in limited time and I didn’t always take the time to polish every bit of my code as I usually like to do, especially for open source projects where the code is put up in the open for everyone to see!)

Thank you

I’ll conclude this post on this note and would like to thank all the people who are making the Dune project and event such as the retreat possible. Thanks to the whole Dune team for their hard work (and also great LAN parties :)), Jane Street who is putting both financing and developer time into the project, OCaml Labs and Tarides for their support and developer time, CEA and LexiFi for their developer time, the Coq team for choosing Dune and helping make it better, all the contributors who are helping in their free time, all the people who are reporting issues, and of course all of the users of Dune, without whom the project wouldn’t need to exist.