14 Docker apps
David Anderson edited this page 2024-12-03 13:03:12 -08:00

Terminology

In this context, words like 'app' have many possible meanings. To avoid confusion we use these terms with specific meanings:

  • "BOINC app" and "BOINC app version": the BOINC concepts described here.

  • "Science app": a set of programs that execute a job, i.e. that process input files and produce output files.

  • Science apps can evolve over time Each instance is a "science app version'.

  • "Science executable": a part of a science app version in compiled form, e.g. a x64/Linux executable.

Overview

BOINC lets you use Docker to run science apps on volunteer hosts (Win, Mac and Linux). To do so:

  • Develop your science app in the software environment of your choice (say, particular versions of Linux and Python, with particular libraries and packages installed).

  • Write a Dockerfile that builds this environment.

Note: your Docker image must include the ps command. Most Docker Linux images do, but for some reason the Debian image does not. If you use this image you'll need to include:

RUN apt-get update && apt-get install -y procps && rm -rf /var/lib/apt/lists/*
  • Create BOINC app versions that combine the Dockerfile, and your science executables, with a "Docker wrapper" program (supplied by BOINC) that interfaces between Docker and the BOINC client.

Your science application can then run on all major platforms (Linux, Windows, Mac OS). In that sense it's similar to BOINC's support for apps that run in VirtualBox virtual machines. However, the Docker approach has several advantages:

  • Docker apps can access GPUs.
  • Docker apps use much less disk space (tens of MBs rather than GBs).
  • Starting a Docker container takes less time than starting a virtual machine.

The remainder of this document describes BOINC's support for Docker apps. For a simple example, see the Docker app cookbook.

The Docker wrapper

The Docker wrapper (docker_wrapper) interfaces BOINC to Docker. It is the main program of Docker apps. Usage:

docker_wrapper [options] arg1 arg2 ...

Options:

--verbose: write Docker commands and output to stderr.

--config <filename>: config file name; default job.toml.

--dockerfile <filename>: Dockerfile name; default Dockerfile.

--sporadic: the application is sporadic.

The Docker wrapper reads an optional config file, default job.toml. This file, which is in TOML format, can contain the following items:

project_dir_mount = "/project"

Mounts the job's project directory at the given mount point (an absolute path) in the container.

use_gpu = true

Allow GPU access from the container.

checkpoint_interval = 3600

Specify a checkpoint interval, overriding the computing preferences.

Command line arguments

Unparsed cmdline args to docker_wrapper are passed into the container in an environment variable ARGS. To use this feature, include in your Dockerfile

ENV ARGS ""
CMD <cmd> ${ARGS}

Programs in the container can then access the arguments via the environment variable ARGS. For example, if the main program is a bash script it could do

...
./program $ARGS infile outfile

Accessing input files

docker_wrapper mounts the job's slot directory at WORKDIR in the container. So there are two ways to access an input file.

Direct access

Mark the file as <copy_file/> in the input template. The BOINC client will copy the file to the slot directory (with its logical name) and the science executable can access it directly.

Indirect access

If your science app has large input files (100 MB+) you can avoid the space and time overhead of copying them to the slot directory by accessing the file in the project directory.

To do this, don't mark the file as <copy_file/>. The client will create a "link file" in the slot directory. The link file is an XML document that points to the file in the project directory; for example

<soft_link>../../projects/proj_url/infile</soft_link>

Mount the project directory in the container by adding this to job.toml:

project_dir_mount = "/project"

Your executables (in the container) must convert BOINC's link files to physical names. This is easy to do in a shell script:

#! /bin/bash

resolve () {
    sed 's/<soft_link>..\/..\/projects\/[^\/]*\//\/project\//; s/<\/soft_link>//' $1 | tr -d '\r\n'
}

./worker $(resolve in) out

Here, the resolve() function takes the name of a link file and returns the path of the file in the project directory (assuming that this directory is mounted at /project).

Accessing output files

Output files should not be marked <copy_file/>. Write them (with logical names) in the WORKDIR.

Packaging options

A BOINC job has

  • A BOINC app version.
  • A workunit.

Each of these is a collection of files. The files in a BOINC app version are code-signed. This is normally done manually, preventing hackers from using your project to distribute malware even if they are able to break into your server.

The files in a BOINC app version are cached on the client. They are deleted only when the app version has been superceded by a later version. Workunit files are deleted after a job is finished, unless they are marked as <sticky/> in the job's input template.

The files of a Docker app can be divided between app version and workunit in two ways.

Single-purpose BOINC app

In this model, there is one BOINC app per science app. The BOINC app version for a platform contains

  • Dockerfile
  • docker_wrapper (compiled for that platform)
  • job.toml
  • science executables

and each workunit contains

  • input files

To deploy a new science application you need to create a new BOINC app, and to deploy a new science application version you need to create a new BOINC app version. These both require login access to the BOINC server.

Universal BOINC app

In this model, a single BOINC handles multiple science apps. This app can have app versions for different platforms. Each app version contains docker_wrapper compiled for that platform.

Each workunit includes

  • Dockerfile
  • science executables
  • job.toml (config file for docker_wrapper)
  • input files

This model facilitates interfaces where job submitters can deploy new science app versions or new science apps without server login access. But it also has some limitations.

See more on the universal Docker model