The BOINC application programming interface (API)

The BOINC API is a set of C++ functions. Unless otherwise specified, the functions return an integer error code; zero indicates success. The graphics API is described separately.

Initialization and termination

The application must call
    int boinc_init();
before calling other BOINC functions or doing I/O. It may call
    struct APP_INIT_DATA {
        char project_preferences[4096];
        char user_name[256];
        char team_name[256];
        double wu_cpu_time;           // cpu time from previous sessions
        double total_cobblestones;
        double recent_avg_cobblestones;
    };

    int boinc_get_init_data(APP_INIT_DATA&);
to get the following information:

These items might be used by the application in its graphics. At any time it may call

    double boinc_cpu_time();
to get its current CPU time.

When the application has completed it must call

    int boinc_finish(int status);
status is nonzero if an error was encountered.

Resolving file names

Applications that use named input or output files must call
    int boinc_resolve_filename(char *logical_name, char *physical_name);
to convert logical file names to physical names. For example, instead of
    f = fopen("my_file", "r");

the application might use

    char resolved_name[256];
    retval = boinc_resolve_filename("my_file", resolved_name);
    if (retval) fail("can't resolve filename");
    f = fopen(resolved_name, "r");
boinc_resolve_filename() doesn't need to be used for temporary files.

Checkpointing

Computations that use a significant amount of time per work unit may want to periodically write the current state of the computation to disk. This is known as checkpointing. The state file must include everything required to start the computation again at the same place it was checkpointed. On startup, the application reads the state file to determine where to begin computation. If the BOINC client quits or exits, the computation can be restarted from the most recent checkpoint.

Do use checkpoint, an application should write to output and state files using the MFILE class.

class MFILE {
public:
    int open(char* path, char* mode);
    int _putchar(char);
    int puts(char*);
    int printf(char* format, ...);
    size_t write(const void* buf, size_t size, size_t nitems);
    int close();
    int flush();
};
MFILE buffers data in memory and writes to disk only on flush() or close(). This lets you write output files and state files more or less atomically. Frequency of checkpointing is a user preference (e.g. laptop users might want to checkpoint infrequently). An application must call
    bool boinc_time_to_checkpoint();
whenever it reaches a point where it is able to checkpoint. If this returns true, the application should write the state file and flush all output files, then call
    void boinc_checkpoint_completed();
A call to boinc_time_to_checkpoint() is extremely fast, so there is little penalty in calling it frequently.

Fraction done

The core client GUI displays the percent done of workunits in progress. To keep this display current, an application should periodically call
   boinc_fraction_done(double fraction_done);
The fraction_done argument is a rough estimate of the workunit fraction complete (0 to 1). This function is extremely fast and can be called often.

Multi-program applications

Some applications consist of multiple programs: a main program that acts as coordinator, and one or more subsidiary programs. Each program should use the BOINC API as described above.

Each program should have its own state file; the state file of the coordinator program records which subsidiary program was last active.

To correctly implement fraction done, the main program should pass information to subsidiary programs (perhaps as command-line arguments) indicating the starting and ending fractions for that program.

The coordinator must call

    void boinc_child_start();
prior to forking a child process. When the child is done, the coordinator must get the child's CPU time, then call
    void boinc_child_done(double total_cpu);
before forking the next child process.

Implementation

Application are executed in separate "catbox" directories, allowing them to create and use temporary files without name conflicts. Input and output files are kept outside the catbox. The mappings from virtual to physical filenames use "symbolic link" files in the catbox directory. The name of such a file is the virtual name, and the file contains an XML tag with the physical name. (This scheme is used because of the lack of filesystem links in Windows.)

Communication between the core client and applications is done through XML files in the catbox directory. Several files are used.

Files created by the core client, read by the app: (Once, at start of app)

Files created by the API implementation, read by the core client:

The API implementation uses a timer (60Hz); the real-time clock is not available to applications. This timer is used for several purposes:

Exit status The core client does a wait() to get the status. boinc_finish() ends with an exit(status);

Accounting of CPU time: (note: in Unix, a parent can't get the CPU time of a child until the child exits. So we're forced to measure it in the child.) The core passes the WU CPU time in init_data.xml. boinc_checkpoint_done() and boinc_finish() compute the new WU CPU time, and write it to checkpoint_cpu.xml. The core deletes this after reading. If on exit there is no checkpoint_cpu.xml, it means the app called exit(0) rather than boinc_finish(). In this case the core measures the child CPU itself.

The core client maintains

Timing of checkpoints

The app library maintains time_until_checkpoint, decremented from the timer handler. boinc_time_to_checkpoint() returns true if this is zero or less. boinc_checkpoint_done() resets it.

Maintaining fraction done and current CPU

These two quantities are transferred from the app library to the core client in the file fraction_done.xml. The parameter time_until_fraction_done_update, passed in the initialization file, determines how often this file is written. It is written from the timer handler.

For multi-program applications, only the active application must write the file. The functions boinc_child_start() and boinc_child_done() tell the app library to stop and start writing the file.

TO DO: this creates disk traffic. Either figure out a way of increasing the period for users who don't want disk access, or don't use disk files.