Back end programs

A project back end is implemented as a set of programs. Some parts of these programs are supplied by BOINC; other parts are project- or application-specific:

Component BOINC-supplied part project-supplied part
Work generator: generates work units, results, and the corresponding input files. Functions and programs that handle the details of creating workunit and result database records. Programs or scripts that generate input files, install them on data servers, and call the BOINC functions.
Timeout check: Checks for various timeout conditions, such as result timeout. Reissues results for workunits as needed. A program timeout_check. Some parameters used by timeout_check.
Result validation and accounting: compare redundant results; select a canonical result representing the correct output, and a canonical credit granted to users and hosts that return the correct output. A program, validate, that contains the basic logic for validation. An application-specific function, linked with validate, that compares sets of redundant results.
Assimilator: handles workunits that are "completed": that is, which have a canonical result or for which an error condition has occurred. Handling a successfully completed result might involve record results in a database and perhaps generating more work. A main program that enumerates unassimilated workunits, calls a project-supplied "handler" function, and updates the database. A handler function that assimilates a workunit, either by processing its canonical result or handling an error return.
File deleter: delete input and output files when they are no longer needed. A program file_deleter. None.

Timeout checker

The timeout checker is passed the following parameters: max_errors give up on a workunit if it gets this many error results(i.e., there must be a bug in the application). max_results give up on a workunit if it gets this many non-error results without finding a canonical result redundancy try to get at least this many non-error results. application which application to handle use crontab to run timeout_checker continuously.

    for each WU with timeout_check_time < now
        for each result of WU
            if result.server_state=IN_PROGRESS and now > result.report_deadline
                result.server_state = OVER
                result.outcome = NO_REPLY
        if any result has outcome COULDNT_SEND
            wu.error_mask |= COULDNT_SEND
            got_error = true
        if too many error results
            wu.error_mask |= TOO_MANY_ERROR_RESULTS
            got_error = true
        if too many results
            wu.error mask |= TOO_MANY_RESULTS
            got_error = true
        else
            generate new results as needed

        if got_error
            for all results server_state UNSENT
                result.server_state = OVER
                result.outcome = DIDNT_NEED
            if wu.assimilate_state == INIT
                wu.assimilate_state = READY

    if all results are OVER and wu.assimilate_state = DONE
        wu.file_delete_state = READY
        wu.timeout_check_time = 0
    else
        wu.timeout_check_time = now + delay_bound

Validator

BOINC supplies a utility program validate to perform validation and credit-granting. This program must be linked with two project-specific functions:

int check_set(vector results, int& canonicalid, double& credit);
int check_pair(RESULT& r1, RESULT& r2, bool& match);
check_set() takes a set of results. If there is sufficient agreement, it selects one of them as the "canonical" result (returning its ID) and also decides what credit should be granted for correct results for this workunit.

check_pair() compares two results and returns match=true if they agree.

The file validate_test.C contains an example implementation of check_set() and check_pair().

    for each WU with need_validate = true
        if already have canonical result
            for each result with validate_state = INIT and outcome = SUCCESS
                if matches canonical, grant credit
                set result.validate_state to VALID or INVALID
        else
            build set of results with outcome = SUCCESS
            if find canonical result
                wu.assimilate_state = READY
                for all results server_state = UNSENT
                    result.server_state = OVER
                    result.outcome = DIDNT_NEED

scheduler

    - when send a result
        result.server_state = IN_PROGRESS
        result.report_deadline = now + wu.delay_bound
        ??? should do lookup before updating?  shmem may be stale
            doesn't matter; can't be stale
    - when receive a result
        switch result.server_state
        client_state = (from reply msg)
        case IN_PROGRESS:
            result.server_state = OVER
        case OVER:
            result.file_delete_state = READY;

        if client_state is DONE
            result.outcome = SUCCESS
            wu.need_validate = true
        else
            result.outcome = CLIENT_ERROR
            result.validate_state = INVALID

Assimilator

    for each WU with assimilate_state = READY
        call project-specific handler function
            NOTE: canonical_resultid and error_mask are not mutually exclusive
        if all results are OVER with outcomes SUCCESS or CLIENT_ERROR
            set result.file_delete = READY for all results
        else
            for each non-canonical result
                if state is OVER and outcome is SUCCESS or CLIENT_ERROR
                    set result.file_delete = READY
        wu.assimilate_state = DONE
        if all results are OVER
            wu.file_delete_state = READY