boinc/doc/backend_programs.html

208 lines
6.4 KiB
HTML
Raw Normal View History

<title>Back end programs</title>
<body bgcolor=ffffff>
<h2>Back end programs</h2>
<p>
A project back end is implemented as a set of programs.
Some parts of these programs are supplied by BOINC;
other parts are project- or application-specific:
<br>
<img vspace=10 src=backend.png>
<br>
<p>
<table border=1 cellpadding=8>
<tr>
<th>Component</th>
<th>BOINC-supplied part</th>
<th>project-supplied part</th>
</tr>
<tr>
<td valign=top>
<b>Work generator</b>: generates work units, results,
and the corresponding input files.
</td>
<td valign=top>
Functions and programs that handle the details of
creating workunit and result database records.
</td>
<td valign=top>
Programs or scripts that generate input files,
install them on data servers, and call the BOINC functions.
</td></tr>
<tr>
<td valign=top><b>Timeout check</b>:
Checks for various timeout conditions,
such as result timeout.
Reissues results for workunits as needed.
</td>
<td valign=top>A program <b>timeout_check</b>.</td>
<td valign=top>Some parameters used by timeout_check.</td>
</tr>
<tr>
<td valign=top><b>Result validation and accounting</b>:
compare redundant results; select a <b>canonical result</b>
representing the correct output,
and a <b>canonical credit</b> granted to users and hosts
that return the correct output.</td>
<td valign=top>A program, <b>validate</b>, that contains the
basic logic for validation.</td>
<td valign=top>An application-specific function, linked with <b>validate</b>,
that compares sets of redundant results.</td>
</tr>
<tr>
<td valign=top><b>Assimilator</b>:
handles workunits that are "completed":
that is, which have a canonical result or for which
an error condition has occurred.
Handling a successfully completed result might involve
record results in a database and perhaps generating more work.</td>
<td valign=top>
A main program that enumerates unassimilated workunits,
calls a project-supplied "handler" function,
and updates the database.
</td>
<td valign=top>
A handler function that assimilates a workunit,
either by processing its canonical result
or handling an error return.
</td>
</tr>
<tr>
<td valign=top><b>File deleter</b>: delete input and output files
when they are no longer needed.</td>
<td valign=top>A program <b>file_deleter</b>.</td>
<td valign=top>None.</td>
</tr>
</table>
<h3>Timeout checker</h3>
<p>
The timeout checker is passed the following parameters:
max_errors
give up on a workunit if it gets this many error results(i.e., there must be a bug in the application).
max_results
give up on a workunit if it gets this many
non-error results without finding a canonical result
redundancy
try to get at least this many non-error results.
application
which application to handle
use crontab to run timeout_checker continuously.
<pre>
for each WU with timeout_check_time < now
for each result of WU
if result.server_state=IN_PROGRESS and now > result.report_deadline
result.server_state = OVER
result.outcome = NO_REPLY
if any result has outcome COULDNT_SEND
wu.error_mask |= COULDNT_SEND
got_error = true
if too many error results
wu.error_mask |= TOO_MANY_ERROR_RESULTS
got_error = true
if too many results
wu.error mask |= TOO_MANY_RESULTS
got_error = true
else
generate new results as needed
if got_error
for all results server_state UNSENT
result.server_state = OVER
result.outcome = DIDNT_NEED
if wu.assimilate_state == INIT
wu.assimilate_state = READY
if all results are OVER and wu.assimilate_state = DONE
wu.file_delete_state = READY
wu.timeout_check_time = 0
else
wu.timeout_check_time = now + delay_bound
</pre>
<h3>Validator</h3>
<p>
BOINC supplies a utility program <b>validate</b>
to perform validation and credit-granting.
This program must be linked with two project-specific functions:
<pre>
int check_set(vector<RESULT> results, int& canonicalid, double& credit);
int check_pair(RESULT& r1, RESULT& r2, bool& match);
</pre>
<b>check_set()</b> takes a set of results.
If there is sufficient agreement,
it selects one of them as the "canonical" result
(returning its ID) and also decides what credit should
be granted for correct results for this workunit.
<p>
<b>check_pair()</b> compares two results and returns match=true
if they agree.
<p>
The file <b>validate_test.C</b> contains an example
implementation of check_set() and check_pair().
<pre>
for each WU with need_validate = true
if already have canonical result
for each result with validate_state = INIT and outcome = SUCCESS
if matches canonical, grant credit
set result.validate_state to VALID or INVALID
else
build set of results with outcome = SUCCESS
if find canonical result
wu.assimilate_state = READY
for all results server_state = UNSENT
result.server_state = OVER
result.outcome = DIDNT_NEED
</pre>
<h3>scheduler</h3>
<pre>
- when send a result
result.server_state = IN_PROGRESS
result.report_deadline = now + wu.delay_bound
??? should do lookup before updating? shmem may be stale
doesn't matter; can't be stale
- when receive a result
switch result.server_state
client_state = (from reply msg)
case IN_PROGRESS:
result.server_state = OVER
case OVER:
result.file_delete_state = READY;
if client_state is DONE
result.outcome = SUCCESS
wu.need_validate = true
else
result.outcome = CLIENT_ERROR
result.validate_state = INVALID
</pre>
<h3>Assimilator</h3>
<pre>
for each WU with assimilate_state = READY
call project-specific handler function
NOTE: canonical_resultid and error_mask are not mutually exclusive
if all results are OVER with outcomes SUCCESS or CLIENT_ERROR
set result.file_delete = READY for all results
else
for each non-canonical result
if state is OVER and outcome is SUCCESS or CLIENT_ERROR
set result.file_delete = READY
wu.assimilate_state = DONE
if all results are OVER
wu.file_delete_state = READY
</pre>