mirror of https://github.com/BOINC/boinc.git
208 lines
6.4 KiB
HTML
208 lines
6.4 KiB
HTML
|
<title>Back end programs</title>
|
||
|
<body bgcolor=ffffff>
|
||
|
<h2>Back end programs</h2>
|
||
|
|
||
|
<p>
|
||
|
A project back end is implemented as a set of programs.
|
||
|
Some parts of these programs are supplied by BOINC;
|
||
|
other parts are project- or application-specific:
|
||
|
|
||
|
<br>
|
||
|
<img vspace=10 src=backend.png>
|
||
|
<br>
|
||
|
|
||
|
<p>
|
||
|
<table border=1 cellpadding=8>
|
||
|
<tr>
|
||
|
<th>Component</th>
|
||
|
<th>BOINC-supplied part</th>
|
||
|
<th>project-supplied part</th>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td valign=top>
|
||
|
<b>Work generator</b>: generates work units, results,
|
||
|
and the corresponding input files.
|
||
|
</td>
|
||
|
<td valign=top>
|
||
|
Functions and programs that handle the details of
|
||
|
creating workunit and result database records.
|
||
|
</td>
|
||
|
<td valign=top>
|
||
|
Programs or scripts that generate input files,
|
||
|
install them on data servers, and call the BOINC functions.
|
||
|
</td></tr>
|
||
|
<tr>
|
||
|
<td valign=top><b>Timeout check</b>:
|
||
|
Checks for various timeout conditions,
|
||
|
such as result timeout.
|
||
|
Reissues results for workunits as needed.
|
||
|
</td>
|
||
|
<td valign=top>A program <b>timeout_check</b>.</td>
|
||
|
<td valign=top>Some parameters used by timeout_check.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td valign=top><b>Result validation and accounting</b>:
|
||
|
compare redundant results; select a <b>canonical result</b>
|
||
|
representing the correct output,
|
||
|
and a <b>canonical credit</b> granted to users and hosts
|
||
|
that return the correct output.</td>
|
||
|
<td valign=top>A program, <b>validate</b>, that contains the
|
||
|
basic logic for validation.</td>
|
||
|
<td valign=top>An application-specific function, linked with <b>validate</b>,
|
||
|
that compares sets of redundant results.</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td valign=top><b>Assimilator</b>:
|
||
|
handles workunits that are "completed":
|
||
|
that is, which have a canonical result or for which
|
||
|
an error condition has occurred.
|
||
|
Handling a successfully completed result might involve
|
||
|
record results in a database and perhaps generating more work.</td>
|
||
|
<td valign=top>
|
||
|
A main program that enumerates unassimilated workunits,
|
||
|
calls a project-supplied "handler" function,
|
||
|
and updates the database.
|
||
|
</td>
|
||
|
<td valign=top>
|
||
|
A handler function that assimilates a workunit,
|
||
|
either by processing its canonical result
|
||
|
or handling an error return.
|
||
|
</td>
|
||
|
</tr>
|
||
|
<tr>
|
||
|
<td valign=top><b>File deleter</b>: delete input and output files
|
||
|
when they are no longer needed.</td>
|
||
|
<td valign=top>A program <b>file_deleter</b>.</td>
|
||
|
<td valign=top>None.</td>
|
||
|
</tr>
|
||
|
</table>
|
||
|
|
||
|
<h3>Timeout checker</h3>
|
||
|
<p>
|
||
|
The timeout checker is passed the following parameters:
|
||
|
|
||
|
max_errors
|
||
|
give up on a workunit if it gets this many error results(i.e., there must be a bug in the application).
|
||
|
|
||
|
max_results
|
||
|
give up on a workunit if it gets this many
|
||
|
non-error results without finding a canonical result
|
||
|
|
||
|
redundancy
|
||
|
try to get at least this many non-error results.
|
||
|
|
||
|
application
|
||
|
which application to handle
|
||
|
|
||
|
use crontab to run timeout_checker continuously.
|
||
|
|
||
|
<pre>
|
||
|
for each WU with timeout_check_time < now
|
||
|
for each result of WU
|
||
|
if result.server_state=IN_PROGRESS and now > result.report_deadline
|
||
|
result.server_state = OVER
|
||
|
result.outcome = NO_REPLY
|
||
|
if any result has outcome COULDNT_SEND
|
||
|
wu.error_mask |= COULDNT_SEND
|
||
|
got_error = true
|
||
|
if too many error results
|
||
|
wu.error_mask |= TOO_MANY_ERROR_RESULTS
|
||
|
got_error = true
|
||
|
if too many results
|
||
|
wu.error mask |= TOO_MANY_RESULTS
|
||
|
got_error = true
|
||
|
else
|
||
|
generate new results as needed
|
||
|
|
||
|
if got_error
|
||
|
for all results server_state UNSENT
|
||
|
result.server_state = OVER
|
||
|
result.outcome = DIDNT_NEED
|
||
|
if wu.assimilate_state == INIT
|
||
|
wu.assimilate_state = READY
|
||
|
|
||
|
if all results are OVER and wu.assimilate_state = DONE
|
||
|
wu.file_delete_state = READY
|
||
|
wu.timeout_check_time = 0
|
||
|
else
|
||
|
wu.timeout_check_time = now + delay_bound
|
||
|
|
||
|
</pre>
|
||
|
|
||
|
<h3>Validater</h3>
|
||
|
<p>
|
||
|
BOINC supplies a utility program <b>validate</b>
|
||
|
to perform validation and credit-granting.
|
||
|
This program must be linked with two project-specific functions:
|
||
|
<pre>
|
||
|
int check_set(vector<RESULT> results, int& canonicalid, double& credit);
|
||
|
int check_pair(RESULT& r1, RESULT& r2, bool& match);
|
||
|
</pre>
|
||
|
<b>check_set()</b> takes a set of results.
|
||
|
If there is sufficient agreement,
|
||
|
it selects one of them as the "canonical" result
|
||
|
(returning its ID) and also decides what credit should
|
||
|
be granted for correct results for this workunit.
|
||
|
<p>
|
||
|
<b>check_pair()</b> compares two results and returns match=true
|
||
|
if they agree.
|
||
|
|
||
|
<p>
|
||
|
The file <b>validate_test.C</b> contains an example
|
||
|
implementation of check_set() and check_pair().
|
||
|
|
||
|
<pre>
|
||
|
for each WU with need_validate = true
|
||
|
if already have canonical result
|
||
|
for each result with validate_state = INIT and outcome = SUCCESS
|
||
|
if matches canonical, grant credit
|
||
|
set result.validate_state to VALID or INVALID
|
||
|
else
|
||
|
build set of results with outcome = SUCCESS
|
||
|
if find canonical result
|
||
|
wu.assimilate_state = READY
|
||
|
for all results server_state = UNSENT
|
||
|
result.server_state = OVER
|
||
|
result.outcome = DIDNT_NEED
|
||
|
</pre>
|
||
|
|
||
|
<h3>scheduler</h3>
|
||
|
<pre>
|
||
|
- when send a result
|
||
|
result.server_state = IN_PROGRESS
|
||
|
result.report_deadline = now + wu.delay_bound
|
||
|
??? should do lookup before updating? shmem may be stale
|
||
|
doesn't matter; can't be stale
|
||
|
- when receive a result
|
||
|
switch result.server_state
|
||
|
client_state = (from reply msg)
|
||
|
case IN_PROGRESS:
|
||
|
result.server_state = OVER
|
||
|
case OVER:
|
||
|
result.file_delete_state = READY;
|
||
|
|
||
|
if client_state is DONE
|
||
|
result.outcome = SUCCESS
|
||
|
wu.need_validate = true
|
||
|
else
|
||
|
result.outcome = CLIENT_ERROR
|
||
|
result.validate_state = INVALID
|
||
|
|
||
|
</pre>
|
||
|
|
||
|
<h3>Assimilator</h3>
|
||
|
<pre>
|
||
|
for each WU with assimilate_state = READY
|
||
|
call project-specific handler function
|
||
|
NOTE: canonical_resultid and error_mask are not mutually exclusive
|
||
|
if all results are OVER with outcomes SUCCESS or CLIENT_ERROR
|
||
|
set result.file_delete = READY for all results
|
||
|
else
|
||
|
for each non-canonical result
|
||
|
if state is OVER and outcome is SUCCESS or CLIENT_ERROR
|
||
|
set result.file_delete = READY
|
||
|
wu.assimilate_state = DONE
|
||
|
if all results are OVER
|
||
|
wu.file_delete_state = READY
|
||
|
</pre>
|