2003-08-19 06:44:58 +00:00
|
|
|
<?
|
|
|
|
require_once("docutil.php");
|
|
|
|
page_head("Back end programs");
|
|
|
|
echo "
|
2003-01-23 08:07:48 +00:00
|
|
|
|
|
|
|
<p>
|
|
|
|
A project back end is implemented as a set of programs.
|
|
|
|
Some parts of these programs are supplied by BOINC;
|
|
|
|
other parts are project- or application-specific:
|
|
|
|
|
|
|
|
<br>
|
|
|
|
<img vspace=10 src=backend.png>
|
|
|
|
<br>
|
|
|
|
|
|
|
|
<p>
|
|
|
|
<table border=1 cellpadding=8>
|
|
|
|
<tr>
|
|
|
|
<th>Component</th>
|
|
|
|
<th>BOINC-supplied part</th>
|
|
|
|
<th>project-supplied part</th>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign=top>
|
|
|
|
<b>Work generator</b>: generates work units, results,
|
|
|
|
and the corresponding input files.
|
|
|
|
</td>
|
|
|
|
<td valign=top>
|
|
|
|
Functions and programs that handle the details of
|
|
|
|
creating workunit and result database records.
|
|
|
|
</td>
|
|
|
|
<td valign=top>
|
|
|
|
Programs or scripts that generate input files,
|
|
|
|
install them on data servers, and call the BOINC functions.
|
|
|
|
</td></tr>
|
|
|
|
<tr>
|
|
|
|
<td valign=top><b>Timeout check</b>:
|
|
|
|
Checks for various timeout conditions,
|
|
|
|
such as result timeout.
|
|
|
|
Reissues results for workunits as needed.
|
|
|
|
</td>
|
|
|
|
<td valign=top>A program <b>timeout_check</b>.</td>
|
|
|
|
<td valign=top>Some parameters used by timeout_check.</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign=top><b>Result validation and accounting</b>:
|
|
|
|
compare redundant results; select a <b>canonical result</b>
|
|
|
|
representing the correct output,
|
|
|
|
and a <b>canonical credit</b> granted to users and hosts
|
|
|
|
that return the correct output.</td>
|
|
|
|
<td valign=top>A program, <b>validate</b>, that contains the
|
|
|
|
basic logic for validation.</td>
|
|
|
|
<td valign=top>An application-specific function, linked with <b>validate</b>,
|
|
|
|
that compares sets of redundant results.</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign=top><b>Assimilator</b>:
|
2003-08-19 06:44:58 +00:00
|
|
|
handles workunits that are 'completed':
|
2003-01-23 08:07:48 +00:00
|
|
|
that is, which have a canonical result or for which
|
|
|
|
an error condition has occurred.
|
|
|
|
Handling a successfully completed result might involve
|
|
|
|
record results in a database and perhaps generating more work.</td>
|
|
|
|
<td valign=top>
|
|
|
|
A main program that enumerates unassimilated workunits,
|
2003-08-19 06:44:58 +00:00
|
|
|
calls a project-supplied 'handler' function,
|
2003-01-23 08:07:48 +00:00
|
|
|
and updates the database.
|
|
|
|
</td>
|
|
|
|
<td valign=top>
|
|
|
|
A handler function that assimilates a workunit,
|
|
|
|
either by processing its canonical result
|
|
|
|
or handling an error return.
|
|
|
|
</td>
|
|
|
|
</tr>
|
|
|
|
<tr>
|
|
|
|
<td valign=top><b>File deleter</b>: delete input and output files
|
|
|
|
when they are no longer needed.</td>
|
|
|
|
<td valign=top>A program <b>file_deleter</b>.</td>
|
|
|
|
<td valign=top>None.</td>
|
|
|
|
</tr>
|
|
|
|
</table>
|
|
|
|
|
|
|
|
<h3>Timeout checker</h3>
|
|
|
|
<p>
|
|
|
|
The timeout checker is passed the following parameters:
|
|
|
|
|
|
|
|
max_errors
|
|
|
|
give up on a workunit if it gets this many error results(i.e., there must be a bug in the application).
|
|
|
|
|
|
|
|
max_results
|
|
|
|
give up on a workunit if it gets this many
|
|
|
|
non-error results without finding a canonical result
|
|
|
|
|
|
|
|
redundancy
|
|
|
|
try to get at least this many non-error results.
|
|
|
|
|
|
|
|
application
|
|
|
|
which application to handle
|
|
|
|
|
|
|
|
use crontab to run timeout_checker continuously.
|
|
|
|
|
|
|
|
<pre>
|
|
|
|
for each WU with timeout_check_time < now
|
|
|
|
for each result of WU
|
|
|
|
if result.server_state=IN_PROGRESS and now > result.report_deadline
|
|
|
|
result.server_state = OVER
|
|
|
|
result.outcome = NO_REPLY
|
|
|
|
if any result has outcome COULDNT_SEND
|
|
|
|
wu.error_mask |= COULDNT_SEND
|
|
|
|
got_error = true
|
|
|
|
if too many error results
|
|
|
|
wu.error_mask |= TOO_MANY_ERROR_RESULTS
|
|
|
|
got_error = true
|
|
|
|
if too many results
|
|
|
|
wu.error mask |= TOO_MANY_RESULTS
|
|
|
|
got_error = true
|
|
|
|
else
|
|
|
|
generate new results as needed
|
|
|
|
|
|
|
|
if got_error
|
|
|
|
for all results server_state UNSENT
|
|
|
|
result.server_state = OVER
|
|
|
|
result.outcome = DIDNT_NEED
|
|
|
|
if wu.assimilate_state == INIT
|
|
|
|
wu.assimilate_state = READY
|
|
|
|
|
|
|
|
if all results are OVER and wu.assimilate_state = DONE
|
|
|
|
wu.file_delete_state = READY
|
|
|
|
wu.timeout_check_time = 0
|
|
|
|
else
|
|
|
|
wu.timeout_check_time = now + delay_bound
|
|
|
|
|
|
|
|
</pre>
|
|
|
|
|
2003-08-13 21:01:45 +00:00
|
|
|
<h3>Validator</h3>
|
2003-01-23 08:07:48 +00:00
|
|
|
<p>
|
|
|
|
BOINC supplies a utility program <b>validate</b>
|
|
|
|
to perform validation and credit-granting.
|
|
|
|
This program must be linked with two project-specific functions:
|
|
|
|
<pre>
|
|
|
|
int check_set(vector<RESULT> results, int& canonicalid, double& credit);
|
|
|
|
int check_pair(RESULT& r1, RESULT& r2, bool& match);
|
|
|
|
</pre>
|
|
|
|
<b>check_set()</b> takes a set of results.
|
|
|
|
If there is sufficient agreement,
|
2003-08-19 06:44:58 +00:00
|
|
|
it selects one of them as the canonical result
|
2003-01-23 08:07:48 +00:00
|
|
|
(returning its ID) and also decides what credit should
|
|
|
|
be granted for correct results for this workunit.
|
|
|
|
<p>
|
|
|
|
<b>check_pair()</b> compares two results and returns match=true
|
|
|
|
if they agree.
|
|
|
|
|
|
|
|
<p>
|
|
|
|
The file <b>validate_test.C</b> contains an example
|
|
|
|
implementation of check_set() and check_pair().
|
|
|
|
|
|
|
|
<pre>
|
|
|
|
for each WU with need_validate = true
|
|
|
|
if already have canonical result
|
|
|
|
for each result with validate_state = INIT and outcome = SUCCESS
|
|
|
|
if matches canonical, grant credit
|
|
|
|
set result.validate_state to VALID or INVALID
|
|
|
|
else
|
|
|
|
build set of results with outcome = SUCCESS
|
|
|
|
if find canonical result
|
|
|
|
wu.assimilate_state = READY
|
|
|
|
for all results server_state = UNSENT
|
|
|
|
result.server_state = OVER
|
|
|
|
result.outcome = DIDNT_NEED
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<h3>scheduler</h3>
|
|
|
|
<pre>
|
|
|
|
- when send a result
|
|
|
|
result.server_state = IN_PROGRESS
|
|
|
|
result.report_deadline = now + wu.delay_bound
|
|
|
|
??? should do lookup before updating? shmem may be stale
|
|
|
|
doesn't matter; can't be stale
|
|
|
|
- when receive a result
|
|
|
|
switch result.server_state
|
|
|
|
client_state = (from reply msg)
|
|
|
|
case IN_PROGRESS:
|
|
|
|
result.server_state = OVER
|
|
|
|
case OVER:
|
|
|
|
result.file_delete_state = READY;
|
|
|
|
|
|
|
|
if client_state is DONE
|
|
|
|
result.outcome = SUCCESS
|
|
|
|
wu.need_validate = true
|
|
|
|
else
|
|
|
|
result.outcome = CLIENT_ERROR
|
|
|
|
result.validate_state = INVALID
|
|
|
|
|
|
|
|
</pre>
|
|
|
|
|
|
|
|
<h3>Assimilator</h3>
|
|
|
|
<pre>
|
|
|
|
for each WU with assimilate_state = READY
|
|
|
|
call project-specific handler function
|
|
|
|
NOTE: canonical_resultid and error_mask are not mutually exclusive
|
|
|
|
if all results are OVER with outcomes SUCCESS or CLIENT_ERROR
|
|
|
|
set result.file_delete = READY for all results
|
|
|
|
else
|
|
|
|
for each non-canonical result
|
|
|
|
if state is OVER and outcome is SUCCESS or CLIENT_ERROR
|
|
|
|
set result.file_delete = READY
|
|
|
|
wu.assimilate_state = DONE
|
|
|
|
if all results are OVER
|
|
|
|
wu.file_delete_state = READY
|
|
|
|
</pre>
|
2003-08-19 06:44:58 +00:00
|
|
|
";
|
|
|
|
page_tail();
|
|
|
|
?>
|