Table of Contents
PyBOINC: simplified BOINC application development in Python
This is a proposed design for making developing BOINC applications as simple as possible. PyBOINC provides a master/slave model: the master runs on the server, and the slave is distributed.
Here's an example, which sums the squares of integers from 1 to 100. The application consists of three files. The first, app_types.py, defines the input and output types:
class Input:
def __init__(self, arg):
self.value = arg
class Output:
def __init__(self, arg):
self.value = arg;
The second file, app_master.py, is the master program:
import app_types
def make_calls():
for i in range(100):
input = Input(i);
pyboinc_call('app_slave.py', input)
def handle_result(output):
sum += output.value
sum = 0
pyboinc_master(make_calls, handle_result)
print "The answer is %d", sum
The third file, pyboinc_slave.py, is the slave function:
import app_types
input = pyboinc_get_input()
output = Output(input.value*input.value);
pyboinc_return_output(output);
The procedure for running this program is:
- Create a BOINC project
- Run a script ops/py_boinc.php that configures the project to use PyBOINC
- Set an environment var PYBOINC_DIR to the root directory of the project
- Create a directory (anywhere) containing the above files
- In that directory, type
python app_master.py
- This command may take a long time. If it's aborted via !^C, it may be repeated later. In that case no new jobs are created, and the master waits for the completion of the remaining slaves.
Implementation
PyBOINC uses a new table, 'batch', which represents a group of jobs. Its fields are:
- ID
- ID of user who submitted this batch
- path of 'batch directory'
PyBOINC uses the following files and subdirectories in the job directory:
- pyboinc_checkpoint: If present, this contains a job ID
- new/: result files not yet handled
- old/: result files already handled
PyBOINC uses Python's Pickler class for serialization.
The PyBOINC setup script creates an application 'pyboinc'. Its work units have two input files: a Python program, and a data file. Its application runs a Python interpreter on the program file. The executable of the application is a shell script for linux/mac and a batch file for windows, which executes the python interpreter with the client code:
python app_client.py
- Question: what if python interpreter is not present on a windows box? Is the license of python allows distribution of the interpreter?
PyBOINC uses the following daemons:
- validator: uses the sample bitwise validator (need to check what python produces with floating-point operations on the different platforms)
- assimilator: uses a variant of sample_assimilator. Given a completed result, it looks up the batch record, then copies the output file to BATCH_DIR/new/
Pseudocode for the various PyBOINC functions:
static jobID
pyboinc_call(slave_filename, input)
create a uniquely-named file x in the download hierarchy, file name should contain batch ID
Pickler(x).dump(input)
create_work()
pyboinc_master(make_calls, handle_result)
read jobID from pyboinc_checkpoint
if none
create a batch record; jobID = its ID
make_calls()
write jobID to checkpoint file
move all files from old/ to new/
while (not all jobs done)
if there is a file x in new/
output = Pickler.load(x)
handle_result(output)
move x to old/
else
sleep(1)
pyboinc_get_input()
boinc_resolve_filename("input", infile)
return Pickler.load(infile)
pyboinc_return_output(output)
boinc_resolve_filename("output", outfile)
Pickler(outfile).dump(output)