svn path=/trunk/boinc/; revision=562
This commit is contained in:
David Anderson 2002-11-03 19:11:05 +00:00
parent cca13c496d
commit 438be0cf44
3 changed files with 176 additions and 1 deletions

View File

@ -12,7 +12,8 @@ BOINC's abstractions of data and computation.
<li><a href=project.html>Anatomy of a BOINC project</a>
<li><a href=parallelize.html>What applications are suitable for BOINC?</a>
<li><a href=files.html>Files and file references</a>
<li><a href=app.html>Platforms, applications, and versions</a>
<li><a href=platform.html>Platforms</a>
<li><a href=app.html>Applications and versions</a>
<li><a href=work.html>Workunits</a>
<li><a href=result.html>Results</a>
<li><a href=batch.html>Batches</a>

115
doc/platform.html Normal file
View File

@ -0,0 +1,115 @@
<title>Platforms</title>
<body bgcolor=ffffff>
<h2>Platforms</h2>
<b>Goals</b>
<p>
BOINC is intended to accommodate participant hosts
with a wide range of operating systems and hardware architectures.
For example, the hosts may run many versions of Windows
(95, 98, ME, 2000, XP) on many processors
(486, Pentium, AMD) with many architectural extensions
(ATI or NVidia graphics coprocessors).
BOINC addresses the following goals:
<ul>
<li> The system should avoid sending code that the host can't execute.
<li> Applications should be able to exploit specific architectural features,
i.e. to use nonstandard instructions or coprocessors if available.
<li>
Enough architectural information should be stored on the server
so that statistics can be broken down
according to specific features.
<li> Simplicity.
The combinatorial explosion of versions and architectures should be
excluded from the internals of BOINC.
</ul>
<b>Design</b>
<p>
A <b>platform</b> is a compilation target.
A set of platforms is maintained in the BOINC database of each project.
Each platform has a <b>name</b> and a <b>description</b> of
the range of architectures it can handle.
Each BOINC program (core client and application) is linked to a platform.
<p>
At the minimum, a platform is a combination
of a CPU architecture and an operating system.
Examples might include:
<p>
<table cellpadding=8 border=1>
<tr><th>name</th><th>description</th></tr>
<tr><td>windows_intelx86</td><td>Microsoft Windows (95 or later) running on an Intel x86-compatible processor</td></tr>
<tr><td>linux_xxx</td><td>Linux running on an Intel x86-compatible processor</td></tr>
<tr><td>macos_ppc</td><td>Mac OS 9.0 or later running on Motorola PowerPC</td></tr>
<tr><td>sparc_solaris</td><td>Solaris 2.1 or later running on a SPARC-compatible processor</td></tr>
</table>
<p>
The name of a platform should specify a particular version (e.g. of an OS)
only if it uses features new to that version.
For example, the platform <b>sparc_solaris2.8</b> should apply
ONLY to SPARC machines running Solaris 2.8 or greater.
<p>
For simplicity, platforms are assumed to be mutually exclusive:
i.e. an application for platform X is not assumed to work
on a host running core client platform Y if X <> Y.
The BOINC scheduling server will send work to a host only
if there is an application version for the same platform.
<p>
There should be as few platforms as possible.
For example, suppose that there are both Solaris2.6
and Solaris2.7 platforms.
Then any host running the Solaris2.6 core client will
only be able to run Solaris2.6 applications.
Application developers will have to create versions
for both 2.6 and 2.7, even if they're identical.
<p>
<b>Handling architectural diversity</b>
<p>
BOINC allows applications to exploit specific architectures,
but shifts the burden of recognizing the the architecture
to the application developer.
<p>
In other words, if you want to make a version of your application
that can use the AMD 3DNow instruction set,
<i>don't</i> create a new <b>windows_amd_3dnow</b> platform.
Instead, make a version for the <b>windows_intelx86</b> platform
that recognizes when it's running a a 3DNow machine,
and branches to the appropriate code.
<p>
It is, however, desirable to report architecture back to the
BOINC server.
This makes it possible, for example, to report average or total
performance statistics for 3DNow hosts constrasted
with other Intel-compatible hosts.
This is done using the <b>boinc_architecture()</b>
function from <a href=api.html>the BOINC API</a>.
This passes a string (project-specific, but typically in XML)
to the core client, which records it in the
<b>architecture_xml</b> field of the <b>result</b> database record.
For example, the application might pass a description like
<pre>
<
</pre>
<b>Avoiding platform anarchy</b>
<p>
Each BOINC project is free to create its own platforms.
To avoid anarchy, however,
<b>Tools</b>
Each
Platforms are maintained in the <b>platform</b> table in the BOINC DB,
and can be created using the <a href=tools_other.html>add</a> utility.
<p>

59
doc/sequence.html Executable file
View File

@ -0,0 +1,59 @@
Long computations
Heartbeat: make sure that host is still working on it
Termination: check results, stop computation if needed
Restart: restart sequence on another host if first fails
optional:
Rate adjustment: if host is too slow, move sequence elsewhere
Redundancy checking: relocate sequence if error
---------------
A sequence is a method of executing lengthy computations, usually on
the order of one or more months. A sequence is represented by a chain
of results. Each result has a successor, except for the last one in
the chain.
Calculation of a sequence begins when a host is assigned the first
result in the chain. The host computes and uploads the data for the
result, and is assigned the next result in the sequence. If the host
is assigned several successive elements in the sequence all at once, it
can start processing result N+1 before finishing the upload of result
N, thereby always keeping the processor busy.
If a result is marked as restartable, the host must upload data for
that result which is sufficient to restart the computation at that
point on a different host. It is up to the work generator to ensure
that this is the case by setting the upload flag in the proper file
info objects. The first result in the sequence is by definition
restartable. If a host completes result N+1 before finishing the
upload of data for result N, the data for both results will be uploaded.
The server has the ability to terminate a sequence prematurely. A host
will receive credit for the the portion of the sequence that was
completed. If a host fails to return a result in the sequence before
its deadline passes, the sequence will be reassigned to a new host,
starting at the last available restartable result.
Examples:
climateprediction.net (known computation length, large state files)
Let's say a simulation takes 6 months. Suppose we want a small
progress report from the user every 3 days, so we generate 6*30/3 = 60
results per sequence. The scheduling server ensures that each host has
at least 2 elements of the sequence at a time, so that it doesn't have
to wait for data upload in order to continue. If we want a full state
save every 3 weeks, we make every 7th result restartable and set the
XML file infos so that the large state files will be uploaded.
Folding@home (unknown computation length, small state files)
For Folding@home things would be slightly different, since we don't
know in advance how long a computation will take. The server generates
a large group of trajectory sequences, but only creates 2 or 3 results
in each sequence. The backend work generator periodically checks how
much of each sequence has been completed, and extends any sequences
that are nearing completion unless it has been decided to permanently
terminate them (i.e. because a more promising trajectory has been
found).