diff --git a/doc/create_project.html b/doc/create_project.html index d8a729ce33..0d0625b60f 100644 --- a/doc/create_project.html +++ b/doc/create_project.html @@ -12,7 +12,8 @@ BOINC's abstractions of data and computation.
+BOINC is intended to accommodate participant hosts +with a wide range of operating systems and hardware architectures. +For example, the hosts may run many versions of Windows +(95, 98, ME, 2000, XP) on many processors +(486, Pentium, AMD) with many architectural extensions +(ATI or NVidia graphics coprocessors). +BOINC addresses the following goals: +
+A platform is a compilation target. +A set of platforms is maintained in the BOINC database of each project. +Each platform has a name and a description of +the range of architectures it can handle. +Each BOINC program (core client and application) is linked to a platform. +
+At the minimum, a platform is a combination +of a CPU architecture and an operating system. +Examples might include: + +
+
name | description |
---|---|
windows_intelx86 | Microsoft Windows (95 or later) running on an Intel x86-compatible processor |
linux_xxx | Linux running on an Intel x86-compatible processor |
macos_ppc | Mac OS 9.0 or later running on Motorola PowerPC |
sparc_solaris | Solaris 2.1 or later running on a SPARC-compatible processor |
+ +The name of a platform should specify a particular version (e.g. of an OS) +only if it uses features new to that version. +For example, the platform sparc_solaris2.8 should apply +ONLY to SPARC machines running Solaris 2.8 or greater. + +
+For simplicity, platforms are assumed to be mutually exclusive: +i.e. an application for platform X is not assumed to work +on a host running core client platform Y if X <> Y. +The BOINC scheduling server will send work to a host only +if there is an application version for the same platform. + +
+There should be as few platforms as possible. +For example, suppose that there are both Solaris2.6 +and Solaris2.7 platforms. +Then any host running the Solaris2.6 core client will +only be able to run Solaris2.6 applications. +Application developers will have to create versions +for both 2.6 and 2.7, even if they're identical. + +
+Handling architectural diversity + +
+BOINC allows applications to exploit specific architectures, +but shifts the burden of recognizing the the architecture +to the application developer. + +
+In other words, if you want to make a version of your application +that can use the AMD 3DNow instruction set, +don't create a new windows_amd_3dnow platform. +Instead, make a version for the windows_intelx86 platform +that recognizes when it's running a a 3DNow machine, +and branches to the appropriate code. +
+It is, however, desirable to report architecture back to the +BOINC server. +This makes it possible, for example, to report average or total +performance statistics for 3DNow hosts constrasted +with other Intel-compatible hosts. +This is done using the boinc_architecture() +function from the BOINC API. +This passes a string (project-specific, but typically in XML) +to the core client, which records it in the +architecture_xml field of the result database record. +For example, the application might pass a description like +
+ < ++ + +Avoiding platform anarchy +
+Each BOINC project is free to create its own platforms. +To avoid anarchy, however, + + +Tools +Each +Platforms are maintained in the platform table in the BOINC DB, +and can be created using the add utility. + +
diff --git a/doc/sequence.html b/doc/sequence.html new file mode 100755 index 0000000000..a6ed2e54d7 --- /dev/null +++ b/doc/sequence.html @@ -0,0 +1,59 @@ +Long computations + +Heartbeat: make sure that host is still working on it +Termination: check results, stop computation if needed +Restart: restart sequence on another host if first fails + +optional: + +Rate adjustment: if host is too slow, move sequence elsewhere +Redundancy checking: relocate sequence if error + +--------------- + + +A sequence is a method of executing lengthy computations, usually on +the order of one or more months. A sequence is represented by a chain +of results. Each result has a successor, except for the last one in +the chain. + +Calculation of a sequence begins when a host is assigned the first +result in the chain. The host computes and uploads the data for the +result, and is assigned the next result in the sequence. If the host +is assigned several successive elements in the sequence all at once, it +can start processing result N+1 before finishing the upload of result +N, thereby always keeping the processor busy. + +If a result is marked as restartable, the host must upload data for +that result which is sufficient to restart the computation at that +point on a different host. It is up to the work generator to ensure +that this is the case by setting the upload flag in the proper file +info objects. The first result in the sequence is by definition +restartable. If a host completes result N+1 before finishing the +upload of data for result N, the data for both results will be uploaded. + +The server has the ability to terminate a sequence prematurely. A host +will receive credit for the the portion of the sequence that was +completed. If a host fails to return a result in the sequence before +its deadline passes, the sequence will be reassigned to a new host, +starting at the last available restartable result. + +Examples: +climateprediction.net (known computation length, large state files) +Let's say a simulation takes 6 months. Suppose we want a small +progress report from the user every 3 days, so we generate 6*30/3 = 60 +results per sequence. The scheduling server ensures that each host has +at least 2 elements of the sequence at a time, so that it doesn't have +to wait for data upload in order to continue. If we want a full state +save every 3 weeks, we make every 7th result restartable and set the +XML file infos so that the large state files will be uploaded. + +Folding@home (unknown computation length, small state files) +For Folding@home things would be slightly different, since we don't +know in advance how long a computation will take. The server generates +a large group of trajectory sequences, but only creates 2 or 3 results +in each sequence. The backend work generator periodically checks how +much of each sequence has been completed, and extends any sequences +that are nearing completion unless it has been decided to permanently +terminate them (i.e. because a more promising trajectory has been +found).