boinc/dcapi/doc/concepts.xml

205 lines
6.8 KiB
XML
Raw Normal View History

<?xml version="1.0"?>
<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<sect1 id="concepts">
<title>General concepts</title>
<sect2>
<title>Programming model</title>
<para>
DC-API applications consist of two major components: a master application
and one or more client applications. The master is responsible for
dividing the global input data into smaller chunks and distributing these
chunks in the form of work units. Interpreting the output generated by the
work units and combining them to form a global output is also the job of
the master.
</para>
<para>
The master application usually runs as a daemon, but it is also possible
to write a master that runs periodically (e.g. from
<application>cron</application>), processes the outstanding events, and
exits.
</para>
<para>
Client applications are simple sequential programs that take their input
from the master, perform some computation on it and produce some output.
</para>
<sect3>
<title>Writing a master application</title>
<para>
A typical master application does the following steps:
</para>
<itemizedlist>
<listitem>
<para>
Initializes the DC-API library by calling <function><link
linkend="DC-initMaster">DC_initMaster()</link></function>
function.
</para>
</listitem>
<listitem>
<para>
Calls the <function><link
linkend="DC-setResultCb">DC_setResultCb()</link></function>
function and optionally some of the <function><link
linkend="DC-setSubresultCb">DC_setSubresultCb()</link></function>,
<function><link
linkend="DC-setMessageCb">DC_setMessageCb()</link></function>,
<function><link
linkend="DC-setSuspendCb">DC_setSuspendCb()</link></function>
and <function><link
linkend="DC-setValidateCb">DC_setValidateCb()</link></function>
functions, depending on the features (messaging, subresults etc.)
it wants to use.
</para>
</listitem>
<listitem>
<para>
In its main loop, the master calls the <function><link
linkend="DC-createWU">DC_createWU()</link></function> function
to create new work units when needed. If the total number of work
units is small (depending on the grid infrastructure), then the
master may also create all the work units in advance. If the total
number of work units is too large for this, the master may use the
<function><link
linkend="DC-getWUNumber">DC_getWUNumber()</link></function>
function to determine the number of running work units, and create
new work units only if this number falls below a certain threshold.
</para>
</listitem>
<listitem>
<para>
Also in its main loop the master calls the <function><link
linkend="DC-processMasterEvents">DC_processMasterEvents()</link></function>
function that checks for outstanding events and invokes the
appropriate callbacks.
</para>
<para>
Alternatively, the master may use the <function><link
linkend="DC-waitMasterEvent">DC_waitMasterEvent()</link></function>
and <function><link
linkend="DC-waitWUEvent">DC_waitWUEvent()</link></function>
functions instead of <function><link
linkend="DC-processMasterEvents">DC_processMasterEvents()</link></function>
if it prefers to receive event structures instead of using
callbacks.
</para>
</listitem>
</itemizedlist>
</sect3>
<sect3>
<title>Writing a client application</title>
<para>
A typical client application performs the following steps:
</para>
<itemizedlist>
<listitem>
<para>
Initializes the DC-API library by calling <function><link
linkend="DC-initClient">DC_initClient()</link></function>
function.
</para>
</listitem>
<listitem>
<para>
Identifies the location of its input/output files by calling the
<function><link
linkend="DC-resolveFileName">DC_resolveFileName()</link></function>
function.
<note>
The client application may not assume that it can
read/create/write any files other than the names returned by
<function><link
linkend="DC-resolveFileName">DC_resolveFileName()</link></function>.
</note>
</para>
</listitem>
<listitem>
<para>
During the computation, the client should periodically call the
<function><link
linkend="DC-checkClientEvent">DC_checkClientEvent()</link></function>
function and process the received events.
</para>
</listitem>
<listitem>
<para>
If possible, the client should call the <function><link
linkend="DC-fractionDone">DC_fractionDone()</link></function>
function with the fraction of the work completed. On some grid
infrastructures (e.g. BOINC) this will allow the client's supervisor
process to show the progression of the application to the user.
</para>
<para>
Ideally the value passed to the <function><link
linkend="DC-fractionDone">DC_fractionDone()</link></function>
function should be proportional to the time elapsed so far compared
to the total time that will be needed to complete the computation.
</para>
</listitem>
<listitem>
<para>
The client should call the <function><link
linkend="DC-finishClient">DC_finishClient()</link></function>
function at the end of the computation. As a result all output files
will be sent to the master and the master will be notified about the
completion of the work unit.
</para>
</listitem>
</itemizedlist>
</sect3>
</sect2>
<sect2>
<title>Messaging</title>
<para>
The DC-API provides limited messaging functionality between the master
application and the clients. The DC-API has the following features and
restrictions:
<itemizedlist>
<listitem>
<para>
Messages are not reliable in the sense that if the client is not
actually running when a message is being sent to it (e.g. because it
is queued by the backend grid infrastructure), then the message may
be silently dropped.
</para>
</listitem>
<listitem>
<para>
The ordering of messages is not neccessarily maintained.
</para>
</listitem>
<listitem>
<para>
Messages are delivered asynchronously. There is no limit for the
time elapsed before a message is actually delivered.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Due to the above restrictions, DC-API messages are not suitable for
message-based parallel processing. They are meant for sending short status
messages about long-running operations, or for sending control messages
like a command to cancel a given computation.
</para>
</sect2>
<sect2>
<title>Checkpointing</title>
</sect2>
</sect1>
<!-- vim: set ai sw=2 tw=80: -->