boinc/dcapi/doc/condor.xml

441 lines
11 KiB
XML

<?xml version="1.0"?>
<!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
<sect1 id="condor">
<title>CONDOR</title>
<sect2 id="condor_app">
<title>Application in Condor environment</title>
<para>
Master-worker application developed using DC-API can be run in a
Condor environment. The master program must be started by hand and
it submits workunits to a Condor execution pool.
</para>
<para>
All files that generated by the application including the master and
the worker programs and the DC-API library itself are placed under a
directory called <emphasis>working directory</emphasis>.
</para>
</sect2>
<sect2 id="condor_environment">
<title>Condor environment</title>
<para>
To execute a DC-API application using Condor version of the DC-API
library you have to set up a Condor environment and have access to
it.
</para>
<para>
Master program of the application must be started on a Condor submit
host so it will be able to submit workunits as Condos jobs.
</para>
<para>
Working directory of the application must be accessible by the
master and the worker processes too so it should be placed on a
shared filesystem (e.g. NFS) which is available for the submit and
the execution hosts in the Condor pool.
</para>
</sect2>
<sect2 id="condor_required">
<title>Required tools</title>
<para>
To compile the application using Condor version of the DC-API
library you need an additional library <filename
class="libraryfile">libcondorapi.a</filename> which is included in the
Condor installation. This library must be linked to the application
besides the DC-API library.
<caution><title>Do not use Condor's lib directory</title>
<para>
Do not specify Condor's lib directory for the linker when
compiling the application. For example do not use the option:
<example><title>Linker option</title>
<programlisting>
... -L$CONDOR_HOME/lib ...
</programlisting>
</example>
Instead, copy out the <filename
class="libraryfile">libcondorapi.a</filename> file to somewhere else
and use that directory after the linker's -L option.
</para>
</caution>
</para>
</sect2>
<sect2 id="condor_configuration">
<title>Configuration options</title>
<variablelist>
<varlistentry>
<term>InstanceUUID</term>
<listitem>
<para>
REQUIRED. Identification of running instance of the
application. For CONDOR backend it can be any string not
just an UUID.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>WorkingDirectory</term>
<listitem>
<para>
REQUIRED. Name of working directory of the
application. All files that are generated by the
application or the DC-API library are placed under this
directory. Different applications can use the same working
directory because every instance has its own subdirectory
there.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>ClientMessageBox</term>
<listitem>
<para>
Name of the directory in workunit's working directory
where messages are placed which are sent by the client to
the master by <function><link
linkend="DC-sendMessage">DC_sendMessage()</link></function>. Default
value is <filename>_dcapi_client_messages</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>MasterMessageBox</term>
<listitem>
<para>
Name of the directory in workunit's working directory
where <function><link
linkend="DC-sendWUMessage">DC_sendWUMessage()</link></function>
places messages sent by the master to the client. Default
value is <filename>_dcapi_master_messages</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>SubresultBox</term>
<listitem>
<para>
Name of the directory in workunit's working directory
where <function><link
linkend="DC-sendResult">DC_sendResult()</link></function>
places subresults generated by the client. Default
value is <filename>_dcapi_client_subresults</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>SystemMessageBox</term>
<listitem>
<para>
Name of the directory in workunit's working directory
where the master and client program place management
messages for example when the master asks the client to
suspend and it sends back an acknowlegde. Default value is
<filename>_dcapi_system_messages</filename>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>SubmitFile</term>
<listitem>
<para>
Name of the file in workunit's working directory which is
generated by the master and used as submit information for
Condor when a workunit is prepared to start. Default value
is <filename>_dcapi_condor_submit.txt</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Executable</term>
<listitem>
<para>
Name of the executable file of the client (workunit). By
default it is the <parameter>clientName</parameter>
parameter which was passed to <function><link
linkend="DC-createWU">DC_createWU()</link></function>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>LeaveFiles</term>
<listitem>
<para>
Specifies if files, directories generated in workunit's
working directory should be deleted or not after workunit
ends. Zero value means delete and non-zero value means not
to delete. Default value is 0.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>CondorLog</term>
<listitem>
<para>
Name of the file in workunit's working directory where
Condor writes records about events happen to the Condor
job. Default value is
<filename>_dcapi_internal_log.txt</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>CheckpointFile</term>
<listitem>
<para>
Name of file in workunit's working directory where
checkpoint information is written by the
client. <function><link
linkend="DC-resolveFileName">DC_resolveFileName()</link></function>
will resolve <link
linkend="DC-CHECKPOINT-FILE:CAPS">DC_CHECKPOINT_FILE</link>
to this name. Default value is
<filename>_dcapi_checkpoint</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>SavedOutputs</term>
<listitem>
<para>
Name of directory in workunit's working directory where
workunit's standard output is saved when it is
suspended. Deafult values is
<filename>_dcapi_saved_output</filename>. There is no
facility in the DC-API yet to merge saved output together.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>CondorSubmitTemplate</term>
<listitem>
<para>
Name of the file which is used as template when generating
Condor submit file. If not specified then a built-in template
will be used. % character can be used to include variable data
into the generated file. Recorgnized % instructions:
<variablelist>
<varlistentry>
<term>%%</term>
<listitem>
<para>
Literal %.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%d</term>
<listitem>
<para>
Current date and time
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%n</term>
<listitem>
<para>
Name of the workunit.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%i</term>
<listitem>
<para>
Internal ID of the workunit.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%w</term>
<listitem>
<para>
Name of working directory of the workunit.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%c</term>
<listitem>
<para>
Client name.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%r</term>
<listitem>
<para>
Number of the arguments.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%x</term>
<listitem>
<para>
Name of the executable.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%a</term>
<listitem>
<para>
Argument list.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%u</term>
<listitem>
<para>
Condor universe (always vanilla).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%o</term>
<listitem>
<para>
File for standard output of the job.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%e</term>
<listitem>
<para>
File for standard error of the job.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%l</term>
<listitem>
<para>
File for Condor user log.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%I</term>
<listitem>
<para>
Comma separated list of input files (physical filenames with path). Capital 'i'.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>%O</term>
<listitem>
<para>
Comma separated list of output files.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>SubmitRetry</term>
<listitem>
<para>
If a job cannot be submitted, how many times should DC-API try before giving
up and reporting it as failed. Default value is 5.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>SubmitRetrySleepTime</term>
<listitem>
<para>
Defines the start value for the sleep period between job submission retries.
Default value is 2. It is multiplied by 2 after each retry, so 2 seconds sleep
before the first retry, 4 seconds before the second, 8 second before the third
and so on.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
</sect1>
<!-- End of condor.xml -->