Compute model: workunits and results
A workunit describes a computation to be performed.
The attributes of a workunit include:
- The application it's associated with.
- Its name (unique across all workunits in the project).
- A list of its input files.
Each has an input file association
that specifies the filename or file descriptor number
by which the application will refer to the file.
- The command-line arguments to be passed to the application.
- The environment variables to be set for the application.
- The estimated resource requirements of the work unit
(computation, memory, disk space, network traffic).
A workunit is associated with an application,
not with a particular version or set of versions.
If the format of your input data changes in a way that
is incompatible with older versions,
you must create a new application.
This can often be avoided by using XML-like data representations.
A result describes a particular instance of a computation,
either to be performed or already performed.
The attributes of a result include:
- The workunit it's associated with.
- Its name (unique across all results in the project).
- A list of its output files.
Each has an output file association
that specifies the filename or file descriptor number
by which the application will refer to the file.
- Its state. Values include:
- Inactive (not ready to dispatch)
- Unsent (ready to dispatch, but not dispatched)
- In progress (dispatched, not done)
- Done successfully
- Timed out
- Done with error
Several results may be associated with a single workunit.
Results may be generated in either of two ways
(selected as part of the application):
- Advance generation of results.
One or more result records are stored in the database
when the workunit is produced.
The scheduling server dispatches each result to a single participant host.
When all result records have been dispatched,
participants hosts are "turned away".
- On-demand generation of results.
The application specifies a "result template",
which has place-holder tokens for the output filenames.
The scheduling server, in response to a host request,
generates a new result record and sends the result template.
The host generates unique output filenames,
and returns them along when it the computation is done.
The following diagram illustrates the relationship between
workunits, results, files, and I/O assocations.