From 8bfe5dbf179f28a03e9d535314743522ba211a0f Mon Sep 17 00:00:00 2001 From: Vitalii Koshura Date: Fri, 7 Apr 2023 04:10:12 +0200 Subject: [PATCH] Update CondorBoinc.md file Signed-off-by: Vitalii Koshura --- CondorBoinc.md | 174 ++++++++++++++++++++++++------------------------- 1 file changed, 87 insertions(+), 87 deletions(-) diff --git a/CondorBoinc.md b/CondorBoinc.md index c469612..de929a0 100644 --- a/CondorBoinc.md +++ b/CondorBoinc.md @@ -1,4 +1,4 @@ -[[PageOutline]] +# Condor-B: BOINC/Condor integration This document describes the design of Condor-B, extensions to BOINC and Condor so that a BOINC-based volunteer computing project can provide computing resources to a Condor pool. @@ -14,7 +14,7 @@ Condor-B must address some basic differences between Condor and BOINC: and the file associated with a given physical name is immutable. Files may be used by many jobs. In Condor, a file is associated with a job, and has a single name. - * BOINC is designed for apps for which the number and names of output files +* BOINC is designed for apps for which the number and names of output files is fixed at the time of job submission. Condor doesn't have this restriction. @@ -26,7 +26,7 @@ Condor-B must address some basic differences between Condor and BOINC: e.g. versions for different platforms, GPU types, etc. A job is associated with an application, not an app version. -# Assumptions +## Assumptions For simplicity, we'll assume that the BOINC project has been configured to run a certain set of applications @@ -43,7 +43,7 @@ For each of these applications, admins must * Build the app for one or more platforms (ways of doing this are discussed below). * Create BOINC "app versions". -# Job submission mechanism +## Job submission mechanism We'll use Condor's existing mechanism for sending jobs to non-Condor back ends. This will involve 2 components: @@ -53,7 +53,7 @@ This will involve 2 components: * A new class in Condor's job_router for managing communication with the BOINC GAHP. -[[Image(condor.png)]] +![Image](condor.png) The GAHP protocol will be based on the one used for HTCondor's interactions with Globus GRAM. That protocol's description can be found at http://research.cs.wisc.edu/htcondor/gahp/gahp_protocol.txt. From that protocol, we will take the basic syntax and command structure, and these commands: @@ -78,28 +78,28 @@ a RESULTS command fetches the results of completed asynchronous commands. The commands are: -## Specify BOINC project and credentials - - BOINC_SELECT_PROJECT project_url authenticator - +### Specify BOINC project and credentials +``` +BOINC_SELECT_PROJECT project_url authenticator +``` Result (immediate): NULL or error message Specify the URL of a BOINC project and the authenticator of an account on that project to which requests will be sent. -## Submit a new job batch - - BOINC_SUBMIT - <#jobs> - <#args> ... - <#input files> - - ... - ... - - Result: - NULL (success) or - +### Submit a new job batch +``` +BOINC_SUBMIT + <#jobs> + <#args> ... + <#input files> + + ... + ... + +Result: + NULL (success) or +``` Notes: * The batch name and job names must be unique over all submissions. * Each job will have its own set of arguments and input files. @@ -107,37 +107,37 @@ Notes: * The input s must agree with the app's template. * As of now, will always be the filename part of -* We could add a argument to prepend to input paths. +* We could add a \ argument to prepend to input paths. -## Query the status of the jobs of one or more batches - - BOINC_QUERY_BATCH min_mod_time #batches ... - - Result: - | NULL server_time ... ... - +### Query the status of the jobs of one or more batches +``` +BOINC_QUERY_BATCHES min_mod_time #batches ... + +Result: + | NULL server_time ... ... +``` Query the jobs in a given set of batches. Only jobs whose DB record has changed (e.g. whose status has changed) -since the given *min_mod_time* are reported -(*min_mod_time* = 0 returns all jobs). +since the given **min_mod_time** are reported +(**min_mod_time** = 0 returns all jobs). -The output includes the current on the server; -you can pass this as *min_mod_time* in a subsequent call. +The output includes the current time on the server; +you can pass this as **min_mod_time** in a subsequent call. The status of each job is either IN_PROGRESS, DONE, or ERROR -## Retrieve the outputs of a completed job +### Retrieve the outputs of a completed job - - BOINC_FETCH_OUTPUT - - <#file-specs> - - ... - Result: - error_msg | NULL - +``` +BOINC_FETCH_OUTPUT + + <#file-specs> + + ... +Result: + error_msg | NULL +``` Get the results of a completed job, including some or all of its output files. BOINC may replicate jobs to ensure that results are valid. @@ -145,17 +145,17 @@ One replica, the "canonical instance", is designated as the authoritative result If the status is DONE, then the output files of the canonical instance, and its stderr output, are fetched. will be zero in this case. - -* is a directory on the local machine where output files are placed by default. + +* \ is a directory on the local machine where output files are placed by default. * If mode is ALL, all the job's output files are fetched. File specs are then applied to rename or move output files. * If mode is SOME, only those output files described by file specs are fetched. -* Each file spec consists of and . is a filename written by the job. - specifies where that file should be placed on the local machine. +* Each file spec consists of and \. \ is a filename written by the job. + \ specifies where that file should be placed on the local machine. It may be either: - * An absolute path - * A relative path, in which case is prepended. - Any directories within must already exist. +* An absolute path +* A relative path, in which case \ is prepended. + Any directories within \ must already exist. If the status is ERROR, the BOINC GAHP looks for an instance for which some information is available (e.g., exit status and stderr output), @@ -165,38 +165,38 @@ If there is no such instance, it returns an error message. or there is no consensus among the instances, or no instances could be dispatched.) -## Abort jobs - - BOINC_ABORT_JOBS ... - Result: - NULL| - +### Abort jobs +``` +BOINC_ABORT_JOBS ... +Result: + NULL| +``` -## Retire a batch - - BOINC_RETIRE_BATCH - Result: - NULL| - +### Retire a batch +``` +BOINC_RETIRE_BATCH +Result: + NULL| +``` The batch's files and database records can be deleted. -## Set the "lease time" for a batch - - BOINC_SET_LEASE - Result: - NULL| - +### Set the "lease time" for a batch +``` +BOINC_SET_LEASE +Result: + NULL| +``` After this time its files and database records can be deleted. -## Results command - - RESULTS - - Result: - # of completed commands - result1 - ... - +### Results command +``` +RESULTS + +Result: + # of completed commands + result1 + ... +``` If any commands have completed, return their results. @@ -204,12 +204,12 @@ Note: the GAHP protocol defines an "async mode" where the GAHP can notify the grid manager that a command has completed by sending "R\n". This is probably not worth doing since polling is very cheap. -# Project selection and authentication +## Project selection and authentication For the time being we'll do it this way: Each job submitter has a separate account on the BOINC project (these accounts can be assigned [access rights and quotas](MultiUser)). -The account has a private *authenticator* (a random string). +The account has a private **authenticator** (a random string). The job submitter will create a configuration file containing * the URL of the BOINC project @@ -221,7 +221,7 @@ and will handle requests using that account on that project. Note: we could generalize this a bit by including the project URL and authenticator as an argument to each GAHP request. -# Data model +## Data model The BOINC GAHP uses BOINC's [content-based file management system](RemoteInputFiles#Content-basedfilemanagement) @@ -235,7 +235,7 @@ a given file is used by many jobs or batches. The BOINC database stores records associating files and batches; a file is deleted only when it is no longer associated with any batches. -# Implementation notes +## Implementation notes The BOINC GAHP handles BOINC_SUBMIT as follows: @@ -249,15 +249,15 @@ The BOINC GAHP handles BOINC_SUBMIT as follows: and create batch/file associations for these files. * Do an RPC create jobs -# Ways to deploy applications on BOINC +## Ways to deploy applications on BOINC BOINC offers three "environments" in which applications can be deployed: -* *Native*: +* **Native**: This requires making source-code modifications and building the app for different platforms, linking with the BOINC API library. -* *BOINC wrapper*: +* **BOINC wrapper**: Requires apps to be built for different platforms, but no source code mods. -* *Virtual machine-based*: +* **Virtual machine-based**: This would eliminate multi-platform issues - but would require volunteer hosts to have VirtualBox installed. + but would require volunteer hosts to have [VirtualBox](VirtualBox) installed.