diff --git a/BUDA-implementation.md b/BUDA-implementation.md new file mode 100644 index 0000000..83902ab --- /dev/null +++ b/BUDA-implementation.md @@ -0,0 +1,96 @@ +On the server, there is a single BOINC app; let's call it 'buda'. +This has app versions for the various platforms (Win, Mac, Linux) +Each app version contains the Docker wrapper built for that platform. + +Each science app variant is a collection of files: + +* A Dockerfile +* a config file, `job.toml` +* input and output templates +* A main program or script +* Other files +* A file 'file_list' listing the other files in template order. + +The set of science apps and variants is represented +by a directory hierarchy of the form +``` +project/buda_apps/ + / + cpu/ + ... files + / + ... files + ... + ... +``` +Note: you can build this hierarchy manually but +typically it's maintained using a web interface; see below. + +This is similar to BOINC's hierarchy of apps and app versions, except: + +* It's represented in a directory hierarchy, not in database tables +* Science app variants are not associated with platforms +(since we're using Docker). +* It stores only the current version, not a sequence of versions +(that's why we call them 'variants', not 'versions'). + +## BUDA is not polymorphic + +Conventional BOINC apps are 'polymorphic': +if an app has both CPU and GPU variants, +you submit jobs without specifying which one to use; +the BOINC scheduler makes the decision. + +It would be possible to make BUDA polymorphic, +but this would be complex, requiring significant changes to the scheduler. +So - at least for now - BUDA is not polymorphic. + +When you submit jobs you have to specify which plan class to use. +This could be a slight nuisance: +a plan class could have little computing power, +and you might avoid using it, but then you wouldn't get the power. + +## Validators and assimilators + +In the current BOINC architecture, +each BOINC app has its own validator and assimilator. +If multiple science apps "share" the same BOINC app, +we'll need a way to let them have different validators and assimilators. + +This could be built on the script-based framework; +each science app could specify the names +of validator and assimilator scripts, +which would be stored in workunits. + +## Interfaces + +BOINC provides a web interface for managing BUDA apps +and submitting batches of jobs to them. +Other interfaces are possible; +e.g. we could make a Python-based remote API +that could be used to integrate BUDA into other batch systems. + +## Implementation notes + +BUDA will require changes to the scheduler. + +Currently: given a job, it scans app versions, +looking for one host can accept based on plan class. +That won't work here. +The plan class is already fixed. + +Instead: +* add plan_class field to workunit (or could put in xml_doc) +* if considering sending a WU to a host, and WU has a plan class + * skip if no app version with that platform / plan class (e.g. can't send metal job to Win host) + * skip if host can't handle the plan class + +## If we wanted to make BUDA polymorphic + +* The scheduler would have to scan the `buda_apps` dir structure (or we could add this info to the DB). + +* Jobs are tagged with BUDA science app name. +* The scheduler scans versions of that science app. +* If find a plan class the host can accept, build wu.xml_doc based on BUDA app version info. + +The above is possible but would be a lot of work. diff --git a/BUDA-job-submission.me b/BUDA-job-submission.me new file mode 100644 index 0000000..867d29e --- /dev/null +++ b/BUDA-job-submission.me @@ -0,0 +1,70 @@ +## BUDA science apps and variants + +We call BUDA applications 'science apps'. +Each science app has a name, like 'worker' or 'autodock'. +A science app can have multiple 'variants' that can +use different types of computer hardware. +The name of a variant is 'cpu' if it uses a single CPU. +Otherwise it's the name of a [plan class](AppPlan). +There might be variants for 1 CPU, for N CPUs, and for various GPU types. + +## User file sandbox + +## Managing science apps and variants. + +In the menu bar of the BOINC project's web site, +select `Computing / Job Submission`. +Then click on `BUDA` +This shows a list of existing science apps and their variants. +You can + +* add or delete a variant +* add or delete a science app +* submit jobs to a variant + +## Adding a variant + +The form for adding a variant includes + +* A plan class name (leave blank if CPU app) +* Select (from your file sandbox) a set of 'app files'. This includes: + * a Dockerfile + * a main prog to run in the container + * other files if needed +* list of input files names +* list of output files names + +## Submitting jobs + +The form for submitting a batch of jobs asks you to +select (from the file sandbox) a zip file of job descriptions. +This file has one dir per job: +``` +jobname1/ + [cmdline] + file1 + file2 + ... +jobname2/ +... +``` +The file names in each job directory must match +the variant's list of input file names. + +## Monitoring a batch + +When you submit a batch of jobs, +you end up at a web page showing you the status of the batch. +This shows you, among other things, +how many of the jobs have completed. +Reload it to update this information. + +You can click on a job to see its status +(and if it failed, the stderr output). +You can view or download its input files. + +On the batch path, you can click to download a zip file +of the output files of all completed jobs. + +When you're done with the batch, you can 'retire' it. +This removes its intput and output files from the server. diff --git a/Docker-universal-app.md b/BUDA-overview.md similarity index 84% rename from Docker-universal-app.md rename to BUDA-overview.md index 3196d60..8f13747 100644 --- a/Docker-universal-app.md +++ b/BUDA-overview.md @@ -3,18 +3,24 @@ is a framework for running Docker-based science apps on BOINC. It's 'universal' in the sense that one BOINC app handles arbitrary science apps. -The science app executables (and Dockerfile) +The science app's Dockerfile and executables are in workunits rather than app versions. On the server, there is a single BOINC app; let's call it 'buda'. -This has app versions for -the various platforms (Win, Mac, Linux) +This has app versions for the various platforms (Win, Mac, Linux) Each app version contains the Docker wrapper built for that platform. +There are various possible interfaces for job submission to BUDA. +We could make a Python-based remote API. +We (or others) could use this API to integrate it into batch systems. + +But for starters, we implemented a generic (multi-application) +web-based job submission system, +using the per-user file sandbox system. + ## BUDA science apps and versions -BOINC provides server-side tools -(CLI and web interfaces) for managing BUDA science apps +BOINC provides a web interface for managing BUDA science apps and submitting jobs to them. These tools assume the following structure: @@ -46,7 +52,7 @@ project/buda_apps/ ... ``` Note: you can build this hierarchy manually but -typically it's maintained by a job-submission system; see below. +typically it's maintained using a web interface; see below. This is similar to BOINC's hierarchy of apps and app versions, except: @@ -58,7 +64,7 @@ This is similar to BOINC's hierarchy of apps and app versions, except: ## BUDA is not polymorphic -The existing BOINC design is 'polymorphic': +Conventional BOINC apps are 'polymorphic': if an app has both CPU and GPU variants, you submit jobs without specifying which one to use; the BOINC scheduler makes the decision. @@ -99,7 +105,7 @@ Instead: * skip if no app version with that platform / plan class (e.g. can't send metal job to Win host) * skip if host can't handle the plan class -If we wanted to make BUDA polymorphic, +## If we wanted to make BUDA polymorphic * The scheduler would have to scan the `buda_apps` dir structure (or we could add this info to the DB). diff --git a/BUDA-setup.md b/BUDA-setup.md new file mode 100644 index 0000000..02e0c03 --- /dev/null +++ b/BUDA-setup.md @@ -0,0 +1,5 @@ +BUDA is 'universal' in the sense that one BOINC app +handles arbitrary science apps. +The science app's Dockerfile and executables +are in workunits rather than app versions. + diff --git a/Docker-universal-app-web-interface.md b/Docker-universal-app-web-interface.md deleted file mode 100644 index 18e3884..0000000 --- a/Docker-universal-app-web-interface.md +++ /dev/null @@ -1,50 +0,0 @@ -There are various possible interfaces for job submission to BUDA. -We could make a Python-based remote API. -We (or others) could use this API to integrate it into batch systems. - -But for starters, I propose implementing a generic (multi-application) -web-based job submission system, -using the per-user file sandbox system. - -## Managing science apps and variants. - -First, there's a web interface for managing BUDA science apps -This shows you the existing apps, -and lets you delete them or create new ones. - -For a given science app it shows you the variants -(i.e. for different GPU types). -It lets you delete them or create new ones. -The form for this includes: - -* Select (from your file sandbox) a set of 'app files'. This includes: - * a Dockerfile - * a main prog to run in the container - * other files if needed - * info per file: logical name, copy flag -* a plan class name (CPU/GPU) -* list of input files (logical name, copy_file) -* list of output files (logical name) -* cmdline (passed to main prog for all jobs) - -## Submitting jobs - -The form for submitting a batch of jobs: - -* batch name (optional) -* select a BUDA science app and variant -* select (from the file sandbox) a zip file of job descriptions. -This file has one dir per job: -``` -jobname/ - [cmdline] - file1 (logical name) - ... -... -``` - -This system should manage file immutability. -The above filenames are logical and don't need to be unique; -e.g. input files for different jobs can have the same name. -The system will create unique physical names. -