mirror of https://github.com/BOINC/boinc.git
Created The BOINC test drive (markdown)
parent
7495ca615e
commit
5c8d7d29db
|
@ -0,0 +1,218 @@
|
||||||
|
Suppose we've solved the supply side of the problem;
|
||||||
|
BOINC has 10 million users, supplying many ExaFLOPS
|
||||||
|
How do we get more scientists to use it?
|
||||||
|
|
||||||
|
The major conference and trade show for scientific computing is Supercomputing.
|
||||||
|
Scientists who do HTC go there.
|
||||||
|
Suppose BOINC had a booth at SC 2022
|
||||||
|
Scientists walk up, we give them a flyer
|
||||||
|
What should it say?
|
||||||
|
What "test drive" experience do we want them to have?
|
||||||
|
|
||||||
|
Ideally, in 10 or 15 minutes they'd be running jobs ~100 CPUs,
|
||||||
|
and there's be a clear path to scaling up to millions.
|
||||||
|
|
||||||
|
The test drive can't include:
|
||||||
|
|
||||||
|
- reading any existing BOINC doc
|
||||||
|
- writing any XML
|
||||||
|
- doing sysadmin
|
||||||
|
- creating a web site
|
||||||
|
- recruiting volunteers
|
||||||
|
- building apps on Windows, Mac, or Android
|
||||||
|
- developing validators or assimilators
|
||||||
|
|
||||||
|
---
|
||||||
|
First, we create a "BOINC app library".
|
||||||
|
It includes a number of widely-used apps (like Autodock, Charm, Rosetta, etc),
|
||||||
|
compiled to run on BOINC (w/ the BOINC library).
|
||||||
|
For app, the library includes app versions for various platforms,
|
||||||
|
CPU features, and GPUs.
|
||||||
|
Each app version has an associated plan class specification.
|
||||||
|
One of the apps is the VBox wrapper.
|
||||||
|
|
||||||
|
These apps are viewed as "secure":
|
||||||
|
running them on a computer doesn't pose a security risk,
|
||||||
|
regardless of the input files and cmdline parameters,
|
||||||
|
even if the job was created by a malevolent hacker.
|
||||||
|
That means we have to be careful about what we put in the library;
|
||||||
|
we need to build it ourselves or vet the people who build it.
|
||||||
|
|
||||||
|
The app library exports a list of the app versions and their hashes.
|
||||||
|
The BOINC client imports this list,
|
||||||
|
so it can know if an app version is from the BOINC library.
|
||||||
|
|
||||||
|
In the BOINC client,
|
||||||
|
an attachment to a project can be marked "restricted",
|
||||||
|
in which case the client will only run apps for that project that are
|
||||||
|
from the app library.
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
1. maintaining this library could be a lot of work!
|
||||||
|
1. the library could be useful for other purposes;
|
||||||
|
e.g. we could bundle Android app versions with the BOINC Android client
|
||||||
|
|
||||||
|
Second, we create a "Demo grid":
|
||||||
|
a set of computers willing to run jobs for anyone, in restricted mode.
|
||||||
|
Could be volunteers, or cluster nodes somewhere, or Amazon spot instances.
|
||||||
|
The BOINC client running on these nodes is attached to
|
||||||
|
an account manager which lets us dynamically attach them to projects.
|
||||||
|
This may as well be an enhanced version of Science United.
|
||||||
|
|
||||||
|
Third, we create a BOINC project that I'll call BOINC Central
|
||||||
|
(the name doesn't matter, no one sees it).
|
||||||
|
Its job is to dispatch jobs for users who don't or can't run their own BOINC server.
|
||||||
|
It has all the apps in library, and all versions, with the plan classes set up.
|
||||||
|
(these are the only app versions it has).
|
||||||
|
|
||||||
|
Finally, we use Science United as a "switchboard" for dynamically
|
||||||
|
attaching hosts to project.
|
||||||
|
It knows which hosts are part of the Demo grid.
|
||||||
|
For each project, it knows whether it is
|
||||||
|
- unvetted
|
||||||
|
- vetted (shallow or deep; see below)
|
||||||
|
This info is used in deciding what projects to attach each host to.
|
||||||
|
|
||||||
|
## Test-drive scenarios
|
||||||
|
|
||||||
|
### unvetted/central
|
||||||
|
```
|
||||||
|
goal: quickly run batches of jobs on computers you don't own
|
||||||
|
User experience:
|
||||||
|
- create an account on BOINC Central, Recaptcha, verify email address
|
||||||
|
2 variants:
|
||||||
|
1) Command line interface (Condor-like)
|
||||||
|
install a package
|
||||||
|
make a "submit file" that specifies a batch of jobs
|
||||||
|
- app
|
||||||
|
- input files
|
||||||
|
- cmdline params
|
||||||
|
- possible resource usage estimates
|
||||||
|
run "boinc_submit"
|
||||||
|
other cmdline commands to
|
||||||
|
- wait for competion of batch
|
||||||
|
(or email notification)
|
||||||
|
- show pending jobs (condor_q)
|
||||||
|
- abort jobs
|
||||||
|
- get resource usage of completed jobs
|
||||||
|
(for use in later submissions)
|
||||||
|
- get output files of completed jobs
|
||||||
|
2) Web interface: go to BOINC Central
|
||||||
|
pick an application
|
||||||
|
specify (through a web interface) a set of cmdline args
|
||||||
|
and/or a range of input files
|
||||||
|
click submit
|
||||||
|
email notification option
|
||||||
|
web interfaces for showing status, aborting
|
||||||
|
download output files as zip
|
||||||
|
|
||||||
|
How to implement
|
||||||
|
- Use BOINC Central for dispatching jobs
|
||||||
|
use existing job-submission and file-management RPCs
|
||||||
|
- Use the Demo grid;
|
||||||
|
SU attaches all Demo nodes to BOINC Central
|
||||||
|
(in restricted mode, though apps coming from there are secure).
|
||||||
|
|
||||||
|
There are limits on
|
||||||
|
- how much computing you get per week
|
||||||
|
- size of input/output files
|
||||||
|
|
||||||
|
possible variant:
|
||||||
|
- you can pay to get more computing
|
||||||
|
|
||||||
|
This is similar to Open Science Grid but
|
||||||
|
- no vetting of job submitters.
|
||||||
|
- has the BOINC "polymorphic app" concept
|
||||||
|
```
|
||||||
|
This is the "test drive" experience.
|
||||||
|
It gives anyone - scientist or not - sporadic access to a few hundred computers.
|
||||||
|
This may be all that some scientists need.
|
||||||
|
|
||||||
|
One of the apps in the library is the VBox wrapper,
|
||||||
|
so you can bring your own apps but they have to run in VMs.
|
||||||
|
Use boinc2docker (and TACC's extensions) to automate converting
|
||||||
|
any Linux/Intel app to a Docker image.
|
||||||
|
Could also develop tools for managing a set of these images.
|
||||||
|
(my earlier "tire-kicking" google doc describes this)
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- no result validation is done; Demo grid nodes are assumed to be reliable.
|
||||||
|
- you don't have to specify job sizes (CPU, RAM, disk).
|
||||||
|
We could have a system that estimates these for you, based on past jobs
|
||||||
|
|
||||||
|
## unvetted/distributed
|
||||||
|
```
|
||||||
|
Similar, but user has their own BOINC server;
|
||||||
|
avoids storage and BW bottleneck of central server
|
||||||
|
Also lets you attach your own computers directly.
|
||||||
|
- get a Linux machine visible on Internet
|
||||||
|
could be Cloud node
|
||||||
|
- install BOINC server on that machine and create a project
|
||||||
|
could be from a package
|
||||||
|
could be BOINC server Docker
|
||||||
|
could be from a VM image
|
||||||
|
- BOINC server is a black box to user
|
||||||
|
- run commands to install apps from library
|
||||||
|
- submit jobs through same cmdline or web interface
|
||||||
|
- register your BOINC server with BOINC Central
|
||||||
|
no vetting
|
||||||
|
server is registered with SU as "unvetted project"
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
Uses Demo grid hosts
|
||||||
|
Science United attaches Demo grid hosts to unvetted projects in restricted mode
|
||||||
|
```
|
||||||
|
---------------
|
||||||
|
```
|
||||||
|
Vetting:
|
||||||
|
partially vetted: we believe that
|
||||||
|
- your identity and affiliation are true
|
||||||
|
- you're doing the kind of computing you claim
|
||||||
|
(science area, location)
|
||||||
|
This gives you access to more computing but you still need to use trusted apps
|
||||||
|
fully vetting: partial vetting plus
|
||||||
|
- we believe that your apps are not malware
|
||||||
|
- we believe that you do code signing
|
||||||
|
This lets you use your own non-VM apps
|
||||||
|
|
||||||
|
Partially vetted
|
||||||
|
You can use either the central or distributed model.
|
||||||
|
Your apps run on all Science United hosts (currently about 5,000).
|
||||||
|
|
||||||
|
Fully vetted
|
||||||
|
Use with distributed model (your own server)
|
||||||
|
You can add your own apps and app versions.
|
||||||
|
May as well use the current BOINC tools for this;
|
||||||
|
requires logging in to your project server,
|
||||||
|
code-signing, maybe writing XML plan class specs
|
||||||
|
Your project is registered on Science United,
|
||||||
|
and it's attached to hosts based on science area
|
||||||
|
and computing resources (that's how SU currently works)
|
||||||
|
Your apps run on all Science United hosts in trusted mode
|
||||||
|
Your project is listed on the BOINC web site,
|
||||||
|
and in the project list in the client GUI,
|
||||||
|
so volunteers can attach to it explicitly.
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
- result validation becomes an issue,
|
||||||
|
mostly because of possible credit cheating.
|
||||||
|
Need to figure out how to do this in a way that doesn't require
|
||||||
|
users to write validators.
|
||||||
|
|
||||||
|
Or get rid of credit
|
||||||
|
```
|
||||||
|
--------------
|
||||||
|
How hard is this to implement?
|
||||||
|
```
|
||||||
|
Things I can do:
|
||||||
|
BOINC library framwork
|
||||||
|
BOINC Central
|
||||||
|
Changes to SU
|
||||||
|
Changes to BOINC client
|
||||||
|
|
||||||
|
Things I'd need help with:
|
||||||
|
Job submission interfaces
|
||||||
|
|
||||||
|
Things others would have to do
|
||||||
|
build app versions for BOINC library
|
||||||
|
```
|
Loading…
Reference in New Issue