mirror of https://github.com/BOINC/boinc.git
Created The BOINC test drive (markdown)
parent
7495ca615e
commit
5c8d7d29db
|
@ -0,0 +1,218 @@
|
|||
Suppose we've solved the supply side of the problem;
|
||||
BOINC has 10 million users, supplying many ExaFLOPS
|
||||
How do we get more scientists to use it?
|
||||
|
||||
The major conference and trade show for scientific computing is Supercomputing.
|
||||
Scientists who do HTC go there.
|
||||
Suppose BOINC had a booth at SC 2022
|
||||
Scientists walk up, we give them a flyer
|
||||
What should it say?
|
||||
What "test drive" experience do we want them to have?
|
||||
|
||||
Ideally, in 10 or 15 minutes they'd be running jobs ~100 CPUs,
|
||||
and there's be a clear path to scaling up to millions.
|
||||
|
||||
The test drive can't include:
|
||||
|
||||
- reading any existing BOINC doc
|
||||
- writing any XML
|
||||
- doing sysadmin
|
||||
- creating a web site
|
||||
- recruiting volunteers
|
||||
- building apps on Windows, Mac, or Android
|
||||
- developing validators or assimilators
|
||||
|
||||
---
|
||||
First, we create a "BOINC app library".
|
||||
It includes a number of widely-used apps (like Autodock, Charm, Rosetta, etc),
|
||||
compiled to run on BOINC (w/ the BOINC library).
|
||||
For app, the library includes app versions for various platforms,
|
||||
CPU features, and GPUs.
|
||||
Each app version has an associated plan class specification.
|
||||
One of the apps is the VBox wrapper.
|
||||
|
||||
These apps are viewed as "secure":
|
||||
running them on a computer doesn't pose a security risk,
|
||||
regardless of the input files and cmdline parameters,
|
||||
even if the job was created by a malevolent hacker.
|
||||
That means we have to be careful about what we put in the library;
|
||||
we need to build it ourselves or vet the people who build it.
|
||||
|
||||
The app library exports a list of the app versions and their hashes.
|
||||
The BOINC client imports this list,
|
||||
so it can know if an app version is from the BOINC library.
|
||||
|
||||
In the BOINC client,
|
||||
an attachment to a project can be marked "restricted",
|
||||
in which case the client will only run apps for that project that are
|
||||
from the app library.
|
||||
|
||||
Notes:
|
||||
1. maintaining this library could be a lot of work!
|
||||
1. the library could be useful for other purposes;
|
||||
e.g. we could bundle Android app versions with the BOINC Android client
|
||||
|
||||
Second, we create a "Demo grid":
|
||||
a set of computers willing to run jobs for anyone, in restricted mode.
|
||||
Could be volunteers, or cluster nodes somewhere, or Amazon spot instances.
|
||||
The BOINC client running on these nodes is attached to
|
||||
an account manager which lets us dynamically attach them to projects.
|
||||
This may as well be an enhanced version of Science United.
|
||||
|
||||
Third, we create a BOINC project that I'll call BOINC Central
|
||||
(the name doesn't matter, no one sees it).
|
||||
Its job is to dispatch jobs for users who don't or can't run their own BOINC server.
|
||||
It has all the apps in library, and all versions, with the plan classes set up.
|
||||
(these are the only app versions it has).
|
||||
|
||||
Finally, we use Science United as a "switchboard" for dynamically
|
||||
attaching hosts to project.
|
||||
It knows which hosts are part of the Demo grid.
|
||||
For each project, it knows whether it is
|
||||
- unvetted
|
||||
- vetted (shallow or deep; see below)
|
||||
This info is used in deciding what projects to attach each host to.
|
||||
|
||||
## Test-drive scenarios
|
||||
|
||||
### unvetted/central
|
||||
```
|
||||
goal: quickly run batches of jobs on computers you don't own
|
||||
User experience:
|
||||
- create an account on BOINC Central, Recaptcha, verify email address
|
||||
2 variants:
|
||||
1) Command line interface (Condor-like)
|
||||
install a package
|
||||
make a "submit file" that specifies a batch of jobs
|
||||
- app
|
||||
- input files
|
||||
- cmdline params
|
||||
- possible resource usage estimates
|
||||
run "boinc_submit"
|
||||
other cmdline commands to
|
||||
- wait for competion of batch
|
||||
(or email notification)
|
||||
- show pending jobs (condor_q)
|
||||
- abort jobs
|
||||
- get resource usage of completed jobs
|
||||
(for use in later submissions)
|
||||
- get output files of completed jobs
|
||||
2) Web interface: go to BOINC Central
|
||||
pick an application
|
||||
specify (through a web interface) a set of cmdline args
|
||||
and/or a range of input files
|
||||
click submit
|
||||
email notification option
|
||||
web interfaces for showing status, aborting
|
||||
download output files as zip
|
||||
|
||||
How to implement
|
||||
- Use BOINC Central for dispatching jobs
|
||||
use existing job-submission and file-management RPCs
|
||||
- Use the Demo grid;
|
||||
SU attaches all Demo nodes to BOINC Central
|
||||
(in restricted mode, though apps coming from there are secure).
|
||||
|
||||
There are limits on
|
||||
- how much computing you get per week
|
||||
- size of input/output files
|
||||
|
||||
possible variant:
|
||||
- you can pay to get more computing
|
||||
|
||||
This is similar to Open Science Grid but
|
||||
- no vetting of job submitters.
|
||||
- has the BOINC "polymorphic app" concept
|
||||
```
|
||||
This is the "test drive" experience.
|
||||
It gives anyone - scientist or not - sporadic access to a few hundred computers.
|
||||
This may be all that some scientists need.
|
||||
|
||||
One of the apps in the library is the VBox wrapper,
|
||||
so you can bring your own apps but they have to run in VMs.
|
||||
Use boinc2docker (and TACC's extensions) to automate converting
|
||||
any Linux/Intel app to a Docker image.
|
||||
Could also develop tools for managing a set of these images.
|
||||
(my earlier "tire-kicking" google doc describes this)
|
||||
|
||||
Notes:
|
||||
- no result validation is done; Demo grid nodes are assumed to be reliable.
|
||||
- you don't have to specify job sizes (CPU, RAM, disk).
|
||||
We could have a system that estimates these for you, based on past jobs
|
||||
|
||||
## unvetted/distributed
|
||||
```
|
||||
Similar, but user has their own BOINC server;
|
||||
avoids storage and BW bottleneck of central server
|
||||
Also lets you attach your own computers directly.
|
||||
- get a Linux machine visible on Internet
|
||||
could be Cloud node
|
||||
- install BOINC server on that machine and create a project
|
||||
could be from a package
|
||||
could be BOINC server Docker
|
||||
could be from a VM image
|
||||
- BOINC server is a black box to user
|
||||
- run commands to install apps from library
|
||||
- submit jobs through same cmdline or web interface
|
||||
- register your BOINC server with BOINC Central
|
||||
no vetting
|
||||
server is registered with SU as "unvetted project"
|
||||
|
||||
Implementation
|
||||
Uses Demo grid hosts
|
||||
Science United attaches Demo grid hosts to unvetted projects in restricted mode
|
||||
```
|
||||
---------------
|
||||
```
|
||||
Vetting:
|
||||
partially vetted: we believe that
|
||||
- your identity and affiliation are true
|
||||
- you're doing the kind of computing you claim
|
||||
(science area, location)
|
||||
This gives you access to more computing but you still need to use trusted apps
|
||||
fully vetting: partial vetting plus
|
||||
- we believe that your apps are not malware
|
||||
- we believe that you do code signing
|
||||
This lets you use your own non-VM apps
|
||||
|
||||
Partially vetted
|
||||
You can use either the central or distributed model.
|
||||
Your apps run on all Science United hosts (currently about 5,000).
|
||||
|
||||
Fully vetted
|
||||
Use with distributed model (your own server)
|
||||
You can add your own apps and app versions.
|
||||
May as well use the current BOINC tools for this;
|
||||
requires logging in to your project server,
|
||||
code-signing, maybe writing XML plan class specs
|
||||
Your project is registered on Science United,
|
||||
and it's attached to hosts based on science area
|
||||
and computing resources (that's how SU currently works)
|
||||
Your apps run on all Science United hosts in trusted mode
|
||||
Your project is listed on the BOINC web site,
|
||||
and in the project list in the client GUI,
|
||||
so volunteers can attach to it explicitly.
|
||||
|
||||
Notes:
|
||||
- result validation becomes an issue,
|
||||
mostly because of possible credit cheating.
|
||||
Need to figure out how to do this in a way that doesn't require
|
||||
users to write validators.
|
||||
|
||||
Or get rid of credit
|
||||
```
|
||||
--------------
|
||||
How hard is this to implement?
|
||||
```
|
||||
Things I can do:
|
||||
BOINC library framwork
|
||||
BOINC Central
|
||||
Changes to SU
|
||||
Changes to BOINC client
|
||||
|
||||
Things I'd need help with:
|
||||
Job submission interfaces
|
||||
|
||||
Things others would have to do
|
||||
build app versions for BOINC library
|
||||
```
|
Loading…
Reference in New Issue