diff --git a/The-BOINC-test-drive.md b/The-BOINC-test-drive.md new file mode 100644 index 0000000..98412e0 --- /dev/null +++ b/The-BOINC-test-drive.md @@ -0,0 +1,218 @@ +Suppose we've solved the supply side of the problem; +BOINC has 10 million users, supplying many ExaFLOPS +How do we get more scientists to use it? + +The major conference and trade show for scientific computing is Supercomputing. +Scientists who do HTC go there. +Suppose BOINC had a booth at SC 2022 +Scientists walk up, we give them a flyer +What should it say? +What "test drive" experience do we want them to have? + +Ideally, in 10 or 15 minutes they'd be running jobs ~100 CPUs, +and there's be a clear path to scaling up to millions. + +The test drive can't include: + +- reading any existing BOINC doc +- writing any XML +- doing sysadmin +- creating a web site +- recruiting volunteers +- building apps on Windows, Mac, or Android +- developing validators or assimilators + +--- +First, we create a "BOINC app library". +It includes a number of widely-used apps (like Autodock, Charm, Rosetta, etc), +compiled to run on BOINC (w/ the BOINC library). +For app, the library includes app versions for various platforms, +CPU features, and GPUs. +Each app version has an associated plan class specification. +One of the apps is the VBox wrapper. + +These apps are viewed as "secure": +running them on a computer doesn't pose a security risk, +regardless of the input files and cmdline parameters, +even if the job was created by a malevolent hacker. +That means we have to be careful about what we put in the library; +we need to build it ourselves or vet the people who build it. + +The app library exports a list of the app versions and their hashes. +The BOINC client imports this list, +so it can know if an app version is from the BOINC library. + +In the BOINC client, +an attachment to a project can be marked "restricted", +in which case the client will only run apps for that project that are +from the app library. + +Notes: +1. maintaining this library could be a lot of work! +1. the library could be useful for other purposes; + e.g. we could bundle Android app versions with the BOINC Android client + +Second, we create a "Demo grid": +a set of computers willing to run jobs for anyone, in restricted mode. +Could be volunteers, or cluster nodes somewhere, or Amazon spot instances. +The BOINC client running on these nodes is attached to +an account manager which lets us dynamically attach them to projects. +This may as well be an enhanced version of Science United. + +Third, we create a BOINC project that I'll call BOINC Central +(the name doesn't matter, no one sees it). +Its job is to dispatch jobs for users who don't or can't run their own BOINC server. +It has all the apps in library, and all versions, with the plan classes set up. +(these are the only app versions it has). + +Finally, we use Science United as a "switchboard" for dynamically +attaching hosts to project. +It knows which hosts are part of the Demo grid. +For each project, it knows whether it is +- unvetted +- vetted (shallow or deep; see below) +This info is used in deciding what projects to attach each host to. + +## Test-drive scenarios + +### unvetted/central +``` + goal: quickly run batches of jobs on computers you don't own + User experience: + - create an account on BOINC Central, Recaptcha, verify email address + 2 variants: + 1) Command line interface (Condor-like) + install a package + make a "submit file" that specifies a batch of jobs + - app + - input files + - cmdline params + - possible resource usage estimates + run "boinc_submit" + other cmdline commands to + - wait for competion of batch + (or email notification) + - show pending jobs (condor_q) + - abort jobs + - get resource usage of completed jobs + (for use in later submissions) + - get output files of completed jobs + 2) Web interface: go to BOINC Central + pick an application + specify (through a web interface) a set of cmdline args + and/or a range of input files + click submit + email notification option + web interfaces for showing status, aborting + download output files as zip + + How to implement + - Use BOINC Central for dispatching jobs + use existing job-submission and file-management RPCs + - Use the Demo grid; + SU attaches all Demo nodes to BOINC Central + (in restricted mode, though apps coming from there are secure). + + There are limits on + - how much computing you get per week + - size of input/output files + + possible variant: + - you can pay to get more computing + + This is similar to Open Science Grid but + - no vetting of job submitters. + - has the BOINC "polymorphic app" concept +``` +This is the "test drive" experience. +It gives anyone - scientist or not - sporadic access to a few hundred computers. +This may be all that some scientists need. + +One of the apps in the library is the VBox wrapper, +so you can bring your own apps but they have to run in VMs. +Use boinc2docker (and TACC's extensions) to automate converting +any Linux/Intel app to a Docker image. +Could also develop tools for managing a set of these images. +(my earlier "tire-kicking" google doc describes this) + +Notes: +- no result validation is done; Demo grid nodes are assumed to be reliable. +- you don't have to specify job sizes (CPU, RAM, disk). + We could have a system that estimates these for you, based on past jobs + +## unvetted/distributed +``` + Similar, but user has their own BOINC server; + avoids storage and BW bottleneck of central server + Also lets you attach your own computers directly. + - get a Linux machine visible on Internet + could be Cloud node + - install BOINC server on that machine and create a project + could be from a package + could be BOINC server Docker + could be from a VM image + - BOINC server is a black box to user + - run commands to install apps from library + - submit jobs through same cmdline or web interface + - register your BOINC server with BOINC Central + no vetting + server is registered with SU as "unvetted project" + + Implementation + Uses Demo grid hosts + Science United attaches Demo grid hosts to unvetted projects in restricted mode +``` +--------------- +``` +Vetting: + partially vetted: we believe that + - your identity and affiliation are true + - you're doing the kind of computing you claim + (science area, location) + This gives you access to more computing but you still need to use trusted apps + fully vetting: partial vetting plus + - we believe that your apps are not malware + - we believe that you do code signing + This lets you use your own non-VM apps + +Partially vetted + You can use either the central or distributed model. + Your apps run on all Science United hosts (currently about 5,000). + +Fully vetted + Use with distributed model (your own server) + You can add your own apps and app versions. + May as well use the current BOINC tools for this; + requires logging in to your project server, + code-signing, maybe writing XML plan class specs + Your project is registered on Science United, + and it's attached to hosts based on science area + and computing resources (that's how SU currently works) + Your apps run on all Science United hosts in trusted mode + Your project is listed on the BOINC web site, + and in the project list in the client GUI, + so volunteers can attach to it explicitly. + +Notes: +- result validation becomes an issue, + mostly because of possible credit cheating. + Need to figure out how to do this in a way that doesn't require + users to write validators. + + Or get rid of credit +``` +-------------- +How hard is this to implement? +``` +Things I can do: + BOINC library framwork + BOINC Central + Changes to SU + Changes to BOINC client + +Things I'd need help with: + Job submission interfaces + +Things others would have to do + build app versions for BOINC library +``` \ No newline at end of file