mirror of https://github.com/BOINC/boinc.git
Updated Sporadic Applications (markdown)
parent
14709d03b0
commit
aae2ef17bd
|
@ -1,16 +1,15 @@
|
|||
BOINC was originally designed as a batch processing system:
|
||||
you submit jobs, they run (independently of one another)
|
||||
and eventually they finish.
|
||||
Some potential uses of volunteer computing don't fit this model.
|
||||
They may require that their apps run simultaneously,
|
||||
you submit jobs, they run (independently of each other)
|
||||
and eventually finish.
|
||||
But some potential uses of volunteer computing don't fit this model.
|
||||
They may require that their apps run simultaneously on different computers,
|
||||
and perhaps that they communicate directly with each other.
|
||||
Examples include MPI-type parallel apps
|
||||
and distributed machine learning.
|
||||
Examples include MPI-type parallel computing and distributed machine learning.
|
||||
BOINC's 'sporadic application' mechanism is designed to support these types of systems,
|
||||
and to allow them to coexist with batch processing.
|
||||
|
||||
The jobs of a sporadic app run (i.e. are present in memory)
|
||||
all the time (like non-CPU-intensive jobs)
|
||||
all the time, like non-CPU-intensive jobs,
|
||||
but compute only some of the time.
|
||||
|
||||
Like regular apps, a sporadic app can have multiple app versions.
|
||||
|
@ -28,7 +27,7 @@ A sporadic app is typically part of another distributed system -
|
|||
a 'guest system' - that exists outside of BOINC.
|
||||
The guest system typically has its own server that handles requests
|
||||
and dispatches them to 'worker nodes' (running BOINC).
|
||||
Its worker nodes may communicate directly with each other - peer-to-peer -
|
||||
Its worker nodes may communicate directly with each other - peer-to-peer or via a relay -
|
||||
as well as with the server.
|
||||
|
||||
A sporadic job engages in conversations with both the BOINC client
|
||||
|
@ -40,14 +39,17 @@ The client/app protocol uses the following messages:
|
|||
|
||||
Client to app:
|
||||
|
||||
```DONT_COMPUTE```: you can't compute now (e.g. because resources are not available)
|
||||
```COULD_COMPUTE```: you could compute if you want
|
||||
```COMPUTING```: you're computing as far as I'm concerned
|
||||
```DONT_COMPUTE```: the app can't compute now (e.g. because resources are not available)
|
||||
|
||||
```COULD_COMPUTE```: the app could potentially compute
|
||||
|
||||
```COMPUTING```: the app is computing as far as the client is concerned
|
||||
|
||||
App to client:
|
||||
|
||||
```DONT_WANT_COMPUTE```: I don't want to compute now
|
||||
```WANT_COMPUTE```: I want to compute
|
||||
```DONT_WANT_COMPUTE```: the app doesn't want to compute now
|
||||
|
||||
```WANT_COMPUTE```: the app wants to compute
|
||||
|
||||
The protocol between the app and the guest server isn't specified.
|
||||
It could be based on polling from the app,
|
||||
|
@ -78,23 +80,25 @@ The steps are:
|
|||
perhaps because the user has suspended computation.
|
||||
* The app relays this to the server;
|
||||
this tells the server not to send any requests.
|
||||
The server can keep track of which worker nodes
|
||||
are available for computing at a given point.
|
||||
* Eventually the user enables computing;
|
||||
the client relays this as a ```COULD_COMPUTE``` message to the app,
|
||||
and the app relays it to the server,
|
||||
indicating that it can now accept requests.
|
||||
* The server sends a request to the app, asking it to do some computing
|
||||
(and possibly some network communication with other workers).
|
||||
* The app sends WANT_COMPUTE to the client.
|
||||
* The app sends ```WANT_COMPUTE``` to the client.
|
||||
* The client reserves that needed computing resources
|
||||
and sends COMPUTING to the app
|
||||
* The app computes. When it's done, it sends DONT_WANT_COMPUTE to the client.
|
||||
* The client (assuming computing is not suspended) sents COULD_COMPUTE
|
||||
and sends ```COMPUTING``` to the app
|
||||
* The app computes. When it's done, it sends ```DONT_WANT_COMPUTE``` to the client.
|
||||
* The client (assuming computing is not suspended) sends ```COULD_COMPUTE```
|
||||
|
||||
It's also possible that the app must stop computing before the request is finished -
|
||||
for example, because the user suspends computing.
|
||||
In this case:
|
||||
|
||||
* The client sends DONT_COMPUTE to the app
|
||||
* The client sends ```DONT_COMPUTE``` to the app
|
||||
* The app notifies the server that it can't finish the request
|
||||
(or it might wait before doing this, in case computing is re-enabled quickly).
|
||||
|
||||
|
|
Loading…
Reference in New Issue