mirror of https://github.com/BOINC/boinc.git
Update CreditOptions.md file
Signed-off-by: Vitalii Koshura <lestat.de.lionkur@gmail.com>
parent
6ce66a6c68
commit
b853a13aa9
|
@ -0,0 +1,139 @@
|
||||||
|
# Credit options
|
||||||
|
|
||||||
|
"Credit" is a number associated with completed jobs,
|
||||||
|
reflecting how much floating-point computation was (or could have been) done.
|
||||||
|
For CPU applications the basic formula is:
|
||||||
|
|
||||||
|
1 unit of credit = 1/200 day runtime on a CPU whose Whetstone benchmark is 1 GFLOPS.
|
||||||
|
|
||||||
|
Whetstone measures peak performance, and applications that do a lot of memory or disk access get lower FLOPS.
|
||||||
|
So credit measures peak, not actual, FLOPs.
|
||||||
|
|
||||||
|
Credit is used for two purposes:
|
||||||
|
|
||||||
|
1. For users, to see their rate of progress,
|
||||||
|
to compete with other users or teams,
|
||||||
|
and to compare the performance of hosts.
|
||||||
|
|
||||||
|
2. To get an estimate of the peak performance available to a particular project,
|
||||||
|
or of the volunteer host pool as a whole.
|
||||||
|
|
||||||
|
For 2) we care only about averages.
|
||||||
|
For 1) we also care about parity between similar jobs;
|
||||||
|
users get upset if someone else gets a lot more credit for a similar job.
|
||||||
|
|
||||||
|
BOINC provides 4 ways of determining credit.
|
||||||
|
The choice (per app) depends on the properties of the app:
|
||||||
|
|
||||||
|
* If you can estimate a job's FLOPs in advance, use **pre-assigned** credit.
|
||||||
|
|
||||||
|
* Else if you can estimate a job's FLOPs after if completes, use **post-assigned** credit.
|
||||||
|
|
||||||
|
* Else if the app has only CPU versions, use **runtime credit**.
|
||||||
|
|
||||||
|
* Else use **adaptive credit**.
|
||||||
|
|
||||||
|
# Pre-assigned credit
|
||||||
|
|
||||||
|
You can use this if the amount of computation done by each job is known in advance,
|
||||||
|
e.g. if all jobs do the same computation.
|
||||||
|
Measure the runtime on a machine with known Whetstone benchmarks.
|
||||||
|
Pick a machine with enough RAM that you're not paging.
|
||||||
|
The credit is then
|
||||||
|
|
||||||
|
(runtime in days)*benchmark*ncpus*200
|
||||||
|
|
||||||
|
ncpus is the number of CPUs used by the app version; use a sequential version if possible.
|
||||||
|
|
||||||
|
You can also use it if the runtime is a linear function of
|
||||||
|
some job attribute (e.g. input file size) that's known in advance.
|
||||||
|
|
||||||
|
To specify:
|
||||||
|
* use the --credit argument to the create_work cmdline program
|
||||||
|
* if using the C++ API, assign wu.canonical_credit in the first argument.
|
||||||
|
|
||||||
|
You must run the app's validator with the **--credit_from_wu** option.
|
||||||
|
|
||||||
|
TODO: add to remote job submissions RPCs if anyone wants.
|
||||||
|
|
||||||
|
# Post-assigned credit
|
||||||
|
|
||||||
|
Use this if you can estimate the FLOPs done by a completed job,
|
||||||
|
based on the contents of its output files or stderr.
|
||||||
|
For example, if your app has an outer loop,
|
||||||
|
and you can measure (as above) the credit C due for each iteration,
|
||||||
|
the job credit is C times the number of iterations performed.
|
||||||
|
|
||||||
|
To use this:
|
||||||
|
* In your validator, have the init_result() function set result.claimed_credit.
|
||||||
|
* Run the validator with **--post_assigned_credit**.
|
||||||
|
|
||||||
|
A job's granted credit is the claimed credit of its canonical instance.
|
||||||
|
|
||||||
|
# Runtime-based credit
|
||||||
|
|
||||||
|
Use this if the app has only CPU app versions.
|
||||||
|
The "claimed credit" for a job instance is runtime*ncpus*peak_flops,
|
||||||
|
where peak_flops is the host's Whetstone benchmark.
|
||||||
|
The job's granted credit is the average of the instance claimed credits.
|
||||||
|
|
||||||
|
To use this: pass the **--credit_from_runtime** option to the app's validator.
|
||||||
|
You must also supply **--max_granted_credit**.
|
||||||
|
|
||||||
|
Runtime-based credit can't be used if the app has GPU versions
|
||||||
|
because efficiency can vary by orders of magnitude between CPU and GPU versions.
|
||||||
|
|
||||||
|
Runtime-based credit is limited by max_granted_credit, but is otherwise not cheat-proof.
|
||||||
|
|
||||||
|
# Runtime-based credit via trickle messages
|
||||||
|
|
||||||
|
If you have very long-running jobs (a week or more) you may want to
|
||||||
|
grant credit incrementally.
|
||||||
|
To do so:
|
||||||
|
|
||||||
|
* Have your application periodically send [trickle-up messages](TrickleApi)
|
||||||
|
with variety **runtime** and content
|
||||||
|
```
|
||||||
|
<runtime>X</runtime>
|
||||||
|
```
|
||||||
|
where X is the runtime since the last trickle message.
|
||||||
|
|
||||||
|
* Run the **trickle_credit** daemon as follows:
|
||||||
|
```
|
||||||
|
trickle_credit --variety runtime --max_runtime Y
|
||||||
|
```
|
||||||
|
where Y is the limit on runtime
|
||||||
|
(typically the period of the trickle messages).
|
||||||
|
|
||||||
|
* Run your validator for the app with the **--no_credit** option
|
||||||
|
|
||||||
|
The **trickle_credit** daemon grants credit in proportion to (runtime * CPU FLOPS),
|
||||||
|
hence this approach should be used only for applications with only single-CPU versions.
|
||||||
|
|
||||||
|
This approach is not device-neutral because hosts with the same peak FLOPS
|
||||||
|
may have different actual FLOPS for the app version.
|
||||||
|
|
||||||
|
# Adaptive credit
|
||||||
|
|
||||||
|
Use this if you have GPU apps, and are unable to estimate FLOPs even after job completion.
|
||||||
|
This method maintains performance statistics on a (host, app version) level,
|
||||||
|
and uses these to normalize credit between CPU and GPU versions.
|
||||||
|
See [CreditNew].
|
||||||
|
|
||||||
|
To use: this is the default.
|
||||||
|
|
||||||
|
If you use this, the adaptation will happen faster if you provide
|
||||||
|
values for workunit fp_ops_est that are correlated with the actual FLOPs.
|
||||||
|
Use a constant value if you're not sure.
|
||||||
|
|
||||||
|
# Credit averaging
|
||||||
|
|
||||||
|
If you use replication, runtime-based and adaptive credit can produce
|
||||||
|
different "claimed credit" for each job instance.
|
||||||
|
The validator code averages these in tricky way I don't quite understand
|
||||||
|
(Kevin invented it).
|
||||||
|
It does not take the minimum.
|
||||||
|
We should probably provide this as an option,
|
||||||
|
to make runtime-based credit more cheat-proof.
|
||||||
|
However, this won't work if you use adaptive replication,
|
||||||
|
where many jobs have only one instance.
|
Loading…
Reference in New Issue