Generalizing credit

Note: in the following

FLOP = floating-point operation
FLOPs = plural of FLOP
FLOPS = FLOPs per second

The current credit system is based on FLOPs: a 1 GFLOPS computer running BOINC all the time gets 200 credits per day. We did things this way because BOINC was designed to support scientific computing, where most apps are floating-point intensive and FLOPS is the standard unit of performance (for example, supercomputer performance is measured in FLOPS).

Also, for grant-writing purposes I need to be able say "BOINC has a peak performance of X PetaFLOPS"; the current credit system lets me do this.

Recently, a new project (Bitcoin Utopia, or BU) started. Their jobs involve Bitcoin mining, which consists of computing SHA256 hash functions. This is an integer algorithm. It can be done on CPUs and GPUs, but it can be done much faster on ASICs. These ASICs can only do hashing; they can't do floating-point math, and they're of no use to any BOINC project other than BU.

The question was: how should BU grant credit for its jobs? One approach is to decide on an "equivalence" between hashes and FLOPs, and assign credit based on the current formula.

How many FLOPs is a hash equivalent to? One approach is to look at a CPU or GPU, measure its FLOPS and its hashes/sec, and divide. Depending on the device, this gives an answer in the range of 1,000 to 10,000 FLOPs per hash.

BU did these things, which are completely reasonable. But it turns out - because ASICs are so fast - that BU is granting huge amounts of credit. With fewer than 1,000 users, BU is granting more credit than the 300,000 users of all other projects combined.

This situation has several undesirable consequences:

Credit no longer measures FLOPs; BOINC's combined average credit no longer measures its peak performance in FLOPS.
The competitive balance between volunteers is lost. The top users and teams (BOINC-wide) will be only those with ASICs running BU.
The competitive balance between projects is lost. BU will always grant far more credit than other projects, for a type of computation that is specific to BU and is not usable for other projects.

Proposal

The basic problem is that we have a credit system based on FLOPs, but we want to give credit for things (like hashes) that are not FLOPs. A similar situation actually already exists in BOINC. We'd like to be able to give credit for disk storage and network communication; some projects have applications that use these resources rather than computing. But there's no obvious way to translate storage or bandwidth into "equivalent FLOPs", and even if there were, we'd be destroying the meaning of credit as a measure of FLOPs.

So, I propose that, rather than trying to shoehorn everything into one number, we keep track of multiple types of credit. In particular, I propose 4 types:

Computing credit: general-purpose FLOPs, i.e. what we have now.
Storage credit, measured in byte/seconds (possibly multiplied by availability).
Network credit, measured in bytes, the sum of upload and download.
Project-defined credit. Projects can define and grant this however they like. For BU, this would be proportional to hashes. Projects like Quake-Catcher Network or Radioactive@Home might grant credit for monitoring types of sensors. Projects like Wildlife@Home might grant credit for a human activity, like annotating video.

We'll add APIs so that project validators can grant the new types of credit. We'll figure out how to make them cheat-resistant.

The BOINC database will maintain each of these types of credit for each host, user, and team. It will store both total and recent average for each type.

Wherever we show credit on project web sites - leader boards, user and team pages, etc. - we'll show one or more credit types; the choice of types will be configurable by the project.

The new types of credit will be included in the XML statistics files exported by projects. Statistics sites (such as BOINCStats) will be extended to show the new types of credit.

Discussion

Are these 4 types enough?

I'd prefer to use only these 4 types, rather than e.g. having an extensible system where projects can add several of their own types. Reasons: conceptual simplicity, and also ease and efficiency of implementation (add fields to existing DB tables rather than create new/big tables).

Other app-specific coprocessors

Eric pointed out the possibility of a variant of the SETI@home app that uses an ASIC or FPGA to compute FFTs. What if these were 1000X faster than GPUs or CPUs? We'd have the same problem as we do now with BU.

My feeling about this is that computing credit should measure general-purpose FLOPs, i.e. FLOPs that are usable by most science applications. FFT FLOPs are not general-purpose. So the right thing would be for SETI@home to grant both computing credit and project-defined credit. CPU and GPU jobs would be granted both; jobs done by ASICs or FPGAs would be granted only project-defined credit.

Similarly, BU could grant computing credit for mining jobs done by CPU or GPU; but for ASIC jobs it would grant only project-defined credit.

Of course this is all subjective and fuzzy; you might argue that GPU FLOPs are not general-purpose because some apps don't map well to GPUs. But we need to draw a line somewhere, and I think that, with the advent of OpenCL and CUDA, GPUs can be considered general-purpose.

Home