2 GpuSync
Vitalii Koshura edited this page 2023-04-10 04:06:11 +02:00

Support for synchronous GPU apps

GPU apps run much faster, like 2X, if their CPU and GPU parts synchronize using busy-waiting instead of interrupts. (Busy-waiting means it uses 100% of a CPU, and this process runs at normal OS scheduling priority.) E.g. you might get a 100 GFLOPS increase in GPU speed at the expense of 5 GFLOPS of CPU speed.

However, many volunteers do not want this behavior, so we can't make it the default.

This raises several issues:

What is the volunteer interface?

Possibilities:

  • A new preference. Pros: automatically propagated; available to server. Cons: too many prefs already.
  • Config file flag. Pros: simple. Cons: nor propagated.

I'd lean towards a new pref.

Gianni suggested adding a new "Use GPU Full Speed" item to the activity menu, but this isn't practical - among other things, it would prevent the use of GPU prefs like "don't use GPU when computer is in use".

What are the semantics? If the user changes the pref (or config option) should running GPU jobs change their synch mode? What about unstarted jobs?

How does it affect apps?

Should there be separate app versions for the 2 types of synch? No; that would prevent users from changing mode mid-job. Apps should be able to do either type of synch, based on a flag in APP_INIT_DATA. Let's call such app versions synch-selectable.

BOINC needs to know whether a given app version is synch-selectable. We'll do this with an element in the app version's XML doc (update_versions will have to be changed to support this).

How does it affect server scheduling?

What avg_ncpus does the scheduler report? (I guess the non-busywait value).

What avg_ncpus does it use in accounting the CPU seconds for jobs sent? (if the pref is set maybe it should use 1).

How is job completion time estimation impacted? (potentially not at all - the system will adapt).

How does it affect client scheduling?

Let's assume that app_version.avg_ncpus is the non-busywait value (e.g. 0.1 or so).

If the full-speed pref is set, and the app version is synch-selectable, then the client needs to set avg_ncpus to 1. Otherwise we'll overcommit the CPUs. (Note: we need to save the original avg_ncpus somewhere, in case the user turns off the pref).