A small attempt at future proofing things. The value is 1 on my new GPU, but might not be on higher end cards. Docs implies it should be part of the calculation.
Use AMD's vendor specific extension if it is available to calculate the total number of shaders and determine the peak FLOP rate from that.
My new GPU I got for Christmas was only reporting 30% of its peak FLOP rate and does not support CAL.
If an NVIDIA GPU is detected only by OpenCL we don't know how many
cores per proc it has.
Instead of assuming 8 (compute capability 1) assume 48 (CC 2).
We could assume 192 (CC 3) but better to err on the low side.
There was a bug where, when you suspend GPU activity,
GPU jobs show as suspended but are not actually suspended.
This was because of recent changes to distinguish GPU and non-GPU coprocs.
Change things so that coprocs are by default GPUs.
If you want to declare a non-GPU coproc in your cc_config.xml,
you much put <non_gpu/> in its <coproc> element.
I.e. treat miner ASICs as a distinct processor type;
send miner_asic jobs only if the client requests them.
Note: I was planning to do this in a more general way,
in which the scheduler wouldn't have a hard-wired list of processor types.
However, that would be a large code change,
so for now I just added miner_asic to the list of processor types
(nvidia, ati, intel_gpu),
and made various changes to get things to work.
Also: in the job dispatch logic, try to send coproc jobs
before CPU jobs.
That way if e.g. there's a limit on jobs in progress,
we'll preferentially send coproc jobs.
We weren't copying the request fields from RSC_WORK_FETCH to COPROC.
Do this, and clean up the code a bit.
Note: the arrays that parallel the COPROCS::coprocs array
are a bit of a kludge; that stuff logically belongs in COPROC.
But it's specific to the client, so I can't put it there.
Maybe I could do something fancy with derived classes, not sure.
A "generic" coprocessor is one that's reported by the client,
but's not of a type that the scheduler knows about (NVIDIA, AMD, Intel).
With this commit the following works:
- On the client, define a <coproc> in your cc_config.xml
with a custom name, say 'miner_asic'.
- define a plan class such as
<plan_class>
<name>foobar</name>
<gpu_type>miner_asic</gpu_type>
<cpu_frac>0.5</cpu_frac>
<plan_class>
- App versions of this plan class will be sent only to hosts
that report a coproc of type "miner_asic".
The <app_version>s in the scheduler reply will include
a <coproc> element with the given name and count=1.
This will cause the client (at least the current client)
to run only one of these jobs at a time,
and to schedule the CPU appropriately.
Note: there's a lot missing from this;
- app version FLOPS will be those of a CPU app;
- jobs will be sent only if CPU work is requested
... and many other things.
Fixing these issues requires a significant re-architecture of the scheduler,
in particular getting rid of the PROC_TYPE_* constants
and the associated arrays,
which hard-wire the 3 fixed GPU types.
For now, handle AMD/ATI, NVIDIA or Intel GPUs as before. But for other, "new" vendors, we treat each device as a separate resource, creating an entry for each instance in the COPROCS::coprocs[] array and copying the device name COPROC::opencl_prop.name into the COPROC::type field (instead of the vendor name.)
For devices from "new" vendors, set <gpu_type> field in init_data.xml file to the vendor string supplied by OpenCL. This should allow boinc_get_opencl_ids() to work correctly with these "new" devices without modification.
Notes:
- The same CPU can have a different cpu_opencl_prop for each of multiple OpenCL platforms. We send them all to the project server because:
- Different OpenCL platforms report different values for the same CPU.
- Some OpenCL CPU apps may work better with certain OpenCL platforms.
- OpenCL has only 64 bits for global_mem_size, so it can report a max of only 4GB; get the CPU RAM size from gstate.hostinfo.m_nbytes.
Some dual-GPU laptops (e.g., Macbook Pro) don't power down the more powerful GPU until all applications which used them exit. To save battery life, the client launches a second instance of the client as a child process to detect and get info about the GPUs.
The child process writes the info to a temp file which our main client then reads.
This option is enabled at compile time by defining USE_CHILD_PROCESS_TO_DETECT_GPUS as non-zero in gpu_detect.cpp
Add OPENCL_DEVICE_PROP cpu_opencl_prop to HOST_INFO;
this store info about the host's ability to run CPU OpenCL apps.
Detect this, and report it in scheduler requests.