pointers to dynamically allocated COPROC-derived objects,
just have the objects themselves.
Dynamic allocation should be avoided at all costs.
svn path=/trunk/boinc/; revision=21564
and default it to off
- client: if we print available GPU RAM (which we now don't)
have a separate timer per GPU type
- scheduler: add new plan classes cuda_opencl (sic) and ati_opencl
svn path=/trunk/boinc/; revision=21498
Some of them allow only 1 CUDA context at a time.
You need to create a CUDA context to get available VRAM.
So the client would run a CUDA job, then immediately kill it.
Solution:
- If a GPU app is running,
let it keep running regardless of available VRAM
(if it's still running, it has enough VRAM).
- But don't start new apps if there's not enough available VRAM,
or it the amount is unknown
(if the client can't create a CUDA context,
the app won't be able to either)
- client: if <coproc_debug> is set, print available GPU RAM periodically
svn path=/trunk/boinc/; revision=21253
of other jobs of that type.
They're waiting for GPU RAM, which may now be available.
- client: bug fix in GPU RAM availability
- client: fix testing setup for GPU RAM availability
svn path=/trunk/boinc/; revision=21206
old: assign GPUs, then check available RAM
Problem: may cause starvation on multi-GPU systems.
new: use available RAM info in the assignment process.
Prevents starvation, also reduces the number of driver calls.
svn path=/trunk/boinc/; revision=21205
that way it will query server for file length when it resumes,
rather than uploading from the beginning
- client: back out SEH handling for GPU detection
svn path=/trunk/boinc/; revision=20750
routines. Take care of situations where something within
the vendors functions cause a crash.
client/
coproc_detect.cpp
svn path=/trunk/boinc/; revision=20744
RAM to run job, but when we actually run the job
not enough GPU RAM is free, so the application fails.
This can cause a large number of jobs to fail.
Solution:
- app_plan() can specify the GPU RAM requirements of an app version.
This is passed to the client in a new field
<gpu_ram> of the <app_version> element.
- prior to starting or restarting a GPU app, the client
checks the amount of free RAM on the particular GPU.
If it's not enough for the app version,
the client doesn't start it,
and arranges for the scheduler to ignore it for 5 minutes
(by which point there might be more free GPU RAM)
Notes:
1) this change will have effect only when
both client and scheduler are updated.
2) the check is done in enforce_schedule(),
rather than schedule_cpus(),
because only at that point
have we assigned a specific GPU to the job.
3) there's another case to deal with:
a GPU app's malloc of GPU RAM fails in the middle of the job.
Currently the job fails.
I plan to add an API call boinc_temporary_exit(x) so
that the job can exit and potentially restart in x seconds.
(In principle this mechanism is sufficient for all cases,
but it could lead to a lot of starting/exiting,
so the current change is worthwhile).
svn path=/trunk/boinc/; revision=19864