where app has both GPU and CPU versions,
but for certain jobs (VLAR WUs in this case)
the GPU version performs poorly and shouldn't be used.
The fix is a kludge - it will result in these jobs
not being sent to the host at all,
rather than being sent with the CPU app.
The current architecture makes it difficult to do otherwise.
One possible fix would be to create a separate app
for VLAR jobs, with only CPU app versions.
svn path=/trunk/boinc/; revision=20419
transitioning every 10 days, hence never become eligible for purging.
The problem: the transitioner has a "safety net" where,
if the WU doesn't have a canonical result,
it arranges for another transition in 10 days.
Skip this if error_mask<>0.
svn path=/trunk/boinc/; revision=20265
anonymous-platform app versions.
Otherwise fractional GPU requirements get truncated to zero.
Thanks to Crunch3r for identifying the problem.
svn path=/trunk/boinc/; revision=20189
10000*major + 100*minor + release,
rather than 100*major + minor.
Sometimes you need release-level resolution.
This affects:
- app_version.min_core_version
- config: min_core_client_version_announced
- config: min_core_client_version
Projects using these must multiply them by 100.
svn path=/trunk/boinc/; revision=20149
if project has crazy DCF, don't automatically request 1 sec;
only request work if there's a shortfall.
- intermediate checkin for notices stuff
svn path=/trunk/boinc/; revision=20145
deletion' pass to 50000 (can be changed with -delete_antiques_limit).
Previously large number of antiques led to not deleting any at all.
- Allow to change the interval between passes with -delete_antiques_interval.
svn path=/trunk/boinc/; revision=20138
- remove obsolete and buggy code from transitioner (create_result() in backend_lib)
- account for 'mixed' scheduling in explain_to_user() in sched_send.cpp
- finish transition to configurable patterns for distinguishing files reported by the client
in the Einstein@home-specific part of send_work_locality in sched_locality
(removed previous hardcoded strcmps)
svn path=/trunk/boinc/; revision=20074
will have enough jobs to use its share of resource instances.
This avoids situations where e.g. on a 2-CPU system
a project has 75% resource share and 1 CPU job,
and its STD increases without bound.
Did a general cleanup of the logic for computing
work request sizes (seconds and instances).
svn path=/trunk/boinc/; revision=20036
and avg_ncpus for GPU apps.
App versions are now characterized by two parameters
(we assume that the app uses either the CPU or the GPU
at a given time, but not both):
- the fraction of FLOPs performed on the CPU
- when the app is using the GPU, the fraction of peak FLOPS
that it gets.
We then run the numbers to get the total FLOPS and avg_ncpus.
svn path=/trunk/boinc/; revision=19977
RAM to run job, but when we actually run the job
not enough GPU RAM is free, so the application fails.
This can cause a large number of jobs to fail.
Solution:
- app_plan() can specify the GPU RAM requirements of an app version.
This is passed to the client in a new field
<gpu_ram> of the <app_version> element.
- prior to starting or restarting a GPU app, the client
checks the amount of free RAM on the particular GPU.
If it's not enough for the app version,
the client doesn't start it,
and arranges for the scheduler to ignore it for 5 minutes
(by which point there might be more free GPU RAM)
Notes:
1) this change will have effect only when
both client and scheduler are updated.
2) the check is done in enforce_schedule(),
rather than schedule_cpus(),
because only at that point
have we assigned a specific GPU to the job.
3) there's another case to deal with:
a GPU app's malloc of GPU RAM fails in the middle of the job.
Currently the job fails.
I plan to add an API call boinc_temporary_exit(x) so
that the job can exit and potentially restart in x seconds.
(In principle this mechanism is sufficient for all cases,
but it could lead to a lot of starting/exiting,
so the current change is worthwhile).
svn path=/trunk/boinc/; revision=19864
Make them both peak FLOPS,
according to the formula supplied by the manufacturer.
The impact on the client is minor:
- the startup message describing the GPU
- the weight of the resource type in computing long-term debt
On the server, I changed the example app_plan() function
to assume that app FLOPS is 20% of peak FLOPS
(that's about what it is for SETI@home)
svn path=/trunk/boinc/; revision=19310
for certain periods (e.g. when Remote Desktop is used on Win).
- add is_usable() member function to COPROC.
Currently this just calls the respective (CUDA or CAL)
initialization function.
We need to check whether this works and/or causes problems.
- in enforce_schedule(), check whether usability has changed
for each GPU type.
If we've gone from usable to unusable,
flag all jobs for that GPU as coproc_missing
(so they won't get run, and will quit if they're running).
If we've gone from unusable to usable, clear the flag.
This should deal with all cases except where
the client is started up with GPUs unusable.
- scheduler: more query optimizations for locality scheduling
(from Oliver Bock)
svn path=/trunk/boinc/; revision=19301
to accept CPU, NVIDIA and ATI jobs.
These prefs are shown only where relevant:
e.g., only for processor types for which the project has app versions,
and if it has versions for only one type, no pref is shown.
These prefs affect both client and scheduler.
The client won't ask for work for a device blocked by prefs,
and the scheduler won't send it.
This replaces earlier optional project-specific prefs for
"no CPU jobs" and "no GPU jobs".
(However, these prefs continue to be honored on the server side).
- client: if NVIDIA driver is unknown, say that rather than 0
svn path=/trunk/boinc/; revision=19194
(amdcalrt.dll is old version w/ funky DLL names)
- client: make GPU enumeration warnings more consistent
(e.g., "NVIDIA" instead of "CUDA").
- scheduler: get rid of ati13 plan class.
Require 1.4+ driver for plan class ati.
svn path=/trunk/boinc/; revision=19158