RAM to run job, but when we actually run the job
not enough GPU RAM is free, so the application fails.
This can cause a large number of jobs to fail.
Solution:
- app_plan() can specify the GPU RAM requirements of an app version.
This is passed to the client in a new field
<gpu_ram> of the <app_version> element.
- prior to starting or restarting a GPU app, the client
checks the amount of free RAM on the particular GPU.
If it's not enough for the app version,
the client doesn't start it,
and arranges for the scheduler to ignore it for 5 minutes
(by which point there might be more free GPU RAM)
Notes:
1) this change will have effect only when
both client and scheduler are updated.
2) the check is done in enforce_schedule(),
rather than schedule_cpus(),
because only at that point
have we assigned a specific GPU to the job.
3) there's another case to deal with:
a GPU app's malloc of GPU RAM fails in the middle of the job.
Currently the job fails.
I plan to add an API call boinc_temporary_exit(x) so
that the job can exit and potentially restart in x seconds.
(In principle this mechanism is sufficient for all cases,
but it could lead to a lot of starting/exiting,
so the current change is worthwhile).
svn path=/trunk/boinc/; revision=19864
<ignore_cuda_dev>n</ignore_cuda_dev>
<ignore_ati_dev>n</ignore_ati_dev>
to ignore (not use) specific NVIDIA or ATI GPUs.
You can ignore more than one.
svn path=/trunk/boinc/; revision=19566
Make them both peak FLOPS,
according to the formula supplied by the manufacturer.
The impact on the client is minor:
- the startup message describing the GPU
- the weight of the resource type in computing long-term debt
On the server, I changed the example app_plan() function
to assume that app FLOPS is 20% of peak FLOPS
(that's about what it is for SETI@home)
svn path=/trunk/boinc/; revision=19310
for certain periods (e.g. when Remote Desktop is used on Win).
- add is_usable() member function to COPROC.
Currently this just calls the respective (CUDA or CAL)
initialization function.
We need to check whether this works and/or causes problems.
- in enforce_schedule(), check whether usability has changed
for each GPU type.
If we've gone from usable to unusable,
flag all jobs for that GPU as coproc_missing
(so they won't get run, and will quit if they're running).
If we've gone from unusable to usable, clear the flag.
This should deal with all cases except where
the client is started up with GPUs unusable.
- scheduler: more query optimizations for locality scheduling
(from Oliver Bock)
svn path=/trunk/boinc/; revision=19301
start only enough jobs to fill CPUs per project,
not all the CPU jobs at once.
I'm not sure how much difference this makes,
but this is how it's supposed to work.
- client: if app_info.xml doesn't specify flops,
use an estimate that takes GPUs into account.
- client: if it's been more than 2 weeks since time stats update,
don't decay on_frac at all.
svn path=/trunk/boinc/; revision=19035
is running a graphics application.
Change the semantics of the "don't use GPU while computer in use" pref
to "don't use a GPU that's running a graphics app while
computer is in use".
This will increase GPU utilization on multi-GPU systems.
svn path=/trunk/boinc/; revision=18942
- different data structure for keeping track of coproc usage;
instead of COPROC having per-instance pointers to ACTIVE_TASK,
ACTIVE_TASK now has an array of device number indices
for each instance that it's using.
- in enforce_schedule(), we call a new function assign_coprocs()
that decides what coproc instances each job will use,
and prunes jobs for which we can't get an assignment.
This function embodies lots of subtlety.
- coproc_cmdline() no longer deals with reserving instances;
it just has to generate the --device X cmdline
svn path=/trunk/boinc/; revision=18880
e.g. the Milkyway@home ATI app, of which we can typically run
2 or 3 instances at once on a GPU.
Changes include:
- In APP_VERSION, don't use a COPROCS to represent the GPU
requirements; just use doubles ncudas and natis.
- sufficient_coprocs() etc. are no longer members of COPROCS
- in HOST_USAGE, ncudas and natis are doubles
- in scheduler request, req_instances is now a double
This checkin doesn't include the job scheduling logic,
i.e. assigning jobs to GPUs. That will follow.
svn path=/trunk/boinc/; revision=18868
old: find fastest GPU, and pretend that others are the same.
Problem: other GPUs might be less capable,
and not able to handle jobs sent by server.
new: find the most "capable" GPU, use others that are equivalent,
don't use those that are not.
"Capable" is defined by
- compute capability (i.e., hardware version)
- driver version
- memory size
- FLOPs
in that priority order.
See comments in lib/coproc.h
svn path=/trunk/boinc/; revision=17855
(say what kind of job and why we're scheduling it)
- client: log messages describing GPUs: one line per GPU; fixes#879
svn path=/trunk/boinc/; revision=17847
and passing them the corresponding --device N cmdline args.
This fixes a bug introduced in 17402 (Feb 26)
that broke the --device feature,
presumably causing problems on systems with multiple GPUs.
svn path=/trunk/boinc/; revision=17549
(app versions don't have a <coprocs> around coproc elements,
may an oversight but let's stick with it).
Anyway, I think it's working now.
- lib: remove "owner" array from COPROC.
This was used in client to keep track of assignment of
coprocessors to tasks, but we got rid of the reserve/free scheme.
NOTE: this breaks the mechanism for passing --device N to apps;
I'll have to do this another way. Stay tuned.
svn path=/trunk/boinc/; revision=17543
There are two mechanisms to prevent the scheduler from
sending jobs that won't finish by their deadline.
Simple mechanism:
The client sends the interval x for which CPUs are projected
to be saturated.
Given a job with estimated duration y,
the scheduler doesn't send it if x + y exceeds the delay bound.
If it does send it, x is incremented by y.
Complex mechanism:
Client sends workload description.
Scheduler does EDF simulation, sees if deadlines are missed.
The only project using this AFAIK is BOINC alpha test.
Neither of these mechanisms takes coprocessors into account,
and as a result jobs could be sent that are doomed to
miss their deadline.
This checkin adds coprocessor awareness to the Simple mechanism.
Changes:
Client:
compute estimated delay (i.e. time until non-saturation)
for coprocessors as well as CPU.
Send them in scheduler request as part of coproc descriptor.
Scheduler:
Keep track of estimated delays separately for different resources
- client: fixed bug that computed CPU estimated delay incorrectly
- client: the work request (req_secs) for a resource is the min
of the project's share and the shortfall.
svn path=/trunk/boinc/; revision=17086
put a textual summary of them in host.serialnum (currently unused)
- web: show coprocs on host detail page
- db_dump: include coproc info in host XML
svn path=/trunk/boinc/; revision=16697
a modified boinc version.
- Added new header "boinc_fcgi.h" to be used instead of "fcgi_stdio.h".
This header defines I/O functions in the namespace FCGI rather than using
redefined functions the way "fcgi_stdio.h" does. This was causing a lot
of headaches when both <cstdio> and "fcgi_stdio.h" was called. Using
overloaded functions fixes this problem, except when the only difference
between functions is the return type (for example ::fopen() returns FILE*
and FCGI::fopen() returns FCGI_FILE*).
- Fixed some missing "#ifdef _WIN32" blocks in filesys.C
svn path=/trunk/boinc/; revision=15984
should be 2.0. This avoids crashes related to data structure
changes in the Runtime.
coprocs/CUDA/mswin/Win32/Debug/bin/
cudart.dll
coprocs/CUDA/mswin/Win32/Release/bin/
cudart.dll
coprocs/CUDA/mswin/Win32/ReleaseSigned/bin/
cudart.dll
coprocs/CUDA/mswin/x64/Debug/bin/
cudart.dll
coprocs/CUDA/mswin/x64/Release/bin/
cudart.dll
coprocs/CUDA/mswin/x64/ReleaseSigned/bin/
cudart.dll
lib/
coproc.C, .h
svn path=/trunk/boinc/; revision=15925
Old: when checking whether an app can be run,
check for sufficient coprocessors relative to
the current coprocessor usage.
Bug: it there are 2 CUDA jobs,
the scheduler will decide to run both.
enforce_scheduler() will only be able to run one,
and the other CPU will be idle.
New: include coprocessor usage (along with RAM and CPUs)
in the check, and do a simulated reservation.
In the above scenario, the scheduler will select
one CUDA app and one non-CUDA app.
svn path=/trunk/boinc/; revision=15904
- scheduler: fix bug in adaptive replication:
if send an unreplicated job to untrusted host,
set both wu.target_nresults and wu.min_quorum to app.target_nresults.
svn path=/trunk/boinc/; revision=15762