old: find fastest GPU, and pretend that others are the same.
Problem: other GPUs might be less capable,
and not able to handle jobs sent by server.
new: find the most "capable" GPU, use others that are equivalent,
don't use those that are not.
"Capable" is defined by
- compute capability (i.e., hardware version)
- driver version
- memory size
- FLOPs
in that priority order.
See comments in lib/coproc.h
svn path=/trunk/boinc/; revision=17855
(say what kind of job and why we're scheduling it)
- client: log messages describing GPUs: one line per GPU; fixes#879
svn path=/trunk/boinc/; revision=17847
and passing them the corresponding --device N cmdline args.
This fixes a bug introduced in 17402 (Feb 26)
that broke the --device feature,
presumably causing problems on systems with multiple GPUs.
svn path=/trunk/boinc/; revision=17549
(app versions don't have a <coprocs> around coproc elements,
may an oversight but let's stick with it).
Anyway, I think it's working now.
- lib: remove "owner" array from COPROC.
This was used in client to keep track of assignment of
coprocessors to tasks, but we got rid of the reserve/free scheme.
NOTE: this breaks the mechanism for passing --device N to apps;
I'll have to do this another way. Stay tuned.
svn path=/trunk/boinc/; revision=17543
There are two mechanisms to prevent the scheduler from
sending jobs that won't finish by their deadline.
Simple mechanism:
The client sends the interval x for which CPUs are projected
to be saturated.
Given a job with estimated duration y,
the scheduler doesn't send it if x + y exceeds the delay bound.
If it does send it, x is incremented by y.
Complex mechanism:
Client sends workload description.
Scheduler does EDF simulation, sees if deadlines are missed.
The only project using this AFAIK is BOINC alpha test.
Neither of these mechanisms takes coprocessors into account,
and as a result jobs could be sent that are doomed to
miss their deadline.
This checkin adds coprocessor awareness to the Simple mechanism.
Changes:
Client:
compute estimated delay (i.e. time until non-saturation)
for coprocessors as well as CPU.
Send them in scheduler request as part of coproc descriptor.
Scheduler:
Keep track of estimated delays separately for different resources
- client: fixed bug that computed CPU estimated delay incorrectly
- client: the work request (req_secs) for a resource is the min
of the project's share and the shortfall.
svn path=/trunk/boinc/; revision=17086
put a textual summary of them in host.serialnum (currently unused)
- web: show coprocs on host detail page
- db_dump: include coproc info in host XML
svn path=/trunk/boinc/; revision=16697
a modified boinc version.
- Added new header "boinc_fcgi.h" to be used instead of "fcgi_stdio.h".
This header defines I/O functions in the namespace FCGI rather than using
redefined functions the way "fcgi_stdio.h" does. This was causing a lot
of headaches when both <cstdio> and "fcgi_stdio.h" was called. Using
overloaded functions fixes this problem, except when the only difference
between functions is the return type (for example ::fopen() returns FILE*
and FCGI::fopen() returns FCGI_FILE*).
- Fixed some missing "#ifdef _WIN32" blocks in filesys.C
svn path=/trunk/boinc/; revision=15984
should be 2.0. This avoids crashes related to data structure
changes in the Runtime.
coprocs/CUDA/mswin/Win32/Debug/bin/
cudart.dll
coprocs/CUDA/mswin/Win32/Release/bin/
cudart.dll
coprocs/CUDA/mswin/Win32/ReleaseSigned/bin/
cudart.dll
coprocs/CUDA/mswin/x64/Debug/bin/
cudart.dll
coprocs/CUDA/mswin/x64/Release/bin/
cudart.dll
coprocs/CUDA/mswin/x64/ReleaseSigned/bin/
cudart.dll
lib/
coproc.C, .h
svn path=/trunk/boinc/; revision=15925
Old: when checking whether an app can be run,
check for sufficient coprocessors relative to
the current coprocessor usage.
Bug: it there are 2 CUDA jobs,
the scheduler will decide to run both.
enforce_scheduler() will only be able to run one,
and the other CPU will be idle.
New: include coprocessor usage (along with RAM and CPUs)
in the check, and do a simulated reservation.
In the above scenario, the scheduler will select
one CUDA app and one non-CUDA app.
svn path=/trunk/boinc/; revision=15904
- scheduler: fix bug in adaptive replication:
if send an unreplicated job to untrusted host,
set both wu.target_nresults and wu.min_quorum to app.target_nresults.
svn path=/trunk/boinc/; revision=15762
- client: better messages reporting coprocessors
- manager: bounds checks to avoid wxwidgets asserts
when job CPU estimates are absurdly large
svn path=/trunk/boinc/; revision=15644
libcudart{32,64}.so is bundled with client.
client loads it and if successful calls the device-query functions.
- client, Linux: append the current directory
(i.e., the BOINC data directory) to the LD_LIBRARY_PATH for apps.
This goes after the project dir and the slot dir.
This lets apps link to libcudartX.so.
NOTE: this is not recommended; better to include it with your app.
- client: allow for multiple messages from coproc probing
- fixed indentation in cs_platforms.C
svn path=/trunk/boinc/; revision=15591
to avoid confusion with "name" field of CUDA.
This is a bug fix - please port.
- start script: don't error out if run_state.xml file is empty
(which happens if project runs out of disk space)
svn path=/trunk/boinc/; revision=15168
in <app_version>s from the server,
keep track of the number free of each type of coproc,
and don't run an app that needs more than are available.
(not quite working yet)
svn path=/trunk/boinc/; revision=14992
and change the correspending structure field from 64KB to 256KB
(could increase this if needed).
This is needed to handle app versions with lots (> 100) of files
- change LARGE_BLOB_SIZE to BLOB_SIZE a bunch of places
- Change COPROCS from vector<COPROC> to vector<COPROC*>.
Otherwise the right virtual functions of COPROCs don't get called
svn path=/trunk/boinc/; revision=14986