where the client tells the scheduler which app versions
its queued jobs use
(this is needed, e.g., to enforce per-app or per-resource job limits).
In this mechanism, the client sends an array of <app_version>s,
and each <other_result> includes an index into this array.
- The wrong index was being sent (client).
- If an <app_version> had a non-existent app name
(e.g. because that app had been deprecated)
it wasn't getting put in the array, invalidating array indices
Furthermore, an erroneous message was being sent to the user
Fix: if parse error for <app_version>,
put it in the array anyway, but with cav.app = NULL,
meaning that it's a place-holder.
Send a message to user only if anon platform.
- manager: increase notice buffers to 64K
svn path=/trunk/boinc/; revision=22052
for messages intended as notices.
This will avoid showing lots of obscure stuff as notices
for projects with old server code.
svn path=/trunk/boinc/; revision=21836
- daily quota mechanism
- reliable mechanism (accelerated retries)
- "trusted" mechanism (adaptive replication)
- scheduler: enforce host scale probation only for apps with
host_scale_check set.
- validator: do scale probation on invalid results
(need this in addition to error and timeout cases)
- feeder: update app version scales every 10 min, not 10 sec
- back-end apps: support --foo as well as -foo for options
Notes:
- If you have, say, cuda, cuda23 and cuda_fermi plan classes,
a host will have separate quotas for each one.
That means it could error out on 100 jobs for cuda_fermi,
and when its quota goes to zero,
error out on 100 jobs for cuda23, etc.
This is intentional; there may be cases where one version
works but not the others.
- host.error_rate and host.max_results_day are deprecated
TODO:
- the values in the app table for limits on jobs in progress etc.
should override rather than config.xml.
Implementation notes:
scheduler:
process_request():
read all host_app_versions for host at start;
Compute "reliable" and "trusted" for each one.
write modified records at end
get_app_version():
add "reliable_only" arg; if set, use only reliable versions
skip over-quota versions
Multi-pass scheduling: if have at least one reliable version,
do a pass for jobs that need reliable,
and use only reliable versions.
Then clear best_app_versions cache.
Score-based scheduling: for need-reliable jobs,
it will pick the fastest version,
then give a score bonus if that version happens to be reliable.
When get back a successful result from client:
increase daily quota
When get back an error result from client:
impose scale probation
decrease daily quota if not aborted
Validator:
when handling a WU, create a vector of HOST_APP_VERSION
parallel to vector of RESULT.
Pass it to assign_credit_set().
Make copies of originals so we can update only modified ones
update HOST_APP_VERSION error rates
Transitioner:
decrease quota on timeout
svn path=/trunk/boinc/; revision=21181
Triggering the work generator is now done via the DB
instead of flat files.
Since only E@h uses locality scheduling,
I kept the DB changes in a separate file (db/schema_locality.sql).
There's a new field in the workunit table,
and that's a required update (in db_update.php)
- manager: compile fix
svn path=/trunk/boinc/; revision=20807
- remove obsolete and buggy code from transitioner (create_result() in backend_lib)
- account for 'mixed' scheduling in explain_to_user() in sched_send.cpp
- finish transition to configurable patterns for distinguishing files reported by the client
in the Einstein@home-specific part of send_work_locality in sched_locality
(removed previous hardcoded strcmps)
svn path=/trunk/boinc/; revision=20074
for certain periods (e.g. when Remote Desktop is used on Win).
- add is_usable() member function to COPROC.
Currently this just calls the respective (CUDA or CAL)
initialization function.
We need to check whether this works and/or causes problems.
- in enforce_schedule(), check whether usability has changed
for each GPU type.
If we've gone from usable to unusable,
flag all jobs for that GPU as coproc_missing
(so they won't get run, and will quit if they're running).
If we've gone from unusable to usable, clear the flag.
This should deal with all cases except where
the client is started up with GPUs unusable.
- scheduler: more query optimizations for locality scheduling
(from Oliver Bock)
svn path=/trunk/boinc/; revision=19301
Old:
1) check deadline based on wu.delay_bound
2) in add_result_to_reply(), potentially modify wu.delay_bound,
e.g. because of retry acceleration
problem: reducing delay bound may cause deadline miss
New:
1) new function get_delay_bound_range()
(called from wu_is_infeasible_fast())
returns optimistic and pessimistic delay bounds.
Retry acceleration logic is here.
2) check deadline based on optimistic bound;
if that fails, check based on pessimistic bound.
Set wu.delay_bound to the one that worked.
Notes:
- get_delay_bound_range() needs result priority and report deadline,
and it's called before we read the full result.
So add these items to WORK_ITEM and WU_RESULT.
- get_delay_bound_range() could be customized for
project-specific deadline policy.
- add_result_to_reply() was becoming a toxic waste dump.
Deadline-related stuff should have been factored out in any case.
svn path=/trunk/boinc/; revision=18946
- client (Unix): if client crashes while benchmark processes are going,
make sure they detect this and exit.
- back-end programs: remove hardwired assumptions about
what directory they run in, and hence where config.xml is.
E.g., daemons look for it in "..", others expect it in current dir.
New approach: all the programs look for the project dir as follows:
1) the environment var BOINC_PROJECT_DIR, if defined
2) the current dir, if config.xml is there.
3) else ".."
This means you can run programs in either proj/bin/ or proj/,
or (using BOINC_PROJECT_DIR) you can keep executables
outside of the project dir.
svn path=/trunk/boinc/; revision=18042
which of those files to include
- Modified MAC address check to work on some non-Linux unixes.
(mac_address.cpp)
- Added suggested change to "already attached to project" checking.
(ProjectInfoPage.cpp)
- changed includes of standard c header files to their c++ equivalents
(i.e. replaced <stdio.h> with <cstdio>) for namespace protection.
- replaced "using namespace std;" with more explicit "using std::function" in
several files.
- Fixed bug in checking whether the os is OS/2 and added conditional OS_OS2
to the build environment. (boinc_platform.m4,configure.ac)
- Changed build environment to not use -nostandardlibs unless we are using
G++ and static linkage is specified. (configure.ac)
- Added makefiles and package building files for solaris CSW package manager.
- Fixed bug with attempting to find login name using logname. (configure.ac)
- Added ifdef HAVE_* protection around some include files commonly found in
sys.
- Added support for unified binary for x86_64/i686-pc-solaris.
(cs_platforms.cpp)
- generate_host_cpid() now uses MAC address on non-linux unix.
(hostinfo_network.cpp)
- Macro BOINC_SET_COMPILE_FLAGS now doesn't check gcc only flags on non-gcc
compilers. (boinc_set_compile_flags.m4)
- Library compiles no longer depend upon the library extension or require
the library to be prefixed with lib.
- More fixes for fcgi builds.
- Added declaration of "struct ether_addr" and ether_ntoa(). Have not yet
implemented ether_ntoa() for machines that don't have it, or where it is
buggy. (unix_util.h)
- Added FCGI::perror() which calls FCGI_perror(). (boinc_fcgi.{h,cpp})
- Fixed library Makefiles so that all required headers get installed.
svn path=/trunk/boinc/; revision=17388
- web: check whether to show profile in separate function
from displaying profile; eliminate double headers
- scheduler: finish purge of redundant arguments
svn path=/trunk/boinc/; revision=16726
for the selected APP_VERSION, rather than on the CPU benchmarks.
Otherwise estimates are wrong for GPU or multi-thread apps.
- scheduler: start switching from having SCHED_REQUEST and
SCHED_REPLY as globals instead of passing them around as args;
to be continued.
svn path=/trunk/boinc/; revision=16691
If set the "effective NCPUS" (which is used to scale
daily_result_quota and max_wus_in_progress)
is max'd with the # of CUDA GPUs.
svn path=/trunk/boinc/; revision=16246