boinc

Commit Graph

Author	SHA1	Message	Date
David Anderson	b169e5ab0f	- server programs: print error message instead of numeric retval in log messages svn path=/trunk/boinc/; revision=22647	2010-11-08 17:51:57 +00:00
David Anderson	81973a9fff	- scheduler: fix structural problems with sending user messages. Old: various redundant and/or misleading messages were sent. New: - if host w/ no GPU contacts a GPU-only project, send high-pri message saying they need a GPU - if host w/ GPU has driver too old for all versions, send high-pri message saying to update driver - if host w/ GPU has driver too old for some versions, send low-pri message saying to update driver - if host has GPU but too little RAM for any app, send low-pri message saying so - scheduler: revamp GPU plan class functions svn path=/trunk/boinc/; revision=21760	2010-06-16 22:07:19 +00:00
David Anderson	b2451544e1	- server: change the following from per-host to per-(host, app version): - daily quota mechanism - reliable mechanism (accelerated retries) - "trusted" mechanism (adaptive replication) - scheduler: enforce host scale probation only for apps with host_scale_check set. - validator: do scale probation on invalid results (need this in addition to error and timeout cases) - feeder: update app version scales every 10 min, not 10 sec - back-end apps: support --foo as well as -foo for options Notes: - If you have, say, cuda, cuda23 and cuda_fermi plan classes, a host will have separate quotas for each one. That means it could error out on 100 jobs for cuda_fermi, and when its quota goes to zero, error out on 100 jobs for cuda23, etc. This is intentional; there may be cases where one version works but not the others. - host.error_rate and host.max_results_day are deprecated TODO: - the values in the app table for limits on jobs in progress etc. should override rather than config.xml. Implementation notes: scheduler: process_request(): read all host_app_versions for host at start; Compute "reliable" and "trusted" for each one. write modified records at end get_app_version(): add "reliable_only" arg; if set, use only reliable versions skip over-quota versions Multi-pass scheduling: if have at least one reliable version, do a pass for jobs that need reliable, and use only reliable versions. Then clear best_app_versions cache. Score-based scheduling: for need-reliable jobs, it will pick the fastest version, then give a score bonus if that version happens to be reliable. When get back a successful result from client: increase daily quota When get back an error result from client: impose scale probation decrease daily quota if not aborted Validator: when handling a WU, create a vector of HOST_APP_VERSION parallel to vector of RESULT. Pass it to assign_credit_set(). Make copies of originals so we can update only modified ones update HOST_APP_VERSION error rates Transitioner: decrease quota on timeout svn path=/trunk/boinc/; revision=21181	2010-04-15 03:13:56 +00:00
David Anderson	5b7f8b8348	- web: fix bug that caused "send email" and "show hosts" in project prefs to always select "no" svn path=/trunk/boinc/; revision=20786	2010-03-04 04:16:00 +00:00
David Anderson	12a85e5ced	- scheduler: code cleanup: goto considered harmful - scheduler: when calculate scheduler runtime, don't include the part reading request msg from client. That can be misleadingly long svn path=/trunk/boinc/; revision=20781	2010-03-03 19:29:23 +00:00
David Anderson	56a8296b5b	- scheduler: compute no_jobs_available correctly in the presence of multiple scheduling types (e.g., locality and job array) From Nils Brause svn path=/trunk/boinc/; revision=19559	2009-11-12 21:30:33 +00:00
David Anderson	8b701fc73f	- scheduler: fix messed-up deadline check logic. Old: 1) check deadline based on wu.delay_bound 2) in add_result_to_reply(), potentially modify wu.delay_bound, e.g. because of retry acceleration problem: reducing delay bound may cause deadline miss New: 1) new function get_delay_bound_range() (called from wu_is_infeasible_fast()) returns optimistic and pessimistic delay bounds. Retry acceleration logic is here. 2) check deadline based on optimistic bound; if that fails, check based on pessimistic bound. Set wu.delay_bound to the one that worked. Notes: - get_delay_bound_range() needs result priority and report deadline, and it's called before we read the full result. So add these items to WORK_ITEM and WU_RESULT. - get_delay_bound_range() could be customized for project-specific deadline policy. - add_result_to_reply() was becoming a toxic waste dump. Deadline-related stuff should have been factored out in any case. svn path=/trunk/boinc/; revision=18946	2009-08-31 19:35:46 +00:00
David Anderson	b300519444	svn path=/trunk/boinc/; revision=18825	2009-08-10 04:49:02 +00:00
David Anderson	2e5d9bd778	- scheduler: add new config option <max_wus_in_progress_gpus>. The limit on jobs in progress is now max_wus_in_progress * NCPUS + max_wus_in_progress * NGPUS where NCPUS and NGPUS reflect prefs and are capped. Furthermore: if the client reports plan class for in-progress jobs (see checkin of 31 May 2009) then these limits are enforced separately; i.e. the # of in-progress CPU jobs is <= max_wus_in_progressNCPUS, and the # of in-progress GPU jobs is <= max_wus_in_progress_gpuNGPUS - scheduler config: rename <cuda_multiplier> to <gpu_multiplier> - scheduler: <max_wus_to_send> is now scaled by (NCPUS + gpu_multiplier*NGPUS) - scheduler: don't keep scanning array if !work_needed() - scheduler: moved array-scan logic from sched_send.cpp to sched_array.cpp - scheduler: don't say "no work available" if jobs are available but work_needed() is initially false svn path=/trunk/boinc/; revision=18255	2009-06-01 22:15:14 +00:00
David Anderson	84afd18450	- scheduler: move app-version selection and score-based scheduling to new files. svn path=/trunk/boinc/; revision=17630	2009-03-19 16:35:35 +00:00
David Anderson	76da7d8653	- scheduler: msg tweak svn path=/trunk/boinc/; revision=17584	2009-03-10 21:34:49 +00:00
David Anderson	41ed82f791	- scheduler: fix bugs that caused only 1 job to be sent svn path=/trunk/boinc/; revision=17555	2009-03-07 01:00:05 +00:00
David Anderson	c22b62f25b	- scheduler: fix bugs in support for anonymous platform + coprocs (app versions don't have a <coprocs> around coproc elements, may an oversight but let's stick with it). Anyway, I think it's working now. - lib: remove "owner" array from COPROC. This was used in client to keep track of assignment of coprocessors to tasks, but we got rid of the reserve/free scheme. NOTE: this breaks the mechanism for passing --device N to apps; I'll have to do this another way. Stay tuned. svn path=/trunk/boinc/; revision=17543	2009-03-06 22:21:47 +00:00
David Anderson	33d5a81cf6	- scheduler: add locality_scheduling arg to add_result_to_reply(); eliminate the need to diddle around with config.locality_scheduling. svn path=/trunk/boinc/; revision=17445	2009-03-03 16:38:54 +00:00
David Anderson	4cd3c530b0	- scheduler: reduce frequency of calls to work_needed() svn path=/trunk/boinc/; revision=17003	2009-01-23 22:52:35 +00:00
David Anderson	91e120b3f4	- scheduler: improve message formatting; add <debug_locality> flag for locality scheduling messages svn path=/trunk/boinc/; revision=16921	2009-01-15 20:23:20 +00:00
David Anderson	74423f23b6	- scheduler: if no jobs available to send, inform the user svn path=/trunk/boinc/; revision=16730	2008-12-22 00:10:02 +00:00
David Anderson	312ffba708	- API: remove BOINC_OPTIONS::worker_thread_stack_size - web: check whether to show profile in separate function from displaying profile; eliminate double headers - scheduler: finish purge of redundant arguments svn path=/trunk/boinc/; revision=16726	2008-12-19 18:14:02 +00:00
David Anderson	ef52366c1b	- web: fix bug that caused login to fail - sched: more global vars svn path=/trunk/boinc/; revision=16695	2008-12-16 16:29:54 +00:00
David Anderson	49a69de194	- scheduler: estimate job durations based on the FLOPS estimate for the selected APP_VERSION, rather than on the CPU benchmarks. Otherwise estimates are wrong for GPU or multi-thread apps. - scheduler: start switching from having SCHED_REQUEST and SCHED_REPLY as globals instead of passing them around as args; to be continued. svn path=/trunk/boinc/; revision=16691	2008-12-15 21:14:32 +00:00
David Anderson	8ea8081626	- scheduler: fix memory leak when reporting time stats logs - scheduler: fix egregious bug where wu_is_infeasible_fast() result is ignored, and we send jobs to hosts that can't handle them. - scheduler: don't check for disk space in work_needed(); do it in check_disk(), which generates a message to user. - scheduler: add -debug_log flag, which sends stderr to "debug_log" rather than scheduler_log.txt (for debugging) svn path=/trunk/boinc/; revision=16578	2008-11-26 21:49:36 +00:00
David Anderson	5039207e2c	- scheduler: add <have_cuda_apps> config flag. If set the "effective NCPUS" (which is used to scale daily_result_quota and max_wus_in_progress) is max'd with the # of CUDA GPUs. svn path=/trunk/boinc/; revision=16246	2008-10-21 23:16:07 +00:00
David Anderson	98cfb8d3b0	- rename .C files to .cpp so that Doxygen will work svn path=/trunk/boinc/; revision=16069	2008-09-26 18:20:24 +00:00

23 Commits