boinc

Commit Graph

Author	SHA1	Message	Date
David Anderson	5867396956	scheduler: fix bug in daily (host, app version) job limit. I was doing the check on each call to get_app_version(). But it turns out this gets called during the job scan, before calling add_result_to_reply(), which is where n_jobs_today gets incremented. Solution:move the check to the loop (in sched_score.cpp) where add_result_reply() is called. Also: fix bad logic in work_needed() where a global check (on job limit in user project prefs) was being done inside a loop over resources.	2019-04-09 22:58:18 -07:00
David Anderson	6144b0aaee	scheduler: fix crashing bug in keyword scheduling code	2017-07-27 23:36:32 -07:00
David Anderson	20d07be2b8	back end: add keyword-based component to job scheduling score. - add DB field for storing job keywords: workunit.keywords add this to various DB parse/write functions - add --keywords option to create_work for specifying job keywords - add <keyword_sched> option in config.xml for enabling keyword score (it's disabled by default). If set, increment score for "yes" keyword matches, and disallow jobs with "no" matches - in scheduler, add array job_keywords_array for parsed versions of job keywords (vector<int>) also: - use symbols instead of numbers for slow_check() return values - parse unused fields in req message to remove unparsed-XML warnings	2017-07-22 00:48:38 -07:00
David Anderson	5c573ddae8	scheduler: use "allow_non_preferred_apps" to match PHP code Fix bug introduced in `671446c`	2017-06-27 23:45:53 -07:00
David Anderson	671446ced1	code cleanup: in scheduler, factor project prefs into their own struct	2016-07-27 15:15:08 -07:00
David Anderson	24b52bd4a9	scheduler: add result priority to job score	2015-10-25 23:34:45 -07:00
David Anderson	8cd8c8e7ee	server software: handle 64-bit database IDs The SETI@home result table is about to run out of 32-bit IDs, so we need to move to 64-bit result IDs. This will happen to the workunit table at some point too. I changed the server C++ code to use the "long" type for all DB IDs (and to use appropriate conversion codes like %lu). "long" is 64 bit on 64-bit machines. For uniformity I did this for all tables, even ones (like app) that will never get big. I chose NOT to change the DB schema for now. The new code will work with 32-bit ID fields in the DB. As projects approach the 32-bit limit on a table they can change its ID field, and fields that reference this table, to BIGINT. This is likely to happen only on the result and workunit tables. I put functions in html/ops/db_update.php to change the IDs of these tables.	2015-07-23 10:11:08 -07:00
David Anderson	c2a0421074	scheduler: add support for miner_asic coprocessor type I.e. treat miner ASICs as a distinct processor type; send miner_asic jobs only if the client requests them. Note: I was planning to do this in a more general way, in which the scheduler wouldn't have a hard-wired list of processor types. However, that would be a large code change, so for now I just added miner_asic to the list of processor types (nvidia, ati, intel_gpu), and made various changes to get things to work. Also: in the job dispatch logic, try to send coproc jobs before CPU jobs. That way if e.g. there's a limit on jobs in progress, we'll preferentially send coproc jobs.	2014-09-21 21:08:09 -07:00
David Anderson	7a84dabbbc	scheduler: fix bug that broke enforcement of per-app job limits The old scheduler worked as follows: scan jobs; for each job get_app_version do various checks add_job_to_reply The check for per-app job limits was in get_app_version, and the incrementing of per-app job count is in add_job_to_reply. The new (score-based) scheduler works as follows: scan jobs; for each job get_app_version add to list sort list by score scan list; for each job do various checks add_job_to_reply So the limit check (in get_app_version) was ineffective because it happens before we've incremented counts. Fix: do the limit check (also) in the "scan list" loop. Bigger picture: we need to restructure app version selection; job limit enforcement doesn't belong there any more.	2014-09-16 11:07:36 -07:00
David Anderson	81faa12ff3	scheduler: make matchmaker scheduling the default Remove <matchmaker> config option; add <sched_old> option.	2014-07-08 12:35:45 -07:00
David Anderson	5fd1656935	scheduler: fix bug in score-based scheduler that cause last job to not be sent This is the problem where the scheduler marks one job with its PID to reserve it. It needs to be able to send this job. I fixed this recently in old scheduler, neglected to fix in score-based as well	2014-06-13 00:30:03 -07:00
David Anderson	56934f8fbe	scheduler: clean up job dispatch logging There are now 3 flags for job dispatch logging: <debug_send/>: info about work request, jobs sent, other high-level stuff <debug_send_scan/>: info about scans through job cache <debug_send_job/>: info about individual jobs (e.g. reason for not sending)	2014-06-12 11:33:11 -07:00
David Anderson	52bd196c4f	scheduler: fix bugs in sending non-compute-intensive jobs	2014-05-26 21:07:07 -07:00
David Anderson	641099040e	scheduler: add log message for beta-test pref, score-based case	2014-05-12 10:21:28 -07:00
David Anderson	6216673eca	web: fix missing mysqli change	2014-03-22 09:04:58 -07:00
David Anderson	5381def663	server: use gpu_active_frac in scheduling decisions On some hosts, gpu_active_frac may be much less than active_frac (i.e., GPUs may be available much less than CPUs). Use gpu_active_frac in the following places: - scheduler: in estimating the elapsed time of jobs, to decide whether they can meet deadline - scheduler: in computing the effective speed of a (host, app version), when deciding what size class it belongs to - size_census: in computing effective speed of (host, app versions) (Previously, we were just using active_frac in all these cases)	2014-03-06 21:23:02 -08:00
David Anderson	6d4999767f	example app: print "starting" message after boinc_init, so that it appears in stdferr file Also remove old score-based sched code	2013-12-10 14:00:31 -08:00
David Anderson	1c31f6feaa	Condor: fix bug when 2 input files have same contents; fix error messages	2013-08-09 16:06:36 -07:00
Eric J Korpela	244ba5bc85	SCHED: modified scheduled log output to use unsigned format for WU and RESULT ids. This allows IDs greater than 2^31 to be printed.	2013-06-19 10:15:08 -07:00
David Anderson	bf96878fe8	scheduler: comparison function for score-based scheduling was backwards	2013-05-22 10:20:54 -07:00
David Anderson	fa43e0fe6e	scheduler: enforce job limits in score-based scheduling	2013-05-21 20:14:39 -07:00
David Anderson	0c430ce1fa	Add support for multi-size apps See http://boinc.berkeley.edu/trac/wiki/MultiSize The components of this include: - DB changes: add size_class to workunit and result n_size_classes to app; >1 means multi-size - size_regulator daemon program: change results states from INACTIVE to UNSENT carefully - size_census program; writes quantile info in flat files - transitioner: when creating results for multi-size apps, set server state to INACTIVE - sched shmem (feeder): read quantile info from flat files, store in shared memory - scheduler (score-based scheduling): for multi-size apps, add component to score function for size class. - show_shmem: show result size class - make_work (and other callers of count_unsent_results()): count both INACTIVE and UNSENT - create_work: add --size_class cmdline option Also: - if get MySQL errors in upgrade, don't rewrite db_version	2013-04-25 00:27:35 -07:00
David Anderson	55f45c8d22	- scheduler: fixes for new score-based scheduling	2013-04-10 15:52:53 -07:00
David Anderson	c9c9f2bae0	- scheduler: code shuffle; new file sched_check.cpp contains functions that decide whether a job can be sent to a host	2013-04-09 12:19:00 -07:00
David Anderson	12319ca82b	- scheduler: add code (commented out for now) for new implementation of score-based scheduling.	2013-04-09 11:10:50 -07:00
David Anderson	4b826b52a0	- scheduler: fix bug in the "homogeneous app version" (HAV) feature (reported by Kevin Reed). The problem: cache inconsistency. If there are 2 results for the same WU in shared mem, and 2 scheduler instances get them around the same time, they can send them with different app versions. We already fixed this problem for HR by 1) rereading the relevant WU fields while deciding whether to send the result 2) doing a "careful update" of the WU field using a where clause to make sure it wasn't modified in the (short) interval since rereading it. I fixed the HAV problem in the same way, and merged the two mechanisms to combine the DB queries. Also: - The rereads are done in slow_check() (see below). - The careful updates are done in update_wu_on_send(), and this is called before doing careful updates on result fields. That way, if the WU updates fail, we don't have orphaned results. - already_sent_to_different_platform_careful() (sic) no longer does DB stuff, so it's merged with already_send_to_different_hr_class() (better name) NOTE: slow_check() is used in array scheduling only. Score-based scheduling uses other code, in which this bug is not yet fixed. Locality scheduling doesn't support HR or HAV at all. This should be unified. svn path=/trunk/boinc/; revision=24484	2011-10-26 07:15:22 +00:00
David Anderson	3b73c8dc0a	- db_purge: make zip compression work (from Teemu Mannermaa) - get rid of a few compile warnings svn path=/trunk/boinc/; revision=23789	2011-07-01 02:12:11 +00:00
David Anderson	436415cfe1	- scheduler, back end: add "homogeneous app version" feature. Lets you specify, on a per-app basis, that all instances should be done using the same app version. This is for validation in the presence of GPUs. - scheduler: code cleanup - Instead of adding a bunch of non-DB fields to RESULT, used a derived class SCHED_DB_RESULT. - Instead of storing a pointer to BEST_APP_VERSION in RESULT, store the structure itself. This simplifies the memory allocation situation. - client: condition "Got server request to delete file" messages on <file_xfer_debug> svn path=/trunk/boinc/; revision=23636	2011-06-06 03:40:42 +00:00
David Anderson	b169e5ab0f	- server programs: print error message instead of numeric retval in log messages svn path=/trunk/boinc/; revision=22647	2010-11-08 17:51:57 +00:00
David Anderson	40c50852f5	- scheduler: fix logic that deals with jobs that need > 2GB RAM. My change of 1 Oct ([22440]) required that such jobs be processed with 64-bit apps, on the assumption that 32-bit apps have a 2 GB user address space limit. However, it turns out this limit applies only to Windows (kernel and user mode share the 4GB address space; each gets half). On Linux, the split is 3GB user / 1 GB kernel. On Mac OS X, user mode and kernel mode have separate address spaces, each of them 4 GB. svn path=/trunk/boinc/; revision=22599	2010-10-27 22:58:16 +00:00
David Anderson	7c51512cbf	- transitioner: the format string for a DB query had %.15d instead of %.15e. That produced a messed-up query that assigned garbage values to: host_app_version.turnaround_var host_app_version.turnaround_q host_app_version.max_jobs_per_day host_app_version.consecutive_valid To repair these: - set turnaround_var and turnaround_q to zero - if max_jobs_per_day is outside of (0..config.daily_result_quota) set it to config.daily_result_quota - if consecutive_valid is outside (0..1000), set it to zero I added a script, html/ops/repair_21812.php, that does this; if you ran server code between [21181] and [21812], run this script. - scheduler/transitioner: add <debug_quota> log flag - changed the build system to always use -Wall (if we'd done this before, this bug wouldn't have happened) - fixed a bunch of other compile warnings svn path=/trunk/boinc/; revision=21812	2010-06-25 18:54:37 +00:00
David Anderson	56a8296b5b	- scheduler: compute no_jobs_available correctly in the presence of multiple scheduling types (e.g., locality and job array) From Nils Brause svn path=/trunk/boinc/; revision=19559	2009-11-12 21:30:33 +00:00
David Anderson	b300519444	svn path=/trunk/boinc/; revision=18825	2009-08-10 04:49:02 +00:00
David Anderson	77055d17e7	svn path=/trunk/boinc/; revision=18765	2009-07-29 18:34:27 +00:00
David Anderson	4c070e3bfb	- scheduler: Gianni requested a feature where jobs have a "min # of GPU processors" attribute (stored in batch) and are sent only to hosts whose GPUs have at least this #. The logical place for this is in the scoring function, JOB::get_score(). I added a clause (#ifdef'd out) that does this. It rejects the WU if #procs is too small, otherwise it adds min/actual to the score. This favors sending jobs that need lots of procs to GPUs that have them. svn path=/trunk/boinc/; revision=18764	2009-07-29 17:29:56 +00:00
David Anderson	2e5d9bd778	- scheduler: add new config option <max_wus_in_progress_gpus>. The limit on jobs in progress is now max_wus_in_progress * NCPUS + max_wus_in_progress * NGPUS where NCPUS and NGPUS reflect prefs and are capped. Furthermore: if the client reports plan class for in-progress jobs (see checkin of 31 May 2009) then these limits are enforced separately; i.e. the # of in-progress CPU jobs is <= max_wus_in_progressNCPUS, and the # of in-progress GPU jobs is <= max_wus_in_progress_gpuNGPUS - scheduler config: rename <cuda_multiplier> to <gpu_multiplier> - scheduler: <max_wus_to_send> is now scaled by (NCPUS + gpu_multiplier*NGPUS) - scheduler: don't keep scanning array if !work_needed() - scheduler: moved array-scan logic from sched_send.cpp to sched_array.cpp - scheduler: don't say "no work available" if jobs are available but work_needed() is initially false svn path=/trunk/boinc/; revision=18255	2009-06-01 22:15:14 +00:00
David Anderson	84afd18450	- scheduler: move app-version selection and score-based scheduling to new files. svn path=/trunk/boinc/; revision=17630	2009-03-19 16:35:35 +00:00

37 Commits