boinc

Commit Graph

Author	SHA1	Message	Date
David Anderson	b7d48765a8	- client: if have coproc jobs but coproc is missing, skip those jobs in RR sim. Otherwise we add stuff to uninitialized data structures, and a crash can result. - client: initialize the above data structures anyway svn path=/trunk/boinc/; revision=20753	2010-02-28 04:32:10 +00:00
David Anderson	f716dcf7ae	- client: if a project has zero resource share, treat it as a "backup project": fetch work from it only if there is an idle instance and no other projects have work. svn path=/trunk/boinc/; revision=20286	2010-01-28 05:21:14 +00:00
David Anderson	b5124fe729	- client: brute-force attempt at eliminating domino-effect preemption: if job A is unstarted and EDF, and there's a job B that is later in the list, is started, has the same app version, and has the same arrival time, move A after B. - client: remove the "temp_dcf" mechanism, which had the same goal but didn't work. - client: in computing overall debt for a project, subtract a term that reflects pending work. This should reduce repeated fetches from the same project. - client simulator: tweaks svn path=/trunk/boinc/; revision=20223	2010-01-21 00:14:56 +00:00
David Anderson	fe7d8b34f3	- client simulator: done for now svn path=/trunk/boinc/; revision=20204	2010-01-20 06:35:57 +00:00
David Anderson	d6b6f8d5db	- client (Mac): append /usr/local/cuda/lib to LD_LIBRARY_PATH and DYLD_LIBRARY_PATH - client simulator: compile fixes svn path=/trunk/boinc/; revision=20117	2010-01-09 16:41:17 +00:00
David Anderson	37aae854f3	- client: scheduling problem: - a project overestimates job FLOP counts - the client starts jobs in EDF mode - as job progresses and fraction done increases, its completion time estimate decreases until it's no longer a deadline miss. - job gets preempted by other job from that project; you end up with lots of partly completed jobs. Solution (I hope): if an app version has running jobs, compute a "temp DCF" for the app version, which is the min of dynamic/static estimates for its jobs. Apply this scaling factor to completion time estimates for unstarted jobs in RR simulation - client: the estimation of remaining time of running jobs was wrong (how did this bug survive so long?) svn path=/trunk/boinc/; revision=20077	2010-01-06 06:01:23 +00:00
David Anderson	876522c6aa	- client: add logic to work fetch so that each project will have enough jobs to use its share of resource instances. This avoids situations where e.g. on a 2-CPU system a project has 75% resource share and 1 CPU job, and its STD increases without bound. Did a general cleanup of the logic for computing work request sizes (seconds and instances). svn path=/trunk/boinc/; revision=20036	2009-12-24 20:40:27 +00:00
David Anderson	e9a4debf9c	- client: scheduling tweak. Old: if a project has RR sim deadline misses, select jobs to run high-priority on the basis of: 1) deadline (earliest first) 2) estimated time to completion (least first) This ignores whether jobs missed their deadline in RR sim, so it may choose to run a job that's actually in no danger of missing its deadline over one that is. New: choose only jobs that miss their deadline in RR sim svn path=/trunk/boinc/; revision=19826	2009-12-08 20:39:46 +00:00
David Anderson	4d96415576	- client: fix bug introduced in [19035] that causes wrong nidle instances (and resulting work fetch problems) - Unix build: don't touch svn_version.sh if it hasn't changed, to avoid remake of sched/ (from Gabor Gombas) svn path=/trunk/boinc/; revision=19096	2009-09-18 19:26:34 +00:00
David Anderson	f5a6f862bf	- client: fix bug in RR simulation: start only enough jobs to fill CPUs per project, not all the CPU jobs at once. I'm not sure how much difference this makes, but this is how it's supposed to work. - client: if app_info.xml doesn't specify flops, use an estimate that takes GPUs into account. - client: if it's been more than 2 weeks since time stats update, don't decay on_frac at all. svn path=/trunk/boinc/; revision=19035	2009-09-09 22:18:02 +00:00
David Anderson	c3fe504e1d	- client: add ATI support to job scheduling and work fetch svn path=/trunk/boinc/; revision=18850	2009-08-17 16:50:40 +00:00
David Anderson	0a523d5f3f	svn path=/trunk/boinc/; revision=18843	2009-08-14 17:10:52 +00:00
David Anderson	e606170b14	- client: try to fix situations where the scheduler runs GPU jobs in a seemingly random order, or preempts GPU jobs needlessly. The change has two parts: 1) sort the "results" vector by received_time, so that the RR simulation processes GPU jobs FIFO. 2) in the CPU scheduler (earliest_deadline_result()) instead of choosing the earliest-deadline GPU job that misses its deadline, pick the earliest_deadline GPU from a project that has a deadline miss for that GPU type (this is what's done in the CPU case) - client: fix bug where if you have an exclusive app, then remove it from cc_config.xml and do "update config", it doesn't go away. Need to clear the list before parsing. svn path=/trunk/boinc/; revision=18842	2009-08-14 16:54:45 +00:00
David Anderson	b358089006	svn path=/trunk/boinc/; revision=18632	2009-07-20 17:30:10 +00:00
David Anderson	5753153909	- client: 2nd try on my last checkin. We need to estimate 2 different delays for each resource type: 1) "saturated time": the time the resource will be fully utilized (new name for the old "estimated delay"). This is used to compute work requests. 2) "busy time": the time a new job would have to wait to start using this resource. This is passed to the scheduler and used for a crude deadline check. Note: this is ill-defined; a single number doesn't suffice. But as a very rough estimate, I'll use the sum of (J.duration * J.ninstances)/ninstances over all jobs that miss their deadline under RR sim. svn path=/trunk/boinc/; revision=18629	2009-07-17 18:29:10 +00:00
David Anderson	8a1c0816ed	- client: change the way a resource's "estimated delay" (passed to server for crude deadline check) is computed. Old: estimated delay is the interval for which the resource is fully used (i.e., all instances busy). Problem: this may cause unnecessary project starvation. example: 1 CPU machine, has a month-long CPDN job with a 1-year deadline (it's not in deadline trouble). Then the CPU estimated delay will be 1 month, and the client won't get any work from projects with deadlines shorter than 1 month. New: estimated delay is the latest time at which the resource is fully used and is being used by at least 1 job that is projected to miss its deadline under RR. Note: this isn't precise, but I don't think we can improve it much without getting a lot more complex. svn path=/trunk/boinc/; revision=18607	2009-07-16 21:21:47 +00:00
David Anderson	c2097091fe	- client: show "est. delay" correctly in work fetch debug msgs - client: show times correctly in rr_sim debug msgs - client: in "requesting new tasks" msg, say what resources we're requesting (if there's more than CPU) - client: estimated delay was possibly being calculated incorrectly because of roundoff error svn path=/trunk/boinc/; revision=18269	2009-06-02 22:53:57 +00:00
David Anderson	cf638ae3a6	- client: instead of scheduling coproc jobs EDF: - first schedule jobs projected to miss deadline in EDF order - then schedule remaining jobs in FIFO order This is intended to reduce the number of preemptions of coproc jobs, and hence (since they are always preempted by quit) to reduce the wasted time due to checkpoint gaps. - client: the CPU scheduling policy made use of the number of deadline misses in various places. This should include only the deadline misses of CPU jobs. So move "deadlines_missed" from RR_SIM_STATUS and PROJECT to RSC_PROJECT_WORK_FETCH so that we have separate counts for CPU and coproc jobs, and use the count for CPU jobs. - GUI RPC: removed the rr_sim_deadlines_missed field from project descriptor. This is no longer meaningful, and it didn't seem to be used anywhere. svn path=/trunk/boinc/; revision=17785	2009-04-10 19:01:38 +00:00
David Anderson	7e256c0995	- client: work fetch: in RR sim, keep track of the number of device instances used by jobs that miss deadline. Don't do "variety" work fetch if this is >= # of instances svn path=/trunk/boinc/; revision=17631	2009-03-19 16:55:04 +00:00
David Anderson	edca22818e	- client: in RR simulation, use app_version.flops instead of host_info.fpops as the FLOPS estimate for non-GPU apps. I don't see why this would make any difference (these two are equal for non-GPU apps) but people have reported that this change improves estimates. svn path=/trunk/boinc/; revision=17624	2009-03-18 17:24:56 +00:00
David Anderson	fb1187e398	svn path=/trunk/boinc/; revision=17501	2009-03-04 22:07:16 +00:00
David Anderson	346ac348b3	- client: RR sim FLOPS estimate for GPU jobs should reflect fraction of time BOINC is running. svn path=/trunk/boinc/; revision=17412	2009-02-27 21:44:39 +00:00
David Anderson	125c90d1da	- client: work-fetch bug fix: if we're fetching work for a starved project, it most have no runnable jobs for ANY resource. - client: work-fetch bug fix: when setting requests in the shortfall case, don't request anything if project is backed off or overworked for the resource. svn path=/trunk/boinc/; revision=17338	2009-02-23 21:34:13 +00:00
David Anderson	6a75b78de4	- client: don't ignore jobs with fraction_done=1 (but still running) in RR simulation; we may need to mark them as deadline miss. - web: replace & with & various places svn path=/trunk/boinc/; revision=17278	2009-02-17 17:39:57 +00:00
David Anderson	a4a2a68f7d	- fix tabs svn path=/trunk/boinc/; revision=17101	2009-02-02 18:47:34 +00:00
David Anderson	9f170696a4	- client: code cleanup svn path=/trunk/boinc/; revision=17100	2009-02-02 18:45:00 +00:00
David Anderson	6120b02306	- client: code cleanup svn path=/trunk/boinc/; revision=17098	2009-02-02 05:15:12 +00:00
David Anderson	89188fca84	- client: there was a problem with how the round simulator worked in the presence of coprocessors. The simulator maintained per-project queues of pending jobs. When a job finished (in the simulation) it would get one or more jobs from that project's pending queue. The problem: this could cause "holes" in the scheduling of GPUs, and produce an erroneous nonzero shortfall for GPUs, leading to infinite work fetch. The solution: maintain a separate (per-resource, not per--project) queue of pending coprocessor jobs. When a coprocessor job finishes, start pending jobs from the queue for that resource. Another change: the simulator did strict reservation of coprocessors. If there are 2 instances of CUDA, and a 1-instance job is running in the simulation, it wouldn't start an additional 2-instance job. This also can cause erroneous nonzero shortfalls. So instead, schedule coprocessors like CPUs, i.e. saturate them. This can cause distorted completion time estimates, but it's better than infinite work fetch. svn path=/trunk/boinc/; revision=17093	2009-02-01 04:37:19 +00:00
David Anderson	9e7cb42084	- client: computation of # idle CUDA instances was wrong svn path=/trunk/boinc/; revision=17087	2009-01-30 21:49:20 +00:00
David Anderson	b7a2c227ca	- Work fetch / scheduler: There are two mechanisms to prevent the scheduler from sending jobs that won't finish by their deadline. Simple mechanism: The client sends the interval x for which CPUs are projected to be saturated. Given a job with estimated duration y, the scheduler doesn't send it if x + y exceeds the delay bound. If it does send it, x is incremented by y. Complex mechanism: Client sends workload description. Scheduler does EDF simulation, sees if deadlines are missed. The only project using this AFAIK is BOINC alpha test. Neither of these mechanisms takes coprocessors into account, and as a result jobs could be sent that are doomed to miss their deadline. This checkin adds coprocessor awareness to the Simple mechanism. Changes: Client: compute estimated delay (i.e. time until non-saturation) for coprocessors as well as CPU. Send them in scheduler request as part of coproc descriptor. Scheduler: Keep track of estimated delays separately for different resources - client: fixed bug that computed CPU estimated delay incorrectly - client: the work request (req_secs) for a resource is the min of the project's share and the shortfall. svn path=/trunk/boinc/; revision=17086	2009-01-30 21:25:24 +00:00
David Anderson	f33631cbbc	- client: fix messages svn path=/trunk/boinc/; revision=16960	2009-01-20 18:06:49 +00:00
David Anderson	f90dddc9a6	- client: clamp long term debts tp +- 1 week - client: fix CUDA debt calculation - client: don't accumulate debt if project->dont_request_more_work - client: improves messages svn path=/trunk/boinc/; revision=16909	2009-01-14 23:56:07 +00:00
David Anderson	132cc6bba3	- client: debugging CUDA-related stuff - client: if reset a project, clear its overall and per-resource backoffs svn path=/trunk/boinc/; revision=16862	2009-01-10 00:48:22 +00:00
David Anderson	2860574fa5	compile fixes and debug message fixes svn path=/trunk/boinc/; revision=16836	2009-01-08 00:20:04 +00:00
David Anderson	8740ffdc94	- client: more work-fetch stuff. No more per-project shortfall. It's getting pretty close. svn path=/trunk/boinc/; revision=16765	2009-01-03 06:01:17 +00:00
David Anderson	72937e5c4f	win compile fixes svn path=/trunk/boinc/; revision=16756	2008-12-31 23:30:38 +00:00
David Anderson	8c591e31df	- client: first whack at new work-fetch logic. Very preliminary. svn path=/trunk/boinc/; revision=16754	2008-12-31 23:07:59 +00:00
David Anderson	cd4ca5fb17	- client: fix calculation of a job's FLOPS rate in round-robin simulation svn path=/trunk/boinc/; revision=16662	2008-12-09 20:01:01 +00:00
David Anderson	fbb899f1c0	- client: in round-robin simulation, don't count a project in total resource share if it has coproc jobs and no CPU jobs. svn path=/trunk/boinc/; revision=16652	2008-12-08 23:00:23 +00:00
David Anderson	af183bc2db	- client: in round-robin simulation, remove code that sets CPU shortfall for projects with no active results. This is now wrong because there coproc apps might have pending results. Also remove nidle_cpus > 0 conditional that increments CPU shortfall; I think this is vestigial code. svn path=/trunk/boinc/; revision=16646	2008-12-08 18:26:25 +00:00
Charlie Fenton	6d18e79466	client: fix compiler warning. svn path=/trunk/boinc/; revision=16615	2008-12-04 02:18:01 +00:00
David Anderson	ea0146d154	- client: fix calculation of CPU shortfall; don't fetch work from projects with zero CPU shortfall svn path=/trunk/boinc/; revision=16613	2008-12-03 23:30:54 +00:00
David Anderson	84f1193a9d	- client: use FLOPs, rather than CPU time, as the basis for estimating job completion times. This should improve estimates for GPU apps, and prevent the DCF from getting messed up. svn path=/trunk/boinc/; revision=16598	2008-12-02 03:58:32 +00:00
David Anderson	e3fd56f5e8	- client: work-fetch tweak: don't increment overall CPU shortfall if any jobs pending in simulation svn path=/trunk/boinc/; revision=16595	2008-12-01 22:06:24 +00:00
David Anderson	57639bdaae	- client: in round-robin simulation, only increment CPU shortfall (per-project or overall) if there are no pending tasks. This is needed when there are coproc (i.e. CUDA) jobs; CPUs may be idle because pending jobs are waiting for active jobs to release coprocs. In this situation the CPU idleness should not be counted as shortfall; otherwise (if there are only coproc jobs) there will always be a shortfall, and the client will fetch infinite work. svn path=/trunk/boinc/; revision=16545	2008-11-24 18:57:04 +00:00
David Anderson	98d6931d63	- client (Unix): if app uses < 1 CPU, run at nice 10 (not 0) - client: suppress specious error message svn path=/trunk/boinc/; revision=16496	2008-11-14 22:08:50 +00:00
Charlie Fenton	deaeae4eda	client: fix compiler warning indicating real error in RR simulation svn path=/trunk/boinc/; revision=16391	2008-11-03 10:19:25 +00:00
David Anderson	719921bfaf	- client: fix the updating of CPU time left in RR simulation; don't print msgs about non-CPU-intensive projects. svn path=/trunk/boinc/; revision=16386	2008-11-01 21:10:08 +00:00
Charlie Fenton	046146317e	client: fix compiler warning svn path=/trunk/boinc/; revision=16383	2008-11-01 00:14:20 +00:00
David Anderson	9987f9d245	- client: revise round-robin simulation to take variable avg_ncpus into account svn path=/trunk/boinc/; revision=16366	2008-10-30 21:07:35 +00:00

1 2

54 Commits