boinc

Commit Graph

Author	SHA1	Message	Date
David Anderson	ef22b2bd4b	client: show projects in alphabetical order of project name A while back I changed the job sched and work fetch policies to use REC-based project priority. The work fetch logic sorts the project list (in CLIENT_STATE::projects) by descending priority. This causes two problems: - If you have a lot of projects, it's hard to find a particular one in the event log, e.g. in work_fetch_debug output. - In the manager's Statistics tab, the selected project can change unexpectedly since we identify it by array index, and the array order may change. Solution: sort CLIENT_STATE::projects alphabetically (case insensitive). In WORK_FETCH, copy this array to a separate array, that is then sorted by decreasing priority.	2014-12-17 09:56:01 -08:00
David Anderson	7a4672e7d6	client: increase limit on coproc instances from 31 to 64 We were using an int bitmap to store flags for the instances of a coproc. Furthermore, because of the use of 2^n-1 to generate a bitmap of 1s, the limit on instances was 31. Use a long long for the bitmap instead, and don't use 2^n-1. This increases the limit to 64.	2014-11-24 00:14:23 -08:00
David Anderson	eafd70ecc6	client: request work from backed-off resources if doing RPC anyway	2014-11-18 00:05:17 -08:00
David Anderson	fbc6e40dca	Client: fix bug that prevented work fetch for zero-share projects In work fetch setup, we were computing rsc_project_reason before doing the round-robin simulation. It needs to be done after, because it uses the # of idle devices, which is computed by the simulation.	2014-11-17 13:56:06 -08:00
David Anderson	4c9d1d6659	client: code cleanup and possible debugging in work fetch - Remove code that tries to keep track of available GPU RAM and defer jobs that don't fit. This never worked, it relied on project estimates of RAM usage, and it's been replaced by having the app do temporary exit if alloc fails. - Move logic for checking for deferred jobs from CPU to work fetch. - Rename rsc_defer_sched to has_deferred_job, and move it from PROJECT to RSC_PROJECT_WORK_FETCH - tweak work_fetch_debug output	2014-10-10 14:35:00 -07:00
David Anderson	9c96108c67	client: work fetch code cleanup The logic for deciding whether to fetch work for a project or a (project, resource type) pair was scattered among several functions, with confusing names. Consolidate this logic, and use consistent names.	2014-10-10 10:37:07 -07:00
David Anderson	f63f259ce5	client: code cleanup	2014-10-10 07:15:10 -07:00
David Anderson	31541e166d	client: set work requests for coprocs specified in cc_config.xml We weren't copying the request fields from RSC_WORK_FETCH to COPROC. Do this, and clean up the code a bit. Note: the arrays that parallel the COPROCS::coprocs array are a bit of a kludge; that stuff logically belongs in COPROC. But it's specific to the client, so I can't put it there. Maybe I could do something fancy with derived classes, not sure.	2014-08-09 21:44:39 -07:00
David Anderson	b076a947fc	client: work fetch tweak to avoid starvation in a particular case My commit of Feb 7 caused work fetch to project P to be deferred for up to 5 min if an upload to P is active, even if some instances are idle. This was to deal with a case where the idleness was caused by a jobs-in-progress limit by P, and work requests lead to long backoff. However, this can cause instances to be idle unnecessarily. I changed things so that, if instances are idle, a work fetch can happen even during upload. But only one such fetch will be done.	2014-03-09 17:09:21 -07:00
David Anderson	fe8b26ac73	client: when not piggybacking work request, explain why in log msg	2014-02-24 18:45:25 -08:00
David Anderson	4d47e2f170	client: don't request work from a project w/ > 1000 runnable jobs Because of O(N^2) algorithms, the client becomes CPU-intensive when there are lots of jobs. This limit could be somewhat lower.	2013-07-07 13:13:57 -07:00
David Anderson	57a6d3d17a	client (Android): make max battery temperature a preference Note: internal change only; there's no GUI for this yet	2013-06-20 21:47:34 -07:00
David Anderson	8a1569c384	client: fix work-fetch bug that could starve a GPU if exclusions used	2013-05-16 12:38:55 -07:00
David Anderson	c00f27a5a5	client: message tweak (show "don't need" in work request msg)	2013-04-26 12:19:43 -07:00
David Anderson	6b6c2ac519	- client: fix bug that could cause idle GPUs when exclusions are present. The basic problem: the way we assign GPU instances when creating the "run list" is slightly different from the way we assign them when we actually run the jobs; the latter assigns a running job to the instance it's using, but the former doesn't. Solution (kludge): when building the run list, don't reserve instances for currently running jobs. This will result in more jobs in the run list, and avoid starvation. For efficiency, do this only if there are exclusions for this type. Comment: this is yet another complexity that would be eliminated if GPU instances were modeled separately. I wish I had time to do that. - client emulator: change default latency bound from 1 day to 10 days	2013-04-07 13:00:15 -07:00
David Anderson	330a25893f	- client emulator: parse <max_concurrent> in <app> in client_state.xml. This gives you a way to simulate the effects of app_config.xml - client: piggyback requests for resources even if we're backed off from them - client: change resource backoff logic Old: if we requested work and didn't get any, back off from resources for which we requested work New: for each resource type T: if we requested work for T and didn't get any, back off from T Also, don't back off if we're already backed off (i.e. if this is a piggyback request) Also, only back off if the RPC was due to an automatic and potentially rapid source (namely: work fetch, result report, trickle up) - client: fix small work fetch bug	2013-04-04 10:25:56 -07:00
David Anderson	f6a61fe801	- client: major overhaul of work-fetch logic based on suggestions by Jacob Klein. The new policy is roughly as follows: - find the highest-priority project P that is allowed to fetch work for a resource below buf_min - Ask P for work for all resources R below buf_max for which it's allowed to fetch work, unless there's a higher-priority project allowed to request work for R. If we're going to do an RPC to P for reasons other than work fetch, the policy is: - for each resource R for which P is the highest-priority project allowed to fetch work, and R is below buf_max, request work for R.	2013-04-02 12:32:28 -07:00
David Anderson	b93e80c6f5	- client: code cleanup. Some variable/function/constant names contained "debt" when they actually refer to REC. Change these names to use "rec".	2013-03-24 11:22:01 -07:00
David Anderson	128da198b6	- client: rename two different functions named backoff() to make it easier to see what's going on. - fix code formatting in manager	2013-03-22 10:43:05 +01:00
David Anderson	546ea233a0	- client: fix small work fetch bug that caused the client to not add a piggyback work request when it should have.	2013-03-15 13:38:45 +01:00
David Anderson	fc6b050883	- client: removed unused code for old work-fetch logic	2013-03-15 13:38:45 +01:00
David Anderson	2e23bfedaa	- client, work fetch policy. Change policy for projects w/ GPU exclusions	2013-03-07 11:28:43 +01:00
David Anderson	a63ebbc13e	- client: change work fetch policy to work better with GPU exclusions - scale amount of work request by (# non-excluded instances)/#instances - change policy: old: don't fetch work if #jobs > #non-excluded instances new: don't fetch work if # of instance-seconds used in RR sim > work_buf_min * (#non-exluded instances)/#instances	2013-03-07 11:28:42 +01:00
David Anderson	7768f6da60	- client: fix bug where, when updating a project, we fail to request work even though higher-priority projects are marked as no-new-tasks or are otherwise ineligible for work fetch.	2013-03-04 14:09:43 +01:00
David Anderson	777f1f11e8	- client: change work fetch policy to avoid starving GPUs in situations where GPU exclusions are used. - client: fix bug in round-robin simulation when GPU exclusions are used. Note: this fixes a major problem (starvation) with project-level GPU exclusion. However, project-level GPU exclusion interferes with most of the client's scheduling policies. E.g., round-robin simulation doesn't take GPU exclusion into account, and the resulting completion estimates and device shortfalls can be wrong by an order of magnitude. The only way I can see to fix this would be to model each GPU instance as a separate resource, and to associate each job with a particular GPU instance. This would be a sweeping change in both client and server.	2013-03-01 15:31:41 +01:00
David Anderson	446bc4ca28	- client: take GPU exclusions into account when making initial work request to a project - client: put some casts to double in NVIDIA detect code. Shouldn't make any difference. - volunteer storage: truncate file to right size after retrieval svn path=/trunk/boinc/; revision=26051	2012-08-20 23:41:27 +00:00
David Anderson	4fea52c6f2	- client: if a project has excluded GPUs of a given type, allow it to fetch work of that type if the # of runnable jobs it <= the # of non-excluded instances (rather than 0). svn path=/trunk/boinc/; revision=26045	2012-08-18 23:26:10 +00:00
David Anderson	ff1a391ced	- client: when we're making a scheduler RPC for a reason other than work fetch, and we're deciding whether to piggyback a work request, skip the checks for hysteresis (buffer < min) and for per-resource backoff time. These checks are there only to limit the rate of RPCs, which is not relevant since we're doing one any. This fixes a bug where a project w/ sporadic jobs specifies a next_rpc_delay to ensure regular polling from clients. When these polls occur they should request work regardless of backoff. svn path=/trunk/boinc/; revision=26002	2012-08-10 18:29:00 +00:00
David Anderson	f6bd141b30	- client: further msg tweaks svn path=/trunk/boinc/; revision=25830	2012-07-02 05:10:58 +00:00
David Anderson	1d717c6fcc	- client: msg tweak svn path=/trunk/boinc/; revision=25829	2012-07-02 04:45:19 +00:00
David Anderson	7dcf119854	- client: msg tweak svn path=/trunk/boinc/; revision=25828	2012-07-02 04:06:11 +00:00
David Anderson	89578050f7	- When the client makes a scheduler RPC without requesting work, and there's a simple reason (e.g. the project is suspended, no-new-tasks, downloads stalled, etc.) show it in the event lot. If the reason is more complex, don't try to explain. svn path=/trunk/boinc/; revision=25827	2012-07-02 03:43:05 +00:00
David Anderson	82d64e9403	- msg tweak and fix compile warnings svn path=/trunk/boinc/; revision=25408	2012-03-12 23:34:41 +00:00
David Anderson	64a371173b	- client: fix crashing bug when there is 1 instance of a resources. I'm not sure how this every worked. svn path=/trunk/boinc/; revision=25362	2012-03-02 03:56:26 +00:00
David Anderson	a6bf5aecf3	- client: tweak to work-fetch policy: if we're making a scheduler RPC to a project for reasons other than work fetch, and we're deciding whether to ask for work, ignore hysteresis; i.e. ask for work even if we're above the min buffer (idea from John McLeod). svn path=/trunk/boinc/; revision=25291	2012-02-18 23:19:06 +00:00
David Anderson	69834e0c01	- client: compile fix; remove redundant total_peak_flops() svn path=/trunk/boinc/; revision=24738	2011-12-06 09:20:30 +00:00
David Anderson	bc35060726	- client: when contacting a project for reasons other than work fetch (e.g. to report completed jobs) only request work if it's the project we would have chosen if we were fetching work. - client: the way in which project priorities were adjusted in work fetch to reflected currently queued work was wrong. - client: fix bug in the way project priorities are adjusted in RR simulator - client emulator: if there are results in the state file with states DOWNLOADING or UPLOADING, change them to DOWNLOADED or UPLOADED. Otherwise they're stuck. svn path=/trunk/boinc/; revision=24737	2011-12-06 04:21:27 +00:00
David Anderson	0d37f69a6a	- client emulator fixes svn path=/trunk/boinc/; revision=24644	2011-11-22 07:47:45 +00:00
David Anderson	7b28215032	- client: reimplement the round-robin simulator to reduce its runtime from O(N^2) to O(N), where N is the number of runnable jobs (which can be in the thousands). This will make the client emulator run a lot faster, and will reduce the client CPU overhead a bit. - API: change boinc_get_opencl_ids() so that it returns a BOINC error code (< -100) if the app_init.xml is missing or bad (i.e. we're running standalone), and an OpenCL error code (> -100) if an OpenCL call failed. svn path=/trunk/boinc/; revision=24469	2011-10-24 17:53:09 +00:00
David Anderson	b95ac02c5b	- client: change the way project priorities are computed, so that they do what they're supposed to (i.e. enforce resource shares) - client: change log flag <debt_debug> to <priority_debug> - client simulator: update REC even with large delta-t. - client simulator: handle "no new work" apps correctly svn path=/trunk/boinc/; revision=24429	2011-10-19 06:37:03 +00:00
David Anderson	7f2a3c0ce1	- client: get GPU available RAM at startup (only) - client: fix compile warning svn path=/trunk/boinc/; revision=24188	2011-09-13 22:58:39 +00:00
David Anderson	9856f795ed	- client: remove code related to debt-based scheduling svn path=/trunk/boinc/; revision=24163	2011-09-12 17:57:31 +00:00
David Anderson	f81cb82b8e	- client: make RR simulation more accurate by simulating time-slicing explicitly. Also simulate changes in project REC and hence in scheduling priority. - client: add a log flag "rrsim_detail" that prints time-slice-level info. svn path=/trunk/boinc/; revision=24161	2011-09-12 17:01:54 +00:00
David Anderson	cceea7f6d4	- client: rename MODE to RUN_MODE, and rename vars accordingly svn path=/trunk/boinc/; revision=23974	2011-08-09 20:41:15 +00:00
David Anderson	5b159c6735	- remote job submission: bug fix and tweaks - client: cc_config.xml: if <devnum> is omitted from a <exclude_gpu>, it means exclude all instances of that GPU type - client: if all instances of a GPU type are excluded for a project, don't ask the project for jobs of that type svn path=/trunk/boinc/; revision=23898	2011-07-29 00:07:20 +00:00
David Anderson	8ca24cbbab	- client, work fetch policy: adjust project REC by the amount of work queued, to increase variety NOTE: at some point I think I had a reason to not do this, but I can't remember what it is. - client, job scheduling policy: fix how project REC is adjusted svn path=/trunk/boinc/; revision=23838	2011-07-13 19:46:03 +00:00
David Anderson	c0417a8aaa	- client: fix scheduler bug that treated all CPU jobs as non-high-priority - client: don't print spurious "domino prevention" and "thrashing prevention" msgs - manager: show project descriptions in same size font as the rest of the dialog svn path=/trunk/boinc/; revision=23831	2011-07-11 05:34:09 +00:00
David Anderson	3b906a191c	- client: generalize the GPU framework so that - new GPU types can be added easily - users can specify GPUs in cc_config.xml, referred to by app_info.xml, and they will be scheduled by BOINC and passed --device N options Note: the parsing of cc_config.xml is not done yet. - RPC protocols (account manager and scheduler) can now specify GPU types in separate elements rather than embedding them in tag names e.g. <no_rsc>NVIDIA</no_rsc> rather than <no_cuda/> - client: in account manager replies, parse elements of the form <no_rsc>NAME</no_rsc> indicating the GPUs of type NAME should not be used. This allows account managers to control GPU types not hardwired into the client. Note: <no_cuda/> and <no_ati/> will continue to be supported. - scheduler RPC reply: add <no_rsc_apps>NAME</no_rsc_apps> (NAME = GPU name) to indicate that the project has no jobs for the indicated GPU type. <no_cuda_apps> etc. are still supported - client/lib: remove set_debts() GUI RPC - client/scheduler RPC remove <cuda_backoff> etc. (superceded by no_app) Exception: <ip_result> elements in sched request still have <ncudas> and <natis>. Fix this later. Implementation notes: - client/lib: change "CUDA" to "NVIDIA" in type/variable names, and in XML Continue to recognize "CUDA" for compatibility - host_info.coprocs no longer used within the client; use a global var (COPROCS coprocs) instead. COPROCS now has an array of COPROCs; GPUs types are identified by the array index. Index zero means CPU. - a bunch of other resource-specific structs (like RSC_WORK_FETCH) are now stored in arrays, with same indices as COPROCS (i.e. index 0 is CPU) - COPROCS still has COPROC_NVIDIA and COPROC_ATI structs to hold vendor-specific info - APP_VERSION now has a struct GPU_USAGE to describe its GPU usage svn path=/trunk/boinc/; revision=23253	2011-03-25 03:44:09 +00:00
David Anderson	9e2abe135e	- simulator work svn path=/trunk/boinc/; revision=22927	2011-01-19 00:32:49 +00:00
David Anderson	717c45a2db	- client: use std::deque instead of std::vector for RR sim's pending-job lists. Erasing head of vector is slow. - lib: allow GPU peak FLOPS to be specified in XML (for simulator) - simulator work - client: old work fetch policy: projects may need enough jobs for all device instances, not just resource_share*ninst. E.g. a project that has only CPU jobs in a CPU/GPU client - client: with REC scheduling, don't ask for work for secondary resources if project has negative priority. - client: in RR sim, make sure we saturate devices if possible. Otherwise we may report a shortfall incorrectly svn path=/trunk/boinc/; revision=22894	2011-01-12 00:47:51 +00:00

1 2 3

109 Commits