Commit Graph

99 Commits

Author SHA1 Message Date
David Anderson 4d47e2f170 client: don't request work from a project w/ > 1000 runnable jobs
Because of O(N^2) algorithms, the client becomes CPU-intensive
when there are lots of jobs.
This limit could be somewhat lower.
2013-07-07 13:13:57 -07:00
David Anderson 57a6d3d17a client (Android): make max battery temperature a preference
Note: internal change only; there's no GUI for this yet
2013-06-20 21:47:34 -07:00
David Anderson 8a1569c384 client: fix work-fetch bug that could starve a GPU if exclusions used 2013-05-16 12:38:55 -07:00
David Anderson c00f27a5a5 client: message tweak (show "don't need" in work request msg) 2013-04-26 12:19:43 -07:00
David Anderson 6b6c2ac519 - client: fix bug that could cause idle GPUs when exclusions are present.
The basic problem: the way we assign GPU instances when creating
        the "run list" is slightly different from the way we assign them
        when we actually run the jobs;
        the latter assigns a running job to the instance it's using,
        but the former doesn't.
    Solution (kludge): when building the run list,
        don't reserve instances for currently running jobs.
        This will result in more jobs in the run list, and avoid starvation.
        For efficiency, do this only if there are exclusions for this type.
    Comment: this is yet another complexity that would be eliminated
        if GPU instances were modeled separately.
        I wish I had time to do that.
- client emulator: change default latency bound from 1 day to 10 days
2013-04-07 13:00:15 -07:00
David Anderson 330a25893f - client emulator: parse <max_concurrent> in <app> in client_state.xml.
This gives you a way to simulate the effects of app_config.xml
- client: piggyback requests for resources even if we're backed off from them
- client: change resource backoff logic
    Old: if we requested work and didn't get any,
        back off from resources for which we requested work
    New: for each resource type T:
        if we requested work for T and didn't get any, back off from T
        Also, don't back off if we're already backed off
            (i.e. if this is a piggyback request)
        Also, only back off if the RPC was due to an automatic
            and potentially rapid source
            (namely: work fetch, result report, trickle up)
- client: fix small work fetch bug
2013-04-04 10:25:56 -07:00
David Anderson f6a61fe801 - client: major overhaul of work-fetch logic based on suggestions
by Jacob Klein.
    The new policy is roughly as follows:
    - find the highest-priority project P that is allowed
        to fetch work for a resource below buf_min
    - Ask P for work for all resources R below buf_max
        for which it's allowed to fetch work,
        unless there's a higher-priority project allowed
        to request work for R.
    If we're going to do an RPC to P for reasons other than work fetch,
    the policy is:
    - for each resource R for which P is the highest-priority project
        allowed to fetch work, and R is below buf_max,
        request work for R.
2013-04-02 12:32:28 -07:00
David Anderson b93e80c6f5 - client: code cleanup. Some variable/function/constant names
contained "debt" when they actually refer to REC.
    Change these names to use "rec".
2013-03-24 11:22:01 -07:00
David Anderson 128da198b6 - client: rename two different functions named backoff()
to make it easier to see what's going on.
- fix code formatting in manager
2013-03-22 10:43:05 +01:00
David Anderson 546ea233a0 - client: fix small work fetch bug that caused the client to
not add a piggyback work request when it should have.
2013-03-15 13:38:45 +01:00
David Anderson fc6b050883 - client: removed unused code for old work-fetch logic 2013-03-15 13:38:45 +01:00
David Anderson 2e23bfedaa - client, work fetch policy. Change policy for projects w/ GPU exclusions 2013-03-07 11:28:43 +01:00
David Anderson a63ebbc13e - client: change work fetch policy to work better with GPU exclusions
- scale amount of work request by
        (# non-excluded instances)/#instances
    - change policy:
        old: don't fetch work if #jobs > #non-excluded instances
        new: don't fetch work if # of instance-seconds used in RR sim
            > work_buf_min * (#non-exluded instances)/#instances
2013-03-07 11:28:42 +01:00
David Anderson 7768f6da60 - client: fix bug where, when updating a project, we fail to request work even though higher-priority projects are marked as no-new-tasks or are otherwise ineligible for work fetch. 2013-03-04 14:09:43 +01:00
David Anderson 777f1f11e8 - client: change work fetch policy to avoid starving GPUs in situations where GPU exclusions are used. - client: fix bug in round-robin simulation when GPU exclusions are used.
Note: this fixes a major problem (starvation)
    with project-level GPU exclusion.
    However, project-level GPU exclusion interferes with most of
    the client's scheduling policies.
    E.g., round-robin simulation doesn't take GPU exclusion into account,
    and the resulting completion estimates and device shortfalls
    can be wrong by an order of magnitude.

    The only way I can see to fix this would be to model each
    GPU instance as a separate resource,
    and to associate each job with a particular GPU instance.
    This would be a sweeping change in both client and server.
2013-03-01 15:31:41 +01:00
David Anderson 446bc4ca28 - client: take GPU exclusions into account when making
initial work request to a project
- client: put some casts to double in NVIDIA detect code.
    Shouldn't make any difference.
- volunteer storage: truncate file to right size after retrieval


svn path=/trunk/boinc/; revision=26051
2012-08-20 23:41:27 +00:00
David Anderson 4fea52c6f2 - client: if a project has excluded GPUs of a given type,
allow it to fetch work of that type if the # of runnable
    jobs it <= the # of non-excluded instances (rather than 0).


svn path=/trunk/boinc/; revision=26045
2012-08-18 23:26:10 +00:00
David Anderson ff1a391ced - client: when we're making a scheduler RPC
for a reason other than work fetch,
    and we're deciding whether to piggyback a work request,
    skip the checks for hysteresis (buffer < min)
    and for per-resource backoff time.
    These checks are there only to limit the rate of RPCs,
    which is not relevant since we're doing one any.

    This fixes a bug where a project w/ sporadic jobs specifies
    a next_rpc_delay to ensure regular polling from clients.
    When these polls occur they should request work regardless of backoff.


svn path=/trunk/boinc/; revision=26002
2012-08-10 18:29:00 +00:00
David Anderson f6bd141b30 - client: further msg tweaks
svn path=/trunk/boinc/; revision=25830
2012-07-02 05:10:58 +00:00
David Anderson 1d717c6fcc - client: msg tweak
svn path=/trunk/boinc/; revision=25829
2012-07-02 04:45:19 +00:00
David Anderson 7dcf119854 - client: msg tweak
svn path=/trunk/boinc/; revision=25828
2012-07-02 04:06:11 +00:00
David Anderson 89578050f7 - When the client makes a scheduler RPC without requesting work,
and there's a simple reason
    (e.g. the project is suspended, no-new-tasks, downloads stalled, etc.)
    show it in the event lot.
    If the reason is more complex, don't try to explain.


svn path=/trunk/boinc/; revision=25827
2012-07-02 03:43:05 +00:00
David Anderson 82d64e9403 - msg tweak and fix compile warnings
svn path=/trunk/boinc/; revision=25408
2012-03-12 23:34:41 +00:00
David Anderson 64a371173b - client: fix crashing bug when there is 1 instance of a resources.
I'm not sure how this every worked.


svn path=/trunk/boinc/; revision=25362
2012-03-02 03:56:26 +00:00
David Anderson a6bf5aecf3 - client: tweak to work-fetch policy:
if we're making a scheduler RPC to a project for reasons
    other than work fetch,
    and we're deciding whether to ask for work, ignore hysteresis;
    i.e. ask for work even if we're above the min buffer
    (idea from John McLeod).


svn path=/trunk/boinc/; revision=25291
2012-02-18 23:19:06 +00:00
David Anderson 69834e0c01 - client: compile fix; remove redundant total_peak_flops()
svn path=/trunk/boinc/; revision=24738
2011-12-06 09:20:30 +00:00
David Anderson bc35060726 - client: when contacting a project for reasons other than
work fetch (e.g. to report completed jobs)
    only request work if it's the project we would have chosen
    if we were fetching work.
- client: the way in which project priorities were adjusted
    in work fetch to reflected currently queued work was wrong.
- client: fix bug in the way project priorities are adjusted
    in RR simulator
- client emulator: if there are results in the state file
    with states DOWNLOADING or UPLOADING,
    change them to DOWNLOADED or UPLOADED.
    Otherwise they're stuck.


svn path=/trunk/boinc/; revision=24737
2011-12-06 04:21:27 +00:00
David Anderson 0d37f69a6a - client emulator fixes
svn path=/trunk/boinc/; revision=24644
2011-11-22 07:47:45 +00:00
David Anderson 7b28215032 - client: reimplement the round-robin simulator to
reduce its runtime from O(N^2) to O(N),
    where N is the number of runnable jobs
    (which can be in the thousands).
    This will make the client emulator run a lot faster,
    and will reduce the client CPU overhead a bit.
- API: change boinc_get_opencl_ids() so that it returns
    a BOINC error code (< -100) if the app_init.xml is
    missing or bad (i.e. we're running standalone),
    and an OpenCL error code (> -100) if an OpenCL call failed.


svn path=/trunk/boinc/; revision=24469
2011-10-24 17:53:09 +00:00
David Anderson b95ac02c5b - client: change the way project priorities are computed,
so that they do what they're supposed to
    (i.e. enforce resource shares)
- client: change log flag <debt_debug> to <priority_debug>
- client simulator: update REC even with large delta-t.
- client simulator: handle "no new work" apps correctly


svn path=/trunk/boinc/; revision=24429
2011-10-19 06:37:03 +00:00
David Anderson 7f2a3c0ce1 - client: get GPU available RAM at startup (only)
- client: fix compile warning


svn path=/trunk/boinc/; revision=24188
2011-09-13 22:58:39 +00:00
David Anderson 9856f795ed - client: remove code related to debt-based scheduling
svn path=/trunk/boinc/; revision=24163
2011-09-12 17:57:31 +00:00
David Anderson f81cb82b8e - client: make RR simulation more accurate
by simulating time-slicing explicitly.
    Also simulate changes in project REC
    and hence in scheduling priority.
- client: add a log flag "rrsim_detail" that prints
    time-slice-level info.


svn path=/trunk/boinc/; revision=24161
2011-09-12 17:01:54 +00:00
David Anderson cceea7f6d4 - client: rename MODE to RUN_MODE, and rename vars accordingly
svn path=/trunk/boinc/; revision=23974
2011-08-09 20:41:15 +00:00
David Anderson 5b159c6735 - remote job submission: bug fix and tweaks
- client: cc_config.xml: if <devnum> is omitted from a <exclude_gpu>,
    it means exclude all instances of that GPU type
- client: if all instances of a GPU type are excluded for a project,
    don't ask the project for jobs of that type


svn path=/trunk/boinc/; revision=23898
2011-07-29 00:07:20 +00:00
David Anderson 8ca24cbbab - client, work fetch policy:
adjust project REC by the amount of work queued, to increase variety
    NOTE: at some point I think I had a reason to not do this,
    but I can't remember what it is.
- client, job scheduling policy: fix how project REC is adjusted


svn path=/trunk/boinc/; revision=23838
2011-07-13 19:46:03 +00:00
David Anderson c0417a8aaa - client: fix scheduler bug that treated all CPU jobs
as non-high-priority
	- client: don't print spurious "domino prevention"
		and "thrashing prevention" msgs
	- manager: show project descriptions in same size font
		as the rest of the dialog

svn path=/trunk/boinc/; revision=23831
2011-07-11 05:34:09 +00:00
David Anderson 3b906a191c - client: generalize the GPU framework so that
- new GPU types can be added easily
		- users can specify GPUs in cc_config.xml,
			referred to by app_info.xml,
			and they will be scheduled by BOINC
			and passed --device N options
			Note: the parsing of cc_config.xml is not done yet.
		- RPC protocols (account manager and scheduler)
			can now specify GPU types in separate elements
			rather than embedding them in tag names
			e.g. <no_rsc>NVIDIA</no_rsc> rather than <no_cuda/>
	- client: in account manager replies, parse elements of the form
		<no_rsc>NAME</no_rsc>
		indicating the GPUs of type NAME should not be used.
		This allows account managers to control GPU types
		not hardwired into the client.
		Note: <no_cuda/> and <no_ati/> will continue to be supported.
	- scheduler RPC reply: add
		<no_rsc_apps>NAME</no_rsc_apps>
		(NAME = GPU name)
		to indicate that the project has no jobs for the indicated GPU type.
		<no_cuda_apps> etc. are still supported 
	- client/lib: remove set_debts() GUI RPC
	- client/scheduler RPC
		remove <cuda_backoff> etc. (superceded by no_app)
		Exception: <ip_result> elements in sched request
		still have <ncudas> and <natis>.
		Fix this later.

	Implementation notes:
	- client/lib: change "CUDA" to "NVIDIA" in type/variable names, and in XML
		Continue to recognize "CUDA" for compatibility
	- host_info.coprocs no longer used within the client;
		use a global var (COPROCS coprocs) instead.
		COPROCS now has an array of COPROCs;
		GPUs types are identified by the array index.
		Index zero means CPU.
	- a bunch of other resource-specific structs (like RSC_WORK_FETCH)
		are now stored in arrays, with same indices as COPROCS
		(i.e. index 0 is CPU)
	- COPROCS still has COPROC_NVIDIA and COPROC_ATI structs to hold vendor-specific info
	- APP_VERSION now has a struct GPU_USAGE to describe its GPU usage

svn path=/trunk/boinc/; revision=23253
2011-03-25 03:44:09 +00:00
David Anderson 9e2abe135e - simulator work
svn path=/trunk/boinc/; revision=22927
2011-01-19 00:32:49 +00:00
David Anderson 717c45a2db - client: use std::deque instead of std::vector
for RR sim's pending-job lists.
    Erasing head of vector is slow.
- lib: allow GPU peak FLOPS to be specified in XML (for simulator)
- simulator work
- client: old work fetch policy: projects may need enough jobs
    for all device instances, not just resource_share*ninst.
    E.g. a project that has only CPU jobs in a CPU/GPU client
- client: with REC scheduling, don't ask for work for
    secondary resources if project has negative priority.
- client: in RR sim, make sure we saturate devices if possible.
    Otherwise we may report a shortfall incorrectly


svn path=/trunk/boinc/; revision=22894
2011-01-12 00:47:51 +00:00
David Anderson eeab2aee92 - simulator work
- fix some indentation

svn path=/trunk/boinc/; revision=22891
2011-01-07 20:23:22 +00:00
David Anderson c5462e4917 - client: more hysteresis work fetch policy stuff
- client simulator work

svn path=/trunk/boinc/; revision=22858
2010-12-30 22:41:50 +00:00
David Anderson 7aeef3070a - client: enabled REC-based scheduling with a cmdline option
rather than a compile flag

svn path=/trunk/boinc/; revision=22855
2010-12-25 19:05:57 +00:00
David Anderson f3169fb77a - client: initial, partial checkin for hysteresis work-fetch
svn path=/trunk/boinc/; revision=22853
2010-12-23 23:39:30 +00:00
David Anderson a129c0d8cd - client: do exponential backoff (from 10 min to 24 hours)
on account manager RPC failures,
    rather than always waiting 24 hours

svn path=/trunk/boinc/; revision=22747
2010-11-25 04:35:50 +00:00
David Anderson b39615d461 - client: work fetch fix: try to maintain GPU work all projects,
since we now do round-robin for GPUs as well as CPU.
    NOTE: this bug was found using the client simulator!
- client simulator: generate REC graph

svn path=/trunk/boinc/; revision=22746
2010-11-24 20:51:25 +00:00
David Anderson 6478b3e05d - client: implement more scheduler changes that use
recent estimated credit (REC) instead of debt.
    These changes are enabled by
        #define USE_REC
    in work_fetch.h.
    If this is commented out (the default) the client uses
    debt-based scheduling, same as before.
    TODO: work-fetch policy changes
- client simulator: various fixes:
    - compute idle and wasted fraction based on all processing resources,
        not just CPU
    - compute job completion times based on FLOPS, not CPU seconds
    - compute and use project->no_X_apps
    etc.


svn path=/trunk/boinc/; revision=22741
2010-11-23 19:39:47 +00:00
David Anderson ef472e3df7 - client simulator: model the scheduler's deadline check mechanism
- scheduler: improve the deadline check mechanism slightly.
    When updating "estimated delay" (a rough measure of how long
    a resource is saturated with high-priority work)
    take into account the # of instances used by the job,
    and the # of total instances


svn path=/trunk/boinc/; revision=22612
2010-11-01 16:53:41 +00:00
David Anderson 4edfe2ec28 - client: small initial checkin for new scheduling system.
Keep track of per-project recent estimated credit

svn path=/trunk/boinc/; revision=22608
2010-10-29 23:41:34 +00:00
David Anderson 1c4422985f - client: add <no_info_fetch> config option and --no_info_fetch
cmdline arg.
    Suppresses the fetch of project list and of current client version #.
    Use when running on grid nodes.
- debugging on client simulator.  Not done yet.

svn path=/trunk/boinc/; revision=22414
2010-09-27 20:34:47 +00:00