Commit Graph

69 Commits

Author SHA1 Message Date
David Anderson 7f2a3c0ce1 - client: get GPU available RAM at startup (only)
- client: fix compile warning


svn path=/trunk/boinc/; revision=24188
2011-09-13 22:58:39 +00:00
David Anderson 9856f795ed - client: remove code related to debt-based scheduling
svn path=/trunk/boinc/; revision=24163
2011-09-12 17:57:31 +00:00
David Anderson f81cb82b8e - client: make RR simulation more accurate
by simulating time-slicing explicitly.
    Also simulate changes in project REC
    and hence in scheduling priority.
- client: add a log flag "rrsim_detail" that prints
    time-slice-level info.


svn path=/trunk/boinc/; revision=24161
2011-09-12 17:01:54 +00:00
David Anderson cceea7f6d4 - client: rename MODE to RUN_MODE, and rename vars accordingly
svn path=/trunk/boinc/; revision=23974
2011-08-09 20:41:15 +00:00
David Anderson 5b159c6735 - remote job submission: bug fix and tweaks
- client: cc_config.xml: if <devnum> is omitted from a <exclude_gpu>,
    it means exclude all instances of that GPU type
- client: if all instances of a GPU type are excluded for a project,
    don't ask the project for jobs of that type


svn path=/trunk/boinc/; revision=23898
2011-07-29 00:07:20 +00:00
David Anderson 8ca24cbbab - client, work fetch policy:
adjust project REC by the amount of work queued, to increase variety
    NOTE: at some point I think I had a reason to not do this,
    but I can't remember what it is.
- client, job scheduling policy: fix how project REC is adjusted


svn path=/trunk/boinc/; revision=23838
2011-07-13 19:46:03 +00:00
David Anderson c0417a8aaa - client: fix scheduler bug that treated all CPU jobs
as non-high-priority
	- client: don't print spurious "domino prevention"
		and "thrashing prevention" msgs
	- manager: show project descriptions in same size font
		as the rest of the dialog

svn path=/trunk/boinc/; revision=23831
2011-07-11 05:34:09 +00:00
David Anderson 3b906a191c - client: generalize the GPU framework so that
- new GPU types can be added easily
		- users can specify GPUs in cc_config.xml,
			referred to by app_info.xml,
			and they will be scheduled by BOINC
			and passed --device N options
			Note: the parsing of cc_config.xml is not done yet.
		- RPC protocols (account manager and scheduler)
			can now specify GPU types in separate elements
			rather than embedding them in tag names
			e.g. <no_rsc>NVIDIA</no_rsc> rather than <no_cuda/>
	- client: in account manager replies, parse elements of the form
		<no_rsc>NAME</no_rsc>
		indicating the GPUs of type NAME should not be used.
		This allows account managers to control GPU types
		not hardwired into the client.
		Note: <no_cuda/> and <no_ati/> will continue to be supported.
	- scheduler RPC reply: add
		<no_rsc_apps>NAME</no_rsc_apps>
		(NAME = GPU name)
		to indicate that the project has no jobs for the indicated GPU type.
		<no_cuda_apps> etc. are still supported 
	- client/lib: remove set_debts() GUI RPC
	- client/scheduler RPC
		remove <cuda_backoff> etc. (superceded by no_app)
		Exception: <ip_result> elements in sched request
		still have <ncudas> and <natis>.
		Fix this later.

	Implementation notes:
	- client/lib: change "CUDA" to "NVIDIA" in type/variable names, and in XML
		Continue to recognize "CUDA" for compatibility
	- host_info.coprocs no longer used within the client;
		use a global var (COPROCS coprocs) instead.
		COPROCS now has an array of COPROCs;
		GPUs types are identified by the array index.
		Index zero means CPU.
	- a bunch of other resource-specific structs (like RSC_WORK_FETCH)
		are now stored in arrays, with same indices as COPROCS
		(i.e. index 0 is CPU)
	- COPROCS still has COPROC_NVIDIA and COPROC_ATI structs to hold vendor-specific info
	- APP_VERSION now has a struct GPU_USAGE to describe its GPU usage

svn path=/trunk/boinc/; revision=23253
2011-03-25 03:44:09 +00:00
David Anderson 9e2abe135e - simulator work
svn path=/trunk/boinc/; revision=22927
2011-01-19 00:32:49 +00:00
David Anderson 717c45a2db - client: use std::deque instead of std::vector
for RR sim's pending-job lists.
    Erasing head of vector is slow.
- lib: allow GPU peak FLOPS to be specified in XML (for simulator)
- simulator work
- client: old work fetch policy: projects may need enough jobs
    for all device instances, not just resource_share*ninst.
    E.g. a project that has only CPU jobs in a CPU/GPU client
- client: with REC scheduling, don't ask for work for
    secondary resources if project has negative priority.
- client: in RR sim, make sure we saturate devices if possible.
    Otherwise we may report a shortfall incorrectly


svn path=/trunk/boinc/; revision=22894
2011-01-12 00:47:51 +00:00
David Anderson eeab2aee92 - simulator work
- fix some indentation

svn path=/trunk/boinc/; revision=22891
2011-01-07 20:23:22 +00:00
David Anderson c5462e4917 - client: more hysteresis work fetch policy stuff
- client simulator work

svn path=/trunk/boinc/; revision=22858
2010-12-30 22:41:50 +00:00
David Anderson 7aeef3070a - client: enabled REC-based scheduling with a cmdline option
rather than a compile flag

svn path=/trunk/boinc/; revision=22855
2010-12-25 19:05:57 +00:00
David Anderson f3169fb77a - client: initial, partial checkin for hysteresis work-fetch
svn path=/trunk/boinc/; revision=22853
2010-12-23 23:39:30 +00:00
David Anderson a129c0d8cd - client: do exponential backoff (from 10 min to 24 hours)
on account manager RPC failures,
    rather than always waiting 24 hours

svn path=/trunk/boinc/; revision=22747
2010-11-25 04:35:50 +00:00
David Anderson b39615d461 - client: work fetch fix: try to maintain GPU work all projects,
since we now do round-robin for GPUs as well as CPU.
    NOTE: this bug was found using the client simulator!
- client simulator: generate REC graph

svn path=/trunk/boinc/; revision=22746
2010-11-24 20:51:25 +00:00
David Anderson 6478b3e05d - client: implement more scheduler changes that use
recent estimated credit (REC) instead of debt.
    These changes are enabled by
        #define USE_REC
    in work_fetch.h.
    If this is commented out (the default) the client uses
    debt-based scheduling, same as before.
    TODO: work-fetch policy changes
- client simulator: various fixes:
    - compute idle and wasted fraction based on all processing resources,
        not just CPU
    - compute job completion times based on FLOPS, not CPU seconds
    - compute and use project->no_X_apps
    etc.


svn path=/trunk/boinc/; revision=22741
2010-11-23 19:39:47 +00:00
David Anderson ef472e3df7 - client simulator: model the scheduler's deadline check mechanism
- scheduler: improve the deadline check mechanism slightly.
    When updating "estimated delay" (a rough measure of how long
    a resource is saturated with high-priority work)
    take into account the # of instances used by the job,
    and the # of total instances


svn path=/trunk/boinc/; revision=22612
2010-11-01 16:53:41 +00:00
David Anderson 4edfe2ec28 - client: small initial checkin for new scheduling system.
Keep track of per-project recent estimated credit

svn path=/trunk/boinc/; revision=22608
2010-10-29 23:41:34 +00:00
David Anderson 1c4422985f - client: add <no_info_fetch> config option and --no_info_fetch
cmdline arg.
    Suppresses the fetch of project list and of current client version #.
    Use when running on grid nodes.
- debugging on client simulator.  Not done yet.

svn path=/trunk/boinc/; revision=22414
2010-09-27 20:34:47 +00:00
David Anderson 31db3207e4 - client: fix bug that cause wasted scheduler RPC
Old: when a job finished, we cleared the backoffs for the
        resources it used.  The idea was to get more jobs
        immediately in the case where the client was at
        a jobs-in-progress limit.
    Problem: this resulted in an RPC immediately,
        typically before the output files were uploaded.
        So the client is still at the limit, and doesn't get jobs.
    New: clear the backoffs at the point when output files
        have been uploaded and the job is ready to report.
- client: change range in resource backoff from (0,x) to (.5, 1.5*x)


svn path=/trunk/boinc/; revision=22411
2010-09-24 21:24:02 +00:00
David Anderson 6faf48e3a8 - client: fix crashing bug on VC 2008/10;
don't memset(0,) structures containing vectors.

svn path=/trunk/boinc/; revision=21963
2010-07-15 21:43:51 +00:00
David Anderson d78b5fb79a - client: if a project is anonymous platform and it has no
app versions that use a resource,
    don't request work from it for that resource.

svn path=/trunk/boinc/; revision=20549
2010-02-11 22:19:22 +00:00
David Anderson ee889ac9dd svn path=/trunk/boinc/; revision=20284 2010-01-27 19:14:29 +00:00
David Anderson b5124fe729 - client: brute-force attempt at eliminating domino-effect preemption:
if job A is unstarted and EDF,
    and there's a job B that is later in the list,
    is started, has the same app version,
    and has the same arrival time,
    move A after B.
- client: remove the "temp_dcf" mechanism,
    which had the same goal but didn't work.
- client: in computing overall debt for a project,
    subtract a term that reflects pending work.
    This should reduce repeated fetches from the same project.
- client simulator: tweaks

svn path=/trunk/boinc/; revision=20223
2010-01-21 00:14:56 +00:00
David Anderson 876522c6aa - client: add logic to work fetch so that each project
will have enough jobs to use its share of resource instances.
    This avoids situations where e.g. on a 2-CPU system
    a project has 75% resource share and 1 CPU job,
    and its STD increases without bound.
    
    Did a general cleanup of the logic for computing
    work request sizes (seconds and instances).

svn path=/trunk/boinc/; revision=20036
2009-12-24 20:40:27 +00:00
David Anderson 37ea627866 - Win compile fixes. Also, needed to provide a replacement
for strptime() on Win.  WTF?

svn path=/trunk/boinc/; revision=20003
2009-12-21 19:20:28 +00:00
David Anderson 2ef5c5895b - client: fix bug in debt calculation
- client: <zero_debts> zeroes STD too

svn path=/trunk/boinc/; revision=19783
2009-12-04 21:21:18 +00:00
David Anderson 2d4ceb618a - client: my STD-related checkin of Dec 1 was bad.
It computed an "overall STD" as the sum of CPU and coprocs,
    weighted by the coproc's speed, as we do for LTD.
    This was the wrong idea; in the presence of GPUs,
    STDs quickly get pushed to +- 1 day and are truncated there.

    New scheme: STD is maintained per (resource type, project).
    This fixes the above problem,
    and it opens to door to round-robin scheduling of GPUs.
- client: the calculation of "anticipated debt" was scaling
    by relative resource share.
    This wasn't correct, seems to me.
- client: rename "debt" to "long_term_debt" in a few places
    (but not in the client state file, for compatibility)

svn path=/trunk/boinc/; revision=19777
2009-12-03 23:09:25 +00:00
David Anderson 59328aaccb - client: change how short term debt is updated.
Old: it's based entirely on CPU time.
        So a GPU project, whose app uses only a fraction
        of a CPU, accrues positive debt.
        This is OK if the project has only GPU apps,
        since STD is not (currently) used for GPU scheduling.
        But some projects have both CPU and GPU apps.
    New: STD is based on total processing.
        It has terms for each resource type.
        The notion of "runnable resource share" is specific to a type.
    Note: the notion of "resource share fraction" appears in
        a couple of other places:
        - it's passed to apps in app_init_data.xml
        - it's passed in scheduler requests.
        It should be broken down by resource type in these cases too.
        Note to self: do this later.

svn path=/trunk/boinc/; revision=19762
2009-12-02 03:41:52 +00:00
David Anderson e86584f6cc - client: the weight of GPU debt in computing total debt should be
(estimated throughput of all GPUs)/(estimated throughput of all CPUs)
    rather than the ratio of 1 GPU to 1 CPU.
    This change will hopefully cause ratios of granted credit
    to more closely match resource shares.

svn path=/trunk/boinc/; revision=19311
2009-10-16 02:48:55 +00:00
David Anderson a7b32b486e - client: fix crash with <ncpus>0</ncpus>
svn path=/trunk/boinc/; revision=19208
2009-09-29 02:12:35 +00:00
David Anderson 71c7e7a74b - client/scheduler/web: add per-project preferences for whether
to accept CPU, NVIDIA and ATI jobs.
    These prefs are shown only where relevant:
    e.g., only for processor types for which the project has app versions,
    and if it has versions for only one type, no pref is shown.

    These prefs affect both client and scheduler.
    The client won't ask for work for a device blocked by prefs,
    and the scheduler won't send it.

    This replaces earlier optional project-specific prefs for
    "no CPU jobs" and "no GPU jobs".
    (However, these prefs continue to be honored on the server side).

- client: if NVIDIA driver is unknown, say that rather than 0


svn path=/trunk/boinc/; revision=19194
2009-09-28 04:24:18 +00:00
David Anderson 41e3b06b23 - client and scheduler RPC: add optional <cpu_backoff>, <cuda_backoff>,
and <ati_backoff> elements to scheduler reply.
    These specify backoffs for the resource types,
    overriding the existing backoff mechanism.
    Projects can supply these if they don't have apps of a particular type
    and don't want to get periodic requests for them.

svn path=/trunk/boinc/; revision=19059
2009-09-16 17:34:19 +00:00
David Anderson 42e8d1137d svn path=/trunk/boinc/; revision=19058 2009-09-16 16:54:42 +00:00
David Anderson 4c52989f59 - client: improve the estimation of "busy time" (see 17 July checkin).
If you have 2 CPUs and a 1-day job in EDF mode,
        the busy time should be zero, not .5 days.

        Add a class BUSY_TIME_ESTIMATOR that makes a somewhat better
        (though still fairly crude) estimate.

svn path=/trunk/boinc/; revision=19003
2009-09-03 20:31:04 +00:00
David Anderson 112cec62a5 - client: fix to [18945]; we only want to max the overall request
with a GPU request if project is anonymous platform
    AND it has an app for that GPU type
- client: report overall work request as well as per-resource-type requests

svn path=/trunk/boinc/; revision=18994
2009-09-02 21:36:25 +00:00
David Anderson 29c1751898 - client: if project is anonymous platform, set the overall work req
to the max of the requests for different resource types.
    Otherwise projects with old schedulers won't send us work.

svn path=/trunk/boinc/; revision=18945
2009-08-31 03:42:01 +00:00
David Anderson c3fe504e1d - client: add ATI support to job scheduling and work fetch
svn path=/trunk/boinc/; revision=18850
2009-08-17 16:50:40 +00:00
David Anderson e606170b14 - client: try to fix situations where the scheduler
runs GPU jobs in a seemingly random order,
        or preempts GPU jobs needlessly.
        The change has two parts:
        1) sort the "results" vector by received_time,
            so that the RR simulation processes GPU jobs FIFO.
        2) in the CPU scheduler (earliest_deadline_result())
            instead of choosing the earliest-deadline GPU job that
            misses its deadline,
            pick the earliest_deadline GPU from a project that
            has a deadline miss for that GPU type
            (this is what's done in the CPU case)
    - client: fix bug where if you have an exclusive app,
        then remove it from cc_config.xml and do "update config",
        it doesn't go away.
        Need to clear the list before parsing.

svn path=/trunk/boinc/; revision=18842
2009-08-14 16:54:45 +00:00
David Anderson 5753153909 - client: 2nd try on my last checkin.
We need to estimate 2 different delays for each resource type:
    1) "saturated time": the time the resource will be fully utilized
        (new name for the old "estimated delay").
        This is used to compute work requests.
    2) "busy time": the time a new job would have to wait
        to start using this resource.
        This is passed to the scheduler and used for a crude deadline check.
        Note: this is ill-defined; a single number doesn't suffice.
        But as a very rough estimate, I'll use the sum of
            (J.duration * J.ninstances)/ninstances
        over all jobs that miss their deadline under RR sim.

svn path=/trunk/boinc/; revision=18629
2009-07-17 18:29:10 +00:00
David Anderson 8a1c0816ed - client: change the way a resource's "estimated delay"
(passed to server for crude deadline check) is computed.
    Old: estimated delay is the interval for which the resource
        is fully used (i.e., all instances busy).
    Problem: this may cause unnecessary project starvation.
        example: 1 CPU machine, has a month-long CPDN job
        with a 1-year deadline (it's not in deadline trouble).
        Then the CPU estimated delay will be 1 month,
        and the client won't get any work from projects
        with deadlines shorter than 1 month.
    New: estimated delay is the latest time at which the
        resource is fully used and is being used by at least 1 job
        that is projected to miss its deadline under RR.

    Note: this isn't precise, but I don't think we can improve it
    much without getting a lot more complex.


svn path=/trunk/boinc/; revision=18607
2009-07-16 21:21:47 +00:00
David Anderson cf638ae3a6 - client: instead of scheduling coproc jobs EDF:
- first schedule jobs projected to miss deadline in EDF order
    - then schedule remaining jobs in FIFO order
    This is intended to reduce the number of preemptions of coproc jobs,
    and hence (since they are always preempted by quit)
    to reduce the wasted time due to checkpoint gaps.
- client: the CPU scheduling policy made use of the number
    of deadline misses in various places.
    This should include only the deadline misses of CPU jobs.
    So move "deadlines_missed" from RR_SIM_STATUS and PROJECT
    to RSC_PROJECT_WORK_FETCH so that we have separate counts
    for CPU and coproc jobs, and use the count for CPU jobs.
- GUI RPC: removed the rr_sim_deadlines_missed field
    from project descriptor.
    This is no longer meaningful, and it didn't seem to be used anywhere.

svn path=/trunk/boinc/; revision=17785
2009-04-10 19:01:38 +00:00
David Anderson 837d3fc0a1 - get_project_config.php: include plan classes in platform list;
i.e., list both win/x86 and win/x86 + NVIDIA.
    This will allow the manager to show which projects can
    use the hosts's coprocessors,
    and also graying out projects that require an absent coproc.
- fix compile warnings

svn path=/trunk/boinc/; revision=17735
2009-04-03 21:55:26 +00:00
David Anderson 7e256c0995 - client: work fetch: in RR sim, keep track of the number
of device instances used by jobs that miss deadline.
    Don't do "variety" work fetch if this is >= # of instances

svn path=/trunk/boinc/; revision=17631
2009-03-19 16:55:04 +00:00
David Anderson 88a4482894 - client: consider fetching work from overworked projects
if resource is saturated for < work_buf_min()
    (rather than saturated for 0).
    So now the only significance of "overworked" is that we
    won't ask overworked projects for work if resource is saturated
    more than work_buf_min() but less than work_buf_total()

svn path=/trunk/boinc/; revision=17620
2009-03-18 15:53:02 +00:00
David Anderson 3709c1e9f4 - scheduler: include driver version in the CUDA description string
storing in the database;
- web: display the above

svn path=/trunk/boinc/; revision=17341
2009-02-24 00:06:45 +00:00
David Anderson 125c90d1da - client: work-fetch bug fix: if we're fetching work for a starved
project, it most have no runnable jobs for ANY resource.
- client: work-fetch bug fix: when setting requests in the
    shortfall case, don't request anything if project is backed off
    or overworked for the resource.

svn path=/trunk/boinc/; revision=17338
2009-02-23 21:34:13 +00:00
David Anderson f257101d36 - client: fix work-fetch bug that caused infinite fetch;
cleanup/reorganization of work fetch logic

svn path=/trunk/boinc/; revision=17337
2009-02-23 20:35:52 +00:00
David Anderson 3b31a9d803 - client: remove the "debt repair" mechanism added earlier today.
There are situations where multiple projects can legitimately
    have large negative LTD on a uniprocessor.
    Instead...
- client: add <zero_debts> option to cc_config.xml

svn path=/trunk/boinc/; revision=17328
2009-02-20 22:16:03 +00:00