Commit Graph

151 Commits

Author SHA1 Message Date
David Anderson ec2c92665a - client: change the calculation of exponential backoff used for
1) individual file transfers
    2) project-level file transfer backoff
    3) scheduler operations
    Old: scale by e.
        Use random backoff in the range min..x
    New: scale by 2.
        Use random backoff in the rand x/2..x
- client: for file transfers, use backoff range of 10 min .. 12 hrs
    rather than 1 min .. 4 hrs


svn path=/trunk/boinc/; revision=21887
2010-07-09 19:24:13 +00:00
David Anderson eca79028cc - client: don't consider a result "nearly runnable"
if one of its downloads is stalled.
    This fixes a situation that can cause processor or GPU
    idleness when download servers are down for a while


svn path=/trunk/boinc/; revision=21877
2010-07-06 20:53:27 +00:00
David Anderson cec8b2950a - client: add --fetch_minimal_work option (cmdline and config file)
If set, then:
        if there are any active jobs at startup, don't fetch more work
        otherwise make exactly 1 scheduler RPC requesting work,
        and request only enough jobs to fill all devices.
- client: --exit_when_idle: make it available in config file
    and change semantics to:
    If set: exit if
        1) there are no tasks, and
        2) either there was an active task on startup,
            or we made a scheduler RPC requesting work
    Note: if there are not active tasks on startup,
        and the client makes a work request which doesn't return work,
        it will exit.


svn path=/trunk/boinc/; revision=21680
2010-06-02 17:50:47 +00:00
David Anderson 40eebe00af - client/scheduler: in COPROCS, instead of having a vector of
pointers to dynamically allocated COPROC-derived objects,
    just have the objects themselves.
    Dynamic allocation should be avoided at all costs.

svn path=/trunk/boinc/; revision=21564
2010-05-18 19:22:34 +00:00
Rom Walton 9cb3e6ffc7 - client & lib: bring header inclusion up-to-date for the CC to begin
hunting down a memory leak.
        
    client/
        <Various Files>
    lib/
        <Various Files>

svn path=/trunk/boinc/; revision=21457
2010-05-11 19:10:29 +00:00
David Anderson e53fdf854b - manager: if user clicks Retry in Transfer tab while network is suspended,
show an alert.
	- manager: in transfers tab, show it if transfers are suspended
		because network is suspended
	- manager: in tasks tab, if a task is downloading or uploading
		and network is suspended, show it

svn path=/trunk/boinc/; revision=21346
2010-05-01 05:28:59 +00:00
David Anderson 7db608660f - client: standardize debug messages.
Messages enabled by <foo_debug> are prefixed by "[foo]"


svn path=/trunk/boinc/; revision=21335
2010-04-29 20:32:51 +00:00
David Anderson b6ec320db9 - client: minor code cleanup
- manager: fix typo


svn path=/trunk/boinc/; revision=21327
2010-04-29 15:12:22 +00:00
David Anderson 678d880c64 - client: clean up logic related to GPU available memory.
If a driver call to get available mem fail, mark the GPU as unusable.


svn path=/trunk/boinc/; revision=21210
2010-04-19 18:35:10 +00:00
David Anderson 12869ae674 - client: give dynamic estimate (based on fraction done)
a greater weight in time-to-completion estimate


svn path=/trunk/boinc/; revision=21040
2010-04-01 03:32:14 +00:00
David Anderson e5ac873205 - boinccmd: add --set_gpu_mode command
- fix some compile warnings


svn path=/trunk/boinc/; revision=21002
2010-03-25 23:48:58 +00:00
David Anderson 4f77556c74 - client: if a GPU job is blocked on available mem,
don't fetch more jobs for that resource type

svn path=/trunk/boinc/; revision=20817
2010-03-10 06:00:37 +00:00
David Anderson 0a9f5d1433 - client: fix bug that interfered with work fetch
for particular resources in anonymous platform case.

svn path=/trunk/boinc/; revision=20755
2010-03-01 04:35:39 +00:00
David Anderson b7d48765a8 - client: if have coproc jobs but coproc is missing,
skip those jobs in RR sim.
    Otherwise we add stuff to uninitialized data structures,
    and a crash can result.
- client: initialize the above data structures anyway


svn path=/trunk/boinc/; revision=20753
2010-02-28 04:32:10 +00:00
David Anderson 6a3a03ee1a - client: don't accumulate LTD for projects w/ suspended jobs
svn path=/trunk/boinc/; revision=20630
2010-02-18 19:09:46 +00:00
David Anderson d78b5fb79a - client: if a project is anonymous platform and it has no
app versions that use a resource,
    don't request work from it for that resource.

svn path=/trunk/boinc/; revision=20549
2010-02-11 22:19:22 +00:00
David Anderson 1fd5b96d34 - client: undo [17160]. <ncpus>0</ncpus> in cc_config.xml
no longer means simulate zero CPUs.
		There are several places that divide by ncpus.
		Zero CPUs doesn't make any sense anyway.

svn path=/trunk/boinc/; revision=20474
2010-02-09 17:25:14 +00:00
David Anderson 09b92f0841 - user web: allow zero resource share
- client: allow zero resource share


svn path=/trunk/boinc/; revision=20315
2010-01-29 15:50:47 +00:00
David Anderson f716dcf7ae - client: if a project has zero resource share,
treat it as a "backup project":
    fetch work from it only if there is an idle instance
    and no other projects have work.


svn path=/trunk/boinc/; revision=20286
2010-01-28 05:21:14 +00:00
David Anderson ee889ac9dd svn path=/trunk/boinc/; revision=20284 2010-01-27 19:14:29 +00:00
David Anderson d1a3243f57 - client: fix small bug that could interfere with work fetch
on hosts with both NVIDIA and ATI GPU

svn path=/trunk/boinc/; revision=20283
2010-01-27 19:06:40 +00:00
David Anderson b5124fe729 - client: brute-force attempt at eliminating domino-effect preemption:
if job A is unstarted and EDF,
    and there's a job B that is later in the list,
    is started, has the same app version,
    and has the same arrival time,
    move A after B.
- client: remove the "temp_dcf" mechanism,
    which had the same goal but didn't work.
- client: in computing overall debt for a project,
    subtract a term that reflects pending work.
    This should reduce repeated fetches from the same project.
- client simulator: tweaks

svn path=/trunk/boinc/; revision=20223
2010-01-21 00:14:56 +00:00
David Anderson eeffc6de96 - web: translation fix from Nicolas:
"There is a bug in tra() that causes problems if one of the arguments
    contains a replacement marker itself. For example, if the first
    argument contains an encoded URL, which contains '%2', the second
    argument may appear in the middle of the URL."
- client simulator: further fiddling around.  Not done.

svn path=/trunk/boinc/; revision=20201
2010-01-19 23:01:09 +00:00
David Anderson e7dcff182f - web DB code: fix PHP warning when enumeration returns nothing.
From Nicolas. fixes #974
- client: tiny code shuffle

svn path=/trunk/boinc/; revision=20178
2010-01-15 23:08:55 +00:00
David Anderson ee343cea02 - client: small tweak to work fetch:
if project has crazy DCF, don't automatically request 1 sec;
    only request work if there's a shortfall.
- intermediate checkin for notices stuff

svn path=/trunk/boinc/; revision=20145
2010-01-12 21:53:40 +00:00
David Anderson d6b6f8d5db - client (Mac): append /usr/local/cuda/lib to LD_LIBRARY_PATH
and DYLD_LIBRARY_PATH
- client simulator: compile fixes

svn path=/trunk/boinc/; revision=20117
2010-01-09 16:41:17 +00:00
David Anderson d800ae43eb - client: work fetch fix: avoid sending null request in certain cases.
- client: fix crash in notices code

svn path=/trunk/boinc/; revision=20101
2010-01-07 21:00:42 +00:00
David Anderson 21ff6cadea - client: bug in ACTIVE_TASK::est_dur()
svn path=/trunk/boinc/; revision=20084
2010-01-06 17:08:07 +00:00
David Anderson 37aae854f3 - client: scheduling problem:
- a project overestimates job FLOP counts
    - the client starts jobs in EDF mode
    - as job progresses and fraction done increases,
        its completion time estimate decreases until
        it's no longer a deadline miss.
    - job gets preempted by other job from that project;
        you end up with lots of partly completed jobs.
    Solution (I hope): if an app version has running jobs,
        compute a "temp DCF" for the app version,
        which is the min of dynamic/static estimates for its jobs.
        Apply this scaling factor to completion time estimates
        for unstarted jobs in RR simulation
- client: the estimation of remaining time of running jobs was wrong
    (how did this bug survive so long?)

svn path=/trunk/boinc/; revision=20077
2010-01-06 06:01:23 +00:00
David Anderson bf65c8ab30 - client: fix format strings for ninstances (can be fraction now)
svn path=/trunk/boinc/; revision=20075
2010-01-05 16:36:42 +00:00
David Anderson b5e47dd29b svn path=/trunk/boinc/; revision=20040 2009-12-25 15:41:14 +00:00
David Anderson 876522c6aa - client: add logic to work fetch so that each project
will have enough jobs to use its share of resource instances.
    This avoids situations where e.g. on a 2-CPU system
    a project has 75% resource share and 1 CPU job,
    and its STD increases without bound.
    
    Did a general cleanup of the logic for computing
    work request sizes (seconds and instances).

svn path=/trunk/boinc/; revision=20036
2009-12-24 20:40:27 +00:00
David Anderson c145a143b1 - client: divide LTD deltas by ninstances, same as for STD.
This is cosmetic - it won't affect work fetch,
    but it will prevent LTD from changing faster than real time

svn path=/trunk/boinc/; revision=20034
2009-12-24 17:09:27 +00:00
David Anderson 19a69b5725 - client: maintain mean STD at zero over all projects,
not just runnable ones


svn path=/trunk/boinc/; revision=19878
2009-12-13 00:02:56 +00:00
David Anderson 1f78855ae1 - client: a couple of switch statements were missing breaks.
This would have caused work-fetch errors if
    using the no_cuda, no_cpu etc. prefs

svn path=/trunk/boinc/; revision=19867
2009-12-12 04:28:18 +00:00
David Anderson a151ad6cb3 - client/scheduler: deal with situation where GPU has enough
RAM to run job, but when we actually run the job
    not enough GPU RAM is free, so the application fails.
    This can cause a large number of jobs to fail.
    Solution:
    - app_plan() can specify the GPU RAM requirements of an app version.
        This is passed to the client in a new field
        <gpu_ram> of the <app_version> element.
    - prior to starting or restarting a GPU app, the client
        checks the amount of free RAM on the particular GPU.
        If it's not enough for the app version,
        the client doesn't start it,
        and arranges for the scheduler to ignore it for 5 minutes
        (by which point there might be more free GPU RAM)
    Notes:
    1) this change will have effect only when
        both client and scheduler are updated.
    2) the check is done in enforce_schedule(),
        rather than schedule_cpus(),
        because only at that point
        have we assigned a specific GPU to the job.
    3) there's another case to deal with:
        a GPU app's malloc of GPU RAM fails in the middle of the job.
        Currently the job fails.
        I plan to add an API call boinc_temporary_exit(x) so
        that the job can exit and potentially restart in x seconds.
        (In principle this mechanism is sufficient for all cases,
        but it could lead to a lot of starting/exiting,
        so the current change is worthwhile).

svn path=/trunk/boinc/; revision=19864
2009-12-11 22:45:59 +00:00
David Anderson ea54aa7759 - client: STD for a device with N instances
can increase or decrease at N times real time.
    My checkin of 7 Dec reflects this by changing
    the STD limits to +- N*MAX_STD.
    This looks like a bug to users.
    Instead, scale that rate of STD change by 1/N,
    and keep the old limits of +- MAX_STD

svn path=/trunk/boinc/; revision=19851
2009-12-10 17:07:45 +00:00
David Anderson e9a4debf9c - client: scheduling tweak.
Old: if a project has RR sim deadline misses,
			select jobs to run high-priority on the basis of:
			1) deadline (earliest first)
			2) estimated time to completion (least first)
			This ignores whether jobs missed their deadline in RR sim,
			so it may choose to run a job that's actually in no
			danger of missing its deadline over one that is.
		New: choose only jobs that miss their deadline in RR sim

svn path=/trunk/boinc/; revision=19826
2009-12-08 20:39:46 +00:00
David Anderson b41ea18233 - client: don't set STDs for non-runnable projects to zero.
Let them float around with other projects.
    Fixes problem where, when a project finishes its last job
    and has a negative STD, it gets an unfair increment
    by being set to zero.

svn path=/trunk/boinc/; revision=19804
2009-12-07 18:58:37 +00:00
David Anderson b70229c093 - code shuffle: move client-specific GPU code to a separate file
svn path=/trunk/boinc/; revision=19794
2009-12-07 00:42:03 +00:00
David Anderson e9fc909f3c - client: scale STD limit by # instances
svn path=/trunk/boinc/; revision=19784
2009-12-04 21:46:43 +00:00
David Anderson 2ef5c5895b - client: fix bug in debt calculation
- client: <zero_debts> zeroes STD too

svn path=/trunk/boinc/; revision=19783
2009-12-04 21:21:18 +00:00
David Anderson 09759104f4 compile fixes, message tweaks
svn path=/trunk/boinc/; revision=19778
2009-12-03 23:27:32 +00:00
David Anderson 2d4ceb618a - client: my STD-related checkin of Dec 1 was bad.
It computed an "overall STD" as the sum of CPU and coprocs,
    weighted by the coproc's speed, as we do for LTD.
    This was the wrong idea; in the presence of GPUs,
    STDs quickly get pushed to +- 1 day and are truncated there.

    New scheme: STD is maintained per (resource type, project).
    This fixes the above problem,
    and it opens to door to round-robin scheduling of GPUs.
- client: the calculation of "anticipated debt" was scaling
    by relative resource share.
    This wasn't correct, seems to me.
- client: rename "debt" to "long_term_debt" in a few places
    (but not in the client state file, for compatibility)

svn path=/trunk/boinc/; revision=19777
2009-12-03 23:09:25 +00:00
David Anderson fb4797adfd - client: Add offset to LTD of non-eligible projects
only if the offset is positive.
- client: some cmdline args set members of config.
    However, config was being cleared after cmdline args were parsed,
    so these args had no effect.
    Instead, clear config before parsing cmdline

svn path=/trunk/boinc/; revision=19776
2009-12-03 19:09:45 +00:00
David Anderson 59328aaccb - client: change how short term debt is updated.
Old: it's based entirely on CPU time.
        So a GPU project, whose app uses only a fraction
        of a CPU, accrues positive debt.
        This is OK if the project has only GPU apps,
        since STD is not (currently) used for GPU scheduling.
        But some projects have both CPU and GPU apps.
    New: STD is based on total processing.
        It has terms for each resource type.
        The notion of "runnable resource share" is specific to a type.
    Note: the notion of "resource share fraction" appears in
        a couple of other places:
        - it's passed to apps in app_init_data.xml
        - it's passed in scheduler requests.
        It should be broken down by resource type in these cases too.
        Note to self: do this later.

svn path=/trunk/boinc/; revision=19762
2009-12-02 03:41:52 +00:00
David Anderson 67ad130477 svn path=/trunk/boinc/; revision=19761 2009-12-01 20:36:11 +00:00
David Anderson 5ac92cdc01 - client: apply the LTD normalizing offset to all projects,
even non-debt-eligible ones.

svn path=/trunk/boinc/; revision=19758
2009-12-01 20:01:13 +00:00
David Anderson 6fc27ffc44 - client: use [wfd] consistently
svn path=/trunk/boinc/; revision=19725
2009-11-27 21:21:39 +00:00
David Anderson 56efa3ec27 - client: if a project has a no_{cpu,cuda,ati} pref set,
don't accumulate debt for that resource.
    Otherwise we'll accumulate debt forever,
    pushing other projects into overworked state.

svn path=/trunk/boinc/; revision=19547
2009-11-12 17:19:50 +00:00