if resource is saturated for < work_buf_min()
(rather than saturated for 0).
So now the only significance of "overworked" is that we
won't ask overworked projects for work if resource is saturated
more than work_buf_min() but less than work_buf_total()
svn path=/trunk/boinc/; revision=17620
other than work fetch (e.g., user request, project request)
temporarily clear resource backoffs while deciding
whether to request work.
The backoffs are there only to delay RPCs,
and we're going an RPC anyway.
svn path=/trunk/boinc/; revision=17416
to ask for work inappropriately,
and tell user that it wasn't asking for work.
Here's what was going on:
There are two different structures with work request fields
(req_secs, req_instances, estimated_delay):
COPROC_CUDA *coproc_cuda
and
RSC_WORK_FETCH cuda_work_fetch.
WORK_FETCH::choose_project() copied from cuda_work_fetch to coproc_cuda,
but only if a project was selected.
WORK_FETCH::clear_request() clears cuda_work_fetch but not coproc_cuda.
Scenario:
- a scheduler op is made to project A requesting X>0 secs of CUDA
- later, a scheduler op is made to project B for reason
other than work fetch (e.g., user request)
- choose_project() doesn't choose anything,
so the value of coproc_cuda->req_secs remains X
- clear_request() is called but that doesn't change *coproc_cuda
Solution: work-fetch code no longer knows about internals of
COPROC_CUDA and is not responsible for settings its request fields.
The copying of request fields from RSC_WORK_FETCH to COPROC
is done at a higher level,
in CLIENT_STATE::make_scheduler_request()
Additional bug fix: estimated_delay wasn't being cleared in some cases.
svn path=/trunk/boinc/; revision=17411
project, it most have no runnable jobs for ANY resource.
- client: work-fetch bug fix: when setting requests in the
shortfall case, don't request anything if project is backed off
or overworked for the resource.
svn path=/trunk/boinc/; revision=17338
There are situations where multiple projects can legitimately
have large negative LTD on a uniprocessor.
Instead...
- client: add <zero_debts> option to cc_config.xml
svn path=/trunk/boinc/; revision=17328
1) if an instance is idle, get work from highest-debt project,
even if it's overworked.
2) if resource has a shortfall, get work from highest-debt
non-overworked project
3) if there's a fetchable non-overworked project with no runnable jobs,
get from from the highest-debt one.
(each step is done first for GPU, then CPU)
Clause 3) is new.
It will cause the client to get jobs for as many projects as possible,
even if there is no shortfall.
This is necessary to make the notion of "overworked" meaningful
(otherwise, any project with long jobs can become overworked).
It also maintains as much variety as possible (like pre-6.6 clients).
Also (small bug fix) if a project is overworked for resource R,
request work for R only in case 1).
svn path=/trunk/boinc/; revision=17327
stop accumulating debt if it's at or around zero.
This prevents other projects from being driven unboundedly negative.
- client: if the number of overworked projects exceeds the number
of device instances, clear debts; this indicates that an earlier
client was buggy and produced bad debt values.
svn path=/trunk/boinc/; revision=17325
This fixes a bug that can cause debts to NEVER get updated.
- client: added "abort_jobs_on_exit" feature
(available by --abort_jobs_on_exit cmdline
or <abort_jobs_on_exit> in cc_config.xml).
If set, when the client is exited by user request
(this includes signals on Unix)
it marks all pending jobs as aborted,
and does a scheduler RPC to all projects with jobs.
When these are completed the client exits.
This is useful when BOINC is being used on grids
where it is wiped clean after each run.
svn path=/trunk/boinc/; revision=17300
so that largest debt among eligible projects tends towards zero
- client: change definition of "overworked"; debt must be < 1 day
svn path=/trunk/boinc/; revision=17206
- client: if a project-requested RPC doesn't return work,
don't do resource backoff.
- client: if a user-requested scheduler RPC errors out, clear the request
svn path=/trunk/boinc/; revision=17191
ignore intervals longer than 10 secs;
that could only happen if the client or host was suspended/hibernated.
- client: in adjust_debts(), ignore intervals longer than
2*work fetch period, not 2*CPU sched period.
adjust_debts() is called from work fetch.
svn path=/trunk/boinc/; revision=17154
worked in the presence of coprocessors.
The simulator maintained per-project queues of pending jobs.
When a job finished (in the simulation) it would get
one or more jobs from that project's pending queue.
The problem: this could cause "holes" in the scheduling of GPUs,
and produce an erroneous nonzero shortfall for GPUs,
leading to infinite work fetch.
The solution: maintain a separate (per-resource, not per--project)
queue of pending coprocessor jobs.
When a coprocessor job finishes,
start pending jobs from the queue for that resource.
Another change: the simulator did strict reservation of coprocessors.
If there are 2 instances of CUDA,
and a 1-instance job is running in the simulation,
it wouldn't start an additional 2-instance job.
This also can cause erroneous nonzero shortfalls.
So instead, schedule coprocessors like CPUs, i.e. saturate them.
This can cause distorted completion time estimates,
but it's better than infinite work fetch.
svn path=/trunk/boinc/; revision=17093
There are two mechanisms to prevent the scheduler from
sending jobs that won't finish by their deadline.
Simple mechanism:
The client sends the interval x for which CPUs are projected
to be saturated.
Given a job with estimated duration y,
the scheduler doesn't send it if x + y exceeds the delay bound.
If it does send it, x is incremented by y.
Complex mechanism:
Client sends workload description.
Scheduler does EDF simulation, sees if deadlines are missed.
The only project using this AFAIK is BOINC alpha test.
Neither of these mechanisms takes coprocessors into account,
and as a result jobs could be sent that are doomed to
miss their deadline.
This checkin adds coprocessor awareness to the Simple mechanism.
Changes:
Client:
compute estimated delay (i.e. time until non-saturation)
for coprocessors as well as CPU.
Send them in scheduler request as part of coproc descriptor.
Scheduler:
Keep track of estimated delays separately for different resources
- client: fixed bug that computed CPU estimated delay incorrectly
- client: the work request (req_secs) for a resource is the min
of the project's share and the shortfall.
svn path=/trunk/boinc/; revision=17086
- client: restore notion of overworked;
if a project is overworked for a resource R,
don't fetch work for R unless there are idle instances
svn path=/trunk/boinc/; revision=17057
1) net adjustment for eligible projects is zero;
2) max LTD is zero
- scheduler: fix msgs so disk size is shown in GB
svn path=/trunk/boinc/; revision=17031
- client: respect work-fetch backoff for non-CPU-intensive projects
- client: for non-CPU-intensive project, fetch new job
if no currently running jobs
- client: skip non-CPU-intensive projects in debt calculations
- manager: show resource backoff times correctly
svn path=/trunk/boinc/; revision=16998