so that largest debt among eligible projects tends towards zero
- client: change definition of "overworked"; debt must be < 1 day
svn path=/trunk/boinc/; revision=17206
this gets called when the op fails, either at initialization or later on;
it clears the project's sched_rpc_pending flag if needed.
This fixes a bug that caused user-requested RPCs to retry every 10 seconds
when the network is down.
- client: if debt-adjust period is too long, reset accounting.
Otherwise we'll get this infinitely.
- API: all optional alpha argument to TEXTURE_DESC::draw()
svn path=/trunk/boinc/; revision=17195
- client: if a project-requested RPC doesn't return work,
don't do resource backoff.
- client: if a user-requested scheduler RPC errors out, clear the request
svn path=/trunk/boinc/; revision=17191
using a coprocessor we don't know about, ignore it
(and all results using that app_version will be flushed).
This deals with the situation where we have some GPU jobs,
but the GPU card is removed (previously this resulted in a crash).
This requires some code shuffling so that we check for coprocessors
before reading state file.
svn path=/trunk/boinc/; revision=17161
ignore intervals longer than 10 secs;
that could only happen if the client or host was suspended/hibernated.
- client: in adjust_debts(), ignore intervals longer than
2*work fetch period, not 2*CPU sched period.
adjust_debts() is called from work fetch.
svn path=/trunk/boinc/; revision=17154
worked in the presence of coprocessors.
The simulator maintained per-project queues of pending jobs.
When a job finished (in the simulation) it would get
one or more jobs from that project's pending queue.
The problem: this could cause "holes" in the scheduling of GPUs,
and produce an erroneous nonzero shortfall for GPUs,
leading to infinite work fetch.
The solution: maintain a separate (per-resource, not per--project)
queue of pending coprocessor jobs.
When a coprocessor job finishes,
start pending jobs from the queue for that resource.
Another change: the simulator did strict reservation of coprocessors.
If there are 2 instances of CUDA,
and a 1-instance job is running in the simulation,
it wouldn't start an additional 2-instance job.
This also can cause erroneous nonzero shortfalls.
So instead, schedule coprocessors like CPUs, i.e. saturate them.
This can cause distorted completion time estimates,
but it's better than infinite work fetch.
svn path=/trunk/boinc/; revision=17093
There are two mechanisms to prevent the scheduler from
sending jobs that won't finish by their deadline.
Simple mechanism:
The client sends the interval x for which CPUs are projected
to be saturated.
Given a job with estimated duration y,
the scheduler doesn't send it if x + y exceeds the delay bound.
If it does send it, x is incremented by y.
Complex mechanism:
Client sends workload description.
Scheduler does EDF simulation, sees if deadlines are missed.
The only project using this AFAIK is BOINC alpha test.
Neither of these mechanisms takes coprocessors into account,
and as a result jobs could be sent that are doomed to
miss their deadline.
This checkin adds coprocessor awareness to the Simple mechanism.
Changes:
Client:
compute estimated delay (i.e. time until non-saturation)
for coprocessors as well as CPU.
Send them in scheduler request as part of coproc descriptor.
Scheduler:
Keep track of estimated delays separately for different resources
- client: fixed bug that computed CPU estimated delay incorrectly
- client: the work request (req_secs) for a resource is the min
of the project's share and the shortfall.
svn path=/trunk/boinc/; revision=17086
- client: restore notion of overworked;
if a project is overworked for a resource R,
don't fetch work for R unless there are idle instances
svn path=/trunk/boinc/; revision=17057
but we don't need to send any more CUDA jobs,
delete the BEST_APP_VERSION record and look for another app version.
This lets the scheduler send both CUDA and CPU app versions
for a given app in a single RPC.
svn path=/trunk/boinc/; revision=17051
1) net adjustment for eligible projects is zero;
2) max LTD is zero
- scheduler: fix msgs so disk size is shown in GB
svn path=/trunk/boinc/; revision=17031
- client: respect work-fetch backoff for non-CPU-intensive projects
- client: for non-CPU-intensive project, fetch new job
if no currently running jobs
- client: skip non-CPU-intensive projects in debt calculations
- manager: show resource backoff times correctly
svn path=/trunk/boinc/; revision=16998
1) it uses a coprocessor
2) it has checkpointed since the client started
3) it's being preempted because of a user action
(suspend job, project, or all processing)
or user preference (time of day, computer in use)
- scheduler: if shared mem seg doesn't exist,
report it and don't crash
svn path=/trunk/boinc/; revision=16992
even if it doesn't use a coprocessor.
- scheduler: added an "nci" (non CPU intensive) plan class
to sched_plan.cpp. It declares the use of 1% of a CPU.
The above two changes are intended to allow the QCN app to
run at above_idle priority, which it needs in order to do 500Hz polling.
- API: the std::string version of boinc_resolve_filename()
acts the same as the char[] version.
svn path=/trunk/boinc/; revision=16985