If present, "file_prefix/" is prepended to the logical names
of input and output files of jobs using that app version.
I.e. for Vbox wrapper based app versions, file_prefix is "share",
so that I/O files are put in a "share" subdirectory of the slot dir.
- update_versions: add support for
<dont_throttle>
<file_prefix>x</file_prefix>
in version.xml
svn path=/trunk/boinc/; revision=23924
- client: cc_config.xml: if <devnum> is omitted from a <exclude_gpu>,
it means exclude all instances of that GPU type
- client: if all instances of a GPU type are excluded for a project,
don't ask the project for jobs of that type
svn path=/trunk/boinc/; revision=23898
as described here: http://boinc.berkeley.edu/trac/wiki/ClientDataModel
Compatibility: if your project is using upload signatures:
- set ignore_upload_certificates
- disable job creation
- let your job queue drain
- upgrade to new server software
- clear ignore_upload_certificates
- enable job creation
svn path=/trunk/boinc/; revision=23863
as described here: http://boinc.berkeley.edu/trac/wiki/ClientDataModel
Compatibility:
clients that upgrade to this version should see nothing unusual.
Clients that downgrade from this version to a previous version
should see all projects reset
(i.e. tasks disappear and then get re-downloaded).
- manager: always show whether a file transfer is upload or download
- client: don't scale work requests by resource share
svn path=/trunk/boinc/; revision=23862
If you put an element of the form
<exclude_gpu>
<url>http://project_url.com/</url>
<device_num>1</device_num>
</exclude_gpu>
in your cc_config.xml, that GPU won't be used for that project
svn path=/trunk/boinc/; revision=23774
- client: allow "non_cpu_intensive" to be specified independently
for different apps in a project.
This is intended to support projects that use the
Attic file distribution system,
which needs to have a daemon running.
svn path=/trunk/boinc/; revision=23610
individual jobs rather than globally.
To use this, projects must add <report_immediately/>
to the <result> elements in job templates
svn path=/trunk/boinc/; revision=23515
- All sticky files are reported on each scheduler RPC
- If a scheduler reply says to delete a file, clear its sticky flag
In particular:
- remove the "send file list" tag in scheduler RPC replies
- remove FILE_INFO::marked_for_delete
- remove FILE_INFO::report_on_rpc
- remove the request_file_list program
svn path=/trunk/boinc/; revision=23431
- new GPU types can be added easily
- users can specify GPUs in cc_config.xml,
referred to by app_info.xml,
and they will be scheduled by BOINC
and passed --device N options
Note: the parsing of cc_config.xml is not done yet.
- RPC protocols (account manager and scheduler)
can now specify GPU types in separate elements
rather than embedding them in tag names
e.g. <no_rsc>NVIDIA</no_rsc> rather than <no_cuda/>
- client: in account manager replies, parse elements of the form
<no_rsc>NAME</no_rsc>
indicating the GPUs of type NAME should not be used.
This allows account managers to control GPU types
not hardwired into the client.
Note: <no_cuda/> and <no_ati/> will continue to be supported.
- scheduler RPC reply: add
<no_rsc_apps>NAME</no_rsc_apps>
(NAME = GPU name)
to indicate that the project has no jobs for the indicated GPU type.
<no_cuda_apps> etc. are still supported
- client/lib: remove set_debts() GUI RPC
- client/scheduler RPC
remove <cuda_backoff> etc. (superceded by no_app)
Exception: <ip_result> elements in sched request
still have <ncudas> and <natis>.
Fix this later.
Implementation notes:
- client/lib: change "CUDA" to "NVIDIA" in type/variable names, and in XML
Continue to recognize "CUDA" for compatibility
- host_info.coprocs no longer used within the client;
use a global var (COPROCS coprocs) instead.
COPROCS now has an array of COPROCs;
GPUs types are identified by the array index.
Index zero means CPU.
- a bunch of other resource-specific structs (like RSC_WORK_FETCH)
are now stored in arrays, with same indices as COPROCS
(i.e. index 0 is CPU)
- COPROCS still has COPROC_NVIDIA and COPROC_ATI structs to hold vendor-specific info
- APP_VERSION now has a struct GPU_USAGE to describe its GPU usage
svn path=/trunk/boinc/; revision=23253
The problem arises when there are jobs of projects
with widely differing resource shares,
and results in an overestimation of saturated time.
Old: at the start of simulation, call WORK_FETCH::compute_shares()
to get resources of runnable projects.
Use these throughout the simulation.
Problem: suppose you have 2 runnable projects;
P1 has large RS, P2 has small RS.
P1's jobs finish quickly.
P2's jobs then are running alone,
but their FLOPS is scaled (incorrectly) by P2's small RS.
Solution: recompute relative CPU resource share within the
simulation loop,
and compute it over the projects that have actives jobs
in the simulation.
svn path=/trunk/boinc/; revision=23162
(either at startup or during execution)
reset a number of "wait until X" variables;
otherwise we might wait years to contact a project, restart a file xfer, etc.
Notes:
- there is no problem setting clocks forward; things just happen prematurely
- some variables (e.g. task deadlines) are not reset,
because it's not clear what to set them to
- sched: remove ati_opencl plan class until we understand what it is
svn path=/trunk/boinc/; revision=22842
and an upload started in the last 5 min, don't fetch work from it.
The goal is to merge the 2 scheduler RPCs
(fetch work, report completed taskS) into a single RPC.
Note: this may result in idleness in some cases.
- scheduler: if client doesn't handle plan class (pre-5.10),
check plan-class app versions anyway,
but only use if it's a single-CPU app.
This allows single-CPU app versions with specific requirements
(like SSE) to be issued to old clients.
From Bernd Machenschalk
svn path=/trunk/boinc/; revision=22841
Old: enforce_schedule() won't run an active job if its
working set size exceeds remaining available RAM.
Problem: there may be a lot of similar jobs.
The client starts one, finds that its working set is too large,
starts the second, and so on.
Solution: if J is an unstarted job,
and there are started jobs using the same app version,
consider J's working set size to be the largest of
the working sets of those jobs.
- client: fix an apparent bug that could oversaturate
the CPUs with single-thread jobsk
svn path=/trunk/boinc/; revision=22840
recent estimated credit (REC) instead of debt.
These changes are enabled by
#define USE_REC
in work_fetch.h.
If this is commented out (the default) the client uses
debt-based scheduling, same as before.
TODO: work-fetch policy changes
- client simulator: various fixes:
- compute idle and wasted fraction based on all processing resources,
not just CPU
- compute job completion times based on FLOPS, not CPU seconds
- compute and use project->no_X_apps
etc.
svn path=/trunk/boinc/; revision=22741
Additions to request message:
<not_started_dur>X</not_started_dur>
<in_progress_dur>X</in_progress_dur>
The estimated remaining duration of unstarted
and in-progress tasks
Additions to reply message, within <project>, optional:
<suspend>0|1</suspend>
suspend or resume project (overrides local state)
<abort_not_started>0|1</abort_not_started>
if set, abort unstarted jobs
svn path=/trunk/boinc/; revision=22698
if project P is anonymous platform
don't request work for resource R from P
if there is no app version using R in P/app_info.xml
else
don't request work for resource R from P
if P tells us it has no app versions using R
svn path=/trunk/boinc/; revision=22675
- scheduler: improve the deadline check mechanism slightly.
When updating "estimated delay" (a rough measure of how long
a resource is saturated with high-priority work)
take into account the # of instances used by the job,
and the # of total instances
svn path=/trunk/boinc/; revision=22612
They can be determined implicitly by WUs/results,
or explicitly in the <app> record.
If you do neither, the app is ignored.
svn path=/trunk/boinc/; revision=22591
as the major criterion in choosing non-EDF GPU jobs.
GPU scheduling now respects resource share,
and as a result STD should no longer diverge.
- client simulator: various improvements, most notably
that we now generate gnuplot graphs of all debt types
NOTE: the client problem was found and fixed using the simulator!
svn path=/trunk/boinc/; revision=22536
cmdline arg.
Suppresses the fetch of project list and of current client version #.
Use when running on grid nodes.
- debugging on client simulator. Not done yet.
svn path=/trunk/boinc/; revision=22414
Insteady of using its own XML input files,
the simulator now takes a client_state.xml file as input.
The simulator generates a synthetic workload based on the
projects, apps, app versions, WUs, and result it finds there.
This means that a user seeing aberrant behavior
can just send their client_state.xml file
and (hopefully) we can use the simulator to repro.
The simulator now can model GPUs.
As of this checkin, the simulator compiles but doesn't work.
There should be no change in the actual client.
svn path=/trunk/boinc/; revision=22409
Implementation: create a base class PROJ_AM,
from which both PROJECT and ACCT_MGR_INFO are derived,
with basic stuff like name, URL, and RSS feed list
svn path=/trunk/boinc/; revision=22324
to request new work on exit
- client: change "unparsed tag" to "unrecognized tag" in msgs
- client: get rid of unused var work_fetch_no_new_work
svn path=/trunk/boinc/; revision=22000
Add more info to "project in-progress job list".
Old: entries included only job name and app plan class;
this was used to resend lost jobs,
and to count the # of CPU and GPU jobs.
But it's not usable e.g. for per-app in-progress limits.
New: send the client's app versions (including usage info)
and for each in-progress job, which app version it uses.
(This reduces request-message size compared with sending
usage info and app name per job).
- client and scheduler RPC:
Add more info to "all in-progress job list", and make it optional.
This list is used by schedulers that do deadline checks
using EDF workload simulation.
Old: the list is always sent, and it contains no info
about job resource usage
New: the list is sent only if the scheduler asked for it
in a previous reply,
and each entry now contains resource usage (CPU, GPUs)
Note: the scheduler's EDF simulator is outdated;
it doesn't know about GPU jobs.
But we may as well get the info in place.
svn path=/trunk/boinc/; revision=21513
old: assign GPUs, then check available RAM
Problem: may cause starvation on multi-GPU systems.
new: use available RAM info in the assignment process.
Prevents starvation, also reduces the number of driver calls.
svn path=/trunk/boinc/; revision=21205
favor those that are partially done
- client: fix crashing bug if a project is detached
while an RSS feed fetch for it is in progress
- code cleanup: switch from /// back to // for comments
(so much for doxygen)
svn path=/trunk/boinc/; revision=21041
Removed my changes of 19 Jan 2010, which didn't work.
Added new mechanism: keep track of whether a job J has ever run in EDF.
If so, and if another job of the same project and resource type as J
is marked as deadline miss, then mark J as deadline miss,
so that it won't get preempted.
- web: change "result" to "task" in server status page
- admin web: show server stable SVN revision, not trunk
svn path=/trunk/boinc/; revision=20805
treat it as a "backup project":
fetch work from it only if there is an idle instance
and no other projects have work.
svn path=/trunk/boinc/; revision=20286
if job A is unstarted and EDF,
and there's a job B that is later in the list,
is started, has the same app version,
and has the same arrival time,
move A after B.
- client: remove the "temp_dcf" mechanism,
which had the same goal but didn't work.
- client: in computing overall debt for a project,
subtract a term that reflects pending work.
This should reduce repeated fetches from the same project.
- client simulator: tweaks
svn path=/trunk/boinc/; revision=20223
- a project overestimates job FLOP counts
- the client starts jobs in EDF mode
- as job progresses and fraction done increases,
its completion time estimate decreases until
it's no longer a deadline miss.
- job gets preempted by other job from that project;
you end up with lots of partly completed jobs.
Solution (I hope): if an app version has running jobs,
compute a "temp DCF" for the app version,
which is the min of dynamic/static estimates for its jobs.
Apply this scaling factor to completion time estimates
for unstarted jobs in RR simulation
- client: the estimation of remaining time of running jobs was wrong
(how did this bug survive so long?)
svn path=/trunk/boinc/; revision=20077