If set, the feeder doesn't read jobs into shmem,
and the scheduler doesn't send jobs.
Intended for use when a project wants to process
a backlog of completed jobs and not issue more.
svn path=/trunk/boinc/; revision=22601
- scheduler: add max_download_urls_per_file config option
(to limit the length of workunit.xml_doc,
which is currently capped at 64KB).
From Bernd.
svn path=/trunk/boinc/; revision=22082
This feature lets you run the BOINC client as a job on grid systems
that handle only 1-CPU jobs;
it disables various mechanisms that prevent multiple clients per host
(which is normally a bad thing).
Old:
- Run the client with a --allow_multiple_clients flag.
This tells it not to use a mutex that prevents
multiple clients per host.
- Run the project with the <multiple_clients_per_host> config flag.
This suppresses two mechanisms:
- (avoid duplicate host records)
on a scheduler request with no host ID,
looks for a host with same domain name, OS type,
and mem size, and assumes the request is from that host
- (job retry)
If we get a request that doesn't have a host ID
but does have a host CPID,
mark its in-progress results as over
NOTE: I CAN'T REMEMBER WHY WE SUPPRESS THIS;
MARK S, DO YOU REMEMBER?
Problem:
if the grid clients attach to a project that
doesn't use <multiple_clients_per_host>, bad things happen.
E.g., if there are several requests at about the same time,
most of them will fail with
"another RPC already in progress" errors.
If a project does include this flag,
it loses protection from duplicate host records.
New:
- If the client is run with --allow_multiple_clients flag,
it passes a <allow_multiple_clients> element
in scheduler requests.
- The scheduler skips the duplicate-host check on
requests that include this flag.
- There is no more <multiple_clients_per_host> scheduler option.
Note: if a project using the old mechanism upgrades to this change,
it will need to use new clients for its grid deployment.
svn path=/trunk/boinc/; revision=21839
That produced a messed-up query that assigned garbage values to:
host_app_version.turnaround_var
host_app_version.turnaround_q
host_app_version.max_jobs_per_day
host_app_version.consecutive_valid
To repair these:
- set turnaround_var and turnaround_q to zero
- if max_jobs_per_day is outside of
(0..config.daily_result_quota)
set it to config.daily_result_quota
- if consecutive_valid is outside (0..1000), set it to zero
I added a script, html/ops/repair_21812.php, that does this;
if you ran server code between [21181] and [21812], run this script.
- scheduler/transitioner: add <debug_quota> log flag
- changed the build system to always use -Wall
(if we'd done this before, this bug wouldn't have happened)
- fixed a bunch of other compile warnings
svn path=/trunk/boinc/; revision=21812
This file originally used code from the following tutorial,
which shows how to open a window using GLUT:
http://nehe.gamedev.net/data/lessons/lesson.asp?lesson=01
The code has now been completely rewritten;
in particular, it doesn't use GLUT anymore.
- scheduler: change default limit on #CPUs from 16 to 64
svn path=/trunk/boinc/; revision=21784
Old: back off until random time in 1st hour of next day
New: no server-dictated backoff; rely on client backoff
This is needed to let hosts recover in a reasonable amount of time
after a burst of errors.
- scheduler config: it turns out we can't put arbitrary XML in config.xml;
The Python code is set up to parse only 1 level of tags (??),
and I'm not up to the task of changing this.
So the fine-grained job limit feature [21674] needs to use
a different file, namely config_aux.xml
svn path=/trunk/boinc/; revision=21686
You can now specify limits for specific apps,
and/or for the project as a whole.
Within each of these, you can specify limits on
CPU jobs, GPU jobs, or total jobs.
In the case of CPU and GPU limits, you can specify
whether the limit should be scaled by the number of devices.
Note: the enforcement of this is done in get_app_version(),
since per-resource-type limits may dictate what app versions
we can use for a particular job.
svn path=/trunk/boinc/; revision=21674
see http://boinc.berkeley.edu/trac/wiki/CreditNew
Projects will need to update DB and recompile all back-end programs.
Summary:
- new way of computing credit
- "reliable host" mechanism is per app version
- "host punishment" mechanism is per app version
- adjustment of wu.rsc_fpops_est provides the
equivalent of per app version DCF
- max jobs in progress is now per app
- max jobs per RPC is now per app
TODO:
- reliable mechanism:
- populate and use host_app_version.error_rate
- populate host_app_version.turnaround
- host punishment:
- populate host_app_version.max_jobs_per_day
- populate host_app_version.n_jobs_today
- use app.max_jobs_per_day_init
- job limits:
- use app.max_jobs_in_progress, max_gpu_jobs_in_progress
- use app.max_jobs_per_rpc
- adjust wu.rsc_fpops_est
- remove old credit stuff
fpops_cumulative, credit_multiplier
credit computation in scheduler
- AVERAGE class: use the Knuth algorithm (Wikipedia)
svn path=/trunk/boinc/; revision=21021
and treat it as a recoverable error (i.e., retry).
The file deleter may run on a host that NSF-mounts
the upload/download dirs, and NSF mounts can file.
- scheduler: include WU#ID in log msgs for handled results
svn path=/trunk/boinc/; revision=18351
The limit on jobs in progress is now
max_wus_in_progress * NCPUS
+ max_wus_in_progress * NGPUS
where NCPUS and NGPUS reflect prefs and are capped.
Furthermore: if the client reports plan class for in-progress jobs
(see checkin of 31 May 2009)
then these limits are enforced separately;
i.e. the # of in-progress CPU jobs is <= max_wus_in_progress*NCPUS,
and the # of in-progress GPU jobs is <= max_wus_in_progress_gpu*NGPUS
- scheduler config: rename <cuda_multiplier> to <gpu_multiplier>
- scheduler: <max_wus_to_send> is now scaled by
(NCPUS + gpu_multiplier*NGPUS)
- scheduler: don't keep scanning array if !work_needed()
- scheduler: moved array-scan logic from sched_send.cpp to sched_array.cpp
- scheduler: don't say "no work available" if jobs are available
but work_needed() is initially false
svn path=/trunk/boinc/; revision=18255
limits the # of completed results handled per scheduler RPC.
This may be needed to avoid crashes due to memory allocation
failure (each reported result uses about 128KB memory).
- web: In showing result lists,
include "Validate error" results in the "Invalid" category.
(Previously they didn't appear in any category)
svn path=/trunk/boinc/; revision=18104
- client (Unix): if client crashes while benchmark processes are going,
make sure they detect this and exit.
- back-end programs: remove hardwired assumptions about
what directory they run in, and hence where config.xml is.
E.g., daemons look for it in "..", others expect it in current dir.
New approach: all the programs look for the project dir as follows:
1) the environment var BOINC_PROJECT_DIR, if defined
2) the current dir, if config.xml is there.
3) else ".."
This means you can run programs in either proj/bin/ or proj/,
or (using BOINC_PROJECT_DIR) you can keep executables
outside of the project dir.
svn path=/trunk/boinc/; revision=18042
and add <cuda_multiplier>.
The latter is used in calculating max jobs/day for a host;
namely, it's host.max_results_day * (NCPUS + NCUDA*cuda_multiplier).
Set it to 10 or so if you have CUDA apps.
- scheduler: don't overload effective_ncpus();
instead, add two new functions,
max_results_day_multiplier() and max_wus_in_progress_multiplier()
- scheduler: don't reduce max_results_day if we get an aborted job
(it might have been aborted by the project;
not appopriate to punish host in this case)
svn path=/trunk/boinc/; revision=16959
If set the "effective NCPUS" (which is used to scale
daily_result_quota and max_wus_in_progress)
is max'd with the # of CUDA GPUs.
svn path=/trunk/boinc/; revision=16246