sched_customize.h. Their definitions in source are now protected with #ifdef
to prevent warnings.
- Created #defined parameters for GPU memory requirements of the GPU plan
classes
- Fixed some logging problems in sched_customize.h and expanded logging in some
place in the GPU plan class selection.
- Reeplaced opencl_*_101 plan classes with more generic (but compatible)
opencl_*_<ver> plan classes where the opencl version is determined from the
plan class name. This is so we don't need to write new plan classes to go
from openc 1.1 to opencl 1.2. Just change the name of the plan class from
opencl_nvidia_101 to opencl_nvidia_102 and it's done.
(but not all) wasn't finished.
New logic: if the project has an NCI app then:
- make a list of NCI apps for which the client doesn't have
a job in progress.
- try to send one job for each of these apps
- do this even if no work is being requested.
- don't send jobs for NCI apps by other mechanisms
NOTE: the client logic isn't quite right for mixed NCI projects.
If there's no job for a given NCI app,
the client should do a scheduler RPC.
This isn't critical so we won't do this now.
svn path=/trunk/boinc/; revision=26068
Some credit cheats (e.g. with credit_by_runtime) can be done
by reporting a huge value.
Fix this by capping the value at 1.1 times the 95th percentile
of host.p_fpops, taken over active hosts.
svn path=/trunk/boinc/; revision=25017
- whether host is "reliable" for an app version
- whether host is eligible for single replication for an app version
- whether to use host scaling
In each case, the answer is yes if the number of
consecutive valid results is above a threshold.
This replaces existing "error rate" and "scale probation" mechanisms.
TODO: the # of consecutive valid results should also determine
a limit on jobs in progress for an app version.
Namely, if N is the threshold for host scaling, the limit should be
ndevices*(max(1, consecutive_valid - N))
The client currently doesn't supply enough
app version info to do this.
It could be approximated; that would give some protection
against cherry-picking.
- credit: more conservative formulas for combining claimed credit
among replicas.
If there are normal replicas, we use a "low average"
that weights each sample by the sum of the other samples.
Otherwise we use the min (not the average) of the approximate samples.
NOTE: a DB update is required
svn path=/trunk/boinc/; revision=21230
Old:
1) check deadline based on wu.delay_bound
2) in add_result_to_reply(), potentially modify wu.delay_bound,
e.g. because of retry acceleration
problem: reducing delay bound may cause deadline miss
New:
1) new function get_delay_bound_range()
(called from wu_is_infeasible_fast())
returns optimistic and pessimistic delay bounds.
Retry acceleration logic is here.
2) check deadline based on optimistic bound;
if that fails, check based on pessimistic bound.
Set wu.delay_bound to the one that worked.
Notes:
- get_delay_bound_range() needs result priority and report deadline,
and it's called before we read the full result.
So add these items to WORK_ITEM and WU_RESULT.
- get_delay_bound_range() could be customized for
project-specific deadline policy.
- add_result_to_reply() was becoming a toxic waste dump.
Deadline-related stuff should have been factored out in any case.
svn path=/trunk/boinc/; revision=18946
- Update to libtool 1.5.24
- build environment: Major automake changes that I've been warning about
for some time.
- Now uses libtool to build libraries.
- Builds separate boinc_fcgi and sched_fcgi libraries for use with
FCGI server components.
- New macro "BOINC_CHECK_LIB_WITH" that executes a "AC_CHECK_LIB" on
a library only if --with-libname[=DIR] is specified on the configure
command line. This is to allow inclusion of libraries when the
ssl, gtk, wxWidgets, or other configuration is incorrect for static
libraries.
- Added a lot of "--with-*" for some libraries that might be required for
static builds.
- The sea directory has been moved to packages/generic. Changes to sea
and the associated scripts might be required to better make use of the
staging mechanism and shared libraries.
- Fixed includes of boinc_fcgi.h in many files.
- Fixed places where FCGI_FILE needs to be used implicitly.
- Fixed missing define of _SC_PAGESIZE on hosts that define only
_SC_PAGE_SIZE.
- Moved build of boinc_cmd (and source file) from lib to client
svn path=/trunk/boinc/; revision=16904
- scheduler: fix bug in adaptive replication:
if send an unreplicated job to untrusted host,
set both wu.target_nresults and wu.min_quorum to app.target_nresults.
svn path=/trunk/boinc/; revision=15762
(attempt to send big jobs to fast hosts, small jobs to slow hosts).
- have "census" compute mean/stdev of host speeds,
write it to a file perf_info.txt
- have feeder compute mean/stdev of sizes of jobs in shmem
- have feeder read perf_info.txt into shmem
- scheduler: add some debugging messages for app version selection
- Add LGPL license to a few files
- upgrade/setup scripts: copy census to bin/
svn path=/trunk/boinc/; revision=15136
and apps that use coprocessors.
There now can be several app_versions for the same
(app, platform, version_num) combination.
This changes a number of things.
- Added app_version.plan_class field to DB
- update_versions now looks for a :plan-class in the
file or directory name, and puts it in the app_version's DB record
- Change uniqueness constraint to include plan_class
- Feeder: the feeder was putting non-deprecated app_versions
in shared mem, and leaving it to the scheduler to
find the latest version for a given platform.
This is dumb.
Instead, for each app/platform pair the feeder now
finds the highest version number of a non-deprecated app version,
and enumerates all non-deprecated app_versions with that
app/platform/version
- Scheduler: add a BEST_APP_VERSION data structure that keeps track,
for each app, what the best app_version is for this host.
This saves the work of recomputing it for each job.
svn path=/trunk/boinc/; revision=14906
Lets you assign a WU to a particular host,
to one or all hosts belonging to a user or team, or to all hosts.
See http://boinc.berkeley.edu/trac/wiki/AssignedWork
Disabled unless you include <enable_assignment> in config.xml
Uses a new DB table.
Tested but only a little.
- Server: code cleanup; moved result-handling to a new file,
and removed the PLATFORM_LIST arg to everything
(put it in SCHEDULER_REQUEST instead)
svn path=/trunk/boinc/; revision=14767
This cause the core client to exit immediately before or after
running a job,
letting you examine the contents of the slot directory.
- scheduler: changed max # of CPUs used in daily_result_quota
limit from 4 to 8, and make it a compile-time parameter
- feeder/scheduler: make the number of work items in shared
memory configurable (in config.xml).
The element is <shmem_work_items>
- feeder: make the size of the work item query configurable
(<feeder_query_size)
- feeder: remove code related to removing infeasible results
from shared mem.
This mechanism was never needed,
and I think a timeout would accomplish the same effect.
client/
app.C
app_start.C
client_state.C,h
cs_cmdline.C
sched/
feeder.C
sched_array.C
sched_config.C,h
sched_send.C
sched_shmem.C,h
sched_util.C
show_shmem.C
svn path=/trunk/boinc/; revision=12771
and add <platform> and <version_num> elements to <result>
(server half of multi-version changes)
- scheduler: make ssp a global; could eliminate from args everywhere
db/
boinc_db.h
sched/
feeder.C
main.C,h
sched_send.C
sched_shmem.C,h
server_types.C
svn path=/trunk/boinc/; revision=12535