Commit Graph

26 Commits

Author SHA1 Message Date
David Anderson a6bf5aecf3 - client: tweak to work-fetch policy:
if we're making a scheduler RPC to a project for reasons
    other than work fetch,
    and we're deciding whether to ask for work, ignore hysteresis;
    i.e. ask for work even if we're above the min buffer
    (idea from John McLeod).


svn path=/trunk/boinc/; revision=25291
2012-02-18 23:19:06 +00:00
David Anderson 8cd608605f - feeder fix
svn path=/branches/server_stable/; revision=25113
2012-01-20 23:43:37 +00:00
David Anderson dd16170fc1 - scheduler: the p_fpops value reported by clients can't be trusted.
Some credit cheats (e.g. with credit_by_runtime) can be done
    by reporting a huge value.
    Fix this by capping the value at 1.1 times the 95th percentile
    of host.p_fpops, taken over active hosts.


svn path=/trunk/boinc/; revision=25017
2012-01-09 17:35:48 +00:00
Jeff Cobb 2e526a0956 server: more fixes to DB to handle unsigned result IDs
svn path=/trunk/boinc/; revision=24564
2011-11-09 20:24:48 +00:00
David Anderson 2840281c56 - boinccmd: --get_cc_status now prints its result
- feeder: don't panic if can't find app for result;
    if the app is deprecated, it won't be in shmem

svn path=/trunk/boinc/; revision=22674
2010-11-10 18:17:20 +00:00
David Anderson b169e5ab0f - server programs: print error message instead of numeric retval
in log messages

svn path=/trunk/boinc/; revision=22647
2010-11-08 17:51:57 +00:00
David Anderson b356552c9c - scheduler/feeder: add a project config option <dont_send_jobs>.
If set, the feeder doesn't read jobs into shmem,
    and the scheduler doesn't send jobs.
    Intended for use when a project wants to process
    a backlog of completed jobs and not issue more.

svn path=/trunk/boinc/; revision=22601
2010-10-28 19:02:19 +00:00
David Anderson ac83e1e9f7 - client: fix bug with the <max_tasks_reported> config option.
If # of ready-to-report tasks > max_tasks_reported,
    then the excess ready-to-report tasks weren't getting
    reported to the scheduler at all (i.e. not in <other_results> either)
    so the scheduler would resend them
    (not a fatal problem, but a waste of bandwidth).
    From Josef Segur.

svn path=/trunk/boinc/; revision=22500
2010-10-13 23:21:19 +00:00
David Anderson e4763dc759 - web: server_status.php is not an ops page
svn path=/trunk/boinc/; revision=22381
2010-09-17 03:45:39 +00:00
David Anderson 1dfeecf698 - feeder: don't error out when an ordering option is used with HR;
if some apps don't use HR the ordering option will apply to them.

svn path=/trunk/boinc/; revision=22238
2010-08-14 23:16:11 +00:00
David Anderson 969100a00c - feeder: error out if an ordering option (e.g. --priority)
is used in combination with homogeneous redundancy.
    HR requires a cyclic scan of all sendable results.

svn path=/trunk/boinc/; revision=21973
2010-07-16 18:38:14 +00:00
David Anderson 7c51512cbf - transitioner: the format string for a DB query had %.15d instead of %.15e.
That produced a messed-up query that assigned garbage values to:
        host_app_version.turnaround_var
        host_app_version.turnaround_q
        host_app_version.max_jobs_per_day
        host_app_version.consecutive_valid
    To repair these:
        - set turnaround_var and turnaround_q to zero
        - if max_jobs_per_day is outside of
            (0..config.daily_result_quota)
            set it to config.daily_result_quota
        - if consecutive_valid is outside (0..1000), set it to zero
    I added a script, html/ops/repair_21812.php, that does this;
    if you ran server code between [21181] and [21812], run this script.
- scheduler/transitioner: add <debug_quota> log flag
- changed the build system to always use -Wall
    (if we'd done this before, this bug wouldn't have happened)
- fixed a bunch of other compile warnings


svn path=/trunk/boinc/; revision=21812
2010-06-25 18:54:37 +00:00
David Anderson b2451544e1 - server: change the following from per-host to per-(host, app version):
- daily quota mechanism
    - reliable mechanism (accelerated retries)
    - "trusted" mechanism (adaptive replication)
- scheduler: enforce host scale probation only for apps with
    host_scale_check set.
- validator: do scale probation on invalid results
    (need this in addition to error and timeout cases)
- feeder: update app version scales every 10 min, not 10 sec
- back-end apps: support --foo as well as -foo for options

Notes:
- If you have, say, cuda, cuda23 and cuda_fermi plan classes,
    a host will have separate quotas for each one.
    That means it could error out on 100 jobs for cuda_fermi,
    and when its quota goes to zero,
    error out on 100 jobs for cuda23, etc.
    This is intentional; there may be cases where one version
    works but not the others.
- host.error_rate and host.max_results_day are deprecated

TODO:
    - the values in the app table for limits on jobs in progress etc.
        should override rather than config.xml.

Implementation notes:
scheduler:
    process_request():
        read all host_app_versions for host at start;
        Compute "reliable" and "trusted" for each one.
        write modified records at end
    get_app_version():
        add "reliable_only" arg; if set, use only reliable versions
        skip over-quota versions
    Multi-pass scheduling: if have at least one reliable version,
        do a pass for jobs that need reliable,
        and use only reliable versions.
        Then clear best_app_versions cache.
    Score-based scheduling: for need-reliable jobs,
        it will pick the fastest version,
        then give a score bonus if that version happens to be reliable.
    When get back a successful result from client:
        increase daily quota
    When get back an error result from client:
        impose scale probation
        decrease daily quota if not aborted
Validator:
    when handling a WU, create a vector of HOST_APP_VERSION
        parallel to vector of RESULT.
        Pass it to assign_credit_set().
        Make copies of originals so we can update only modified ones
    update HOST_APP_VERSION error rates
Transitioner:
    decrease quota on timeout


svn path=/trunk/boinc/; revision=21181
2010-04-15 03:13:56 +00:00
David Anderson 515113d7dd - server: change all backend programs so that -d 4 means
-d 3 plus print DB queries

svn path=/trunk/boinc/; revision=21106
2010-04-05 21:59:33 +00:00
David Anderson fb851311e0 - server: various changes;
see http://boinc.berkeley.edu/trac/wiki/CreditNew

    Projects will need to update DB and recompile all back-end programs.

    Summary:
    - new way of computing credit
    - "reliable host" mechanism is per app version
    - "host punishment" mechanism is per app version
    - adjustment of wu.rsc_fpops_est provides the
        equivalent of per app version DCF
    - max jobs in progress is now per app
    - max jobs per RPC is now per app

    TODO:
    - reliable mechanism:
        - populate and use host_app_version.error_rate
        - populate host_app_version.turnaround
    - host punishment:
        - populate host_app_version.max_jobs_per_day
        - populate host_app_version.n_jobs_today
        - use app.max_jobs_per_day_init
    - job limits:
        - use app.max_jobs_in_progress, max_gpu_jobs_in_progress
        - use app.max_jobs_per_rpc
    - adjust wu.rsc_fpops_est
    - remove old credit stuff
        fpops_cumulative, credit_multiplier
        credit computation in scheduler

- AVERAGE class: use the Knuth algorithm (Wikipedia)


svn path=/trunk/boinc/; revision=21021
2010-03-29 22:28:20 +00:00
David Anderson d40be2dbf7 - feeder: compile fix
svn path=/trunk/boinc/; revision=20987
2010-03-23 17:26:05 +00:00
David Anderson f198dc76ad - feeder: with -allapps option, allow some apps to have zero weights;
no jobs will be sent for them.


svn path=/trunk/boinc/; revision=20980
2010-03-22 20:12:24 +00:00
David Anderson 75d2a45491 - server programs: add --help and --version cmdline options to all.
From Nils Chr. Brause.

svn path=/trunk/boinc/; revision=19079
2009-09-17 17:56:59 +00:00
David Anderson 8b701fc73f - scheduler: fix messed-up deadline check logic.
Old:
        1) check deadline based on wu.delay_bound
        2) in add_result_to_reply(), potentially modify wu.delay_bound,
            e.g. because of retry acceleration
        problem: reducing delay bound may cause deadline miss
    New:
        1) new function get_delay_bound_range()
            (called from wu_is_infeasible_fast())
            returns optimistic and pessimistic delay bounds.
            Retry acceleration logic is here.
        2) check deadline based on optimistic bound;
            if that fails, check based on pessimistic bound.
            Set wu.delay_bound to the one that worked.
    Notes:
    - get_delay_bound_range() needs result priority and report deadline,
        and it's called before we read the full result.
        So add these items to WORK_ITEM and WU_RESULT.
    - get_delay_bound_range() could be customized for
        project-specific deadline policy.
    - add_result_to_reply() was becoming a toxic waste dump.
        Deadline-related stuff should have been factored out in any case.

svn path=/trunk/boinc/; revision=18946
2009-08-31 19:35:46 +00:00
David Anderson 8333c35c81 - feeder: process array slots even if enum has ended;
this is needed to handle stale entries and slots
    reserved by now-dead PIDs
- client: unify code for writing soft link files

svn path=/trunk/boinc/; revision=18256
2009-06-02 00:22:45 +00:00
David Anderson 12eb6057e5 - client, Mac: don't do res_init(). It causes a crash.
- client (Unix): if client crashes while benchmark processes are going,
    make sure they detect this and exit.
- back-end programs: remove hardwired assumptions about
    what directory they run in, and hence where config.xml is.
    E.g., daemons look for it in "..", others expect it in current dir.
    New approach: all the programs look for the project dir as follows:
    1) the environment var BOINC_PROJECT_DIR, if defined
    2) the current dir, if config.xml is there.
    3) else ".."
    This means you can run programs in either proj/bin/ or proj/,
    or (using BOINC_PROJECT_DIR) you can keep executables
    outside of the project dir.


svn path=/trunk/boinc/; revision=18042
2009-05-07 13:54:51 +00:00
David Anderson 6262401394 - feeder: add -appids option: lets you specify which apps to
get jobs for (default it all).
    Useful if you're mixing locality and regular scheduling.
- a little E@h-specific stuff
From Bernd Machenschalk.


svn path=/trunk/boinc/; revision=18039
2009-05-06 21:52:50 +00:00
David Anderson e36e700f22 svn path=/trunk/boinc/; revision=17430 2009-03-03 00:14:51 +00:00
David Anderson ebe3b090e8 - add a script "upgrade_db.php" that updates project DB structure
(after user confirmation).
    This is called from "upgrade", and can also be run by itself.

    NOTE: this mechanism will handle all DB updates going forward.
    Older updates must be done the old way (edit and run db_update.php)

- Web: let teams determine whether they're accepting new members


svn path=/trunk/boinc/; revision=16160
2008-10-08 16:48:11 +00:00
David Anderson 57b92fb40a - scheduler: #ifdef'd tweaks for server simulator
svn path=/trunk/boinc/; revision=16097
2008-09-30 18:21:41 +00:00
David Anderson 98cfb8d3b0 - rename .C files to .cpp so that Doxygen will work
svn path=/trunk/boinc/; revision=16069
2008-09-26 18:20:24 +00:00