Commit Graph

124 Commits

Author SHA1 Message Date
David Anderson 6d1133fb1d - scheduler: add <user_filter> config option.
If set, and a WU has nonzero batch,
    it is interpreted as a user ID,
    and the job will be sent only to hosts with that user ID.

    Note: the use of workunit.batch is arbitrary;
    we could also use workunit.opaque or other deprecated field.


svn path=/trunk/boinc/; revision=23556
2011-05-17 21:11:39 +00:00
David Anderson 53a7307305 - scheduler: fix nasty bug introduced in [23040]
that caused no jobs to be sent.


svn path=/trunk/boinc/; revision=23096
2011-02-23 21:22:45 +00:00
David Anderson 5421335dbb - transitioner: fix bug that could cause file deletion to not be done
for some WUs
- back end: fix the way "report grace period" is implemented
    old: result.report_deadline (i.e. what's in the DB) and
        the deadline sent to the client are the same.
        Some confusing and incorrect logic in the transitioner
        tries to provide the desired semantics.
    new: result.report_deadline is the deadline sent to the client,
        plus the grace period.
        No logic in the transitioner is needed.


svn path=/trunk/boinc/; revision=23040
2011-02-15 22:07:14 +00:00
David Anderson 43a3036101 - back end: allow the specification of a read-only DB replica
(in config.xml) to include DB name, user, and password.
- back end: add read-only replica info to SCHED_CONFIG,
    so that C++ programs can use the replica
    (currently only PHP code can use it)
- db_dump: use the read-only DB replica if it exists.


svn path=/trunk/boinc/; revision=22958
2011-01-28 22:03:46 +00:00
David Anderson b356552c9c - scheduler/feeder: add a project config option <dont_send_jobs>.
If set, the feeder doesn't read jobs into shmem,
    and the scheduler doesn't send jobs.
    Intended for use when a project wants to process
    a backlog of completed jobs and not issue more.

svn path=/trunk/boinc/; revision=22601
2010-10-28 19:02:19 +00:00
David Anderson 84679f482a - scheduler: change the "primary_platform_only" config option
to "prefer_primary_platform".
    If an app has only only 32-bit versions, use the for 64-bit clients.


svn path=/trunk/boinc/; revision=22282
2010-08-22 19:13:25 +00:00
David Anderson d79ca6a9f2 - scheduler: add <primary_platform_only> config option:
send only 64-bit app versions to 64-bit hosts 
    (the default is to send whatever app version is fastest)

svn path=/trunk/boinc/; revision=22183
2010-08-10 22:17:59 +00:00
David Anderson e0cea31781 - API: add result name to APP_INFO_DATA structure (for Volpex)
- scheduler: add max_download_urls_per_file config option
    (to limit the length of workunit.xml_doc,
    which is currently capped at 64KB).
    From Bernd.

svn path=/trunk/boinc/; revision=22082
2010-07-30 21:43:23 +00:00
David Anderson c0776ea188 - user web: put RSS item titles in CDATA
- sched: get rid of unused config items
- manager: msg tweak

svn path=/trunk/boinc/; revision=22045
2010-07-22 22:57:15 +00:00
David Anderson 0f613d61d8 - scheduler and client: fix the "allow multiple clients" feature.
This feature lets you run the BOINC client as a job on grid systems
    that handle only 1-CPU jobs;
    it disables various mechanisms that prevent multiple clients per host
    (which is normally a bad thing).
    Old:
        - Run the client with a --allow_multiple_clients flag.
            This tells it not to use a mutex that prevents
            multiple clients per host.
        - Run the project with the <multiple_clients_per_host> config flag.
            This suppresses two mechanisms:
            - (avoid duplicate host records)
                on a scheduler request with no host ID,
                looks for a host with same domain name, OS type,
                and mem size, and assumes the request is from that host
            - (job retry)
                If we get a request that doesn't have a host ID
                but does have a host CPID,
                mark its in-progress results as over
                NOTE: I CAN'T REMEMBER WHY WE SUPPRESS THIS;
                MARK S, DO YOU REMEMBER?

    Problem:
        if the grid clients attach to a project that
        doesn't use <multiple_clients_per_host>, bad things happen.
        E.g., if there are several requests at about the same time,
        most of them will fail with
        "another RPC already in progress" errors.
        If a project does include this flag,
        it loses protection from duplicate host records.

    New:
        - If the client is run with --allow_multiple_clients flag,
            it passes a <allow_multiple_clients> element
            in scheduler requests.
        - The scheduler skips the duplicate-host check on
            requests that include this flag.
        - There is no more <multiple_clients_per_host> scheduler option.

    Note: if a project using the old mechanism upgrades to this change,
    it will need to use new clients for its grid deployment.


svn path=/trunk/boinc/; revision=21839
2010-06-29 16:37:28 +00:00
David Anderson 7c51512cbf - transitioner: the format string for a DB query had %.15d instead of %.15e.
That produced a messed-up query that assigned garbage values to:
        host_app_version.turnaround_var
        host_app_version.turnaround_q
        host_app_version.max_jobs_per_day
        host_app_version.consecutive_valid
    To repair these:
        - set turnaround_var and turnaround_q to zero
        - if max_jobs_per_day is outside of
            (0..config.daily_result_quota)
            set it to config.daily_result_quota
        - if consecutive_valid is outside (0..1000), set it to zero
    I added a script, html/ops/repair_21812.php, that does this;
    if you ran server code between [21181] and [21812], run this script.
- scheduler/transitioner: add <debug_quota> log flag
- changed the build system to always use -Wall
    (if we'd done this before, this bug wouldn't have happened)
- fixed a bunch of other compile warnings


svn path=/trunk/boinc/; revision=21812
2010-06-25 18:54:37 +00:00
David Anderson 4147249de2 - server: delete old credit stuff
- user web: show host link in user result list.  Fixes #999


svn path=/trunk/boinc/; revision=21735
2010-06-12 22:08:15 +00:00
David Anderson 356327d88c - scheduler: change backoff policy if a host has reached daily job quota.
Old: back off until random time in 1st hour of next day
    New: no server-dictated backoff; rely on client backoff
    This is needed to let hosts recover in a reasonable amount of time
    after a burst of errors.
- scheduler config: it turns out we can't put arbitrary XML in config.xml;
    The Python code is set up to parse only 1 level of tags (??),
    and I'm not up to the task of changing this.
    So the fine-grained job limit feature [21674] needs to use
    a different file, namely config_aux.xml

svn path=/trunk/boinc/; revision=21686
2010-06-03 04:59:27 +00:00
David Anderson cf7fb29227 - scheduler: add fine-grained "max jobs in progress" control.
You can now specify limits for specific apps,
    and/or for the project as a whole.
    Within each of these, you can specify limits on
    CPU jobs, GPU jobs, or total jobs.
    In the case of CPU and GPU limits, you can specify
    whether the limit should be scaled by the number of devices.

    Note: the enforcement of this is done in get_app_version(),
    since per-resource-type limits may dictate what app versions
    we can use for a particular job.

svn path=/trunk/boinc/; revision=21674
2010-06-01 23:41:07 +00:00
David Anderson d45d3b488f - server: code cleanup
svn path=/trunk/boinc/; revision=21664
2010-06-01 03:45:49 +00:00
David Anderson 9187cb52ba - client and scheduler RPC:
Add more info to "project in-progress job list".
    Old: entries included only job name and app plan class;
        this was used to resend lost jobs,
        and to count the # of CPU and GPU jobs.
        But it's not usable e.g. for per-app in-progress limits.
    New: send the client's app versions (including usage info)
        and for each in-progress job, which app version it uses.
        (This reduces request-message size compared with sending
        usage info and app name per job).
- client and scheduler RPC:
    Add more info to "all in-progress job list", and make it optional.
    This list is used by schedulers that do deadline checks
    using EDF workload simulation.
    Old: the list is always sent, and it contains no info
        about job resource usage
    New: the list is sent only if the scheduler asked for it
        in a previous reply,
        and each entry now contains resource usage (CPU, GPUs)
    Note: the scheduler's EDF simulator is outdated;
        it doesn't know about GPU jobs.
        But we may as well get the info in place.


svn path=/trunk/boinc/; revision=21513
2010-05-13 20:18:27 +00:00
David Anderson 5035007b90 - back end: new way of deciding:
- whether host is "reliable" for an app version
    - whether host is eligible for single replication for an app version
    - whether to use host scaling
    In each case, the answer is yes if the number of
    consecutive valid results is above a threshold.
    This replaces existing "error rate" and "scale probation" mechanisms.

    TODO: the # of consecutive valid results should also determine
        a limit on jobs in progress for an app version.
        Namely, if N is the threshold for host scaling, the limit should be
            ndevices*(max(1, consecutive_valid - N))
        The client currently doesn't supply enough
        app version info to do this.
        It could be approximated; that would give some protection
        against cherry-picking.
- credit: more conservative formulas for combining claimed credit
    among replicas.
    If there are normal replicas, we use a "low average"
    that weights each sample by the sum of the other samples.
    Otherwise we use the min (not the average) of the approximate samples.

NOTE: a DB update is required


svn path=/trunk/boinc/; revision=21230
2010-04-21 19:33:20 +00:00
David Anderson 021edb02c2 - back end programs: improve log msgs
svn path=/trunk/boinc/; revision=21193
2010-04-16 18:07:08 +00:00
David Anderson fb851311e0 - server: various changes;
see http://boinc.berkeley.edu/trac/wiki/CreditNew

    Projects will need to update DB and recompile all back-end programs.

    Summary:
    - new way of computing credit
    - "reliable host" mechanism is per app version
    - "host punishment" mechanism is per app version
    - adjustment of wu.rsc_fpops_est provides the
        equivalent of per app version DCF
    - max jobs in progress is now per app
    - max jobs per RPC is now per app

    TODO:
    - reliable mechanism:
        - populate and use host_app_version.error_rate
        - populate host_app_version.turnaround
    - host punishment:
        - populate host_app_version.max_jobs_per_day
        - populate host_app_version.n_jobs_today
        - use app.max_jobs_per_day_init
    - job limits:
        - use app.max_jobs_in_progress, max_gpu_jobs_in_progress
        - use app.max_jobs_per_rpc
    - adjust wu.rsc_fpops_est
    - remove old credit stuff
        fpops_cumulative, credit_multiplier
        credit computation in scheduler

- AVERAGE class: use the Knuth algorithm (Wikipedia)


svn path=/trunk/boinc/; revision=21021
2010-03-29 22:28:20 +00:00
Rytis Slatkevičius f239587bdb Sched: config option not to store stderr_out if exit_status==0 (to save on DB size). With help from Nicolas Alvarez.
svn path=/trunk/boinc/; revision=18528
2009-06-30 18:00:58 +00:00
David Anderson 2e5d9bd778 - scheduler: add new config option <max_wus_in_progress_gpus>.
The limit on jobs in progress is now
        max_wus_in_progress * NCPUS
        + max_wus_in_progress * NGPUS
    where NCPUS and NGPUS reflect prefs and are capped.
    Furthermore: if the client reports plan class for in-progress jobs
    (see checkin of 31 May 2009)
    then these limits are enforced separately;
    i.e. the # of in-progress CPU jobs is <= max_wus_in_progress*NCPUS,
    and the # of in-progress GPU jobs is <= max_wus_in_progress_gpu*NGPUS
- scheduler config: rename <cuda_multiplier> to <gpu_multiplier>
- scheduler: <max_wus_to_send> is now scaled by
    (NCPUS + gpu_multiplier*NGPUS)
- scheduler: don't keep scanning array if !work_needed()
- scheduler: moved array-scan logic from sched_send.cpp to sched_array.cpp
- scheduler: don't say "no work available" if jobs are available
    but work_needed() is initially false


svn path=/trunk/boinc/; revision=18255
2009-06-01 22:15:14 +00:00
David Anderson c2fda4db09 - scheduler: add <report_max> config parameter;
limits the # of completed results handled per scheduler RPC.
    This may be needed to avoid crashes due to memory allocation
    failure (each reported result uses about 128KB memory).
- web: In showing result lists,
    include "Validate error" results in the "Invalid" category.
    (Previously they didn't appear in any category)

svn path=/trunk/boinc/; revision=18104
2009-05-14 19:01:40 +00:00
David Anderson 12eb6057e5 - client, Mac: don't do res_init(). It causes a crash.
- client (Unix): if client crashes while benchmark processes are going,
    make sure they detect this and exit.
- back-end programs: remove hardwired assumptions about
    what directory they run in, and hence where config.xml is.
    E.g., daemons look for it in "..", others expect it in current dir.
    New approach: all the programs look for the project dir as follows:
    1) the environment var BOINC_PROJECT_DIR, if defined
    2) the current dir, if config.xml is there.
    3) else ".."
    This means you can run programs in either proj/bin/ or proj/,
    or (using BOINC_PROJECT_DIR) you can keep executables
    outside of the project dir.


svn path=/trunk/boinc/; revision=18042
2009-05-07 13:54:51 +00:00
David Anderson 41ed82f791 - scheduler: fix bugs that caused only 1 job to be sent
svn path=/trunk/boinc/; revision=17555
2009-03-07 01:00:05 +00:00
David Anderson 66ec889431 - scheduler: add <locality_scheduling_sticky_file>
and <locality_scheduling_workunit_file> options
    From Bernd M.

svn path=/trunk/boinc/; revision=17431
2009-03-03 00:25:41 +00:00
David Anderson aadf813336 - scheduler/feeder: add <locality_scheduler_fraction> option;
lets you intermix locality and job-cache scheduling
    From Bernd M.

svn path=/trunk/boinc/; revision=17429
2009-03-03 00:12:55 +00:00
David Anderson 2d707927ab - scheduler: replace choose_download_url_by_timezone with
replace_download_url_by_timezone.


svn path=/trunk/boinc/; revision=17427
2009-03-02 23:47:11 +00:00
Eric J. Korpela 8f3abcc835 - Added checks for net/*.h, arpa/*.h, netinet/*.h and code to figure out
which of those files to include
    - Modified MAC address check to work on some non-Linux unixes.
      (mac_address.cpp)
    - Added suggested change to "already attached to project" checking.
      (ProjectInfoPage.cpp)
    - changed includes of standard c header files to their c++ equivalents
      (i.e. replaced <stdio.h> with <cstdio>) for namespace protection.
    - replaced "using namespace std;" with more explicit "using std::function" in
      several files.
    - Fixed bug in checking whether the os is OS/2 and added conditional OS_OS2
      to the build environment. (boinc_platform.m4,configure.ac)
    - Changed build environment to not use -nostandardlibs unless we are using
      G++ and static linkage is specified. (configure.ac)
    - Added makefiles and package building files for solaris CSW package manager.
    - Fixed bug with attempting to find login name using logname. (configure.ac)
    - Added ifdef HAVE_* protection around some include files commonly found in
      sys.
    - Added support for unified binary for x86_64/i686-pc-solaris.
      (cs_platforms.cpp)
    - generate_host_cpid() now uses MAC address on non-linux unix.
      (hostinfo_network.cpp)
    - Macro BOINC_SET_COMPILE_FLAGS now doesn't check gcc only flags on non-gcc
      compilers. (boinc_set_compile_flags.m4)
    - Library compiles no longer depend upon the library extension or require
      the library to be prefixed with lib.
    - More fixes for fcgi builds.
    - Added declaration of "struct ether_addr" and ether_ntoa().  Have not yet
      implemented ether_ntoa() for machines that don't have it, or where it is
      buggy.  (unix_util.h)
    - Added FCGI::perror() which calls FCGI_perror(). (boinc_fcgi.{h,cpp})
    - Fixed library Makefiles so that all required headers get installed.


svn path=/trunk/boinc/; revision=17388
2009-02-26 00:23:23 +00:00
David Anderson 85a8e6a772 - scheduler: remove the config flag <have_cuda_apps>,
and add <cuda_multiplier>.
    The latter is used in calculating max jobs/day for a host;
    namely, it's host.max_results_day * (NCPUS + NCUDA*cuda_multiplier).
    Set it to 10 or so if you have CUDA apps.
- scheduler: don't overload effective_ncpus();
    instead, add two new functions,
    max_results_day_multiplier() and max_wus_in_progress_multiplier()
- scheduler: don't reduce max_results_day if we get an aborted job
    (it might have been aborted by the project;
    not appopriate to punish host in this case)

svn path=/trunk/boinc/; revision=16959
2009-01-20 00:54:16 +00:00
David Anderson 91e120b3f4 - scheduler: improve message formatting; add <debug_locality> flag
for locality scheduling messages

svn path=/trunk/boinc/; revision=16921
2009-01-15 20:23:20 +00:00
David Anderson 14f6f9a386 - scheduler: add <ignore_dcf> option;
use this temporarily when you've fixed FLOPS estimate


svn path=/trunk/boinc/; revision=16672
2008-12-12 17:03:54 +00:00
David Anderson 5039207e2c - scheduler: add <have_cuda_apps> config flag.
If set the "effective NCPUS" (which is used to scale
    daily_result_quota and max_wus_in_progress)
    is max'd with the # of CUDA GPUs.

svn path=/trunk/boinc/; revision=16246
2008-10-21 23:16:07 +00:00
David Anderson bb9d546a02 - scheduler: add <no_vista_sandbox> option.
If set, don't send work to sandboxed Vista clients
    (e.g., because of CUDA issue)

svn path=/trunk/boinc/; revision=16105
2008-10-01 19:48:52 +00:00
David Anderson 4ad5249bf2 - scheduler: various bug fixes in score-based schedule;
get rid of no_darwin_6 option


svn path=/trunk/boinc/; revision=16015
2008-09-17 23:35:16 +00:00
David Anderson cc7d507789 - DB interface: in update(), check that 1 row was updated
- API: in APP_INIT_DATA, enclose project preferences in tags
    so that it's legal XML
- scheduler: add <multiple_clients_per_host> option.
    Use this if your project runs on Condor or grids
    and (to use multicore machines) you're running
    multiple clients per host.
    This will skip the host lookup based on IP address.


svn path=/trunk/boinc/; revision=15954
2008-09-04 08:33:21 +00:00
David Anderson 53ccd10f13 - scheduler: add <debug_resend> config option to enable messages
about job resending

svn path=/trunk/boinc/; revision=15889
2008-08-19 03:00:17 +00:00
David Anderson b5b3fd43b7 - scheduler: make credit_multiplier stuff conditional on
<use_credit_multiplier> flag in config.xml

svn path=/trunk/boinc/; revision=15766
2008-08-06 23:30:22 +00:00
David Anderson 4f66bb4c95 - added copyright and license info to .C, .cpp, .h files
- scheduler: fix bug in adaptive replication:
    if send an unreplicated job to untrusted host,
    set both wu.target_nresults and wu.min_quorum to app.target_nresults.

svn path=/trunk/boinc/; revision=15762
2008-08-06 18:36:30 +00:00
David Anderson 007b3ba9dd - server compile fix for gcc 4.3
svn path=/trunk/boinc/; revision=15647
2008-07-21 22:29:10 +00:00
David Anderson 16b1305db7 - server code: at some point I made a global var "SCHED_CONFIG config",
mostly so that the parse function could assume
    that everything was initially zero.
    However, various back-end functions pass around SCHED_CONFIG&
    as an argument (also named "config").
    This creates a shadow, which is always bad.
    Worse is the possibility that some projects have back-end programs
    that have a SCHED_CONFIG variable that's automatic,
    and therefore isn't zero initially,
    and therefore isn't parsing correctly.

    To fix this, I changed the 2 vectors in SCHED_CONFIG into pointers,
    and have the parse routine zero the structure.
    I was tempted to remove the SCHED_CONFIG& args to back-end functions,
    but this would have broken some projects' code.
    I did, however, change the name from config to config_loc
    to avoid shadowing.

    Also fixed various other compiler warnings.

svn path=/trunk/boinc/; revision=15541
2008-07-02 17:24:53 +00:00
David Anderson 49eb1246cf - scheduler: added
- config option <matchmaker> for matchmaker scheduling
    - config options <mm_min_slots>, <mm_max_slots>, <job_size_matching>
        to control matchmaker scheduling
- scheduler: tweaks to matchmaker scheduling from Kevin Reed
- web: fixes to alternative stylesheet from Simek

svn path=/trunk/boinc/; revision=15281
2008-05-23 16:13:30 +00:00
David Anderson e6a5d0cbfb - scheduler: add new log flags debug_edf_sim_workload, debug_edf_sim_details
for getting info on EDF simulation;
    change output from seconds to hours
- API: remove extern "C" from graphics API
    (convince me it's needed before restoring)

svn path=/trunk/boinc/; revision=15148
2008-05-08 17:48:40 +00:00
David Anderson 05f703559f - scheduler: add preliminary support for "job size matching"
(attempt to send big jobs to fast hosts, small jobs to slow hosts).
    - have "census" compute mean/stdev of host speeds,
        write it to a file perf_info.txt
    - have feeder compute mean/stdev of sizes of jobs in shmem
    - have feeder read perf_info.txt into shmem
- scheduler: add some debugging messages for app version selection
- Add LGPL license to a few files
- upgrade/setup scripts: copy census to bin/


svn path=/trunk/boinc/; revision=15136
2008-05-06 19:53:49 +00:00
David Anderson 6e6fab3e7c - scheduler: clean up message log.
Merge redundant messages.
    Condition various messages on config flags.
- client (Unix) fix to CUDA detection if LD_LIBRARY_PATH is ""

svn path=/trunk/boinc/; revision=15122
2008-05-02 17:48:29 +00:00
David Anderson fbabb7cee7 - web: tweaks to host list
- scheduler: condition lots of log file writes on config flags
    (i.e. divide "debug" output into a bunch of categories, individually selectable)

svn path=/trunk/boinc/; revision=15101
2008-04-26 23:34:38 +00:00
David Anderson 5fd6c676cf - scheduler: fix bug where scheduler sends a WU when
an app version is not available for that platform

svn path=/trunk/boinc/; revision=15088
2008-04-23 23:34:26 +00:00
David Anderson ee0ca3973c - scheduler: add "distinct_beta_apps" option;
lets users filter out beta apps as well as others
    (from Nicolas Maire)

svn path=/trunk/boinc/; revision=14971
2008-03-27 21:39:02 +00:00
David Anderson 4e9fbac5e0 - admin web: touch reread_db in manage_app_versions.php
- DB code: remove "is_high_priority" stuff.
- scheduler: merge find_app_version() into get_app_version().
    Have the latter memoize its results (both positive and negative).
    Have it call app_plan() for apps with nonempty plan_class.
- scheduler: first steps towards improved selectability of log messages.
    It will eventually be like the client,
    where you can select among various types of messages.
- feeder: if can't unlink the reread_db trigger file, exit
    (else we'd go into an infinite loop)

svn path=/trunk/boinc/; revision=14940
2008-03-18 21:22:44 +00:00
David Anderson 95772cba77 - removed boinc_ncpus_available() and boinc_nthreads() calls.
The design has been changed to constant #threads per app version
    Various changes from Kevin Reed/WCG:
    - server: add workunit.rsc_bandwidth_bound: if nonzero,
        send this WU only to hosts with that much download bandwidth
    - assimilators: if a handler returns DEFER_ASSIMILATION,
        the WU remains in INIT state and will be handled when the
        next instance completes.
        Useful if you want the assimilator to see all instances.
    - scheduler: when setting result.outcome = DETACHED,
        set received_time to now
    - scheduler: removed the reliable_time and reliable_min_avg_credit
        options
    - scheduler/web: add optional <allow_non_preferred_projects>
        in project preferences.
        If present, user will accept work from non-selected apps
        if no work is available for selected apps
    - scheduler: improved messages for projects with multiple apps
    - scheduler: added config options
        <granted_credit_weight> and <granted_credit_ramp_up>.
        Used in calculating host.claimed_credit_per_cpu_sec,
        but I'm not sure how.
    - Added two new credit-granting formulas (validate_util.C):
        stddev_credit() and two_credit()
    - server DB: add rollback_transaction() and affected_rows() to DB_CONN

    NOTE: DB update required

svn path=/trunk/boinc/; revision=14870
2008-03-07 21:13:01 +00:00
David Anderson a09e19b8dc - scheduler: add a general method for excluding hosts from job distribution.
config.xml has optional <ban_os> and <ban_cpu> elements,
    which contain regular expressions matched against
    os_name\tos_version and p_vendor\tp_model.
    If a host matches either one, it's not sent jobs.
- scheduler: fix bug in job assignment
- scheduler: initial (incompleted, commented-out) code for
    matchmaker scheduling
- server programs: declare "SCHED_CONFIG config" in sched_config.C;
    remove declarations of it from all other .C files
    (because I added a vector to it, I can no longer use memset
    to initialize it to zero; instead, it must be a global variable,
    not an automatic)

svn path=/trunk/boinc/; revision=14783
2008-02-25 18:05:04 +00:00