Commit Graph

145 Commits

Author SHA1 Message Date
David Anderson a71a03f698 - Mac install: fix things a better way
- install: don't install internal .h files
- scheduler: fix spurious "reached limit of 0 GPU tasks" message,
    slight code cleanup

svn path=/trunk/boinc/; revision=18480
2009-06-22 21:11:19 +00:00
David Anderson 2e5d9bd778 - scheduler: add new config option <max_wus_in_progress_gpus>.
The limit on jobs in progress is now
        max_wus_in_progress * NCPUS
        + max_wus_in_progress * NGPUS
    where NCPUS and NGPUS reflect prefs and are capped.
    Furthermore: if the client reports plan class for in-progress jobs
    (see checkin of 31 May 2009)
    then these limits are enforced separately;
    i.e. the # of in-progress CPU jobs is <= max_wus_in_progress*NCPUS,
    and the # of in-progress GPU jobs is <= max_wus_in_progress_gpu*NGPUS
- scheduler config: rename <cuda_multiplier> to <gpu_multiplier>
- scheduler: <max_wus_to_send> is now scaled by
    (NCPUS + gpu_multiplier*NGPUS)
- scheduler: don't keep scanning array if !work_needed()
- scheduler: moved array-scan logic from sched_send.cpp to sched_array.cpp
- scheduler: don't say "no work available" if jobs are available
    but work_needed() is initially false


svn path=/trunk/boinc/; revision=18255
2009-06-01 22:15:14 +00:00
David Anderson 84afd18450 - scheduler: move app-version selection and score-based scheduling
to new files.

svn path=/trunk/boinc/; revision=17630
2009-03-19 16:35:35 +00:00
David Anderson 04cdfe9cab - scheduler and web: add a project preference for whether to use the CPU.
This complements the "use GPU?" pref.
    Neither should be necessary, but what the heck.

svn path=/trunk/boinc/; revision=17628
2009-03-18 21:14:44 +00:00
David Anderson 52d46f05d6 - scheduler: parse and return platform name in anon platform apps.
Otherwise, if an app version has a platform different from
    the client's primary platform, the client won't find it.


svn path=/trunk/boinc/; revision=17508
2009-03-05 17:54:39 +00:00
David Anderson e6f3027567 - scheduler: add support for anonymous-platform coproc apps.
Old: although the request message contained all info
        about the app version (flops, coproc usage etc.)
        the server ignored this info,
        and assumed that all anonymous platform apps where CPU.
        With 6.6 client, this could produce infinite work fetch:
        - client uses anon platform, has coproc app
        - client has idle CPU, requests CPU work
        - scheduler sends it jobs, thinking they will be done by CPU app
        - client asks for more work etc.
    New: scheduler parses full info on anon platform app versions:
        plan class, FLOPS, coprocs.
        It uses this info to make scheduling decisions;
        in particular, if the request is for CUDA work,
        if will only send jobs that use a CUDA app version.
        The <result> records it returns contain info
        (plan_class) that tells the client which app_version to use.
    This will work correctly even if the client has multiple app versions
    for the same app (e.g., a CPU version and a GPU version)


svn path=/trunk/boinc/; revision=17506
2009-03-05 17:30:10 +00:00
David Anderson 8544b20886 - client: reorganize and improve the logic for deciding
when to do a scheduler RPC:
    if user request or acct mgr request, ignore backoff and suspend via GUI;
    in all other cases honor both of these.

svn path=/trunk/boinc/; revision=17504
2009-03-05 00:10:16 +00:00
David Anderson dcc3bbe36f - scheduler: slight code cleanup
svn path=/trunk/boinc/; revision=17395
2009-02-26 03:03:35 +00:00
David Anderson b7a2c227ca - Work fetch / scheduler:
There are two mechanisms to prevent the scheduler from
    sending jobs that won't finish by their deadline.
    Simple mechanism:
        The client sends the interval x for which CPUs are projected
        to be saturated.
        Given a job with estimated duration y,
        the scheduler doesn't send it if x + y exceeds the delay bound.
        If it does send it, x is incremented by y.
    Complex mechanism:
        Client sends workload description.
        Scheduler does EDF simulation, sees if deadlines are missed.
        The only project using this AFAIK is BOINC alpha test.
    Neither of these mechanisms takes coprocessors into account,
    and as a result jobs could be sent that are doomed to
    miss their deadline.
    This checkin adds coprocessor awareness to the Simple mechanism.

    Changes:
    Client:
        compute estimated delay (i.e. time until non-saturation)
        for coprocessors as well as CPU.
        Send them in scheduler request as part of coproc descriptor.
    Scheduler:
        Keep track of estimated delays separately for different resources
- client: fixed bug that computed CPU estimated delay incorrectly
- client: the work request (req_secs) for a resource is the min
    of the project's share and the shortfall.

svn path=/trunk/boinc/; revision=17086
2009-01-30 21:25:24 +00:00
David Anderson 592bf5c17b - scheduler: get effective RAM sizes and running fraction just once
svn path=/trunk/boinc/; revision=17072
2009-01-29 20:42:45 +00:00
David Anderson 574d1fe087 - client: don't request work for a resource if it has no shortfall.
- client and server: get rid of coproc_cuda global.

svn path=/trunk/boinc/; revision=17019
2009-01-26 05:00:49 +00:00
David Anderson db8f15e396 - scheduler: if anonymous platform, ignore coprocessor requests
(since anonymous platforms apps are treated as CPU)

svn path=/trunk/boinc/; revision=17011
2009-01-24 21:51:19 +00:00
David Anderson 50405c89e3 - scheduler: improve no-work messages
- web: don't use DB conn in mysql_real_escape_string()
    (otherwise won't work if DB is down)

svn path=/trunk/boinc/; revision=16961
2009-01-20 21:31:13 +00:00
David Anderson 85a8e6a772 - scheduler: remove the config flag <have_cuda_apps>,
and add <cuda_multiplier>.
    The latter is used in calculating max jobs/day for a host;
    namely, it's host.max_results_day * (NCPUS + NCUDA*cuda_multiplier).
    Set it to 10 or so if you have CUDA apps.
- scheduler: don't overload effective_ncpus();
    instead, add two new functions,
    max_results_day_multiplier() and max_wus_in_progress_multiplier()
- scheduler: don't reduce max_results_day if we get an aborted job
    (it might have been aborted by the project;
    not appopriate to punish host in this case)

svn path=/trunk/boinc/; revision=16959
2009-01-20 00:54:16 +00:00
David Anderson 377545a056 - scheduler: if we're not sending work because of the user's "no GPUs" pref,
tell them so.
- scheduler: fix bug that caused no CUDA jobs to be sent

svn path=/trunk/boinc/; revision=16893
2009-01-12 23:47:52 +00:00
David Anderson b5a33323d2 - scheduler: if a Windows host has a GPU slower than 60 GFLOPS,
don't send it CUDA jobs (they may cause BSOD);
    send user a message to this effect

svn path=/trunk/boinc/; revision=16881
2009-01-12 05:28:36 +00:00
David Anderson a9050243d6 - scheduler: add support for resource-specific scheduler requests:
- parse new request message elements
        (CPU and coproc requested seconds and instances)
    - decide how many jobs to send based on these params
    - select app version based on these params
        (may send both CPU and CUDA app versions for the same app!)

svn path=/trunk/boinc/; revision=16861
2009-01-10 00:43:33 +00:00
David Anderson 5495ec64df - web/scheduler: add a project-specific preferences for
whether to accept GPU jobs

svn path=/trunk/boinc/; revision=16723
2008-12-18 21:25:51 +00:00
David Anderson 58cdf84551 - scheduler: in estimating job duration,
clamp running fraction to [.1, 1] and clamp DCF to [.1, 100]

svn path=/trunk/boinc/; revision=16722
2008-12-18 18:19:42 +00:00
David Anderson ef52366c1b - web: fix bug that caused login to fail
- sched: more global vars

svn path=/trunk/boinc/; revision=16695
2008-12-16 16:29:54 +00:00
David Anderson 49a69de194 - scheduler: estimate job durations based on the FLOPS estimate
for the selected APP_VERSION, rather than on the CPU benchmarks.
    Otherwise estimates are wrong for GPU or multi-thread apps.
- scheduler: start switching from having SCHED_REQUEST and
    SCHED_REPLY as globals instead of passing them around as args;
    to be continued.

svn path=/trunk/boinc/; revision=16691
2008-12-15 21:14:32 +00:00
David Anderson be4cd9bb79 - scheduler: notify user if we're not sending work
because we don't have any (matchmaker only).
- back end programs: for programs that do enumerations,
    check for error returns and exit
    (otherwise we'll get stuck forever if DB fails)

NOTE: In the course of researching this I came across a bug
in the transitioner: if there's a WU with more than 1000 results,
the enumeration will always return ERR_DB_NOT_FOUND,
and the transitioner won't ever do anything again.
Fixing this is a little tricky, so I'm not going to do it right now.


svn path=/trunk/boinc/; revision=16324
2008-10-27 21:23:07 +00:00
David Anderson 5039207e2c - scheduler: add <have_cuda_apps> config flag.
If set the "effective NCPUS" (which is used to scale
    daily_result_quota and max_wus_in_progress)
    is max'd with the # of CUDA GPUs.

svn path=/trunk/boinc/; revision=16246
2008-10-21 23:16:07 +00:00
David Anderson e43e8a408d - client: major changes to enforce_schedule() to handle GPUs
svn path=/trunk/boinc/; revision=16178
2008-10-09 22:44:45 +00:00
David Anderson d973bcac15 - scheduler: move core_client_version from WORK_REQ to SCHEDULER_REQUEST;
WORK_REQ doesn't get initialized in all cases.

svn path=/trunk/boinc/; revision=16107
2008-10-01 22:07:35 +00:00
David Anderson bb9d546a02 - scheduler: add <no_vista_sandbox> option.
If set, don't send work to sandboxed Vista clients
    (e.g., because of CUDA issue)

svn path=/trunk/boinc/; revision=16105
2008-10-01 19:48:52 +00:00
David Anderson 6869242f75 - scheduler: fixed bug that caused spurious messages
saying "no work was available for the apps you requested"
    with locality scheduling (i.e. Einstein@home)
    even if the user hasn't select apps.

    Note: the logic for printing these messages won't work
    for matchmaker scheduling.

svn path=/trunk/boinc/; revision=15847
2008-08-14 22:06:51 +00:00
David Anderson 4f66bb4c95 - added copyright and license info to .C, .cpp, .h files
- scheduler: fix bug in adaptive replication:
    if send an unreplicated job to untrusted host,
    set both wu.target_nresults and wu.min_quorum to app.target_nresults.

svn path=/trunk/boinc/; revision=15762
2008-08-06 18:36:30 +00:00
David Anderson ba6526f8c9 - scheduler: add constructor for HOST_USAGE structure
(otherwise get random crap in cmdline)


svn path=/trunk/boinc/; revision=15605
2008-07-14 22:32:20 +00:00
David Anderson 0c5c51d531 - web: when hide/unhide/delete posts,
set the timestamp of the thread to the timetamp of
    the latest non-hidden post (rather than to now).
    Same thing for forum timestamp.
- scheduler: return more informative message to user in case of
    request message parse error

svn path=/trunk/boinc/; revision=15526
2008-07-01 16:34:51 +00:00
David Anderson 0e03df254b - Back end: add adaptive validation feature
(DB update required)
- Fixed typo in Eric's 5/28 checkin

svn path=/trunk/boinc/; revision=15357
2008-06-04 23:04:12 +00:00
David Anderson b622f64e30 - scheduler: performance optimization for EDF simulation.
Keep track of the "easiest" job that has been rejected by EDF sim.
    Any jobs harder than this one can be rejected without doing the sim.


svn path=/trunk/boinc/; revision=15171
2008-05-09 23:01:53 +00:00
David Anderson 6e6fab3e7c - scheduler: clean up message log.
Merge redundant messages.
    Condition various messages on config flags.
- client (Unix) fix to CUDA detection if LD_LIBRARY_PATH is ""

svn path=/trunk/boinc/; revision=15122
2008-05-02 17:48:29 +00:00
David Anderson 8ba1188dd0 - Client/server protocol:
send <client_cap_plan_class/> if client understands
    app version plan class.
    The server checks for this instead of version > 6.11.
- clean up unix_util: .h files declare only (extern) interfaces;
    no reason for daemon() to be C

svn path=/trunk/boinc/; revision=15006
2008-04-02 19:05:08 +00:00
David Anderson 6af9f66b4e - DB/feeder/scheduler: change app_version.xml_doc from blob to mediumblob,
and change the correspending structure field from 64KB to 256KB
    (could increase this if needed).
    This is needed to handle app versions with lots (> 100) of files
- change LARGE_BLOB_SIZE to BLOB_SIZE a bunch of places
- Change COPROCS from vector<COPROC> to vector<COPROC*>.
    Otherwise the right virtual functions of COPROCs don't get called

svn path=/trunk/boinc/; revision=14986
2008-03-31 16:19:45 +00:00
David Anderson 13400c9516 Changes for multithread app support:
- update_versions: use __ (not :) as separator for plan class
- client: add plan_class to APP_VERSION;
    an app version is now identified by platform/version/plan_class
- client CPU scheduler: don't assume apps use 1 CPU
- client: add avg_ncpus, max_cpus, flops, cmdline to RESULT
- scheduler: implement app planning scheme

Other changes:

- client: if symlink() fails, make a XML soft link instead
    (for Unix running off a FAT32 FS)
- client: don't accept nonpositive resource share from AMS
- daemons and DB: check for error returns from enumerations,
    and exit if so.  Thus, if the MySQL server goes down,
    all the daemons will soon exit.
    The cron script will restart them every 5 min,
    so when the DB server comes back up so will the project.
- web: show empty max CPU % as ---
- API: get rid of all_threads_cpu_time option (always the case now)


svn path=/trunk/boinc/; revision=14966
2008-03-27 18:25:29 +00:00
David Anderson 4e9fbac5e0 - admin web: touch reread_db in manage_app_versions.php
- DB code: remove "is_high_priority" stuff.
- scheduler: merge find_app_version() into get_app_version().
    Have the latter memoize its results (both positive and negative).
    Have it call app_plan() for apps with nonempty plan_class.
- scheduler: first steps towards improved selectability of log messages.
    It will eventually be like the client,
    where you can select among various types of messages.
- feeder: if can't unlink the reread_db trigger file, exit
    (else we'd go into an infinite loop)

svn path=/trunk/boinc/; revision=14940
2008-03-18 21:22:44 +00:00
David Anderson 8098622210 - Validator framework: remove some consts, and other changes,
to allow validator to assign different credit
    to different instances of a job
- Scheduler: if can't open DB, return <project_is_down/>
    (fixes #578)
- clean up logic of modify_claimed_credit
- feeder: for -priority_order_create_time, use workunitid
    rather than create time (faster for the DB)
from Kevin Reed

svn path=/trunk/boinc/; revision=14908
2008-03-13 23:35:13 +00:00
David Anderson 815b8fc043 Various preparation for handling multithreaded apps
and apps that use coprocessors.
There now can be several app_versions for the same
(app, platform, version_num) combination.
This changes a number of things.

- Added app_version.plan_class field to DB
- update_versions now looks for a :plan-class in the
    file or directory name, and puts it in the app_version's DB record
- Change uniqueness constraint to include plan_class
- Feeder: the feeder was putting non-deprecated app_versions
    in shared mem, and leaving it to the scheduler to
    find the latest version for a given platform.
    This is dumb.
    Instead, for each app/platform pair the feeder now
    finds the highest version number of a non-deprecated app version,
    and enumerates all non-deprecated app_versions with that
    app/platform/version
- Scheduler: add a BEST_APP_VERSION data structure that keeps track,
    for each app, what the best app_version is for this host.
    This saves the work of recomputing it for each job.

svn path=/trunk/boinc/; revision=14906
2008-03-13 22:57:24 +00:00
David Anderson cc2f1a20a0 - lib: moved "run program as user" stuff to a separate file,
so it doesn't screw up the linkage of apps that don't use it
- start of server-side support for coprocessors

svn path=/trunk/boinc/; revision=14878
2008-03-10 21:59:27 +00:00
David Anderson 95772cba77 - removed boinc_ncpus_available() and boinc_nthreads() calls.
The design has been changed to constant #threads per app version
    Various changes from Kevin Reed/WCG:
    - server: add workunit.rsc_bandwidth_bound: if nonzero,
        send this WU only to hosts with that much download bandwidth
    - assimilators: if a handler returns DEFER_ASSIMILATION,
        the WU remains in INIT state and will be handled when the
        next instance completes.
        Useful if you want the assimilator to see all instances.
    - scheduler: when setting result.outcome = DETACHED,
        set received_time to now
    - scheduler: removed the reliable_time and reliable_min_avg_credit
        options
    - scheduler/web: add optional <allow_non_preferred_projects>
        in project preferences.
        If present, user will accept work from non-selected apps
        if no work is available for selected apps
    - scheduler: improved messages for projects with multiple apps
    - scheduler: added config options
        <granted_credit_weight> and <granted_credit_ramp_up>.
        Used in calculating host.claimed_credit_per_cpu_sec,
        but I'm not sure how.
    - Added two new credit-granting formulas (validate_util.C):
        stddev_credit() and two_credit()
    - server DB: add rollback_transaction() and affected_rows() to DB_CONN

    NOTE: DB update required

svn path=/trunk/boinc/; revision=14870
2008-03-07 21:13:01 +00:00
David Anderson b6cc885abf - server: make the special substring for assigned WUs
into a #define's symbol (ASSIGNED_WU_STR)
- scheduler: when send client command to abort a WU,
    include a reason code in the scheduler log

svn path=/trunk/boinc/; revision=14798
2008-02-26 17:24:29 +00:00
David Anderson 54519a4ee1 - Server: add "job assignment" feature.
Lets you assign a WU to a particular host,
    to one or all hosts belonging to a user or team, or to all hosts.
    See http://boinc.berkeley.edu/trac/wiki/AssignedWork
    Disabled unless you include <enable_assignment> in config.xml
    Uses a new DB table.
    Tested but only a little.
- Server: code cleanup; moved result-handling to a new file,
    and removed the PLATFORM_LIST arg to everything
    (put it in SCHEDULER_REQUEST instead)

svn path=/trunk/boinc/; revision=14767
2008-02-21 00:47:50 +00:00
David Anderson 7ea74282f4 - client: limit global prefs mod time to now
- server: limit global prefs mod time to now
    These changes address the situation where a server
    sends out prefs with mod time far in the future,
    and there's no way to undo them

svn path=/trunk/boinc/; revision=14664
2008-02-03 21:46:30 +00:00
David Anderson 2be6f8e53a - Client: add <run_apps_manually> config flag.
This is for debugging apps (currently works only in Unix).
    What it does: when running an app,
    the client does everything except actually fork/exec the app,
    i.e. it sets up the slot dir, creates shared mem segment etc.
    It then continues as if the app were actually running,
    and you can then manually run your app under a debugger
    in the slot directory.
    Note: the client won't notice the termination of your app.
- API, Unix: in situations where the timer thread wants to exit
    (e.g. it notices a missing heartbeat).
    don't directly call boinc_exit(),
    since this touches data structures that the worker thread
    may be using concurrently.
    Instead, set a flag telling the worker thread to call boinc_exit()
    (which it will do from its signal handler)
    This is an attempt to fix problems reported by Bernd;
    I haven't tested it.
- scheduler: add config flag for uploading usage data
- web: show account key and weak account key on user page
- added some code for multithread support (not finished)

api/
    boinc_api.C


svn path=/trunk/boinc/; revision=14542
2008-01-13 00:12:14 +00:00
Frank Thomas 3bfc78b511 Updated the postal address of the Free Software Foundation in all license headers. See http://lists.ssl.berkeley.edu/pipermail/boinc_dev/2007-October/008939.html for reference.
svn path=/trunk/boinc/; revision=13804
2007-10-09 11:35:47 +00:00
David Anderson 4e1f0b7019 - client: add a <data_dir> option in cc_config.xml;
tells the client to use this as the data directory
- scheduler: improve the message telling the client that
    more disk or memory is needed;
    tell them the minimum amount needed to
    send any of the jobs rejected,
    rather than the amount needed for the first job rejected
- manager: fix text in "connect now" dialog

svn path=/trunk/boinc/; revision=13387
2007-08-16 17:33:41 +00:00
David Anderson 797c464b3a - Back end: add a feature for "blackballing" hosts.
To do this, set host.max_results_day to -1.
    If you do this, scheduler requests from that host
    will get an error message, and will otherwise be ignored
    (no jobs in or out, no trickles).
- Scheduler: send_message() should be called ONLY if you're
    not going to call handle_request();
    otherwise we'll write two separate replies.
    To fix this, I added a separate function (send_error_message())
    that can be called within handle_request()
    to deal with error situations.
- Scheduler: moved debug_sched() to main.C
- Scheduler: moved logic to send "delete file" commands
    out of handle_request() into a separate function,
    send_file_deletes() in sched_locality.C.
    Remove #ifdef EINSTEIN_AT_HOMEs; maybe someday another project
    will use locality scheduling!

svn path=/trunk/boinc/; revision=13108
2007-07-06 16:37:00 +00:00
David Anderson 0bbe224c21 - scheduler: the "max_wus_in_progress" option only worked if
"resend_lost_results" option was used also
    (because the count of in-progress results was
    based from the DB query used by resend_lost_results).

    Fix: initialize the count of in-progress results to
    the list provided in the scheduler request.
- scheduler: add "--mark_jobs_done" flag; if set, all jobs
    sent are marked as done, and their WUs enabled for transition.
    This is used for simulation purposes,
    in conjunction with sched_driver.
- scheduler: if --batch option is set, don't check RPC seqnos
    (for simulation purposes)

svn path=/trunk/boinc/; revision=13101
2007-07-05 04:18:48 +00:00
David Anderson 01f4851323 - scheduler: add max_wus_in_progress option.
Limits total # of in-progress results per host
    (independently of #CPUs)

sched/
    sched_config.C,h
    sched_resend.C
    sched_send.C
    server_types.h


svn path=/trunk/boinc/; revision=12661
2007-05-14 15:21:38 +00:00