Commit Graph

22 Commits

Author SHA1 Message Date
David Anderson 2363ec359d Server programs: any time we can't connect to DB, try to explain why 2020-05-06 13:01:21 -07:00
Christian Beer d2442d13b8
Merge pull request #2630 from BOINC/dpa_trickle
server: add the ability to have > 1 stages of trickle handling
2019-02-02 10:56:03 +01:00
Vitalii Koshura 1ce3793c76
Remove unused BOINC_RCSID constants
This fixes #2953

Signed-off-by: Vitalii Koshura <lestat.de.lionkur@gmail.com>
2019-01-12 23:43:48 +02:00
David Anderson 5edddd5766 trickle_handler: fix wrong sense of retval; add error option
Add global variable handler_error; if handle_trickle() returns error,
set message_from_host.handled to this.  Default 1.
2018-08-04 21:17:43 -07:00
David Anderson 3e365013fb server: add the ability to have > 1 stages of trickle handling
Some projects (namely CPDN) need to process trickle-up messages
with two different daemons.
We do this as follows:
- daemon 1 enumerates messages with handled==0, and sets handled=1
- daemon 2 enumerates messages with handled==1, and sets handled=2
You can then purge records with handled==2

To implement this, trickle_handler.cpp has two global vars,
int handled_enum and int handled_set.
By default these are 0 and 1.
For daemon 2, set them to 1 and 2 in trickle_handler_init()
2018-08-02 12:32:51 -07:00
Christian Beer 972c0b9a9f Server: ignore infinite loop defects
ignores CID 27886, 27830, 27766 found by Coverity
2015-11-04 08:20:30 +01:00
David Anderson e91eee67da trickle handler daemon: mark message as handled even if handler returns error.
This is because errors in general are non-recoverable,
and we'll end up retrying infinitely.
If an error actually is recoverable, exit().
2014-03-29 09:25:01 -07:00
David Anderson f13c3d58ea fix bug in trickle handler framework; from Christian 2013-08-23 13:01:53 -07:00
David Anderson 78f7610f6e remove dependency of boinc_api.h on str_replace.h (and hence config.h)
Any files that use strlcpy() or strlcat() must directly include str_replace.h
2013-06-06 17:31:46 -07:00
David Anderson b9f0733c06 server: replace strcpy() with strlcpy() various places 2013-06-03 22:42:53 -07:00
David Anderson d41f79588d - server daemons: add daemon_sleep(n), which sleeps for n secs
but checks for the "stop_daemons" trigger file every 1 sec.
    Use this instead of sleep() in daemons.
    This will speed up bin/stop.


svn path=/trunk/boinc/; revision=25708
2012-05-23 18:11:59 +00:00
David Anderson d8f20bceea - vboxwrapper: report network usage to the client
- client: include the above in enforcing network quota preferences


svn path=/trunk/boinc/; revision=24227
2011-09-16 19:16:12 +00:00
David Anderson 176b0a4327 - validator: add a --credit_from_runtime option.
This assigns credit proportional to runtime*p_fpops.
    To prevent cheating, p_fpops is capped at the 95th percentile value
    among active hosts,
    and runtime is capped at a specified limit.
    This option supports apps, like LHC's CERNvm app,
    that run for a certain amount of time and then exit.
    The CreditNew system doesn't work for such apps.
- trickle_credit:
    To prevent cheating,
    cap p_fpops at the 95th percentile value among active hosts,
    and require a limit on runtime.
- require that trickle handlers supply an initialization function


svn path=/trunk/boinc/; revision=24182
2011-09-13 21:01:42 +00:00
David Anderson 732866b8aa - back end: add two example trickle handlers:
trickle_credit: grants credit based on CPU time reported in msg
    trickle_echo: echoes trickle-up as a trickle-down

svn path=/trunk/boinc/; revision=23118
2011-02-27 00:10:14 +00:00
David Anderson b169e5ab0f - server programs: print error message instead of numeric retval
in log messages

svn path=/trunk/boinc/; revision=22647
2010-11-08 17:51:57 +00:00
David Anderson b2451544e1 - server: change the following from per-host to per-(host, app version):
- daily quota mechanism
    - reliable mechanism (accelerated retries)
    - "trusted" mechanism (adaptive replication)
- scheduler: enforce host scale probation only for apps with
    host_scale_check set.
- validator: do scale probation on invalid results
    (need this in addition to error and timeout cases)
- feeder: update app version scales every 10 min, not 10 sec
- back-end apps: support --foo as well as -foo for options

Notes:
- If you have, say, cuda, cuda23 and cuda_fermi plan classes,
    a host will have separate quotas for each one.
    That means it could error out on 100 jobs for cuda_fermi,
    and when its quota goes to zero,
    error out on 100 jobs for cuda23, etc.
    This is intentional; there may be cases where one version
    works but not the others.
- host.error_rate and host.max_results_day are deprecated

TODO:
    - the values in the app table for limits on jobs in progress etc.
        should override rather than config.xml.

Implementation notes:
scheduler:
    process_request():
        read all host_app_versions for host at start;
        Compute "reliable" and "trusted" for each one.
        write modified records at end
    get_app_version():
        add "reliable_only" arg; if set, use only reliable versions
        skip over-quota versions
    Multi-pass scheduling: if have at least one reliable version,
        do a pass for jobs that need reliable,
        and use only reliable versions.
        Then clear best_app_versions cache.
    Score-based scheduling: for need-reliable jobs,
        it will pick the fastest version,
        then give a score bonus if that version happens to be reliable.
    When get back a successful result from client:
        increase daily quota
    When get back an error result from client:
        impose scale probation
        decrease daily quota if not aborted
Validator:
    when handling a WU, create a vector of HOST_APP_VERSION
        parallel to vector of RESULT.
        Pass it to assign_credit_set().
        Make copies of originals so we can update only modified ones
    update HOST_APP_VERSION error rates
Transitioner:
    decrease quota on timeout


svn path=/trunk/boinc/; revision=21181
2010-04-15 03:13:56 +00:00
David Anderson 515113d7dd - server: change all backend programs so that -d 4 means
-d 3 plus print DB queries

svn path=/trunk/boinc/; revision=21106
2010-04-05 21:59:33 +00:00
David Anderson 75d2a45491 - server programs: add --help and --version cmdline options to all.
From Nils Chr. Brause.

svn path=/trunk/boinc/; revision=19079
2009-09-17 17:56:59 +00:00
David Anderson 12eb6057e5 - client, Mac: don't do res_init(). It causes a crash.
- client (Unix): if client crashes while benchmark processes are going,
    make sure they detect this and exit.
- back-end programs: remove hardwired assumptions about
    what directory they run in, and hence where config.xml is.
    E.g., daemons look for it in "..", others expect it in current dir.
    New approach: all the programs look for the project dir as follows:
    1) the environment var BOINC_PROJECT_DIR, if defined
    2) the current dir, if config.xml is there.
    3) else ".."
    This means you can run programs in either proj/bin/ or proj/,
    or (using BOINC_PROJECT_DIR) you can keep executables
    outside of the project dir.


svn path=/trunk/boinc/; revision=18042
2009-05-07 13:54:51 +00:00
David Anderson c2710be9f9 - removed outdated translation files; updated template
svn path=/trunk/boinc/; revision=17962
2009-05-01 17:05:12 +00:00
David Anderson be4cd9bb79 - scheduler: notify user if we're not sending work
because we don't have any (matchmaker only).
- back end programs: for programs that do enumerations,
    check for error returns and exit
    (otherwise we'll get stuck forever if DB fails)

NOTE: In the course of researching this I came across a bug
in the transitioner: if there's a WU with more than 1000 results,
the enumeration will always return ERR_DB_NOT_FOUND,
and the transitioner won't ever do anything again.
Fixing this is a little tricky, so I'm not going to do it right now.


svn path=/trunk/boinc/; revision=16324
2008-10-27 21:23:07 +00:00
David Anderson 98cfb8d3b0 - rename .C files to .cpp so that Doxygen will work
svn path=/trunk/boinc/; revision=16069
2008-09-26 18:20:24 +00:00