Commit Graph

31 Commits

Author SHA1 Message Date
David Anderson a87d039f49 server: more 64-bit ID fixes
negative values are stored in app_version_id fields to represent
anonymous platform versions.
So need to use %ld rather than %lu for these fields.

Also there were a couple of more changes of int do DB_ID_TYPE
2015-07-29 17:32:57 -07:00
David Anderson 8cd8c8e7ee server software: handle 64-bit database IDs
The SETI@home result table is about to run out of 32-bit IDs,
so we need to move to 64-bit result IDs.
This will happen to the workunit table at some point too.

I changed the server C++ code to use the "long" type for all DB IDs
(and to use appropriate conversion codes like %lu).
"long" is 64 bit on 64-bit machines.
For uniformity I did this for all tables,
even ones (like app) that will never get big.

I chose NOT to change the DB schema for now.
The new code will work with 32-bit ID fields in the DB.
As projects approach the 32-bit limit on a table they can change
its ID field, and fields that reference this table, to BIGINT.
This is likely to happen only on the result and workunit tables.
I put functions in html/ops/db_update.php
to change the IDs of these tables.
2015-07-23 10:11:08 -07:00
David Anderson c2a0421074 scheduler: add support for miner_asic coprocessor type
I.e. treat miner ASICs as a distinct processor type;
send miner_asic jobs only if the client requests them.

Note: I was planning to do this in a more general way,
in which the scheduler wouldn't have a hard-wired list of processor types.
However, that would be a large code change,
so for now I just added miner_asic to the list of processor types
(nvidia, ati, intel_gpu),
and made various changes to get things to work.

Also: in the job dispatch logic, try to send coproc jobs
before CPU jobs.
That way if e.g. there's a limit on jobs in progress,
we'll preferentially send coproc jobs.
2014-09-21 21:08:09 -07:00
David Anderson d77f322014 scheduler, show_shmem: if shared mem size mismatch, give details 2014-08-11 11:46:25 -07:00
David Anderson d861862ca1 server: fix compile warnings and file descriptor leaks
Also, we were using memset() to zero WORK_REQ,
which contains several std::vector's.
This apparently works on Linux, but not in general.
2014-01-08 22:00:13 -08:00
David Anderson ef82d5d9fb server: fix compile error on systems that don't define MAXPATHLEN 2013-08-22 17:01:45 -07:00
Eric J Korpela 24353261c4 Fixed problems with FCGI compiles introduced in recent checkins. 2013-04-25 12:46:22 -07:00
David Anderson 0c430ce1fa Add support for multi-size apps
See http://boinc.berkeley.edu/trac/wiki/MultiSize
The components of this include:
- DB changes:
    add size_class to workunit and result
    n_size_classes to app; >1 means multi-size
- size_regulator daemon program: change results states
    from INACTIVE to UNSENT carefully
- size_census program; writes quantile info in flat files
- transitioner: when creating results for multi-size apps,
    set server state to INACTIVE
- sched shmem (feeder): read quantile info from flat files,
    store in shared memory
- scheduler (score-based scheduling): for multi-size apps,
    add component to score function for size class.
- show_shmem: show result size class
- make_work (and other callers of count_unsent_results()):
    count both INACTIVE and UNSENT
- create_work: add --size_class cmdline option

Also:
- if get MySQL errors in upgrade, don't rewrite db_version
2013-04-25 00:27:35 -07:00
David Anderson 12319ca82b - scheduler: add code (commented out for now) for new implementation
of score-based scheduling.
2013-04-09 11:10:50 -07:00
David Anderson 657c0f9730 - show_shmem: show more stuff, and improve display
- job prioritization tweaks
2013-03-04 17:48:45 +01:00
David Anderson 11a6e85632 - scheduler: support for projects with some non-CPU-intensive apps
(but not all) wasn't finished.
    New logic: if the project has an NCI app then:
    - make a list of NCI apps for which the client doesn't have
        a job in progress.
    - try to send one job for each of these apps
    - do this even if no work is being requested.
    - don't send jobs for NCI apps by other mechanisms

NOTE: the client logic isn't quite right for mixed NCI projects.
    If there's no job for a given NCI app,
    the client should do a scheduler RPC.
    This isn't critical so we won't do this now.


svn path=/trunk/boinc/; revision=26068
2012-09-01 04:58:12 +00:00
David Anderson 9ccb8fa38d - scheduler: add support for limited locality scheduling
- API: remove support for PPM files


svn path=/trunk/boinc/; revision=26062
2012-08-27 17:00:43 +00:00
David Anderson 8c71f6d59a - scheduler: add support for Intel GPUs, and restructure things
to make it easier to add other GPU types in the future


svn path=/trunk/boinc/; revision=25792
2012-06-25 23:09:45 +00:00
David Anderson fd0983b991 - web: server status page should show elapsed time, not CPU time
svn path=/trunk/boinc/; revision=25785
2012-06-22 07:35:54 +00:00
David Anderson ba04760fee - scheduler: fix bug that broke broadcast jobs (from Kevin)
svn path=/trunk/boinc/; revision=25258
2012-02-14 16:58:18 +00:00
David Anderson 130d6ed4f0 - server: revamp the "assigned job" mechanism.
This now supports two main use cases:
    1) there's a job that you want to run once on all hosts,
        present and future
        (or all hosts belonging to a user, or to a team).
        The job is never transitioned, validated, or assimilated.
    2) There's a normal job for which you want to use only
        hosts belonging to a specific user (e.g. cluster or cloud hosts).
        This restriction can be made either when the job is created,
        or on the fly,
        e.g. as part of a scheme for accelerating batch completion.
        For the latter purposes we now provide a function
            restrict_wu_to_user(DB_WORKUNIT&, int userid);

        The job goes through the standard
        transitioner/validator/assimilator path.

    These cases are enabled by config flags
        <enable_assignment_multi/>
        <enable_assignment/>
    respectively.

    Assignment of type 2) are no longer stored in shared mem,
    so there is no limit on their number.

    There is no longer a rule that assigned job names must contain "asgn".

    NOTE: this requires a database update.


svn path=/trunk/boinc/; revision=25169
2012-01-30 22:39:13 +00:00
David Anderson dd16170fc1 - scheduler: the p_fpops value reported by clients can't be trusted.
Some credit cheats (e.g. with credit_by_runtime) can be done
    by reporting a huge value.
    Fix this by capping the value at 1.1 times the 95th percentile
    of host.p_fpops, taken over active hosts.


svn path=/trunk/boinc/; revision=25017
2012-01-09 17:35:48 +00:00
David Anderson 1366dc1cdc - scheduler: if using homogeneous app version,
and a WU is committed to an app version that's been superceded,
    treat it as committed to the later version.


svn path=/trunk/boinc/; revision=24776
2011-12-12 22:57:58 +00:00
Jeff Cobb 2e526a0956 server: more fixes to DB to handle unsigned result IDs
svn path=/trunk/boinc/; revision=24564
2011-11-09 20:24:48 +00:00
David Anderson b1456cfc0b - scheduler: fix a bug that would choose app versions erroneously.
The problem: the choice of app version was based on
    the "projected FLOPS" return by estimate_flops(av).
    If usage stats exist for the host / app version,
    this returns a number X such that
    WU.rsc_fpops_est/X approximates the runtime of a job
    using the given app version..
    (If WU.rsc_fpops_est is way off, this will be correspondingly way off
    from the actual FLOPS the app version will get.)
    However, if there are no usage stats,
    it return an estimate based on host hardware speed,
    which might be 100X less.
    Hence, in some cases a new app version would never get used.

    Solution: choose app versions based on the values
    returned by the app plan functions.
    Use estimate_flops() AFTER choosing the version.
- scheduler: improve the accuracy of FLOPS estimation for GPU apps.
    The "flops_scale" argument to coproc_perf
    (which expresses the difference between peak GPU FLOPS
    and actual FLOPS) should be used to scale GPU FLOPS
    prior to calling coproc_perf(),
    rather than scaling the estimate returned by coproc_perf().
- show_shmem: show have_X_apps flags


svn path=/trunk/boinc/; revision=24385
2011-10-12 23:59:38 +00:00
David Anderson 7c51512cbf - transitioner: the format string for a DB query had %.15d instead of %.15e.
That produced a messed-up query that assigned garbage values to:
        host_app_version.turnaround_var
        host_app_version.turnaround_q
        host_app_version.max_jobs_per_day
        host_app_version.consecutive_valid
    To repair these:
        - set turnaround_var and turnaround_q to zero
        - if max_jobs_per_day is outside of
            (0..config.daily_result_quota)
            set it to config.daily_result_quota
        - if consecutive_valid is outside (0..1000), set it to zero
    I added a script, html/ops/repair_21812.php, that does this;
    if you ran server code between [21181] and [21812], run this script.
- scheduler/transitioner: add <debug_quota> log flag
- changed the build system to always use -Wall
    (if we'd done this before, this bug wouldn't have happened)
- fixed a bunch of other compile warnings


svn path=/trunk/boinc/; revision=21812
2010-06-25 18:54:37 +00:00
David Anderson 5035007b90 - back end: new way of deciding:
- whether host is "reliable" for an app version
    - whether host is eligible for single replication for an app version
    - whether to use host scaling
    In each case, the answer is yes if the number of
    consecutive valid results is above a threshold.
    This replaces existing "error rate" and "scale probation" mechanisms.

    TODO: the # of consecutive valid results should also determine
        a limit on jobs in progress for an app version.
        Namely, if N is the threshold for host scaling, the limit should be
            ndevices*(max(1, consecutive_valid - N))
        The client currently doesn't supply enough
        app version info to do this.
        It could be approximated; that would give some protection
        against cherry-picking.
- credit: more conservative formulas for combining claimed credit
    among replicas.
    If there are normal replicas, we use a "low average"
    that weights each sample by the sum of the other samples.
    Otherwise we use the min (not the average) of the approximate samples.

NOTE: a DB update is required


svn path=/trunk/boinc/; revision=21230
2010-04-21 19:33:20 +00:00
David Anderson 1d765245ed - scheduler: sweeping changes to the way job runtimes are estimated:
see http://boinc.berkeley.edu/trac/wiki/RuntimeEstimation


svn path=/trunk/boinc/; revision=21153
2010-04-08 23:14:47 +00:00
David Anderson f198dc76ad - feeder: with -allapps option, allow some apps to have zero weights;
no jobs will be sent for them.


svn path=/trunk/boinc/; revision=20980
2010-03-22 20:12:24 +00:00
David Anderson 0c1a1421f8 - scheduler/feeder: if any client version number field
(min_core_version etc.) is < 10000,
    multiply it by 100 and print a warning.

svn path=/trunk/boinc/; revision=20187
2010-01-18 04:52:58 +00:00
David Anderson 545d137804 - client: no network activity if running CPU benchmarks
svn path=/trunk/boinc/; revision=19375
2009-10-23 21:57:58 +00:00
Eric J. Korpela 4e60ef3003 - STILL WORK TO BE DONE TO GET locale STUFF INSTALLED PROPERLY!!!
- Update to libtool 1.5.24
- build environment:  Major automake changes that I've been warning about
  for some time.
- Now uses libtool to build libraries.
- Builds separate boinc_fcgi and sched_fcgi libraries for use with 
  FCGI server components.
- New macro "BOINC_CHECK_LIB_WITH" that executes a "AC_CHECK_LIB" on
  a library only if --with-libname[=DIR] is specified on the configure
  command line.  This is to allow inclusion of libraries when the 
  ssl, gtk, wxWidgets, or other configuration is incorrect for static
  libraries.
- Added a lot of "--with-*" for some libraries that might be required for
  static builds.
- The sea directory has been moved to packages/generic.  Changes to sea
  and the associated scripts might be required to better make use of the
  staging mechanism and shared libraries.
- Fixed includes of boinc_fcgi.h in many files.
- Fixed places where FCGI_FILE needs to be used implicitly.
- Fixed missing define of _SC_PAGESIZE on hosts that define only
  _SC_PAGE_SIZE.
- Moved build of boinc_cmd (and source file) from lib to client



svn path=/trunk/boinc/; revision=16904
2009-01-13 23:06:02 +00:00
David Anderson 765eefd18d svn path=/trunk/boinc/; revision=16715 2008-12-17 22:05:20 +00:00
David Anderson 90740c7690 - feeder: include all app versions that have maximal version num
within their plan class

svn path=/trunk/boinc/; revision=16714
2008-12-17 21:58:54 +00:00
David Anderson 8ea8081626 - scheduler: fix memory leak when reporting time stats logs
- scheduler: fix egregious bug where wu_is_infeasible_fast() result
    is ignored, and we send jobs to hosts that can't handle them.
- scheduler: don't check for disk space in work_needed();
    do it in check_disk(), which generates a message to user.
- scheduler: add -debug_log flag, which sends stderr to
    "debug_log" rather than scheduler_log.txt (for debugging)

svn path=/trunk/boinc/; revision=16578
2008-11-26 21:49:36 +00:00
David Anderson 98cfb8d3b0 - rename .C files to .cpp so that Doxygen will work
svn path=/trunk/boinc/; revision=16069
2008-09-26 18:20:24 +00:00