boinc/sched/sched_util.h

// This file is part of BOINC.
// http://boinc.berkeley.edu
// Copyright (C) 2014 University of California
//
// BOINC is free software; you can redistribute it and/or modify it
// under the terms of the GNU Lesser General Public License
// as published by the Free Software Foundation,
// either version 3 of the License, or (at your option) any later version.
//
// BOINC is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
// See the GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with BOINC.  If not, see <http://www.gnu.org/licenses/>.

// server utility functions that refer to the DB

#ifndef BOINC_SCHED_UTIL_H
#define BOINC_SCHED_UTIL_H

#include "boinc_db_types.h"
#include "util.h"

#include "sched_util_basic.h"

extern void compute_avg_turnaround(HOST& host, double turnaround);

struct PERF_INFO {
    double host_fpops_mean;
    double host_fpops_stddev;
    double host_fpops_50_percentile;
    double host_fpops_95_percentile;

    int get_from_db();
};

// Return a value for host_app_version.app_version_id.
// if the app version is anonymous platform,
// make a "pseudo ID" that combines the app ID and the resource type
// else just used the app_version ID
//
inline DB_ID_TYPE generalized_app_version_id(
    DB_ID_TYPE avid, DB_ID_TYPE appid
) {
    if (avid < 0) {
        return appid*1000000 - avid;
    }
    return avid;
}

extern int count_workunits(long&, const char* query);
extern int count_unsent_results(long&, DB_ID_TYPE appid, int size_class=-1);
extern int restrict_wu_to_user(WORKUNIT& wu, DB_ID_TYPE userid);
extern int restrict_wu_to_host(WORKUNIT& wu, DB_ID_TYPE hostid);
extern int min_transition_time(double&);

#endif
- added copyright and license info to .C, .cpp, .h files - scheduler: fix bug in adaptive replication: if send an unreplicated job to untrusted host, set both wu.target_nresults and wu.min_quorum to app.target_nresults. svn path=/trunk/boinc/; revision=15762 2008-08-06 18:36:30 +00:00			`// This file is part of BOINC.`
* empty log message * svn path=/trunk/boinc/; revision=5161 2005-01-20 23:22:22 +00:00			`// http://boinc.berkeley.edu`
server: shuffle code so that the file upload handler doesn't need MySQL Also (client): remove notices about app_config.xml after problem is fixed 2014-06-17 23:02:59 +00:00			`// Copyright (C) 2014 University of California`
* empty log message * svn path=/trunk/boinc/; revision=5161 2005-01-20 23:22:22 +00:00			`//`
- added copyright and license info to .C, .cpp, .h files - scheduler: fix bug in adaptive replication: if send an unreplicated job to untrusted host, set both wu.target_nresults and wu.min_quorum to app.target_nresults. svn path=/trunk/boinc/; revision=15762 2008-08-06 18:36:30 +00:00			`// BOINC is free software; you can redistribute it and/or modify it`
			`// under the terms of the GNU Lesser General Public License`
			`// as published by the Free Software Foundation,`
			`// either version 3 of the License, or (at your option) any later version.`
* empty log message * svn path=/trunk/boinc/; revision=5161 2005-01-20 23:22:22 +00:00			`//`
- added copyright and license info to .C, .cpp, .h files - scheduler: fix bug in adaptive replication: if send an unreplicated job to untrusted host, set both wu.target_nresults and wu.min_quorum to app.target_nresults. svn path=/trunk/boinc/; revision=15762 2008-08-06 18:36:30 +00:00			`// BOINC is distributed in the hope that it will be useful,`
* empty log message * svn path=/trunk/boinc/; revision=5161 2005-01-20 23:22:22 +00:00			`// but WITHOUT ANY WARRANTY; without even the implied warranty of`
			`// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.`
			`// See the GNU Lesser General Public License for more details.`
			`//`
- added copyright and license info to .C, .cpp, .h files - scheduler: fix bug in adaptive replication: if send an unreplicated job to untrusted host, set both wu.target_nresults and wu.min_quorum to app.target_nresults. svn path=/trunk/boinc/; revision=15762 2008-08-06 18:36:30 +00:00			`// You should have received a copy of the GNU Lesser General Public License`
			`// along with BOINC. If not, see <http://www.gnu.org/licenses/>.`
update_stats program svn path=/trunk/boinc/; revision=1030 2003-03-08 00:09:40 +00:00
server: shuffle code so that the file upload handler doesn't need MySQL Also (client): remove notices about app_config.xml after problem is fixed 2014-06-17 23:02:59 +00:00			`// server utility functions that refer to the DB`
- server: change the following from per-host to per-(host, app version): - daily quota mechanism - reliable mechanism (accelerated retries) - "trusted" mechanism (adaptive replication) - scheduler: enforce host scale probation only for apps with host_scale_check set. - validator: do scale probation on invalid results (need this in addition to error and timeout cases) - feeder: update app version scales every 10 min, not 10 sec - back-end apps: support --foo as well as -foo for options Notes: - If you have, say, cuda, cuda23 and cuda_fermi plan classes, a host will have separate quotas for each one. That means it could error out on 100 jobs for cuda_fermi, and when its quota goes to zero, error out on 100 jobs for cuda23, etc. This is intentional; there may be cases where one version works but not the others. - host.error_rate and host.max_results_day are deprecated TODO: - the values in the app table for limits on jobs in progress etc. should override rather than config.xml. Implementation notes: scheduler: process_request(): read all host_app_versions for host at start; Compute "reliable" and "trusted" for each one. write modified records at end get_app_version(): add "reliable_only" arg; if set, use only reliable versions skip over-quota versions Multi-pass scheduling: if have at least one reliable version, do a pass for jobs that need reliable, and use only reliable versions. Then clear best_app_versions cache. Score-based scheduling: for need-reliable jobs, it will pick the fastest version, then give a score bonus if that version happens to be reliable. When get back a successful result from client: increase daily quota When get back an error result from client: impose scale probation decrease daily quota if not aborted Validator: when handling a WU, create a vector of HOST_APP_VERSION parallel to vector of RESULT. Pass it to assign_credit_set(). Make copies of originals so we can update only modified ones update HOST_APP_VERSION error rates Transitioner: decrease quota on timeout svn path=/trunk/boinc/; revision=21181 2010-04-15 03:13:56 +00:00
change multiple-inclusion guard names to BOINC_FILENAME_H 2017-04-08 06:54:49 +00:00			`#ifndef BOINC_SCHED_UTIL_H`
			`#define BOINC_SCHED_UTIL_H`
update_stats program svn path=/trunk/boinc/; revision=1030 2003-03-08 00:09:40 +00:00
- fix many problems with validator_test svn path=/trunk/boinc/; revision=25582 2012-04-19 08:47:38 +00:00			`#include "boinc_db_types.h"`
- API: in boinc_exit(), release the lockfile only if we're the main program (otherwise we didn't lock it in the first place, and a crash results). From Artyom Sharov. - scheduler: add support for the GCL simulator, which uses special versions of backend programs that use virtual time, and that wait for signals instead of sleep()ing. To compile: make clean configure CXXFLAGS="-DGCL_SIMULATOR" make svn path=/trunk/boinc/; revision=16036 2008-09-22 17:52:41 +00:00			`#include "util.h"`
* empty log message * svn path=/trunk/boinc/; revision=5499 2005-02-23 00:11:59 +00:00
server: shuffle code so that the file upload handler doesn't need MySQL Also (client): remove notices about app_config.xml after problem is fixed 2014-06-17 23:02:59 +00:00			`#include "sched_util_basic.h"`
* empty log message * svn path=/trunk/boinc/; revision=5029 2005-01-08 06:54:03 +00:00
* empty log message * svn path=/trunk/boinc/; revision=5499 2005-02-23 00:11:59 +00:00			`extern void compute_avg_turnaround(HOST& host, double turnaround);`

- scheduler: the p_fpops value reported by clients can't be trusted. Some credit cheats (e.g. with credit_by_runtime) can be done by reporting a huge value. Fix this by capping the value at 1.1 times the 95th percentile of host.p_fpops, taken over active hosts. svn path=/trunk/boinc/; revision=25017 2012-01-09 17:35:48 +00:00			`struct PERF_INFO {`
			`double host_fpops_mean;`
			`double host_fpops_stddev;`
			`double host_fpops_50_percentile;`
			`double host_fpops_95_percentile;`

			`int get_from_db();`
			`};`

- server: change the following from per-host to per-(host, app version): - daily quota mechanism - reliable mechanism (accelerated retries) - "trusted" mechanism (adaptive replication) - scheduler: enforce host scale probation only for apps with host_scale_check set. - validator: do scale probation on invalid results (need this in addition to error and timeout cases) - feeder: update app version scales every 10 min, not 10 sec - back-end apps: support --foo as well as -foo for options Notes: - If you have, say, cuda, cuda23 and cuda_fermi plan classes, a host will have separate quotas for each one. That means it could error out on 100 jobs for cuda_fermi, and when its quota goes to zero, error out on 100 jobs for cuda23, etc. This is intentional; there may be cases where one version works but not the others. - host.error_rate and host.max_results_day are deprecated TODO: - the values in the app table for limits on jobs in progress etc. should override rather than config.xml. Implementation notes: scheduler: process_request(): read all host_app_versions for host at start; Compute "reliable" and "trusted" for each one. write modified records at end get_app_version(): add "reliable_only" arg; if set, use only reliable versions skip over-quota versions Multi-pass scheduling: if have at least one reliable version, do a pass for jobs that need reliable, and use only reliable versions. Then clear best_app_versions cache. Score-based scheduling: for need-reliable jobs, it will pick the fastest version, then give a score bonus if that version happens to be reliable. When get back a successful result from client: increase daily quota When get back an error result from client: impose scale probation decrease daily quota if not aborted Validator: when handling a WU, create a vector of HOST_APP_VERSION parallel to vector of RESULT. Pass it to assign_credit_set(). Make copies of originals so we can update only modified ones update HOST_APP_VERSION error rates Transitioner: decrease quota on timeout svn path=/trunk/boinc/; revision=21181 2010-04-15 03:13:56 +00:00			`// Return a value for host_app_version.app_version_id.`
			`// if the app version is anonymous platform,`
			`// make a "pseudo ID" that combines the app ID and the resource type`
			`// else just used the app_version ID`
			`//`
change multiple-inclusion guard names to BOINC_FILENAME_H 2017-04-08 06:54:49 +00:00			`inline DB_ID_TYPE generalized_app_version_id(`
			`DB_ID_TYPE avid, DB_ID_TYPE appid`
			`) {`
- server: change the following from per-host to per-(host, app version): - daily quota mechanism - reliable mechanism (accelerated retries) - "trusted" mechanism (adaptive replication) - scheduler: enforce host scale probation only for apps with host_scale_check set. - validator: do scale probation on invalid results (need this in addition to error and timeout cases) - feeder: update app version scales every 10 min, not 10 sec - back-end apps: support --foo as well as -foo for options Notes: - If you have, say, cuda, cuda23 and cuda_fermi plan classes, a host will have separate quotas for each one. That means it could error out on 100 jobs for cuda_fermi, and when its quota goes to zero, error out on 100 jobs for cuda23, etc. This is intentional; there may be cases where one version works but not the others. - host.error_rate and host.max_results_day are deprecated TODO: - the values in the app table for limits on jobs in progress etc. should override rather than config.xml. Implementation notes: scheduler: process_request(): read all host_app_versions for host at start; Compute "reliable" and "trusted" for each one. write modified records at end get_app_version(): add "reliable_only" arg; if set, use only reliable versions skip over-quota versions Multi-pass scheduling: if have at least one reliable version, do a pass for jobs that need reliable, and use only reliable versions. Then clear best_app_versions cache. Score-based scheduling: for need-reliable jobs, it will pick the fastest version, then give a score bonus if that version happens to be reliable. When get back a successful result from client: increase daily quota When get back an error result from client: impose scale probation decrease daily quota if not aborted Validator: when handling a WU, create a vector of HOST_APP_VERSION parallel to vector of RESULT. Pass it to assign_credit_set(). Make copies of originals so we can update only modified ones update HOST_APP_VERSION error rates Transitioner: decrease quota on timeout svn path=/trunk/boinc/; revision=21181 2010-04-15 03:13:56 +00:00			`if (avid < 0) {`
			`return appid*1000000 - avid;`
			`}`
			`return avid;`
			`}`

server software: handle 64-bit database IDs The SETI@home result table is about to run out of 32-bit IDs, so we need to move to 64-bit result IDs. This will happen to the workunit table at some point too. I changed the server C++ code to use the "long" type for all DB IDs (and to use appropriate conversion codes like %lu). "long" is 64 bit on 64-bit machines. For uniformity I did this for all tables, even ones (like app) that will never get big. I chose NOT to change the DB schema for now. The new code will work with 32-bit ID fields in the DB. As projects approach the 32-bit limit on a table they can change its ID field, and fields that reference this table, to BIGINT. This is likely to happen only on the result and workunit tables. I put functions in html/ops/db_update.php to change the IDs of these tables. 2015-07-23 17:11:08 +00:00			`extern int count_workunits(long&, const char* query);`
			`extern int count_unsent_results(long&, DB_ID_TYPE appid, int size_class=-1);`
			`extern int restrict_wu_to_user(WORKUNIT& wu, DB_ID_TYPE userid);`
			`extern int restrict_wu_to_host(WORKUNIT& wu, DB_ID_TYPE hostid);`
sample work generator: wait until transitioner has processed jobs before creating any more Work generators create jobs (workunits); the transitioner creates instances (results). If a work generator tries to maintain a certain number of unsent results (as the sample work generator does) it must wait for a bit, after creating jobs, to let the transitioner create instances of those jobs. The example work generator waited 5 seconds. Problem: on a heavily loaded project, the transitioner can fall behind - minutes or hours behind. So the above policy can create way too many jobs. Solution: after creating jobs, the sample work generator notes the current time X, then waits until the transitioner catches up to time X (i.e., until the min workunit.transition_time exceeds X). This ensures that instances have been created for all the new jobs. Other work generators the limit the number of unsent jobs should use the same technique; use min_transition_time(x) to get the min transition time. Code cleanup: get_double should be a member of DB_CONN, not DB_BASE. 2013-12-15 00:36:18 +00:00			`extern int min_transition_time(double&);`

- scheduler: send log messages to file, rather than httpd error log, when using FCGI (from Carl Christensen) svn path=/trunk/boinc/; revision=14678 2008-02-05 20:16:57 +00:00			`#endif`