Commit Graph

74 Commits

Author SHA1 Message Date
David Anderson 2363ec359d Server programs: any time we can't connect to DB, try to explain why 2020-05-06 13:01:21 -07:00
Roman Trunov 097b510223 validator: Avoid infinite loop in --dry-run mode 2019-09-12 22:15:39 +03:00
David Anderson 8090bd81bc validator: fix bugs in #3024 2019-04-09 12:46:51 -07:00
root 978ed46c1b added the check_punitive usage option 2019-03-15 10:02:12 +01:00
David Anderson 419cd78fc7 validator: fix bug in punitive mechanism
The punitive mechanism was scanning for results with validate state INIT.
This is wrong because the scheduler immediately flags results
with client error as INVALID.
Fix: remove validate state check.
Also, don't update validate state; not needed any more.
2019-03-08 15:26:28 -08:00
David Anderson 94c8e53204 Server: add "punitive validation" mechanism
Say that a job has a "long-term failure" if it fails in a way
(as evidenced by its exit code and/or stderr)
suggesting that other jobs for that (host, app version) will fail too.
In this case we want to avoid sending more jobs to that (host, app version).

This implements this feature.
To use it, have your validator's init_result() return
VAL_RESULT_LONG_TERM_FAIL if it finds a long-term failure,
and run your validator with the --check_punitive option.
("Punitive" because we're "punishing" the host for its failure).

The validator punishes the (host, app version) by
setting host_app_version.max_jobs_per_day to 1.
One job per day can still be sent.
That way if the underlying problem is fixed
(e.g. the user enables VM acceleration in the BIOS)
we'll eventually go back to normal.

Also: normally HAV.max_jobs_per_day is scaled by the numbers
of CPUs and GPUs.
Disable this scaling in the case where it's 1.
2019-02-18 21:29:04 -08:00
Christian Beer 16e7c516d6
Merge pull request #2138 from bema-aei/validate_suspicious_results
validator: raise the quorum for 'suspicious' results to ensure validation
2019-02-16 16:39:11 +01:00
Vitalii Koshura 1ce3793c76
Remove unused BOINC_RCSID constants
This fixes #2953

Signed-off-by: Vitalii Koshura <lestat.de.lionkur@gmail.com>
2019-01-12 23:43:48 +02:00
David Anderson 83b9de7b08
Merge pull request #2518 from BOINC/dpa_credit4
Server: add support for post-assigned credit
2018-05-28 17:32:15 -07:00
David Anderson b857a37008 Add support for post-assigned credit
Add --post_assigned_credit option to validator.
If set, it gets claimed credit from result.claimed_credit
(put there by project's init_result() function).
The claimed credit of the canonical result is the job's granted credit.

Also changed --credit_from_runtime so that it averages
claimed credit across instances,
instead of just using the canonical instance.
2018-05-15 14:55:30 -07:00
David Anderson 924ff5dba9 Add support for pre-assigned credit
You can now pre-assign a job's credit, as described here:
https://boinc.berkeley.edu/trac/wiki/CreditOptions

Note: this feature was originally available via an
--additional_xml "<credit>xx</credit>" arg to create_work.
This is an ugly kludge; I removed it.
In fact, the --additional_xml arg should be removed at some point.

Also: change stage_file to it cd's to html/bin when including stuff;
this is needed since util_basic.inc now includes something else
2018-05-15 13:01:31 -07:00
Bernd Machenschalk c27594431d validator: raise the quorum for 'suspicious' results to ensure validation 2017-09-22 13:09:14 +02:00
Christian Beer 2999fc10cd Validator: fix usage info 2016-09-06 10:44:14 +02:00
Christian Beer 2c36e7246d Daemons: enhance validator framework
The validator handler can now pass unknown arguments to the project specific handler.
Projects that have there own validator need to implement the validate_handler_init() function and handle project specific arguments there. They also need to supply a validate_handler_usage() function that printf()'s a description of the custom options. For examples see sample_substr_validator.cpp or script_validator.cpp
The validator test harness was also adopted to use this new functions.

This brings the implementation of the validator framework on the same level as the assimilator framework where similar changes where made in 0038d275c and dd004404a.
2016-08-16 11:14:42 +02:00
Christian Beer 6c10091740 check return value of host.update_diff_validator()
fixes CID 27961 found by Coverity
2015-10-28 14:41:09 +01:00
David Anderson 1af264747f validator: fix 64-bit ID problem 2015-07-28 16:19:31 -07:00
David Anderson 8cd8c8e7ee server software: handle 64-bit database IDs
The SETI@home result table is about to run out of 32-bit IDs,
so we need to move to 64-bit result IDs.
This will happen to the workunit table at some point too.

I changed the server C++ code to use the "long" type for all DB IDs
(and to use appropriate conversion codes like %lu).
"long" is 64 bit on 64-bit machines.
For uniformity I did this for all tables,
even ones (like app) that will never get big.

I chose NOT to change the DB schema for now.
The new code will work with 32-bit ID fields in the DB.
As projects approach the 32-bit limit on a table they can change
its ID field, and fields that reference this table, to BIGINT.
This is likely to happen only on the result and workunit tables.
I put functions in html/ops/db_update.php
to change the IDs of these tables.
2015-07-23 10:11:08 -07:00
David Anderson 5ad43a6509 validator: add --wu_id N option for debugging single WU 2015-04-03 20:00:13 -07:00
David Anderson dbd2d03a0d server/web: add support for per-application credit
See http://boinc.berkeley.edu/trac/wiki/PerAppCredit
If enabled (by the <credit_by_app> config flag)
validators will maintain on a per-(app, user, credit type) basis,
and same for teams,
in new DB tables credit_user and credit_team.
This info is displayed in the web site, on user and team pages,
using project-supplied functions to generate the HTML.

Note: update_stats doesn't decay the recent-average values
for per-app credit; I'll add this if needed.
2014-08-15 14:01:32 -07:00
David Anderson dfc99e225c scheduler: don't resent job if app is deprecated or user has de-selected it 2014-06-08 20:20:25 -07:00
Bernd Machenschalk 34c823a9ab Merge branch 'EinsteinAtHome' into 'master'
This is meant not to break anything, just add some
(optional) logging and features needed for Einstein@Home.
Please contact me before changing or removing any of this.

Conflicts:
	sched/db_dump.cpp
	sched/file_deleter.cpp
	sched/validator.cpp
2014-05-26 14:42:36 +02:00
Bernd Machenschalk d67776a93c validator:
fix one_pass: leave main loop even if we did_something
2014-05-23 12:06:00 +02:00
Bernd Machenschalk 2f6d140c56 validator:
added options -min_wu_id and -max_wu_id to validator
2014-05-23 12:06:00 +02:00
Bernd Machenschalk 93798a5732 validator:
add '--dry_run' to validator daemon (run w/o DB update)
2014-05-23 12:06:00 +02:00
David Anderson 6f29a50812 validator: fixes and features
- add --is_gzip option to sample_bitwise_validator.
  If set, all files are treated as gzip archives.
  Check their 10-byte header to verify that it's a gzip file,
  but ignore it when comparing files.
- validator.cpp: don't error out on unparsed cmdline args,
  since we're now using them in sample_bitwise_validator
  and sample_substr_validator.
- fix build error on Debian
2014-03-20 12:38:29 -07:00
David Anderson cf0a0817c0 server: fix some compile warnings
Add a derived class DB_APP_VERSION_VAL for use by the validator,
containing the extra fields it uses,
so that we're not doing memset 0 on vectors
2014-03-19 14:55:16 -07:00
David Anderson 834ac11661 server: add sample validator that checks for string in stderr 2014-03-18 19:12:13 -07:00
Eric J Korpela 244ba5bc85 SCHED: modified scheduled log output to use unsigned format for WU and RESULT
ids.  This allows IDs greater than 2^31 to be printed.
2013-06-19 10:15:08 -07:00
David Anderson 78f7610f6e remove dependency of boinc_api.h on str_replace.h (and hence config.h)
Any files that use strlcpy() or strlcat() must directly include str_replace.h
2013-06-06 17:31:46 -07:00
David Anderson b9f0733c06 server: replace strcpy() with strlcpy() various places 2013-06-03 22:42:53 -07:00
David Anderson 9049737d1f validator: retry if transient failure
check_set() wasn't returning "retry" properly in the case where
one of the calls to init_result() return ERR_OPEN_DIR
(treated as a transient failure, since it can be caused by a failed NFS mount)
2013-05-20 13:01:10 -07:00
David Anderson 24e8133e4b - tabs -> spaces 2013-04-02 17:23:37 -07:00
Eric J Korpela f6ee54a602 Added a couple debugging statements. 2013-03-26 15:24:45 -07:00
David Anderson 980c9b66c9 - validator: fix confused logic.
A "viable" result is one that could potentially become the canonical result,
    i.e. the outcome is SUCCESS and the validate state is not INVALID.
    The existing code treated all results with outcome SUCCESS as viable,
    which is wrong.
    In particular, this could cause workunit.target_nresults
    to be incremented inappropriately.
2013-03-22 10:28:20 +01:00
David Anderson 3017ed943f - scheduler: debug the above 2013-02-26 16:44:26 +01:00
David Anderson 282af6effc - user web: show the right page/message after the following actions:
- rate a post
    - moderate a post
    - edit a post
    - report a post


svn path=/trunk/boinc/; revision=26152
2012-10-15 18:47:55 +00:00
David Anderson fc2af21221 - client: add missing end tag for <pci_info>. Doh!
- validator: add some sanity-checking for credit,
    to prevent granting 1e38 credit.
    max_granted_credit now defaults to the equivalent of 1 TeraFLOP-year.
    Instances that exceed this are not counted in the credit
    calculation, and a critical-mode log message is written
- wrapper: remove wall_cpu_time; not used anymore


svn path=/trunk/boinc/; revision=25825
2012-06-29 22:24:07 +00:00
David Anderson d41f79588d - server daemons: add daemon_sleep(n), which sleeps for n secs
but checks for the "stop_daemons" trigger file every 1 sec.
    Use this instead of sleep() in daemons.
    This will speed up bin/stop.


svn path=/trunk/boinc/; revision=25708
2012-05-23 18:11:59 +00:00
David Anderson 759c23ed27 - server: create a harness for testing validator code.
If you link your functions (init_result(), compare_results(),
    cleanup_result()) with validate_test.cpp,
    you'll get a program that you can run as
        validate_test file1 file2
    and it will compare the two files
    (this works only for validators that expect 1 file per result).

    I added a makefile, sched/makefile_validator_test,
    that you can use for this.
- server: shuffle code so that the above doesn't need to
    link MySQL libraries
- client: if we fetch a master file and it contains no scheduler URLs,
    show a message of class INTERNAL_ERROR
- client/scheduler: make CUDA_DEVICE_PROP.totalGlobalMem a double,
    and remove dtotalGlobalMem.
    Although NVIDIA reports RAM size as a size_t,
    there's no reason to store it as an integer after that.


svn path=/trunk/boinc/; revision=25542
2012-04-10 00:32:35 +00:00
Bernd Machenschalk df439c128b validator: output the version string even when not in project directory
svn path=/trunk/boinc/; revision=25345
2012-02-27 11:54:02 +00:00
David Anderson e8657adfd2 - scheduler: change vbox_mt app plan function to use 1, 2 or 3 CPUs
depending on how many the host has,
    and whether CPU VM extensions are present
    (this reflects the requirements of CernVM).


svn path=/trunk/boinc/; revision=25009
2012-01-08 01:28:39 +00:00
David Anderson 5020e3af2f - validator: for credit_from_runtime,
use result.flops_estimate rather than host.p_fpops;
    otherwise it doesn't work for multicore apps.
    TODO: cheat-proofing


svn path=/trunk/boinc/; revision=25006
2012-01-06 22:22:02 +00:00
David Anderson e49f945908 - Validator: allow project-specific code to mark a result
is a "runtime outlier", i.e. its runtime does
    not correspond to the job's rsc_fpops_est.
    Runtime outliers are not counted in the statistics for
    elapsed time, turnaround time, and peak FLOPs count.

    The is intended for applications like SETI@home,
    some of whose jobs finish more or less instantly
    (this happens if the data contains a lot of interference).
    If a host happens to get a bunch of these short jobs,
    its statistics will get skewed: in essence, the server
    will think that the host is extremely fast,
    and will send it too many jobs.


svn path=/trunk/boinc/; revision=24225
2011-09-16 16:43:15 +00:00
David Anderson 176b0a4327 - validator: add a --credit_from_runtime option.
This assigns credit proportional to runtime*p_fpops.
    To prevent cheating, p_fpops is capped at the 95th percentile value
    among active hosts,
    and runtime is capped at a specified limit.
    This option supports apps, like LHC's CERNvm app,
    that run for a certain amount of time and then exit.
    The CreditNew system doesn't work for such apps.
- trickle_credit:
    To prevent cheating,
    cap p_fpops at the 95th percentile value among active hosts,
    and require a limit on runtime.
- require that trickle handlers supply an initialization function


svn path=/trunk/boinc/; revision=24182
2011-09-13 21:01:42 +00:00
David Anderson 048c6a48a4 - validator: add --no_credit option;
maintains stats but doesn't grant credit


svn path=/trunk/boinc/; revision=24175
2011-09-13 05:23:10 +00:00
David Anderson 9b89168c49 - validator: in "credit_from_wu" case, record what the new credit
system would have assigned in result.claimed_credit.

svn path=/trunk/boinc/; revision=24088
2011-08-30 22:28:52 +00:00
David Anderson 4d45dda3d9 - validator: update credit statistics even if credit_from_wu
is being used.
- web: make almost everything translatable.  From Christian Beer.


svn path=/trunk/boinc/; revision=24048
2011-08-25 22:12:48 +00:00
David Anderson bffeeb0851 - web: don't error out on old-style notice URL
svn path=/trunk/boinc/; revision=23506
2011-05-05 14:56:32 +00:00
David Anderson fb04266eaf - validator: fix bug when check_pair() returns retry=true.
svn path=/trunk/boinc/; revision=23443
2011-04-25 18:27:03 +00:00
David Anderson 73dfafde79 - validator: if --credit_from_wu is set, and no credit specified in WU,
assign zero credit and keep going
- client simulator work


svn path=/trunk/boinc/; revision=23231
2011-03-14 06:27:51 +00:00