boinc

Commit Graph

Author	SHA1	Message	Date
David Anderson	266c050895	validator: fix potential infinite loop	2024-09-19 20:04:42 -07:00
David Anderson	d94d625f63	validator: handle transient errors A 'transient error' is one that will go away in a while, e.g. an fopen failure because of a broken NSF mount. In general, the BOINC back-end code (validation, assimilation) handles transient errors sensibly: if there's a bad NSF mount, it retries validation for a few hours rather than marking thousands of jobs as failed. Extend this to script-based validation. If a script (either init_result or compare_results) exits with 3, treat that as a transient error. Treat other nonzero exits (or lack of an exit code) as a permanent error. More generally (for all validators) add a return value VAL_RESULT_TRANSIENT_ERROR for init_result() and compute_results(). This means any transient error. Previously we checked only for ERR_OPENDIR. And for compare_results() we treated all nonzero returns as permanent.	2024-09-09 17:06:50 -07:00
David Anderson	9c240e6e40	Many comments in the source code (C++ and PHP) referred to Trac wiki pages. Change these to the Github wiki. Web: change a couple of links from Trac to Github wiki. text_transform.inc: the [github]wiki:xxx[/github] tag linked to a non-existent boinc-dev-doc repo. Link to the Github wiki instead.	2023-05-25 12:59:56 -07:00
Bernd Machenschalk	c27594431d	validator: raise the quorum for 'suspicious' results to ensure validation	2017-09-22 13:09:14 +02:00
Christian Beer	998a927956	Validator: log suspicious results	2016-09-01 11:46:37 +02:00
Bernd Machenschalk	e82d9d87a9	Validator: implement "suspicious" results A validator now has the possibility to mark a single result as "suspicious" by making init_result() return VAL_RESULT_SUSPICIOUS. If this is the single quorum result of an adaptive replication, this will trigger another task to be generated for validation.	2016-09-01 10:47:45 +02:00
David Anderson	1af264747f	validator: fix 64-bit ID problem	2015-07-28 16:19:31 -07:00
David Anderson	8cd8c8e7ee	server software: handle 64-bit database IDs The SETI@home result table is about to run out of 32-bit IDs, so we need to move to 64-bit result IDs. This will happen to the workunit table at some point too. I changed the server C++ code to use the "long" type for all DB IDs (and to use appropriate conversion codes like %lu). "long" is 64 bit on 64-bit machines. For uniformity I did this for all tables, even ones (like app) that will never get big. I chose NOT to change the DB schema for now. The new code will work with 32-bit ID fields in the DB. As projects approach the 32-bit limit on a table they can change its ID field, and fields that reference this table, to BIGINT. This is likely to happen only on the result and workunit tables. I put functions in html/ops/db_update.php to change the IDs of these tables.	2015-07-23 10:11:08 -07:00
Eric J Korpela	244ba5bc85	SCHED: modified scheduled log output to use unsigned format for WU and RESULT ids. This allows IDs greater than 2^31 to be printed.	2013-06-19 10:15:08 -07:00
David Anderson	9049737d1f	validator: retry if transient failure check_set() wasn't returning "retry" properly in the case where one of the calls to init_result() return ERR_OPEN_DIR (treated as a transient failure, since it can be caused by a failed NFS mount)	2013-05-20 13:01:10 -07:00
David Anderson	176b0a4327	- validator: add a --credit_from_runtime option. This assigns credit proportional to runtime*p_fpops. To prevent cheating, p_fpops is capped at the 95th percentile value among active hosts, and runtime is capped at a specified limit. This option supports apps, like LHC's CERNvm app, that run for a certain amount of time and then exit. The CreditNew system doesn't work for such apps. - trickle_credit: To prevent cheating, cap p_fpops at the 95th percentile value among active hosts, and require a limit on runtime. - require that trickle handlers supply an initialization function svn path=/trunk/boinc/; revision=24182	2011-09-13 21:01:42 +00:00
David Anderson	b169e5ab0f	- server programs: print error message instead of numeric retval in log messages svn path=/trunk/boinc/; revision=22647	2010-11-08 17:51:57 +00:00
David Anderson	b677f0c25e	- validator: remove app and app_versions arguments from check_set(). These weren't used, and I'm not sure why they were added. - include sched_limit.h in "make install" list svn path=/trunk/boinc/; revision=21894	2010-07-12 21:35:05 +00:00
David Anderson	fb851311e0	- server: various changes; see http://boinc.berkeley.edu/trac/wiki/CreditNew Projects will need to update DB and recompile all back-end programs. Summary: - new way of computing credit - "reliable host" mechanism is per app version - "host punishment" mechanism is per app version - adjustment of wu.rsc_fpops_est provides the equivalent of per app version DCF - max jobs in progress is now per app - max jobs per RPC is now per app TODO: - reliable mechanism: - populate and use host_app_version.error_rate - populate host_app_version.turnaround - host punishment: - populate host_app_version.max_jobs_per_day - populate host_app_version.n_jobs_today - use app.max_jobs_per_day_init - job limits: - use app.max_jobs_in_progress, max_gpu_jobs_in_progress - use app.max_jobs_per_rpc - adjust wu.rsc_fpops_est - remove old credit stuff fpops_cumulative, credit_multiplier credit computation in scheduler - AVERAGE class: use the Knuth algorithm (Wikipedia) svn path=/trunk/boinc/; revision=21021	2010-03-29 22:28:20 +00:00
David Anderson	98cfb8d3b0	- rename .C files to .cpp so that Doxygen will work svn path=/trunk/boinc/; revision=16069	2008-09-26 18:20:24 +00:00

15 Commits