boinc

Commit Graph

Author	SHA1	Message	Date
David Anderson	d41f79588d	- server daemons: add daemon_sleep(n), which sleeps for n secs but checks for the "stop_daemons" trigger file every 1 sec. Use this instead of sleep() in daemons. This will speed up bin/stop. svn path=/trunk/boinc/; revision=25708	2012-05-23 18:11:59 +00:00
David Anderson	759c23ed27	- server: create a harness for testing validator code. If you link your functions (init_result(), compare_results(), cleanup_result()) with validate_test.cpp, you'll get a program that you can run as validate_test file1 file2 and it will compare the two files (this works only for validators that expect 1 file per result). I added a makefile, sched/makefile_validator_test, that you can use for this. - server: shuffle code so that the above doesn't need to link MySQL libraries - client: if we fetch a master file and it contains no scheduler URLs, show a message of class INTERNAL_ERROR - client/scheduler: make CUDA_DEVICE_PROP.totalGlobalMem a double, and remove dtotalGlobalMem. Although NVIDIA reports RAM size as a size_t, there's no reason to store it as an integer after that. svn path=/trunk/boinc/; revision=25542	2012-04-10 00:32:35 +00:00
Bernd Machenschalk	df439c128b	validator: output the version string even when not in project directory svn path=/trunk/boinc/; revision=25345	2012-02-27 11:54:02 +00:00
David Anderson	e8657adfd2	- scheduler: change vbox_mt app plan function to use 1, 2 or 3 CPUs depending on how many the host has, and whether CPU VM extensions are present (this reflects the requirements of CernVM). svn path=/trunk/boinc/; revision=25009	2012-01-08 01:28:39 +00:00
David Anderson	5020e3af2f	- validator: for credit_from_runtime, use result.flops_estimate rather than host.p_fpops; otherwise it doesn't work for multicore apps. TODO: cheat-proofing svn path=/trunk/boinc/; revision=25006	2012-01-06 22:22:02 +00:00
David Anderson	e49f945908	- Validator: allow project-specific code to mark a result is a "runtime outlier", i.e. its runtime does not correspond to the job's rsc_fpops_est. Runtime outliers are not counted in the statistics for elapsed time, turnaround time, and peak FLOPs count. The is intended for applications like SETI@home, some of whose jobs finish more or less instantly (this happens if the data contains a lot of interference). If a host happens to get a bunch of these short jobs, its statistics will get skewed: in essence, the server will think that the host is extremely fast, and will send it too many jobs. svn path=/trunk/boinc/; revision=24225	2011-09-16 16:43:15 +00:00
David Anderson	176b0a4327	- validator: add a --credit_from_runtime option. This assigns credit proportional to runtime*p_fpops. To prevent cheating, p_fpops is capped at the 95th percentile value among active hosts, and runtime is capped at a specified limit. This option supports apps, like LHC's CERNvm app, that run for a certain amount of time and then exit. The CreditNew system doesn't work for such apps. - trickle_credit: To prevent cheating, cap p_fpops at the 95th percentile value among active hosts, and require a limit on runtime. - require that trickle handlers supply an initialization function svn path=/trunk/boinc/; revision=24182	2011-09-13 21:01:42 +00:00
David Anderson	048c6a48a4	- validator: add --no_credit option; maintains stats but doesn't grant credit svn path=/trunk/boinc/; revision=24175	2011-09-13 05:23:10 +00:00
David Anderson	9b89168c49	- validator: in "credit_from_wu" case, record what the new credit system would have assigned in result.claimed_credit. svn path=/trunk/boinc/; revision=24088	2011-08-30 22:28:52 +00:00
David Anderson	4d45dda3d9	- validator: update credit statistics even if credit_from_wu is being used. - web: make almost everything translatable. From Christian Beer. svn path=/trunk/boinc/; revision=24048	2011-08-25 22:12:48 +00:00
David Anderson	bffeeb0851	- web: don't error out on old-style notice URL svn path=/trunk/boinc/; revision=23506	2011-05-05 14:56:32 +00:00
David Anderson	fb04266eaf	- validator: fix bug when check_pair() returns retry=true. svn path=/trunk/boinc/; revision=23443	2011-04-25 18:27:03 +00:00
David Anderson	73dfafde79	- validator: if --credit_from_wu is set, and no credit specified in WU, assign zero credit and keep going - client simulator work svn path=/trunk/boinc/; revision=23231	2011-03-14 06:27:51 +00:00
David Anderson	732866b8aa	- back end: add two example trickle handlers: trickle_credit: grants credit based on CPU time reported in msg trickle_echo: echoes trickle-up as a trickle-down svn path=/trunk/boinc/; revision=23118	2011-02-27 00:10:14 +00:00
David Anderson	b169e5ab0f	- server programs: print error message instead of numeric retval in log messages svn path=/trunk/boinc/; revision=22647	2010-11-08 17:51:57 +00:00
David Anderson	8aa29bec33	- validator: fix another bug with --credit_from_wu - make_project, update scripts: don't quit it user_profiles already exists svn path=/trunk/boinc/; revision=22630	2010-11-05 17:15:27 +00:00
David Anderson	805f73b66c	- validator: fix bug with --credit_from_wu HOWEVER: use of this option is discouraged. Use the default credit system. svn path=/trunk/boinc/; revision=22621	2010-11-03 22:06:56 +00:00
David Anderson	794214208f	- validator: if credit calculation returns an error, wait 6 hours before retrying svn path=/trunk/boinc/; revision=22418	2010-09-28 20:17:09 +00:00
Bernd Machenschalk	5cb98247a3	validator, assimilator: added --help and --version svn path=/trunk/boinc/; revision=21966	2010-07-16 07:15:57 +00:00
David Anderson	b677f0c25e	- validator: remove app and app_versions arguments from check_set(). These weren't used, and I'm not sure why they were added. - include sched_limit.h in "make install" list svn path=/trunk/boinc/; revision=21894	2010-07-12 21:35:05 +00:00
David Anderson	7c51512cbf	- transitioner: the format string for a DB query had %.15d instead of %.15e. That produced a messed-up query that assigned garbage values to: host_app_version.turnaround_var host_app_version.turnaround_q host_app_version.max_jobs_per_day host_app_version.consecutive_valid To repair these: - set turnaround_var and turnaround_q to zero - if max_jobs_per_day is outside of (0..config.daily_result_quota) set it to config.daily_result_quota - if consecutive_valid is outside (0..1000), set it to zero I added a script, html/ops/repair_21812.php, that does this; if you ran server code between [21181] and [21812], run this script. - scheduler/transitioner: add <debug_quota> log flag - changed the build system to always use -Wall (if we'd done this before, this bug wouldn't have happened) - fixed a bunch of other compile warnings svn path=/trunk/boinc/; revision=21812	2010-06-25 18:54:37 +00:00
David Anderson	89fab4ece5	- back end: change "daily result quota" mechanism. Old: config.xml specifies an initial daily quota (say, 100). Each host_app_version starts out with this quota. On the return of a SUCCESS result, the quota is doubled, up to the initial value. On the return of an error result, or a timeout, the quota is decremented down to 1. Problem: Doesn't accommodate hosts that can do more than 100 jobs/day. New: similar, but - on validation of a job, daily quota is incremented. - on invalidation of a job, daily quota is decremented. - on return of an error result, or a timeout, daily quota is min'd with initial quota, then decremented. Notes: - This allows a host to have an unboundedly large quota as long as it continues to return more valid than invalid results. - Even with this change, hosts that return SUCCESS but invalid results will continue to get the initial daily quota. It would be desirable to reduce their quota to 1. svn path=/trunk/boinc/; revision=21675	2010-06-02 00:11:01 +00:00
David Anderson	5035007b90	- back end: new way of deciding: - whether host is "reliable" for an app version - whether host is eligible for single replication for an app version - whether to use host scaling In each case, the answer is yes if the number of consecutive valid results is above a threshold. This replaces existing "error rate" and "scale probation" mechanisms. TODO: the # of consecutive valid results should also determine a limit on jobs in progress for an app version. Namely, if N is the threshold for host scaling, the limit should be ndevices*(max(1, consecutive_valid - N)) The client currently doesn't supply enough app version info to do this. It could be approximated; that would give some protection against cherry-picking. - credit: more conservative formulas for combining claimed credit among replicas. If there are normal replicas, we use a "low average" that weights each sample by the sum of the other samples. Otherwise we use the min (not the average) of the approximate samples. NOTE: a DB update is required svn path=/trunk/boinc/; revision=21230	2010-04-21 19:33:20 +00:00
David Anderson	b2451544e1	- server: change the following from per-host to per-(host, app version): - daily quota mechanism - reliable mechanism (accelerated retries) - "trusted" mechanism (adaptive replication) - scheduler: enforce host scale probation only for apps with host_scale_check set. - validator: do scale probation on invalid results (need this in addition to error and timeout cases) - feeder: update app version scales every 10 min, not 10 sec - back-end apps: support --foo as well as -foo for options Notes: - If you have, say, cuda, cuda23 and cuda_fermi plan classes, a host will have separate quotas for each one. That means it could error out on 100 jobs for cuda_fermi, and when its quota goes to zero, error out on 100 jobs for cuda23, etc. This is intentional; there may be cases where one version works but not the others. - host.error_rate and host.max_results_day are deprecated TODO: - the values in the app table for limits on jobs in progress etc. should override rather than config.xml. Implementation notes: scheduler: process_request(): read all host_app_versions for host at start; Compute "reliable" and "trusted" for each one. write modified records at end get_app_version(): add "reliable_only" arg; if set, use only reliable versions skip over-quota versions Multi-pass scheduling: if have at least one reliable version, do a pass for jobs that need reliable, and use only reliable versions. Then clear best_app_versions cache. Score-based scheduling: for need-reliable jobs, it will pick the fastest version, then give a score bonus if that version happens to be reliable. When get back a successful result from client: increase daily quota When get back an error result from client: impose scale probation decrease daily quota if not aborted Validator: when handling a WU, create a vector of HOST_APP_VERSION parallel to vector of RESULT. Pass it to assign_credit_set(). Make copies of originals so we can update only modified ones update HOST_APP_VERSION error rates Transitioner: decrease quota on timeout svn path=/trunk/boinc/; revision=21181	2010-04-15 03:13:56 +00:00
David Anderson	2536797068	- validator: remove update_credit_per_cpu_sec(). Irrelevant. TODO: remove related code - validator: update wu.canonical_credit correctly. However, this field should be deprecated. - validator: check for error return from assign_credit_set(). svn path=/trunk/boinc/; revision=21096	2010-04-05 20:03:54 +00:00
David Anderson	a2a661993b	- validator: -d 4 means -d 3 plus print all DB queries (todo: do this for all daemons) - validator: change cmdline args from -foo to --foo (todo: do this for all daemons) - validator: pass max_granted_credit to assign_credit_set() svn path=/trunk/boinc/; revision=21093	2010-04-05 18:59:16 +00:00
David Anderson	19f7d66b53	- backend programs: change the way PFC and elapsed-time statistics are written to the DB. The incremental approach was bogus. New approach: host_app_version: write directly; R/W interval is tiny app_version: maintain an explicit list of update samples for both PFC and credit. When the validator flushes its app_version cache, do careful updates. Note: when using double fields in careful updates, you can't test for equality. Use abs(new-old) < 1e-N svn path=/trunk/boinc/; revision=21057	2010-04-02 19:10:37 +00:00
David Anderson	fb851311e0	- server: various changes; see http://boinc.berkeley.edu/trac/wiki/CreditNew Projects will need to update DB and recompile all back-end programs. Summary: - new way of computing credit - "reliable host" mechanism is per app version - "host punishment" mechanism is per app version - adjustment of wu.rsc_fpops_est provides the equivalent of per app version DCF - max jobs in progress is now per app - max jobs per RPC is now per app TODO: - reliable mechanism: - populate and use host_app_version.error_rate - populate host_app_version.turnaround - host punishment: - populate host_app_version.max_jobs_per_day - populate host_app_version.n_jobs_today - use app.max_jobs_per_day_init - job limits: - use app.max_jobs_in_progress, max_gpu_jobs_in_progress - use app.max_jobs_per_rpc - adjust wu.rsc_fpops_est - remove old credit stuff fpops_cumulative, credit_multiplier credit computation in scheduler - AVERAGE class: use the Knuth algorithm (Wikipedia) svn path=/trunk/boinc/; revision=21021	2010-03-29 22:28:20 +00:00
David Anderson	381a15c724	- create_work function and script: check for valid ordering among max_success_results, max_total_results, max_error_results, and target_nresults svn path=/trunk/boinc/; revision=19054	2009-09-16 03:10:22 +00:00
David Anderson	fb443e5c31	- compile fixes svn path=/trunk/boinc/; revision=18832	2009-08-13 03:35:26 +00:00
David Anderson	3fb7c8f13f	- server code: moved everything related to credit-granting to credit.cpp, where it can be used by trickle handlers as well as by validators. svn path=/trunk/boinc/; revision=18831	2009-08-12 16:26:43 +00:00
David Anderson	7484aeccf1	- validator: prepare for code cleanup svn path=/trunk/boinc/; revision=18824	2009-08-10 04:22:02 +00:00
David Anderson	12eb6057e5	- client, Mac: don't do res_init(). It causes a crash. - client (Unix): if client crashes while benchmark processes are going, make sure they detect this and exit. - back-end programs: remove hardwired assumptions about what directory they run in, and hence where config.xml is. E.g., daemons look for it in "..", others expect it in current dir. New approach: all the programs look for the project dir as follows: 1) the environment var BOINC_PROJECT_DIR, if defined 2) the current dir, if config.xml is there. 3) else ".." This means you can run programs in either proj/bin/ or proj/, or (using BOINC_PROJECT_DIR) you can keep executables outside of the project dir. svn path=/trunk/boinc/; revision=18042	2009-05-07 13:54:51 +00:00
David Anderson	e3a730c334	- client: add <use_all_gpus> config option. If set, use GPUs even if they're not equivalent to the most capable one. - Validator: fix one_pass_N_WU option. svn path=/trunk/boinc/; revision=17896	2009-04-27 23:51:46 +00:00
Eric J. Korpela	8f3abcc835	- Added checks for net/.h, arpa/.h, netinet/.h and code to figure out which of those files to include - Modified MAC address check to work on some non-Linux unixes. (mac_address.cpp) - Added suggested change to "already attached to project" checking. (ProjectInfoPage.cpp) - changed includes of standard c header files to their c++ equivalents (i.e. replaced <stdio.h> with <cstdio>) for namespace protection. - replaced "using namespace std;" with more explicit "using std::function" in several files. - Fixed bug in checking whether the os is OS/2 and added conditional OS_OS2 to the build environment. (boinc_platform.m4,configure.ac) - Changed build environment to not use -nostandardlibs unless we are using G++ and static linkage is specified. (configure.ac) - Added makefiles and package building files for solaris CSW package manager. - Fixed bug with attempting to find login name using logname. (configure.ac) - Added ifdef HAVE_ protection around some include files commonly found in sys. - Added support for unified binary for x86_64/i686-pc-solaris. (cs_platforms.cpp) - generate_host_cpid() now uses MAC address on non-linux unix. (hostinfo_network.cpp) - Macro BOINC_SET_COMPILE_FLAGS now doesn't check gcc only flags on non-gcc compilers. (boinc_set_compile_flags.m4) - Library compiles no longer depend upon the library extension or require the library to be prefixed with lib. - More fixes for fcgi builds. - Added declaration of "struct ether_addr" and ether_ntoa(). Have not yet implemented ether_ntoa() for machines that don't have it, or where it is buggy. (unix_util.h) - Added FCGI::perror() which calls FCGI_perror(). (boinc_fcgi.{h,cpp}) - Fixed library Makefiles so that all required headers get installed. svn path=/trunk/boinc/; revision=17388	2009-02-26 00:23:23 +00:00
David Anderson	57b92fb40a	- scheduler: #ifdef'd tweaks for server simulator svn path=/trunk/boinc/; revision=16097	2008-09-30 18:21:41 +00:00
David Anderson	98cfb8d3b0	- rename .C files to .cpp so that Doxygen will work svn path=/trunk/boinc/; revision=16069	2008-09-26 18:20:24 +00:00

37 Commits