boinc

Commit Graph

Author	SHA1	Message	Date
David Anderson	f716dcf7ae	- client: if a project has zero resource share, treat it as a "backup project": fetch work from it only if there is an idle instance and no other projects have work. svn path=/trunk/boinc/; revision=20286	2010-01-28 05:21:14 +00:00
David Anderson	b5124fe729	- client: brute-force attempt at eliminating domino-effect preemption: if job A is unstarted and EDF, and there's a job B that is later in the list, is started, has the same app version, and has the same arrival time, move A after B. - client: remove the "temp_dcf" mechanism, which had the same goal but didn't work. - client: in computing overall debt for a project, subtract a term that reflects pending work. This should reduce repeated fetches from the same project. - client simulator: tweaks svn path=/trunk/boinc/; revision=20223	2010-01-21 00:14:56 +00:00
David Anderson	37aae854f3	- client: scheduling problem: - a project overestimates job FLOP counts - the client starts jobs in EDF mode - as job progresses and fraction done increases, its completion time estimate decreases until it's no longer a deadline miss. - job gets preempted by other job from that project; you end up with lots of partly completed jobs. Solution (I hope): if an app version has running jobs, compute a "temp DCF" for the app version, which is the min of dynamic/static estimates for its jobs. Apply this scaling factor to completion time estimates for unstarted jobs in RR simulation - client: the estimation of remaining time of running jobs was wrong (how did this bug survive so long?) svn path=/trunk/boinc/; revision=20077	2010-01-06 06:01:23 +00:00
David Anderson	b499654603	- client: more notice stuff. Substantial progress! We're now saving feed lists, and fetching items from feeds. svn path=/trunk/boinc/; revision=20021	2009-12-23 00:58:27 +00:00
David Anderson	37ea627866	- Win compile fixes. Also, needed to provide a replacement for strptime() on Win. WTF? svn path=/trunk/boinc/; revision=20003	2009-12-21 19:20:28 +00:00
David Anderson	4e9fc3d595	- client: a big glob of new code related to notices. Not functional yet. svn path=/trunk/boinc/; revision=20002	2009-12-21 17:49:28 +00:00
David Anderson	a151ad6cb3	- client/scheduler: deal with situation where GPU has enough RAM to run job, but when we actually run the job not enough GPU RAM is free, so the application fails. This can cause a large number of jobs to fail. Solution: - app_plan() can specify the GPU RAM requirements of an app version. This is passed to the client in a new field <gpu_ram> of the <app_version> element. - prior to starting or restarting a GPU app, the client checks the amount of free RAM on the particular GPU. If it's not enough for the app version, the client doesn't start it, and arranges for the scheduler to ignore it for 5 minutes (by which point there might be more free GPU RAM) Notes: 1) this change will have effect only when both client and scheduler are updated. 2) the check is done in enforce_schedule(), rather than schedule_cpus(), because only at that point have we assigned a specific GPU to the job. 3) there's another case to deal with: a GPU app's malloc of GPU RAM fails in the middle of the job. Currently the job fails. I plan to add an API call boinc_temporary_exit(x) so that the job can exit and potentially restart in x seconds. (In principle this mechanism is sufficient for all cases, but it could lead to a lot of starting/exiting, so the current change is worthwhile). svn path=/trunk/boinc/; revision=19864	2009-12-11 22:45:59 +00:00
David Anderson	2d4ceb618a	- client: my STD-related checkin of Dec 1 was bad. It computed an "overall STD" as the sum of CPU and coprocs, weighted by the coproc's speed, as we do for LTD. This was the wrong idea; in the presence of GPUs, STDs quickly get pushed to +- 1 day and are truncated there. New scheme: STD is maintained per (resource type, project). This fixes the above problem, and it opens to door to round-robin scheduling of GPUs. - client: the calculation of "anticipated debt" was scaling by relative resource share. This wasn't correct, seems to me. - client: rename "debt" to "long_term_debt" in a few places (but not in the client state file, for compatibility) svn path=/trunk/boinc/; revision=19777	2009-12-03 23:09:25 +00:00
David Anderson	59328aaccb	- client: change how short term debt is updated. Old: it's based entirely on CPU time. So a GPU project, whose app uses only a fraction of a CPU, accrues positive debt. This is OK if the project has only GPU apps, since STD is not (currently) used for GPU scheduling. But some projects have both CPU and GPU apps. New: STD is based on total processing. It has terms for each resource type. The notion of "runnable resource share" is specific to a type. Note: the notion of "resource share fraction" appears in a couple of other places: - it's passed to apps in app_init_data.xml - it's passed in scheduler requests. It should be broken down by resource type in these cases too. Note to self: do this later. svn path=/trunk/boinc/; revision=19762	2009-12-02 03:41:52 +00:00
David Anderson	545d137804	- client: no network activity if running CPU benchmarks svn path=/trunk/boinc/; revision=19375	2009-10-23 21:57:58 +00:00
David Anderson	5e862ac495	- client: on startup, if a coproc needed by a job is missing, set a "coproc_missing" flag rather than aborting the job. If use removes a GPU board while there's a large queue of GPU jobs, they'll stay queued (until their deadline passes). Note: this doesn't fix the situation where user connects via Remote Desktop while GPU jobs are running or queued. We should check for Remote Desktop every minute or so, and stop GPU jobs. svn path=/trunk/boinc/; revision=19287	2009-10-12 16:28:17 +00:00
David Anderson	4ab5685ce4	- client: if a task is running, uses a GPU, and the system has >1 GPU, append text to its resource string saying which GPU it's using - manager: tweak Task properties text svn path=/trunk/boinc/; revision=19240	2009-10-04 02:51:44 +00:00
David Anderson	71c7e7a74b	- client/scheduler/web: add per-project preferences for whether to accept CPU, NVIDIA and ATI jobs. These prefs are shown only where relevant: e.g., only for processor types for which the project has app versions, and if it has versions for only one type, no pref is shown. These prefs affect both client and scheduler. The client won't ask for work for a device blocked by prefs, and the scheduler won't send it. This replaces earlier optional project-specific prefs for "no CPU jobs" and "no GPU jobs". (However, these prefs continue to be honored on the server side). - client: if NVIDIA driver is unknown, say that rather than 0 svn path=/trunk/boinc/; revision=19194	2009-09-28 04:24:18 +00:00
David Anderson	86ee2f5753	- client: fix bug that caused unstarted coproc jobs to preempt ones already running. The problem: we considered a job as started if it has an ACTIVE_TASK. However, we were creating ACTIVE_TASKS for jobs before deciding to run them, because we needed a place to store the coproc reservations. This caused the above bug, and also had the undesirable effect of creating slot directories before they're needed. Solution: store coprocessor reservations in RESULT rather than ACTIVE_TASK. svn path=/trunk/boinc/; revision=19129	2009-09-22 21:02:06 +00:00
David Anderson	073e6ded2c	- client and scheduler: lay the groundwork for "fractional coproc jobs", e.g. the Milkyway@home ATI app, of which we can typically run 2 or 3 instances at once on a GPU. Changes include: - In APP_VERSION, don't use a COPROCS to represent the GPU requirements; just use doubles ncudas and natis. - sufficient_coprocs() etc. are no longer members of COPROCS - in HOST_USAGE, ncudas and natis are doubles - in scheduler request, req_instances is now a double This checkin doesn't include the job scheduling logic, i.e. assigning jobs to GPUs. That will follow. svn path=/trunk/boinc/; revision=18868	2009-08-19 18:41:47 +00:00
David Anderson	c3fe504e1d	- client: add ATI support to job scheduling and work fetch svn path=/trunk/boinc/; revision=18850	2009-08-17 16:50:40 +00:00
David Anderson	26114920fe	- client: define "too many uploads" (for work fetch) as 2 * max(ncpus, ngpus); show this in the state displayed by <work_fetch_debug> - manager: show project-wide backoff in transfers tab svn path=/trunk/boinc/; revision=18662	2009-07-22 22:00:51 +00:00
David Anderson	e794e71c48	- client: code cleanup for project-level file xfer backoff svn path=/trunk/boinc/; revision=18601	2009-07-16 16:35:57 +00:00
David Anderson	6a13bd12b8	- client: restored code for project-wide backoff on file uploads and downloads. I originally added this on 30 Sept 2005 and disabled it 2 weeks later because there were reports of problems. However, we need this functionality (e.g. on GPU hosts with hundreds of files to upload, we need to back off after a few failures, not try all of them). I added messages (<file_xfer_debug>) so you can see what's going on. Fixes #932. svn path=/trunk/boinc/; revision=18593	2009-07-10 17:06:06 +00:00
David Anderson	46d9e8f087	- client: record the time results are received. Process non-EDF GPU jobs in this order. svn path=/trunk/boinc/; revision=18531	2009-06-30 20:22:54 +00:00
David Anderson	0b3ce504ff	- Win: compile fixes svn path=/trunk/boinc/; revision=18439	2009-06-16 21:58:38 +00:00
David Anderson	16e87bc84e	- client: don't require that file upload URLs contain "file_upload_handler". svn path=/trunk/boinc/; revision=18427	2009-06-16 18:27:16 +00:00
David Anderson	f9222339e9	- client: simplify enforce_schedule(), and maybe fix bugs. New approach: take the "ordered_schedule_results" list, add running jobs that haven't finished their time slice, and order the result appropriately. Then run jobs in order until CPUs are filled. Simpler and clearer than the old way. svn path=/trunk/boinc/; revision=17992	2009-05-04 19:55:59 +00:00
David Anderson	cf638ae3a6	- client: instead of scheduling coproc jobs EDF: - first schedule jobs projected to miss deadline in EDF order - then schedule remaining jobs in FIFO order This is intended to reduce the number of preemptions of coproc jobs, and hence (since they are always preempted by quit) to reduce the wasted time due to checkpoint gaps. - client: the CPU scheduling policy made use of the number of deadline misses in various places. This should include only the deadline misses of CPU jobs. So move "deadlines_missed" from RR_SIM_STATUS and PROJECT to RSC_PROJECT_WORK_FETCH so that we have separate counts for CPU and coproc jobs, and use the count for CPU jobs. - GUI RPC: removed the rr_sim_deadlines_missed field from project descriptor. This is no longer meaningful, and it didn't seem to be used anywhere. svn path=/trunk/boinc/; revision=17785	2009-04-10 19:01:38 +00:00
David Anderson	c6d7076464	- client: for each app version, keep track of the largest WSS of tasks using it. In checking whether tasks fit in RAM, use this as an estimate for tasks that haven't started yet. This avoids a situation where the client starts a lot of tasks in sequence, only to find that each one doesn't fit in RAM. svn path=/trunk/boinc/; revision=17765	2009-04-09 16:46:03 +00:00
David Anderson	dfc62d896d	- Manager: show elapsed time instead of CPU time in Task tab. CPU time is visible in task Properties. - Manager: in task Properties, show final CPU and elapsed times if job is finished - client: honor backoff for account-manager-requested scheduler RPCs - client: keep track final elapsed time for results - GUI RPC: report final elapsed time svn path=/trunk/boinc/; revision=17588	2009-03-11 22:01:38 +00:00
David Anderson	e1b94a1e53	- client: add a new mechanism for assigning coproc instances to tasks, and passing them the corresponding --device N cmdline args. This fixes a bug introduced in 17402 (Feb 26) that broke the --device feature, presumably causing problems on systems with multiple GPUs. svn path=/trunk/boinc/; revision=17549	2009-03-06 23:10:45 +00:00
David Anderson	e74f93c10d	- client: if using anonymous platform, ignore (and complain about) app versions in scheduler reply - client: when reporting anonymous platform apps in sched request, don't include <file_info>s (not relevant to server) svn path=/trunk/boinc/; revision=17507	2009-03-05 17:45:36 +00:00
David Anderson	41fe3e40bf	- client: tag messages with project where possible; fixes #852 - client: show fetch share rather than run share in wfd message svn path=/trunk/boinc/; revision=17398	2009-02-26 17:12:55 +00:00
Eric J. Korpela	8f3abcc835	- Added checks for net/.h, arpa/.h, netinet/.h and code to figure out which of those files to include - Modified MAC address check to work on some non-Linux unixes. (mac_address.cpp) - Added suggested change to "already attached to project" checking. (ProjectInfoPage.cpp) - changed includes of standard c header files to their c++ equivalents (i.e. replaced <stdio.h> with <cstdio>) for namespace protection. - replaced "using namespace std;" with more explicit "using std::function" in several files. - Fixed bug in checking whether the os is OS/2 and added conditional OS_OS2 to the build environment. (boinc_platform.m4,configure.ac) - Changed build environment to not use -nostandardlibs unless we are using G++ and static linkage is specified. (configure.ac) - Added makefiles and package building files for solaris CSW package manager. - Fixed bug with attempting to find login name using logname. (configure.ac) - Added ifdef HAVE_ protection around some include files commonly found in sys. - Added support for unified binary for x86_64/i686-pc-solaris. (cs_platforms.cpp) - generate_host_cpid() now uses MAC address on non-linux unix. (hostinfo_network.cpp) - Macro BOINC_SET_COMPILE_FLAGS now doesn't check gcc only flags on non-gcc compilers. (boinc_set_compile_flags.m4) - Library compiles no longer depend upon the library extension or require the library to be prefixed with lib. - More fixes for fcgi builds. - Added declaration of "struct ether_addr" and ether_ntoa(). Have not yet implemented ether_ntoa() for machines that don't have it, or where it is buggy. (unix_util.h) - Added FCGI::perror() which calls FCGI_perror(). (boinc_fcgi.{h,cpp}) - Fixed library Makefiles so that all required headers get installed. svn path=/trunk/boinc/; revision=17388	2009-02-26 00:23:23 +00:00
David Anderson	258dac62b2	- client: it the state file or an RPC reply has an app version using a coprocessor we don't know about, ignore it (and all results using that app_version will be flushed). This deals with the situation where we have some GPU jobs, but the GPU card is removed (previously this resulted in a crash). This requires some code shuffling so that we check for coprocessors before reading state file. svn path=/trunk/boinc/; revision=17161	2009-02-06 00:22:21 +00:00
David Anderson	89188fca84	- client: there was a problem with how the round simulator worked in the presence of coprocessors. The simulator maintained per-project queues of pending jobs. When a job finished (in the simulation) it would get one or more jobs from that project's pending queue. The problem: this could cause "holes" in the scheduling of GPUs, and produce an erroneous nonzero shortfall for GPUs, leading to infinite work fetch. The solution: maintain a separate (per-resource, not per--project) queue of pending coprocessor jobs. When a coprocessor job finishes, start pending jobs from the queue for that resource. Another change: the simulator did strict reservation of coprocessors. If there are 2 instances of CUDA, and a 1-instance job is running in the simulation, it wouldn't start an additional 2-instance job. This also can cause erroneous nonzero shortfalls. So instead, schedule coprocessors like CPUs, i.e. saturate them. This can cause distorted completion time estimates, but it's better than infinite work fetch. svn path=/trunk/boinc/; revision=17093	2009-02-01 04:37:19 +00:00
David Anderson	be177ee7a4	- client: clear debts when reset project - client: respect work-fetch backoff for non-CPU-intensive projects - client: for non-CPU-intensive project, fetch new job if no currently running jobs - client: skip non-CPU-intensive projects in debt calculations - manager: show resource backoff times correctly svn path=/trunk/boinc/; revision=16998	2009-01-23 18:29:28 +00:00
David Anderson	8740ffdc94	- client: more work-fetch stuff. No more per-project shortfall. It's getting pretty close. svn path=/trunk/boinc/; revision=16765	2009-01-03 06:01:17 +00:00
David Anderson	8c591e31df	- client: first whack at new work-fetch logic. Very preliminary. svn path=/trunk/boinc/; revision=16754	2008-12-31 23:07:59 +00:00
David Anderson	cd4ca5fb17	- client: fix calculation of a job's FLOPS rate in round-robin simulation svn path=/trunk/boinc/; revision=16662	2008-12-09 20:01:01 +00:00
David Anderson	79fb6e969e	- Remove the notion of "CPU efficiency" from both client and server. This wasn't being measured correctly for coproc/multithread apps, and its effect is now subsumed in DCF. svn path=/trunk/boinc/; revision=16610	2008-12-03 19:50:06 +00:00
David Anderson	89548f04da	- client: compute duration_correction_factor based on elapsed time, not CPU time (otherwise it doesn't work for coproc or multi-proc apps) - client: in estimate of job completion time, weight the estimate based on fraction done more heavily (quadratic rather than linear) svn path=/trunk/boinc/; revision=16603	2008-12-02 22:19:39 +00:00
David Anderson	84f1193a9d	- client: use FLOPs, rather than CPU time, as the basis for estimating job completion times. This should improve estimates for GPU apps, and prevent the DCF from getting messed up. svn path=/trunk/boinc/; revision=16598	2008-12-02 03:58:32 +00:00
David Anderson	719921bfaf	- client: fix the updating of CPU time left in RR simulation; don't print msgs about non-CPU-intensive projects. svn path=/trunk/boinc/; revision=16386	2008-11-01 21:10:08 +00:00
David Anderson	2d1d47de15	- client: move round-robin simulation to its own file - web: check for profile existence before trying to show it - file deleter: add some debugging msgs svn path=/trunk/boinc/; revision=16338	2008-10-28 21:59:25 +00:00
David Anderson	a4380ee9a6	- web: make some things in sample front page translatable. TODO: make them all translatable. - manager: compile fix for Linux svn path=/trunk/boinc/; revision=16207	2008-10-14 21:40:14 +00:00
David Anderson	f17c0879de	- changed some comments for Doxygen svn path=/trunk/boinc/; revision=16130	2008-10-04 23:44:24 +00:00
David Anderson	a4d5d49b28	- client: attempt to fix CPU sched bug in the presence of GPUs (if there was an idle GPU, it would run unboundedly many CPU jobs) svn path=/trunk/boinc/; revision=16043	2008-09-25 01:04:53 +00:00
David Anderson	4bf97d0e3f	- compile fixes for Win svn path=/trunk/boinc/; revision=15960	2008-09-04 13:18:51 +00:00
David Anderson	096337a241	- Add support for code-signing using x509 certificates (from Attila Marosi) svn path=/trunk/boinc/; revision=15958	2008-09-04 12:50:54 +00:00
David Anderson	f6b1ae85b9	- certificate stuff svn path=/trunk/boinc/; revision=15957	2008-09-04 12:17:58 +00:00
David Anderson	4f66bb4c95	- added copyright and license info to .C, .cpp, .h files - scheduler: fix bug in adaptive replication: if send an unreplicated job to untrusted host, set both wu.target_nresults and wu.min_quorum to app.target_nresults. svn path=/trunk/boinc/; revision=15762	2008-08-06 18:36:30 +00:00
David Anderson	77c7ff3649	- client/API: add "computation_deadline" to APP_INIT_DATA. This supports apps that can do variable amounts of computing; they can boinc_finish() if their deadline is near. Rom: please back-port. svn path=/trunk/boinc/; revision=15395	2008-06-12 19:05:14 +00:00
David Anderson	eeeb3b7951	- client: on startup, detect when the system clock has been set backwards, and clear all timeout variables. This should fix the situation where, say: 1) the user sets the system clock forward by a year; 2) all projects get their min_rpc_time set; 3) the user sets the system clock back to the correct time. Previously, BOINC would not do anything for a year. Note: a restart of BOINC is required to fix things. It would be harder to do this on the fly. svn path=/trunk/boinc/; revision=15314	2008-05-28 19:15:44 +00:00

1 2 3 4 5 ...

274 Commits