Note: this fixes a major problem (starvation)
with project-level GPU exclusion.
However, project-level GPU exclusion interferes with most of
the client's scheduling policies.
E.g., round-robin simulation doesn't take GPU exclusion into account,
and the resulting completion estimates and device shortfalls
can be wrong by an order of magnitude.
The only way I can see to fix this would be to model each
GPU instance as a separate resource,
and to associate each job with a particular GPU instance.
This would be a sweeping change in both client and server.
Old: heartbeat mechanism
Problem: if the client is blocked for > 30 secs
(e.g. because it takes a long time to write the state file,
of because it's stopped in a debugger)
then apps exit.
This is bad is the app doesn't checkpoint and has been
running for a long time.
New: the client passes its PID to the app.
The app periodically (10 sec) checks that the process still exists.
Notes:
- For backward compatibility (e.g. new API w/ old client,
or vice versa) the client still sends heartbeats,
and the API checks heartbeats if the client doesn't pass a PID.
- The new mechanism works only if the client's PID isn't assigned
to a new process within 10 secs of the client exiting.
Windows 2000 reuses PIDs immediately, so check for Win2K
and don't use this mechanism if so.
TODO: For Unix multithread apps,
critical sections aren't currently being enforced.
Need to fix this by masking signals.
svn path=/trunk/boinc/; revision=26147
clear the suspend_request flag.
Otherwise we'll end up doing two suspends,
and on Win the app will be suspended forever.
svn path=/trunk/boinc/; revision=26143
of NVIDIA APIs. This apparently caused crashes
(in app, not client, which I don't understand) for Einstein@Home.
From Steffen Moller.
svn path=/trunk/boinc/; revision=25527
File verify is done in 4 places:
- after a download finishes
- transition result to DOWNLOADED
- if project->verify_files_on_app_start, on app start
Use asynchrony only in the first 2 cases,
since the async logic is set up to mark the file as PRESENT
when done, not to restart a task
svn path=/trunk/boinc/; revision=25219
resource-specific backoff and exclusion
Old: client writes
<rsc_backoff_time>
<rsc_backoff_interval>
<no_rsc_ams>
<no_rsc_apps>
<no_rsc_pref>
in GUI RPC entries for projects.
Manager (GUI RPC client): PROJECT struct has
cpu_backoff_time
cpu_backoff_interval
... cuda, ati
no_cpu_pref
... cuda, ati
and it parses tags of these names.
In other words, no information is being conveyed
from client to Manager.
New:
manager parses both forms
svn path=/trunk/boinc/; revision=25217
This now supports two main use cases:
1) there's a job that you want to run once on all hosts,
present and future
(or all hosts belonging to a user, or to a team).
The job is never transitioned, validated, or assimilated.
2) There's a normal job for which you want to use only
hosts belonging to a specific user (e.g. cluster or cloud hosts).
This restriction can be made either when the job is created,
or on the fly,
e.g. as part of a scheme for accelerating batch completion.
For the latter purposes we now provide a function
restrict_wu_to_user(DB_WORKUNIT&, int userid);
The job goes through the standard
transitioner/validator/assimilator path.
These cases are enabled by config flags
<enable_assignment_multi/>
<enable_assignment/>
respectively.
Assignment of type 2) are no longer stored in shared mem,
so there is no limit on their number.
There is no longer a rule that assigned job names must contain "asgn".
NOTE: this requires a database update.
svn path=/trunk/boinc/; revision=25169
- web: show BBCode info in the same page, rather than target=new.
On Firefox, this opens a new tab but doesn't switch to it,
which makes it look like nothing happened.
svn path=/trunk/boinc/; revision=24622
If the file "client_opaque.txt" exists on the client,
include its contents in scheduler request messages.
On the scheduler, parse this into SCHEDULER_REQUEST::client_opaque,
where it can be used by the customizable scheduler functions.
svn path=/trunk/boinc/; revision=24586
for canceling jobs
- added program cancel_jobs for canceling jobs
- DB interface: it's not an error if update_fields_noid()
affects != 1 rows
svn path=/trunk/boinc/; revision=24413
so that if you use <http_debug> and filter by project
you don't see other projects' HTTP stuff
- client simulator: cc_config.xml is part of the scenario;
log flags are part of the simulation
svn path=/trunk/boinc/; revision=24410
for when the job completed successfully but
one or more output files had permanent upload failures.
Show this state in web interfaces.
- sample_work_generator: check return value of count_unsent_results(),
so that we don't generate infinite work if there's a DB problem
- web: RSS feed shows news items from last 90 days, rather than 14
svn path=/trunk/boinc/; revision=24377
- Fix build problems on Mac OS X using autotools
- Consistently use #if HAVE_X for platform checks,
rather than #ifdef HAVE_X or #if defined(HAVE_X)
- In Unix build, make lots of compiler checks standard
- Fix some compile warnings
From Matt Arsenault.
Note: there are now lots of compile warnings in clientgui/ on Unix,
mostly in WxWidgets code
svn path=/trunk/boinc/; revision=24303
- measure the available RAM of each GPU when BOINC starts up.
If this fails, set available = physical.
Show available RAM in startup messages.
- use available RAM rather than physical RAM in selecting
the "best" GPU instance
- report available RAM to the scheduler
TODO: change the scheduler to use available rather than physical
if it's reported
svn path=/trunk/boinc/; revision=24210
Add parsed_tag and is_tag to the class,
so that parsing functions don't need to declare them
and pass them around.
- Complete the task of using XML_PARSER as the argument
to all parsing functions.
(Internally, many of these functions still use the old XML parser;
that's the next step.)
svn path=/trunk/boinc/; revision=23978
- client: extend <exclude_gpu> option so that if <device_num> is omitted,
all GPUs of the given type are excluded.
svn path=/trunk/boinc/; revision=23902
of the simulation, not the scenario.
If you want to run a simulation w/ different log flags,
you shouldn't have to create a new scenario.
- client emulator: add --config_prefix cmdline arg
- validator: prevent infinite loop when app_version.pfc_avg
is wonky (like 1e-300).
Next step: figure out how it got that way.
svn path=/trunk/boinc/; revision=23828
to mess up input templates containing
<copy_file/> or other attribute tags.
XML_PARSER now contains a member element() for when
you want to copy an element without knowing its structure.
svn path=/trunk/boinc/; revision=23790
- MGR: Fix a bug introduced in a previous commit where the plan class
was being surrounded by single quotes when generating an updated
project list.
clientgui/
ProjectInfoPage.cpp
doc/
get_platforms.inc
svn path=/trunk/boinc/; revision=23744
Lets you specify, on a per-app basis,
that all instances should be done using the same app version.
This is for validation in the presence of GPUs.
- scheduler: code cleanup
- Instead of adding a bunch of non-DB fields to RESULT,
used a derived class SCHED_DB_RESULT.
- Instead of storing a pointer to BEST_APP_VERSION in RESULT,
store the structure itself.
This simplifies the memory allocation situation.
- client: condition "Got server request to delete file" messages
on <file_xfer_debug>
svn path=/trunk/boinc/; revision=23636