VirtualBox 4.2 no longer requires certain commands to be executed
to unregister a VM while older versions do. Just ignore any error
codes, if it becomes a problem we can always make it conditional
on what version of VirtualBox is installed.
the binding of the get_state() RPC
- client: move client_start_time and previous_uptime
from CLIENT_STATE to TIME_STATS,
so that these are also visible in GUI RPC
- scheduler RPC: move uptime and previous_uptime
into <time_stats>
- client: condition an RR simulation message on <rrsim_detail>
- boinccmd: show TIME_STATS info in --get_state
Previously: elapsed_time was just incremented with the value of the polling
period each iteraction through the main loop. This introduced issues
when vboxmanage lagged for whatever reason. This lag could go as high as 5
seconds. Over the timespan of a day this could increase the wall clock time
of a task a great deal.
Now: elapsed_time is incremented with the time it took to execute the main
loop.
- client: if an app's finish file has existed for 10 seconds, kill it;
it must be hung in boinc_finish().
This behavior has been seen with LHC@home and maybe other projects.
Note: this fixes a major problem (starvation)
with project-level GPU exclusion.
However, project-level GPU exclusion interferes with most of
the client's scheduling policies.
E.g., round-robin simulation doesn't take GPU exclusion into account,
and the resulting completion estimates and device shortfalls
can be wrong by an order of magnitude.
The only way I can see to fix this would be to model each
GPU instance as a separate resource,
and to associate each job with a particular GPU instance.
This would be a sweeping change in both client and server.
Old: heartbeat mechanism
Problem: if the client is blocked for > 30 secs
(e.g. because it takes a long time to write the state file,
of because it's stopped in a debugger)
then apps exit.
This is bad is the app doesn't checkpoint and has been
running for a long time.
New: the client passes its PID to the app.
The app periodically (10 sec) checks that the process still exists.
Notes:
- For backward compatibility (e.g. new API w/ old client,
or vice versa) the client still sends heartbeats,
and the API checks heartbeats if the client doesn't pass a PID.
- The new mechanism works only if the client's PID isn't assigned
to a new process within 10 secs of the client exiting.
Windows 2000 reuses PIDs immediately, so check for Win2K
and don't use this mechanism if so.
TODO: For Unix multithread apps,
critical sections aren't currently being enforced.
Need to fix this by masking signals.
svn path=/trunk/boinc/; revision=26147
- It was possible if all results for a workunit were PFC_MODE_INVALID
that NaN pfc would be used causing database update errors. Solved
by using wu_estimated_pfc() as pfc in that case.
- Sanity check was comparing raw_pfc directly to rsc_fpops_bound. That
was causing problems GPUs with high performance estimates. Fixed by
including the app_version scale factor in the check. I thought I had
already committed this...
- Removed a few lines of commented out experimental code accidentally
comitted earlier.
- Committed to git repository on 8/24
svn path=/trunk/boinc/; revision=26144
clear the suspend_request flag.
Otherwise we'll end up doing two suspends,
and on Win the app will be suspended forever.
svn path=/trunk/boinc/; revision=26143