Mem usage (WSS):
The easiest way to get the WSS of a Docker contaier is to ask Docker
using the "docker stats" command.
So I have docker_wrapper do this periodically (10 sec... it's a bit slow).
But how to get this back to the client?
Currently there's no provision for an app to reports its own WSS.
So I added one, by adding an optional field to the app status messages
sent from app to client in shared mem.
If this is present, the client uses it instead of procinfo.
CPU time: "docker stats" reports CPU fraction
(averaged over what period?)
We multiply that by the stats poll.
Not exactly the same as CPU time, but close enough.
sporadic state via files.
This allows, for example, sporadic VM apps.
The logic for reading and writing of files is in the API library,
rather than in the wrappers.
Also: wrappers show message and exit if bad command line option.
Also: small code shuffle in vboxwrapper to parse cmdline before doing anything.
Remove header from boinc_api.h of the function boinc_try_critical_section() that was removed in
6984ec8cf4
Signed-off-by: Vitalii Koshura <lestat.de.lionkur@gmail.com>
When the BOINC API headers are included from C source code that gets compiled with "-Wstrict-prototypes", they generate a lot of "function declaration isn’t a prototype" warnings. The attached patch fixes it by turning "foo()" to "foo(void)".
Gabor
Signed-off-by: Vitalii Koshura <lestat.de.lionkur@gmail.com>
There were two parts to this:
- In the timer thread, we need to check for client death even if
we're in a critical section.
If both conditions hold, set the no_heartbeat status flag.
- In boinc_end_critical_section(), check no_heartbeat and exit if set.
Also: the various checks in boinc_end_critical_section()
(quit, abort, no heartbeat) should be conditioned on
options.direct_process_action.
Otherwise wrappers that use critical sections won't do the right thing.
Sending or receiving trickle messages required setting flags in BOINC_OPTIONS.
There were two problems with this:
1) it wasn't documented
2) it's not necessary; the act of calling boinc_send_trickle_up()
tells the runtime system to do the trickle-up-related stuff.
Furthermore, because intermediate file upload shares message channels
with trickles, these functions also required the option flags
(also undocumented).
With this change, you don't need to set options to use
trickle messages are intermediate file upload.
Vboxwrapper detects known buggy versions of Vbox and calls
boinc_temporary_exit().
The "Incompatible version" message appears in the task status
in the BOINC Manager, where some users may never see it.
It needs to appear as a notice, telling the user to upgrade VBox.
To do this, I added an optional argument to boinc_temporary_exit()
saying that the message should be delivered as a notice.
This is conveyed to the client by adding
a line containing "notice" to the temp exit file.
I changed the client and vboxwrapper to use this.
My last commit did this using a new API call.
But this would require rebuilding apps any time you want to change it;
too much work.
So instead make it an attribute of apps,
which you can set via the admin web interface.
Corresponding changes to client.
Currently the duration estimate for a task is a combination of
- a static estimate, based on wu.rsc_fpops_est and the estimated FLOPS
- a dynamic estimate, based on fraction done (FD) and elapsed time
The weighting of the dynamic estimate is FD^2;
the assumption is that fraction done is imprecise and improves
toward the end of a task.
This isn't ideal for apps that can supply accurate FD.
Solution: add a new API function
boinc_fraction_done_exact().
This notifies the client that the FD is accurate,
and that it should use only the dynamic estimate.
(New clients will do this; old clients will use the FD as the currently do).
- If you run the client with --run_test_app,
runs "test_app" in the current directory and interacts with it
(and does nothing else).
It can suspend/resume it with arbitrary timing;
this is controlled in run_test_app() (app_start.cpp).
- example app: add --critical_section option.
This lets you test the runtime system for apps that do
most of their work in a critical section (like GPU apps).
- Add some logging messages (conditioned by DEBUG_BOINC_API)
to the runtime system.
- boinc_finish() waits for the timer thread to write final messages;
make sure it doesn't do anything else
(like suspend the worker thread) during this period
Lets application specify a min checkpoint interval.
The actual min checkpoint interval is the max of this
and the user-specified pref for min disk interval.
svn path=/trunk/boinc/; revision=26005
moved app_ipc.h inclusion outside __cplusplus
since it contains important C mode prototypes
(boinc_resolve_filename() etc.)
svn path=/trunk/boinc/; revision=25752
lets an application report its network usage to BOINC,
and hence take it into account with monthly limits etc.
- API: get rid of deprecated boinc_ops_per_cpu_sec(),
boinc_ops_cumulative(), and
boinc_set_credit_claim();
- admin web: update manage_apps.php;
add the ability to set homogeneous app version
svn path=/trunk/boinc/; revision=25700
and it timed out and we killed it, we'd treat it as a job error.
(This was a major bug).
- API: remove BOINC_STATUS::suspend_request.
I meant to do this before.
svn path=/trunk/boinc/; revision=25498
boinc_temporary_exit(),
explaining why the app is exiting.
Convey this to the client, and then to the Manager,
and display it there and in the log.
clientgui/
MainDocument.cpp
lib/
gui_rpc_client_ops.cpp
gui_rpc_client.h
api/
boinc_api.cpp,h
client/
client_types.cpp,h
app.h
app_control.cpp
svn path=/trunk/boinc/; revision=25315
core client. Next commit will create an extra "VM Console"
button in the manager when detected. Volunteers will just have
to click the button to see what is going on with the VM.
api/
boinc_api.cpp, .h
samples/vboxwrapper
vbox.cpp, .h
vboxwrapper.cpp, .h
svn path=/trunk/boinc/; revision=25035
allow applications to supply a "web graphics URL",
in which case the manager's "Show Graphics" button
opens a browser at that URL.
This typically would used for applications that
implement a web server that serves pages showing
job information in HTML.
- vboxwrapper: if <pf_guest_port> is specified in the config file,
set up port forwarding to that port
and use the above API call with URL "http://localhost:port"
svn path=/trunk/boinc/; revision=24898
- Fix build problems on Mac OS X using autotools
- Consistently use #if HAVE_X for platform checks,
rather than #ifdef HAVE_X or #if defined(HAVE_X)
- In Unix build, make lots of compiler checks standard
- Fix some compile warnings
From Matt Arsenault.
Note: there are now lots of compile warnings in clientgui/ on Unix,
mostly in WxWidgets code
svn path=/trunk/boinc/; revision=24303
add a mechanism so that apps can report sub-processes
that are not descendants (e.g., virtual machines)
These processes are then counted as part of the app,
not as "non-BOINC CPU time".
This fixes a bug where processing was incorrectly suspended
because CPU usage by VM apps exceeded the "CPU usage limit" pref.
Implementation:
- the PIDs of the processes in question
are passed from app to client via shared-memory,
in the app_status channel.
A new variant of boinc_report_app_status() supports this.
- the VBox wrapper queries the PID of the VM,
and reports it in this way.
- procinfo_app() includes a new argument: a list of PIDs
that are part of the app, although not ancestrally
related to the main process.
- in the client, ACTIVE_TASK now includes a vector "other_pids".
If this is nonempty, it's passed to procinfo_app().
svn path=/trunk/boinc/; revision=24123
could use the following for safe exit checking.
#ifdef _WIN32
//Jason: Safe exit check macro to play nicer with Cuda & MS-CRT
#ifdef USE_CUDA
#define SAFE_EXIT_CHECK do { \
if (worker_thread_exit_request) { \
fprintf(stderr,"-> Worker received exit request, syncing Cuda...");
cudaThreadSynchronize(); fprintf(stderr,"Done.\n"); \
fprintf(stderr," Worker Freeing Cuda data..."); cudaAcc_free();
fprintf(stderr,"Done.\n"); \
fprintf(stderr," Worker Acknowledging exit request, spinning->\n");
worker_thread_exit_ack = true; \
while (1) Sleep(10); \
} \
} while (0);
#else
#define SAFE_EXIT_CHECK do { \
if (worker_thread_exit_request) { \
fprintf(stderr," Worker Acknowledging exit request, spinning-> ");
worker_thread_exit_ack = true; \
while (1) Sleep(10); \
} \
} while (0);
#endif
#else // Linux or other probably have their own safe exit handling, defined as
blank, do nothing
#define SAFE_EXIT_CHECK
#endif
and install at the top of the cffft loop, and more locations if desired:
SAFE_EXIT_CHECK;
I'd like to implement these as BOINC API functions, but have not yet done so.
svn path=/trunk/boinc/; revision=23646
i.e. those that create subprocesses.
Previously, the client's job control options (suspend/resume/quit)
would not work for subprocesses.
Multiprocess apps must initialize with something like:
BOINC_OPTIONS options;
boinc_options_defaults(options);
options.multi_process = true;
boinc_init_options(&options);
Note: an application can be both multi-thread and multi-process.
In this case set options.multi_thread as well.
- wrapper: add support for multi-process apps.
Previously, suspend/resume operations did not work for subprocesses.
If a task is multi-process, you must include
<multi_process>1</multi_process>
in its descriptor.
svn path=/trunk/boinc/; revision=23369
Not necessary.
- wrapper: add optional <append_cmdline_args/> element to
task descriptor.
If set, pass the wrapper's cmdline args to that task.
NOTE: previously they were always passed.
If you want this behavior, you now must set this.
svn path=/trunk/boinc/; revision=23232
1) old-style apps with graphics in main program.
No one should be using these anymore.
2) writing init_data.xml in boinc_finish().
This was used by deprecated "compound app" scheme
- scheduler: if request reports results that were previously reported,
that's evidence that the previous reply was not received by client.
It may have contained results.
So set a "resend lost results" flag.
svn path=/trunk/boinc/; revision=22203
This is like boinc_init() but for multithread apps.
Unlike boinc_init(), it suspends/resumes all threads in the app,
not just one.
In Unix, this is done by forking,
and having the parent process handle suspend/resume messages
and suspend/resume the child using signals
On Win, there's some nasty code that enumerates all
threads in the whole system, and suspends/resumes
those in a particular process.
svn path=/trunk/boinc/; revision=20054
This exits the app with status zero and no finish file,
so the client will restart it.
It creates a file "temporary_exit" containing dt.
The (new) client reads this file and will postpone
scheduling the job again for dt seconds.
Old clients will treat it as a premature exit,
and potentially try to reschedule the job immediately.
This function is intended for GPU applications that
fail to allocate GPU RAM,
presumably because a non-GPU application has it allocated.
We don't want the job to fail,
and we want to wait for a while before trying the allocation again.
svn path=/trunk/boinc/; revision=19879
want to grant approximately fixed credits, but don't want to express them in
terms of FPOPS and IOPS. This API just calls boinc_ops_cumulative(N*8.64000e+11,0).
CPU intensive projects that use this API should still use the
tools/calculate_credit_multiplier script in order to adjust their credit
claims as processing times vary.
svn path=/trunk/boinc/; revision=17743
time of day and process ID.
This should prefix all messages written to stderr
by applications or by the runtime system.
svn path=/trunk/boinc/; revision=17687
- web: check whether to show profile in separate function
from displaying profile; eliminate double headers
- scheduler: finish purge of redundant arguments
svn path=/trunk/boinc/; revision=16726