In deciding whether to schedule needs_network tasks,
we were looking at gstate.network_suspended.
The problem is that this remains false for 5 minutes
after any GUI RPC that could generate network activity.
Instead, look at gstate.file_xfers_suspended.
- store final network usage in RESULT; write/parse in state file
- final disk and memory usage weren't being written to state file; do so.
- add --network_usage option to example app, to test this stuff
Also fixed a bug where, if a job was aborted while not running,
its final CPU and elapsed time weren't copied from ACTIVE_TASK to RESULT,
hence not sent to scheduler
Various bad things could happen when CPU throttling was used together w/ GPU apps.
Examples:
- on a multi-GPU system, several GPU tasks are assigned to the same GPU
- a suspended GPU task remains in memory (tying up its GPU resources)
while other tasks try to use the GPU.
The problem was that parts of the code assumed that suspended
GPU processes don't exist - i.e. that when a GPU task is suspended
it's always removed from memory.
This isn't true in the presence of CPU throttling.
So I made the following changes:
- When assigning GPUs to tasks, treat suspended tasks like running tasks
(i.e. reserve their GPUs)
- At the end of the CPU-scheduling logic, if there are any GPU tasks
that are suspended and not scheduled, remove them from memory,
and trigger a reschedule so we can reallocate their GPUs.
Also, a cosmetic change: in the resource usage string shown in the GUI,
include "(device X)" even if the task is suspended (i.e. because of throttling).
Also: zero out COPROC::opencl_device_indexes[] so we don't write
a garbage number to init_data.xml for non-OpenCL jobs