original:
Info about resource usage (GPU usage, #cpus) is stored in APP_VERSION.
When we need this info for a RESULT, we look at rp->avp
new:
For BUDA apps, the info about the actual app (not the docker wrapper)
comes with the workunit, not the app version.
So create a new structure, RESOURCE_USAGE.
APP_VERSION has one, WORKUNIT has one.
So does RESULT; when we create the result we copy the struct
either from the app version or (for BUDA jobs) the workunit.
Then the code can just reference rp->resource_usage.
Nice. This enables BUDA/GPU functionality with almost no additional complexity.
Add code to parse resource usage items in <workunit>
Note: info about missing GPUs (or GPUS without needed libraries)
is also stored in RESOURCE_USAGE.
with a list of BUDA variant names (i.e. plan classes).
Update as variants are added and deleted.
This is used in project preferences for 'Use NVIDIA' type buttons.
feeder: the shared-mem segment has a list of resources types
for which the project has work.
Need to include BUDA variants also.
Do this by scanning the 'buda_plan_classes' file (see above)
Note: this means that when the set of BUDA variants changes,
we need to restart the project
plan_class_spec.xml.sample:
The 'cuda' class had a max compute capability of 200.
Remove it.
Moved RadioButton definitions closer to the relevant controls.
Locale files now shoulw be defined in the main JSON file.
Add possibility to pass main JSON file as a parameter to the installer.exe.
Signed-off-by: Vitalii Koshura <lestat.de.lionkur@gmail.com>
If you make a variant of a BUDA app for a plan class
(e.g. NVIDIA GPU with CUDA)
this ensures that jobs submitted to that variant are sent
only to capable hosts,
and that the host usage and projected FLOPS are set correctly.
On the web side, we add a <plan_class> element to workunit.xml_doc.
This gets sent to the scheduler.
On the scheduler this required some reorganization.
As the scheduler scans jobs, it finds and caches
a BEST_APP_VERSION for each app.
This contains a HOST_USAGE.
In the case of BUDA, the host usage depends on the workunit,
not the app version.
We might scan several BUDA jobs
they'll all use the same APP_VERSION,
but they could have different plan classes
and therefore different HOST_USAGE.
So if we're looking at a job to send,
and the WU has a <plan_class> element,
call app_plan() to check the host capability and get the host usage.
Change add_result_to_reply() so that it takes a HOST_USAGE& argument,
rather than getting it from the BEST_APP_VERSION.
We do this in several places:
- sched_array (old scheduling policy)
- sched_score (new scheduling policy)
- sched_locality (locality scheduling)
- sched_resend (resending lost jobs)
- sched_assign (assigned jobs)
so all these functions work properly with BUDA apps.
-----------------
Also: the input and output templates for a BUDA app variant
depend only on the variant, not on batches or jobs.
So generate them when the variant is created,
and store them in the variant dir,
rather than generating them on batch submission
Also: fix bug in downloading batch output as .zip;
need to do zip -q