Commit Graph

9 Commits

Author SHA1 Message Date
David Anderson 81973a9fff - scheduler: fix structural problems with sending user messages.
Old: various redundant and/or misleading messages were sent.
    New:
        - if host w/ no GPU contacts a GPU-only project,
            send high-pri message saying they need a GPU
        - if host w/ GPU has driver too old for all versions,
            send high-pri message saying to update driver
        - if host w/ GPU has driver too old for some versions,
            send low-pri message saying to update driver
        - if host has GPU but too little RAM for any app,
            send low-pri message saying so
- scheduler: revamp GPU plan class functions

svn path=/trunk/boinc/; revision=21760
2010-06-16 22:07:19 +00:00
David Anderson a151ad6cb3 - client/scheduler: deal with situation where GPU has enough
RAM to run job, but when we actually run the job
    not enough GPU RAM is free, so the application fails.
    This can cause a large number of jobs to fail.
    Solution:
    - app_plan() can specify the GPU RAM requirements of an app version.
        This is passed to the client in a new field
        <gpu_ram> of the <app_version> element.
    - prior to starting or restarting a GPU app, the client
        checks the amount of free RAM on the particular GPU.
        If it's not enough for the app version,
        the client doesn't start it,
        and arranges for the scheduler to ignore it for 5 minutes
        (by which point there might be more free GPU RAM)
    Notes:
    1) this change will have effect only when
        both client and scheduler are updated.
    2) the check is done in enforce_schedule(),
        rather than schedule_cpus(),
        because only at that point
        have we assigned a specific GPU to the job.
    3) there's another case to deal with:
        a GPU app's malloc of GPU RAM fails in the middle of the job.
        Currently the job fails.
        I plan to add an API call boinc_temporary_exit(x) so
        that the job can exit and potentially restart in x seconds.
        (In principle this mechanism is sufficient for all cases,
        but it could lead to a lot of starting/exiting,
        so the current change is worthwhile).

svn path=/trunk/boinc/; revision=19864
2009-12-11 22:45:59 +00:00
David Anderson 720ec66f28 - scheduler: fix CUDA RAM warning msg
svn path=/trunk/boinc/; revision=18922
2009-08-26 16:10:46 +00:00
David Anderson eafb410cf8 - scheduler: simplify and fix the way that app_plan() conveys messages
to the user.  app_plan() now generates the messages directly
    rather than returning integer error codes.

svn path=/trunk/boinc/; revision=18899
2009-08-21 20:38:39 +00:00
Eric J. Korpela 208956257a - Moved credit.cpp into libboinc_sched where it should have gone in the first
place.
- Added a separate GPU memory requirement for the CUDA23 plan


svn path=/trunk/boinc/; revision=18887
2009-08-21 02:12:31 +00:00
David Anderson 7278ab1787 - scheduler: add support for ATI GPUs
svn path=/trunk/boinc/; revision=18851
2009-08-17 17:07:38 +00:00
David Anderson a525453b5e - code shuffling
svn path=/trunk/boinc/; revision=18826
2009-08-10 04:56:46 +00:00
David Anderson f163897d8a - scheduler: add plan class for CUDA 2.3
svn path=/trunk/boinc/; revision=18804
2009-08-03 21:30:19 +00:00
David Anderson e3363c7eb8 - scheduler: on second thought, it would be better to add the above
feature without requiring use of score-based scheduling.
    So add a new customizable function, wu_is_infeasible_custom(),
    where projects can put job-specific checks.

    Also, move customizable functions (of which there are now 4)
    to a new file, sched_customize.cpp.

svn path=/trunk/boinc/; revision=18767
2009-07-29 18:55:50 +00:00