Commit Graph

43 Commits

Author SHA1 Message Date
David Anderson 5bcdacfa64 Condor interface: race condition fix, from Jamie 2014-04-14 13:29:04 -07:00
David Anderson 12b4f43793 Condor interface: when unzipping output file, use job-specific name for temp file to avoid conflict between jobs 2014-03-13 13:07:16 -07:00
David Anderson 01b78c714a Remote job submission: allow efficient batch query
The batch query call used by Condor (query_batch_set(), in the C++ API)
returned info about all the jobs in the set of batches,
even those that hadn't changed.
This is potentially inefficient - a query might return info
about 10,000 jobs, only a few (or none) of which have changed state
since the last call.

Solution: add a "min_mod_time" parameter to the call.
Only jobs that have changed state since that time are reported.
Also, add a "server_time" field to the return,
giving the current time on the server
(in case there's clock skew between client and server)

Also, fix some text scrambling introduced in previous checkin;
there must have been a gremlin in my vim.
2014-01-16 10:24:10 -08:00
David Anderson 438cd78b13 Remote job submission: add C++ APIs for query_batches() and query_batch()
- Add program (tools/remote_submit_test.cpp) for testing C++ API for remote job submission.
- Rename Condor-specific API to query_batch_set().
2013-10-22 15:27:34 -07:00
David Anderson 03cc6849c7 remote job submission: add C++ interface to estimate_batch function 2013-10-21 23:40:30 -07:00
David Anderson 8e8adf46b9 Condor interface: bug fixes. 2013-10-17 22:38:22 -07:00
David Anderson 9d07555ce3 Condor: bug fix to BOINC GAHP, from Jaime 2013-10-09 12:43:40 -07:00
David Anderson 7336bfd9ba Remote job submission, C++ API: simplify 2013-10-08 17:36:45 -07:00
David Anderson 3e21e8b7c4 Condor: debug set_expire_time RPC 2013-09-17 23:14:57 -07:00
David Anderson 2a2c9c4ad8 remote job submission: add notion of "expire time" for batches (for Condor)
- Batches now have optional "expire time".
  If this time passes and the batch is not retired, abort and retire it.
- Add script "expire_batches" which enforces the above.
  Run it as a periodic task.
- Add a web RPC for setting the expire time of a batch
  (it can be changed multiple times)
- Add a C++ interface for this RPC
- Add a BOINC_SET_LEASE command to the BOINC GAHP
  ("lease" is Condor term for expire time)
2013-09-17 13:35:55 -07:00
David Anderson 29c41fd980 Bug fixes for job file management and BOINC GAHP, from Jaime Frey 2013-09-09 16:18:43 -07:00
David Anderson 65fb860317 Condor interface: increase thread stack size 2013-08-30 08:44:33 -07:00
David Anderson 98123560e2 Condor: add variant of uppercase that processes multiple files 2013-08-27 21:24:01 -07:00
David Anderson 1c31f6feaa Condor: fix bug when 2 input files have same contents; fix error messages 2013-08-09 16:06:36 -07:00
David Anderson 74a8a8cad7 Condor: use flockfile() instead of pthread mutex 2013-08-09 14:55:41 -07:00
David Anderson 37e822fe3a Condor: implement asynchronous mode in BOINC GAHP; from Jaime 2013-08-09 13:41:58 -07:00
David Anderson 53f97f9cf1 Condor: fix memory-allocation bugs, from Jaime 2013-08-09 13:33:03 -07:00
David Anderson 213bb934a7 Condor interface changes
BOINC_QUERY_BATCHES now prints, for each queried batch,
    a count of jobs followed by the jobs
BOINC_ABORT_JOBS takes a list of jobs, which may belong to different batches.
    The handler for this looks up the batches and makes sure
    the jobs belong to the user.
2013-05-30 23:38:33 -07:00
David Anderson ff261cb6df Condor interface: bug fixes; add request_gen script; add retire_batch command 2013-05-30 09:44:58 -07:00
David Anderson 8009a8cecb Condor interface: various fixes, mostly from Jaime Frey
- XML parser: for parse_string(), malloc the 256KB buffer instead of
    allocating it on the stack; the latter crashes threads with 32KB stacks.
    However, do the malloc() only if we've actually seen the start tag
    (this required a bit of code shuffle).
- BOINC GAHP: escape spaces in error msgs
2013-05-27 11:45:10 -07:00
David Anderson ff1311bf11 BOINC/Condor interface: tweaks from Jaime Frey 2013-05-21 20:35:14 -07:00
David Anderson 1d63b6d8cc Condor: tweak 2013-05-15 16:50:27 -07:00
David Anderson ed2fb9d26a Condor: various fixes to BOINC GAHP 2013-05-15 16:47:17 -07:00
David Anderson a5bcf6ab3b - client: work fetch message tweaks: show state before actions 2013-04-02 17:04:45 -07:00
David Anderson a78705a8d4 - client emulator: ignore non-CPU-intensive apps
- remote job submission:
    - prefix error messages with "BOINC server:"
      so higher levels can tell where the error is coming from
    - "get templates" RPC can take job name instead of app name
- Condor interface
    - add BOINC_SELECT_PROJECT function
    - BOINC_SUBMIT no longer has info about output files
    - Change BOINC_FETCH_OUTPUT semantics
2013-03-22 22:04:35 -07:00
David Anderson d95da0f75c - Condor integration:
- change "query_batch" to "query_batches"; allow multiple batches
    - add "ping server" web RPC and GAHP function
    - change BoincDb::get() so that it generates XML error message if needed
2013-03-22 10:25:39 +01:00
David Anderson bd8ecb1b00 - Condor integration:
- add Web RPC for querying a completed job (returns status,
        stderr out, run times, etc.)
    - support Jaime's changes to GAHP protocol
    - support for zipped output files
2013-03-15 13:38:44 +01:00
David Anderson 72d38818b4 - BOINC/Condor stuff
- C++ interfaces to remote functions: add error_msg argument,
    so caller can get textual description of error
2013-03-07 11:31:39 +01:00
David Anderson d6c92d870c - Code cleanup for remote job submission
- Add abort_jobs operation to BOINC GAHP
- Change batch-related RPCs so that you can identify batch
    either by database ID or name
2013-03-05 17:15:18 +01:00
David Anderson a6af5bf272 - remote job submission tweaks 2013-03-05 17:10:31 +01:00
David Anderson c3ec0e91de - client: show nvidia driver version as 314.07 instead of 314.7 2013-03-05 16:58:44 +01:00
David Anderson 3a530a4c34 - client: check return value of the function (statfs or statvfs)
used to find disk space and usage.
    This may be failing for in-memory filesystems on Linux.
2013-03-05 15:05:29 +01:00
David Anderson 3977ad8845 - Condor interface: small code shuffle 2013-03-05 14:52:38 +01:00
David Anderson 4eed4c1b7d - client: message tweak
- remote job submission: output file fetch now working.  Woo hoo!
2013-03-05 14:52:37 +01:00
David Anderson 46f06b9350 - Remote job submission stuff for Condor.
Submit and Query are more or less working.
2013-03-05 14:52:37 +01:00
David Anderson 7f4263b079 - Condor stuff; basic function works now! 2013-03-05 14:26:40 +01:00
David Anderson 3321837b01 - wrapper: fix CPU time accounting on Unix 2013-03-05 14:05:04 +01:00
David Anderson 2a73dc0e01 - remote file management stuff for Condor 2013-03-05 14:05:04 +01:00
David Anderson a46a5926ae - remote file management and job submission stuff for Condor 2013-03-05 14:05:04 +01:00
David Anderson fce6266e23 - Web RPC for remote job submit: fix bugs
- scheduler: message tweaks
2013-03-05 13:53:58 +01:00
David Anderson 79c6225fc2 - configure: work with "gold" linker 2013-03-05 13:33:27 +01:00
David Anderson 35608c434b - fix Android compile warnings
- intermediate checkin for Condor stuff
2013-03-05 13:33:27 +01:00
David Anderson 18d0f1f4d9 more GAHP code 2013-03-04 17:24:20 +01:00