boinc/todo

229 lines
7.5 KiB
Plaintext
Executable File

-----------------------
BUGS (arranged from high to low priority)
-----------------------
- window closes and does not reopen when workunit finishes
and new workunit starts
- CPU time updates infrequently (every 10 seconds),
add user control for this (HD write frequency)
- Client treats URL "maggie/ap/" different than URL "maggie/ap",
though this isn't really a bug it might be good to fix anyway
- global battery/user active prefs are always true in the client
- Client should display "Upload failed" and "Download failed" when failure occurs
- Result status should say "downloading files", "uploading files", etc.
-----------------------
HIGH-PRIORITY (should do for beta test)
-----------------------
- Implement Screensaver "blank screen" functionality
implement server watchdogs
est_time_to_completion doesn't work for non-running tasks
Messages from core client
decide what messages should be shown to user, and how
log file? GUI? dialog?
Should tag messages with project they're from, if any?
-----------------------
THINGS TO TEST (preferably with test scripts)
-----------------------
- Test suspend/resume functionality on Windows/UNIX
- verify that if file xfer is interrupted, it resumes at right place
- result reissue
- WU failure: too many errors
- WU failure: too many good results
- credit is granted even if result arrives very late
- multiple preference sets
-----------------------
MEDIUM-PRIORITY (should do before public release)
-----------------------
decide what to do with invalid result files in upload directory
make get_local_ip_addr() work in all cases
Implement FIFO mechanism in scheduler for results that can't be sent
user profiles on web (borrow logic from SETI@home)
Devise system for porting applications
password-protected web-based interface for
uploading app versions and adding them to DB
XXX should do this manually since need to sign
Add 2-D waterfall display to Astropulse
get timezone working on all platforms
Deadline mechanism for results
- use in result dispatching
- use in file uploading (decide what to upload next)
- use in deciding when to make scheduler RPC (done already?)
Testing framework
better mechanisms to model server/client/communication failure
better mechanisms to simulate large load
do client/server on separate hosts?
Delete files if needed to honor disk usage constraint
inform user if this happens
Global preferences
implement disk usage prefs
time-of-day prefs?
test propagation mechanism
set up multi-project, multi-host test;
change global prefs at one web site,
make sure they propagate to all hosts
limit on frequency of disk writes?
Per-project preferences
test project-specific prefs
make example web edit pages
make app that uses them
set up a test with multiple projects
test "add project" feature, GUI and cmdline
test resource share mechanism
CPU benchmarking
review CPU benchmarks - do they do what we want?
what to do when tests show hardware problem?
How should we weight factors for credit?
run CPU tests unobtrusively, periodically
check that on/conn/active fracs are maintainted correctly
check that bandwidth is measured correctly
measure disk/mem size on all platforms
get timezone to work
CPU accounting in the presence of checkpoint/restart
test
Redundancy checking and validation
test the validation mechanism
make sure credit is granted correctly
make sure average, total credit maintained correctly for user, host
Windows screensaver functionality
idle-only behavior without screensaver - test
Data transfer
make sure restart of downloads works
make sure restart of uploads works
test download/upload with multiple data servers
make sure it tries servers in succession,
does exponential backoff if all fail
review and document prioritization of transfers
review protocol; make sure error returns are possible and handled correctly
Scheduler
Should dispatch results based on deadline?
test that scheduler estimates WU completion time correctly
test that scheduler sends right amount of work
test that client estimates remaining work correctly,
requests correct # of seconds
test that hi/low water mark system works
test that scheduler sends only feasible WUs
Scheduler RPC
formalize notion of "permanent failure" (e.g. can't download file)
report perm failures to scheduler, record in DB
make sure RPC backoff is done for any perm failure
(in general, should never make back-to-back RPCs to a project)
make sure that client eventually reloads master URL
Application graphics
finish design, implementation, doc, testing
size, frame rate, whether to generate
Work generation
generation of upload signature is very slow
prevent file_xfer->req1 from overflowing. This problems seems to be
happening when the file_upload_handler returnes a message to the
client that is large. This causes project->parsefile to get wrong
input and so on.
test HTTP redirect mechanism for all types of ops
Add batch features to ops web
-----------------------
LONG-TERM IDEAS AND PROJECTS
-----------------------
use https for login (don't sent account ID or password in clear)
CPU benchmarking
This should be done by a pseudo-application
rather than by the core client.
This would eliminate the GUI-starvation problem,
and would make it possible to have architecture-specific
benchmarking programs (e.g. for graphics coprocessor)
or project-specific programs.
investigate binary diff mechanism for updating persistent files
verify support for > 4 GB files everywhere
use FTP instead of HTTP for file xfer??
measure speed diff
Local scheduling
more intelligent decision about when/what to work on
- monitor VM situation, run small-footprint programs
even if user active
- monitor network usage, do net xfers if network idle
even if user active
The following would require client to accept connections:
- clients can act as proxy scheduling server
- exiting client can pass work to another client
- client can transfer files to other clients
User/host "reputation"
keep track of % results bad, %results claimed > 2x granted credit
both per-host and per-user.
Make these visible to project, to that user (only)
Storage validation
periodic rehash of persistent files;
compare results between hosts
Include account ID in URL for file xfers
This would let you verify network xfers by scanning web logs
(could use that to give credit for xfers)
WU/result sequence mechanism
design/implement/document
Multiple application files
document, test
Versioning
think through issues involved in:
compatibility of core client and scheduling server
compatibility of core client and data server
compatibility of core client and app version
compatibility of core client and client state file?
Need version numbers for protocols/interfaces?
What messages to show user? Project?
Persistent files
test
design/implement test reporting, retrieval mechanisms
(do this using WU/results with null application?)
NET_XFER_SET
review logic; prevent one stream for starving others
Kill app if there is a memory leak
Other user preferences:
memory restrictions
process priority/affinity
show disk usage as two pie charts (one for overall, one for per project)
disk write frequency