boinc/todo

-----------------------
BUGS (arranged from high to low priority)
-----------------------
- Suspend/resume not fully functional on Windows, no way to suspend/resume on UNIX
- Currently, if there are multiple CPUs they work on the same result
- "Show Graphics" menu item brings up minimized window, client does not remember window size/pos after close/reopen, window closes and does not reopen when workunit finishes and new workunit starts
- No easy way to quit/add projects on UNIX
- "ACTIVE_TASK.check_app_status_files: could not delete slots\0\fraction_done.xml: -110" appears in stderr.txt on Windows
- "no work available" appears sporadically though work is eventually assigned, not sure if it is assigned immediately or on next RPC
- Should include option in Windows client or installer whether to run client at startup or not
- Scheduler reply includes blank lines that XML parser complains about
- boinc_gui.exe priority should be lower (?), launched app priorities should be very low
- on final panel of install, add checkbox to let user view readme
- Time to completion isn't too accurate, this is more of an Astropulse problem involving fraction_done
- Report problems page on maggie doesn't link to anything
- Host stats incorrectly reports number of times connected with same ip (unconfirmed)
- CPU time updates infrequently (every 10 seconds), should there be a user control for this?
- Client treats URL "maggie/ap/" different than URL "maggie/ap", though this isn't really a bug it might be good to fix anyway
- Astropulse uses a lot of memory (~70 MB) b/c of the dispersion table, should this be decreased?
- CPU time for a completed workunit is incorrect (unconfirmed)
- client died quickly on Mandrake 9.0 linux (unconfirmed)
- make pie chart colors/labels easier to understand
- need a way to refresh prefs from client
- columns expand when window expands

-----------------------
HIGH-PRIORITY (should do for beta test)
-----------------------
make get_local_ip_addr() work in all cases
est_time_to_completion doesn't work for non-running tasks

run backend programs (validate/file_deleter/assimilate)
    from crontab; document

Windows client
    make mini-logo
    fix "unable to calculate"
    completed results: show final CPU time

Messages from core client
    decide what messages should be shown to user, and how
    log file?  GUI?  dialog?

-----------------------
MEDIUM-PRIORITY (should do before public release)
-----------------------

protect project admin web pages (htaccess)
get timezone working on all platforms

Deadline mechanism for results
    - use in result dispatching
    - use in file uploading (decide what to upload next)
    - use in deciding when to make scheduler RPC (done already?)

Testing framework
    better mechanisms to model server/client/communication failure
    better mechanisms to simulate large load
    do client/server on separate hosts?

Delete files if needed to honor disk usage constraint
    inform user if this happens

implement max bytes/sec network preferences

Global preferences
    implement disk usage prefs
    time-of-day prefs?
    test propagation mechanism
        set up multi-project, multi-host test;
        change global prefs at one web site,
        make sure they propagate to all hosts
    limit on frequency of disk writes?
    max net traffic per day?
        implement in client

Per-project preferences
    test project-specific prefs
        make example web edit pages
        make app that uses them
    set up a test with multiple projects
        test "add project" feature, GUI and cmdline
        test resource share mechanism

CPU benchmarking
    review CPU benchmarks - do they do what we want?
    what to do when tests show hardware problem?
    How should we weight factors for credit?
    run CPU tests unobtrusively, periodically
    check that on/conn/active fracs are maintainted correctly
    check that bandwidth is measured correctly
    measure disk/mem size on all platforms
    get timezone to work

CPU accounting in the presence of checkpoint/restart
    test

Test nslots > 1

Redundancy checking and validation
    test the validation mechanism
    make sure credit is granted correctly
    make sure average, total credit maintained correctly for user, host

Windows screensaver functionality
    idle-only behavior without screensaver - test

Data transfer
    make sure restart of downloads works
    make sure restart of uploads works
    test download/upload with multiple data servers
        make sure it tries servers in succession,
        does exponential backoff if all fail
    review and document prioritization of transfers
    review protocol; make sure error returns are possible and handled correctly

Scheduler
    Should dispatch results based on deadline?
    test that scheduler estimates WU completion time correctly
    test that scheduler sends right amount of work
    test that client estimates remaining work correctly,
        requests correct # of seconds
    test that hi/low water mark system works
    test that scheduler sends only feasible WUs

Scheduler RPC
    formalize notion of "permanent failure" (e.g. can't download file)
    report perm failures to scheduler, record in DB
    make sure RPC backoff is done for any perm failure
        (in general, should never make back-to-back RPCs to a project)
    make sure that client eventually reloads master URL

Application graphics
    finish design, implementation, doc, testing
        size, frame rate, whether to generate

Work generation
    generation of upload signature is very slow

prevent file_xfer->req1 from overflowing. This problems seems to be
    happening when the file_upload_handler returnes a message to the
    client that is large. This causes project->parsefile to get wrong
    input and so on.

test HTTP redirect mechanism for all types of ops

Add batch features to ops web
-----------------------
LONG-TERM IDEAS AND PROJECTS
-----------------------

use https for login (don't sent account ID or password in clear)

CPU benchmarking
    This should be done by a pseudo-application
    rather than by the core client.
    This would eliminate the GUI-starvation problem,
    and would make it possible to have architecture-specific
    benchmarking programs (e.g. for graphics coprocessor)
    or project-specific programs.

investigate binary diff mechanism for updating persistent files

verify support for > 4 GB files everywhere

use FTP instead of HTTP for file xfer??
    measure speed diff

Local scheduling
    more intelligent decision about when/what to work on
    - monitor VM situation, run small-footprint programs
        even if user active
    - monitor network usage, do net xfers if network idle
        even if user active

The following would require client to accept connections:
    - clients can act as proxy scheduling server
    - exiting client can pass work to another client
    - client can transfer files to other clients

User/host "reputation"
    keep track of % results bad, %results claimed > 2x granted credit
    both per-host and per-user.
    Make these visible to project, to that user (only)

Storage validation
    periodic rehash of persistent files;
    compare results between hosts

Include account ID in URL for file xfers
    This would let you verify network xfers by scanning web logs
    (could use that to give credit for xfers)

WU/result sequence mechanism
    design/implement/document

Multiple application files
    document, test

Versioning
    think through issues involved in:
    compatibility of core client and scheduling server
    compatibility of core client and data server
    compatibility of core client and app version
    compatibility of core client and client state file?
    Need version numbers for protocols/interfaces?
    What messages to show user?  Project?

Persistent files
    test
    design/implement test reporting, retrieval mechanisms
    (do this using WU/results with null application?)

NET_XFER_SET
    review logic; prevent one stream for starving others