mirror of https://github.com/perkeep/perkeep.git
249 lines
11 KiB
Plaintext
249 lines
11 KiB
Plaintext
There are two TODO lists. This file (good for airplanes) and the online bug tracker:
|
|
|
|
https://github.com/perkeep/perkeep/issues
|
|
|
|
Offline list:
|
|
|
|
-- add a build tag to allow perkeepd to be compiled without any GCE
|
|
support. (good for smaller binaries on Raspberry Pis or
|
|
whatnot). Most code has moved to the osutil/gce package, and
|
|
there's little gce usage now in perkeepd except for 2-3 things
|
|
bethind "if env.OnGCE" checks. those things inside the checks can
|
|
be func pointers registered by a file with a +build gce or
|
|
!without_gce, depending on which way we decide to go. But make sure
|
|
our test coverage builds both.
|
|
|
|
-- fix the presubmit's gofmt to be happy about emacs:
|
|
|
|
go fmt perkeep.org/cmd... perkeep.org/dev... perkeep.org/misc... perkeep.org/pkg... perkeep.org/server...
|
|
stat pkg/blobserver/.#multistream_test.go: no such file or directory
|
|
exit status 2
|
|
make: *** [fmt] Error 1
|
|
|
|
|
|
-- add HTTP handler for blobstreamer. stream a tar file? where to put
|
|
continuation token? special file after each tar entry? special file
|
|
at the end? HTTP Trailers? (but nobody supports them)
|
|
|
|
-- reindexing:
|
|
* add streaming interface to localdisk? maybe, even though not ideal, but
|
|
really: migrate my personal instance from localdisk to blobpacked +
|
|
maybe diskpacked for loose blobs? start by migrating to blobpacked and
|
|
measuring size of loose.
|
|
* add blobserver.EnumerateAllUnsorted (which could use StreamBlobs
|
|
if available, else use EnumerateAll, else maybe even use a new
|
|
interface method that goes forever and can't resume at a point,
|
|
but can be canceled, and localdisk could implement that at least)
|
|
* add buffered sorted.KeyValue implementation: a memory one (of
|
|
configurable max size) in front of a real disk one. add a Flush method
|
|
to it. also Flush when memory gets big enough.
|
|
In progress: pkg/sorted/buffer
|
|
|
|
-- stop using the "cond" blob router storage type in genconfig, as
|
|
well as the /bs-and-index/ "replica" storage type, and just let the
|
|
index register its own AddReceiveHook like the sync handler
|
|
(pkg/server/sync.go). But whereas the sync handler only synchronously
|
|
_enqueues_ the blob to replicate, the indexer should synchronously
|
|
do the ReceiveBlob (ooo-reindex) on it too before returning.
|
|
But the sync handler, despite technically only synchronously-enqueueing
|
|
and being therefore async, is still very fast. It's likely the
|
|
sync handler will therefore send a ReceiveBlob to the indexer
|
|
at the ~same time the indexer is already indexing it. So the indexer
|
|
should have some dup/merge suppression, and not do double work.
|
|
singleflight should work. The loser should still consume the
|
|
source io.Reader body and reply with the same error value.
|
|
|
|
-- ditch the importer.Interrupt type and pass along a context.Context
|
|
instead, which has its Done channel for cancelation.
|
|
|
|
-- S3-only mode doesn't work with a local disk index (kvfile) because
|
|
there's no directory for us to put the kv in.
|
|
|
|
-- fault injection many more places with pkg/fault. maybe even in all
|
|
handlers automatically somehow?
|
|
|
|
-- sync handler's shard validation doesn't retry on error.
|
|
only reports the errors now.
|
|
|
|
-- export blobserver.checkHashReader and document it with
|
|
the blob.Fetcher docs.
|
|
|
|
-- "filestogether" handler, putting related blobs (e.g. files)
|
|
next to each other in bigger blobs / separate files, and recording
|
|
offsets of small blobs into bigger ones
|
|
|
|
-- diskpacked doesn't seem to sync its index quickly enough.
|
|
A new blob receieved + process exit + read in a new process
|
|
doesn't find that blob. kv bug? Seems to need an explicit Close.
|
|
This feels broken. Add tests & debug.
|
|
|
|
-- websocket upload protocol. different write & read on same socket,
|
|
as opposed to HTTP, to have multiple chunks in flight.
|
|
|
|
-- extension to blobserver upload protocol to minimize fsyncs: maybe a
|
|
client can say "no rush" on a bunch of data blobs first (which
|
|
still don't get acked back over websocket until they've been
|
|
fsynced), and then when the client uploads the schema/vivivy blob,
|
|
that websocket message won't have the "no rush" flag, calling the
|
|
optional blobserver.Storage method to fsync (in the case of
|
|
diskpacked/localdisk) and getting all the "uploaded" messages back
|
|
for the data chunks that were written-but-not-synced.
|
|
|
|
-- measure FUSE operations, latency, round-trips, performance.
|
|
see next item:
|
|
|
|
-- ... we probaby need a "describe all chunks in file" HTTP handler.
|
|
then FUSE (when it sees sequential access) can say "what's the
|
|
list of all chunks in this file?" and then fetch them all at once.
|
|
see next item:
|
|
|
|
-- ... HTTP handler to get multiple blobs at once. multi-download
|
|
in multipart/mime body. we have this for stat and upload, but
|
|
not download.
|
|
|
|
-- ... if we do blob fetching over websocket too, then we can support
|
|
cancellation of blob requests. Then we can combine the previous
|
|
two items: FUSE client can ask the server, over websockets, for a
|
|
list of all chunks, and to also start streaming them all. assume a
|
|
high-latency (but acceptable bandwidth) link. the chunks are
|
|
already in flight, but some might be redundant. once the client figures
|
|
out some might be redundant, it can issue "stop send" messages over
|
|
that websocket connection to prevent dups. this should work on
|
|
both "files" and "bytes" types.
|
|
|
|
-- cacher: configurable policy on max cache size. clean oldest
|
|
things (consider mtime+atime) to get back under max cache size.
|
|
maybe prefer keeping small things (metadata blobs) too,
|
|
and only delete large data chunks.
|
|
|
|
-- UI: video, at least thumbnailing (use external program,
|
|
like VLC or whatever nautilus uses?)
|
|
|
|
-- rename server.ImageHandler to ThumbnailRequest or something? It's
|
|
not really a Handler in the normal sense. It's not built once and
|
|
called repeatedly; it's built for every ServeHTTP request.
|
|
|
|
-- unexport more stuff from pkg/server. Cache, etc.
|
|
|
|
-- look into garbage from openpgp signing
|
|
|
|
-- make leveldb memdb's iterator struct only 8 bytes, pointing to a recycled
|
|
object, and just nil out that pointer at EOF.
|
|
|
|
-- bring in the google glog package to third_party and use it in
|
|
places that want selective logging (e.g. pkg/index/receive.go)
|
|
|
|
-- (Mostly done) verify all ReceiveBlob calls and see which should be
|
|
blobserver.Receive instead, or ReceiveNoHash. git grep -E
|
|
"\.ReceiveBlob\(" And maybe ReceiveNoHash should go away and be
|
|
replaced with a "ReceiveString" method which combines the
|
|
blobref-from-string and ReceiveNoHash at once.
|
|
|
|
-- union storage target. sharder can be thought of a specialization
|
|
of union. sharder already unions, but has a hard-coded policy
|
|
of where to put new blobs. union could a library (used by sharder)
|
|
with a pluggable policy on that.
|
|
|
|
-- support for running pk-mount under perkeepd. especially for OS X,
|
|
where the lifetime of the background daemon will be the same as the
|
|
user's login session.
|
|
|
|
-- website: add godoc for /server/perkeepd (also without a "go get"
|
|
line)
|
|
|
|
-- tests for all cmd/* stuff, perhaps as part of some integration
|
|
tests.
|
|
|
|
-- move most of pk-put into a library, not a package main.
|
|
|
|
-- server cron support: full syncs, pk-put file backups, integrity
|
|
checks.
|
|
|
|
-- status in top right of UI: sync, crons. (in-progress, un-acked
|
|
problems)
|
|
|
|
-- finish metadata compaction on the encryption blobserver.Storage wrapper.
|
|
|
|
-- get security review on encryption wrapper. (agl?)
|
|
|
|
-- peer-to-peer server and blobserver target to store encrypted blobs
|
|
on stranger's hardrives. server will be open source so groups of
|
|
friends/family can run their own for small circles, or some company
|
|
could run a huge instance. spray encrypted backup chunks across
|
|
friends' machines, and have central server(s) present challenges to
|
|
the replicas to have them verify what they have and how big, and
|
|
also occasionally say what the SHA-1("challenge" + blob-data) is.
|
|
|
|
-- sharing: make pk-get work with permanode sets too, not just
|
|
"directory" and "file" things.
|
|
|
|
-- sharing: when hitting e.g. http://myserver/share/sha1-xxxxx, if
|
|
a web browser and not a smart client (Accept header? User-Agent?)
|
|
then redirect or render a cutesy gallery or file browser instead,
|
|
still with machine-readable data for slurping.
|
|
|
|
-- rethink the directory schema so it can a) represent directories
|
|
with millions of files (without making a >1MB or >16MB schema blob),
|
|
probably forming a tree, similar to files. but rather than rolling checksum,
|
|
just split lexically when nodes get too big.
|
|
|
|
-- delete mostly-obsolete camsigd. see big TODO in camsigd.go.
|
|
|
|
-- we used to be able live-edit js/css files in server/perkeepd/ui when
|
|
running under the App Engine dev_appserver.py. That's now broken with my
|
|
latest efforts to revive it. The place to start looking is:
|
|
server/perkeepd/ui/fileembed_appengine.go
|
|
|
|
-- should a "share" claim be not a claim but its own permanode, so it
|
|
can be rescinded? right now you can't really unshare a "haveref"
|
|
claim. or rather, TODO: verify we support "delete" claims to
|
|
delete any claim, and verify the share system and indexer all
|
|
support it. I think the indexer might, but not the share system.
|
|
Also TODO: "pk-put delete" or "rescind" subcommand.
|
|
Also TODO: document share claims in doc/schema/ and on website.
|
|
|
|
-- make the -transitive flag for "pk-put share -transitive" be a tri-state:
|
|
unset, true, false, and unset should then mean default to true for "file"
|
|
and "directory" schema blobs, and "false" for other things.
|
|
|
|
-- index: static directory recursive sizes: search: ask to see biggest directories?
|
|
|
|
-- index: index dates in filenames ("yyyy-mm-dd-Foo-Trip", "yyyy-mm blah", etc).
|
|
|
|
-- get webdav server working again, for mounting on Windows. This worked before Go 1
|
|
but bitrot when we moved pkg/fs to use the rsc/fuse.
|
|
|
|
-- BUG: osutil paths.go on OS X: should use Library everywhere instead of mix of
|
|
Library and ~/.camlistore?
|
|
|
|
OLD:
|
|
|
|
-- add CROS support? Access-Control-Allow-Origin: * + w/ OPTIONS
|
|
http://hacks.mozilla.org/2009/07/cross-site-xmlhttprequest-with-cors/
|
|
|
|
-- brackup integration, perhaps sans GPG? (requires Perl client?)
|
|
|
|
-- blobserver: clean up channel-closing consistency in blobserver interface
|
|
(most close, one doesn't. all should probably close)
|
|
|
|
Android:
|
|
|
|
[ ] Fix wake locks in UploadThread. need to hold CPU + WiFi whenever
|
|
something's enqueued at all and we're running. Move out of the Thread
|
|
that's uploading itself.
|
|
[ ] GPG signing of blobs (brad)
|
|
http://code.google.com/p/android-privacy-guard/
|
|
http://www.thialfihar.org/projects/apg/
|
|
(supports signing in code, but not an Intent?)
|
|
http://code.google.com/p/android-privacy-guard/wiki/UsingApgForDevelopment
|
|
... mailed the author.
|
|
|
|
Client libraries:
|
|
|
|
[X] Go
|
|
[X] JavaScript
|
|
[/] Python (Brett); but see https://github.com/tsileo/camlipy
|
|
[ ] Perl
|
|
[ ] Ruby
|
|
[ ] PHP
|