Before the files were stored in directories like
sha1/012/345/sha-012345xxxxx.dat, meaning there were 4096 (16^3)
top-level directories, each with up to 4096 child directories. We
never really did the math, and the result millions (up to 16.7
million) directories with 1 file each.
Now the hashing structure is only 256 wide (two hex digits). If we
considered 4096 files in a directory acceptable before, that means the
new scheme can go up to 256*256*4096 files (268 million), which is
about 512 times bigger than my personal Camlistore instance
now. Larger users should probably be using the diskpacked storage
backend, anyway.
On start-up, the code now migrates the old format to the new format.
Change-Id: I17f7e830c50a5b770c57ee92d51f122340a0afbb
Previous TODO entry was:
-- Get rid of QueueCreator entirely. Plan:
-- sync handler still has a source and dest (one pair) but
instead of calling CreateQueue on the source, it instead
has an index.Storage (configured via a RequiredObject
so it can be a kvfile, leveldb, mysql, postgres etc)
-- make all the index.Storage types be instantiable
from a jsonconfig Object, perhaps with constructors keyed
on a "type" field.
-- make sync handler support blobserver.Receiver (or StatReceiver)
like indexes, so it can receive blobs. but all it needs to
do to acknowledge the ReceiveBlob is write and flush to its
index.Storage. the syncing is async by default. (otherwise callers
could just use "replica" if they wanted sync replication).
But maybe for ease of configuration switching, we could also
support a sync mode. when it needs to replicate a blob,
it uses the source.
-- future option: sync mirror to an alternate path on ReceiveBlob
that can delete. e.g. you're uploading to s3 and google,
but don't want to upload to both at once, so you use the localdisk
as a buffer to spread out your upstream bandwidth.
-- end result: no more hardlinks or queue creator.
Change-Id: I6244fc4f3a655f08470ae3160502659399f468ed
A lot is still broken, but most stuff at least compiles now.
The directory tree has been rearranged now too. Go libraries are now
under "pkg". Fully qualified, they are e.g. "camlistore.org/pkg/jsonsign".
The go tool cannot yet fetch from arbitrary domains, but discussion is
happening now on which mechanism to use to allow that.
For now, put the camlistore root under $GOPATH/src. Typically $GOPATH
is $HOME, so Camlistore should be at $HOME/src/camlistore.org.
Then you can:
$ go build ./server/camlistored
... etc
The build.pl script is currently disabled. It'll be resurrected at
some point, but with a very different role (helping create a fake
GOPATH and running the go build command, if things are installed at
the wrong place, and/or running fileembed generators).
Many things are certainly broken.
Many things are disabled. (MySQL, all indexing, etc).
Many things need to be moved into
camlistore.org/third_party/{code.google.com,github.com} and updated
from their r60 to Go 1 versions, where applicable.
The GoMySQL stuff should be updated to use database/sql and the ziutek
library implementing database/sql/driver.
Help wanted.
Change-Id: If71217dc5c8f0e70dbe46e9504ca5131c6eeacde