Commit Graph

5 Commits

Author SHA1 Message Date
Brad Fitzpatrick 8297d9614c localdisk: change hashing structure
Before the files were stored in directories like
sha1/012/345/sha-012345xxxxx.dat, meaning there were 4096 (16^3)
top-level directories, each with up to 4096 child directories.  We
never really did the math, and the result millions (up to 16.7
million) directories with 1 file each.

Now the hashing structure is only 256 wide (two hex digits). If we
considered 4096 files in a directory acceptable before, that means the
new scheme can go up to 256*256*4096 files (268 million), which is
about 512 times bigger than my personal Camlistore instance
now. Larger users should probably be using the diskpacked storage
backend, anyway.

On start-up, the code now migrates the old format to the new format.

Change-Id: I17f7e830c50a5b770c57ee92d51f122340a0afbb
2013-11-28 16:33:01 -08:00
Brad Fitzpatrick 70475701d1 Get rid of QueueCreator and all its associated complexity.
Previous TODO entry was:

-- Get rid of QueueCreator entirely. Plan:
     -- sync handler still has a source and dest (one pair) but
        instead of calling CreateQueue on the source, it instead
        has an index.Storage (configured via a RequiredObject
        so it can be a kvfile, leveldb, mysql, postgres etc)
     -- make all the index.Storage types be instantiable
        from a jsonconfig Object, perhaps with constructors keyed
        on a "type" field.
     -- make sync handler support blobserver.Receiver (or StatReceiver)
        like indexes, so it can receive blobs.  but all it needs to
        do to acknowledge the ReceiveBlob is write and flush to its
        index.Storage. the syncing is async by default. (otherwise callers
        could just use "replica" if they wanted sync replication).
        But maybe for ease of configuration switching, we could also
        support a sync mode.  when it needs to replicate a blob,
        it uses the source.
     -- future option: sync mirror to an alternate path on ReceiveBlob
        that can delete. e.g. you're uploading to s3 and google,
        but don't want to upload to both at once, so you use the localdisk
        as a buffer to spread out your upstream bandwidth.
     -- end result: no more hardlinks or queue creator.

Change-Id: I6244fc4f3a655f08470ae3160502659399f468ed
2013-11-22 14:33:31 -08:00
Brad Fitzpatrick 0bdf20884b all: delete pkg/blobref; convert all from *blobref.BlobRef to new blob.Ref
Change-Id: Id2dfb7f19452bedf4f3c9310b36227fd8117b225
2013-08-03 19:54:30 -07:00
Brad Fitzpatrick cf0d9aca6e More docs
Change-Id: I5c21f240c85bcf91fb67487cc172bf3faeb49fff
2013-07-07 18:52:14 -07:00
Brad Fitzpatrick 0714a463c9 Update from r60 to [almost] Go 1.
A lot is still broken, but most stuff at least compiles now.

The directory tree has been rearranged now too.  Go libraries are now
under "pkg".  Fully qualified, they are e.g. "camlistore.org/pkg/jsonsign".

The go tool cannot yet fetch from arbitrary domains, but discussion is
happening now on which mechanism to use to allow that.

For now, put the camlistore root under $GOPATH/src.  Typically $GOPATH
is $HOME, so Camlistore should be at $HOME/src/camlistore.org.

Then you can:

$ go build ./server/camlistored

... etc

The build.pl script is currently disabled.  It'll be resurrected at
some point, but with a very different role (helping create a fake
GOPATH and running the go build command, if things are installed at
the wrong place, and/or running fileembed generators).

Many things are certainly broken.

Many things are disabled.  (MySQL, all indexing, etc).

Many things need to be moved into
camlistore.org/third_party/{code.google.com,github.com} and updated
from their r60 to Go 1 versions, where applicable.

The GoMySQL stuff should be updated to use database/sql and the ziutek
library implementing database/sql/driver.

Help wanted.

Change-Id: If71217dc5c8f0e70dbe46e9504ca5131c6eeacde
2012-02-18 21:53:06 -08:00