Not just in blob.SizedRef, but in blobserver.Fetch and
blobserver.FetchStreaming, too.
Blobs have a max size of 10-32 MB anyway, and the index.Corpus is now using
uint32 to save memory.
Change-Id: I1172445c2f9463fdaee55bfe0f1218d44be4aa53
From the package docs:
Package archiver zips lots of little blobs into bigger zip files
and stores them somewhere. While generic, it was designed to
incrementally create Amazon Glacier archives from many little
blobs, rather than creating millions of Glacier archives.
Change-Id: If304b2d4bf144bfab073c61c148bb34fa0be2f2d
Bytes read/writen per pack file, as well as per configured diskpacked
configuration are now available as expvars.
Also add reader stat helpers to pkg/types and updated the original
user in server/image.go
Change-Id: Ifc9d76c57aab329d4b947e9a4ef9eac008bc608d
As blob.fetcherToSeekerWrapper.Fetch erroneously asserts that FetchStreaming
returns a ReadSeekCloser everytime, it had to be changed.
Move MaxBlobSize from blobserver to constants (new package).
Change-Id: I4b4f22c302cbec84d77d21454e0c9e8aebdf73e5
No more dynamic upload URL, which trips up half our new users behind
reverse proxies when the camlistored process doesn't know its
forward-facing URL.
The original camlistore stat + upload protocol was influenced by App
Engine's limitations at the time, and some of our indecision about
where the Camlistore design is going. We understand the Camlistore
design now, and App Engine's former limitations are gone. Time to
clean things up.
More REST-y now too.
See http://camlistore.org/issue/123
Change-Id: I92c6552f830b925cef379c204a982a2213bf2f4b
The client configuration requires this if it's not passed in through the
environment. Since this is for a storage service, it makes sense to
place it with the specific remote.
Since SetupAuthFromConfig was a bit awkward and not used elsewhere, it's
replaced with a more simple and explicit SetupAuthFromString to which
the exact auth details you wish to use are provided.
Change-Id: Id39ff314738794e299d48cbe634be2aa5d5c3bd1
This commit introduces the basic API required to implement
high-throughput blob streaming functionality within the various blob
storage engines.
Change-Id: Ie170d11b229196617f96b298f864ad12af62c363
The new package sorted/kvtest provides a generic KeyValue test for all
implementations. Memory, SQLite, and kvfile now use it.
This speeds up the index slurping start-up of my personal Camlistore
server from 30 seconds (when it was doing 17,000+ queries in small
windows) to now just 5 seconds. That 5 seconds can be improved yet
further.
Change-Id: Idd55ba9ccd3ed12a26868a41db1af676aff7b67b
Will eventually be plumbed through lots of APIs, especially those requiring or benefiting from
cancelation notification and/or those needing access to the HTTP context (e.g. App Engine).
Change-Id: I591496725d620126e09d49eb07cade7707c7fc64
Add use into localdisk (diskpacked already uses it).
Add ErrNotImplemented error for blobserver and mention the possibility
for RemoveBlobs (diskpacked deficit).
Change-Id: I6a50f263a58c8d3d1611ff9a060ea9fa4aee6163
Regressed from rev cb6f423e. Eventually pkg storagetest should test all methods of blobserver.Storage
for all storage target types.
Change-Id: I2c1c93b76fd9280a3eb429b1d71c64a693ed1ace
Before the files were stored in directories like
sha1/012/345/sha-012345xxxxx.dat, meaning there were 4096 (16^3)
top-level directories, each with up to 4096 child directories. We
never really did the math, and the result millions (up to 16.7
million) directories with 1 file each.
Now the hashing structure is only 256 wide (two hex digits). If we
considered 4096 files in a directory acceptable before, that means the
new scheme can go up to 256*256*4096 files (268 million), which is
about 512 times bigger than my personal Camlistore instance
now. Larger users should probably be using the diskpacked storage
backend, anyway.
On start-up, the code now migrates the old format to the new format.
Change-Id: I17f7e830c50a5b770c57ee92d51f122340a0afbb
Previous TODO entry was:
-- Get rid of QueueCreator entirely. Plan:
-- sync handler still has a source and dest (one pair) but
instead of calling CreateQueue on the source, it instead
has an index.Storage (configured via a RequiredObject
so it can be a kvfile, leveldb, mysql, postgres etc)
-- make all the index.Storage types be instantiable
from a jsonconfig Object, perhaps with constructors keyed
on a "type" field.
-- make sync handler support blobserver.Receiver (or StatReceiver)
like indexes, so it can receive blobs. but all it needs to
do to acknowledge the ReceiveBlob is write and flush to its
index.Storage. the syncing is async by default. (otherwise callers
could just use "replica" if they wanted sync replication).
But maybe for ease of configuration switching, we could also
support a sync mode. when it needs to replicate a blob,
it uses the source.
-- future option: sync mirror to an alternate path on ReceiveBlob
that can delete. e.g. you're uploading to s3 and google,
but don't want to upload to both at once, so you use the localdisk
as a buffer to spread out your upstream bandwidth.
-- end result: no more hardlinks or queue creator.
Change-Id: I6244fc4f3a655f08470ae3160502659399f468ed