StreamingFetcher is now just Fetcher, and its FetchStreaming is now
just Fetch.
SeekFetcher is gone. Blobs are max 16 MB anyway, so we can slurp to
memory when needed. The main thing that cared about SeekFetcher
was the GET handler, ServeBlobref, because http.ServeContent needed
one for range requests. That's rewritten in an earlier commit, using
the FakeSeeker from another earlier commit.
Lot of code got simpler as a result.
Change-Id: Ib819413e48a8f9b8d97f596d0fbf771dab211f11
Not just in blob.SizedRef, but in blobserver.Fetch and
blobserver.FetchStreaming, too.
Blobs have a max size of 10-32 MB anyway, and the index.Corpus is now using
uint32 to save memory.
Change-Id: I1172445c2f9463fdaee55bfe0f1218d44be4aa53
Will eventually be plumbed through lots of APIs, especially those requiring or benefiting from
cancelation notification and/or those needing access to the HTTP context (e.g. App Engine).
Change-Id: I591496725d620126e09d49eb07cade7707c7fc64
Add use into localdisk (diskpacked already uses it).
Add ErrNotImplemented error for blobserver and mention the possibility
for RemoveBlobs (diskpacked deficit).
Change-Id: I6a50f263a58c8d3d1611ff9a060ea9fa4aee6163
Before the files were stored in directories like
sha1/012/345/sha-012345xxxxx.dat, meaning there were 4096 (16^3)
top-level directories, each with up to 4096 child directories. We
never really did the math, and the result millions (up to 16.7
million) directories with 1 file each.
Now the hashing structure is only 256 wide (two hex digits). If we
considered 4096 files in a directory acceptable before, that means the
new scheme can go up to 256*256*4096 files (268 million), which is
about 512 times bigger than my personal Camlistore instance
now. Larger users should probably be using the diskpacked storage
backend, anyway.
On start-up, the code now migrates the old format to the new format.
Change-Id: I17f7e830c50a5b770c57ee92d51f122340a0afbb
Previous TODO entry was:
-- Get rid of QueueCreator entirely. Plan:
-- sync handler still has a source and dest (one pair) but
instead of calling CreateQueue on the source, it instead
has an index.Storage (configured via a RequiredObject
so it can be a kvfile, leveldb, mysql, postgres etc)
-- make all the index.Storage types be instantiable
from a jsonconfig Object, perhaps with constructors keyed
on a "type" field.
-- make sync handler support blobserver.Receiver (or StatReceiver)
like indexes, so it can receive blobs. but all it needs to
do to acknowledge the ReceiveBlob is write and flush to its
index.Storage. the syncing is async by default. (otherwise callers
could just use "replica" if they wanted sync replication).
But maybe for ease of configuration switching, we could also
support a sync mode. when it needs to replicate a blob,
it uses the source.
-- future option: sync mirror to an alternate path on ReceiveBlob
that can delete. e.g. you're uploading to s3 and google,
but don't want to upload to both at once, so you use the localdisk
as a buffer to spread out your upstream bandwidth.
-- end result: no more hardlinks or queue creator.
Change-Id: I6244fc4f3a655f08470ae3160502659399f468ed
Refactor the localdisk, diskpacked common code to pkg/blobserver/local
(only StorageGeneration, ResetStorageGeneration in this CL)
Change-Id: Ib04125805d5a1960bd29a474d3fc7ca985708d8d
The errc channel needs to be buffered. Broken by b24cad68dd which
radically simplified this method.
Change-Id: I7f4df7d47cca78d6098000b926d7b734dd712bfa
Move up a layer to the HTTP. Also, start to remove ContextWrapper
stuff. We've done it differently for App Engine instead, and will do
it differently yet moving forward.
Also add blobserver.Receive and use it in most places, moving checksum
verification up a layer.
Bunch of other cleanup and TODO fixing too.
Much simpler and cleaner.
Change-Id: I12e56c5d4e53bfcf82bdd8fb0b6d57c248ff605c
Required some sync work (full syncs on start, blocking full syncs on
start, and also adding a dev-only hack to force a depedency from
search -> sync, to control the handler initialization order, otherwise
publish handlers would race with the sync handler and they'd create
new "blog" and "pics" permanodes and we'd end up with duplicates).
Webserver were initialized with "tcp" and ":3179" by default and
listenURL assumed that it would be treated as IPv6 and replaced [::]
by localhost. Host that were listening on IPv4 0.0.0.0 didn't get
the modification.
Receive in localdisk were using link that failed on windows plateforms.
Camlistored didn't use Json Marshaling which caused problem with the
way Windows stores its paths.
Change-Id: I9f62f7d46399c3514707383efcb2752dbaf1f420