Commit Graph

36 Commits

Author SHA1 Message Date
Brad Fitzpatrick bf2e1fa585 sync: don't replicate a shard's missing blobs until enumeration is complete
Prevents spurious replication of blobs on enumeration error.

Change-Id: I38db7406f6ea52137cb757b32599b18eb7fcf3da
2014-03-17 23:21:53 -07:00
Brad Fitzpatrick bf1ec32e39 sync: add paranoia around checking storage's Enumerate implementation
Didn't find anything, but is useful to keep in, to maybe find bugs in the future
for other storage types.

Change-Id: If0fd37e03578de233be8da95ca45623c5f12156b
2014-03-17 23:05:27 -07:00
Brad Fitzpatrick bf8f4b2423 sync: fix bug in prefix enumeration. could send one extra item.
Depending on timing, could lead to ListMissingDestinationBlobs getting out
of sync and causing a lot of blobs to be replicated that were fine and already
on the server.

Change-Id: I3710e59088f1fe4e526f8f11bc9d1837a727e512
2014-03-17 23:02:01 -07:00
Brad Fitzpatrick bf88f5f06c sync: missing handler return + wording change
Change-Id: Iebd5344a0a0e1418cb48c92b91858ebf2f9486c8
2014-03-17 23:00:18 -07:00
Brad Fitzpatrick bf28dd4488 More status handler HTML+JSON, more sync status.
Change-Id: I0381853191d5b871af649d102b976e592def791f
2014-03-16 20:14:57 -07:00
Brad Fitzpatrick bf94a73859 Get rid of SeekFetcher vs StreamingFetcher distinction and complexity.
StreamingFetcher is now just Fetcher, and its FetchStreaming is now
just Fetch.

SeekFetcher is gone. Blobs are max 16 MB anyway, so we can slurp to
memory when needed. The main thing that cared about SeekFetcher
was the GET handler, ServeBlobref, because http.ServeContent needed
one for range requests. That's rewritten in an earlier commit, using
the FakeSeeker from another earlier commit.

Lot of code got simpler as a result.

Change-Id: Ib819413e48a8f9b8d97f596d0fbf771dab211f11
2014-03-14 12:29:13 -07:00
Brad Fitzpatrick bfa30b3013 sync: add a button to start a full validation, even when disabled
Change-Id: I229fa70843bb4b9d206788bf150c436c56c82cb7
2014-03-07 17:32:59 -08:00
Brad Fitzpatrick bfa8efc1b9 sync: background validation of src-vs-destination in sync handler
Currently controlled by an environment variable, but will become
a config option + on-demand button in UI in later commits.

Change-Id: I25fa878c9b30cdd713e2859585210eb722092f7b
2014-03-07 10:57:41 -08:00
Brad Fitzpatrick abca567581 sync: work on full-validation-on-startup. 90% done, disabled.
Change-Id: I5bf062f3b22c2cc41329ff6b23f11198ae543c0f
2014-03-06 16:52:29 -08:00
Brad Fitzpatrick 58ac8b5469 sync: restore key part accidentally removed prior to earlier submit
Would cause accounting errors before in the face of duplicate uploads.

Change-Id: Ie7c49da1adaf2b9c98ef1015f875a4df8b66729f
2014-03-06 13:23:27 -08:00
Brad Fitzpatrick c1892b5ae5 Rewritten sync handler.
Fix deadlock, much better status page, show per-blob status & errors,
clear errors when they've resolved themselves, fix known data race.

Change-Id: I968de0de4f308ff0a410adceb181a0712800d401
2014-03-05 08:51:22 -08:00
Brad Fitzpatrick 4731e3abec sync: rename lk to more conventional mu
Change-Id: I59f12da9462e4768b59eb135c10f356916191ea7
2014-03-04 13:57:33 -08:00
Andrew Gerrand 0a1fe281ca remove redundant return statement
Change-Id: I09f61e61c0aa2c0ecaec04eb7541374a3265879e
2014-02-11 10:50:30 +11:00
Tamás Gulácsi 97520583b8 Use 'uint32' instead of 'int64' for blob sizes everywhere.
Not just in blob.SizedRef, but in blobserver.Fetch and
blobserver.FetchStreaming, too.
Blobs have a max size of 10-32 MB anyway, and the index.Corpus is now using
uint32 to save memory.

Change-Id: I1172445c2f9463fdaee55bfe0f1218d44be4aa53
2014-02-08 17:58:12 +01:00
Brad Fitzpatrick c63fedf263 Fix sync handler spinning on failure.
Change-Id: I4bdd568103cc6d50e208a2f3bc7523fa3abad1c8
2014-01-13 20:10:45 -08:00
mpl 2d85e017ff cmd/camtool: (re)index command
http://camlistore.org/issue/193

Change-Id: I498f92bdc153f44dc84d4b47f03c47a8e7b54ad9
2013-12-26 18:23:15 +01:00
Brad Fitzpatrick 76171ddb3d Change sorted.KeyValue.Find to take an optional end bound; add tests.
The new package sorted/kvtest provides a generic KeyValue test for all
implementations. Memory, SQLite, and kvfile now use it.

This speeds up the index slurping start-up of my personal Camlistore
server from 30 seconds (when it was doing 17,000+ queries in small
windows) to now just 5 seconds. That 5 seconds can be improved yet
further.

Change-Id: Idd55ba9ccd3ed12a26868a41db1af676aff7b67b
2013-12-07 08:43:18 -08:00
Brad Fitzpatrick b82b8efe4c Start of new context package and *context.Context type.
Will eventually be plumbed through lots of APIs, especially those requiring or benefiting from
cancelation notification and/or those needing access to the HTTP context (e.g. App Engine).

Change-Id: I591496725d620126e09d49eb07cade7707c7fc64
2013-12-02 13:20:51 -08:00
Brad Fitzpatrick 03cc0fa8bd sync: wake up early from sleep if a blob arrives
Change-Id: I49240d1970e537e3ace36f4cd02315ff3ed9d6b2
2013-11-25 19:18:13 -08:00
Brad Fitzpatrick cf388c2f2a sync: have handler register receive hook with its source.
Lets legacy configs work, even without replicating directly to it.

Change-Id: I8bdb8651040794ae346f19d6dd67a0da07505f07
2013-11-24 16:20:11 -08:00
Brad Fitzpatrick 3ccdb025c0 sync: add forgotten channel close.
Change-Id: Ie6d14b0bad1229fc775dc9a0afda349ee163cbda
2013-11-23 11:20:08 -08:00
Brad Fitzpatrick ab19715dc6 docs and TODOs
Change-Id: I434c4d00a4dd63d338646376a563f69b122a3c53
2013-11-23 09:09:40 -08:00
Brad Fitzpatrick 90c1e48afe Rename index.Storage to sorted.KeyValue and move it into a new package.
Having index.Index and index.Storage both in the same package led to
confusing discussions about "an index". Better names now, and smaller
packages.
2013-11-22 23:24:54 -08:00
Brad Fitzpatrick 70475701d1 Get rid of QueueCreator and all its associated complexity.
Previous TODO entry was:

-- Get rid of QueueCreator entirely. Plan:
     -- sync handler still has a source and dest (one pair) but
        instead of calling CreateQueue on the source, it instead
        has an index.Storage (configured via a RequiredObject
        so it can be a kvfile, leveldb, mysql, postgres etc)
     -- make all the index.Storage types be instantiable
        from a jsonconfig Object, perhaps with constructors keyed
        on a "type" field.
     -- make sync handler support blobserver.Receiver (or StatReceiver)
        like indexes, so it can receive blobs.  but all it needs to
        do to acknowledge the ReceiveBlob is write and flush to its
        index.Storage. the syncing is async by default. (otherwise callers
        could just use "replica" if they wanted sync replication).
        But maybe for ease of configuration switching, we could also
        support a sync mode.  when it needs to replicate a blob,
        it uses the source.
     -- future option: sync mirror to an alternate path on ReceiveBlob
        that can delete. e.g. you're uploading to s3 and google,
        but don't want to upload to both at once, so you use the localdisk
        as a buffer to spread out your upstream bandwidth.
     -- end result: no more hardlinks or queue creator.

Change-Id: I6244fc4f3a655f08470ae3160502659399f468ed
2013-11-22 14:33:31 -08:00
Brad Fitzpatrick 0457e9359a Start of making QueueCreator optional for replication sources.
Change-Id: I82290991208c6e8953bdc63424760b378437674d
2013-09-01 16:19:07 -07:00
mpl e036f96488 sync: delay copy retry on specific errors
http://camlistore.org/issue/206

Change-Id: I1dd07149352e3af6b39bcb86ed2312f19c3bae30
2013-08-30 19:51:04 +02:00
Brad Fitzpatrick b24cad68dd Cleanup: remove BlobHub and time.Duration waits from storage interface
Move up a layer to the HTTP.  Also, start to remove ContextWrapper
stuff.  We've done it differently for App Engine instead, and will do
it differently yet moving forward.

Also add blobserver.Receive and use it in most places, moving checksum
verification up a layer.

Bunch of other cleanup and TODO fixing too.

Much simpler and cleaner.

Change-Id: I12e56c5d4e53bfcf82bdd8fb0b6d57c248ff605c
2013-08-21 13:57:28 -07:00
mpl 4acc10e6e4 serverconfig: idle synchandler when no localdisk as primary storage
Because no localdisk means either s3 or google is the primary,
and none of them support efficient replication.

1) Added a dummy synchandler constructor for when config has "idle"
2) Set "idle" for synchandler config when no localdisk
3) fixed corresponding tests

Also:
- added error (and test) when no localdisk and both s3 and google
in config
- added s3 + mysql test

http://camlistore.org/issue/201

Change-Id: I861fdca0c203bc0181ab6d548adab501ed98d2f0
2013-08-21 15:17:13 +02:00
Brad Fitzpatrick 0bdf20884b all: delete pkg/blobref; convert all from *blobref.BlobRef to new blob.Ref
Change-Id: Id2dfb7f19452bedf4f3c9310b36227fd8117b225
2013-08-03 19:54:30 -07:00
Brad Fitzpatrick 9468e5ba70 More docs. Every package is documented now.
misc.CountingReader moves into readerutil.

pkg/atomics is folded into pkg/types.

pkg/test/testdep is folded into pkg/test, with better name/docs.

Old cruft from pkg/webserver is deleted.

Change-Id: I3f72d8b29804254ef944995fb085837c878f79f5
2013-07-07 21:12:30 -07:00
Brad Fitzpatrick ca58d8e2e0 Remove noisy var _ = log.Printf lines.
Change-Id: Ia58b8ef5f271f542ae4fe61c7fb1497322770322
2013-06-14 12:55:55 -07:00
mpl 3129e70380 discovery: add synchandlers
Change-Id: I2934c4a7d926770eaf59cb82f0fe48b6c8deb225
2013-01-12 01:55:32 +01:00
Brad Fitzpatrick caa50142dc sync: fix fd leak, update a TODO
Change-Id: I772ac951723300e4e5e767d7a25d6d4bcda3e825
2012-12-31 18:21:50 -08:00
Brad Fitzpatrick a41269e78e Reindex all dev-server blobs into memindex on restart.
Required some sync work (full syncs on start, blocking full syncs on
start, and also adding a dev-only hack to force a depedency from
search -> sync, to control the handler initialization order, otherwise
publish handlers would race with the sync handler and they'd create
new "blog" and "pics" permanodes and we'd end up with duplicates).
2012-11-07 22:40:17 +01:00
Brad Fitzpatrick 5c4d0f71f5 gofmt 2012-11-07 18:49:14 +01:00
Brad Fitzpatrick 0714a463c9 Update from r60 to [almost] Go 1.
A lot is still broken, but most stuff at least compiles now.

The directory tree has been rearranged now too.  Go libraries are now
under "pkg".  Fully qualified, they are e.g. "camlistore.org/pkg/jsonsign".

The go tool cannot yet fetch from arbitrary domains, but discussion is
happening now on which mechanism to use to allow that.

For now, put the camlistore root under $GOPATH/src.  Typically $GOPATH
is $HOME, so Camlistore should be at $HOME/src/camlistore.org.

Then you can:

$ go build ./server/camlistored

... etc

The build.pl script is currently disabled.  It'll be resurrected at
some point, but with a very different role (helping create a fake
GOPATH and running the go build command, if things are installed at
the wrong place, and/or running fileembed generators).

Many things are certainly broken.

Many things are disabled.  (MySQL, all indexing, etc).

Many things need to be moved into
camlistore.org/third_party/{code.google.com,github.com} and updated
from their r60 to Go 1 versions, where applicable.

The GoMySQL stuff should be updated to use database/sql and the ziutek
library implementing database/sql/driver.

Help wanted.

Change-Id: If71217dc5c8f0e70dbe46e9504ca5131c6eeacde
2012-02-18 21:53:06 -08:00