Commit Graph

48 Commits

Author SHA1 Message Date
Brad Fitzpatrick bf2764cdfe index: rename the reindex method to indexBlob, to be less confusing.
Also, upon server --reindex, check that no out-of-order blobs are
pending.  From a quick reading, they shouldn't be, but I'm curious to
see.  Will do a full reindex of my data later.

Change-Id: Idebf93cc264e55512afcfb99e47320dd0ae745d1
2014-04-06 14:03:38 -07:00
Brad Fitzpatrick bf2f09cab3 index: reschedule indexing a claim blob if public key blob isn't yet available
Change-Id: Ie0174bf830eb4790080b2b5e7cdc4ea0af25406f
2014-04-02 13:39:36 -07:00
Brad Fitzpatrick bfc607fee7 index: reindex blobs when dependent blobs arrive out-of-order
Keep track of missing dependencies both in memory and in the index's
underlying sorted.KeyValue. When we see a dependent blob arrive, see
if we can reindex things.

Fixes camlistore.org/issue/102

Change-Id: I3d8cfc463e4b8c9d158be8f9656e772839b093b9
2014-03-15 08:44:09 -07:00
Brad Fitzpatrick bf94a73859 Get rid of SeekFetcher vs StreamingFetcher distinction and complexity.
StreamingFetcher is now just Fetcher, and its FetchStreaming is now
just Fetch.

SeekFetcher is gone. Blobs are max 16 MB anyway, so we can slurp to
memory when needed. The main thing that cared about SeekFetcher
was the GET handler, ServeBlobref, because http.ServeContent needed
one for range requests. That's rewritten in an earlier commit, using
the FakeSeeker from another earlier commit.

Lot of code got simpler as a result.

Change-Id: Ib819413e48a8f9b8d97f596d0fbf771dab211f11
2014-03-14 12:29:13 -07:00
Brad Fitzpatrick bf01b14961 index: move seekFetcherMissTracker up a layer
In prep for missing blob dependency rescheduling in indexer.

Change-Id: I1d492e6aa64cfb658daec17e4621d1453c6d3607
2014-03-14 09:14:46 -07:00
Tamás Gulácsi 97520583b8 Use 'uint32' instead of 'int64' for blob sizes everywhere.
Not just in blob.SizedRef, but in blobserver.Fetch and
blobserver.FetchStreaming, too.
Blobs have a max size of 10-32 MB anyway, and the index.Corpus is now using
uint32 to save memory.

Change-Id: I1172445c2f9463fdaee55bfe0f1218d44be4aa53
2014-02-08 17:58:12 +01:00
Daniel Erat 5603ea8e0d pkg/index: Index audio duration.
Add pkg/media with code to calculate MPEG audio duration.
Index it in a "durationms" property.

Change-Id: Ifb6251657cadc365ef3f5667a0512fde17575560
2014-01-25 10:40:06 -08:00
Daniel Erat 404548d31a pkg/index: Index more music-related properties.
Add disc and mediaref (a hash of the audio portion of the
file).

Also relocate taglib code to
third_party/github.com/hjfreyer/taglib-go.

Change-Id: I58364f525b787484af894663125163095256d7c6
2014-01-22 21:25:05 -08:00
Daniel Erat 704d3c6bfc pkg/index: Rename audiotag to mediatag.
Also fix up keys and values and add tests.

Change-Id: I7e6c5c4315705442e3517456f2ba16419af49f2f
2014-01-20 21:46:39 -08:00
Brad Fitzpatrick 5b03c3f8fb search, index: let media tags be searchable too.
git push from Dolores Park. Sorry, no tests. Dan Erat will tell me if
this doesn't work.

Change-Id: I557cc3d07983390b8a15b7756ee0825fced2f503
2014-01-20 15:47:36 -08:00
Brad Fitzpatrick 14b950496f index, corpus: prevent indexing dup blobs
With the sync handler + indexer in same process subscribing to all
incoming blobs, we were indexing everything twice.

Fixes camlistore.org/issue/306

Change-Id: I7da54a0e18ac613eeae36d6db29b6cdb73a37196
2013-12-30 20:17:47 -08:00
Brad Fitzpatrick a11ff22b8e camlistored: add --reindex flag; make sqlkv a sorted.Wiper
Change-Id: I6b16c1c32187fb754d3acdbe852d02a506236078
2013-12-23 19:07:17 -08:00
Brad Fitzpatrick a7b3f4ee01 index: index all photo EXIF tags
Change-Id: I00b2eebfc75de38eed5c212ac6d52e0da07297bc
2013-12-23 16:21:19 -08:00
Bill Thiede 2d4fb25c34 images: fix Decode when resize + rotate + max W/H.
Adds more tests to cover rotations with resize when used with
MaxWidth/MaxHeight, previously only ScaledWidth/ScaledHeight were
tested.

Improve tests to compare bounds when determining equality, otherwise
an image sized 0x0 is equal to all other images.

Sort test image filenames so test order is stable and obvious.

Keep more data in memory when indexing images upon receive.  Some
largish CR2 files need more data or the EXIF parsing will fail.

Should address some or all of https://camlistore.org/issue/274

Change-Id: I80d90c33538c9d62ce4480ccb58c003e18ee6629
2013-12-16 10:01:07 -08:00
Brad Fitzpatrick 91d735df4b index: start of re-indexing smartly when dependent blobs are missing
See https://camlistore.org/issue/102

Change-Id: Ia5f69475d8f47398bc228a96e7694d59edf277bf
2013-11-30 23:15:17 -08:00
mpl 6c75ceb8b5 pkg/index: do not record a keySignerAttrValue on DelAttributeClaim
Change-Id: Ib1f81fe4879de2be7d484a5a40cc6bf0449893d5
2013-11-30 00:56:09 +01:00
mpl 1ee5fd20c5 search: deletions are not modifications
1) pkg/search: documented that deletions times do not
qualify as modtimes

2) pkg/index: got rid of DeletedAt, and keyDeletes

http://camlistore.org/issue/191

Change-Id: I39578913345454d36af4599e29e7053f46577846
2013-11-29 00:29:57 +01:00
mpl 42e37d4456 pkg/index: update the deletes cache when receiving a delete claim
http://camlistore.org/issue/191

Change-Id: I49da2ef4e43675fba6a80db29ba96a473c159403
2013-11-27 18:44:39 +01:00
mpl c81f3147f6 pkg/index: write relevant keys when receiving a delete claim
This change:

1) Checks if the incoming claim is a delete claim with the use
of GetBlobMeta.

2) write the keyDeleted and keyDeletes keys when it's a delete
claim, plus the usual keys when the target is a permanode.

Yet to be done in the next CLs:
1) update the index deletes cache upon reception of a delete claim
2) update most of the search functions so they use deletedAt properly
3) add new keys necessary for GetRecentPermanodes to give a fully
correct result.

I also made indextest.DumpIndex public because it turned to be useful
to debug within pkg/search/ as well.

http://camlistore.org/issue/191

Change-Id: I8d8b9d12a535b8b1de0018b4a0e359241f14d52a
2013-11-19 18:02:12 +01:00
Brad Fitzpatrick e8603b1293 Put claims in memory too for in-memory search. Required index schema version bump.
Change-Id: I194d65476bddea111277cd0b1472c56b5527226b
2013-11-17 16:52:51 -08:00
Brad Fitzpatrick 3eb493599e in-memory search: better structure for keeping memory corpus and kv
index in sync, both at start-up and while running and receiving blobs.
They both use the same mechanism now.

Also adds KeyId to the index and Corpus, as the next step. Plenty more
row types remain...

Change-Id: Id79955ba25dc79d5fbd94b0e5248d33dcf71d97e
2013-11-17 09:41:45 -08:00
Brad Fitzpatrick f3cc3c7ed9 search: more in-memory search work. make tests verify Scan doesn't hit Storage.
also some string interning work.

Change-Id: I7864b56eb97318bce943afdca3b1212f4729a9a8
2013-11-16 18:50:01 -08:00
Brad Fitzpatrick 2984897ac7 search: more in-memory search work.
keep blob metadata in memory, and start of testing all search queries in three modes:
classic index.Storage scanning, all in-memory with corpus scanned from the index.Storage,
and the in-memory corpus built up over time as blobs arrive.

Change-Id: I40536e498a63bece5bd4897cdbbd0cef78085f44
2013-11-16 17:24:02 -08:00
Brad Fitzpatrick 705107ad80 search/index: invert depedency. search now depends on index.
creates new package types/camtypes for misc types needed by both. might eventually go away as
search matures.

Change-Id: Ib771ead7bea39936ba478b7e5d58de997060861b
2013-11-16 15:00:30 -08:00
mpl e03d923fe1 pkg/index: use a map to populate the mutations
When indexing upon a blob reception, we first populate
all the mutations in a map instead of in a batch mutation.
Then we transfer all the mutations in a batch and commit
it immediately. This makes the window when the batch mutation
is open much shorter, and will ease future indexing because
it allows reading from the index while writing the mutations
to the map.

Change-Id: I276282388f59ca543835bfa5ec64986453b23fe1
2013-11-15 01:23:21 +01:00
mpl 5031b01880 pkg/index: keyType keyPermanodeClaim for "claim" index entry
The index entry prefixed by "claim" had no keyType and
was always built "by hand" with pipes concatenation.
This change adds the documented keyPermanodeClaim to fix
that.

Change-Id: Ic59f7dbcccc6b223b155d5bffbf8e636209800cb
2013-11-08 16:20:43 +01:00
Brad Fitzpatrick 8319411ab4 Convert more ReceiveBlob into blobserver.Receive or blobserver.ReceiveNoHash
Change-Id: I9199555324b617167a6062a8b55ed09b449bae4f
2013-09-16 15:57:14 +01:00
mpl d488c576fc search: support for static directory children
This change introduces a new index entry
to help with finding the children of a static directory.
It also fixes ResolvePrefixHop so that it takes
into account static directories, and not only collections.

This is the first step to support publishing static directories.

http://camlistore.org/issue/179

Change-Id: I5666e5caa6c782004054ae4c19a6b6119d4fda8b
2013-09-10 23:06:48 +02:00
Brad Fitzpatrick 00d8ff5275 index: remove now-longer-necessary blob hash check
Change-Id: Ia2a79655832a840d37666b94a1f101042861c8ff
2013-09-08 12:38:20 -07:00
Hunter Freyer 6940b3991f Basic code to index id3 (and other audio) tags.
Does a few things:

1) Adds gotaglib to third_party. If you'd like to review that, feel
free, though there's a bit of organization I'd like to do first.

2) Adds an "audioTag" key type.

3) Indexes wholerefs by various audio tags. Doesn't yet add a map from
wholeref to tags, but I can add that next.

Change-Id: I8e2a5bc27260086bad3351ac57973d1ac23cff44
2013-09-02 14:39:51 -04:00
Brad Fitzpatrick b24cad68dd Cleanup: remove BlobHub and time.Duration waits from storage interface
Move up a layer to the HTTP.  Also, start to remove ContextWrapper
stuff.  We've done it differently for App Engine instead, and will do
it differently yet moving forward.

Also add blobserver.Receive and use it in most places, moving checksum
verification up a layer.

Bunch of other cleanup and TODO fixing too.

Much simpler and cleaner.

Change-Id: I12e56c5d4e53bfcf82bdd8fb0b6d57c248ff605c
2013-08-21 13:57:28 -07:00
Brad Fitzpatrick 0bdf20884b all: delete pkg/blobref; convert all from *blobref.BlobRef to new blob.Ref
Change-Id: Id2dfb7f19452bedf4f3c9310b36227fd8117b225
2013-08-03 19:54:30 -07:00
Brad Fitzpatrick 9468e5ba70 More docs. Every package is documented now.
misc.CountingReader moves into readerutil.

pkg/atomics is folded into pkg/types.

pkg/test/testdep is folded into pkg/test, with better name/docs.

Old cruft from pkg/webserver is deleted.

Change-Id: I3f72d8b29804254ef944995fb085837c878f79f5
2013-07-07 21:12:30 -07:00
Brad Fitzpatrick 7fd16c5df4 remove debugging
Change-Id: If83580e85cfb350bba059dde9e7bccb0c7658e99
2013-06-10 19:23:34 +02:00
mpl 0dfd84a7d8 search handler: return correct thumbnail dimensions
images: DecodeConfig to get the predicted width
and height after EXIF correction
search&index: add GetImageInfo and use it in search
to predict the thumbnail dimensions

http://camlistore.org/issue/115

Change-Id: I358136a2ab03ea09c8f8fd2fa0dc574921c819c5
2013-03-25 17:06:15 +01:00
Brad Fitzpatrick ace9474d95 index: index file times too, and return in index.GetFileInfo.
Change-Id: I59d91f0938c725a4cbdf5ca933cdff3529e25f5f
2013-02-18 21:31:41 -08:00
Brad Fitzpatrick 14239a5c23 index: cap reindex parallelism
Change-Id: Iaf8a54a547e7d74f9b3702901180b2253aef58aa
2013-02-07 21:02:42 -08:00
Brad Fitzpatrick 51d79e8759 index: re-index on file failure. Issue 103.
Change-Id: I740dbcf951d865df32c2f54d9d4119af135713db
2013-02-07 19:31:44 -08:00
Brad Fitzpatrick cfc32e4a05 schema, camget: more work on deleting the Superset type.
not much more remains.

Change-Id: I6cfe4145f67b100a0e2509f88ce6e1c580b7f9fe
2013-01-22 09:32:40 -08:00
Brad Fitzpatrick 7ceaaa0012 blobref: simplify the FromHash func. Make type implicit.
Change-Id: I2e01c3663bdb1151c11dfc9a1d59c7081940ffac
2013-01-20 13:36:27 -08:00
mpl 67c5678062 Index directories with "fileinfo", and use this to
find their name in a search request.

Fixes http://code.google.com/p/camlistore/issues/detail?id=79

Change-Id: I755afd8f52dbd2f8a48ba72bed0a6b0192d1dd71
2013-01-09 16:59:04 +01:00
Brad Fitzpatrick 898e522126 Close FileReaders. Hunting an fd leak, but this isn't it,
since FileReader.Close is back to doing nothing.

Change-Id: I65e906d75cf2825b9476ed5008ce042f44582113
2012-12-31 18:02:13 -08:00
Brad Fitzpatrick e3247edafb Change blobref.SeekerFromStreamingFetcher signature to not return an error.
Change-Id: I77f693e3b3d0d116e08bca3d3f4cb45ef2a00b27
2012-12-25 10:27:35 -08:00
Brad Fitzpatrick 3057358cfc Index the dimensions of images. 2012-11-07 23:54:00 +01:00
Brad Fitzpatrick e783ad1717 Add another search handler test, for recent permanodes.
Change-Id: Iaf40cd94aba7b96c16fa1b04c2bfcebdfeea870e
2012-11-04 15:26:13 +01:00
Brad Fitzpatrick 1466c77198 Add 'edgeback' key to index, for going backwards.
Change-Id: I43057a6fb96c3e8d9364002288d5c7b9ad2fd034
2012-11-03 14:25:48 +01:00
Brad Fitzpatrick 1d3703f7ef Clean up some logging.
Change-Id: I92ff6e68e9866784e643682c5e6db5d03f877c5b
2012-04-22 17:56:52 -07:00
Brad Fitzpatrick 0714a463c9 Update from r60 to [almost] Go 1.
A lot is still broken, but most stuff at least compiles now.

The directory tree has been rearranged now too.  Go libraries are now
under "pkg".  Fully qualified, they are e.g. "camlistore.org/pkg/jsonsign".

The go tool cannot yet fetch from arbitrary domains, but discussion is
happening now on which mechanism to use to allow that.

For now, put the camlistore root under $GOPATH/src.  Typically $GOPATH
is $HOME, so Camlistore should be at $HOME/src/camlistore.org.

Then you can:

$ go build ./server/camlistored

... etc

The build.pl script is currently disabled.  It'll be resurrected at
some point, but with a very different role (helping create a fake
GOPATH and running the go build command, if things are installed at
the wrong place, and/or running fileembed generators).

Many things are certainly broken.

Many things are disabled.  (MySQL, all indexing, etc).

Many things need to be moved into
camlistore.org/third_party/{code.google.com,github.com} and updated
from their r60 to Go 1 versions, where applicable.

The GoMySQL stuff should be updated to use database/sql and the ziutek
library implementing database/sql/driver.

Help wanted.

Change-Id: If71217dc5c8f0e70dbe46e9504ca5131c6eeacde
2012-02-18 21:53:06 -08:00