Commit Graph

324 Commits

Author SHA1 Message Date
Paul Lindner 1383869054 all: lint fixes for "receiver name should be consistent with previous receiver name"
Change-Id: I05275cd20c92349e37365e2cbd29fa9f8d834101
2017-12-13 11:31:25 -08:00
Paul Lindner ba92702834 all: lint fixes for "should omit 2nd value from range"
Change-Id: I7bb19d376f96a39ecae7dbdb4d6808f704bae5fb
2017-12-13 11:31:25 -08:00
Paul Lindner 15feaeb24c all: lint fixes for 'error strings should not be capitalized or end with punctuation or a newline'
Change-Id: I9c3766a51ac8be694ae76befff4b6fa9a85e34eb
2017-12-11 06:13:25 -08:00
mpl a91a98c58a pkg/test: remove FakeIndex
And fix its main users: tests in pkg/search.

Fixes #883

Change-Id: Ib04b8d6f2d56bfb24a8900520c97b24bccb3c78b
2017-12-01 19:59:26 +01:00
Paul Lindner b09cd377d7 Switch to stdlib context from golang.org/x/net/context
This switches most usages of the pre-1.7 context library to use the
standard library.  Remaining usages are in:

  app/publisher/main.go
  pkg/fs/...

Change-Id: Ia74acc39499dcb39892342a2c9a2776537cf49f1
2017-11-26 01:12:26 -08:00
mpl 4a8e6f03b5 pkg/index: add simple integrity check
This change adds a check right after index initialization that
enumerates blobs for a few seconds and verifies that all of them are
indexed.

A warning is logged if any of the blobs are not found in the index.

Issue #947

Change-Id: Idc0df2121c1fb58e7560173b7753eaaddc4e653b
2017-09-15 15:52:58 +02:00
Paul Lindner fa46c3935d Correct various misspelled words
Change-Id: I236e880526e4c2b0bd318da041983d557e0aa885
2017-09-11 08:33:31 -07:00
mpl d717b58fd3 pkg/index: fix enumeration tests
related to 93c6d682d2

Change-Id: I2fa0df6da70df1297712ee0c9279a625b0ac88ca
2017-08-30 01:01:56 +02:00
Brad Fitzpatrick b411c990f7 Merge "pkg/search: change corpus enumeration signatures for speed" 2017-08-29 18:12:58 +00:00
Brad Fitzpatrick 93c6d682d2 pkg/search: change corpus enumeration signatures for speed
Avoid select overhead in hot paths. Just use funcs.

Also, for sort-by-map searches, don't do a describe and pass over all
the results doing location lookups a second time. Remember the
location from the initial matching. Cache it on the search value.

Reduces some sort-by-map searches from 10 seconds to 3 seconds for
me. (still too slow, but good start)

Change-Id: I632954738df9accd802f28364ed11e48ddba0d14
2017-08-29 11:10:10 -07:00
mpl d72a8b3045 pkg/index: ignore indexed NaN location
We only started preventing NaNs from locations from being indexed at
ee13a3060b, so files indexed before
that could have introduced indexed NaNs, which we were not checking
against, until now.

Change-Id: I31fc8b9482cbd546591d553d7d8804700c7cf175
2017-08-23 17:19:28 +02:00
mpl e4b7db8274 server/camlistored/ui: improve map aspect search and markers
Notably:

-do not load any markers on an empty search query, because that would
mean loading absolutely all of the items with a location, which seems
like a bad idea.

-use different markers for different nodes. For now, foursquare
checkins, file images, and files have their own marker.

-vendor in https://github.com/lvoogdt/Leaflet.awesome-markers to achieve
the above, which relies on Font Awesome, which we already have in.
icons available for the markers: http://fontawesome.io/icons/

-when no location can be inferred from the search query, set the view to
encompass all markers that were drawn.

-when a location search is known, draw a rectangle representing the
results zone.

-use thumber for image in marker popup

-use title, if possible, instead of blobRef for link text in marker
popup

-switch to directly using OpenStreetMap tiles, instead of MapBox ones.

https://storage.googleapis.com/camlistore-screenshots/Screenshot_20170622-232359.png

Change-Id: Ibc84fa988aea8b8d3a2588ee8790adf6d9b5ad7a
2017-07-06 01:03:03 +02:00
mpl ee13a3060b pkg/index: ignore NaN in EXIF lat/long
Fixes #927

Change-Id: I40b151ca0af30a65263c1daf9597221136ccdf54
2017-05-23 22:36:44 +02:00
mpl 1951498e63 pkg/index: use missing dep mechanism for static sets too
We relied on missTrackFetcher to return errMissingDep when the
underlying Fetch() returned os.ErrNotExist. The caller could then know
how to act if some indexing operation failed because of an errMissingDep
error.

This was wrong for 2 reasons:

1) if a function fn(tf blob.Fetcher) error does:

	if _, _, err := tf.Fetch(br); err != nil {
		return fmt.Errorf("wrapping this error in a nicer error
message: %v", err)
	}

when we call err := fn(tf), we lose the ability to directly determine
whether err is an errMissingDep. We'd have to parse the error string,
which is gross.

This is exactly what happens in populateDir, when we call
dr.StaticSet().

And in order to fix issue #738, we want to be able to tell when a call
to dr.StaticSet() failed because the underlying Fetch() operation
failed.

2) The blob.Fetcher interface specifically states that os.ErrNotExist
should be returned when a blob is not found. We were breaking that rule
by returning errMissingDep.

In order to address both 1) and 2), it seemed like we could add an err
field to missTrackFetcher to keep track of when an os.ErrNotExist
occurred during a Fetch, and let Fetch return an os.ErrNotExist.
However, that would not work, as a missTrackFetcher is used concurrently
by several callers, so a given caller wouldn't be able to tell whether
"its" Fetch failed or a Fetch from a concurrent caller failed.

Therefore, we introduce trackErrorsFetcher, that has such an error field,
and that wraps the missTrackFetcher. All the callers can keep on sharing
the missTrackFetcher, but each of them initialize their own
trackErrorsFetcher, and can check the errors field after a failed call to a
function is suspected to be the result of a failed Fetch.

Also added a test to demonstrate that issue #738 is fixed.

Fixes #738

Change-Id: Ia5c3081b71c77be1e8cff0bbc847ade68f019bf9
2017-03-03 00:11:42 +01:00
Mathieu Lonjaret d9200f9855 Merge "pkg/index: simplify out of order indexing" 2017-02-18 01:19:01 +00:00
mpl 66d75cbcf9 pkg/index: simplify out of order indexing
There's a race that takes place at the end of the reindexing process.

In func (x *Index) Reindex(), we wait for all the reindexing goroutines
to be done, with wg.Wait. However, any (or all) of those goroutines
could have triggered (they call indexBlob, which calls
blobserver.Receive, which calls noteBlobIndexed, which can send on
tickleOoo) an asynchronous out of order reindexing which is NOT waited
on.

The race can be trivially demonstrated by changing:

  WaitTickle:
  	for range ix.tickleOoo {
+  		time.Sleep(5*time.Second)

in receive.go, and running:
  go test ./pkg/index/ -run TestReindex_*

This CL rewrites the out of order indexing implementation to make it
simpler, and in the process, fixes the above bug.

Fixes #756

Change-Id: If79fb1ad8869cefce4a095ef2becdba333732bce
2017-02-17 20:29:02 +01:00
mpl 5a24ffd854 new app: scanning cabinet
WARNING: this app is still experimental, and even its data schema might
change. Do not use in production.

This change adds a Camlistore-based port of the scanning cabinet app
originally created by Brad Fitzpatrick:
https://github.com/bradfitz/scanningcabinet

Some of it is inspired from the App Engine Go port of Patrick Borgeest:
https://bitbucket.org/pborgeest/nometicland

The data schema is roughly as follows:

-a scan is a permanode, with the node type: "scanningcabinet:scan".
-a scan's camliContent attribute is set to the actual image file.
-a scan also holds the "dateCreated" attribute, as well as the
"document" attribute, which references the document this scan is a part
of (if any).

-a document is a permanode, with the node type: "scanningcabinet:doc".
-a document page, is modeled by the "camliPath:sha1-xxx" = "pageNumber"
relation, where sha1-xxx is the blobRef of a scan.
-a document can also hold the following attributes: "dateCreated",
"tag", "locationText", "title", "startDate", and "paymentDueDate".

Known caveats, in decreasing order of concern:
-the data schema might still change.
-the scancab tool, to actually create and upload the files from physical
documents, is practically untested (since I do not own a scanner).
-some parts, in particular related to searches, are probably
sub-optimized.
-the usual unavoidable bugs.

Change-Id: If6afc509e13f7c21164a3abd276fec075a3813bb
2017-02-15 17:14:45 +01:00
Mathieu Lonjaret 8a17e7252b Merge "pkg/sorted/mysql: drop tables on reindex" 2017-01-18 18:14:06 +00:00
mpl af77128123 pkg/sorted/mysql: drop tables on reindex
When reindexing on a (My)SQL based sorted.KeyValue, we should recreate
the database schema from scratch, which means dropping the tables.

However, index.Reindex just calls Wipe on the newly created
sorted.KeyValue, which only deletes the rows, and does not drop the
tables.

Therefore, this CL changes the implementation of Wipe in the MySQL case,
so that it takes care of dropping the tables, and doing everything that
needs to be done afterwards to set up the sorted.KeyValue.

In addition, with the introduction of the sorted.NeedWipeError, we detect
upon initialization of a sorted.KeyValue if it failed because it needed
a schema update. If that is the case, and we're in reindex mode, we can
fix the sorted.KeyValue with a Wipe and carry on.

Finally, we introduce the new sorted.NewKeyValueMaybeWipe function that
automatically wipes a KeyValue when a NeedWipeError was returned upon
its creation.

Next, do the same with other sorted SQLs.

Fixes #806

Change-Id: I2032781cbf453a364880bd3e2e8b3c09aac7aed9
2017-01-16 19:10:05 +01:00
Stephen Searles bbecbc47cd search: fix panic when claim and permanode have no attributes.
fixes #881

Change-Id: Ifdea54a56bf879ca418763617acf7bd2b9159dad
2016-11-22 19:39:22 -08:00
Attila Tajti 1dad8f33da search: unify location Query with Describe
Use index/LocationHandler.PermanodeLocation instead of
index/Corpus.PermanodeLatLong for location matching.

Remove fileLoc constraint from location and hasLocation
predicates. They are now handled by index/LocationHandler.

index: Remove Corpus.PermanodeLatLong, its functionality
is now moved to search.

Change-Id: I01e72661470ffb9376f3491401db4e2ce7f8a131
2016-11-17 08:59:27 +01:00
Attila Tajti bdb350dc25 search: improve location describe performance
Use permanode attribute caches already in Corpus if
a suitable map exists for the owner and time being queried.
Move the location query back to the index to make
accessing these maps possible.

Add a test case to search/TestDescribeLocation where
no location should be returned.

Change-Id: Ic51451daf8f3610e3cc4a8fda0f0c005eba9b286
2016-11-15 17:48:00 +01:00
Brad Fitzpatrick db50bae0c4 pkg/index: read blob before acquiring index mutex
For #878

Change-Id: I8abaf5d923fc6dee7e8a9a3e84f82d4cf7484329
2016-11-08 08:47:59 -08:00
Brad Fitzpatrick fe09b0a28b pkg/index: clean up BlobSniffer a bit
Change-Id: I584754c452dbf827969f27ae98eb4d0913e2ce01
2016-11-08 08:47:43 -08:00
mpl 6c1ae1d865 pkg/index/corpus: more valuesAtSigner doc
Change-Id: I750caa46ebe6534a28e5567251688c06d3935c13
2016-11-03 16:47:56 +01:00
Mathieu Lonjaret 1a7452561a Merge "index: improve Corpus attr lookup with signer filter" 2016-10-26 20:50:12 +00:00
mpl 167bed4277 pkg/index: util_test.go not an index impl
Change-Id: Iccc1bb43d703ec8a05f30bdd3132058fbdff2f1f
2016-10-24 18:12:13 +02:00
mpl 096493bc13 pkg/index: skip reindexing tests on Travis CI
We've had issue #756 making reindexing tests fail for so long now, that
having Travis CI monitor builds has become useless, since we don't
expect any other result than a fail.

This CL skips the Reindexing tests when on Travis, so that it becomes
useful again in helping us catching regressions.

Change-Id: I7b837f62a4e2d7471a08155346150130af74f48f
2016-10-24 17:26:42 +02:00
Attila Tajti 0f55bfe980 index: improve Corpus attr lookup with signer filter
Keep attributes for each signer in PermanodeMeta
to make Corpus methods AppendPermanodeAttrValues,
PermanodeAttrValue perform well even with a signerFilter.

Also add index/util_test.go to the notAnIndexer slice
of index/index_test.go.

Fixes #861

Change-Id: Ic25470b7d42e40a6f9d0ed0bf868ef3755413289
2016-10-06 18:01:28 +02:00
Brad Fitzpatrick 3dedf3c72f Merge "search: add location info to DescribedBlob" 2016-09-16 19:34:00 +00:00
Brad Fitzpatrick 2d186e7773 Merge "index: add ClaimsAttrValue" 2016-09-16 19:33:53 +00:00
mpl 25652d66d9 pkg/index: use mime.TypeByExtension to record MIMEType
When receiving a file, we were only trying to guess its MIME type
through its contents (pkg/magic). We're now making a better effort at it
by guessing from the filename extension if needed.

Also:

pkg/magic: get rid of all the extra video extensions that are already
covered by mime.TypeByExtension. Because it's redundant and
confusing.

app/publisher, pkg/types/camtypes: also use mime.TypeByExtension as an
extra effort. Especially since a reindex would be necessary to benefit
from the pkg/index change.
There are other places in Camlistore that could use such an effort.
Maybe we should have a camtypes.*FileInfo.MIME() method that tries all
the ways to guess the MIME type of the file?

Change-Id: Ib9a2bc42af77c5394dac578ae415524b5111ad4e
2016-09-06 16:26:09 +02:00
Attila Tajti 8b4c324adb search: add location info to DescribedBlob
Add Handler.GetPermanodeLocation, based on the existing logic of loc:* and has:loc
search predicates, and that of index/Corpus.PermanodeLatLong:
  1. Permanode attributes "latitude" and "longitude"
  2. Referenced permanode attributes (eg. for "foursquare.com:checkin"
     its "foursquareVenuePermanode")
  3. Location in permanode camliContent file metadata
 The sources are checked in this order, the location from
 the first source yielding a valid result is returned.

camtypes: add new Location type

index: Add GetFileLocation to Index/Interface, to make
indexed location info accessible without a corpus.
This was unlike other file metadata like image info or media tags
which had accessors in both Index and Corpus.

Fixes #777

Change-Id: I63cf143d67a12732ca2c941de64b63736be5de6e
2016-09-02 11:28:39 +02:00
Attila Tajti b5bdb7d502 index: add ClaimsAttrValue
ClaimsAttrValue can be used in clients of pkg index, such
as in search on values returned by Index.AppendClaims to
query permanode attributes.

Related: issue #777

Change-Id: I5a8fa2a970d88f9ddbcc3de350215a196e124b64
2016-08-27 07:42:59 +02:00
mpl f279013f65 pkg/search: optimize file search by wholeRef
Issue #684

Change-Id: Id1e317df36172bdcc21906339c46ece898d4d023
2016-08-18 18:32:23 +02:00
mpl f9a8e002b8 pkg/index: test showing issue #756
A word of caution: relatedly to the issue demonstrated by the added
tests, an infinite loop can also occur, as it already could in
TestReindex_LevelDB. As it is, after all, a consequence of a race, I
haven't been able to determine what exactly makes the loop occur. But
what I observed is:

1) It seems to be occuring much more easily with LevelDB, which is why I
ended up just disabling TestReindex_LevelDB.
2) I've never seen it happen in TestReindex_Kvfile, but who knows.
3) I've seen it rarely happen with TestShowReindexRace_Kvfile, but it
seems that adding in TestShowReindexRace_Kvfile the kind of timed kill
that I had added TestReindex_LevelDB, actually makes the loop happen
much more often. And it ends up eclipsing the original issue that we
want to demonstrate, which is why I decided against it.

TL;DR: if you use -show_reindex_race=true , be prepared to maybe
have to kill(1) the test manually.

Change-Id: I47fd3c55363c8d0dda17ad19665115cb96f3d58f
2016-08-05 16:37:50 +02:00
Mathieu Lonjaret b507434d21 Merge "sorted/levedb: check error on closing iter" 2016-08-01 23:46:44 +00:00
mpl 7757b754b8 sorted/levedb: check error on closing iter
Follow-up from the findings of
https://camlistore-review.googlesource.com/6227 which hinted that the
iter "err" field was not needed.

-Added Error check on iterator Closing.

-Removed Error call in Next, because it.it.Next already does it. See
func (i *dbIter) Next(), that checks i.err, which is the same as calling
Error(). (in github.com/syndtr/goleveldb/leveldb/db_iter.go)

-the closed field, and related check in Next are not strictly necessary,
because that's part of what Release does, in conjunction with func (i
*dbIter) Next() which is checking if we already released (look for i.dir
== dirReleased). But we're keeping it and the panic for the benefit of
detecting programmer's errors.

Also added the missing leveldb tests in index pkg.

Because of a potential infinite loop error (likely related to issue #756,
and not introduced by this CL), I've added a timer triggered panic to
break that loop when it happens.

Change-Id: I26e0815f1d85279f0ead7bf90daae2ae03f1af63
2016-08-02 01:43:34 +02:00
mpl ba1877c4b6 pkg/fs: take At into account for (root)dirs
Also adds more reliable check for "mounted broken" dirs on linux (that
require a fusermount -u). They were seen as not mounted (actually not
existing), because the check relied on df, which on them would just
return the "transport endpoint is not connected" error.

Fixes #826

Change-Id: I440d10a7b42c217ee85ea7a1726e581bc74e6f4a
2016-08-01 18:45:26 +02:00
Eric Drechsel 95f9a6b9a8 index: store exifgps keys without exponent
also check bounds of long, lat before storing

Fixes #758

Change-Id: Ife59ebeec23210bcb821a47765319c76688f7daa
2016-05-16 09:39:30 -07:00
mpl 1fcacd8e85 pkg/index: add missing locking for some tests
As rev e93e4f3822 moved the responsibility
of locking the corpus to a higher up locking of the index, some tests
now need to add some locking of their own to avoid data races.

Context: issue #750

Change-Id: Ifaf87e275432fe5e66639fae2699d27b566c93aa
2016-05-03 15:07:49 +02:00
Brad Fitzpatrick e93e4f3822 Fix deadlock in search/index.
The describe requests were launching a storm of RLocks which weren't
safe in the presence of goroutines trying to acquire write locks.

Instead, make the corpus locking the responsibility of the caller and
add Lock/Unlock/RLock/RUnlock methods to the index and move locking up
a level.

This also adds a fair bit of context.Context plumbing which was used
in earlier debugging.

Fixes camlistore/camlistore#709

Change-Id: I8d7254d1e1da541f8c080d62f5408aac807fd3b1
2016-04-22 14:57:10 -07:00
mpl 8580b811cf pkg/index: make corpus RLock take a context
Also remove extra (deadlocking) RLock in pkg/search.

Updates issue #709

Change-Id: I556a1fbf9217f482b6a51e74c28a019dea8369a2
2016-04-21 11:28:24 -07:00
Brad Fitzpatrick 75d60962f6 Move remaining stuff in third_party/* to vendor/*
Change-Id: Ifbcc02817083cba68d8c1acec3e6ec50e8f61149
2016-04-20 16:49:15 -07:00
Tamás Gulácsi 7402cc0efd Delete misc unused objects
Using honnef.co/go/unused/cmd/unused

Change-Id: I672b3cb77f09e9bd80dcdc149cde4f7f2939e451
2016-04-06 17:59:51 +02:00
Tamás Gulácsi 8d6b156a0b Misc syntax cleanup found by gosimple.
https://github.com/dominikh/go-simple

Thanks to Dominik Honnef for this great little tool!

Change-Id: I789b3a37e18f535df1ff0da47c0366ed01b2429e
2016-04-04 17:19:57 +02:00
Will Norris 77ed42edf8 add canonical import paths
The import path was added to the go file that included the package
documentation if one existed.  Otherwise, I used what seemed to be the
primary file for the package.

Fixes #689

Change-Id: If51be0e86529fd6f179e80af6781e639f8550fd2
2016-03-13 19:57:14 -07:00
mpl e0d719ba21 pkg/types: remove
Most of it replaced with vendor/go4.org/types and
vendor/go4.org/readerutil

u32 went where needed in pkg/blobserver/*
invertedBool went in pkg/types/serverconfig
atomics64 went in pkg/fs

Change-Id: I230426cda35be4b45ed67e869f14e6fdae89be22
2016-02-05 18:28:47 +01:00
Attila Tajti 34ad44385b pkg/search: add parentof search expression
Change-Id: I74f6f5411355a0b0864739f135331ba304245ddb
2016-02-03 10:31:51 +01:00
Mathieu Lonjaret d20ad1570e Merge "server/synchandler: add exported constructor and IdleWait method" 2016-01-28 00:57:56 +00:00