When fetching shared blobs, we rely on the share chain to verify if a
blob can be reached. This chain is updated whenever we fetch an
additional link of the chain, by updating the Client.via map. However,
when some blobs of the chain are already cached in camget's DiskCache,
because we get them from the cache, we don't fetch them with
Client.FetchVia, which means the Client.via map isn't updated. And thus
the chain is broken.
This change adds Client.UpdateShareChain, and sets it as a hook to be
called by the CachingFetcher in the event of a cache hit. That way, we
ensure that the share chain is updated even when we get blobs from the
cache (instead of from the Client).
We also add a mutex to guard Client.via, because it is accessed by
concurrent smartFetch calls in case of a static-set.
As FetchVia was undocumented and not used by anyone, I made it
unexported. We can always export it again later when needed.
Fixes#856
Change-Id: I767cbec4b6f382cbccc25c0b97782b2a7472deb8
at rev 63cb68f1e3834e44683ca062ddf06cb9a889380a
Forgot to add it to the commit at
ab06dbd80dFixes#854
Change-Id: Ic8a9a3d0fd279b2bcf20c1dd77bee56512ae3f39
rm google.golang.org/cloud
add cloud.google.com/go at a47b182e769f5e75f5fc927ff6ee2678f7f552cf
update google.golang.org/api to 63cb68f1e3834e44683ca062ddf06cb9a889380a
update google.golang.org/grpc to
0e6ec3a4501ee9ee2d023abe92e436fd04ed4081
update go4.org to f5283521d7365fb2875408726e9cbf349f173767
fix in cmd/ pkg/ server/
TODO(mpl): fix misc/docker tools as well. next CL.
Fixes#832
Change-Id: I842b968a0afea8a5822913bd614d67cdbe50ee63
When receiving a file, we were only trying to guess its MIME type
through its contents (pkg/magic). We're now making a better effort at it
by guessing from the filename extension if needed.
Also:
pkg/magic: get rid of all the extra video extensions that are already
covered by mime.TypeByExtension. Because it's redundant and
confusing.
app/publisher, pkg/types/camtypes: also use mime.TypeByExtension as an
extra effort. Especially since a reindex would be necessary to benefit
from the pkg/index change.
There are other places in Camlistore that could use such an effort.
Maybe we should have a camtypes.*FileInfo.MIME() method that tries all
the ways to guess the MIME type of the file?
Change-Id: Ib9a2bc42af77c5394dac578ae415524b5111ad4e
All the camweb features depending on docker had been stuck for a couple
of weeks because one of the docker containers was wedged somehow. I
don't know exactly what was wrong, but I figure it can't hurt to add the
same kind of cleanup that we have for the demo blobserver.
So in this CL, we name each container running a git command, so we can
stop and remove said container the next time we run the same command.
Change-Id: I66592fbbde73ea30e4cee7477ada450e0c6a645e
Because our go generate line is:
//go:generate go run gensearchtypes.go -out zsearch.go
which will run a binary of gensearchtypes.go built for whatever $GOOS is
set to. Which will fail if $GOOS is different from runtime.GOOS (the
cross-compiling case).
I suppose it means that the day pkg/search becomes GOOS differentiated,
we may have to introduce an -os flag to gensearchtypes.go, since it
calls go doc on camlistore.org/pkg/search, whose output might depend on
GOOS?
Change-Id: I1ea32bb9190300120887ee8614dcdd2d1391a954
To decide whether a search submitted to the app search proxy is allowed,
we compare its results to the domain blobs, result of the master query,
that we cache when the master query is set.
However, since the results of the master query are liable to change when
new blobs arrive (e.g. a new camliMember is added to the set that is
published), that cache may need to be invalidated. Otherwise, we might
reply with a 403 to search query that is actually allowed.
Therefore, this CL adds a refresh of the cache on two instances:
-When the app handler gets a search query that seems to be forbidden.
Before replying with a 403, we refresh the cache with the master query,
and recheck whether the search query is allowed.
-When the publisher gets a request for a "members" page, or the "file"
page, it preemptively asks the app handler to refresh. Now that a lot of
the client workflow has been moved to javascript/the browser, these
kinds of requests should not happen too often, so it seems a reasonable
place to ask for a refresh. But this might change, so we should of
course be careful not to flood the app handler with refresh requests in
the future.
In any case, the app handler is suppressing the refresh requests, so
that it does not perform refreshes at more that one per minute.
As a smarter approach, we could later imagine a way for the app handler
to be aware of when new blobs get to the blobserver (akin to the blob
hub that the sync handler uses?), so that it only ever refreshes when
needed.
Fixes#851
Change-Id: Idc14cce5018053deac01ec454e5c936ed93e5a05
Add Handler.GetPermanodeLocation, based on the existing logic of loc:* and has:loc
search predicates, and that of index/Corpus.PermanodeLatLong:
1. Permanode attributes "latitude" and "longitude"
2. Referenced permanode attributes (eg. for "foursquare.com:checkin"
its "foursquareVenuePermanode")
3. Location in permanode camliContent file metadata
The sources are checked in this order, the location from
the first source yielding a valid result is returned.
camtypes: add new Location type
index: Add GetFileLocation to Index/Interface, to make
indexed location info accessible without a corpus.
This was unlike other file metadata like image info or media tags
which had accessors in both Index and Corpus.
Fixes#777
Change-Id: I63cf143d67a12732ca2c941de64b63736be5de6e
So far only images were served with their MIME types set properly, so
they would display directly in the browser, instead of being served as a
file download.
Now the same is done for a subset of text types: i.e. text/plain,
text/html, text/xml, and text/json. Aside from the browsing convenience,
the obvious advantage is being able to serve HTML directly, which should
allow us to build other things on top of the publisher.
Also a bit of related refactoring: moving the extension matching to
pkg/magic
Change-Id: Id98065c7c685036a272d1d2e293bfcbca33015ee
So far we were returning an error, which appears to be a little too
strict, so we now just ignore when a key or value is too large, log it,
and go on.
Fixes#849
Change-Id: Iadd4eaab7459643e22ab3043d1f45e3eab662b30
Because docker unpacks the file without its +x, which makes go generate
fail when trying to execute it.
Change-Id: I9998b849110437c6faff89090f5dbe98fe2f2c9b
vendoring: update gopherjs to rev
45518c130e5bd1525f20110830a4986365a153de
Patch on upstream to remove fsnotify dep added (for reference) as:
vendor/github.com/gopherjs/gopherjs/nofsnotify.diff
make.go: make Go1.7 the required version
Fixes#838
Change-Id: I2013ee4832a26f8be3a8b42f02e40a347674ec9a
Since the app handler should not trim the r.URL.Path of the handler's
prefix, it is now the responsibility of the app to cope with that
prefix.
Fixes#833
Change-Id: Ie1fa9801b26767c3e3b6612498380261e22cdf07
Some of the publisher features have moved from the server-side app to
the client-side app (the browser) thanks to gopherjs. Some of these
features imply doing some search queries against Camlistore, which
requires authentication. The server-side app receives the necessary
credentials on creation, from Camlistore. However, we can't just
communicate them to the client-side (as we do with the web UI) since the
publisher app itself does not require any auth and is supposed to be
exposed to the world.
Therefore, we need to allow some search queries to be done without
authentication.
To this end, the app handler on Camlistore now assumes a new role: it is
also a search proxy for the app. The app sends an unauthenticated search
query to the app handler (instead of directly to the search handler),
and it is the role of the app handler to verify that this query is
allowed for the app, and if yes, to forward the search to the Camlistore's
search handler.
We introduce a new mechanism to filter the search queries in the form of
a master query. Upon startup, the publisher registers, using the new
CAMLI_APP_MASTERQUERY_URL env var, a *search.SearchQuery with the app
handler. The app handler runs that query and caches all the blob refs
included in the response to that query. In the following, all incoming
search queries are run by the app handler, which checks that none of the
response blobs are out of the set defined by the aforementioned cached
blob refs. If that check fails, the search response is not forwarded to
the app/client.
The process can be improved in a subsequent CL (or patchset), with finer
grained domains, i.e. a master search query per published camliPath,
instead of one for the whole app handler.
Change-Id: I00d91ff73e0cbe78744bfae9878077dc3a8521f4
ClaimsAttrValue can be used in clients of pkg index, such
as in search on values returned by Index.AppendClaims to
query permanode attributes.
Related: issue #777
Change-Id: I5a8fa2a970d88f9ddbcc3de350215a196e124b64
Since probably e93e4f3822 we've been using
(*search.Handler).DescribeLocked instead of
(*search.Handler).Describe when serving describe queries, so the index
was never locked :(
I was so focused on debugging the recursions in describeReally that I
missed that elephant in the room.
I still think the Describe and describeReally recursions protected by
dr.wg is too complicated but that's for another CL.
Fixes#750
Change-Id: I376ee0f807df68ad1ef3752f944edc7c9bdb92ee
As soon as the (bs->index) sync handler is initialized, it starts
writing pending blobs to the index. However, so far the corpus
initialization has been left, unguarded, up to the search handler. Which
means lots of potential data races on the corpus fields between the sync
handler writing a blob to the index, and the search handler creating all
the pieces of the corpus itself.
This change fixes the issue by locking the index while the search
handler is creating the corpus.
This change does not entirely fix issue #750, as there seems to be yet
another data race going on, but it is related as the fixed race was
found thanks to the race trace at:
https://github.com/camlistore/camlistore/issues/750#issuecomment-241881987
Change-Id: I6de0165064a4f8934466f6a08710b4962f7b0256
This change allows the publisher to use resources from a SourceRoot
directory, without having to rebuild the publisher binary, instead of
only using embedded resources.
Change-Id: Ife29e3015b8595a33f175a62d98fcf5ffa689134
To avoid tmp file creation errors due to ulimit.
A different, more flexible, approach was discussed on
https://github.com/camlistore/camlistore/issues/812 , and could be
implemented later on if the current CL is too naive.
As a follow-up, issue #837 should be then fixed.
Fixes#812
Change-Id: I2590fdac137b0e8711a6a1bf4ba8a32259496515
Regardless of whether the database tables are newly created, we were
_always_ stamping the database schema version in the meta table BEFORE
doing the database version check. Which means said check was actually
useless.
Also add a small sanity check of the database name.
Issue #806
Change-Id: I85e19ef7583ebd5ef1043a6deb0fe61abaa4d190