I had intended for this to be a small change.
I was going to just add context.Context to the BlobReceiver interface,
but then I saw blob.Fetcher could also use one, so I decided to do two
in one CL.
And then it got a bit infectious and ended up touching everything.
I ended up doing SubFetch in the process by necessity.
At a certain point I finally started using context.TODO() in a few
spots, but not too many. But removing context.TODO() will come in the
future. There are more blob storage interfaces lacking context, too,
like RemoveBlobs.
Updates #733
Change-Id: Idf273180b3f8e397ac5929c6d7f520ccc5cdce08
Remove the blob.SHA{1,224}From{Bytes,String} constructors too. No
longer used. This adds blob.RefFromBytes which was missing. We had
blob.RefFromString. Now everything uses blob.RefFrom* instead of
specifying a hash function.
Some tests set a flag to force use of SHA-1 because there was too much
golden data to update. We can remove those one-by-one over time as we
fix up tests.
Updates #537
Change-Id: Ibe6428089a6221594c2b751f53f98b03b5a28dc2
followup of 7eda9fd502
We want existing Perkeep instances on GCE to be able to keep on running
with their DBNames-style existing databases.
To that end, we introduce the "perkeep-config-version" metadata key,
which will be set by the launcher from now on.
When perkeep configuration starts, it can lookup that key. If it is set,
it means we're in a newly created instance, and we don't need to care
about DBNames compatibility. If not, we modify the low level
configuration on the fly, so that it keeps on using the old DBNames
values that were set for a GCE instance.
Change-Id: I611811fdb9c68777c2ba799e9047d00ec0bae040
DBNames is supposed to provide configuration for the various databases
names. However,
1) I contend that nobody needs or wants to configure them as long as we
provide sane defaults.
2) it seems the only obvious user we have for this is to set up some of
the names on GCE.
3) having another external source for names complicates the code
further, especially when we already have the distinction between
database names for DBMS and file names for file-based databases.
4) writing a correct documentation for it is awkward.
Therefore, in this CL, I propose that we remove DBNames. Instead,
genconfig.go now sets some consistent default names for the various
queues and indexes set up on a DBMS (MySQL, PostGres, Mongo). To that
end, we introduce the new, but optional, DBUnique configuration
parameter, that is used as a part of all the database names, in order to
be able to run several Perkeep instances on the same DBMS, without name
conflicts.
In addition, the queue for the bs->index synchandler is now set up on
the same DBMS that is already in use for the index itself, instead of
using a file-base database.
And i think we could proceed likewise for the other queues.
Fixes#951
Change-Id: Ib6a638f088a563d881e3957e4042e932382b44f4
Part of the project renaming, issue #981.
After this, users will need to mv their $GOPATH/src/camlistore.org to
$GOPATH/src/perkeep.org. Sorry.
This doesn't yet rename the tools like camlistored, camput, camget,
camtool, etc.
Also, this only moves the lru package to internal. More will move to
internal later.
Also, this doesn't yet remove the "/pkg/" directory. That'll likely
happen later.
This updates some docs, but not all.
devcam test now passes again, even with Go 1.10 (which requires vet
checks are clean too). So a bunch of vet tests are fixed in this CL
too, and a bunch of other broken tests are now fixed (introduced from
the past week of merging the CL backlog).
Change-Id: If580db1691b5b99f8ed6195070789b1f44877dd4
On GCE, on startup, we did not set the camliNetIP in the high-level
config if the camlistore-hostname instance var vas already set.
The camliNetIP presence in the config is the signal for the server
that it should be configured as part of camlistore.net (and in
particular that it should update the record for its name on the
camlistore.net DNS server). Part of this configuration is to set the
camlistore-hostname var for the instance.
Therefore, a server which had been configured once as described above,
would not, on a subsequent restart, behave as if part of camlistore.net
and would skip the related configuration code path. Unless the
camlistore-hostname var was manually wiped before restart.
As this manual step is not an obvious one, this CL changes the
initialization, so that if a camlistore-hostname var is set, but the
value is a subdomain of "camlistore.net", the var is treated as empty
and initialization proceeds as if with a new server.
Fixes#963
Change-Id: Iab70185d7b90ef7e70bb831d363ff9d525922e35
Some errors such are missing fields, or wrong values, or conflicting
values were already caught logically when the high-level configuration
is transformed into the low-level configuration. Or later when
initializing handlers.
However, there was no mechanism to catch typos such as "httsCert"
instead of "httpsCert", or "leveldb" instead of "levelDB" in the
high-level configuration. Such errors would result in misconfiguration
(e.g. use of Let's Encrypt instead of the desired HTTPS certificate)
which can then even go unnoticed since the server still starts.
Therefore, this change generates a fake serverconfig.Config with all its
fields set, so that its JSON encoding results in a list of all the
possible configuration fields. This allows to compare the given
configuration fields with that list, and catch invalid names.
Change-Id: I4d6e61462353e52938db93ba332df2d52225e750
The bit removed is already done right before by osutil.Username(). Plus
it wasn't finished as the envVar wasn't even used in a Getenv call.
Change-Id: Ic75853ad883e7acad4347e7f2d8851dc470b5bbf
The default server config on GCE (as deployed by the launcher) did not
have a share handler. This CL adds one, so that users can benefit from
the new sharing feature from the web UI.
Also, the button for sharing from the web UI does not appear anymore if
the config that the web UI gets from the discovery does not have a
"shareRoot", because it's a strong hint that the server does not have a
share handler.
Change-Id: I6c444995339fda8dba864b1d6729fb7c1b6d72bd
So far, when building camlistored docker image for CoreOS, we were not
using make.go, and we were neither running gopherjs nor embedding the
resources (but rather provide the UI resources at their default
filesystem location).
Now that we're using gopherjs for the web UI, it is a hard dependency
for the camlistore server.
We could reproduce the steps in make.go to build gopherjs, run it to
build the web ui resources, and then move the resources at the right
place, but since make.go already does the equivalent work it seems
to make more sense to use it, which is the main point of this CL.
Similarly, it seems to make more sense to now build a binary with the
resources embedded, which is the default make.go behaviour, instead of
building a "raw" camlistored, and provide the resources as additional
directories in the container image, so this CL takes that approach too.
Finally, it was necessary to add the "-static" flag to make.go, so we
can keep on building a static camlistored binary, that does not rely on
libc for DNS. Because our container image is FROM SCRATCH, with just the
necessary binaries, in order to get a container image of a reasonable
size.
One drawback of now using make.go in
misc/docker/server/build-camlistore-server.go is we're doing some
unnecessary (since we're already running in the isolation of a
container) copying to the virtual gopath, but that seems a very tiny
price to pay. Especially considering how rarely we run that code.
Change-Id: I416c86d366cd4ed2d3b8b1636a6a65a83b9f15d7
Since we don't actually ever remove blobs (until we add garbage
collection, and even then), if a share claim gets deleted (with a delete
claim), the only knowledge of the deletion resides in the Index.
So when the share handler verifies the sharing chain, there is nothing
preventing it from reading a supposedly deleted share claim (from the
blobserver), and concluding that the share chain is valid.
This change adds the index handler to the share handler, so it can check
the deletion status of a share claim, and hence support "share
cancellation".
Fixes#914
Change-Id: I572fdddee30e745aa2d2a6720c83c8e8c916515d
Since -update_golden reorders lexicographically, and it seems we forgot
to systematically use it in previous updates, the files are now
unordered. So I'm running it now, so that future real changes don't get
polluted and hard to review in the reordering churn.
Change-Id: I7618f2e8dcc22f19850bb15c57084fc9688d67ce
Minimizing redundant info when setting up an app:
* prefix - Now optional and defaults to the handler prefix the whole app is attached to
* serverListen - Defaults to the host part of serverBaseURL
Change-Id: I395e71c54b9e81849681fe406ff1b5f412e92c9c
WARNING: this app is still experimental, and even its data schema might
change. Do not use in production.
This change adds a Camlistore-based port of the scanning cabinet app
originally created by Brad Fitzpatrick:
https://github.com/bradfitz/scanningcabinet
Some of it is inspired from the App Engine Go port of Patrick Borgeest:
https://bitbucket.org/pborgeest/nometicland
The data schema is roughly as follows:
-a scan is a permanode, with the node type: "scanningcabinet:scan".
-a scan's camliContent attribute is set to the actual image file.
-a scan also holds the "dateCreated" attribute, as well as the
"document" attribute, which references the document this scan is a part
of (if any).
-a document is a permanode, with the node type: "scanningcabinet:doc".
-a document page, is modeled by the "camliPath:sha1-xxx" = "pageNumber"
relation, where sha1-xxx is the blobRef of a scan.
-a document can also hold the following attributes: "dateCreated",
"tag", "locationText", "title", "startDate", and "paymentDueDate".
Known caveats, in decreasing order of concern:
-the data schema might still change.
-the scancab tool, to actually create and upload the files from physical
documents, is practically untested (since I do not own a scanner).
-some parts, in particular related to searches, are probably
sub-optimized.
-the usual unavoidable bugs.
Change-Id: If6afc509e13f7c21164a3abd276fec075a3813bb
When reindexing on a (My)SQL based sorted.KeyValue, we should recreate
the database schema from scratch, which means dropping the tables.
However, index.Reindex just calls Wipe on the newly created
sorted.KeyValue, which only deletes the rows, and does not drop the
tables.
Therefore, this CL changes the implementation of Wipe in the MySQL case,
so that it takes care of dropping the tables, and doing everything that
needs to be done afterwards to set up the sorted.KeyValue.
In addition, with the introduction of the sorted.NeedWipeError, we detect
upon initialization of a sorted.KeyValue if it failed because it needed
a schema update. If that is the case, and we're in reindex mode, we can
fix the sorted.KeyValue with a Wipe and carry on.
Finally, we introduce the new sorted.NewKeyValueMaybeWipe function that
automatically wipes a KeyValue when a NeedWipeError was returned upon
its creation.
Next, do the same with other sorted SQLs.
Fixes#806
Change-Id: I2032781cbf453a364880bd3e2e8b3c09aac7aed9
This CL changes the GCE launcher to work with the new features of
camlistored: i.e. that it can automatically get a hostname in
camlistore.net, and that it can get an HTTPS certificate from Let's
Encrypt, for said hostname.
In order for the user to easily (without having to look at the logs)
know what their hostname is, camlistored stores it as the
"camlistore-hostname" key in the custom metadata of the GCE instance.
The deployer can then query for that key, to report the hostname on the
instance creation success page.
Change-Id: Iaaef2d51f34fa5e1e0ee90097919abab7ee72a12
In order to use HTTPS, one must have a certificate, and one must have a
domain name for which the certificate is valid.
The first part is solved by the use of Let's Encrypt. For the second
part, we want to provide to any Camlistore instance a name such as
<gpgKeyId>.camlistore.net, where gpgKeyId is the fingerprint of its GPG
key. The DNS for camlistore.net agrees to add a record for that name if
and only if the Camlistore instance can prove it owns the GPG key, as
well as the IP address bound to that name in the DNS record.
A protocol such as the above is already implemented in pkg/gpgchallenge.
This CL:
- uses the client-side of the gpgchallenge protocol in camlistored, so
that it can claim a hostname in camlistore.net on startup (and then use
that hostname when requesting a certificate from Let's Encrypt).
- adds the configuration parameter "CamliNetIP" for the high-level
config. This parameter specifies the IP address that camlistored will
supply during the gpgpchallenge, so it can prove to the DNS server that
we own this address.
Fixes#722
Change-Id: I6bf4ec149b6dffd0ae93a6fa7bf208b2e8a05445
As the requests to the publisher are proxied through Camlistore's app
handler, there's no point in the publisher having its own autocert
Manager to request a certificate. Therefore, the publisher reuses
(readonly) camlistored's autocert CacheDir to get its certificate.
It follows that, for now, Let's Encrypt only works for the publisher if
it is running on the same host as camlistored (or more precisely, if they
share the same filesystem).
Fixes#458
Change-Id: Icf3be2913f85f9ec6f94b831ad58e1949b4d6961
Or to be more precise, golang.org/x/crypto/acme/autocert
The default behaviour regarding HTTPS certificates changes as such:
1) If the high-level config does not specify a certificate, the
low-level config used to be generated with a default certificate path.
This is no longer the case.
2) If the low-level config does not specify a certificate, we used to
generate self-signed ones at the default path. This is no longer always
the case. We only do this if our hostname does not look like an FQDN,
otherwise we try Let's Encrypt.
3) As a result, if the high-level config does not specify a certificate,
and the hostname looks like an FQDN, it is no longer the case that we'll
generate a self-signed. Let's Encrypt will be tried instead.
To sum up, the new rules are:
If cert/key files are specified, and found, use them.
If cert/key files are specified, not found, and the default values,
generate them (self-signed CA used as a cert), and use them.
If cert/key files are not specified, use Let's Encrypt if we have an
FQDN, otherwise generate self-signed.
Regarding cert caching:
On non-GCE, store the autocert cache dir in
osutil.CamliConfigDir()/letsencrypt.cache
On GCE, store in /tmp/camli-letsencrypt.cache
Fixes#701Fixes#859
Change-Id: Id78a9c6f113fa93e38d690033c10a749d1844ea6
rm google.golang.org/cloud
add cloud.google.com/go at a47b182e769f5e75f5fc927ff6ee2678f7f552cf
update google.golang.org/api to 63cb68f1e3834e44683ca062ddf06cb9a889380a
update google.golang.org/grpc to
0e6ec3a4501ee9ee2d023abe92e436fd04ed4081
update go4.org to f5283521d7365fb2875408726e9cbf349f173767
fix in cmd/ pkg/ server/
TODO(mpl): fix misc/docker tools as well. next CL.
Fixes#832
Change-Id: I842b968a0afea8a5822913bd614d67cdbe50ee63
Some of the publisher features have moved from the server-side app to
the client-side app (the browser) thanks to gopherjs. Some of these
features imply doing some search queries against Camlistore, which
requires authentication. The server-side app receives the necessary
credentials on creation, from Camlistore. However, we can't just
communicate them to the client-side (as we do with the web UI) since the
publisher app itself does not require any auth and is supposed to be
exposed to the world.
Therefore, we need to allow some search queries to be done without
authentication.
To this end, the app handler on Camlistore now assumes a new role: it is
also a search proxy for the app. The app sends an unauthenticated search
query to the app handler (instead of directly to the search handler),
and it is the role of the app handler to verify that this query is
allowed for the app, and if yes, to forward the search to the Camlistore's
search handler.
We introduce a new mechanism to filter the search queries in the form of
a master query. Upon startup, the publisher registers, using the new
CAMLI_APP_MASTERQUERY_URL env var, a *search.SearchQuery with the app
handler. The app handler runs that query and caches all the blob refs
included in the response to that query. In the following, all incoming
search queries are run by the app handler, which checks that none of the
response blobs are out of the set defined by the aforementioned cached
blob refs. If that check fails, the search response is not forwarded to
the app/client.
The process can be improved in a subsequent CL (or patchset), with finer
grained domains, i.e. a master search query per published camliPath,
instead of one for the whole app handler.
Change-Id: I00d91ff73e0cbe78744bfae9878077dc3a8521f4
This change allows the publisher to use resources from a SourceRoot
directory, without having to rebuild the publisher binary, instead of
only using embedded resources.
Change-Id: Ife29e3015b8595a33f175a62d98fcf5ffa689134
These improvements on the server app handler should help writing
and running stand-alone apps.
The two main goals are:
1) "simple" configurations should work automatically; the parameters for
the app are derived from the Listen and BaseURL of the Camlistore
server.
2) More advanced configurations, such as being behind a proxy, should be
easily configurable through the app's Listen, BackendURL, and ApiHost
parameters.
I had worked on them while doing the scanning cabinet app, and I am
backporting them now since we haven't landed the scanning cabinet yet,
and people have been having trouble setting up the publisher.
pkg/app/app_test.go is gone because app.ListenAddress is now dumb. The
hard work is done in pkg/server/app instead.
Fixes#818
Change-Id: Ice2610d6bac611b209cc3a928e67fa6093a41d3e
The blobpacked storage, among others, needs a sorted storage
implementation for its meta index. This changes lets the high-level
config specify a sorted implementation (that various indexes will use)
even when the (main) indexer is specifically disabled.
Fixes#797
Change-Id: Id69a939318aa6654b9fb92ee34978ae1d1c24291
The b.sortedStorageAt call had an empty filePrefix argument, which
means it would fail if the current index was not a DBMS (but a
file-based DB).
Also, this change adds a default "blobpacked_index" database name value
for the case when a user wants a DBMS index, but hasn't configured a
DBNames map.
Issue #693
Change-Id: Ie4386e789542c006530983e20bccc7a564bc4ee3
configHandler strips the key "_knownkeys" from all JSON objects in its
output using a regex on the marshaled JSON. If this was the last key of
an object which contains multiples keys, this leaves the JSON in an
invalid state. For example,
{
"foo": "bar",
"_knownkeys": {}
}
becomes:
{
"foo": "bar",
}
The trailing comma here is illegal. To fix this, we run one more regex
across the output removing any comma at the end of a line just before a
closing brace. This is never valid in JSON, so is safe to remove
globally here.
Change-Id: I05f54e9f094fbe9cbc720f073b2b78ae80525106
make the "REDACTED" placeholder a valid JSON string and include trailing
comma in attempt to make the document valid JSON (primarily helpful with
browser extensions that parse and render JSON documents, even though the
content type is text/plain).
Add client_secret to the list of redacted keys (used with google cloud
storage, possibly others).
Change-Id: Ia10fc9fffbd667ea7018b750b2a98db0b05dcf82
The import path was added to the go file that included the package
documentation if one existed. Otherwise, I used what seemed to be the
primary file for the package.
Fixes#689
Change-Id: If51be0e86529fd6f179e80af6781e639f8550fd2