Commit Graph

456 Commits

Author SHA1 Message Date
Brad Fitzpatrick 12894d4630 all: Windows fixes (don't listen on file descriptors in test.World, etc)
test/integration: don't listen on file descriptors.
make.go: unrelated, but options to make it much faster.
internal/images: t.Skip on HEIC dependency failures

Fixes #1140
Updates golang/go#25210

Change-Id: I8092155411826d6ed1f8d85230b753d1369044af
2018-05-01 21:38:19 -07:00
Brad Fitzpatrick f5de76de22 diskpacked: fix tests on Windows
Delete some test code I don't see the value of (not sure what it's
testing).

Change-Id: I7c27bce5601e1b2780eb8bb6eeadf45f7ea97d00
2018-05-01 13:14:41 -07:00
Brad Fitzpatrick b1abccd287 devcam, localhost, storagetest: fixes for Windows
Unset CGO_ENABLED if no sign of gcc.

The localdisk renaming stuff was fixed in Go ages ago in
golang/go#13673 and https://golang.org/cl/6140

And a defer in storagetest meant Windows couldn't delete files
because a file was still open.

Change-Id: I57aef85f24653b19ce10e3d1e18c778cee2d48f6
2018-05-01 10:46:19 -07:00
Brad Fitzpatrick b1c1d1be68 blobserver/localdisk: be sure to implement SubFetcher for blobpacked
My fault for not running the (slow) integration tests before I broke
things in a4d0cc6ab7.

Fixes #1136

Change-Id: Ia30051da02974d0c3e79e0b220ff86dcab5771e4
2018-04-30 16:23:43 -07:00
Brad Fitzpatrick a4d0cc6ab7 blobserver/{localdisk,files}: move generic localdisk code to the files package
Just code movement.

Except I did delete some 5 year old localdisk migration code that's no
longer relevant.

Updates #1111 (this is prep for SFTP support)

Change-Id: Ibe1de1d4d804a6c86919a9df454ab125027e4c33
2018-04-29 20:59:42 -07:00
Brad Fitzpatrick 0ff6a4954e pkg/blobserver/gethandler: fix vet error
Change-Id: Ib3a86f74230f76c5fa9e3965d360e82c07a41e80
2018-04-29 10:07:54 -07:00
Brad Fitzpatrick 3d5c28511f Log instead of panicking in HTTP handler on GET to /index/ (no Fetcher)
And change two Camlistore instances to Perkeep.

Change-Id: Id515480ecdc0c997e1700204d63ef4cc6a8c8cc4
2018-04-27 13:29:01 -07:00
Brad Fitzpatrick 8a67582cf9 blobserver/{files,localdisk}: add VFS layer for use by localdisk
For now, no user-visible changes.

But this will permit an SFTP blobstorage layer in the future.

Next step will be moving 90% of the code from the localdisk package
into the files package.

Updates #1111

Change-Id: I62b924e3d69ca47e7c0fa83c78a77808a71ea33e
2018-04-24 16:30:08 -07:00
mpl 2bb666ccf6 all: rename remaining occurrences of camput
Also removed misc/buildbot while at it (which contained camput
references) since we don't use it anymore at all.

TODO: the OSX app seems to be relying on finding a binary in ../bin,
which we do not use anymore. This will probably need fixing.

Updates #981

Change-Id: I14220fbad2e81181330fca4bb2d2e5fe170e1bd6
2018-04-21 16:20:24 -07:00
Brad Fitzpatrick 54578ea062 pkg/blobserver/blobpacked: log before starting integrity check
Fixes #1097

Change-Id: I3abea84a9cee090309098634e655721272386092
2018-04-21 12:05:41 -07:00
Brad Fitzpatrick ca76a40bbc Rename camlistored to perkeepd.
Updates #981

Change-Id: I8fe43c240c149074c23128a89ab426af9cbf94b4
2018-04-21 11:06:09 -07:00
mpl 4b47f45535 pkg/blobserver/blobpacked: reindex duplicates as well
For reasons yet to be explained, it can happen that zip blobs with the
exact same overall data contents get written to the large blobpacked.
Some of their packed schema blobs might differ though, e.g. if the
file names are different.

This is a problem, at least because of the added storage space used. And
also because it is not supposed to happen as there is a wholeRef check
on reception of a file.

However, it seems harmless to let these duplicates get indexed as z:
rows.

Therefore, this change improves a bit the inspection of duplicates and
the logging around it, to help users delete them by hand if they decide
to do so, but also adds detected duplicates as found zip blobs to the
index. This means finding duplicates is not longer considered a failure
of the recovery process.

Updates issue #1079

Change-Id: I5fec2f2226818eefd03ffce82c768de867fad6b0
2018-04-07 20:14:02 +02:00
mpl d917c62d76 blobserver/blobpacked: add new z: meta rows
As discussed in issue #1068, part of the problem with reindexing and
then checking the integrity of the index, is that neither b: nor w:
meta rows are keyed by the blobRef of the packed blob (the "zipRef").

This can lead, among other things, to reindexing actually erasing all
trace of a zipRef as it may have been in the value of a row, that gets
subsequently overwritten by a row that has a different value, but an
identical key (since e.g. w: rows are keyed by wholeRef).

To address that issue, this change introduces a new kind of meta row,
prefixed by z: , which indexes the following information:
key: blobref of the zip, prefixed by "z:"
value: size of the zip, blobref of the contents of the whole file, size
of the whole file, position in the whole file of the data in the zip,
size of the data in the zip.

Fixes #1068

Change-Id: Iae61fe823cda0accb22e55ea075407a2b8fd11f8
2018-03-30 20:47:18 +02:00
Brad Fitzpatrick 77ef2484c2 blobserver/blobpacked: re-validate large integrity after repair
Also:
* lowercase the SHOUTYCASE log prefix
* close the meta index on Close (from mpl's CL)

For #1068

Change-Id: I26d9d77338ac850a20d9d631c0424129f6f98fe2
2018-03-23 11:34:57 -07:00
Brad Fitzpatrick 93422ae168 blobserver/blobpacked: add more debug
For #1068

Change-Id: I4c1210480d869f8d4fb1b53d28d69cc02ad51d72
2018-03-22 22:22:11 -07:00
mpl 4d9b0b1306 blobserver/blobpacked: fix incomplete Errorf call
Caught by Go1.10 yay

Change-Id: Ibb64ca8830b61d69350f3eddd50574d7d85b7c94
2018-03-02 19:48:37 +01:00
Attila Tajti b6baba23b0 pkg/blobserver/overlay: use better names for underlying storage
Use the names "lower" and "upper" from OverlayFS
instead of "base" and "stage".

Also make the deleted keyvalue (and thus support for deletion) optional,
and improve readability for the StatBlobs method.

Change-Id: Ic3f36609bf4599251f9ba7c648f513b788550298
2018-02-27 17:11:59 +01:00
Attila Tajti b71c054373 pkg/blobserver/overlay: add “overlay” blobserver type
The blobserver uses a base blobserver in read-only mode,
and uses another stage blobserver to record changes.

In a sense it is a read-write view on the master
blobserver that never changes base itself.

Change-Id: I39c7d7bbac713c32fd17710fab43754b546ebb3b
2018-02-21 05:51:00 +01:00
mpl 0d96057201 pk/blobserver/blobpacked: check large storage integrity
Check that all blobpacked zips are in the blobpacked meta, and
vice-versa (that all entries in meta do exist as a zip in large
storage).

As the current recovery code would not fix the case of stale entries
(large blobRefs in the blobpacked index of large blobs that don't exist
anymore), this change also adds a new recovery mode, which wipes the
existing blobpacked index, before rebuilding it.

In doing so, the recovery var in blobpacked pkg, as well as the
flagRecovery in camlistored.go have been changed to ints instead of bools,
to take into account that we now have several modes of operation for
recovery.

Fixes #946

Change-Id: I1fe76b805af34933e362d70c9f27bfd5403e3f3a
2018-02-17 02:53:24 +01:00
Paul Lindner 84b2c6b3e4 all: various lint fixes
- correct logging that logged functions instead of their value
- use ID vs Id naming
- use correct function names in comments

Change-Id: I61562cef7ebac7337ec6c85312cdf7915cb1a84b
2018-02-05 11:59:00 -08:00
Paul Lindner 459c75410e all: more renaming of Camlistore to Perkeep
Change-Id: I118e3cbcf20d80afeffc84f001388c4556f21628
2018-01-30 03:02:56 -08:00
Brad Fitzpatrick 66791480b0 cmd/client: remove NewStorageClient, convert to an option
Also fix some docs, rename some Camlistore to Perkeep, and make Close
close idle connections.

Change-Id: Ib903c7f01728d36b87301674094ca8967306cda1
2018-01-24 08:52:07 -08:00
Brad Fitzpatrick d4ff75359c pkg/client: reduce the number of New constructors, return error by default
This removes NewDefault and NewFromParams.

Now the default way to create a client is:

    client.New() -> (*Client, error)

Specifying a server is optional and now requires
client.OptionServer(server).

If the caller really wants to log.Fatal on error, they can use
client.NewOrFail.

Also, some of the boilerplate from GopherJS callers is now promoted to
be the default behavior in the client package.

Change-Id: Icb106cf3e13cc492fe5b2f7f240e1ad4227eaf33
2018-01-24 07:42:04 -08:00
Brad Fitzpatrick 8eec428c0a Merge "blobserver: add context to BlobRemover" 2018-01-23 18:33:20 +00:00
Brad Fitzpatrick 87694c3ba8 Merge "all: simpify constructs by running gofmt -s on all code" 2018-01-21 19:14:54 +00:00
Adam Shannon 2b655f8855 blobserver/localdisk: check that underlying filesystem can perform operations needed
Fixes #397

Change-Id: Idc8674d13336b29eb95db4be4dd39cd557ca38e7
2018-01-21 13:10:56 -06:00
Paul Lindner 6d2d9714de all: simpify constructs by running gofmt -s on all code
Change-Id: Idc12ddcfe8f735d77c6baa942f5bb7a2c7d9b40b
2018-01-21 10:27:12 -08:00
Brad Fitzpatrick 66db09453f blobserver: add context to BlobRemover
Updates #733

Change-Id: I2fffb5cad59aa994441ee82ac5d940270113ee5a
2018-01-19 09:54:46 -08:00
Brad Fitzpatrick 194d4f9443 blobserver, all: add contexts to ReceiveBlob, Fetch & million resulting deps
I had intended for this to be a small change.

I was going to just add context.Context to the BlobReceiver interface,
but then I saw blob.Fetcher could also use one, so I decided to do two
in one CL.

And then it got a bit infectious and ended up touching everything.

I ended up doing SubFetch in the process by necessity.

At a certain point I finally started using context.TODO() in a few
spots, but not too many. But removing context.TODO() will come in the
future. There are more blob storage interfaces lacking context, too,
like RemoveBlobs.

Updates #733

Change-Id: Idf273180b3f8e397ac5929c6d7f520ccc5cdce08
2018-01-18 16:22:16 -08:00
Tamás Gulácsi 2ba0c43003 pkg/blobserver/union: add "union" blobserver type
This blobserver is just "cat"ing the given "read" storages.
This is read-only, so you should use some other storage to augment this for
writing and removing - for example the "cond" storage is perfect for this.

My use-case is to use blobpacked with large=diskpacked, small=filesystem,
but consolidate the small blob storage into a diskpacked + filesystem
after the filesystem becomes huge.

Another use-case is joining separately built camlistore servers into one.
(For me, they have to be separated later, so I've built them separately,
but I've to use it joined for a month).

Change-Id: I4e7e42cd59286f0f34da2f6ff01e44439771d53c
2018-01-12 06:54:24 +01:00
Brad Fitzpatrick 38f10a7bd0 all, testhooks: use sha224 by default, add hook for some tests to use sha-1
Remove the blob.SHA{1,224}From{Bytes,String} constructors too. No
longer used. This adds blob.RefFromBytes which was missing. We had
blob.RefFromString. Now everything uses blob.RefFrom* instead of
specifying a hash function.

Some tests set a flag to force use of SHA-1 because there was too much
golden data to update. We can remove those one-by-one over time as we
fix up tests.

Updates #537

Change-Id: Ibe6428089a6221594c2b751f53f98b03b5a28dc2
2018-01-09 20:03:38 -08:00
Brad Fitzpatrick 6bb4cc91ba pkg/blobserver/mongo: fix stat behavior on missing blob
Tests pass now.

Change-Id: Ib98ca8c213b638a79bfd61f1b7739459b3fd03da
2018-01-09 15:12:05 -08:00
Brad Fitzpatrick 0e8980b54b blobserver: change BlobStatter interface, simplify proxycache
This addresses a long-standing TODO in the BlobStatter interface to
clean it up. Just like all new Go programmers, I misused channels in
APIs. I should've cleaned this up years ago.

While here, I also added a context.

The rest should get contexts later.

This also cleans up a few things here & there.

The pkg/client statting no longer does batching, which added a lot of
complexity. There was a comment saying something like "once we have
SPDY, we can delete this". Well, we have HTTP/2 now, so seems
deletable.

All tests pass.

Change-Id: I034ce07d9b70e5cc9e5482213368993e638d4bc8
2018-01-08 16:54:52 -08:00
Brad Fitzpatrick 27bacd3df1 pkg/blob, all: support SHA-224 blobrefs, make them the default
Updates #537

Change-Id: I3966697cbdb05ca4b380974be604deebdaa258c2
2018-01-08 16:34:41 -08:00
Brad Fitzpatrick 57648c6b83 all: update copyright holder from Google Inc to The Perkeep Authors
The AUTHORS file is the list of copyright holders.
2018-01-03 16:52:49 -08:00
Paul Lindner d5b55e51a8 Merge "all: update mongo to latest version" 2018-01-03 21:04:12 +00:00
Paul Lindner 49b1af4a1b all: update mongo to latest version
The mongo integration was using a very old package.  It's using
a new namespace now.  Upgrade and adjust all call points

Removes labix.org/v2/mgo

Introduces gopkg.in/mgo.v2 from branch v2 with revision
  3f83fa5005286a7fe593b055f0d7771a7dce4655

Change-Id: I2784fca941998460f58e0ac8d3d51286401590b5
2018-01-03 10:49:07 -08:00
Brad Fitzpatrick c3d05cdce9 Move more packages out of pkg/ and into internal/
Moved hashutil, httputil, osutil, netutil,
images, media, magic, video, and rollsum.
2018-01-02 21:03:30 -08:00
Brad Fitzpatrick 11e9c5567c Move some packages from perkeep.org/pkg to perkeep.org/internal
Notably: pkg/misc all moves.

And pkg/googlestorage is deleted, since it's not used. Only the
x/net/http2/h2demo code used to use it, but that ended in
https://go-review.googlesource.com/33230 (our vendored code is old).
So just nuke that dir for now. When it's refreshed, it'll either be
gone (dep prune) or new enough to not need googlestorage.

Also move pkg/pools, pkg/leak, and pkg/geocode to internal.

More remains.

Change-Id: I2640c4d18424062fdb8461ba451f1ce26719ae9d
2018-01-01 20:54:48 -08:00
Brad Fitzpatrick d6a0b05df0 Rename import paths from camlistore.org to perkeep.org.
Part of the project renaming, issue #981.

After this, users will need to mv their $GOPATH/src/camlistore.org to
$GOPATH/src/perkeep.org. Sorry.

This doesn't yet rename the tools like camlistored, camput, camget,
camtool, etc.

Also, this only moves the lru package to internal. More will move to
internal later.

Also, this doesn't yet remove the "/pkg/" directory. That'll likely
happen later.

This updates some docs, but not all.

devcam test now passes again, even with Go 1.10 (which requires vet
checks are clean too). So a bunch of vet tests are fixed in this CL
too, and a bunch of other broken tests are now fixed (introduced from
the past week of merging the CL backlog).

Change-Id: If580db1691b5b99f8ed6195070789b1f44877dd4
2018-01-01 16:03:34 -08:00
mpl c366c68a84 pkg/blobpacked: adjust tests for rollsum change
After the rollsum fix in 4723d0f452
landed, the way files were truncated in blobs changed, and hence some
expected hashsums as well.

This CL adjusts such expectations.

Change-Id: I44fc1f5ce1922d7bc99f9a8096ef4b8d212571dc
2018-01-02 00:33:47 +01:00
Brad Fitzpatrick 26e0ff0c96 Merge "blobserver/blobpacked: log reindexing progress" 2017-12-31 02:05:45 +00:00
Markus Peröbner ca3118aa12 pkg/blobserver/remote: adds trusted certs option to remote blobserver
Allows to use self signed certificates with https endpoints.

Change-Id: I1e15bbf15b89e57c8a8cfaf85d778d912a3cc36e
2017-12-29 14:47:37 -08:00
Filippo Valsorda d388cab373 blobserver/encrypt: implement meta blob packing
Keeping it simple: every time a new meta blob is added, Push it into a
heap, if the heap gets too long, Pop out all the blobs, pack them,
upload the new one and delete the olds.

The Push and Pop operations are done under Lock, packing, uploading and
deleting in a goroutine.

Meta blobs can't get bigger than twice the full size.  Packing happens
on average <max heap length> times before filling a blob because blobs
are added as single-lines.  This means uploading approximately
    <max heap length> * <full size> / 2
bytes of blobs that will be removed for each full size blob.

At start, push all non-full meta blobs, so that we do packing that might
have failed previously.

Change-Id: I1f2fbfc802c1b82dcc87fc0b333c30949229c928
2017-12-29 14:18:57 -08:00
Filippo Valsorda f9cfd754a2 blobserver/encrypt: rewrite encryption to use NaCl and a simpler meta
NaCl offers authenticated encryption, which means that the blobstore
can't tamper with the data.  Since SHA-1 were checked one could not
change a blob outright, but could add new blobs by tampering with the
meta blobs, too.  It's true that only signed blobs should cause actions
just by being present, but we are already far too deep in the chain of
assumptions, just not to spend a bit of CPU adding a MAC.  The new
scheme is much easier to prove secure.

Also simplified the meta by removing the IV (which is in the encrypted
blob anyway) and the encrypted size (which is plaintext size + overhead).

Finally, added tests (including a storagetest) and tried to make this
sort of production-ready.

Still to do are meta compaction and a way to regenerate the meta from
the blobs, in case of meta corruption (which now we can do securely
thanks to NaCl authentication).

golang.org/x/crypto/nacl/secretbox:
golang.org/x/crypto/poly1305:
golang.org/x/crypto/salsa20/salsa:
golang.org/x/crypto/scrypt:
golang.org/x/crypto/pbkdf2:
	1e61df8d9ea476e2e1504cd9a32b40280c7c6c7e

Change-Id: I095c6204ac093f6292c7943dbb77655d2c51aba6
2017-12-29 14:16:34 -08:00
Govert Versluis 8548962dbe Add Azure blobserver support.
Fixes #425

Change-Id: I02bb29e6503bfef0894cbfde0c2a3304cf70c932
2017-12-29 12:39:49 -08:00
Tamás Gulácsi 4754ab6c4b pkg/blobserver/diskpacked: fail earlier in StreamBlobs
As Miki Habryn suggested at
https://groups.google.com/forum/#!topic/camlistore/WmUyUWMfZx0%5B1-25%5D

Change-Id: Ib910e5bcfa7eb33360f7b5e1085bd9bb1f0e9e6a
2017-12-29 12:19:25 -08:00
Tamás Gulácsi 2329b8038c blobserver/diskpacked: fix missing Close on Seek error
Fixes #667

Change-Id: I11eaa2cc21bfbc825b14cc91208fd8ebc9e7418e
2017-12-29 11:29:09 -08:00
Brad Fitzpatrick 956a0a810b pkg/blobserver/localdisk: simplify code, limit stat concurrency
Don't create an unbounded number of stat goroutines.

Change-Id: Ie66cc9c680bd83e649966258a8e7ef09c8af5c62
2017-12-29 11:22:47 -08:00
Stephen Searles 6b426cb10d improve proxycache and stats blobservers
improving proxycache
- added fuller sample config to the package documentation
- switched the stats caching from sorted.kv to the stats blobserver
- added a cleaning mechanism to evict the least recently used blobs
- implemented StatBlobs to actually inspect the local cache. It still
  always consults the origin, but only for the blobs necessary after
  giving the cache a 50ms headstart.
- logging a few errors that were previously ignored
- added tests modeled after the tests for the localdisk blobstore
- added a method to verify the cache, and call it on initialization
- added a strictStats option to always get stats from the origin
- filling in cacheBytes on initialization

improving stats blobserver
- implemented a few more of the blobserver interfaces, Enumerator and
  Remover
- Fixed a bug(?) in ReceiveBlob that seemed to prevent it from actually
  storing stats
- added a test

minor improvements include:
- blobserver/memory: allowing the memory blobserver to hold actually
  infinite items, if desired
- blobserver: closing dest in the NoImpl blobserver, as required by the
  BlobEnumerator interface
- storagetest: not closing dest leads to deadlock
- lru: max entries of 0 now means infinite (maybe do anything <0?)
- test: a helper function to create a random blob using a global random
  source that is, by default, deterministic, to make test results more
  consistent.

In the future, an improved BlobHub or similar interface could allow a
tighter feedback loop in providing cache consistency. i.e. the cache
could register with backend stores to be notified of content updates,
minimizing the time between backend changes and cache correction.

The proxycache will verify itself at startup, reporting an error if
any of its blobs do not exist in the backend storage or if the backend
storage has a different size for the content than the cache.

Fixes #443

Change-Id: I9ee1efd8c1d0eed49bb82930c2489a64122d3e00
2017-12-29 10:09:15 -08:00