2013-08-30 13:21:49 +00:00
|
|
|
/*
|
Rename import paths from camlistore.org to perkeep.org.
Part of the project renaming, issue #981.
After this, users will need to mv their $GOPATH/src/camlistore.org to
$GOPATH/src/perkeep.org. Sorry.
This doesn't yet rename the tools like camlistored, camput, camget,
camtool, etc.
Also, this only moves the lru package to internal. More will move to
internal later.
Also, this doesn't yet remove the "/pkg/" directory. That'll likely
happen later.
This updates some docs, but not all.
devcam test now passes again, even with Go 1.10 (which requires vet
checks are clean too). So a bunch of vet tests are fixed in this CL
too, and a bunch of other broken tests are now fixed (introduced from
the past week of merging the CL backlog).
Change-Id: If580db1691b5b99f8ed6195070789b1f44877dd4
2018-01-01 22:41:41 +00:00
|
|
|
Copyright 2013 The Perkeep Authors.
|
2013-08-30 13:21:49 +00:00
|
|
|
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
you may not use this file except in compliance with the License.
|
|
|
|
You may obtain a copy of the License at
|
|
|
|
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
See the License for the specific language governing permissions and
|
|
|
|
limitations under the License.
|
|
|
|
*/
|
|
|
|
|
|
|
|
package main
|
|
|
|
|
|
|
|
import (
|
|
|
|
"bytes"
|
|
|
|
"encoding/binary"
|
|
|
|
"errors"
|
|
|
|
"fmt"
|
|
|
|
"hash/crc32"
|
|
|
|
"log"
|
|
|
|
"net/url"
|
|
|
|
"os"
|
|
|
|
"path/filepath"
|
2014-01-31 11:42:59 +00:00
|
|
|
"sort"
|
2013-08-30 13:21:49 +00:00
|
|
|
"strconv"
|
2014-01-31 11:42:59 +00:00
|
|
|
"strings"
|
|
|
|
"time"
|
2013-08-30 13:21:49 +00:00
|
|
|
|
2018-01-03 05:03:30 +00:00
|
|
|
"perkeep.org/internal/osutil"
|
Rename import paths from camlistore.org to perkeep.org.
Part of the project renaming, issue #981.
After this, users will need to mv their $GOPATH/src/camlistore.org to
$GOPATH/src/perkeep.org. Sorry.
This doesn't yet rename the tools like camlistored, camput, camget,
camtool, etc.
Also, this only moves the lru package to internal. More will move to
internal later.
Also, this doesn't yet remove the "/pkg/" directory. That'll likely
happen later.
This updates some docs, but not all.
devcam test now passes again, even with Go 1.10 (which requires vet
checks are clean too). So a bunch of vet tests are fixed in this CL
too, and a bunch of other broken tests are now fixed (introduced from
the past week of merging the CL backlog).
Change-Id: If580db1691b5b99f8ed6195070789b1f44877dd4
2018-01-01 22:41:41 +00:00
|
|
|
"perkeep.org/pkg/blob"
|
|
|
|
"perkeep.org/pkg/client"
|
2016-08-18 21:42:09 +00:00
|
|
|
|
|
|
|
"github.com/syndtr/goleveldb/leveldb"
|
cmd/camput: compact LevelDB on HaveCache setup
This CL is about levelDB as the HaveCache for camput, and there are
several aspects to it. To describe it, I'll take the particular example
where you want to add many permanodes (~33k) to a given set, with
camput. Something like:
for _, blob := range blobs {
do("camput attr -add sha1-foobar camliMember " + blob)
}
In a "normal" levelDB use case, everytime the number of level-0 .ldb
files goes over 4 (by default), a background compaction task is
started to transform these SST into level-1 ones, and remove the level-0
ones.
However, since our particular camput call is very short lived
(especially on a local Perkeep), not only might there be not enough time
for the compaction to be triggered, but even if it is, when the DB is
flushed (on a Close call), any ongoing compactions are cancelled. This
makes level-0 compactions very unlikely to happen on short-lived camput
calls. As a result, the number of level-0 files keeps growing until
levelDB fails while trying to open them all, because it hits the current
process ulimit.
Now, in this CL, what we propose is to systematically force a compaction
as soon as the HaveCache is opened. It is not scheduled concurrently, so
we are sure that the compaction happens before the DB actually gets used
by camput. This seems to make sure that the number of level-0 tables
never grows too much. With this change, I was able to run the above
example on 33K blobs without hitting the ulimit error.
However, it should be noted that potential problems might remain. The
compaction for levels above 0 is triggered based only on the total size
of the level (e.g. at 100MB by default for level-1), and not on the
number of files. Since we're creating many tiny tables (basically 1
entry per table), the number of files grows very fast while the total
size does not, and the compaction does not get triggered, even if forced
with CompactRange. This does not seem to be a problem for our use case,
as levelDB does not seem to need to open many of the level-1 files at
the same time, so we're not hitting the ulimit problem because of that.
If needed, there's at least one way this problem (if it is one?) could
be fixed: make the compaction trigger on other conditions, such as
number of files per level. I've experimented with it (forcing the
level-1 compaction to trigger at the 100 files limit), and it seems to
be working. But I had to do change the goleveldb code itself, and I
don't think levelDB implementations are supposed to do that.
For information, at the end of the run on the 33K blobs:
$ du -sch *.ldb
...
83M total
$ ll | wc -l
20988
And indeed, when asking for leveldb.stats on the table:
Level | Tables | Size(MB) |
-------+------------+---------------+
0 | 1 | 0.00015 |
1 | 20981 | 3.47307 |
Also, update github.com/syndtr/goleveldb to
34011bf325bce385408353a30b101fe5e923eb6e
And remove github.com/syndtr/gosnappy as goleveldb does not use it
anymore.
Also apply this change to StatCache.
Fixes #1008
Change-Id: If9f790a003e67f3c075881470e52e5f2174afa73
2018-01-12 19:39:44 +00:00
|
|
|
"github.com/syndtr/goleveldb/leveldb/util"
|
2013-08-30 13:21:49 +00:00
|
|
|
)
|
|
|
|
|
|
|
|
var errCacheMiss = errors.New("not in cache")
|
|
|
|
|
|
|
|
// KvHaveCache is a HaveCache on top of a single
|
2016-08-18 21:42:09 +00:00
|
|
|
// mutable database file on disk using github.com/syndtr/goleveldb.
|
2013-08-30 13:21:49 +00:00
|
|
|
// It stores the blobref in binary as the key, and
|
|
|
|
// the blobsize in binary as the value.
|
|
|
|
// Access to the cache is restricted to one process
|
|
|
|
// at a time with a lock file. Close should be called
|
|
|
|
// to remove the lock.
|
|
|
|
type KvHaveCache struct {
|
|
|
|
filename string
|
2016-08-18 21:42:09 +00:00
|
|
|
db *leveldb.DB
|
2013-08-30 13:21:49 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
func NewKvHaveCache(gen string) *KvHaveCache {
|
2014-01-31 11:42:59 +00:00
|
|
|
cleanCacheDir()
|
2016-08-18 21:42:09 +00:00
|
|
|
fullPath := filepath.Join(osutil.CacheDir(), "camput.havecache."+escapeGen(gen)+".leveldb")
|
|
|
|
db, err := leveldb.OpenFile(fullPath, nil)
|
2013-08-30 13:21:49 +00:00
|
|
|
if err != nil {
|
|
|
|
log.Fatalf("Could not create/open new have cache at %v, %v", fullPath, err)
|
|
|
|
}
|
cmd/camput: compact LevelDB on HaveCache setup
This CL is about levelDB as the HaveCache for camput, and there are
several aspects to it. To describe it, I'll take the particular example
where you want to add many permanodes (~33k) to a given set, with
camput. Something like:
for _, blob := range blobs {
do("camput attr -add sha1-foobar camliMember " + blob)
}
In a "normal" levelDB use case, everytime the number of level-0 .ldb
files goes over 4 (by default), a background compaction task is
started to transform these SST into level-1 ones, and remove the level-0
ones.
However, since our particular camput call is very short lived
(especially on a local Perkeep), not only might there be not enough time
for the compaction to be triggered, but even if it is, when the DB is
flushed (on a Close call), any ongoing compactions are cancelled. This
makes level-0 compactions very unlikely to happen on short-lived camput
calls. As a result, the number of level-0 files keeps growing until
levelDB fails while trying to open them all, because it hits the current
process ulimit.
Now, in this CL, what we propose is to systematically force a compaction
as soon as the HaveCache is opened. It is not scheduled concurrently, so
we are sure that the compaction happens before the DB actually gets used
by camput. This seems to make sure that the number of level-0 tables
never grows too much. With this change, I was able to run the above
example on 33K blobs without hitting the ulimit error.
However, it should be noted that potential problems might remain. The
compaction for levels above 0 is triggered based only on the total size
of the level (e.g. at 100MB by default for level-1), and not on the
number of files. Since we're creating many tiny tables (basically 1
entry per table), the number of files grows very fast while the total
size does not, and the compaction does not get triggered, even if forced
with CompactRange. This does not seem to be a problem for our use case,
as levelDB does not seem to need to open many of the level-1 files at
the same time, so we're not hitting the ulimit problem because of that.
If needed, there's at least one way this problem (if it is one?) could
be fixed: make the compaction trigger on other conditions, such as
number of files per level. I've experimented with it (forcing the
level-1 compaction to trigger at the 100 files limit), and it seems to
be working. But I had to do change the goleveldb code itself, and I
don't think levelDB implementations are supposed to do that.
For information, at the end of the run on the 33K blobs:
$ du -sch *.ldb
...
83M total
$ ll | wc -l
20988
And indeed, when asking for leveldb.stats on the table:
Level | Tables | Size(MB) |
-------+------------+---------------+
0 | 1 | 0.00015 |
1 | 20981 | 3.47307 |
Also, update github.com/syndtr/goleveldb to
34011bf325bce385408353a30b101fe5e923eb6e
And remove github.com/syndtr/gosnappy as goleveldb does not use it
anymore.
Also apply this change to StatCache.
Fixes #1008
Change-Id: If9f790a003e67f3c075881470e52e5f2174afa73
2018-01-12 19:39:44 +00:00
|
|
|
|
|
|
|
if err := maybeRunCompaction("HaveCache", db); err != nil {
|
|
|
|
log.Fatal(err)
|
|
|
|
}
|
|
|
|
|
2013-08-30 13:21:49 +00:00
|
|
|
return &KvHaveCache{
|
|
|
|
filename: fullPath,
|
|
|
|
db: db,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
cmd/camput: compact LevelDB on HaveCache setup
This CL is about levelDB as the HaveCache for camput, and there are
several aspects to it. To describe it, I'll take the particular example
where you want to add many permanodes (~33k) to a given set, with
camput. Something like:
for _, blob := range blobs {
do("camput attr -add sha1-foobar camliMember " + blob)
}
In a "normal" levelDB use case, everytime the number of level-0 .ldb
files goes over 4 (by default), a background compaction task is
started to transform these SST into level-1 ones, and remove the level-0
ones.
However, since our particular camput call is very short lived
(especially on a local Perkeep), not only might there be not enough time
for the compaction to be triggered, but even if it is, when the DB is
flushed (on a Close call), any ongoing compactions are cancelled. This
makes level-0 compactions very unlikely to happen on short-lived camput
calls. As a result, the number of level-0 files keeps growing until
levelDB fails while trying to open them all, because it hits the current
process ulimit.
Now, in this CL, what we propose is to systematically force a compaction
as soon as the HaveCache is opened. It is not scheduled concurrently, so
we are sure that the compaction happens before the DB actually gets used
by camput. This seems to make sure that the number of level-0 tables
never grows too much. With this change, I was able to run the above
example on 33K blobs without hitting the ulimit error.
However, it should be noted that potential problems might remain. The
compaction for levels above 0 is triggered based only on the total size
of the level (e.g. at 100MB by default for level-1), and not on the
number of files. Since we're creating many tiny tables (basically 1
entry per table), the number of files grows very fast while the total
size does not, and the compaction does not get triggered, even if forced
with CompactRange. This does not seem to be a problem for our use case,
as levelDB does not seem to need to open many of the level-1 files at
the same time, so we're not hitting the ulimit problem because of that.
If needed, there's at least one way this problem (if it is one?) could
be fixed: make the compaction trigger on other conditions, such as
number of files per level. I've experimented with it (forcing the
level-1 compaction to trigger at the 100 files limit), and it seems to
be working. But I had to do change the goleveldb code itself, and I
don't think levelDB implementations are supposed to do that.
For information, at the end of the run on the 33K blobs:
$ du -sch *.ldb
...
83M total
$ ll | wc -l
20988
And indeed, when asking for leveldb.stats on the table:
Level | Tables | Size(MB) |
-------+------------+---------------+
0 | 1 | 0.00015 |
1 | 20981 | 3.47307 |
Also, update github.com/syndtr/goleveldb to
34011bf325bce385408353a30b101fe5e923eb6e
And remove github.com/syndtr/gosnappy as goleveldb does not use it
anymore.
Also apply this change to StatCache.
Fixes #1008
Change-Id: If9f790a003e67f3c075881470e52e5f2174afa73
2018-01-12 19:39:44 +00:00
|
|
|
// maybeRunCompaction forces compaction of db, if the number of
|
|
|
|
// tables in level 0 is >= 4. dbname should be provided for error messages.
|
|
|
|
func maybeRunCompaction(dbname string, db *leveldb.DB) error {
|
|
|
|
val, err := db.GetProperty("leveldb.num-files-at-level0")
|
|
|
|
if err != nil {
|
|
|
|
return fmt.Errorf("could not get number of level-0 files of %v's LevelDB: %v", dbname, err)
|
|
|
|
}
|
|
|
|
nbFiles, err := strconv.Atoi(val)
|
|
|
|
if err != nil {
|
|
|
|
return fmt.Errorf("could not convert number of level-0 files to int: %v", err)
|
|
|
|
}
|
|
|
|
// Only force compaction if we're at the default trigger (4), see
|
|
|
|
// github.com/syndtr/goleveldb/leveldb/opt.DefaultCompactionL0Trigger
|
|
|
|
if nbFiles < 4 {
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
if err := db.CompactRange(util.Range{nil, nil}); err != nil {
|
|
|
|
return fmt.Errorf("could not run compaction on %v's LevelDB: %v", dbname, err)
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
2013-08-30 13:21:49 +00:00
|
|
|
// Close should be called to commit all the writes
|
|
|
|
// to the db and to unlock the file.
|
|
|
|
func (c *KvHaveCache) Close() error {
|
|
|
|
return c.db.Close()
|
|
|
|
}
|
|
|
|
|
2014-01-28 20:46:52 +00:00
|
|
|
func (c *KvHaveCache) StatBlobCache(br blob.Ref) (size uint32, ok bool) {
|
2013-08-30 13:21:49 +00:00
|
|
|
if !br.Valid() {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
binBr, _ := br.MarshalBinary()
|
2016-08-18 21:42:09 +00:00
|
|
|
binVal, err := c.db.Get(binBr, nil)
|
2013-08-30 13:21:49 +00:00
|
|
|
if err != nil {
|
2016-08-18 21:42:09 +00:00
|
|
|
if err == leveldb.ErrNotFound {
|
|
|
|
cachelog.Printf("have cache MISS on %v", br)
|
|
|
|
return
|
|
|
|
}
|
2013-08-30 13:21:49 +00:00
|
|
|
log.Fatalf("Could not query have cache %v for %v: %v", c.filename, br, err)
|
|
|
|
}
|
2014-01-28 20:46:52 +00:00
|
|
|
val, err := strconv.ParseUint(string(binVal), 10, 32)
|
2013-08-30 13:21:49 +00:00
|
|
|
if err != nil {
|
|
|
|
log.Fatalf("Could not decode have cache binary value for %v: %v", br, err)
|
|
|
|
}
|
2014-01-28 20:46:52 +00:00
|
|
|
if val < 0 {
|
|
|
|
log.Fatalf("Error decoding have cache binary value for %v: size=%d", br, val)
|
|
|
|
}
|
2013-08-30 13:21:49 +00:00
|
|
|
cachelog.Printf("have cache HIT on %v", br)
|
2014-01-28 20:46:52 +00:00
|
|
|
return uint32(val), true
|
2013-08-30 13:21:49 +00:00
|
|
|
}
|
|
|
|
|
2014-01-28 20:46:52 +00:00
|
|
|
func (c *KvHaveCache) NoteBlobExists(br blob.Ref, size uint32) {
|
2013-08-30 13:21:49 +00:00
|
|
|
if !br.Valid() {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
if size < 0 {
|
|
|
|
log.Fatalf("Got a negative blob size to note in have cache for %v", br)
|
|
|
|
}
|
|
|
|
binBr, _ := br.MarshalBinary()
|
|
|
|
binVal := []byte(strconv.Itoa(int(size)))
|
|
|
|
cachelog.Printf("Adding to have cache %v: %q", br, binVal)
|
2016-08-18 21:42:09 +00:00
|
|
|
if err := c.db.Put(binBr, binVal, nil); err != nil {
|
2013-08-30 13:21:49 +00:00
|
|
|
log.Fatalf("Could not write %v in have cache: %v", br, err)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// KvStatCache is an UploadCache on top of a single
|
2016-08-18 21:42:09 +00:00
|
|
|
// mutable database file on disk using github.com/syndtr/goleveldb.
|
2013-08-30 13:21:49 +00:00
|
|
|
// It stores a binary combination of an os.FileInfo fingerprint and
|
|
|
|
// a client.Putresult as the key, and the blobsize in binary as
|
|
|
|
// the value.
|
|
|
|
// Access to the cache is restricted to one process
|
|
|
|
// at a time with a lock file. Close should be called
|
|
|
|
// to remove the lock.
|
|
|
|
type KvStatCache struct {
|
|
|
|
filename string
|
2016-08-18 21:42:09 +00:00
|
|
|
db *leveldb.DB
|
2013-08-30 13:21:49 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
func NewKvStatCache(gen string) *KvStatCache {
|
2016-08-18 21:42:09 +00:00
|
|
|
fullPath := filepath.Join(osutil.CacheDir(), "camput.statcache."+escapeGen(gen)+".leveldb")
|
|
|
|
db, err := leveldb.OpenFile(fullPath, nil)
|
2013-08-30 13:21:49 +00:00
|
|
|
if err != nil {
|
|
|
|
log.Fatalf("Could not create/open new stat cache at %v, %v", fullPath, err)
|
|
|
|
}
|
cmd/camput: compact LevelDB on HaveCache setup
This CL is about levelDB as the HaveCache for camput, and there are
several aspects to it. To describe it, I'll take the particular example
where you want to add many permanodes (~33k) to a given set, with
camput. Something like:
for _, blob := range blobs {
do("camput attr -add sha1-foobar camliMember " + blob)
}
In a "normal" levelDB use case, everytime the number of level-0 .ldb
files goes over 4 (by default), a background compaction task is
started to transform these SST into level-1 ones, and remove the level-0
ones.
However, since our particular camput call is very short lived
(especially on a local Perkeep), not only might there be not enough time
for the compaction to be triggered, but even if it is, when the DB is
flushed (on a Close call), any ongoing compactions are cancelled. This
makes level-0 compactions very unlikely to happen on short-lived camput
calls. As a result, the number of level-0 files keeps growing until
levelDB fails while trying to open them all, because it hits the current
process ulimit.
Now, in this CL, what we propose is to systematically force a compaction
as soon as the HaveCache is opened. It is not scheduled concurrently, so
we are sure that the compaction happens before the DB actually gets used
by camput. This seems to make sure that the number of level-0 tables
never grows too much. With this change, I was able to run the above
example on 33K blobs without hitting the ulimit error.
However, it should be noted that potential problems might remain. The
compaction for levels above 0 is triggered based only on the total size
of the level (e.g. at 100MB by default for level-1), and not on the
number of files. Since we're creating many tiny tables (basically 1
entry per table), the number of files grows very fast while the total
size does not, and the compaction does not get triggered, even if forced
with CompactRange. This does not seem to be a problem for our use case,
as levelDB does not seem to need to open many of the level-1 files at
the same time, so we're not hitting the ulimit problem because of that.
If needed, there's at least one way this problem (if it is one?) could
be fixed: make the compaction trigger on other conditions, such as
number of files per level. I've experimented with it (forcing the
level-1 compaction to trigger at the 100 files limit), and it seems to
be working. But I had to do change the goleveldb code itself, and I
don't think levelDB implementations are supposed to do that.
For information, at the end of the run on the 33K blobs:
$ du -sch *.ldb
...
83M total
$ ll | wc -l
20988
And indeed, when asking for leveldb.stats on the table:
Level | Tables | Size(MB) |
-------+------------+---------------+
0 | 1 | 0.00015 |
1 | 20981 | 3.47307 |
Also, update github.com/syndtr/goleveldb to
34011bf325bce385408353a30b101fe5e923eb6e
And remove github.com/syndtr/gosnappy as goleveldb does not use it
anymore.
Also apply this change to StatCache.
Fixes #1008
Change-Id: If9f790a003e67f3c075881470e52e5f2174afa73
2018-01-12 19:39:44 +00:00
|
|
|
|
|
|
|
if err := maybeRunCompaction("StatCache", db); err != nil {
|
|
|
|
log.Fatal(err)
|
|
|
|
}
|
|
|
|
|
2013-08-30 13:21:49 +00:00
|
|
|
return &KvStatCache{
|
|
|
|
filename: fullPath,
|
|
|
|
db: db,
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
// Close should be called to commit all the writes
|
|
|
|
// to the db and to unlock the file.
|
|
|
|
func (c *KvStatCache) Close() error {
|
|
|
|
return c.db.Close()
|
|
|
|
}
|
|
|
|
|
|
|
|
func (c *KvStatCache) CachedPutResult(pwd, filename string, fi os.FileInfo, withPermanode bool) (*client.PutResult, error) {
|
|
|
|
fullPath := fullpath(pwd, filename)
|
|
|
|
cacheKey := &statCacheKey{
|
|
|
|
Filepath: fullPath,
|
|
|
|
Permanode: withPermanode,
|
|
|
|
}
|
|
|
|
binKey, err := cacheKey.marshalBinary()
|
2016-08-18 21:42:09 +00:00
|
|
|
binVal, err := c.db.Get(binKey, nil)
|
2013-08-30 13:21:49 +00:00
|
|
|
if err != nil {
|
2016-08-18 21:42:09 +00:00
|
|
|
if err == leveldb.ErrNotFound {
|
|
|
|
cachelog.Printf("stat cache MISS on %q", binKey)
|
|
|
|
return nil, errCacheMiss
|
|
|
|
}
|
|
|
|
log.Fatalf("Could not query stat cache %q for %v: %v", binKey, fullPath, err)
|
2013-08-30 13:21:49 +00:00
|
|
|
}
|
|
|
|
val := &statCacheValue{}
|
|
|
|
if err = val.unmarshalBinary(binVal); err != nil {
|
|
|
|
return nil, fmt.Errorf("Bogus stat cached value for %q: %v", binKey, err)
|
|
|
|
}
|
|
|
|
fp := fileInfoToFingerprint(fi)
|
|
|
|
if val.Fingerprint != fp {
|
|
|
|
cachelog.Printf("cache MISS on %q: stats not equal:\n%#v\n%#v", binKey, val.Fingerprint, fp)
|
|
|
|
return nil, errCacheMiss
|
|
|
|
}
|
|
|
|
cachelog.Printf("stat cache HIT on %q", binKey)
|
2013-09-09 01:17:09 +00:00
|
|
|
return &val.Result, nil
|
2013-08-30 13:21:49 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
func (c *KvStatCache) AddCachedPutResult(pwd, filename string, fi os.FileInfo, pr *client.PutResult, withPermanode bool) {
|
|
|
|
fullPath := fullpath(pwd, filename)
|
|
|
|
cacheKey := &statCacheKey{
|
|
|
|
Filepath: fullPath,
|
|
|
|
Permanode: withPermanode,
|
|
|
|
}
|
|
|
|
val := &statCacheValue{fileInfoToFingerprint(fi), *pr}
|
|
|
|
|
|
|
|
binKey, err := cacheKey.marshalBinary()
|
|
|
|
if err != nil {
|
|
|
|
log.Fatalf("Could not add %q to stat cache: %v", binKey, err)
|
|
|
|
}
|
|
|
|
binVal, err := val.marshalBinary()
|
|
|
|
if err != nil {
|
|
|
|
log.Fatalf("Could not add %q to stat cache: %v", binKey, err)
|
|
|
|
}
|
|
|
|
cachelog.Printf("Adding to stat cache %q: %q", binKey, binVal)
|
2016-08-18 21:42:09 +00:00
|
|
|
if err := c.db.Put(binKey, binVal, nil); err != nil {
|
2013-08-30 13:21:49 +00:00
|
|
|
log.Fatalf("Could not add %q to stat cache: %v", binKey, err)
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
type statCacheKey struct {
|
|
|
|
Filepath string
|
|
|
|
Permanode bool // whether -filenodes is being used.
|
|
|
|
}
|
|
|
|
|
|
|
|
// marshalBinary returns a more compact binary
|
|
|
|
// representation of the contents of sk.
|
|
|
|
func (sk *statCacheKey) marshalBinary() ([]byte, error) {
|
|
|
|
if sk == nil {
|
|
|
|
return nil, errors.New("Can not marshal from a nil stat cache key")
|
|
|
|
}
|
|
|
|
data := make([]byte, 0, len(sk.Filepath)+3)
|
|
|
|
data = append(data, 1) // version number
|
|
|
|
data = append(data, sk.Filepath...)
|
|
|
|
data = append(data, '|')
|
|
|
|
if sk.Permanode {
|
|
|
|
data = append(data, 1)
|
|
|
|
}
|
|
|
|
return data, nil
|
|
|
|
}
|
|
|
|
|
|
|
|
type statFingerprint string
|
|
|
|
|
|
|
|
type statCacheValue struct {
|
|
|
|
Fingerprint statFingerprint
|
|
|
|
Result client.PutResult
|
|
|
|
}
|
|
|
|
|
|
|
|
// marshalBinary returns a more compact binary
|
|
|
|
// representation of the contents of scv.
|
|
|
|
func (scv *statCacheValue) marshalBinary() ([]byte, error) {
|
|
|
|
if scv == nil {
|
|
|
|
return nil, errors.New("Can not marshal from a nil stat cache value")
|
|
|
|
}
|
|
|
|
binBr, _ := scv.Result.BlobRef.MarshalBinary()
|
|
|
|
// Blob size fits on 4 bytes when binary encoded
|
|
|
|
data := make([]byte, 0, len(scv.Fingerprint)+1+4+1+len(binBr))
|
|
|
|
buf := bytes.NewBuffer(data)
|
|
|
|
_, err := buf.WriteString(string(scv.Fingerprint))
|
|
|
|
if err != nil {
|
|
|
|
return nil, fmt.Errorf("Could not write fingerprint %v: %v", scv.Fingerprint, err)
|
|
|
|
}
|
|
|
|
err = buf.WriteByte('|')
|
|
|
|
if err != nil {
|
|
|
|
return nil, fmt.Errorf("Could not write '|': %v", err)
|
|
|
|
}
|
|
|
|
err = binary.Write(buf, binary.BigEndian, int32(scv.Result.Size))
|
|
|
|
if err != nil {
|
|
|
|
return nil, fmt.Errorf("Could not write blob size %d: %v", scv.Result.Size, err)
|
|
|
|
}
|
|
|
|
err = buf.WriteByte('|')
|
|
|
|
if err != nil {
|
|
|
|
return nil, fmt.Errorf("Could not write '|': %v", err)
|
|
|
|
}
|
|
|
|
_, err = buf.Write(binBr)
|
|
|
|
if err != nil {
|
|
|
|
return nil, fmt.Errorf("Could not write binary blobref %q: %v", binBr, err)
|
|
|
|
}
|
|
|
|
return buf.Bytes(), nil
|
|
|
|
}
|
|
|
|
|
2013-09-09 01:17:09 +00:00
|
|
|
var pipe = []byte("|")
|
|
|
|
|
2013-08-30 13:21:49 +00:00
|
|
|
func (scv *statCacheValue) unmarshalBinary(data []byte) error {
|
|
|
|
if scv == nil {
|
|
|
|
return errors.New("Can't unmarshalBinary into a nil stat cache value")
|
|
|
|
}
|
|
|
|
if scv.Fingerprint != "" {
|
|
|
|
return errors.New("Can't unmarshalBinary into a non empty stat cache value")
|
|
|
|
}
|
|
|
|
|
2013-09-09 01:17:09 +00:00
|
|
|
parts := bytes.SplitN(data, pipe, 3)
|
2013-09-09 17:07:41 +00:00
|
|
|
if len(parts) != 3 {
|
|
|
|
return fmt.Errorf("Bogus stat cache value; was expecting fingerprint|blobSize|blobRef, got %q", data)
|
|
|
|
}
|
2013-08-30 13:21:49 +00:00
|
|
|
fingerprint := string(parts[0])
|
|
|
|
buf := bytes.NewReader(parts[1])
|
|
|
|
var size int32
|
|
|
|
err := binary.Read(buf, binary.BigEndian, &size)
|
|
|
|
if err != nil {
|
|
|
|
return fmt.Errorf("Could not decode blob size from stat cache value part %q: %v", parts[1], err)
|
|
|
|
}
|
|
|
|
br := new(blob.Ref)
|
|
|
|
if err := br.UnmarshalBinary(parts[2]); err != nil {
|
|
|
|
return fmt.Errorf("Could not unmarshalBinary for %q: %v", parts[2], err)
|
|
|
|
}
|
|
|
|
|
|
|
|
scv.Fingerprint = statFingerprint(fingerprint)
|
|
|
|
scv.Result = client.PutResult{
|
|
|
|
BlobRef: *br,
|
2014-01-28 20:46:52 +00:00
|
|
|
Size: uint32(size),
|
2013-08-30 13:21:49 +00:00
|
|
|
Skipped: true,
|
|
|
|
}
|
|
|
|
return nil
|
|
|
|
}
|
|
|
|
|
|
|
|
func fullpath(pwd, filename string) string {
|
|
|
|
var fullPath string
|
|
|
|
if filepath.IsAbs(filename) {
|
|
|
|
fullPath = filepath.Clean(filename)
|
|
|
|
} else {
|
|
|
|
fullPath = filepath.Join(pwd, filename)
|
|
|
|
}
|
|
|
|
return fullPath
|
|
|
|
}
|
|
|
|
|
|
|
|
func escapeGen(gen string) string {
|
|
|
|
// Good enough:
|
|
|
|
return url.QueryEscape(gen)
|
|
|
|
}
|
|
|
|
|
|
|
|
var cleanSysStat func(v interface{}) interface{}
|
|
|
|
|
|
|
|
func fileInfoToFingerprint(fi os.FileInfo) statFingerprint {
|
|
|
|
// We calculate the CRC32 of the underlying system stat structure to get
|
|
|
|
// ctime, owner, group, etc. This is overkill (e.g. we don't care about
|
|
|
|
// the inode or device number probably), but works.
|
|
|
|
sysHash := uint32(0)
|
|
|
|
if sys := fi.Sys(); sys != nil {
|
|
|
|
if clean := cleanSysStat; clean != nil {
|
|
|
|
// TODO: don't clean bad fields, but provide a
|
|
|
|
// portable way to extract all good fields.
|
|
|
|
// This is a Linux+Mac-specific hack for now.
|
|
|
|
sys = clean(sys)
|
|
|
|
}
|
|
|
|
c32 := crc32.NewIEEE()
|
|
|
|
fmt.Fprintf(c32, "%#v", sys)
|
|
|
|
sysHash = c32.Sum32()
|
|
|
|
}
|
|
|
|
return statFingerprint(fmt.Sprintf("%dB/%dMOD/sys-%d", fi.Size(), fi.ModTime().UnixNano(), sysHash))
|
|
|
|
}
|
2014-01-31 11:42:59 +00:00
|
|
|
|
2016-08-18 21:42:09 +00:00
|
|
|
// Delete all but the oldest 5 havecache/statcache dirs, unless they're newer
|
|
|
|
// than 30 days.
|
2014-01-31 11:42:59 +00:00
|
|
|
func cleanCacheDir() {
|
|
|
|
dir := osutil.CacheDir()
|
|
|
|
f, err := os.Open(dir)
|
|
|
|
if err != nil {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
defer f.Close()
|
|
|
|
fis, err := f.Readdir(-1)
|
|
|
|
if err != nil {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
var haveCache, statCache []os.FileInfo
|
|
|
|
seen := make(map[string]bool)
|
|
|
|
for _, fi := range fis {
|
|
|
|
seen[fi.Name()] = true
|
|
|
|
}
|
|
|
|
|
|
|
|
for _, fi := range fis {
|
|
|
|
if strings.HasPrefix(fi.Name(), "camput.havecache.") {
|
|
|
|
haveCache = append(haveCache, fi)
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
if strings.HasPrefix(fi.Name(), "camput.statcache.") {
|
|
|
|
statCache = append(statCache, fi)
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
}
|
|
|
|
for _, list := range [][]os.FileInfo{haveCache, statCache} {
|
|
|
|
if len(list) <= 5 {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
sort.Sort(byModtime(list))
|
|
|
|
list = list[:len(list)-5]
|
|
|
|
for _, fi := range list {
|
|
|
|
if fi.ModTime().Before(time.Now().Add(-30 * 24 * time.Hour)) {
|
2016-08-18 21:42:09 +00:00
|
|
|
os.RemoveAll(filepath.Join(dir, fi.Name()))
|
2014-01-31 11:42:59 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
type byModtime []os.FileInfo
|
|
|
|
|
|
|
|
func (s byModtime) Len() int { return len(s) }
|
|
|
|
func (s byModtime) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
|
|
|
|
func (s byModtime) Less(i, j int) bool { return s[i].ModTime().Before(s[j].ModTime()) }
|