cmd/camput: compact LevelDB on HaveCache setup

This CL is about levelDB as the HaveCache for camput, and there are
several aspects to it. To describe it, I'll take the particular example
where you want to add many permanodes (~33k) to a given set, with
camput. Something like:

for _, blob := range blobs {
	do("camput attr -add sha1-foobar camliMember " + blob)
}

In a "normal" levelDB use case, everytime the number of level-0 .ldb
files goes over 4 (by default), a background compaction task is
started to transform these SST into level-1 ones, and remove the level-0
ones.
However, since our particular camput call is very short lived
(especially on a local Perkeep), not only might there be not enough time
for the compaction to be triggered, but even if it is, when the DB is
flushed (on a Close call), any ongoing compactions are cancelled. This
makes level-0 compactions very unlikely to happen on short-lived camput
calls. As a result, the number of level-0 files keeps growing until
levelDB fails while trying to open them all, because it hits the current
process ulimit.

Now, in this CL, what we propose is to systematically force a compaction
as soon as the HaveCache is opened. It is not scheduled concurrently, so
we are sure that the compaction happens before the DB actually gets used
by camput. This seems to make sure that the number of level-0 tables
never grows too much. With this change, I was able to run the above
example on 33K blobs without hitting the ulimit error.

However, it should be noted that potential problems might remain. The
compaction for levels above 0 is triggered based only on the total size
of the level (e.g. at 100MB by default for level-1), and not on the
number of files. Since we're creating many tiny tables (basically 1
entry per table), the number of files grows very fast while the total
size does not, and the compaction does not get triggered, even if forced
with CompactRange. This does not seem to be a problem for our use case,
as levelDB does not seem to need to open many of the level-1 files at
the same time, so we're not hitting the ulimit problem because of that.

If needed, there's at least one way this problem (if it is one?) could
be fixed: make the compaction trigger on other conditions, such as
number of files per level. I've experimented with it (forcing the
level-1 compaction to trigger at the 100 files limit), and it seems to
be working. But I had to do change the goleveldb code itself, and I
don't think levelDB implementations are supposed to do that.

For information, at the end of the run on the 33K blobs:
$ du -sch *.ldb
...
83M	total
$ ll | wc -l
20988

And indeed, when asking for leveldb.stats on the table:
 Level |   Tables   |    Size(MB)   |
-------+------------+---------------+
   0   |          1 |       0.00015 |
   1   |      20981 |       3.47307 |

Also, update github.com/syndtr/goleveldb to
34011bf325bce385408353a30b101fe5e923eb6e
And remove github.com/syndtr/gosnappy as goleveldb does not use it
anymore.

Also apply this change to StatCache.

Fixes #1008

Change-Id: If9f790a003e67f3c075881470e52e5f2174afa73
This commit is contained in:
mpl 2018-01-12 20:39:44 +01:00
parent b88b82f1ee
commit da9020ec71
77 changed files with 4745 additions and 4768 deletions

282
Gopkg.lock generated
View File

@ -3,26 +3,12 @@
[[projects]]
name = "bazil.org/fuse"
packages = [
".",
"fs",
"fuseutil",
"syscallx"
]
packages = [".","fs","fuseutil","syscallx"]
revision = "371fbbdaa8987b715bdd21d6adc4c9b20155f748"
[[projects]]
name = "cloud.google.com/go"
packages = [
"compute/metadata",
"datastore",
"internal",
"internal/bundler",
"logging",
"logging/apiv2",
"logging/internal",
"storage"
]
packages = ["compute/metadata","datastore","internal","internal/bundler","logging","logging/apiv2","logging/internal","storage"]
revision = "b70ccc799b9d019708c3eb9395acef6e3f6b7bc8"
[[projects]]
@ -43,11 +29,7 @@
[[projects]]
name = "github.com/cznic/internal"
packages = [
"buffer",
"file",
"slice"
]
packages = ["buffer","file","slice"]
revision = "4747030f7cf2f4c0a01512b00cd68734b167ac3b"
[[projects]]
@ -104,16 +86,7 @@
[[projects]]
branch = "master"
name = "github.com/golang/protobuf"
packages = [
"proto",
"ptypes",
"ptypes/any",
"ptypes/duration",
"ptypes/empty",
"ptypes/struct",
"ptypes/timestamp",
"ptypes/wrappers"
]
packages = ["proto","ptypes","ptypes/any","ptypes/duration","ptypes/empty","ptypes/struct","ptypes/timestamp","ptypes/wrappers"]
revision = "1e59b77b52bf8e4b449a57e6f79f21226d571845"
[[projects]]
@ -128,21 +101,7 @@
[[projects]]
name = "github.com/gopherjs/gopherjs"
packages = [
".",
"build",
"compiler",
"compiler/analysis",
"compiler/astutil",
"compiler/filter",
"compiler/natives",
"compiler/prelude",
"compiler/typesutil",
"internal/sysutil",
"js",
"nosync",
"tests/otherpkg"
]
packages = [".","build","compiler","compiler/analysis","compiler/astutil","compiler/filter","compiler/natives","compiler/prelude","compiler/typesutil","internal/sysutil","js","nosync","tests/otherpkg"]
revision = "b40cd48c38f9a18eb3db20d163bad78de12cf0b7"
[[projects]]
@ -164,10 +123,7 @@
[[projects]]
name = "github.com/hjfreyer/taglib-go"
packages = [
"taglib",
"taglib/id3"
]
packages = ["taglib","taglib/id3"]
revision = "0ef8bba9c41b66c12f60ce9833786838d2c2d3d8"
[[projects]]
@ -188,10 +144,7 @@
[[projects]]
name = "github.com/lib/pq"
packages = [
".",
"oid"
]
packages = [".","oid"]
revision = "9afcd9aa793101bd0536da34e74ae0123345bab1"
[[projects]]
@ -244,10 +197,7 @@
[[projects]]
name = "github.com/rwcarlsen/goexif"
packages = [
"exif",
"tiff"
]
packages = ["exif","tiff"]
revision = "709fab3d192d7c62f86043caff1e7e3fb0f42bd8"
[[projects]]
@ -279,26 +229,8 @@
[[projects]]
name = "github.com/syndtr/goleveldb"
packages = [
"leveldb",
"leveldb/cache",
"leveldb/comparer",
"leveldb/errors",
"leveldb/filter",
"leveldb/iterator",
"leveldb/journal",
"leveldb/memdb",
"leveldb/opt",
"leveldb/storage",
"leveldb/table",
"leveldb/util"
]
revision = "4875955338b0a434238a31165cb87255ab6e9e4a"
[[projects]]
name = "github.com/syndtr/gosnappy"
packages = ["snappy"]
revision = "156a073208e131d7d2e212cb749feae7c339e846"
packages = ["leveldb","leveldb/cache","leveldb/comparer","leveldb/errors","leveldb/filter","leveldb/iterator","leveldb/journal","leveldb/memdb","leveldb/opt","leveldb/storage","leveldb/table","leveldb/util"]
revision = "34011bf325bce385408353a30b101fe5e923eb6e"
[[projects]]
name = "github.com/tgulacsi/picago"
@ -307,87 +239,27 @@
[[projects]]
name = "go4.org"
packages = [
"cloud/cloudlaunch",
"cloud/google/gceutil",
"cloud/google/gcsutil",
"ctxutil",
"errorutil",
"fault",
"jsonconfig",
"legal",
"lock",
"net/throttle",
"oauthutil",
"readerutil",
"strutil",
"syncutil",
"syncutil/singleflight",
"types",
"wkfs",
"wkfs/gcs",
"writerutil"
]
packages = ["cloud/cloudlaunch","cloud/google/gceutil","cloud/google/gcsutil","ctxutil","errorutil","fault","jsonconfig","legal","lock","net/throttle","oauthutil","readerutil","strutil","syncutil","syncutil/singleflight","types","wkfs","wkfs/gcs","writerutil"]
revision = "c3a8ba339e20006b054736f8eb9fc5e1d5fa6eab"
[[projects]]
name = "golang.org/x/crypto"
packages = [
"acme",
"acme/autocert",
"cast5",
"nacl/secretbox",
"openpgp",
"openpgp/armor",
"openpgp/elgamal",
"openpgp/errors",
"openpgp/packet",
"openpgp/s2k",
"pbkdf2",
"poly1305",
"salsa20/salsa",
"scrypt",
"ssh/terminal"
]
packages = ["acme","acme/autocert","cast5","nacl/secretbox","openpgp","openpgp/armor","openpgp/elgamal","openpgp/errors","openpgp/packet","openpgp/s2k","pbkdf2","poly1305","salsa20/salsa","scrypt","ssh/terminal"]
revision = "ede567c8e044a5913dad1d1af3696d9da953104c"
[[projects]]
name = "golang.org/x/image"
packages = [
"draw",
"math/f64",
"tiff",
"tiff/lzw"
]
packages = ["draw","math/f64","tiff","tiff/lzw"]
revision = "12117c17ca67ffa1ce22e9409f3b0b0a93ac08c7"
[[projects]]
name = "golang.org/x/net"
packages = [
"context",
"context/ctxhttp",
"html",
"html/atom",
"html/charset",
"http2",
"http2/hpack",
"idna",
"internal/timeseries",
"lex/httplex",
"trace",
"xsrftoken"
]
packages = ["context","context/ctxhttp","html","html/atom","html/charset","http2","http2/hpack","idna","internal/timeseries","lex/httplex","trace","xsrftoken"]
revision = "d866cfc389cec985d6fda2859936a575a55a3ab6"
[[projects]]
name = "golang.org/x/oauth2"
packages = [
".",
"google",
"internal",
"jws",
"jwt"
]
packages = [".","google","internal","jws","jwt"]
revision = "197281d4e0ecd78c33865daf9c6e51626feefcb2"
[[projects]]
@ -402,35 +274,7 @@
[[projects]]
name = "golang.org/x/text"
packages = [
".",
"collate",
"collate/build",
"encoding",
"encoding/charmap",
"encoding/htmlindex",
"encoding/internal",
"encoding/internal/identifier",
"encoding/japanese",
"encoding/korean",
"encoding/simplifiedchinese",
"encoding/traditionalchinese",
"encoding/unicode",
"internal/colltab",
"internal/gen",
"internal/tag",
"internal/triegen",
"internal/ucd",
"internal/utf8internal",
"language",
"runes",
"secure/bidirule",
"transform",
"unicode/bidi",
"unicode/cldr",
"unicode/norm",
"unicode/rangetable"
]
packages = [".","collate","collate/build","encoding","encoding/charmap","encoding/htmlindex","encoding/internal","encoding/internal/identifier","encoding/japanese","encoding/korean","encoding/simplifiedchinese","encoding/traditionalchinese","encoding/unicode","internal/colltab","internal/gen","internal/tag","internal/triegen","internal/ucd","internal/utf8internal","language","runes","secure/bidirule","transform","unicode/bidi","unicode/cldr","unicode/norm","unicode/rangetable"]
revision = "88f656faf3f37f690df1a32515b479415e1a6769"
[[projects]]
@ -440,99 +284,35 @@
[[projects]]
name = "golang.org/x/tools"
packages = [
"go/buildutil",
"go/gcimporter15",
"go/types/typeutil",
"refactor/importgraph"
]
packages = ["go/buildutil","go/gcimporter15","go/types/typeutil","refactor/importgraph"]
revision = "e531a2a1c15f94033f6fa87666caeb19a688175f"
[[projects]]
name = "google.golang.org/api"
packages = [
"cloudresourcemanager/v1",
"compute/v1",
"drive/v2",
"drive/v3",
"gensupport",
"googleapi",
"googleapi/internal/uritemplates",
"googleapi/transport",
"internal",
"iterator",
"option",
"servicemanagement/v1",
"sqladmin/v1beta3",
"storage/v1",
"transport"
]
packages = ["cloudresourcemanager/v1","compute/v1","drive/v2","drive/v3","gensupport","googleapi","googleapi/internal/uritemplates","googleapi/transport","internal","iterator","option","servicemanagement/v1","sqladmin/v1beta3","storage/v1","transport"]
revision = "48e49d1645e228d1c50c3d54fb476b2224477303"
[[projects]]
name = "google.golang.org/appengine"
packages = [
".",
"internal",
"internal/app_identity",
"internal/base",
"internal/datastore",
"internal/log",
"internal/modules",
"internal/remote_api",
"internal/socket",
"internal/urlfetch",
"socket",
"urlfetch"
]
packages = [".","internal","internal/app_identity","internal/base","internal/datastore","internal/log","internal/modules","internal/remote_api","internal/socket","internal/urlfetch","socket","urlfetch"]
revision = "150dc57a1b433e64154302bdc40b6bb8aefa313a"
version = "v1.0.0"
[[projects]]
name = "google.golang.org/genproto"
packages = [
"googleapis/api/label",
"googleapis/api/metric",
"googleapis/api/monitoredres",
"googleapis/api/serviceconfig",
"googleapis/datastore/v1",
"googleapis/logging/type",
"googleapis/logging/v2",
"googleapis/rpc/status",
"googleapis/type/latlng",
"protobuf"
]
packages = ["googleapis/api/label","googleapis/api/metric","googleapis/api/monitoredres","googleapis/api/serviceconfig","googleapis/datastore/v1","googleapis/logging/type","googleapis/logging/v2","googleapis/rpc/status","googleapis/type/latlng","protobuf"]
revision = "08f135d1a31b6ba454287638a3ce23a55adace6f"
[[projects]]
name = "google.golang.org/grpc"
packages = [
".",
"codes",
"credentials",
"credentials/oauth",
"grpclog",
"internal",
"metadata",
"naming",
"peer",
"stats",
"tap",
"transport"
]
packages = [".","codes","credentials","credentials/oauth","grpclog","internal","metadata","naming","peer","stats","tap","transport"]
revision = "188a132adcfba339f1f2d5da52498451341f9ee8"
source = "https://github.com/bradfitz/grpc-go.git"
[[projects]]
branch = "v2"
name = "gopkg.in/mgo.v2"
packages = [
".",
"bson",
"internal/json",
"internal/sasl",
"internal/scram"
]
packages = [".","bson","internal/json","internal/sasl","internal/scram"]
revision = "3f83fa5005286a7fe593b055f0d7771a7dce4655"
[[projects]]
@ -547,15 +327,7 @@
[[projects]]
name = "myitcv.io/react"
packages = [
".",
"cmd/reactGen",
"internal/bundle",
"internal/core",
"internal/dev",
"internal/preact",
"internal/prod"
]
packages = [".","cmd/reactGen","internal/bundle","internal/core","internal/dev","internal/preact","internal/prod"]
revision = "bca7c66b77ed8a5b86fb77cff70914c4a7cc3ce5"
[[projects]]
@ -565,16 +337,12 @@
[[projects]]
name = "rsc.io/qr"
packages = [
".",
"coding",
"gf256"
]
packages = [".","coding","gf256"]
revision = "48b2ede4844e13f1a2b7ce4d2529c9af7e359fc5"
[solve-meta]
analyzer-name = "dep"
analyzer-version = 1
inputs-digest = "0922eed8dece3458922d996935e4ca92abcc96da00d8ec0295ee44815079a0fc"
inputs-digest = "58aeea7aea438a25e7b53a5ba4847f3f28c99cedf3bdfad4e7b2dca110da839d"
solver-name = "gps-cdcl"
solver-version = 1

View File

@ -25,7 +25,6 @@ required = [
"google.golang.org/grpc", # fork by bradfitz
"golang.org/x/text",
"github.com/neelance/sourcemap",
"github.com/syndtr/gosnappy/snappy",
"golang.org/x/sys/unix",
"golang.org/x/tools/go/types/typeutil", # for gopherjs
"golang.org/x/tools/go/gcimporter15", # for gopherjs
@ -195,11 +194,7 @@ required = [
[[constraint]]
name = "github.com/syndtr/goleveldb"
revision = "4875955338b0a434238a31165cb87255ab6e9e4a"
[[constraint]]
name = "github.com/syndtr/gosnappy"
revision = "156a073208e131d7d2e212cb749feae7c339e846"
revision = "34011bf325bce385408353a30b101fe5e923eb6e"
[[constraint]]
name = "github.com/tgulacsi/picago"

View File

@ -36,6 +36,7 @@ import (
"perkeep.org/pkg/client"
"github.com/syndtr/goleveldb/leveldb"
"github.com/syndtr/goleveldb/leveldb/util"
)
var errCacheMiss = errors.New("not in cache")
@ -59,12 +60,39 @@ func NewKvHaveCache(gen string) *KvHaveCache {
if err != nil {
log.Fatalf("Could not create/open new have cache at %v, %v", fullPath, err)
}
if err := maybeRunCompaction("HaveCache", db); err != nil {
log.Fatal(err)
}
return &KvHaveCache{
filename: fullPath,
db: db,
}
}
// maybeRunCompaction forces compaction of db, if the number of
// tables in level 0 is >= 4. dbname should be provided for error messages.
func maybeRunCompaction(dbname string, db *leveldb.DB) error {
val, err := db.GetProperty("leveldb.num-files-at-level0")
if err != nil {
return fmt.Errorf("could not get number of level-0 files of %v's LevelDB: %v", dbname, err)
}
nbFiles, err := strconv.Atoi(val)
if err != nil {
return fmt.Errorf("could not convert number of level-0 files to int: %v", err)
}
// Only force compaction if we're at the default trigger (4), see
// github.com/syndtr/goleveldb/leveldb/opt.DefaultCompactionL0Trigger
if nbFiles < 4 {
return nil
}
if err := db.CompactRange(util.Range{nil, nil}); err != nil {
return fmt.Errorf("could not run compaction on %v's LevelDB: %v", dbname, err)
}
return nil
}
// Close should be called to commit all the writes
// to the db and to unlock the file.
func (c *KvHaveCache) Close() error {
@ -129,6 +157,11 @@ func NewKvStatCache(gen string) *KvStatCache {
if err != nil {
log.Fatalf("Could not create/open new stat cache at %v, %v", fullPath, err)
}
if err := maybeRunCompaction("StatCache", db); err != nil {
log.Fatal(err)
}
return &KvStatCache{
filename: fullPath,
db: db,

View File

@ -10,94 +10,97 @@ Installation
Requirements
-----------
* Need at least `go1.2` or newer.
* Need at least `go1.5` or newer.
Usage
-----------
Create or open a database:
db, err := leveldb.OpenFile("path/to/db", nil)
...
defer db.Close()
...
```go
// The returned DB instance is safe for concurrent use. Which mean that all
// DB's methods may be called concurrently from multiple goroutine.
db, err := leveldb.OpenFile("path/to/db", nil)
...
defer db.Close()
...
```
Read or modify the database content:
// Remember that the contents of the returned slice should not be modified.
data, err := db.Get([]byte("key"), nil)
...
err = db.Put([]byte("key"), []byte("value"), nil)
...
err = db.Delete([]byte("key"), nil)
...
```go
// Remember that the contents of the returned slice should not be modified.
data, err := db.Get([]byte("key"), nil)
...
err = db.Put([]byte("key"), []byte("value"), nil)
...
err = db.Delete([]byte("key"), nil)
...
```
Iterate over database content:
iter := db.NewIterator(nil, nil)
for iter.Next() {
// Remember that the contents of the returned slice should not be modified, and
// only valid until the next call to Next.
key := iter.Key()
value := iter.Value()
...
}
iter.Release()
err = iter.Error()
```go
iter := db.NewIterator(nil, nil)
for iter.Next() {
// Remember that the contents of the returned slice should not be modified, and
// only valid until the next call to Next.
key := iter.Key()
value := iter.Value()
...
}
iter.Release()
err = iter.Error()
...
```
Seek-then-Iterate:
iter := db.NewIterator(nil, nil)
for ok := iter.Seek(key); ok; ok = iter.Next() {
// Use key/value.
...
}
iter.Release()
err = iter.Error()
```go
iter := db.NewIterator(nil, nil)
for ok := iter.Seek(key); ok; ok = iter.Next() {
// Use key/value.
...
}
iter.Release()
err = iter.Error()
...
```
Iterate over subset of database content:
iter := db.NewIterator(&util.Range{Start: []byte("foo"), Limit: []byte("xoo")}, nil)
for iter.Next() {
// Use key/value.
...
}
iter.Release()
err = iter.Error()
```go
iter := db.NewIterator(&util.Range{Start: []byte("foo"), Limit: []byte("xoo")}, nil)
for iter.Next() {
// Use key/value.
...
}
iter.Release()
err = iter.Error()
...
```
Iterate over subset of database content with a particular prefix:
iter := db.NewIterator(util.BytesPrefix([]byte("foo-")), nil)
for iter.Next() {
// Use key/value.
...
}
iter.Release()
err = iter.Error()
```go
iter := db.NewIterator(util.BytesPrefix([]byte("foo-")), nil)
for iter.Next() {
// Use key/value.
...
}
iter.Release()
err = iter.Error()
...
```
Batch writes:
batch := new(leveldb.Batch)
batch.Put([]byte("foo"), []byte("value"))
batch.Put([]byte("bar"), []byte("another value"))
batch.Delete([]byte("baz"))
err = db.Write(batch, nil)
...
```go
batch := new(leveldb.Batch)
batch.Put([]byte("foo"), []byte("value"))
batch.Put([]byte("bar"), []byte("another value"))
batch.Delete([]byte("baz"))
err = db.Write(batch, nil)
...
```
Use bloom filter:
o := &opt.Options{
Filter: filter.NewBloomFilter(10),
}
db, err := leveldb.OpenFile("path/to/db", o)
...
defer db.Close()
...
```go
o := &opt.Options{
Filter: filter.NewBloomFilter(10),
}
db, err := leveldb.OpenFile("path/to/db", o)
...
defer db.Close()
...
```
Documentation
-----------

View File

@ -9,11 +9,15 @@ package leveldb
import (
"encoding/binary"
"fmt"
"io"
"github.com/syndtr/goleveldb/leveldb/errors"
"github.com/syndtr/goleveldb/leveldb/memdb"
"github.com/syndtr/goleveldb/leveldb/storage"
)
// ErrBatchCorrupted records reason of batch corruption. This error will be
// wrapped with errors.ErrCorrupted.
type ErrBatchCorrupted struct {
Reason string
}
@ -23,84 +27,102 @@ func (e *ErrBatchCorrupted) Error() string {
}
func newErrBatchCorrupted(reason string) error {
return errors.NewErrCorrupted(nil, &ErrBatchCorrupted{reason})
return errors.NewErrCorrupted(storage.FileDesc{}, &ErrBatchCorrupted{reason})
}
const (
batchHdrLen = 8 + 4
batchGrowRec = 3000
batchHeaderLen = 8 + 4
batchGrowRec = 3000
batchBufioSize = 16
)
// BatchReplay wraps basic batch operations.
type BatchReplay interface {
Put(key, value []byte)
Delete(key []byte)
}
type batchIndex struct {
keyType keyType
keyPos, keyLen int
valuePos, valueLen int
}
func (index batchIndex) k(data []byte) []byte {
return data[index.keyPos : index.keyPos+index.keyLen]
}
func (index batchIndex) v(data []byte) []byte {
if index.valueLen != 0 {
return data[index.valuePos : index.valuePos+index.valueLen]
}
return nil
}
func (index batchIndex) kv(data []byte) (key, value []byte) {
return index.k(data), index.v(data)
}
// Batch is a write batch.
type Batch struct {
data []byte
rLen, bLen int
seq uint64
sync bool
data []byte
index []batchIndex
// internalLen is sums of key/value pair length plus 8-bytes internal key.
internalLen int
}
func (b *Batch) grow(n int) {
off := len(b.data)
if off == 0 {
off = batchHdrLen
if b.data != nil {
b.data = b.data[:off]
}
}
if cap(b.data)-off < n {
if b.data == nil {
b.data = make([]byte, off, off+n)
} else {
odata := b.data
div := 1
if b.rLen > batchGrowRec {
div = b.rLen / batchGrowRec
}
b.data = make([]byte, off, off+n+(off-batchHdrLen)/div)
copy(b.data, odata)
o := len(b.data)
if cap(b.data)-o < n {
div := 1
if len(b.index) > batchGrowRec {
div = len(b.index) / batchGrowRec
}
ndata := make([]byte, o, o+n+o/div)
copy(ndata, b.data)
b.data = ndata
}
}
func (b *Batch) appendRec(kt kType, key, value []byte) {
func (b *Batch) appendRec(kt keyType, key, value []byte) {
n := 1 + binary.MaxVarintLen32 + len(key)
if kt == ktVal {
if kt == keyTypeVal {
n += binary.MaxVarintLen32 + len(value)
}
b.grow(n)
off := len(b.data)
data := b.data[:off+n]
data[off] = byte(kt)
off += 1
off += binary.PutUvarint(data[off:], uint64(len(key)))
copy(data[off:], key)
off += len(key)
if kt == ktVal {
off += binary.PutUvarint(data[off:], uint64(len(value)))
copy(data[off:], value)
off += len(value)
index := batchIndex{keyType: kt}
o := len(b.data)
data := b.data[:o+n]
data[o] = byte(kt)
o++
o += binary.PutUvarint(data[o:], uint64(len(key)))
index.keyPos = o
index.keyLen = len(key)
o += copy(data[o:], key)
if kt == keyTypeVal {
o += binary.PutUvarint(data[o:], uint64(len(value)))
index.valuePos = o
index.valueLen = len(value)
o += copy(data[o:], value)
}
b.data = data[:off]
b.rLen++
// Include 8-byte ikey header
b.bLen += len(key) + len(value) + 8
b.data = data[:o]
b.index = append(b.index, index)
b.internalLen += index.keyLen + index.valueLen + 8
}
// Put appends 'put operation' of the given key/value pair to the batch.
// It is safe to modify the contents of the argument after Put returns.
// It is safe to modify the contents of the argument after Put returns but not
// before.
func (b *Batch) Put(key, value []byte) {
b.appendRec(ktVal, key, value)
b.appendRec(keyTypeVal, key, value)
}
// Delete appends 'delete operation' of the given key to the batch.
// It is safe to modify the contents of the argument after Delete returns.
// It is safe to modify the contents of the argument after Delete returns but
// not before.
func (b *Batch) Delete(key []byte) {
b.appendRec(ktDel, key, nil)
b.appendRec(keyTypeDel, key, nil)
}
// Dump dumps batch contents. The returned slice can be loaded into the
@ -108,7 +130,7 @@ func (b *Batch) Delete(key []byte) {
// The returned slice is not its own copy, so the contents should not be
// modified.
func (b *Batch) Dump() []byte {
return b.encode()
return b.data
}
// Load loads given slice into the batch. Previous contents of the batch
@ -116,137 +138,212 @@ func (b *Batch) Dump() []byte {
// The given slice will not be copied and will be used as batch buffer, so
// it is not safe to modify the contents of the slice.
func (b *Batch) Load(data []byte) error {
return b.decode(0, data)
return b.decode(data, -1)
}
// Replay replays batch contents.
func (b *Batch) Replay(r BatchReplay) error {
return b.decodeRec(func(i int, kt kType, key, value []byte) {
switch kt {
case ktVal:
r.Put(key, value)
case ktDel:
r.Delete(key)
for _, index := range b.index {
switch index.keyType {
case keyTypeVal:
r.Put(index.k(b.data), index.v(b.data))
case keyTypeDel:
r.Delete(index.k(b.data))
}
})
}
return nil
}
// Len returns number of records in the batch.
func (b *Batch) Len() int {
return b.rLen
return len(b.index)
}
// Reset resets the batch.
func (b *Batch) Reset() {
b.data = b.data[:0]
b.seq = 0
b.rLen = 0
b.bLen = 0
b.sync = false
b.index = b.index[:0]
b.internalLen = 0
}
func (b *Batch) init(sync bool) {
b.sync = sync
func (b *Batch) replayInternal(fn func(i int, kt keyType, k, v []byte) error) error {
for i, index := range b.index {
if err := fn(i, index.keyType, index.k(b.data), index.v(b.data)); err != nil {
return err
}
}
return nil
}
func (b *Batch) append(p *Batch) {
if p.rLen > 0 {
b.grow(len(p.data) - batchHdrLen)
b.data = append(b.data, p.data[batchHdrLen:]...)
b.rLen += p.rLen
}
if p.sync {
b.sync = true
}
}
ob := len(b.data)
oi := len(b.index)
b.data = append(b.data, p.data...)
b.index = append(b.index, p.index...)
b.internalLen += p.internalLen
// size returns sums of key/value pair length plus 8-bytes ikey.
func (b *Batch) size() int {
return b.bLen
}
func (b *Batch) encode() []byte {
b.grow(0)
binary.LittleEndian.PutUint64(b.data, b.seq)
binary.LittleEndian.PutUint32(b.data[8:], uint32(b.rLen))
return b.data
}
func (b *Batch) decode(prevSeq uint64, data []byte) error {
if len(data) < batchHdrLen {
return newErrBatchCorrupted("too short")
}
b.seq = binary.LittleEndian.Uint64(data)
if b.seq < prevSeq {
return newErrBatchCorrupted("invalid sequence number")
}
b.rLen = int(binary.LittleEndian.Uint32(data[8:]))
if b.rLen < 0 {
return newErrBatchCorrupted("invalid records length")
}
// No need to be precise at this point, it won't be used anyway
b.bLen = len(data) - batchHdrLen
b.data = data
return nil
}
func (b *Batch) decodeRec(f func(i int, kt kType, key, value []byte)) (err error) {
off := batchHdrLen
for i := 0; i < b.rLen; i++ {
if off >= len(b.data) {
return newErrBatchCorrupted("invalid records length")
}
kt := kType(b.data[off])
if kt > ktVal {
return newErrBatchCorrupted("bad record: invalid type")
}
off += 1
x, n := binary.Uvarint(b.data[off:])
off += n
if n <= 0 || off+int(x) > len(b.data) {
return newErrBatchCorrupted("bad record: invalid key length")
}
key := b.data[off : off+int(x)]
off += int(x)
var value []byte
if kt == ktVal {
x, n := binary.Uvarint(b.data[off:])
off += n
if n <= 0 || off+int(x) > len(b.data) {
return newErrBatchCorrupted("bad record: invalid value length")
// Updating index offset.
if ob != 0 {
for ; oi < len(b.index); oi++ {
index := &b.index[oi]
index.keyPos += ob
if index.valueLen != 0 {
index.valuePos += ob
}
value = b.data[off : off+int(x)]
off += int(x)
}
f(i, kt, key, value)
}
return nil
}
func (b *Batch) memReplay(to *memdb.DB) error {
return b.decodeRec(func(i int, kt kType, key, value []byte) {
ikey := newIkey(key, b.seq+uint64(i), kt)
to.Put(ikey, value)
func (b *Batch) decode(data []byte, expectedLen int) error {
b.data = data
b.index = b.index[:0]
b.internalLen = 0
err := decodeBatch(data, func(i int, index batchIndex) error {
b.index = append(b.index, index)
b.internalLen += index.keyLen + index.valueLen + 8
return nil
})
}
func (b *Batch) memDecodeAndReplay(prevSeq uint64, data []byte, to *memdb.DB) error {
if err := b.decode(prevSeq, data); err != nil {
if err != nil {
return err
}
return b.memReplay(to)
if expectedLen >= 0 && len(b.index) != expectedLen {
return newErrBatchCorrupted(fmt.Sprintf("invalid records length: %d vs %d", expectedLen, len(b.index)))
}
return nil
}
func (b *Batch) revertMemReplay(to *memdb.DB) error {
return b.decodeRec(func(i int, kt kType, key, value []byte) {
ikey := newIkey(key, b.seq+uint64(i), kt)
to.Delete(ikey)
})
func (b *Batch) putMem(seq uint64, mdb *memdb.DB) error {
var ik []byte
for i, index := range b.index {
ik = makeInternalKey(ik, index.k(b.data), seq+uint64(i), index.keyType)
if err := mdb.Put(ik, index.v(b.data)); err != nil {
return err
}
}
return nil
}
func (b *Batch) revertMem(seq uint64, mdb *memdb.DB) error {
var ik []byte
for i, index := range b.index {
ik = makeInternalKey(ik, index.k(b.data), seq+uint64(i), index.keyType)
if err := mdb.Delete(ik); err != nil {
return err
}
}
return nil
}
func newBatch() interface{} {
return &Batch{}
}
func decodeBatch(data []byte, fn func(i int, index batchIndex) error) error {
var index batchIndex
for i, o := 0, 0; o < len(data); i++ {
// Key type.
index.keyType = keyType(data[o])
if index.keyType > keyTypeVal {
return newErrBatchCorrupted(fmt.Sprintf("bad record: invalid type %#x", uint(index.keyType)))
}
o++
// Key.
x, n := binary.Uvarint(data[o:])
o += n
if n <= 0 || o+int(x) > len(data) {
return newErrBatchCorrupted("bad record: invalid key length")
}
index.keyPos = o
index.keyLen = int(x)
o += index.keyLen
// Value.
if index.keyType == keyTypeVal {
x, n = binary.Uvarint(data[o:])
o += n
if n <= 0 || o+int(x) > len(data) {
return newErrBatchCorrupted("bad record: invalid value length")
}
index.valuePos = o
index.valueLen = int(x)
o += index.valueLen
} else {
index.valuePos = 0
index.valueLen = 0
}
if err := fn(i, index); err != nil {
return err
}
}
return nil
}
func decodeBatchToMem(data []byte, expectSeq uint64, mdb *memdb.DB) (seq uint64, batchLen int, err error) {
seq, batchLen, err = decodeBatchHeader(data)
if err != nil {
return 0, 0, err
}
if seq < expectSeq {
return 0, 0, newErrBatchCorrupted("invalid sequence number")
}
data = data[batchHeaderLen:]
var ik []byte
var decodedLen int
err = decodeBatch(data, func(i int, index batchIndex) error {
if i >= batchLen {
return newErrBatchCorrupted("invalid records length")
}
ik = makeInternalKey(ik, index.k(data), seq+uint64(i), index.keyType)
if err := mdb.Put(ik, index.v(data)); err != nil {
return err
}
decodedLen++
return nil
})
if err == nil && decodedLen != batchLen {
err = newErrBatchCorrupted(fmt.Sprintf("invalid records length: %d vs %d", batchLen, decodedLen))
}
return
}
func encodeBatchHeader(dst []byte, seq uint64, batchLen int) []byte {
dst = ensureBuffer(dst, batchHeaderLen)
binary.LittleEndian.PutUint64(dst, seq)
binary.LittleEndian.PutUint32(dst[8:], uint32(batchLen))
return dst
}
func decodeBatchHeader(data []byte) (seq uint64, batchLen int, err error) {
if len(data) < batchHeaderLen {
return 0, 0, newErrBatchCorrupted("too short")
}
seq = binary.LittleEndian.Uint64(data)
batchLen = int(binary.LittleEndian.Uint32(data[8:]))
if batchLen < 0 {
return 0, 0, newErrBatchCorrupted("invalid records length")
}
return
}
func batchesLen(batches []*Batch) int {
batchLen := 0
for _, batch := range batches {
batchLen += batch.Len()
}
return batchLen
}
func writeBatchesWithHeader(wr io.Writer, batches []*Batch, seq uint64) error {
if _, err := wr.Write(encodeBatchHeader(nil, seq, batchesLen(batches))); err != nil {
return err
}
for _, batch := range batches {
if _, err := wr.Write(batch.data); err != nil {
return err
}
}
return nil
}

View File

@ -8,113 +8,140 @@ package leveldb
import (
"bytes"
"fmt"
"testing"
"testing/quick"
"github.com/syndtr/goleveldb/leveldb/comparer"
"github.com/syndtr/goleveldb/leveldb/memdb"
"github.com/syndtr/goleveldb/leveldb/testutil"
)
type tbRec struct {
kt kType
key, value []byte
func TestBatchHeader(t *testing.T) {
f := func(seq uint64, length uint32) bool {
encoded := encodeBatchHeader(nil, seq, int(length))
decSeq, decLength, err := decodeBatchHeader(encoded)
return err == nil && decSeq == seq && decLength == int(length)
}
config := &quick.Config{
Rand: testutil.NewRand(),
}
if err := quick.Check(f, config); err != nil {
t.Error(err)
}
}
type testBatch struct {
rec []*tbRec
type batchKV struct {
kt keyType
k, v []byte
}
func (p *testBatch) Put(key, value []byte) {
p.rec = append(p.rec, &tbRec{ktVal, key, value})
}
func (p *testBatch) Delete(key []byte) {
p.rec = append(p.rec, &tbRec{ktDel, key, nil})
}
func compareBatch(t *testing.T, b1, b2 *Batch) {
if b1.seq != b2.seq {
t.Errorf("invalid seq number want %d, got %d", b1.seq, b2.seq)
}
if b1.Len() != b2.Len() {
t.Fatalf("invalid record length want %d, got %d", b1.Len(), b2.Len())
}
p1, p2 := new(testBatch), new(testBatch)
err := b1.Replay(p1)
if err != nil {
t.Fatal("error when replaying batch 1: ", err)
}
err = b2.Replay(p2)
if err != nil {
t.Fatal("error when replaying batch 2: ", err)
}
for i := range p1.rec {
r1, r2 := p1.rec[i], p2.rec[i]
if r1.kt != r2.kt {
t.Errorf("invalid type on record '%d' want %d, got %d", i, r1.kt, r2.kt)
func TestBatch(t *testing.T) {
var (
kvs []batchKV
internalLen int
)
batch := new(Batch)
rbatch := new(Batch)
abatch := new(Batch)
testBatch := func(i int, kt keyType, k, v []byte) error {
kv := kvs[i]
if kv.kt != kt {
return fmt.Errorf("invalid key type, index=%d: %d vs %d", i, kv.kt, kt)
}
if !bytes.Equal(r1.key, r2.key) {
t.Errorf("invalid key on record '%d' want %s, got %s", i, string(r1.key), string(r2.key))
if !bytes.Equal(kv.k, k) {
return fmt.Errorf("invalid key, index=%d", i)
}
if r1.kt == ktVal {
if !bytes.Equal(r1.value, r2.value) {
t.Errorf("invalid value on record '%d' want %s, got %s", i, string(r1.value), string(r2.value))
if !bytes.Equal(kv.v, v) {
return fmt.Errorf("invalid value, index=%d", i)
}
return nil
}
f := func(ktr uint8, k, v []byte) bool {
kt := keyType(ktr % 2)
if kt == keyTypeVal {
batch.Put(k, v)
rbatch.Put(k, v)
kvs = append(kvs, batchKV{kt: kt, k: k, v: v})
internalLen += len(k) + len(v) + 8
} else {
batch.Delete(k)
rbatch.Delete(k)
kvs = append(kvs, batchKV{kt: kt, k: k})
internalLen += len(k) + 8
}
if batch.Len() != len(kvs) {
t.Logf("batch.Len: %d vs %d", len(kvs), batch.Len())
return false
}
if batch.internalLen != internalLen {
t.Logf("abatch.internalLen: %d vs %d", internalLen, batch.internalLen)
return false
}
if len(kvs)%1000 == 0 {
if err := batch.replayInternal(testBatch); err != nil {
t.Logf("batch.replayInternal: %v", err)
return false
}
abatch.append(rbatch)
rbatch.Reset()
if abatch.Len() != len(kvs) {
t.Logf("abatch.Len: %d vs %d", len(kvs), abatch.Len())
return false
}
if abatch.internalLen != internalLen {
t.Logf("abatch.internalLen: %d vs %d", internalLen, abatch.internalLen)
return false
}
if err := abatch.replayInternal(testBatch); err != nil {
t.Logf("abatch.replayInternal: %v", err)
return false
}
nbatch := new(Batch)
if err := nbatch.Load(batch.Dump()); err != nil {
t.Logf("nbatch.Load: %v", err)
return false
}
if nbatch.Len() != len(kvs) {
t.Logf("nbatch.Len: %d vs %d", len(kvs), nbatch.Len())
return false
}
if nbatch.internalLen != internalLen {
t.Logf("nbatch.internalLen: %d vs %d", internalLen, nbatch.internalLen)
return false
}
if err := nbatch.replayInternal(testBatch); err != nil {
t.Logf("nbatch.replayInternal: %v", err)
return false
}
}
}
}
func TestBatch_EncodeDecode(t *testing.T) {
b1 := new(Batch)
b1.seq = 10009
b1.Put([]byte("key1"), []byte("value1"))
b1.Put([]byte("key2"), []byte("value2"))
b1.Delete([]byte("key1"))
b1.Put([]byte("k"), []byte(""))
b1.Put([]byte("zzzzzzzzzzz"), []byte("zzzzzzzzzzzzzzzzzzzzzzzz"))
b1.Delete([]byte("key10000"))
b1.Delete([]byte("k"))
buf := b1.encode()
b2 := new(Batch)
err := b2.decode(0, buf)
if err != nil {
t.Error("error when decoding batch: ", err)
}
compareBatch(t, b1, b2)
}
func TestBatch_Append(t *testing.T) {
b1 := new(Batch)
b1.seq = 10009
b1.Put([]byte("key1"), []byte("value1"))
b1.Put([]byte("key2"), []byte("value2"))
b1.Delete([]byte("key1"))
b1.Put([]byte("foo"), []byte("foovalue"))
b1.Put([]byte("bar"), []byte("barvalue"))
b2a := new(Batch)
b2a.seq = 10009
b2a.Put([]byte("key1"), []byte("value1"))
b2a.Put([]byte("key2"), []byte("value2"))
b2a.Delete([]byte("key1"))
b2b := new(Batch)
b2b.Put([]byte("foo"), []byte("foovalue"))
b2b.Put([]byte("bar"), []byte("barvalue"))
b2a.append(b2b)
compareBatch(t, b1, b2a)
}
func TestBatch_Size(t *testing.T) {
b := new(Batch)
for i := 0; i < 2; i++ {
b.Put([]byte("key1"), []byte("value1"))
b.Put([]byte("key2"), []byte("value2"))
b.Delete([]byte("key1"))
b.Put([]byte("foo"), []byte("foovalue"))
b.Put([]byte("bar"), []byte("barvalue"))
mem := memdb.New(&iComparer{comparer.DefaultComparer}, 0)
b.memReplay(mem)
if b.size() != mem.Size() {
t.Errorf("invalid batch size calculation, want=%d got=%d", mem.Size(), b.size())
if len(kvs)%10000 == 0 {
nbatch := new(Batch)
if err := batch.Replay(nbatch); err != nil {
t.Logf("batch.Replay: %v", err)
return false
}
if nbatch.Len() != len(kvs) {
t.Logf("nbatch.Len: %d vs %d", len(kvs), nbatch.Len())
return false
}
if nbatch.internalLen != internalLen {
t.Logf("nbatch.internalLen: %d vs %d", internalLen, nbatch.internalLen)
return false
}
if err := nbatch.replayInternal(testBatch); err != nil {
t.Logf("nbatch.replayInternal: %v", err)
return false
}
}
b.Reset()
return true
}
config := &quick.Config{
MaxCount: 40000,
Rand: testutil.NewRand(),
}
if err := quick.Check(f, config); err != nil {
t.Error(err)
}
t.Logf("length=%d internalLen=%d", len(kvs), internalLen)
}

View File

@ -1,58 +0,0 @@
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
// All rights reserved.
//
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
// +build !go1.2
package leveldb
import (
"sync/atomic"
"testing"
)
func BenchmarkDBReadConcurrent(b *testing.B) {
p := openDBBench(b, false)
p.populate(b.N)
p.fill()
p.gc()
defer p.close()
b.ResetTimer()
b.SetBytes(116)
b.RunParallel(func(pb *testing.PB) {
iter := p.newIter()
defer iter.Release()
for pb.Next() && iter.Next() {
}
})
}
func BenchmarkDBReadConcurrent2(b *testing.B) {
p := openDBBench(b, false)
p.populate(b.N)
p.fill()
p.gc()
defer p.close()
b.ResetTimer()
b.SetBytes(116)
var dir uint32
b.RunParallel(func(pb *testing.PB) {
iter := p.newIter()
defer iter.Release()
if atomic.AddUint32(&dir, 1)%2 == 0 {
for pb.Next() && iter.Next() {
}
} else {
if pb.Next() && iter.Last() {
for pb.Next() && iter.Prev() {
}
}
}
})
}

View File

@ -13,6 +13,7 @@ import (
"os"
"path/filepath"
"runtime"
"sync/atomic"
"testing"
"github.com/syndtr/goleveldb/leveldb/iterator"
@ -90,7 +91,7 @@ func openDBBench(b *testing.B, noCompress bool) *dbBench {
ro: &opt.ReadOptions{},
wo: &opt.WriteOptions{},
}
p.stor, err = storage.OpenFile(benchDB)
p.stor, err = storage.OpenFile(benchDB, false)
if err != nil {
b.Fatal("cannot open stor: ", err)
}
@ -103,7 +104,6 @@ func openDBBench(b *testing.B, noCompress bool) *dbBench {
b.Fatal("cannot open db: ", err)
}
runtime.GOMAXPROCS(runtime.NumCPU())
return p
}
@ -259,7 +259,6 @@ func (p *dbBench) close() {
p.keys = nil
p.values = nil
runtime.GC()
runtime.GOMAXPROCS(1)
}
func BenchmarkDBWrite(b *testing.B) {
@ -462,3 +461,47 @@ func BenchmarkDBGetRandom(b *testing.B) {
p.gets()
p.close()
}
func BenchmarkDBReadConcurrent(b *testing.B) {
p := openDBBench(b, false)
p.populate(b.N)
p.fill()
p.gc()
defer p.close()
b.ResetTimer()
b.SetBytes(116)
b.RunParallel(func(pb *testing.PB) {
iter := p.newIter()
defer iter.Release()
for pb.Next() && iter.Next() {
}
})
}
func BenchmarkDBReadConcurrent2(b *testing.B) {
p := openDBBench(b, false)
p.populate(b.N)
p.fill()
p.gc()
defer p.close()
b.ResetTimer()
b.SetBytes(116)
var dir uint32
b.RunParallel(func(pb *testing.PB) {
iter := p.newIter()
defer iter.Release()
if atomic.AddUint32(&dir, 1)%2 == 0 {
for pb.Next() && iter.Next() {
}
} else {
if pb.Next() && iter.Last() {
for pb.Next() && iter.Prev() {
}
}
}
})
}

View File

@ -4,13 +4,12 @@
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
// +build !go1.2
package cache
import (
"math/rand"
"testing"
"time"
)
func BenchmarkLRUCache(b *testing.B) {

View File

@ -16,7 +16,7 @@ import (
)
// Cacher provides interface to implements a caching functionality.
// An implementation must be goroutine-safe.
// An implementation must be safe for concurrent use.
type Cacher interface {
// Capacity returns cache capacity.
Capacity() int
@ -47,17 +47,21 @@ type Cacher interface {
// so the the Release method will be called once object is released.
type Value interface{}
type CacheGetter struct {
// NamespaceGetter provides convenient wrapper for namespace.
type NamespaceGetter struct {
Cache *Cache
NS uint64
}
func (g *CacheGetter) Get(key uint64, setFunc func() (size int, value Value)) *Handle {
// Get simply calls Cache.Get() method.
func (g *NamespaceGetter) Get(key uint64, setFunc func() (size int, value Value)) *Handle {
return g.Cache.Get(g.NS, key, setFunc)
}
// The hash tables implementation is based on:
// "Dynamic-Sized Nonblocking Hash Tables", by Yujie Liu, Kunlong Zhang, and Michael Spear. ACM Symposium on Principles of Distributed Computing, Jul 2014.
// "Dynamic-Sized Nonblocking Hash Tables", by Yujie Liu,
// Kunlong Zhang, and Michael Spear.
// ACM Symposium on Principles of Distributed Computing, Jul 2014.
const (
mInitialSize = 1 << 4
@ -507,18 +511,12 @@ func (r *Cache) EvictAll() {
}
}
// Close closes the 'cache map' and releases all 'cache node'.
// Close closes the 'cache map' and forcefully releases all 'cache node'.
func (r *Cache) Close() error {
r.mu.Lock()
if !r.closed {
r.closed = true
if r.cacher != nil {
if err := r.cacher.Close(); err != nil {
return err
}
}
h := (*mNode)(r.mHead)
h.initBuckets()
@ -537,10 +535,37 @@ func (r *Cache) Close() error {
for _, f := range n.onDel {
f()
}
n.onDel = nil
}
}
}
r.mu.Unlock()
// Avoid deadlock.
if r.cacher != nil {
if err := r.cacher.Close(); err != nil {
return err
}
}
return nil
}
// CloseWeak closes the 'cache map' and evict all 'cache node' from cacher, but
// unlike Close it doesn't forcefully releases 'cache node'.
func (r *Cache) CloseWeak() error {
r.mu.Lock()
if !r.closed {
r.closed = true
}
r.mu.Unlock()
// Avoid deadlock.
if r.cacher != nil {
r.cacher.EvictAll()
if err := r.cacher.Close(); err != nil {
return err
}
}
return nil
}
@ -610,10 +635,12 @@ func (n *Node) unrefLocked() {
}
}
// Handle is a 'cache handle' of a 'cache node'.
type Handle struct {
n unsafe.Pointer // *Node
}
// Value returns the value of the 'cache node'.
func (h *Handle) Value() Value {
n := (*Node)(atomic.LoadPointer(&h.n))
if n != nil {
@ -622,6 +649,8 @@ func (h *Handle) Value() Value {
return nil
}
// Release releases this 'cache handle'.
// It is safe to call release multiple times.
func (h *Handle) Release() {
nPtr := atomic.LoadPointer(&h.n)
if nPtr != nil && atomic.CompareAndSwapPointer(&h.n, nPtr, nil) {

View File

@ -45,20 +45,29 @@ func set(c *Cache, ns, key uint64, value Value, charge int, relf func()) *Handle
return c.Get(ns, key, func() (int, Value) {
if relf != nil {
return charge, releaserFunc{relf, value}
} else {
return charge, value
}
return charge, value
})
}
type cacheMapTestParams struct {
nobjects, nhandles, concurrent, repeat int
}
func TestCacheMap(t *testing.T) {
runtime.GOMAXPROCS(runtime.NumCPU())
nsx := []struct {
nobjects, nhandles, concurrent, repeat int
}{
{10000, 400, 50, 3},
{100000, 1000, 100, 10},
var params []cacheMapTestParams
if testing.Short() {
params = []cacheMapTestParams{
{1000, 100, 20, 3},
{10000, 300, 50, 10},
}
} else {
params = []cacheMapTestParams{
{10000, 400, 50, 3},
{100000, 1000, 100, 10},
}
}
var (
@ -66,7 +75,7 @@ func TestCacheMap(t *testing.T) {
handles [][]unsafe.Pointer
)
for _, x := range nsx {
for _, x := range params {
objects = append(objects, make([]int32o, x.nobjects))
handles = append(handles, make([]unsafe.Pointer, x.nhandles))
}
@ -76,7 +85,7 @@ func TestCacheMap(t *testing.T) {
wg := new(sync.WaitGroup)
var done int32
for ns, x := range nsx {
for ns, x := range params {
for i := 0; i < x.concurrent; i++ {
wg.Add(1)
go func(ns, i, repeat int, objects []int32o, handles []unsafe.Pointer) {

View File

@ -6,7 +6,9 @@
package leveldb
import "github.com/syndtr/goleveldb/leveldb/comparer"
import (
"github.com/syndtr/goleveldb/leveldb/comparer"
)
type iComparer struct {
ucmp comparer.Comparer
@ -33,43 +35,33 @@ func (icmp *iComparer) Name() string {
}
func (icmp *iComparer) Compare(a, b []byte) int {
x := icmp.ucmp.Compare(iKey(a).ukey(), iKey(b).ukey())
x := icmp.uCompare(internalKey(a).ukey(), internalKey(b).ukey())
if x == 0 {
if m, n := iKey(a).num(), iKey(b).num(); m > n {
x = -1
if m, n := internalKey(a).num(), internalKey(b).num(); m > n {
return -1
} else if m < n {
x = 1
return 1
}
}
return x
}
func (icmp *iComparer) Separator(dst, a, b []byte) []byte {
ua, ub := iKey(a).ukey(), iKey(b).ukey()
dst = icmp.ucmp.Separator(dst, ua, ub)
if dst == nil {
return nil
ua, ub := internalKey(a).ukey(), internalKey(b).ukey()
dst = icmp.uSeparator(dst, ua, ub)
if dst != nil && len(dst) < len(ua) && icmp.uCompare(ua, dst) < 0 {
// Append earliest possible number.
return append(dst, keyMaxNumBytes...)
}
if len(dst) < len(ua) && icmp.uCompare(ua, dst) < 0 {
dst = append(dst, kMaxNumBytes...)
} else {
// Did not close possibilities that n maybe longer than len(ub).
dst = append(dst, a[len(a)-8:]...)
}
return dst
return nil
}
func (icmp *iComparer) Successor(dst, b []byte) []byte {
ub := iKey(b).ukey()
dst = icmp.ucmp.Successor(dst, ub)
if dst == nil {
return nil
ub := internalKey(b).ukey()
dst = icmp.uSuccessor(dst, ub)
if dst != nil && len(dst) < len(ub) && icmp.uCompare(ub, dst) < 0 {
// Append earliest possible number.
return append(dst, keyMaxNumBytes...)
}
if len(dst) < len(ub) && icmp.uCompare(ub, dst) < 0 {
dst = append(dst, kMaxNumBytes...)
} else {
// Did not close possibilities that n maybe longer than len(ub).
dst = append(dst, b[len(b)-8:]...)
}
return dst
return nil
}

View File

@ -9,12 +9,13 @@ package leveldb
import (
"bytes"
"fmt"
"github.com/syndtr/goleveldb/leveldb/filter"
"github.com/syndtr/goleveldb/leveldb/opt"
"github.com/syndtr/goleveldb/leveldb/storage"
"io"
"math/rand"
"testing"
"github.com/syndtr/goleveldb/leveldb/filter"
"github.com/syndtr/goleveldb/leveldb/opt"
"github.com/syndtr/goleveldb/leveldb/storage"
)
const ctValSize = 1000
@ -99,19 +100,17 @@ func (h *dbCorruptHarness) corrupt(ft storage.FileType, fi, offset, n int) {
p := &h.dbHarness
t := p.t
ff, _ := p.stor.GetFiles(ft)
sff := files(ff)
sff.sort()
fds, _ := p.stor.List(ft)
sortFds(fds)
if fi < 0 {
fi = len(sff) - 1
fi = len(fds) - 1
}
if fi >= len(sff) {
if fi >= len(fds) {
t.Fatalf("no such file with type %q with index %d", ft, fi)
}
file := sff[fi]
r, err := file.Open()
fd := fds[fi]
r, err := h.stor.Open(fd)
if err != nil {
t.Fatal("cannot open file: ", err)
}
@ -149,11 +148,11 @@ func (h *dbCorruptHarness) corrupt(ft storage.FileType, fi, offset, n int) {
buf[offset+i] ^= 0x80
}
err = file.Remove()
err = h.stor.Remove(fd)
if err != nil {
t.Fatal("cannot remove old file: ", err)
}
w, err := file.Create()
w, err := h.stor.Create(fd)
if err != nil {
t.Fatal("cannot create new file: ", err)
}
@ -165,25 +164,37 @@ func (h *dbCorruptHarness) corrupt(ft storage.FileType, fi, offset, n int) {
}
func (h *dbCorruptHarness) removeAll(ft storage.FileType) {
ff, err := h.stor.GetFiles(ft)
fds, err := h.stor.List(ft)
if err != nil {
h.t.Fatal("get files: ", err)
}
for _, f := range ff {
if err := f.Remove(); err != nil {
for _, fd := range fds {
if err := h.stor.Remove(fd); err != nil {
h.t.Error("remove file: ", err)
}
}
}
func (h *dbCorruptHarness) forceRemoveAll(ft storage.FileType) {
fds, err := h.stor.List(ft)
if err != nil {
h.t.Fatal("get files: ", err)
}
for _, fd := range fds {
if err := h.stor.ForceRemove(fd); err != nil {
h.t.Error("remove file: ", err)
}
}
}
func (h *dbCorruptHarness) removeOne(ft storage.FileType) {
ff, err := h.stor.GetFiles(ft)
fds, err := h.stor.List(ft)
if err != nil {
h.t.Fatal("get files: ", err)
}
f := ff[rand.Intn(len(ff))]
h.t.Logf("removing file @%d", f.Num())
if err := f.Remove(); err != nil {
fd := fds[rand.Intn(len(fds))]
h.t.Logf("removing file @%d", fd.Num)
if err := h.stor.Remove(fd); err != nil {
h.t.Error("remove file: ", err)
}
}
@ -221,6 +232,7 @@ func (h *dbCorruptHarness) check(min, max int) {
func TestCorruptDB_Journal(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.build(100)
h.check(100, 100)
@ -230,12 +242,11 @@ func TestCorruptDB_Journal(t *testing.T) {
h.openDB()
h.check(36, 36)
h.close()
}
func TestCorruptDB_Table(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.build(100)
h.compactMem()
@ -246,12 +257,11 @@ func TestCorruptDB_Table(t *testing.T) {
h.openDB()
h.check(99, 99)
h.close()
}
func TestCorruptDB_TableIndex(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.build(10000)
h.compactMem()
@ -260,8 +270,6 @@ func TestCorruptDB_TableIndex(t *testing.T) {
h.openDB()
h.check(5000, 9999)
h.close()
}
func TestCorruptDB_MissingManifest(t *testing.T) {
@ -271,6 +279,7 @@ func TestCorruptDB_MissingManifest(t *testing.T) {
Strict: opt.StrictJournalChecksum,
WriteBuffer: 1000 * 60,
})
defer h.close()
h.build(1000)
h.compactMem()
@ -286,10 +295,8 @@ func TestCorruptDB_MissingManifest(t *testing.T) {
h.compactMem()
h.closeDB()
h.stor.SetIgnoreOpenErr(storage.TypeManifest)
h.removeAll(storage.TypeManifest)
h.forceRemoveAll(storage.TypeManifest)
h.openAssert(false)
h.stor.SetIgnoreOpenErr(0)
h.recover()
h.check(1000, 1000)
@ -300,12 +307,11 @@ func TestCorruptDB_MissingManifest(t *testing.T) {
h.recover()
h.check(1000, 1000)
h.close()
}
func TestCorruptDB_SequenceNumberRecovery(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.put("foo", "v1")
h.put("foo", "v2")
@ -321,12 +327,11 @@ func TestCorruptDB_SequenceNumberRecovery(t *testing.T) {
h.reopenDB()
h.getVal("foo", "v6")
h.close()
}
func TestCorruptDB_SequenceNumberRecoveryTable(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.put("foo", "v1")
h.put("foo", "v2")
@ -344,12 +349,11 @@ func TestCorruptDB_SequenceNumberRecoveryTable(t *testing.T) {
h.reopenDB()
h.getVal("foo", "v6")
h.close()
}
func TestCorruptDB_CorruptedManifest(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.put("foo", "hello")
h.compactMem()
@ -360,12 +364,11 @@ func TestCorruptDB_CorruptedManifest(t *testing.T) {
h.recover()
h.getVal("foo", "hello")
h.close()
}
func TestCorruptDB_CompactionInputError(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.build(10)
h.compactMem()
@ -377,12 +380,11 @@ func TestCorruptDB_CompactionInputError(t *testing.T) {
h.build(10000)
h.check(10000, 10000)
h.close()
}
func TestCorruptDB_UnrelatedKeys(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.build(10)
h.compactMem()
@ -394,12 +396,11 @@ func TestCorruptDB_UnrelatedKeys(t *testing.T) {
h.getVal(string(tkey(1000)), string(tval(1000, ctValSize)))
h.compactMem()
h.getVal(string(tkey(1000)), string(tval(1000, ctValSize)))
h.close()
}
func TestCorruptDB_Level0NewerFileHasOlderSeqnum(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.put("a", "v1")
h.put("b", "v1")
@ -421,12 +422,11 @@ func TestCorruptDB_Level0NewerFileHasOlderSeqnum(t *testing.T) {
h.getVal("b", "v3")
h.getVal("c", "v0")
h.getVal("d", "v0")
h.close()
}
func TestCorruptDB_RecoverInvalidSeq_Issue53(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.put("a", "v1")
h.put("b", "v1")
@ -448,12 +448,11 @@ func TestCorruptDB_RecoverInvalidSeq_Issue53(t *testing.T) {
h.getVal("b", "v3")
h.getVal("c", "v0")
h.getVal("d", "v0")
h.close()
}
func TestCorruptDB_MissingTableFiles(t *testing.T) {
h := newDbCorruptHarness(t)
defer h.close()
h.put("a", "v1")
h.put("b", "v1")
@ -467,8 +466,6 @@ func TestCorruptDB_MissingTableFiles(t *testing.T) {
h.removeOne(storage.TypeTable)
h.openAssert(false)
h.close()
}
func TestCorruptDB_RecoverTable(t *testing.T) {
@ -477,6 +474,7 @@ func TestCorruptDB_RecoverTable(t *testing.T) {
CompactionTableSize: 90 * opt.KiB,
Filter: filter.NewBloomFilter(10),
})
defer h.close()
h.build(1000)
h.compactMem()
@ -495,6 +493,4 @@ func TestCorruptDB_RecoverTable(t *testing.T) {
t.Errorf("invalid seq, want=%d got=%d", seq, h.db.seq)
}
h.check(985, 985)
h.close()
}

File diff suppressed because it is too large Load Diff

View File

@ -11,109 +11,79 @@ import (
"time"
"github.com/syndtr/goleveldb/leveldb/errors"
"github.com/syndtr/goleveldb/leveldb/memdb"
"github.com/syndtr/goleveldb/leveldb/opt"
"github.com/syndtr/goleveldb/leveldb/storage"
)
var (
errCompactionTransactExiting = errors.New("leveldb: compaction transact exiting")
)
type cStats struct {
sync.Mutex
type cStat struct {
duration time.Duration
read uint64
write uint64
read int64
write int64
}
func (p *cStats) add(n *cStatsStaging) {
p.Lock()
func (p *cStat) add(n *cStatStaging) {
p.duration += n.duration
p.read += n.read
p.write += n.write
p.Unlock()
}
func (p *cStats) get() (duration time.Duration, read, write uint64) {
p.Lock()
defer p.Unlock()
func (p *cStat) get() (duration time.Duration, read, write int64) {
return p.duration, p.read, p.write
}
type cStatsStaging struct {
type cStatStaging struct {
start time.Time
duration time.Duration
on bool
read uint64
write uint64
read int64
write int64
}
func (p *cStatsStaging) startTimer() {
func (p *cStatStaging) startTimer() {
if !p.on {
p.start = time.Now()
p.on = true
}
}
func (p *cStatsStaging) stopTimer() {
func (p *cStatStaging) stopTimer() {
if p.on {
p.duration += time.Since(p.start)
p.on = false
}
}
type cMem struct {
s *session
level int
rec *sessionRecord
type cStats struct {
lk sync.Mutex
stats []cStat
}
func newCMem(s *session) *cMem {
return &cMem{s: s, rec: &sessionRecord{numLevel: s.o.GetNumLevel()}}
}
func (c *cMem) flush(mem *memdb.DB, level int) error {
s := c.s
// Write memdb to table.
iter := mem.NewIterator(nil)
defer iter.Release()
t, n, err := s.tops.createFrom(iter)
if err != nil {
return err
func (p *cStats) addStat(level int, n *cStatStaging) {
p.lk.Lock()
if level >= len(p.stats) {
newStats := make([]cStat, level+1)
copy(newStats, p.stats)
p.stats = newStats
}
p.stats[level].add(n)
p.lk.Unlock()
}
// Pick level.
if level < 0 {
v := s.version()
level = v.pickLevel(t.imin.ukey(), t.imax.ukey())
v.release()
func (p *cStats) getStat(level int) (duration time.Duration, read, write int64) {
p.lk.Lock()
defer p.lk.Unlock()
if level < len(p.stats) {
return p.stats[level].get()
}
c.rec.addTableFile(level, t)
s.logf("mem@flush created L%d@%d N·%d S·%s %q:%q", level, t.file.Num(), n, shortenb(int(t.size)), t.imin, t.imax)
c.level = level
return nil
}
func (c *cMem) reset() {
c.rec = &sessionRecord{numLevel: c.s.o.GetNumLevel()}
}
func (c *cMem) commit(journal, seq uint64) error {
c.rec.setJournalNum(journal)
c.rec.setSeqNum(seq)
// Commit changes.
return c.s.commit(c.rec)
return
}
func (db *DB) compactionError() {
var (
err error
wlocked bool
)
var err error
noerr:
// No error.
for {
@ -121,12 +91,12 @@ noerr:
case err = <-db.compErrSetC:
switch {
case err == nil:
case errors.IsCorrupted(err):
case err == ErrReadOnly, errors.IsCorrupted(err):
goto hasperr
default:
goto haserr
}
case _, _ = <-db.closeC:
case <-db.closeC:
return
}
}
@ -139,11 +109,11 @@ haserr:
switch {
case err == nil:
goto noerr
case errors.IsCorrupted(err):
case err == ErrReadOnly, errors.IsCorrupted(err):
goto hasperr
default:
}
case _, _ = <-db.closeC:
case <-db.closeC:
return
}
}
@ -155,9 +125,9 @@ hasperr:
case db.compPerErrC <- err:
case db.writeLockC <- struct{}{}:
// Hold write lock, so that write won't pass-through.
wlocked = true
case _, _ = <-db.closeC:
if wlocked {
db.compWriteLocking = true
case <-db.closeC:
if db.compWriteLocking {
// We should release the lock or Close will hang.
<-db.writeLockC
}
@ -202,7 +172,7 @@ func (db *DB) compactionTransact(name string, t compactionTransactInterface) {
disableBackoff = db.s.o.GetDisableCompactionBackoff()
)
for n := 0; ; n++ {
// Check wether the DB is closed.
// Check whether the DB is closed.
if db.isClosed() {
db.logf("%s exiting", name)
db.compactionExitTransact()
@ -225,7 +195,7 @@ func (db *DB) compactionTransact(name string, t compactionTransactInterface) {
db.logf("%s exiting (persistent error %q)", name, perr)
db.compactionExitTransact()
}
case _, _ = <-db.closeC:
case <-db.closeC:
db.logf("%s exiting", name)
db.compactionExitTransact()
}
@ -254,7 +224,7 @@ func (db *DB) compactionTransact(name string, t compactionTransactInterface) {
}
select {
case <-backoffT.C:
case _, _ = <-db.closeC:
case <-db.closeC:
db.logf("%s exiting", name)
db.compactionExitTransact()
}
@ -286,22 +256,27 @@ func (db *DB) compactionExitTransact() {
panic(errCompactionTransactExiting)
}
func (db *DB) compactionCommit(name string, rec *sessionRecord) {
db.compCommitLk.Lock()
defer db.compCommitLk.Unlock() // Defer is necessary.
db.compactionTransactFunc(name+"@commit", func(cnt *compactionTransactCounter) error {
return db.s.commit(rec)
}, nil)
}
func (db *DB) memCompaction() {
mem := db.getFrozenMem()
if mem == nil {
mdb := db.getFrozenMem()
if mdb == nil {
return
}
defer mem.decref()
defer mdb.decref()
c := newCMem(db.s)
stats := new(cStatsStaging)
db.logf("mem@flush N·%d S·%s", mem.mdb.Len(), shortenb(mem.mdb.Size()))
db.logf("memdb@flush N·%d S·%s", mdb.Len(), shortenb(mdb.Size()))
// Don't compact empty memdb.
if mem.mdb.Len() == 0 {
db.logf("mem@flush skipping")
// drop frozen mem
if mdb.Len() == 0 {
db.logf("memdb@flush skipping")
// drop frozen memdb
db.dropFrozenMem()
return
}
@ -313,39 +288,48 @@ func (db *DB) memCompaction() {
case <-db.compPerErrC:
close(resumeC)
resumeC = nil
case _, _ = <-db.closeC:
return
case <-db.closeC:
db.compactionExitTransact()
}
db.compactionTransactFunc("mem@flush", func(cnt *compactionTransactCounter) (err error) {
var (
rec = &sessionRecord{}
stats = &cStatStaging{}
flushLevel int
)
// Generate tables.
db.compactionTransactFunc("memdb@flush", func(cnt *compactionTransactCounter) (err error) {
stats.startTimer()
defer stats.stopTimer()
return c.flush(mem.mdb, -1)
flushLevel, err = db.s.flushMemdb(rec, mdb.DB, db.memdbMaxLevel)
stats.stopTimer()
return
}, func() error {
for _, r := range c.rec.addedTables {
db.logf("mem@flush revert @%d", r.num)
f := db.s.getTableFile(r.num)
if err := f.Remove(); err != nil {
for _, r := range rec.addedTables {
db.logf("memdb@flush revert @%d", r.num)
if err := db.s.stor.Remove(storage.FileDesc{Type: storage.TypeTable, Num: r.num}); err != nil {
return err
}
}
return nil
})
db.compactionTransactFunc("mem@commit", func(cnt *compactionTransactCounter) (err error) {
stats.startTimer()
defer stats.stopTimer()
return c.commit(db.journalFile.Num(), db.frozenSeq)
}, nil)
rec.setJournalNum(db.journalFd.Num)
rec.setSeqNum(db.frozenSeq)
db.logf("mem@flush committed F·%d T·%v", len(c.rec.addedTables), stats.duration)
// Commit.
stats.startTimer()
db.compactionCommit("memdb", rec)
stats.stopTimer()
for _, r := range c.rec.addedTables {
db.logf("memdb@flush committed F·%d T·%v", len(rec.addedTables), stats.duration)
for _, r := range rec.addedTables {
stats.write += r.size
}
db.compStats[c.level].add(stats)
db.compStats.addStat(flushLevel, stats)
// Drop frozen mem.
// Drop frozen memdb.
db.dropFrozenMem()
// Resume table compaction.
@ -353,13 +337,13 @@ func (db *DB) memCompaction() {
select {
case <-resumeC:
close(resumeC)
case _, _ = <-db.closeC:
return
case <-db.closeC:
db.compactionExitTransact()
}
}
// Trigger table compaction.
db.compSendTrigger(db.tcompCmdC)
db.compTrigger(db.tcompCmdC)
}
type tableCompactionBuilder struct {
@ -367,7 +351,7 @@ type tableCompactionBuilder struct {
s *session
c *compaction
rec *sessionRecord
stat0, stat1 *cStatsStaging
stat0, stat1 *cStatStaging
snapHasLastUkey bool
snapLastUkey []byte
@ -394,7 +378,7 @@ func (b *tableCompactionBuilder) appendKV(key, value []byte) error {
select {
case ch := <-b.db.tcompPauseC:
b.db.pauseCompaction(ch)
case _, _ = <-b.db.closeC:
case <-b.db.closeC:
b.db.compactionExitTransact()
default:
}
@ -421,9 +405,9 @@ func (b *tableCompactionBuilder) flush() error {
if err != nil {
return err
}
b.rec.addTableFile(b.c.level+1, t)
b.rec.addTableFile(b.c.sourceLevel+1, t)
b.stat1.write += t.size
b.s.logf("table@build created L%d@%d N·%d S·%s %q:%q", b.c.level+1, t.file.Num(), b.tw.tw.EntriesLen(), shortenb(int(t.size)), t.imin, t.imax)
b.s.logf("table@build created L%d@%d N·%d S·%s %q:%q", b.c.sourceLevel+1, t.fd.Num, b.tw.tw.EntriesLen(), shortenb(int(t.size)), t.imin, t.imax)
b.tw = nil
return nil
}
@ -468,7 +452,7 @@ func (b *tableCompactionBuilder) run(cnt *compactionTransactCounter) error {
}
ikey := iter.Key()
ukey, seq, kt, kerr := parseIkey(ikey)
ukey, seq, kt, kerr := parseInternalKey(ikey)
if kerr == nil {
shouldStop := !resumed && b.c.shouldStopBefore(ikey)
@ -494,14 +478,14 @@ func (b *tableCompactionBuilder) run(cnt *compactionTransactCounter) error {
hasLastUkey = true
lastUkey = append(lastUkey[:0], ukey...)
lastSeq = kMaxSeq
lastSeq = keyMaxSeq
}
switch {
case lastSeq <= b.minSeq:
// Dropped because newer entry for same user key exist
fallthrough // (A)
case kt == ktDel && seq <= b.minSeq && b.c.baseLevelForKey(lastUkey):
case kt == keyTypeDel && seq <= b.minSeq && b.c.baseLevelForKey(lastUkey):
// For this user key:
// (1) there is no data in higher levels
// (2) data in lower levels will have larger seq numbers
@ -523,7 +507,7 @@ func (b *tableCompactionBuilder) run(cnt *compactionTransactCounter) error {
// Don't drop corrupted keys.
hasLastUkey = false
lastUkey = lastUkey[:0]
lastSeq = kMaxSeq
lastSeq = keyMaxSeq
b.kerrCnt++
}
@ -546,8 +530,7 @@ func (b *tableCompactionBuilder) run(cnt *compactionTransactCounter) error {
func (b *tableCompactionBuilder) revert() error {
for _, at := range b.rec.addedTables {
b.s.logf("table@build revert @%d", at.num)
f := b.s.getTableFile(at.num)
if err := f.Remove(); err != nil {
if err := b.s.stor.Remove(storage.FileDesc{Type: storage.TypeTable, Num: at.num}); err != nil {
return err
}
}
@ -557,31 +540,29 @@ func (b *tableCompactionBuilder) revert() error {
func (db *DB) tableCompaction(c *compaction, noTrivial bool) {
defer c.release()
rec := &sessionRecord{numLevel: db.s.o.GetNumLevel()}
rec.addCompPtr(c.level, c.imax)
rec := &sessionRecord{}
rec.addCompPtr(c.sourceLevel, c.imax)
if !noTrivial && c.trivial() {
t := c.tables[0][0]
db.logf("table@move L%d@%d -> L%d", c.level, t.file.Num(), c.level+1)
rec.delTable(c.level, t.file.Num())
rec.addTableFile(c.level+1, t)
db.compactionTransactFunc("table@move", func(cnt *compactionTransactCounter) (err error) {
return db.s.commit(rec)
}, nil)
t := c.levels[0][0]
db.logf("table@move L%d@%d -> L%d", c.sourceLevel, t.fd.Num, c.sourceLevel+1)
rec.delTable(c.sourceLevel, t.fd.Num)
rec.addTableFile(c.sourceLevel+1, t)
db.compactionCommit("table-move", rec)
return
}
var stats [2]cStatsStaging
for i, tables := range c.tables {
var stats [2]cStatStaging
for i, tables := range c.levels {
for _, t := range tables {
stats[i].read += t.size
// Insert deleted tables into record
rec.delTable(c.level+i, t.file.Num())
rec.delTable(c.sourceLevel+i, t.fd.Num)
}
}
sourceSize := int(stats[0].read + stats[1].read)
minSeq := db.minSeq()
db.logf("table@compaction L%d·%d -> L%d·%d S·%s Q·%d", c.level, len(c.tables[0]), c.level+1, len(c.tables[1]), shortenb(sourceSize), minSeq)
db.logf("table@compaction L%d·%d -> L%d·%d S·%s Q·%d", c.sourceLevel, len(c.levels[0]), c.sourceLevel+1, len(c.levels[1]), shortenb(sourceSize), minSeq)
b := &tableCompactionBuilder{
db: db,
@ -591,49 +572,60 @@ func (db *DB) tableCompaction(c *compaction, noTrivial bool) {
stat1: &stats[1],
minSeq: minSeq,
strict: db.s.o.GetStrict(opt.StrictCompaction),
tableSize: db.s.o.GetCompactionTableSize(c.level + 1),
tableSize: db.s.o.GetCompactionTableSize(c.sourceLevel + 1),
}
db.compactionTransact("table@build", b)
// Commit changes
db.compactionTransactFunc("table@commit", func(cnt *compactionTransactCounter) (err error) {
stats[1].startTimer()
defer stats[1].stopTimer()
return db.s.commit(rec)
}, nil)
// Commit.
stats[1].startTimer()
db.compactionCommit("table", rec)
stats[1].stopTimer()
resultSize := int(stats[1].write)
db.logf("table@compaction committed F%s S%s Ke·%d D·%d T·%v", sint(len(rec.addedTables)-len(rec.deletedTables)), sshortenb(resultSize-sourceSize), b.kerrCnt, b.dropCnt, stats[1].duration)
// Save compaction stats
for i := range stats {
db.compStats[c.level+1].add(&stats[i])
db.compStats.addStat(c.sourceLevel+1, &stats[i])
}
}
func (db *DB) tableRangeCompaction(level int, umin, umax []byte) {
func (db *DB) tableRangeCompaction(level int, umin, umax []byte) error {
db.logf("table@compaction range L%d %q:%q", level, umin, umax)
if level >= 0 {
if c := db.s.getCompactionRange(level, umin, umax); c != nil {
if c := db.s.getCompactionRange(level, umin, umax, true); c != nil {
db.tableCompaction(c, true)
}
} else {
v := db.s.version()
m := 1
for i, t := range v.tables[1:] {
if t.overlaps(db.s.icmp, umin, umax, false) {
m = i + 1
}
}
v.release()
// Retry until nothing to compact.
for {
compacted := false
for level := 0; level < m; level++ {
if c := db.s.getCompactionRange(level, umin, umax); c != nil {
db.tableCompaction(c, true)
// Scan for maximum level with overlapped tables.
v := db.s.version()
m := 1
for i := m; i < len(v.levels); i++ {
tables := v.levels[i]
if tables.overlaps(db.s.icmp, umin, umax, false) {
m = i
}
}
v.release()
for level := 0; level < m; level++ {
if c := db.s.getCompactionRange(level, umin, umax, false); c != nil {
db.tableCompaction(c, true)
compacted = true
}
}
if !compacted {
break
}
}
}
return nil
}
func (db *DB) tableAutoCompaction() {
@ -651,7 +643,7 @@ func (db *DB) tableNeedCompaction() bool {
func (db *DB) pauseCompaction(ch chan<- struct{}) {
select {
case ch <- struct{}{}:
case _, _ = <-db.closeC:
case <-db.closeC:
db.compactionExitTransact()
}
}
@ -660,11 +652,11 @@ type cCmd interface {
ack(err error)
}
type cIdle struct {
type cAuto struct {
ackC chan<- error
}
func (r cIdle) ack(err error) {
func (r cAuto) ack(err error) {
if r.ackC != nil {
defer func() {
recover()
@ -688,38 +680,38 @@ func (r cRange) ack(err error) {
}
}
// This will trigger auto compation and/or wait for all compaction to be done.
func (db *DB) compSendIdle(compC chan<- cCmd) (err error) {
// This will trigger auto compaction but will not wait for it.
func (db *DB) compTrigger(compC chan<- cCmd) {
select {
case compC <- cAuto{}:
default:
}
}
// This will trigger auto compaction and/or wait for all compaction to be done.
func (db *DB) compTriggerWait(compC chan<- cCmd) (err error) {
ch := make(chan error)
defer close(ch)
// Send cmd.
select {
case compC <- cIdle{ch}:
case compC <- cAuto{ch}:
case err = <-db.compErrC:
return
case _, _ = <-db.closeC:
case <-db.closeC:
return ErrClosed
}
// Wait cmd.
select {
case err = <-ch:
case err = <-db.compErrC:
case _, _ = <-db.closeC:
case <-db.closeC:
return ErrClosed
}
return err
}
// This will trigger auto compaction but will not wait for it.
func (db *DB) compSendTrigger(compC chan<- cCmd) {
select {
case compC <- cIdle{}:
default:
}
}
// Send range compaction request.
func (db *DB) compSendRange(compC chan<- cCmd, level int, min, max []byte) (err error) {
func (db *DB) compTriggerRange(compC chan<- cCmd, level int, min, max []byte) (err error) {
ch := make(chan error)
defer close(ch)
// Send cmd.
@ -727,14 +719,14 @@ func (db *DB) compSendRange(compC chan<- cCmd, level int, min, max []byte) (err
case compC <- cRange{level, min, max, ch}:
case err := <-db.compErrC:
return err
case _, _ = <-db.closeC:
case <-db.closeC:
return ErrClosed
}
// Wait cmd.
select {
case err = <-ch:
case err = <-db.compErrC:
case _, _ = <-db.closeC:
case <-db.closeC:
return ErrClosed
}
return err
@ -759,14 +751,14 @@ func (db *DB) mCompaction() {
select {
case x = <-db.mcompCmdC:
switch x.(type) {
case cIdle:
case cAuto:
db.memCompaction()
x.ack(nil)
x = nil
default:
panic("leveldb: unknown command")
}
case _, _ = <-db.closeC:
case <-db.closeC:
return
}
}
@ -799,7 +791,7 @@ func (db *DB) tCompaction() {
case ch := <-db.tcompPauseC:
db.pauseCompaction(ch)
continue
case _, _ = <-db.closeC:
case <-db.closeC:
return
default:
}
@ -814,17 +806,16 @@ func (db *DB) tCompaction() {
case ch := <-db.tcompPauseC:
db.pauseCompaction(ch)
continue
case _, _ = <-db.closeC:
case <-db.closeC:
return
}
}
if x != nil {
switch cmd := x.(type) {
case cIdle:
case cAuto:
ackQ = append(ackQ, x)
case cRange:
db.tableRangeCompaction(cmd.level, cmd.min, cmd.max)
x.ack(nil)
x.ack(db.tableRangeCompaction(cmd.level, cmd.min, cmd.max))
default:
panic("leveldb: unknown command")
}

View File

@ -19,7 +19,7 @@ import (
)
var (
errInvalidIkey = errors.New("leveldb: Iterator: invalid internal key")
errInvalidInternalKey = errors.New("leveldb: Iterator: invalid internal key")
)
type memdbReleaser struct {
@ -33,40 +33,50 @@ func (mr *memdbReleaser) Release() {
})
}
func (db *DB) newRawIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator {
func (db *DB) newRawIterator(auxm *memDB, auxt tFiles, slice *util.Range, ro *opt.ReadOptions) iterator.Iterator {
strict := opt.GetStrict(db.s.o.Options, ro, opt.StrictReader)
em, fm := db.getMems()
v := db.s.version()
ti := v.getIterators(slice, ro)
n := len(ti) + 2
i := make([]iterator.Iterator, 0, n)
emi := em.mdb.NewIterator(slice)
emi.SetReleaser(&memdbReleaser{m: em})
i = append(i, emi)
if fm != nil {
fmi := fm.mdb.NewIterator(slice)
fmi.SetReleaser(&memdbReleaser{m: fm})
i = append(i, fmi)
tableIts := v.getIterators(slice, ro)
n := len(tableIts) + len(auxt) + 3
its := make([]iterator.Iterator, 0, n)
if auxm != nil {
ami := auxm.NewIterator(slice)
ami.SetReleaser(&memdbReleaser{m: auxm})
its = append(its, ami)
}
i = append(i, ti...)
strict := opt.GetStrict(db.s.o.Options, ro, opt.StrictReader)
mi := iterator.NewMergedIterator(i, db.s.icmp, strict)
for _, t := range auxt {
its = append(its, v.s.tops.newIterator(t, slice, ro))
}
emi := em.NewIterator(slice)
emi.SetReleaser(&memdbReleaser{m: em})
its = append(its, emi)
if fm != nil {
fmi := fm.NewIterator(slice)
fmi.SetReleaser(&memdbReleaser{m: fm})
its = append(its, fmi)
}
its = append(its, tableIts...)
mi := iterator.NewMergedIterator(its, db.s.icmp, strict)
mi.SetReleaser(&versionReleaser{v: v})
return mi
}
func (db *DB) newIterator(seq uint64, slice *util.Range, ro *opt.ReadOptions) *dbIter {
func (db *DB) newIterator(auxm *memDB, auxt tFiles, seq uint64, slice *util.Range, ro *opt.ReadOptions) *dbIter {
var islice *util.Range
if slice != nil {
islice = &util.Range{}
if slice.Start != nil {
islice.Start = newIkey(slice.Start, kMaxSeq, ktSeek)
islice.Start = makeInternalKey(nil, slice.Start, keyMaxSeq, keyTypeSeek)
}
if slice.Limit != nil {
islice.Limit = newIkey(slice.Limit, kMaxSeq, ktSeek)
islice.Limit = makeInternalKey(nil, slice.Limit, keyMaxSeq, keyTypeSeek)
}
}
rawIter := db.newRawIterator(islice, ro)
rawIter := db.newRawIterator(auxm, auxt, islice, ro)
iter := &dbIter{
db: db,
icmp: db.s.icmp,
@ -177,7 +187,7 @@ func (i *dbIter) Seek(key []byte) bool {
return false
}
ikey := newIkey(key, i.seq, ktSeek)
ikey := makeInternalKey(nil, key, i.seq, keyTypeSeek)
if i.iter.Seek(ikey) {
i.dir = dirSOI
return i.next()
@ -189,15 +199,15 @@ func (i *dbIter) Seek(key []byte) bool {
func (i *dbIter) next() bool {
for {
if ukey, seq, kt, kerr := parseIkey(i.iter.Key()); kerr == nil {
if ukey, seq, kt, kerr := parseInternalKey(i.iter.Key()); kerr == nil {
i.sampleSeek()
if seq <= i.seq {
switch kt {
case ktDel:
case keyTypeDel:
// Skip deleted key.
i.key = append(i.key[:0], ukey...)
i.dir = dirForward
case ktVal:
case keyTypeVal:
if i.dir == dirSOI || i.icmp.uCompare(ukey, i.key) > 0 {
i.key = append(i.key[:0], ukey...)
i.value = append(i.value[:0], i.iter.Value()...)
@ -240,13 +250,13 @@ func (i *dbIter) prev() bool {
del := true
if i.iter.Valid() {
for {
if ukey, seq, kt, kerr := parseIkey(i.iter.Key()); kerr == nil {
if ukey, seq, kt, kerr := parseInternalKey(i.iter.Key()); kerr == nil {
i.sampleSeek()
if seq <= i.seq {
if !del && i.icmp.uCompare(ukey, i.key) < 0 {
return true
}
del = (kt == ktDel)
del = (kt == keyTypeDel)
if !del {
i.key = append(i.key[:0], ukey...)
i.value = append(i.value[:0], i.iter.Value()...)
@ -282,7 +292,7 @@ func (i *dbIter) Prev() bool {
return i.Last()
case dirForward:
for i.iter.Prev() {
if ukey, _, _, kerr := parseIkey(i.iter.Key()); kerr == nil {
if ukey, _, _, kerr := parseInternalKey(i.iter.Key()); kerr == nil {
i.sampleSeek()
if i.icmp.uCompare(ukey, i.key) < 0 {
goto cont

View File

@ -59,7 +59,7 @@ func (db *DB) releaseSnapshot(se *snapshotElement) {
}
}
// Gets minimum sequence that not being snapshoted.
// Gets minimum sequence that not being snapshotted.
func (db *DB) minSeq() uint64 {
db.snapsMu.Lock()
defer db.snapsMu.Unlock()
@ -110,7 +110,7 @@ func (snap *Snapshot) Get(key []byte, ro *opt.ReadOptions) (value []byte, err er
err = ErrSnapshotReleased
return
}
return snap.db.get(key, snap.elem.seq, ro)
return snap.db.get(nil, nil, key, snap.elem.seq, ro)
}
// Has returns true if the DB does contains the given key.
@ -127,11 +127,11 @@ func (snap *Snapshot) Has(key []byte, ro *opt.ReadOptions) (ret bool, err error)
err = ErrSnapshotReleased
return
}
return snap.db.has(key, snap.elem.seq, ro)
return snap.db.has(nil, nil, key, snap.elem.seq, ro)
}
// NewIterator returns an iterator for the snapshot of the uderlying DB.
// The returned iterator is not goroutine-safe, but it is safe to use
// NewIterator returns an iterator for the snapshot of the underlying DB.
// The returned iterator is not safe for concurrent use, but it is safe to use
// multiple iterators concurrently, with each in a dedicated goroutine.
// It is also safe to use an iterator concurrently with modifying its
// underlying DB. The resultant key/value pairs are guaranteed to be
@ -158,7 +158,7 @@ func (snap *Snapshot) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterat
}
// Since iterator already hold version ref, it doesn't need to
// hold snapshot ref.
return snap.db.newIterator(snap.elem.seq, slice, ro)
return snap.db.newIterator(nil, nil, snap.elem.seq, slice, ro)
}
// Release releases the snapshot. This will not release any returned

View File

@ -7,19 +7,29 @@
package leveldb
import (
"errors"
"sync/atomic"
"time"
"github.com/syndtr/goleveldb/leveldb/journal"
"github.com/syndtr/goleveldb/leveldb/memdb"
"github.com/syndtr/goleveldb/leveldb/storage"
)
var (
errHasFrozenMem = errors.New("has frozen mem")
)
type memDB struct {
db *DB
mdb *memdb.DB
db *DB
*memdb.DB
ref int32
}
func (m *memDB) getref() int32 {
return atomic.LoadInt32(&m.ref)
}
func (m *memDB) incref() {
atomic.AddInt32(&m.ref, 1)
}
@ -27,12 +37,12 @@ func (m *memDB) incref() {
func (m *memDB) decref() {
if ref := atomic.AddInt32(&m.ref, -1); ref == 0 {
// Only put back memdb with std capacity.
if m.mdb.Capacity() == m.db.s.o.GetWriteBuffer() {
m.mdb.Reset()
m.db.mpoolPut(m.mdb)
if m.Capacity() == m.db.s.o.GetWriteBuffer() {
m.Reset()
m.db.mpoolPut(m.DB)
}
m.db = nil
m.mdb = nil
m.DB = nil
} else if ref < 0 {
panic("negative memdb ref")
}
@ -48,31 +58,40 @@ func (db *DB) addSeq(delta uint64) {
atomic.AddUint64(&db.seq, delta)
}
func (db *DB) sampleSeek(ikey iKey) {
func (db *DB) setSeq(seq uint64) {
atomic.StoreUint64(&db.seq, seq)
}
func (db *DB) sampleSeek(ikey internalKey) {
v := db.s.version()
if v.sampleSeek(ikey) {
// Trigger table compaction.
db.compSendTrigger(db.tcompCmdC)
db.compTrigger(db.tcompCmdC)
}
v.release()
}
func (db *DB) mpoolPut(mem *memdb.DB) {
defer func() {
recover()
}()
select {
case db.memPool <- mem:
default:
if !db.isClosed() {
select {
case db.memPool <- mem:
default:
}
}
}
func (db *DB) mpoolGet() *memdb.DB {
func (db *DB) mpoolGet(n int) *memDB {
var mdb *memdb.DB
select {
case mem := <-db.memPool:
return mem
case mdb = <-db.memPool:
default:
return nil
}
if mdb == nil || mdb.Capacity() < n {
mdb = memdb.New(db.s.icmp, maxInt(db.s.o.GetWriteBuffer(), n))
}
return &memDB{
db: db,
DB: mdb,
}
}
@ -85,7 +104,13 @@ func (db *DB) mpoolDrain() {
case <-db.memPool:
default:
}
case _, _ = <-db.closeC:
case <-db.closeC:
ticker.Stop()
// Make sure the pool is drained.
select {
case <-db.memPool:
case <-time.After(time.Second):
}
close(db.memPool)
return
}
@ -95,11 +120,10 @@ func (db *DB) mpoolDrain() {
// Create new memdb and froze the old one; need external synchronization.
// newMem only called synchronously by the writer.
func (db *DB) newMem(n int) (mem *memDB, err error) {
num := db.s.allocFileNum()
file := db.s.getJournalFile(num)
w, err := file.Create()
fd := storage.FileDesc{Type: storage.TypeJournal, Num: db.s.allocFileNum()}
w, err := db.s.stor.Create(fd)
if err != nil {
db.s.reuseFileNum(num)
db.s.reuseFileNum(fd.Num)
return
}
@ -107,7 +131,7 @@ func (db *DB) newMem(n int) (mem *memDB, err error) {
defer db.memMu.Unlock()
if db.frozenMem != nil {
panic("still has frozen mem")
return nil, errHasFrozenMem
}
if db.journal == nil {
@ -115,20 +139,14 @@ func (db *DB) newMem(n int) (mem *memDB, err error) {
} else {
db.journal.Reset(w)
db.journalWriter.Close()
db.frozenJournalFile = db.journalFile
db.frozenJournalFd = db.journalFd
}
db.journalWriter = w
db.journalFile = file
db.journalFd = fd
db.frozenMem = db.mem
mdb := db.mpoolGet()
if mdb == nil || mdb.Capacity() < n {
mdb = memdb.New(db.s.icmp, maxInt(db.s.o.GetWriteBuffer(), n))
}
mem = &memDB{
db: db,
mdb: mdb,
ref: 2,
}
mem = db.mpoolGet(n)
mem.incref() // for self
mem.incref() // for caller
db.mem = mem
// The seq only incremented by the writer. And whoever called newMem
// should hold write lock, so no need additional synchronization here.
@ -140,24 +158,26 @@ func (db *DB) newMem(n int) (mem *memDB, err error) {
func (db *DB) getMems() (e, f *memDB) {
db.memMu.RLock()
defer db.memMu.RUnlock()
if db.mem == nil {
if db.mem != nil {
db.mem.incref()
} else if !db.isClosed() {
panic("nil effective mem")
}
db.mem.incref()
if db.frozenMem != nil {
db.frozenMem.incref()
}
return db.mem, db.frozenMem
}
// Get frozen memdb.
// Get effective memdb.
func (db *DB) getEffectiveMem() *memDB {
db.memMu.RLock()
defer db.memMu.RUnlock()
if db.mem == nil {
if db.mem != nil {
db.mem.incref()
} else if !db.isClosed() {
panic("nil effective mem")
}
db.mem.incref()
return db.mem
}
@ -181,17 +201,25 @@ func (db *DB) getFrozenMem() *memDB {
// Drop frozen memdb; assume that frozen memdb isn't nil.
func (db *DB) dropFrozenMem() {
db.memMu.Lock()
if err := db.frozenJournalFile.Remove(); err != nil {
db.logf("journal@remove removing @%d %q", db.frozenJournalFile.Num(), err)
if err := db.s.stor.Remove(db.frozenJournalFd); err != nil {
db.logf("journal@remove removing @%d %q", db.frozenJournalFd.Num, err)
} else {
db.logf("journal@remove removed @%d", db.frozenJournalFile.Num())
db.logf("journal@remove removed @%d", db.frozenJournalFd.Num)
}
db.frozenJournalFile = nil
db.frozenJournalFd = storage.FileDesc{}
db.frozenMem.decref()
db.frozenMem = nil
db.memMu.Unlock()
}
// Clear mems ptr; used by DB.Close().
func (db *DB) clearMems() {
db.memMu.Lock()
db.mem = nil
db.frozenMem = nil
db.memMu.Unlock()
}
// Set closed flag; return true if not already closed.
func (db *DB) setClosed() bool {
return atomic.CompareAndSwapUint32(&db.closed, 0, 1)

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,325 @@
// Copyright (c) 2016, Suryandaru Triandana <syndtr@gmail.com>
// All rights reserved.
//
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
package leveldb
import (
"errors"
"sync"
"time"
"github.com/syndtr/goleveldb/leveldb/iterator"
"github.com/syndtr/goleveldb/leveldb/opt"
"github.com/syndtr/goleveldb/leveldb/util"
)
var errTransactionDone = errors.New("leveldb: transaction already closed")
// Transaction is the transaction handle.
type Transaction struct {
db *DB
lk sync.RWMutex
seq uint64
mem *memDB
tables tFiles
ikScratch []byte
rec sessionRecord
stats cStatStaging
closed bool
}
// Get gets the value for the given key. It returns ErrNotFound if the
// DB does not contains the key.
//
// The returned slice is its own copy, it is safe to modify the contents
// of the returned slice.
// It is safe to modify the contents of the argument after Get returns.
func (tr *Transaction) Get(key []byte, ro *opt.ReadOptions) ([]byte, error) {
tr.lk.RLock()
defer tr.lk.RUnlock()
if tr.closed {
return nil, errTransactionDone
}
return tr.db.get(tr.mem.DB, tr.tables, key, tr.seq, ro)
}
// Has returns true if the DB does contains the given key.
//
// It is safe to modify the contents of the argument after Has returns.
func (tr *Transaction) Has(key []byte, ro *opt.ReadOptions) (bool, error) {
tr.lk.RLock()
defer tr.lk.RUnlock()
if tr.closed {
return false, errTransactionDone
}
return tr.db.has(tr.mem.DB, tr.tables, key, tr.seq, ro)
}
// NewIterator returns an iterator for the latest snapshot of the transaction.
// The returned iterator is not safe for concurrent use, but it is safe to use
// multiple iterators concurrently, with each in a dedicated goroutine.
// It is also safe to use an iterator concurrently while writes to the
// transaction. The resultant key/value pairs are guaranteed to be consistent.
//
// Slice allows slicing the iterator to only contains keys in the given
// range. A nil Range.Start is treated as a key before all keys in the
// DB. And a nil Range.Limit is treated as a key after all keys in
// the DB.
//
// The iterator must be released after use, by calling Release method.
//
// Also read Iterator documentation of the leveldb/iterator package.
func (tr *Transaction) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator {
tr.lk.RLock()
defer tr.lk.RUnlock()
if tr.closed {
return iterator.NewEmptyIterator(errTransactionDone)
}
tr.mem.incref()
return tr.db.newIterator(tr.mem, tr.tables, tr.seq, slice, ro)
}
func (tr *Transaction) flush() error {
// Flush memdb.
if tr.mem.Len() != 0 {
tr.stats.startTimer()
iter := tr.mem.NewIterator(nil)
t, n, err := tr.db.s.tops.createFrom(iter)
iter.Release()
tr.stats.stopTimer()
if err != nil {
return err
}
if tr.mem.getref() == 1 {
tr.mem.Reset()
} else {
tr.mem.decref()
tr.mem = tr.db.mpoolGet(0)
tr.mem.incref()
}
tr.tables = append(tr.tables, t)
tr.rec.addTableFile(0, t)
tr.stats.write += t.size
tr.db.logf("transaction@flush created L0@%d N·%d S·%s %q:%q", t.fd.Num, n, shortenb(int(t.size)), t.imin, t.imax)
}
return nil
}
func (tr *Transaction) put(kt keyType, key, value []byte) error {
tr.ikScratch = makeInternalKey(tr.ikScratch, key, tr.seq+1, kt)
if tr.mem.Free() < len(tr.ikScratch)+len(value) {
if err := tr.flush(); err != nil {
return err
}
}
if err := tr.mem.Put(tr.ikScratch, value); err != nil {
return err
}
tr.seq++
return nil
}
// Put sets the value for the given key. It overwrites any previous value
// for that key; a DB is not a multi-map.
// Please note that the transaction is not compacted until committed, so if you
// writes 10 same keys, then those 10 same keys are in the transaction.
//
// It is safe to modify the contents of the arguments after Put returns.
func (tr *Transaction) Put(key, value []byte, wo *opt.WriteOptions) error {
tr.lk.Lock()
defer tr.lk.Unlock()
if tr.closed {
return errTransactionDone
}
return tr.put(keyTypeVal, key, value)
}
// Delete deletes the value for the given key.
// Please note that the transaction is not compacted until committed, so if you
// writes 10 same keys, then those 10 same keys are in the transaction.
//
// It is safe to modify the contents of the arguments after Delete returns.
func (tr *Transaction) Delete(key []byte, wo *opt.WriteOptions) error {
tr.lk.Lock()
defer tr.lk.Unlock()
if tr.closed {
return errTransactionDone
}
return tr.put(keyTypeDel, key, nil)
}
// Write apply the given batch to the transaction. The batch will be applied
// sequentially.
// Please note that the transaction is not compacted until committed, so if you
// writes 10 same keys, then those 10 same keys are in the transaction.
//
// It is safe to modify the contents of the arguments after Write returns.
func (tr *Transaction) Write(b *Batch, wo *opt.WriteOptions) error {
if b == nil || b.Len() == 0 {
return nil
}
tr.lk.Lock()
defer tr.lk.Unlock()
if tr.closed {
return errTransactionDone
}
return b.replayInternal(func(i int, kt keyType, k, v []byte) error {
return tr.put(kt, k, v)
})
}
func (tr *Transaction) setDone() {
tr.closed = true
tr.db.tr = nil
tr.mem.decref()
<-tr.db.writeLockC
}
// Commit commits the transaction. If error is not nil, then the transaction is
// not committed, it can then either be retried or discarded.
//
// Other methods should not be called after transaction has been committed.
func (tr *Transaction) Commit() error {
if err := tr.db.ok(); err != nil {
return err
}
tr.lk.Lock()
defer tr.lk.Unlock()
if tr.closed {
return errTransactionDone
}
if err := tr.flush(); err != nil {
// Return error, lets user decide either to retry or discard
// transaction.
return err
}
if len(tr.tables) != 0 {
// Committing transaction.
tr.rec.setSeqNum(tr.seq)
tr.db.compCommitLk.Lock()
tr.stats.startTimer()
var cerr error
for retry := 0; retry < 3; retry++ {
cerr = tr.db.s.commit(&tr.rec)
if cerr != nil {
tr.db.logf("transaction@commit error R·%d %q", retry, cerr)
select {
case <-time.After(time.Second):
case <-tr.db.closeC:
tr.db.logf("transaction@commit exiting")
tr.db.compCommitLk.Unlock()
return cerr
}
} else {
// Success. Set db.seq.
tr.db.setSeq(tr.seq)
break
}
}
tr.stats.stopTimer()
if cerr != nil {
// Return error, lets user decide either to retry or discard
// transaction.
return cerr
}
// Update compaction stats. This is safe as long as we hold compCommitLk.
tr.db.compStats.addStat(0, &tr.stats)
// Trigger table auto-compaction.
tr.db.compTrigger(tr.db.tcompCmdC)
tr.db.compCommitLk.Unlock()
// Additionally, wait compaction when certain threshold reached.
// Ignore error, returns error only if transaction can't be committed.
tr.db.waitCompaction()
}
// Only mark as done if transaction committed successfully.
tr.setDone()
return nil
}
func (tr *Transaction) discard() {
// Discard transaction.
for _, t := range tr.tables {
tr.db.logf("transaction@discard @%d", t.fd.Num)
if err1 := tr.db.s.stor.Remove(t.fd); err1 == nil {
tr.db.s.reuseFileNum(t.fd.Num)
}
}
}
// Discard discards the transaction.
//
// Other methods should not be called after transaction has been discarded.
func (tr *Transaction) Discard() {
tr.lk.Lock()
if !tr.closed {
tr.discard()
tr.setDone()
}
tr.lk.Unlock()
}
func (db *DB) waitCompaction() error {
if db.s.tLen(0) >= db.s.o.GetWriteL0PauseTrigger() {
return db.compTriggerWait(db.tcompCmdC)
}
return nil
}
// OpenTransaction opens an atomic DB transaction. Only one transaction can be
// opened at a time. Subsequent call to Write and OpenTransaction will be blocked
// until in-flight transaction is committed or discarded.
// The returned transaction handle is safe for concurrent use.
//
// Transaction is expensive and can overwhelm compaction, especially if
// transaction size is small. Use with caution.
//
// The transaction must be closed once done, either by committing or discarding
// the transaction.
// Closing the DB will discard open transaction.
func (db *DB) OpenTransaction() (*Transaction, error) {
if err := db.ok(); err != nil {
return nil, err
}
// The write happen synchronously.
select {
case db.writeLockC <- struct{}{}:
case err := <-db.compPerErrC:
return nil, err
case <-db.closeC:
return nil, ErrClosed
}
if db.tr != nil {
panic("leveldb: has open transaction")
}
// Flush current memdb.
if db.mem != nil && db.mem.Len() != 0 {
if _, err := db.rotateMem(0, true); err != nil {
return nil, err
}
}
// Wait compaction when certain threshold reached.
if err := db.waitCompaction(); err != nil {
return nil, err
}
tr := &Transaction{
db: db,
seq: db.seq,
mem: db.mpoolGet(0),
}
tr.mem.incref()
db.tr = tr
return tr, nil
}

View File

@ -21,14 +21,16 @@ type Reader interface {
NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator
}
type Sizes []uint64
// Sizes is list of size.
type Sizes []int64
// Sum returns sum of the sizes.
func (p Sizes) Sum() (n uint64) {
for _, s := range p {
n += s
func (sizes Sizes) Sum() int64 {
var sum int64
for _, size := range sizes {
sum += size
}
return n
return sum
}
// Logging.
@ -40,59 +42,59 @@ func (db *DB) checkAndCleanFiles() error {
v := db.s.version()
defer v.release()
tablesMap := make(map[uint64]bool)
for _, tables := range v.tables {
tmap := make(map[int64]bool)
for _, tables := range v.levels {
for _, t := range tables {
tablesMap[t.file.Num()] = false
tmap[t.fd.Num] = false
}
}
files, err := db.s.getFiles(storage.TypeAll)
fds, err := db.s.stor.List(storage.TypeAll)
if err != nil {
return err
}
var nTables int
var rem []storage.File
for _, f := range files {
var nt int
var rem []storage.FileDesc
for _, fd := range fds {
keep := true
switch f.Type() {
switch fd.Type {
case storage.TypeManifest:
keep = f.Num() >= db.s.manifestFile.Num()
keep = fd.Num >= db.s.manifestFd.Num
case storage.TypeJournal:
if db.frozenJournalFile != nil {
keep = f.Num() >= db.frozenJournalFile.Num()
if !db.frozenJournalFd.Zero() {
keep = fd.Num >= db.frozenJournalFd.Num
} else {
keep = f.Num() >= db.journalFile.Num()
keep = fd.Num >= db.journalFd.Num
}
case storage.TypeTable:
_, keep = tablesMap[f.Num()]
_, keep = tmap[fd.Num]
if keep {
tablesMap[f.Num()] = true
nTables++
tmap[fd.Num] = true
nt++
}
}
if !keep {
rem = append(rem, f)
rem = append(rem, fd)
}
}
if nTables != len(tablesMap) {
var missing []*storage.FileInfo
for num, present := range tablesMap {
if nt != len(tmap) {
var mfds []storage.FileDesc
for num, present := range tmap {
if !present {
missing = append(missing, &storage.FileInfo{Type: storage.TypeTable, Num: num})
mfds = append(mfds, storage.FileDesc{storage.TypeTable, num})
db.logf("db@janitor table missing @%d", num)
}
}
return errors.NewErrCorrupted(nil, &errors.ErrMissingFiles{Files: missing})
return errors.NewErrCorrupted(storage.FileDesc{}, &errors.ErrMissingFiles{Fds: mfds})
}
db.logf("db@janitor F·%d G·%d", len(files), len(rem))
for _, f := range rem {
db.logf("db@janitor removing %s-%d", f.Type(), f.Num())
if err := f.Remove(); err != nil {
db.logf("db@janitor F·%d G·%d", len(fds), len(rem))
for _, fd := range rem {
db.logf("db@janitor removing %s-%d", fd.Type, fd.Num)
if err := db.s.stor.Remove(fd); err != nil {
return err
}
}

View File

@ -7,6 +7,7 @@
package leveldb
import (
"sync/atomic"
"time"
"github.com/syndtr/goleveldb/leveldb/memdb"
@ -14,91 +15,95 @@ import (
"github.com/syndtr/goleveldb/leveldb/util"
)
func (db *DB) writeJournal(b *Batch) error {
w, err := db.journal.Next()
func (db *DB) writeJournal(batches []*Batch, seq uint64, sync bool) error {
wr, err := db.journal.Next()
if err != nil {
return err
}
if _, err := w.Write(b.encode()); err != nil {
if err := writeBatchesWithHeader(wr, batches, seq); err != nil {
return err
}
if err := db.journal.Flush(); err != nil {
return err
}
if b.sync {
if sync {
return db.journalWriter.Sync()
}
return nil
}
func (db *DB) jWriter() {
defer db.closeW.Done()
for {
select {
case b := <-db.journalC:
if b != nil {
db.journalAckC <- db.writeJournal(b)
}
case _, _ = <-db.closeC:
return
}
}
}
func (db *DB) rotateMem(n int) (mem *memDB, err error) {
func (db *DB) rotateMem(n int, wait bool) (mem *memDB, err error) {
retryLimit := 3
retry:
// Wait for pending memdb compaction.
err = db.compSendIdle(db.mcompCmdC)
err = db.compTriggerWait(db.mcompCmdC)
if err != nil {
return
}
retryLimit--
// Create new memdb and journal.
mem, err = db.newMem(n)
if err != nil {
if err == errHasFrozenMem {
if retryLimit <= 0 {
panic("BUG: still has frozen memdb")
}
goto retry
}
return
}
// Schedule memdb compaction.
db.compSendTrigger(db.mcompCmdC)
if wait {
err = db.compTriggerWait(db.mcompCmdC)
} else {
db.compTrigger(db.mcompCmdC)
}
return
}
func (db *DB) flush(n int) (mem *memDB, nn int, err error) {
func (db *DB) flush(n int) (mdb *memDB, mdbFree int, err error) {
delayed := false
slowdownTrigger := db.s.o.GetWriteL0SlowdownTrigger()
pauseTrigger := db.s.o.GetWriteL0PauseTrigger()
flush := func() (retry bool) {
v := db.s.version()
defer v.release()
mem = db.getEffectiveMem()
mdb = db.getEffectiveMem()
if mdb == nil {
err = ErrClosed
return false
}
defer func() {
if retry {
mem.decref()
mem = nil
mdb.decref()
mdb = nil
}
}()
nn = mem.mdb.Free()
tLen := db.s.tLen(0)
mdbFree = mdb.Free()
switch {
case v.tLen(0) >= db.s.o.GetWriteL0SlowdownTrigger() && !delayed:
case tLen >= slowdownTrigger && !delayed:
delayed = true
time.Sleep(time.Millisecond)
case nn >= n:
case mdbFree >= n:
return false
case v.tLen(0) >= db.s.o.GetWriteL0PauseTrigger():
case tLen >= pauseTrigger:
delayed = true
err = db.compSendIdle(db.tcompCmdC)
err = db.compTriggerWait(db.tcompCmdC)
if err != nil {
return false
}
default:
// Allow memdb to grow if it has no entry.
if mem.mdb.Len() == 0 {
nn = n
if mdb.Len() == 0 {
mdbFree = n
} else {
mem.decref()
mem, err = db.rotateMem(n)
mdb.decref()
mdb, err = db.rotateMem(n, false)
if err == nil {
nn = mem.mdb.Free()
mdbFree = mdb.Free()
} else {
nn = 0
mdbFree = 0
}
}
return false
@ -113,157 +118,265 @@ func (db *DB) flush(n int) (mem *memDB, nn int, err error) {
db.writeDelayN++
} else if db.writeDelayN > 0 {
db.logf("db@write was delayed N·%d T·%v", db.writeDelayN, db.writeDelay)
atomic.AddInt32(&db.cWriteDelayN, int32(db.writeDelayN))
atomic.AddInt64(&db.cWriteDelay, int64(db.writeDelay))
db.writeDelay = 0
db.writeDelayN = 0
}
return
}
// Write apply the given batch to the DB. The batch will be applied
// sequentially.
//
// It is safe to modify the contents of the arguments after Write returns.
func (db *DB) Write(b *Batch, wo *opt.WriteOptions) (err error) {
err = db.ok()
if err != nil || b == nil || b.Len() == 0 {
return
type writeMerge struct {
sync bool
batch *Batch
keyType keyType
key, value []byte
}
func (db *DB) unlockWrite(overflow bool, merged int, err error) {
for i := 0; i < merged; i++ {
db.writeAckC <- err
}
b.init(wo.GetSync())
// The write happen synchronously.
select {
case db.writeC <- b:
if <-db.writeMergedC {
return <-db.writeAckC
}
case db.writeLockC <- struct{}{}:
case err = <-db.compPerErrC:
return
case _, _ = <-db.closeC:
return ErrClosed
if overflow {
// Pass lock to the next write (that failed to merge).
db.writeMergedC <- false
} else {
// Release lock.
<-db.writeLockC
}
}
merged := 0
danglingMerge := false
defer func() {
if danglingMerge {
db.writeMergedC <- false
} else {
<-db.writeLockC
}
for i := 0; i < merged; i++ {
db.writeAckC <- err
}
}()
mem, memFree, err := db.flush(b.size())
// ourBatch if defined should equal with batch.
func (db *DB) writeLocked(batch, ourBatch *Batch, merge, sync bool) error {
// Try to flush memdb. This method would also trying to throttle writes
// if it is too fast and compaction cannot catch-up.
mdb, mdbFree, err := db.flush(batch.internalLen)
if err != nil {
return
db.unlockWrite(false, 0, err)
return err
}
defer mem.decref()
defer mdb.decref()
// Calculate maximum size of the batch.
m := 1 << 20
if x := b.size(); x <= 128<<10 {
m = x + (128 << 10)
}
m = minInt(m, memFree)
var (
overflow bool
merged int
batches = []*Batch{batch}
)
// Merge with other batch.
drain:
for b.size() < m && !b.sync {
select {
case nb := <-db.writeC:
if b.size()+nb.size() <= m {
b.append(nb)
db.writeMergedC <- true
merged++
} else {
danglingMerge = true
break drain
}
default:
break drain
if merge {
// Merge limit.
var mergeLimit int
if batch.internalLen > 128<<10 {
mergeLimit = (1 << 20) - batch.internalLen
} else {
mergeLimit = 128 << 10
}
}
// Set batch first seq number relative from last seq.
b.seq = db.seq + 1
// Write journal concurrently if it is large enough.
if b.size() >= (128 << 10) {
// Push the write batch to the journal writer
select {
case db.journalC <- b:
// Write into memdb
if berr := b.memReplay(mem.mdb); berr != nil {
panic(berr)
}
case err = <-db.compPerErrC:
return
case _, _ = <-db.closeC:
err = ErrClosed
return
mergeCap := mdbFree - batch.internalLen
if mergeLimit > mergeCap {
mergeLimit = mergeCap
}
// Wait for journal writer
select {
case err = <-db.journalAckC:
if err != nil {
// Revert memdb if error detected
if berr := b.revertMemReplay(mem.mdb); berr != nil {
panic(berr)
merge:
for mergeLimit > 0 {
select {
case incoming := <-db.writeMergeC:
if incoming.batch != nil {
// Merge batch.
if incoming.batch.internalLen > mergeLimit {
overflow = true
break merge
}
batches = append(batches, incoming.batch)
mergeLimit -= incoming.batch.internalLen
} else {
// Merge put.
internalLen := len(incoming.key) + len(incoming.value) + 8
if internalLen > mergeLimit {
overflow = true
break merge
}
if ourBatch == nil {
ourBatch = db.batchPool.Get().(*Batch)
ourBatch.Reset()
batches = append(batches, ourBatch)
}
// We can use same batch since concurrent write doesn't
// guarantee write order.
ourBatch.appendRec(incoming.keyType, incoming.key, incoming.value)
mergeLimit -= internalLen
}
return
sync = sync || incoming.sync
merged++
db.writeMergedC <- true
default:
break merge
}
case _, _ = <-db.closeC:
err = ErrClosed
return
}
}
// Seq number.
seq := db.seq + 1
// Write journal.
if err := db.writeJournal(batches, seq, sync); err != nil {
db.unlockWrite(overflow, merged, err)
return err
}
// Put batches.
for _, batch := range batches {
if err := batch.putMem(seq, mdb.DB); err != nil {
panic(err)
}
seq += uint64(batch.Len())
}
// Incr seq number.
db.addSeq(uint64(batchesLen(batches)))
// Rotate memdb if it's reach the threshold.
if batch.internalLen >= mdbFree {
db.rotateMem(0, false)
}
db.unlockWrite(overflow, merged, nil)
return nil
}
// Write apply the given batch to the DB. The batch records will be applied
// sequentially. Write might be used concurrently, when used concurrently and
// batch is small enough, write will try to merge the batches. Set NoWriteMerge
// option to true to disable write merge.
//
// It is safe to modify the contents of the arguments after Write returns but
// not before. Write will not modify content of the batch.
func (db *DB) Write(batch *Batch, wo *opt.WriteOptions) error {
if err := db.ok(); err != nil || batch == nil || batch.Len() == 0 {
return err
}
// If the batch size is larger than write buffer, it may justified to write
// using transaction instead. Using transaction the batch will be written
// into tables directly, skipping the journaling.
if batch.internalLen > db.s.o.GetWriteBuffer() && !db.s.o.GetDisableLargeBatchTransaction() {
tr, err := db.OpenTransaction()
if err != nil {
return err
}
if err := tr.Write(batch, wo); err != nil {
tr.Discard()
return err
}
return tr.Commit()
}
merge := !wo.GetNoWriteMerge() && !db.s.o.GetNoWriteMerge()
sync := wo.GetSync() && !db.s.o.GetNoSync()
// Acquire write lock.
if merge {
select {
case db.writeMergeC <- writeMerge{sync: sync, batch: batch}:
if <-db.writeMergedC {
// Write is merged.
return <-db.writeAckC
}
// Write is not merged, the write lock is handed to us. Continue.
case db.writeLockC <- struct{}{}:
// Write lock acquired.
case err := <-db.compPerErrC:
// Compaction error.
return err
case <-db.closeC:
// Closed
return ErrClosed
}
} else {
err = db.writeJournal(b)
if err != nil {
return
}
if berr := b.memReplay(mem.mdb); berr != nil {
panic(berr)
select {
case db.writeLockC <- struct{}{}:
// Write lock acquired.
case err := <-db.compPerErrC:
// Compaction error.
return err
case <-db.closeC:
// Closed
return ErrClosed
}
}
// Set last seq number.
db.addSeq(uint64(b.Len()))
return db.writeLocked(batch, nil, merge, sync)
}
if b.size() >= memFree {
db.rotateMem(0)
func (db *DB) putRec(kt keyType, key, value []byte, wo *opt.WriteOptions) error {
if err := db.ok(); err != nil {
return err
}
return
merge := !wo.GetNoWriteMerge() && !db.s.o.GetNoWriteMerge()
sync := wo.GetSync() && !db.s.o.GetNoSync()
// Acquire write lock.
if merge {
select {
case db.writeMergeC <- writeMerge{sync: sync, keyType: kt, key: key, value: value}:
if <-db.writeMergedC {
// Write is merged.
return <-db.writeAckC
}
// Write is not merged, the write lock is handed to us. Continue.
case db.writeLockC <- struct{}{}:
// Write lock acquired.
case err := <-db.compPerErrC:
// Compaction error.
return err
case <-db.closeC:
// Closed
return ErrClosed
}
} else {
select {
case db.writeLockC <- struct{}{}:
// Write lock acquired.
case err := <-db.compPerErrC:
// Compaction error.
return err
case <-db.closeC:
// Closed
return ErrClosed
}
}
batch := db.batchPool.Get().(*Batch)
batch.Reset()
batch.appendRec(kt, key, value)
return db.writeLocked(batch, batch, merge, sync)
}
// Put sets the value for the given key. It overwrites any previous value
// for that key; a DB is not a multi-map.
// for that key; a DB is not a multi-map. Write merge also applies for Put, see
// Write.
//
// It is safe to modify the contents of the arguments after Put returns.
// It is safe to modify the contents of the arguments after Put returns but not
// before.
func (db *DB) Put(key, value []byte, wo *opt.WriteOptions) error {
b := new(Batch)
b.Put(key, value)
return db.Write(b, wo)
return db.putRec(keyTypeVal, key, value, wo)
}
// Delete deletes the value for the given key. It returns ErrNotFound if
// the DB does not contain the key.
// Delete deletes the value for the given key. Delete will not returns error if
// key doesn't exist. Write merge also applies for Delete, see Write.
//
// It is safe to modify the contents of the arguments after Delete returns.
// It is safe to modify the contents of the arguments after Delete returns but
// not before.
func (db *DB) Delete(key []byte, wo *opt.WriteOptions) error {
b := new(Batch)
b.Delete(key)
return db.Write(b, wo)
return db.putRec(keyTypeDel, key, nil, wo)
}
func isMemOverlaps(icmp *iComparer, mem *memdb.DB, min, max []byte) bool {
iter := mem.NewIterator(nil)
defer iter.Release()
return (max == nil || (iter.First() && icmp.uCompare(max, iKey(iter.Key()).ukey()) >= 0)) &&
(min == nil || (iter.Last() && icmp.uCompare(min, iKey(iter.Key()).ukey()) <= 0))
return (max == nil || (iter.First() && icmp.uCompare(max, internalKey(iter.Key()).ukey()) >= 0)) &&
(min == nil || (iter.Last() && icmp.uCompare(min, internalKey(iter.Key()).ukey()) <= 0))
}
// CompactRange compacts the underlying DB for the given key range.
@ -285,21 +398,24 @@ func (db *DB) CompactRange(r util.Range) error {
case db.writeLockC <- struct{}{}:
case err := <-db.compPerErrC:
return err
case _, _ = <-db.closeC:
case <-db.closeC:
return ErrClosed
}
// Check for overlaps in memdb.
mem := db.getEffectiveMem()
defer mem.decref()
if isMemOverlaps(db.s.icmp, mem.mdb, r.Start, r.Limit) {
mdb := db.getEffectiveMem()
if mdb == nil {
return ErrClosed
}
defer mdb.decref()
if isMemOverlaps(db.s.icmp, mdb.DB, r.Start, r.Limit) {
// Memdb compaction.
if _, err := db.rotateMem(0); err != nil {
if _, err := db.rotateMem(0, false); err != nil {
<-db.writeLockC
return err
}
<-db.writeLockC
if err := db.compSendIdle(db.mcompCmdC); err != nil {
if err := db.compTriggerWait(db.mcompCmdC); err != nil {
return err
}
} else {
@ -307,5 +423,33 @@ func (db *DB) CompactRange(r util.Range) error {
}
// Table compaction.
return db.compSendRange(db.tcompCmdC, -1, r.Start, r.Limit)
return db.compTriggerRange(db.tcompCmdC, -1, r.Start, r.Limit)
}
// SetReadOnly makes DB read-only. It will stay read-only until reopened.
func (db *DB) SetReadOnly() error {
if err := db.ok(); err != nil {
return err
}
// Lock writer.
select {
case db.writeLockC <- struct{}{}:
db.compWriteLocking = true
case err := <-db.compPerErrC:
return err
case <-db.closeC:
return ErrClosed
}
// Set compaction read-only.
select {
case db.compErrSetC <- ErrReadOnly:
case perr := <-db.compPerErrC:
return perr
case <-db.closeC:
return ErrClosed
}
return nil
}

View File

@ -8,6 +8,8 @@
//
// Create or open a database:
//
// // The returned DB instance is safe for concurrent use. Which mean that all
// // DB's methods may be called concurrently from multiple goroutine.
// db, err := leveldb.OpenFile("path/to/db", nil)
// ...
// defer db.Close()

View File

@ -10,8 +10,10 @@ import (
"github.com/syndtr/goleveldb/leveldb/errors"
)
// Common errors.
var (
ErrNotFound = errors.ErrNotFound
ErrReadOnly = errors.New("leveldb: read-only mode")
ErrSnapshotReleased = errors.New("leveldb: snapshot released")
ErrIterReleased = errors.New("leveldb: iterator released")
ErrClosed = errors.New("leveldb: closed")

View File

@ -15,6 +15,7 @@ import (
"github.com/syndtr/goleveldb/leveldb/util"
)
// Common errors.
var (
ErrNotFound = New("leveldb: not found")
ErrReleased = util.ErrReleased
@ -29,21 +30,20 @@ func New(text string) error {
// ErrCorrupted is the type that wraps errors that indicate corruption in
// the database.
type ErrCorrupted struct {
File *storage.FileInfo
Err error
Fd storage.FileDesc
Err error
}
func (e *ErrCorrupted) Error() string {
if e.File != nil {
return fmt.Sprintf("%v [file=%v]", e.Err, e.File)
} else {
return e.Err.Error()
if !e.Fd.Zero() {
return fmt.Sprintf("%v [file=%v]", e.Err, e.Fd)
}
return e.Err.Error()
}
// NewErrCorrupted creates new ErrCorrupted error.
func NewErrCorrupted(f storage.File, err error) error {
return &ErrCorrupted{storage.NewFileInfo(f), err}
func NewErrCorrupted(fd storage.FileDesc, err error) error {
return &ErrCorrupted{fd, err}
}
// IsCorrupted returns a boolean indicating whether the error is indicating
@ -52,24 +52,26 @@ func IsCorrupted(err error) bool {
switch err.(type) {
case *ErrCorrupted:
return true
case *storage.ErrCorrupted:
return true
}
return false
}
// ErrMissingFiles is the type that indicating a corruption due to missing
// files.
// files. ErrMissingFiles always wrapped with ErrCorrupted.
type ErrMissingFiles struct {
Files []*storage.FileInfo
Fds []storage.FileDesc
}
func (e *ErrMissingFiles) Error() string { return "file missing" }
// SetFile sets 'file info' of the given error with the given file.
// SetFd sets 'file info' of the given error with the given file.
// Currently only ErrCorrupted is supported, otherwise will do nothing.
func SetFile(err error, f storage.File) error {
func SetFd(err error, fd storage.FileDesc) error {
switch x := err.(type) {
case *ErrCorrupted:
x.File = storage.NewFileInfo(f)
x.Fd = fd
return x
}
return err

View File

@ -32,12 +32,12 @@ var _ = testutil.Defer(func() {
db := newTestingDB(o, nil, nil)
t := testutil.DBTesting{
DB: db,
Deleted: testutil.KeyValue_Generate(nil, 500, 1, 50, 5, 5).Clone(),
Deleted: testutil.KeyValue_Generate(nil, 500, 1, 1, 50, 5, 5).Clone(),
}
testutil.DoDBTesting(&t)
db.TestClose()
done <- true
}, 20.0)
}, 80.0)
})
Describe("read test", func() {
@ -54,5 +54,64 @@ var _ = testutil.Defer(func() {
db.(*testingDB).TestClose()
})
})
Describe("transaction test", func() {
It("should do transaction correctly", func(done Done) {
db := newTestingDB(o, nil, nil)
By("creating first transaction")
var err error
tr := &testingTransaction{}
tr.Transaction, err = db.OpenTransaction()
Expect(err).NotTo(HaveOccurred())
t0 := &testutil.DBTesting{
DB: tr,
Deleted: testutil.KeyValue_Generate(nil, 200, 1, 1, 50, 5, 5).Clone(),
}
testutil.DoDBTesting(t0)
testutil.TestGet(tr, t0.Present)
testutil.TestHas(tr, t0.Present)
By("committing first transaction")
err = tr.Commit()
Expect(err).NotTo(HaveOccurred())
testutil.TestIter(db, nil, t0.Present)
testutil.TestGet(db, t0.Present)
testutil.TestHas(db, t0.Present)
By("manipulating DB without transaction")
t0.DB = db
testutil.DoDBTesting(t0)
By("creating second transaction")
tr.Transaction, err = db.OpenTransaction()
Expect(err).NotTo(HaveOccurred())
t1 := &testutil.DBTesting{
DB: tr,
Deleted: t0.Deleted.Clone(),
Present: t0.Present.Clone(),
}
testutil.DoDBTesting(t1)
testutil.TestIter(db, nil, t0.Present)
By("discarding second transaction")
tr.Discard()
testutil.TestIter(db, nil, t0.Present)
By("creating third transaction")
tr.Transaction, err = db.OpenTransaction()
Expect(err).NotTo(HaveOccurred())
t0.DB = tr
testutil.DoDBTesting(t0)
By("committing third transaction")
err = tr.Commit()
Expect(err).NotTo(HaveOccurred())
testutil.TestIter(db, nil, t0.Present)
db.TestClose()
done <- true
}, 240.0)
})
})
})

View File

@ -15,7 +15,7 @@ type iFilter struct {
}
func (f iFilter) Contains(filter, key []byte) bool {
return f.Filter.Contains(filter, iKey(key).ukey())
return f.Filter.Contains(filter, internalKey(key).ukey())
}
func (f iFilter) NewGenerator() filter.FilterGenerator {
@ -27,5 +27,5 @@ type iFilterGenerator struct {
}
func (g iFilterGenerator) Add(key []byte) {
g.FilterGenerator.Add(iKey(key).ukey())
g.FilterGenerator.Add(internalKey(key).ukey())
}

View File

@ -17,7 +17,7 @@ var _ = testutil.Defer(func() {
Describe("Array iterator", func() {
It("Should iterates and seeks correctly", func() {
// Build key/value.
kv := testutil.KeyValue_Generate(nil, 70, 1, 5, 3, 3)
kv := testutil.KeyValue_Generate(nil, 70, 1, 1, 5, 3, 3)
// Test the iterator.
t := testutil.IteratorTesting{

View File

@ -52,7 +52,7 @@ var _ = testutil.Defer(func() {
for _, x := range n {
sum += x
}
kv := testutil.KeyValue_Generate(nil, sum, 1, 10, 4, 4)
kv := testutil.KeyValue_Generate(nil, sum, 1, 1, 10, 4, 4)
for i, j := 0, 0; i < len(n); i++ {
for x := n[i]; x > 0; x-- {
key, value := kv.Index(j)
@ -69,7 +69,7 @@ var _ = testutil.Defer(func() {
}
testutil.DoIteratorTesting(&t)
done <- true
}, 1.5)
}, 15.0)
}
}

View File

@ -21,13 +21,13 @@ var (
// IteratorSeeker is the interface that wraps the 'seeks method'.
type IteratorSeeker interface {
// First moves the iterator to the first key/value pair. If the iterator
// only contains one key/value pair then First and Last whould moves
// only contains one key/value pair then First and Last would moves
// to the same key/value pair.
// It returns whether such pair exist.
First() bool
// Last moves the iterator to the last key/value pair. If the iterator
// only contains one key/value pair then First and Last whould moves
// only contains one key/value pair then First and Last would moves
// to the same key/value pair.
// It returns whether such pair exist.
Last() bool
@ -48,7 +48,7 @@ type IteratorSeeker interface {
Prev() bool
}
// CommonIterator is the interface that wraps common interator methods.
// CommonIterator is the interface that wraps common iterator methods.
type CommonIterator interface {
IteratorSeeker
@ -71,14 +71,15 @@ type CommonIterator interface {
// Iterator iterates over a DB's key/value pairs in key order.
//
// When encouter an error any 'seeks method' will return false and will
// When encounter an error any 'seeks method' will return false and will
// yield no key/value pairs. The error can be queried by calling the Error
// method. Calling Release is still necessary.
//
// An iterator must be released after use, but it is not necessary to read
// an iterator until exhaustion.
// Also, an iterator is not necessarily goroutine-safe, but it is safe to use
// multiple iterators concurrently, with each in a dedicated goroutine.
// Also, an iterator is not necessarily safe for concurrent use, but it is
// safe to use multiple iterators concurrently, with each in a dedicated
// goroutine.
type Iterator interface {
CommonIterator
@ -87,7 +88,7 @@ type Iterator interface {
// its contents may change on the next call to any 'seeks method'.
Key() []byte
// Value returns the key of the current key/value pair, or nil if done.
// Value returns the value of the current key/value pair, or nil if done.
// The caller should not modify the contents of the returned slice, and
// its contents may change on the next call to any 'seeks method'.
Value() []byte
@ -98,7 +99,7 @@ type Iterator interface {
//
// ErrorCallbackSetter implemented by indexed and merged iterator.
type ErrorCallbackSetter interface {
// SetErrorCallback allows set an error callback of the coresponding
// SetErrorCallback allows set an error callback of the corresponding
// iterator. Use nil to clear the callback.
SetErrorCallback(f func(err error))
}

View File

@ -24,7 +24,7 @@ var _ = testutil.Defer(func() {
// Build key/value.
filledKV := make([]testutil.KeyValue, filled)
kv := testutil.KeyValue_Generate(nil, 100, 1, 10, 4, 4)
kv := testutil.KeyValue_Generate(nil, 100, 1, 1, 10, 4, 4)
kv.Iterate(func(i int, key, value []byte) {
filledKV[rnd.Intn(filled)].Put(key, value)
})
@ -49,7 +49,7 @@ var _ = testutil.Defer(func() {
}
testutil.DoIteratorTesting(&t)
done <- true
}, 1.5)
}, 15.0)
}
}

View File

@ -83,6 +83,7 @@ import (
"io"
"github.com/syndtr/goleveldb/leveldb/errors"
"github.com/syndtr/goleveldb/leveldb/storage"
"github.com/syndtr/goleveldb/leveldb/util"
)
@ -165,7 +166,7 @@ func (r *Reader) corrupt(n int, reason string, skip bool) error {
r.dropper.Drop(&ErrCorrupted{n, reason})
}
if r.strict && !skip {
r.err = errors.NewErrCorrupted(nil, &ErrCorrupted{n, reason})
r.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrCorrupted{n, reason})
return r.err
}
return errSkip
@ -179,34 +180,37 @@ func (r *Reader) nextChunk(first bool) error {
checksum := binary.LittleEndian.Uint32(r.buf[r.j+0 : r.j+4])
length := binary.LittleEndian.Uint16(r.buf[r.j+4 : r.j+6])
chunkType := r.buf[r.j+6]
unprocBlock := r.n - r.j
if checksum == 0 && length == 0 && chunkType == 0 {
// Drop entire block.
m := r.n - r.j
r.i = r.n
r.j = r.n
return r.corrupt(m, "zero header", false)
} else {
m := r.n - r.j
r.i = r.j + headerSize
r.j = r.j + headerSize + int(length)
if r.j > r.n {
// Drop entire block.
r.i = r.n
r.j = r.n
return r.corrupt(m, "chunk length overflows block", false)
} else if r.checksum && checksum != util.NewCRC(r.buf[r.i-1:r.j]).Value() {
// Drop entire block.
r.i = r.n
r.j = r.n
return r.corrupt(m, "checksum mismatch", false)
}
return r.corrupt(unprocBlock, "zero header", false)
}
if chunkType < fullChunkType || chunkType > lastChunkType {
// Drop entire block.
r.i = r.n
r.j = r.n
return r.corrupt(unprocBlock, fmt.Sprintf("invalid chunk type %#x", chunkType), false)
}
r.i = r.j + headerSize
r.j = r.j + headerSize + int(length)
if r.j > r.n {
// Drop entire block.
r.i = r.n
r.j = r.n
return r.corrupt(unprocBlock, "chunk length overflows block", false)
} else if r.checksum && checksum != util.NewCRC(r.buf[r.i-1:r.j]).Value() {
// Drop entire block.
r.i = r.n
r.j = r.n
return r.corrupt(unprocBlock, "checksum mismatch", false)
}
if first && chunkType != fullChunkType && chunkType != firstChunkType {
m := r.j - r.i
chunkLength := (r.j - r.i) + headerSize
r.i = r.j
// Report the error, but skip it.
return r.corrupt(m+headerSize, "orphan chunk", true)
return r.corrupt(chunkLength, "orphan chunk", true)
}
r.last = chunkType == fullChunkType || chunkType == lastChunkType
return nil

View File

@ -11,132 +11,133 @@ import (
"fmt"
"github.com/syndtr/goleveldb/leveldb/errors"
"github.com/syndtr/goleveldb/leveldb/storage"
)
type ErrIkeyCorrupted struct {
// ErrInternalKeyCorrupted records internal key corruption.
type ErrInternalKeyCorrupted struct {
Ikey []byte
Reason string
}
func (e *ErrIkeyCorrupted) Error() string {
return fmt.Sprintf("leveldb: iKey %q corrupted: %s", e.Ikey, e.Reason)
func (e *ErrInternalKeyCorrupted) Error() string {
return fmt.Sprintf("leveldb: internal key %q corrupted: %s", e.Ikey, e.Reason)
}
func newErrIkeyCorrupted(ikey []byte, reason string) error {
return errors.NewErrCorrupted(nil, &ErrIkeyCorrupted{append([]byte{}, ikey...), reason})
func newErrInternalKeyCorrupted(ikey []byte, reason string) error {
return errors.NewErrCorrupted(storage.FileDesc{}, &ErrInternalKeyCorrupted{append([]byte{}, ikey...), reason})
}
type kType int
type keyType uint
func (kt kType) String() string {
func (kt keyType) String() string {
switch kt {
case ktDel:
case keyTypeDel:
return "d"
case ktVal:
case keyTypeVal:
return "v"
}
return "x"
return fmt.Sprintf("<invalid:%#x>", uint(kt))
}
// Value types encoded as the last component of internal keys.
// Don't modify; this value are saved to disk.
const (
ktDel kType = iota
ktVal
keyTypeDel = keyType(0)
keyTypeVal = keyType(1)
)
// ktSeek defines the kType that should be passed when constructing an
// keyTypeSeek defines the keyType that should be passed when constructing an
// internal key for seeking to a particular sequence number (since we
// sort sequence numbers in decreasing order and the value type is
// embedded as the low 8 bits in the sequence number in internal keys,
// we need to use the highest-numbered ValueType, not the lowest).
const ktSeek = ktVal
const keyTypeSeek = keyTypeVal
const (
// Maximum value possible for sequence number; the 8-bits are
// used by value type, so its can packed together in single
// 64-bit integer.
kMaxSeq uint64 = (uint64(1) << 56) - 1
keyMaxSeq = (uint64(1) << 56) - 1
// Maximum value possible for packed sequence number and type.
kMaxNum uint64 = (kMaxSeq << 8) | uint64(ktSeek)
keyMaxNum = (keyMaxSeq << 8) | uint64(keyTypeSeek)
)
// Maximum number encoded in bytes.
var kMaxNumBytes = make([]byte, 8)
var keyMaxNumBytes = make([]byte, 8)
func init() {
binary.LittleEndian.PutUint64(kMaxNumBytes, kMaxNum)
binary.LittleEndian.PutUint64(keyMaxNumBytes, keyMaxNum)
}
type iKey []byte
type internalKey []byte
func newIkey(ukey []byte, seq uint64, kt kType) iKey {
if seq > kMaxSeq {
func makeInternalKey(dst, ukey []byte, seq uint64, kt keyType) internalKey {
if seq > keyMaxSeq {
panic("leveldb: invalid sequence number")
} else if kt > ktVal {
} else if kt > keyTypeVal {
panic("leveldb: invalid type")
}
ik := make(iKey, len(ukey)+8)
copy(ik, ukey)
binary.LittleEndian.PutUint64(ik[len(ukey):], (seq<<8)|uint64(kt))
return ik
dst = ensureBuffer(dst, len(ukey)+8)
copy(dst, ukey)
binary.LittleEndian.PutUint64(dst[len(ukey):], (seq<<8)|uint64(kt))
return internalKey(dst)
}
func parseIkey(ik []byte) (ukey []byte, seq uint64, kt kType, err error) {
func parseInternalKey(ik []byte) (ukey []byte, seq uint64, kt keyType, err error) {
if len(ik) < 8 {
return nil, 0, 0, newErrIkeyCorrupted(ik, "invalid length")
return nil, 0, 0, newErrInternalKeyCorrupted(ik, "invalid length")
}
num := binary.LittleEndian.Uint64(ik[len(ik)-8:])
seq, kt = uint64(num>>8), kType(num&0xff)
if kt > ktVal {
return nil, 0, 0, newErrIkeyCorrupted(ik, "invalid type")
seq, kt = uint64(num>>8), keyType(num&0xff)
if kt > keyTypeVal {
return nil, 0, 0, newErrInternalKeyCorrupted(ik, "invalid type")
}
ukey = ik[:len(ik)-8]
return
}
func validIkey(ik []byte) bool {
_, _, _, err := parseIkey(ik)
func validInternalKey(ik []byte) bool {
_, _, _, err := parseInternalKey(ik)
return err == nil
}
func (ik iKey) assert() {
func (ik internalKey) assert() {
if ik == nil {
panic("leveldb: nil iKey")
panic("leveldb: nil internalKey")
}
if len(ik) < 8 {
panic(fmt.Sprintf("leveldb: iKey %q, len=%d: invalid length", []byte(ik), len(ik)))
panic(fmt.Sprintf("leveldb: internal key %q, len=%d: invalid length", []byte(ik), len(ik)))
}
}
func (ik iKey) ukey() []byte {
func (ik internalKey) ukey() []byte {
ik.assert()
return ik[:len(ik)-8]
}
func (ik iKey) num() uint64 {
func (ik internalKey) num() uint64 {
ik.assert()
return binary.LittleEndian.Uint64(ik[len(ik)-8:])
}
func (ik iKey) parseNum() (seq uint64, kt kType) {
func (ik internalKey) parseNum() (seq uint64, kt keyType) {
num := ik.num()
seq, kt = uint64(num>>8), kType(num&0xff)
if kt > ktVal {
panic(fmt.Sprintf("leveldb: iKey %q, len=%d: invalid type %#x", []byte(ik), len(ik), kt))
seq, kt = uint64(num>>8), keyType(num&0xff)
if kt > keyTypeVal {
panic(fmt.Sprintf("leveldb: internal key %q, len=%d: invalid type %#x", []byte(ik), len(ik), kt))
}
return
}
func (ik iKey) String() string {
func (ik internalKey) String() string {
if ik == nil {
return "<nil>"
}
if ukey, seq, kt, err := parseIkey(ik); err == nil {
if ukey, seq, kt, err := parseInternalKey(ik); err == nil {
return fmt.Sprintf("%s,%s%d", shorten(string(ukey)), kt, seq)
} else {
return "<invalid>"
}
return fmt.Sprintf("<invalid:%#x>", []byte(ik))
}

View File

@ -15,8 +15,8 @@ import (
var defaultIComparer = &iComparer{comparer.DefaultComparer}
func ikey(key string, seq uint64, kt kType) iKey {
return newIkey([]byte(key), uint64(seq), kt)
func ikey(key string, seq uint64, kt keyType) internalKey {
return makeInternalKey(nil, []byte(key), uint64(seq), kt)
}
func shortSep(a, b []byte) []byte {
@ -37,7 +37,7 @@ func shortSuccessor(b []byte) []byte {
return dst
}
func testSingleKey(t *testing.T, key string, seq uint64, kt kType) {
func testSingleKey(t *testing.T, key string, seq uint64, kt keyType) {
ik := ikey(key, seq, kt)
if !bytes.Equal(ik.ukey(), []byte(key)) {
@ -52,7 +52,7 @@ func testSingleKey(t *testing.T, key string, seq uint64, kt kType) {
t.Errorf("type does not equal, got %v, want %v", rt, kt)
}
if rukey, rseq, rt, kerr := parseIkey(ik); kerr == nil {
if rukey, rseq, rt, kerr := parseInternalKey(ik); kerr == nil {
if !bytes.Equal(rukey, []byte(key)) {
t.Errorf("user key does not equal, got %v, want %v", string(ik.ukey()), key)
}
@ -67,7 +67,7 @@ func testSingleKey(t *testing.T, key string, seq uint64, kt kType) {
}
}
func TestIkey_EncodeDecode(t *testing.T) {
func TestInternalKey_EncodeDecode(t *testing.T) {
keys := []string{"", "k", "hello", "longggggggggggggggggggggg"}
seqs := []uint64{
1, 2, 3,
@ -77,8 +77,8 @@ func TestIkey_EncodeDecode(t *testing.T) {
}
for _, key := range keys {
for _, seq := range seqs {
testSingleKey(t, key, seq, ktVal)
testSingleKey(t, "hello", 1, ktDel)
testSingleKey(t, key, seq, keyTypeVal)
testSingleKey(t, "hello", 1, keyTypeDel)
}
}
}
@ -89,45 +89,45 @@ func assertBytes(t *testing.T, want, got []byte) {
}
}
func TestIkeyShortSeparator(t *testing.T) {
func TestInternalKeyShortSeparator(t *testing.T) {
// When user keys are same
assertBytes(t, ikey("foo", 100, ktVal),
shortSep(ikey("foo", 100, ktVal),
ikey("foo", 99, ktVal)))
assertBytes(t, ikey("foo", 100, ktVal),
shortSep(ikey("foo", 100, ktVal),
ikey("foo", 101, ktVal)))
assertBytes(t, ikey("foo", 100, ktVal),
shortSep(ikey("foo", 100, ktVal),
ikey("foo", 100, ktVal)))
assertBytes(t, ikey("foo", 100, ktVal),
shortSep(ikey("foo", 100, ktVal),
ikey("foo", 100, ktDel)))
assertBytes(t, ikey("foo", 100, keyTypeVal),
shortSep(ikey("foo", 100, keyTypeVal),
ikey("foo", 99, keyTypeVal)))
assertBytes(t, ikey("foo", 100, keyTypeVal),
shortSep(ikey("foo", 100, keyTypeVal),
ikey("foo", 101, keyTypeVal)))
assertBytes(t, ikey("foo", 100, keyTypeVal),
shortSep(ikey("foo", 100, keyTypeVal),
ikey("foo", 100, keyTypeVal)))
assertBytes(t, ikey("foo", 100, keyTypeVal),
shortSep(ikey("foo", 100, keyTypeVal),
ikey("foo", 100, keyTypeDel)))
// When user keys are misordered
assertBytes(t, ikey("foo", 100, ktVal),
shortSep(ikey("foo", 100, ktVal),
ikey("bar", 99, ktVal)))
assertBytes(t, ikey("foo", 100, keyTypeVal),
shortSep(ikey("foo", 100, keyTypeVal),
ikey("bar", 99, keyTypeVal)))
// When user keys are different, but correctly ordered
assertBytes(t, ikey("g", uint64(kMaxSeq), ktSeek),
shortSep(ikey("foo", 100, ktVal),
ikey("hello", 200, ktVal)))
assertBytes(t, ikey("g", uint64(keyMaxSeq), keyTypeSeek),
shortSep(ikey("foo", 100, keyTypeVal),
ikey("hello", 200, keyTypeVal)))
// When start user key is prefix of limit user key
assertBytes(t, ikey("foo", 100, ktVal),
shortSep(ikey("foo", 100, ktVal),
ikey("foobar", 200, ktVal)))
assertBytes(t, ikey("foo", 100, keyTypeVal),
shortSep(ikey("foo", 100, keyTypeVal),
ikey("foobar", 200, keyTypeVal)))
// When limit user key is prefix of start user key
assertBytes(t, ikey("foobar", 100, ktVal),
shortSep(ikey("foobar", 100, ktVal),
ikey("foo", 200, ktVal)))
assertBytes(t, ikey("foobar", 100, keyTypeVal),
shortSep(ikey("foobar", 100, keyTypeVal),
ikey("foo", 200, keyTypeVal)))
}
func TestIkeyShortestSuccessor(t *testing.T) {
assertBytes(t, ikey("g", uint64(kMaxSeq), ktSeek),
shortSuccessor(ikey("foo", 100, ktVal)))
assertBytes(t, ikey("\xff\xff", 100, ktVal),
shortSuccessor(ikey("\xff\xff", 100, ktVal)))
func TestInternalKeyShortestSuccessor(t *testing.T) {
assertBytes(t, ikey("g", uint64(keyMaxSeq), keyTypeSeek),
shortSuccessor(ikey("foo", 100, keyTypeVal)))
assertBytes(t, ikey("\xff\xff", 100, keyTypeVal),
shortSuccessor(ikey("\xff\xff", 100, keyTypeVal)))
}

View File

@ -17,6 +17,7 @@ import (
"github.com/syndtr/goleveldb/leveldb/util"
)
// Common errors.
var (
ErrNotFound = errors.ErrNotFound
ErrIterReleased = errors.New("leveldb/memdb: iterator released")
@ -206,6 +207,7 @@ func (p *DB) randHeight() (h int) {
return
}
// Must hold RW-lock if prev == true, as it use shared prevNode slice.
func (p *DB) findGE(key []byte, prev bool) (int, bool) {
node := 0
h := p.maxHeight - 1
@ -302,7 +304,7 @@ func (p *DB) Put(key []byte, value []byte) error {
node := len(p.nodeData)
p.nodeData = append(p.nodeData, kvOffset, len(key), len(value), h)
for i, n := range p.prevNode[:h] {
m := n + 4 + i
m := n + nNext + i
p.nodeData = append(p.nodeData, p.nodeData[m])
p.nodeData[m] = node
}
@ -327,7 +329,7 @@ func (p *DB) Delete(key []byte) error {
h := p.nodeData[node+nHeight]
for i, n := range p.prevNode[:h] {
m := n + 4 + i
m := n + nNext + i
p.nodeData[m] = p.nodeData[p.nodeData[m]+nNext+i]
}
@ -384,7 +386,7 @@ func (p *DB) Find(key []byte) (rkey, value []byte, err error) {
}
// NewIterator returns an iterator of the DB.
// The returned iterator is not goroutine-safe, but it is safe to use
// The returned iterator is not safe for concurrent use, but it is safe to use
// multiple iterators concurrently, with each in a dedicated goroutine.
// It is also safe to use an iterator concurrently with modifying its
// underlying DB. However, the resultant key/value pairs are not guaranteed
@ -410,7 +412,7 @@ func (p *DB) Capacity() int {
}
// Size returns sum of keys and values length. Note that deleted
// key/value will not be accouted for, but it will still consume
// key/value will not be accounted for, but it will still consume
// the buffer, since the buffer is append only.
func (p *DB) Size() int {
p.mu.RLock()
@ -434,27 +436,32 @@ func (p *DB) Len() int {
// Reset resets the DB to initial empty state. Allows reuse the buffer.
func (p *DB) Reset() {
p.mu.Lock()
p.rnd = rand.New(rand.NewSource(0xdeadbeef))
p.maxHeight = 1
p.n = 0
p.kvSize = 0
p.kvData = p.kvData[:0]
p.nodeData = p.nodeData[:4+tMaxHeight]
p.nodeData = p.nodeData[:nNext+tMaxHeight]
p.nodeData[nKV] = 0
p.nodeData[nKey] = 0
p.nodeData[nVal] = 0
p.nodeData[nHeight] = tMaxHeight
for n := 0; n < tMaxHeight; n++ {
p.nodeData[4+n] = 0
p.nodeData[nNext+n] = 0
p.prevNode[n] = 0
}
p.mu.Unlock()
}
// New creates a new initalized in-memory key/value DB. The capacity
// New creates a new initialized in-memory key/value DB. The capacity
// is the initial key/value buffer capacity. The capacity is advisory,
// not enforced.
//
// The returned DB instance is goroutine-safe.
// This DB is append-only, deleting an entry would remove entry node but not
// reclaim KV buffer.
//
// The returned DB instance is safe for concurrent use.
func New(cmp comparer.BasicComparer, capacity int) *DB {
p := &DB{
cmp: cmp,

View File

@ -73,7 +73,7 @@ var _ = testutil.Defer(func() {
db := New(comparer.DefaultComparer, 0)
t := testutil.DBTesting{
DB: db,
Deleted: testutil.KeyValue_Generate(nil, 1000, 1, 30, 5, 5).Clone(),
Deleted: testutil.KeyValue_Generate(nil, 1000, 1, 1, 30, 5, 5).Clone(),
PostFn: func(t *testutil.DBTesting) {
Expect(db.Len()).Should(Equal(t.Present.Len()))
Expect(db.Size()).Should(Equal(t.Present.Size()))

View File

@ -8,10 +8,11 @@
package opt
import (
"math"
"github.com/syndtr/goleveldb/leveldb/cache"
"github.com/syndtr/goleveldb/leveldb/comparer"
"github.com/syndtr/goleveldb/leveldb/filter"
"math"
)
const (
@ -35,8 +36,6 @@ var (
DefaultCompactionTotalSizeMultiplier = 10.0
DefaultCompressionType = SnappyCompression
DefaultIteratorSamplingRate = 1 * MiB
DefaultMaxMemCompationLevel = 2
DefaultNumLevel = 7
DefaultOpenFilesCacher = LRUCacher
DefaultOpenFilesCacheCapacity = 500
DefaultWriteBuffer = 4 * MiB
@ -250,6 +249,11 @@ type Options struct {
// The default value (DefaultCompression) uses snappy compression.
Compression Compression
// DisableBufferPool allows disable use of util.BufferPool functionality.
//
// The default value is false.
DisableBufferPool bool
// DisableBlockCache allows disable use of cache.Cache functionality on
// 'sorted table' block.
//
@ -261,6 +265,13 @@ type Options struct {
// The default value is false.
DisableCompactionBackoff bool
// DisableLargeBatchTransaction allows disabling switch-to-transaction mode
// on large batch write. If enable batch writes large than WriteBuffer will
// use transaction.
//
// The default is false.
DisableLargeBatchTransaction bool
// ErrorIfExist defines whether an error should returned if the DB already
// exist.
//
@ -296,18 +307,15 @@ type Options struct {
// The default is 1MiB.
IteratorSamplingRate int
// MaxMemCompationLevel defines maximum level a newly compacted 'memdb'
// will be pushed into if doesn't creates overlap. This should less than
// NumLevel. Use -1 for level-0.
// NoSync allows completely disable fsync.
//
// The default is 2.
MaxMemCompationLevel int
// The default is false.
NoSync bool
// NumLevel defines number of database level. The level shouldn't changed
// between opens, or the database will panic.
// NoWriteMerge allows disabling write merge.
//
// The default is 7.
NumLevel int
// The default is false.
NoWriteMerge bool
// OpenFilesCacher provides cache algorithm for open files caching.
// Specify NoCacher to disable caching algorithm.
@ -321,6 +329,11 @@ type Options struct {
// The default value is 500.
OpenFilesCacheCapacity int
// If true then opens DB in read-only mode.
//
// The default value is false.
ReadOnly bool
// Strict defines the DB strict level.
Strict Strict
@ -425,7 +438,7 @@ func (o *Options) GetCompactionTableSize(level int) int {
if o.CompactionTableSize > 0 {
base = o.CompactionTableSize
}
if len(o.CompactionTableSizeMultiplierPerLevel) > level && o.CompactionTableSizeMultiplierPerLevel[level] > 0 {
if level < len(o.CompactionTableSizeMultiplierPerLevel) && o.CompactionTableSizeMultiplierPerLevel[level] > 0 {
mult = o.CompactionTableSizeMultiplierPerLevel[level]
} else if o.CompactionTableSizeMultiplier > 0 {
mult = math.Pow(o.CompactionTableSizeMultiplier, float64(level))
@ -446,7 +459,7 @@ func (o *Options) GetCompactionTotalSize(level int) int64 {
if o.CompactionTotalSize > 0 {
base = o.CompactionTotalSize
}
if len(o.CompactionTotalSizeMultiplierPerLevel) > level && o.CompactionTotalSizeMultiplierPerLevel[level] > 0 {
if level < len(o.CompactionTotalSizeMultiplierPerLevel) && o.CompactionTotalSizeMultiplierPerLevel[level] > 0 {
mult = o.CompactionTotalSizeMultiplierPerLevel[level]
} else if o.CompactionTotalSizeMultiplier > 0 {
mult = math.Pow(o.CompactionTotalSizeMultiplier, float64(level))
@ -472,6 +485,20 @@ func (o *Options) GetCompression() Compression {
return o.Compression
}
func (o *Options) GetDisableBufferPool() bool {
if o == nil {
return false
}
return o.DisableBufferPool
}
func (o *Options) GetDisableBlockCache() bool {
if o == nil {
return false
}
return o.DisableBlockCache
}
func (o *Options) GetDisableCompactionBackoff() bool {
if o == nil {
return false
@ -479,6 +506,13 @@ func (o *Options) GetDisableCompactionBackoff() bool {
return o.DisableCompactionBackoff
}
func (o *Options) GetDisableLargeBatchTransaction() bool {
if o == nil {
return false
}
return o.DisableLargeBatchTransaction
}
func (o *Options) GetErrorIfExist() bool {
if o == nil {
return false
@ -507,26 +541,18 @@ func (o *Options) GetIteratorSamplingRate() int {
return o.IteratorSamplingRate
}
func (o *Options) GetMaxMemCompationLevel() int {
level := DefaultMaxMemCompationLevel
if o != nil {
if o.MaxMemCompationLevel > 0 {
level = o.MaxMemCompationLevel
} else if o.MaxMemCompationLevel < 0 {
level = 0
}
func (o *Options) GetNoSync() bool {
if o == nil {
return false
}
if level >= o.GetNumLevel() {
return o.GetNumLevel() - 1
}
return level
return o.NoSync
}
func (o *Options) GetNumLevel() int {
if o == nil || o.NumLevel <= 0 {
return DefaultNumLevel
func (o *Options) GetNoWriteMerge() bool {
if o == nil {
return false
}
return o.NumLevel
return o.NoWriteMerge
}
func (o *Options) GetOpenFilesCacher() Cacher {
@ -548,6 +574,13 @@ func (o *Options) GetOpenFilesCacheCapacity() int {
return o.OpenFilesCacheCapacity
}
func (o *Options) GetReadOnly() bool {
if o == nil {
return false
}
return o.ReadOnly
}
func (o *Options) GetStrict(strict Strict) bool {
if o == nil || o.Strict == 0 {
return DefaultStrict&strict != 0
@ -608,6 +641,11 @@ func (ro *ReadOptions) GetStrict(strict Strict) bool {
// WriteOptions holds the optional parameters for 'write operation'. The
// 'write operation' includes Write, Put and Delete.
type WriteOptions struct {
// NoWriteMerge allows disabling write merge.
//
// The default is false.
NoWriteMerge bool
// Sync is whether to sync underlying writes from the OS buffer cache
// through to actual disk, if applicable. Setting Sync can result in
// slower writes.
@ -623,6 +661,13 @@ type WriteOptions struct {
Sync bool
}
func (wo *WriteOptions) GetNoWriteMerge() bool {
if wo == nil {
return false
}
return wo.NoWriteMerge
}
func (wo *WriteOptions) GetSync() bool {
if wo == nil {
return false

View File

@ -43,6 +43,8 @@ func (s *session) setOptions(o *opt.Options) {
s.o.cache()
}
const optCachedLevel = 7
type cachedOptions struct {
*opt.Options
@ -54,15 +56,13 @@ type cachedOptions struct {
}
func (co *cachedOptions) cache() {
numLevel := co.Options.GetNumLevel()
co.compactionExpandLimit = make([]int, optCachedLevel)
co.compactionGPOverlaps = make([]int, optCachedLevel)
co.compactionSourceLimit = make([]int, optCachedLevel)
co.compactionTableSize = make([]int, optCachedLevel)
co.compactionTotalSize = make([]int64, optCachedLevel)
co.compactionExpandLimit = make([]int, numLevel)
co.compactionGPOverlaps = make([]int, numLevel)
co.compactionSourceLimit = make([]int, numLevel)
co.compactionTableSize = make([]int, numLevel)
co.compactionTotalSize = make([]int64, numLevel)
for level := 0; level < numLevel; level++ {
for level := 0; level < optCachedLevel; level++ {
co.compactionExpandLimit[level] = co.Options.GetCompactionExpandLimit(level)
co.compactionGPOverlaps[level] = co.Options.GetCompactionGPOverlaps(level)
co.compactionSourceLimit[level] = co.Options.GetCompactionSourceLimit(level)
@ -72,21 +72,36 @@ func (co *cachedOptions) cache() {
}
func (co *cachedOptions) GetCompactionExpandLimit(level int) int {
return co.compactionExpandLimit[level]
if level < optCachedLevel {
return co.compactionExpandLimit[level]
}
return co.Options.GetCompactionExpandLimit(level)
}
func (co *cachedOptions) GetCompactionGPOverlaps(level int) int {
return co.compactionGPOverlaps[level]
if level < optCachedLevel {
return co.compactionGPOverlaps[level]
}
return co.Options.GetCompactionGPOverlaps(level)
}
func (co *cachedOptions) GetCompactionSourceLimit(level int) int {
return co.compactionSourceLimit[level]
if level < optCachedLevel {
return co.compactionSourceLimit[level]
}
return co.Options.GetCompactionSourceLimit(level)
}
func (co *cachedOptions) GetCompactionTableSize(level int) int {
return co.compactionTableSize[level]
if level < optCachedLevel {
return co.compactionTableSize[level]
}
return co.Options.GetCompactionTableSize(level)
}
func (co *cachedOptions) GetCompactionTotalSize(level int) int64 {
return co.compactionTotalSize[level]
if level < optCachedLevel {
return co.compactionTotalSize[level]
}
return co.Options.GetCompactionTotalSize(level)
}

View File

@ -11,16 +11,15 @@ import (
"io"
"os"
"sync"
"sync/atomic"
"github.com/syndtr/goleveldb/leveldb/errors"
"github.com/syndtr/goleveldb/leveldb/iterator"
"github.com/syndtr/goleveldb/leveldb/journal"
"github.com/syndtr/goleveldb/leveldb/opt"
"github.com/syndtr/goleveldb/leveldb/storage"
"github.com/syndtr/goleveldb/leveldb/util"
)
// ErrManifestCorrupted records manifest corruption. This error will be
// wrapped with errors.ErrCorrupted.
type ErrManifestCorrupted struct {
Field string
Reason string
@ -30,31 +29,32 @@ func (e *ErrManifestCorrupted) Error() string {
return fmt.Sprintf("leveldb: manifest corrupted (field '%s'): %s", e.Field, e.Reason)
}
func newErrManifestCorrupted(f storage.File, field, reason string) error {
return errors.NewErrCorrupted(f, &ErrManifestCorrupted{field, reason})
func newErrManifestCorrupted(fd storage.FileDesc, field, reason string) error {
return errors.NewErrCorrupted(fd, &ErrManifestCorrupted{field, reason})
}
// session represent a persistent database session.
type session struct {
// Need 64-bit alignment.
stNextFileNum uint64 // current unused file number
stJournalNum uint64 // current journal file number; need external synchronization
stPrevJournalNum uint64 // prev journal file number; no longer used; for compatibility with older version of leveldb
stNextFileNum int64 // current unused file number
stJournalNum int64 // current journal file number; need external synchronization
stPrevJournalNum int64 // prev journal file number; no longer used; for compatibility with older version of leveldb
stTempFileNum int64
stSeqNum uint64 // last mem compacted seq; need external synchronization
stTempFileNum uint64
stor storage.Storage
storLock util.Releaser
storLock storage.Locker
o *cachedOptions
icmp *iComparer
tops *tOps
fileRef map[int64]int
manifest *journal.Writer
manifestWriter storage.Writer
manifestFile storage.File
manifestFd storage.FileDesc
stCompPtrs []iKey // compaction pointers; need external synchronization
stVersion *version // current version
stCompPtrs []internalKey // compaction pointers; need external synchronization
stVersion *version // current version
vmu sync.Mutex
}
@ -68,9 +68,9 @@ func newSession(stor storage.Storage, o *opt.Options) (s *session, err error) {
return
}
s = &session{
stor: stor,
storLock: storLock,
stCompPtrs: make([]iKey, o.GetNumLevel()),
stor: stor,
storLock: storLock,
fileRef: make(map[int64]int),
}
s.setOptions(o)
s.tops = newTableOps(s)
@ -90,13 +90,12 @@ func (s *session) close() {
}
s.manifest = nil
s.manifestWriter = nil
s.manifestFile = nil
s.stVersion = nil
s.setVersion(&version{s: s, closing: true})
}
// Release session lock.
func (s *session) release() {
s.storLock.Release()
s.storLock.Unlock()
}
// Create a new database session; need external synchronization.
@ -111,27 +110,31 @@ func (s *session) recover() (err error) {
if os.IsNotExist(err) {
// Don't return os.ErrNotExist if the underlying storage contains
// other files that belong to LevelDB. So the DB won't get trashed.
if files, _ := s.stor.GetFiles(storage.TypeAll); len(files) > 0 {
err = &errors.ErrCorrupted{File: &storage.FileInfo{Type: storage.TypeManifest}, Err: &errors.ErrMissingFiles{}}
if fds, _ := s.stor.List(storage.TypeAll); len(fds) > 0 {
err = &errors.ErrCorrupted{Fd: storage.FileDesc{Type: storage.TypeManifest}, Err: &errors.ErrMissingFiles{}}
}
}
}()
m, err := s.stor.GetManifest()
fd, err := s.stor.GetMeta()
if err != nil {
return
}
reader, err := m.Open()
reader, err := s.stor.Open(fd)
if err != nil {
return
}
defer reader.Close()
strict := s.o.GetStrict(opt.StrictManifest)
jr := journal.NewReader(reader, dropper{s, m}, strict, true)
staging := s.stVersion.newStaging()
rec := &sessionRecord{numLevel: s.o.GetNumLevel()}
var (
// Options.
strict = s.o.GetStrict(opt.StrictManifest)
jr = journal.NewReader(reader, dropper{s, fd}, strict, true)
rec = &sessionRecord{}
staging = s.stVersion.newStaging()
)
for {
var r io.Reader
r, err = jr.Next()
@ -140,24 +143,23 @@ func (s *session) recover() (err error) {
err = nil
break
}
return errors.SetFile(err, m)
return errors.SetFd(err, fd)
}
err = rec.decode(r)
if err == nil {
// save compact pointers
for _, r := range rec.compPtrs {
s.stCompPtrs[r.level] = iKey(r.ikey)
s.setCompPtr(r.level, internalKey(r.ikey))
}
// commit record to version staging
staging.commit(rec)
} else {
err = errors.SetFile(err, m)
err = errors.SetFd(err, fd)
if strict || !errors.IsCorrupted(err) {
return
} else {
s.logf("manifest error: %v (skipped)", errors.SetFile(err, m))
}
s.logf("manifest error: %v (skipped)", errors.SetFd(err, fd))
}
rec.resetCompPtrs()
rec.resetAddedTables()
@ -166,18 +168,18 @@ func (s *session) recover() (err error) {
switch {
case !rec.has(recComparer):
return newErrManifestCorrupted(m, "comparer", "missing")
return newErrManifestCorrupted(fd, "comparer", "missing")
case rec.comparer != s.icmp.uName():
return newErrManifestCorrupted(m, "comparer", fmt.Sprintf("mismatch: want '%s', got '%s'", s.icmp.uName(), rec.comparer))
return newErrManifestCorrupted(fd, "comparer", fmt.Sprintf("mismatch: want '%s', got '%s'", s.icmp.uName(), rec.comparer))
case !rec.has(recNextFileNum):
return newErrManifestCorrupted(m, "next-file-num", "missing")
return newErrManifestCorrupted(fd, "next-file-num", "missing")
case !rec.has(recJournalNum):
return newErrManifestCorrupted(m, "journal-file-num", "missing")
return newErrManifestCorrupted(fd, "journal-file-num", "missing")
case !rec.has(recSeqNum):
return newErrManifestCorrupted(m, "seq-num", "missing")
return newErrManifestCorrupted(fd, "seq-num", "missing")
}
s.manifestFile = m
s.manifestFd = fd
s.setVersion(staging.finish())
s.setNextFileNum(rec.nextFileNum)
s.recordCommited(rec)
@ -206,250 +208,3 @@ func (s *session) commit(r *sessionRecord) (err error) {
return
}
// Pick a compaction based on current state; need external synchronization.
func (s *session) pickCompaction() *compaction {
v := s.version()
var level int
var t0 tFiles
if v.cScore >= 1 {
level = v.cLevel
cptr := s.stCompPtrs[level]
tables := v.tables[level]
for _, t := range tables {
if cptr == nil || s.icmp.Compare(t.imax, cptr) > 0 {
t0 = append(t0, t)
break
}
}
if len(t0) == 0 {
t0 = append(t0, tables[0])
}
} else {
if p := atomic.LoadPointer(&v.cSeek); p != nil {
ts := (*tSet)(p)
level = ts.level
t0 = append(t0, ts.table)
} else {
v.release()
return nil
}
}
return newCompaction(s, v, level, t0)
}
// Create compaction from given level and range; need external synchronization.
func (s *session) getCompactionRange(level int, umin, umax []byte) *compaction {
v := s.version()
t0 := v.tables[level].getOverlaps(nil, s.icmp, umin, umax, level == 0)
if len(t0) == 0 {
v.release()
return nil
}
// Avoid compacting too much in one shot in case the range is large.
// But we cannot do this for level-0 since level-0 files can overlap
// and we must not pick one file and drop another older file if the
// two files overlap.
if level > 0 {
limit := uint64(v.s.o.GetCompactionSourceLimit(level))
total := uint64(0)
for i, t := range t0 {
total += t.size
if total >= limit {
s.logf("table@compaction limiting F·%d -> F·%d", len(t0), i+1)
t0 = t0[:i+1]
break
}
}
}
return newCompaction(s, v, level, t0)
}
func newCompaction(s *session, v *version, level int, t0 tFiles) *compaction {
c := &compaction{
s: s,
v: v,
level: level,
tables: [2]tFiles{t0, nil},
maxGPOverlaps: uint64(s.o.GetCompactionGPOverlaps(level)),
tPtrs: make([]int, s.o.GetNumLevel()),
}
c.expand()
c.save()
return c
}
// compaction represent a compaction state.
type compaction struct {
s *session
v *version
level int
tables [2]tFiles
maxGPOverlaps uint64
gp tFiles
gpi int
seenKey bool
gpOverlappedBytes uint64
imin, imax iKey
tPtrs []int
released bool
snapGPI int
snapSeenKey bool
snapGPOverlappedBytes uint64
snapTPtrs []int
}
func (c *compaction) save() {
c.snapGPI = c.gpi
c.snapSeenKey = c.seenKey
c.snapGPOverlappedBytes = c.gpOverlappedBytes
c.snapTPtrs = append(c.snapTPtrs[:0], c.tPtrs...)
}
func (c *compaction) restore() {
c.gpi = c.snapGPI
c.seenKey = c.snapSeenKey
c.gpOverlappedBytes = c.snapGPOverlappedBytes
c.tPtrs = append(c.tPtrs[:0], c.snapTPtrs...)
}
func (c *compaction) release() {
if !c.released {
c.released = true
c.v.release()
}
}
// Expand compacted tables; need external synchronization.
func (c *compaction) expand() {
limit := uint64(c.s.o.GetCompactionExpandLimit(c.level))
vt0, vt1 := c.v.tables[c.level], c.v.tables[c.level+1]
t0, t1 := c.tables[0], c.tables[1]
imin, imax := t0.getRange(c.s.icmp)
// We expand t0 here just incase ukey hop across tables.
t0 = vt0.getOverlaps(t0, c.s.icmp, imin.ukey(), imax.ukey(), c.level == 0)
if len(t0) != len(c.tables[0]) {
imin, imax = t0.getRange(c.s.icmp)
}
t1 = vt1.getOverlaps(t1, c.s.icmp, imin.ukey(), imax.ukey(), false)
// Get entire range covered by compaction.
amin, amax := append(t0, t1...).getRange(c.s.icmp)
// See if we can grow the number of inputs in "level" without
// changing the number of "level+1" files we pick up.
if len(t1) > 0 {
exp0 := vt0.getOverlaps(nil, c.s.icmp, amin.ukey(), amax.ukey(), c.level == 0)
if len(exp0) > len(t0) && t1.size()+exp0.size() < limit {
xmin, xmax := exp0.getRange(c.s.icmp)
exp1 := vt1.getOverlaps(nil, c.s.icmp, xmin.ukey(), xmax.ukey(), false)
if len(exp1) == len(t1) {
c.s.logf("table@compaction expanding L%d+L%d (F·%d S·%s)+(F·%d S·%s) -> (F·%d S·%s)+(F·%d S·%s)",
c.level, c.level+1, len(t0), shortenb(int(t0.size())), len(t1), shortenb(int(t1.size())),
len(exp0), shortenb(int(exp0.size())), len(exp1), shortenb(int(exp1.size())))
imin, imax = xmin, xmax
t0, t1 = exp0, exp1
amin, amax = append(t0, t1...).getRange(c.s.icmp)
}
}
}
// Compute the set of grandparent files that overlap this compaction
// (parent == level+1; grandparent == level+2)
if c.level+2 < c.s.o.GetNumLevel() {
c.gp = c.v.tables[c.level+2].getOverlaps(c.gp, c.s.icmp, amin.ukey(), amax.ukey(), false)
}
c.tables[0], c.tables[1] = t0, t1
c.imin, c.imax = imin, imax
}
// Check whether compaction is trivial.
func (c *compaction) trivial() bool {
return len(c.tables[0]) == 1 && len(c.tables[1]) == 0 && c.gp.size() <= c.maxGPOverlaps
}
func (c *compaction) baseLevelForKey(ukey []byte) bool {
for level, tables := range c.v.tables[c.level+2:] {
for c.tPtrs[level] < len(tables) {
t := tables[c.tPtrs[level]]
if c.s.icmp.uCompare(ukey, t.imax.ukey()) <= 0 {
// We've advanced far enough.
if c.s.icmp.uCompare(ukey, t.imin.ukey()) >= 0 {
// Key falls in this file's range, so definitely not base level.
return false
}
break
}
c.tPtrs[level]++
}
}
return true
}
func (c *compaction) shouldStopBefore(ikey iKey) bool {
for ; c.gpi < len(c.gp); c.gpi++ {
gp := c.gp[c.gpi]
if c.s.icmp.Compare(ikey, gp.imax) <= 0 {
break
}
if c.seenKey {
c.gpOverlappedBytes += gp.size
}
}
c.seenKey = true
if c.gpOverlappedBytes > c.maxGPOverlaps {
// Too much overlap for current output; start new output.
c.gpOverlappedBytes = 0
return true
}
return false
}
// Creates an iterator.
func (c *compaction) newIterator() iterator.Iterator {
// Creates iterator slice.
icap := len(c.tables)
if c.level == 0 {
// Special case for level-0
icap = len(c.tables[0]) + 1
}
its := make([]iterator.Iterator, 0, icap)
// Options.
ro := &opt.ReadOptions{
DontFillCache: true,
Strict: opt.StrictOverride,
}
strict := c.s.o.GetStrict(opt.StrictCompaction)
if strict {
ro.Strict |= opt.StrictReader
}
for i, tables := range c.tables {
if len(tables) == 0 {
continue
}
// Level-0 is not sorted and may overlaps each other.
if c.level+i == 0 {
for _, t := range tables {
its = append(its, c.s.tops.newIterator(t, nil, ro))
}
} else {
it := iterator.NewIndexedIterator(tables.newIndexIterator(c.s.tops, c.s.icmp, nil, ro), strict)
its = append(its, it)
}
}
return iterator.NewMergedIterator(its, c.s.icmp, strict)
}

View File

@ -0,0 +1,302 @@
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
// All rights reserved.
//
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
package leveldb
import (
"sync/atomic"
"github.com/syndtr/goleveldb/leveldb/iterator"
"github.com/syndtr/goleveldb/leveldb/memdb"
"github.com/syndtr/goleveldb/leveldb/opt"
)
func (s *session) pickMemdbLevel(umin, umax []byte, maxLevel int) int {
v := s.version()
defer v.release()
return v.pickMemdbLevel(umin, umax, maxLevel)
}
func (s *session) flushMemdb(rec *sessionRecord, mdb *memdb.DB, maxLevel int) (int, error) {
// Create sorted table.
iter := mdb.NewIterator(nil)
defer iter.Release()
t, n, err := s.tops.createFrom(iter)
if err != nil {
return 0, err
}
// Pick level other than zero can cause compaction issue with large
// bulk insert and delete on strictly incrementing key-space. The
// problem is that the small deletion markers trapped at lower level,
// while key/value entries keep growing at higher level. Since the
// key-space is strictly incrementing it will not overlaps with
// higher level, thus maximum possible level is always picked, while
// overlapping deletion marker pushed into lower level.
// See: https://github.com/syndtr/goleveldb/issues/127.
flushLevel := s.pickMemdbLevel(t.imin.ukey(), t.imax.ukey(), maxLevel)
rec.addTableFile(flushLevel, t)
s.logf("memdb@flush created L%d@%d N·%d S·%s %q:%q", flushLevel, t.fd.Num, n, shortenb(int(t.size)), t.imin, t.imax)
return flushLevel, nil
}
// Pick a compaction based on current state; need external synchronization.
func (s *session) pickCompaction() *compaction {
v := s.version()
var sourceLevel int
var t0 tFiles
if v.cScore >= 1 {
sourceLevel = v.cLevel
cptr := s.getCompPtr(sourceLevel)
tables := v.levels[sourceLevel]
for _, t := range tables {
if cptr == nil || s.icmp.Compare(t.imax, cptr) > 0 {
t0 = append(t0, t)
break
}
}
if len(t0) == 0 {
t0 = append(t0, tables[0])
}
} else {
if p := atomic.LoadPointer(&v.cSeek); p != nil {
ts := (*tSet)(p)
sourceLevel = ts.level
t0 = append(t0, ts.table)
} else {
v.release()
return nil
}
}
return newCompaction(s, v, sourceLevel, t0)
}
// Create compaction from given level and range; need external synchronization.
func (s *session) getCompactionRange(sourceLevel int, umin, umax []byte, noLimit bool) *compaction {
v := s.version()
if sourceLevel >= len(v.levels) {
v.release()
return nil
}
t0 := v.levels[sourceLevel].getOverlaps(nil, s.icmp, umin, umax, sourceLevel == 0)
if len(t0) == 0 {
v.release()
return nil
}
// Avoid compacting too much in one shot in case the range is large.
// But we cannot do this for level-0 since level-0 files can overlap
// and we must not pick one file and drop another older file if the
// two files overlap.
if !noLimit && sourceLevel > 0 {
limit := int64(v.s.o.GetCompactionSourceLimit(sourceLevel))
total := int64(0)
for i, t := range t0 {
total += t.size
if total >= limit {
s.logf("table@compaction limiting F·%d -> F·%d", len(t0), i+1)
t0 = t0[:i+1]
break
}
}
}
return newCompaction(s, v, sourceLevel, t0)
}
func newCompaction(s *session, v *version, sourceLevel int, t0 tFiles) *compaction {
c := &compaction{
s: s,
v: v,
sourceLevel: sourceLevel,
levels: [2]tFiles{t0, nil},
maxGPOverlaps: int64(s.o.GetCompactionGPOverlaps(sourceLevel)),
tPtrs: make([]int, len(v.levels)),
}
c.expand()
c.save()
return c
}
// compaction represent a compaction state.
type compaction struct {
s *session
v *version
sourceLevel int
levels [2]tFiles
maxGPOverlaps int64
gp tFiles
gpi int
seenKey bool
gpOverlappedBytes int64
imin, imax internalKey
tPtrs []int
released bool
snapGPI int
snapSeenKey bool
snapGPOverlappedBytes int64
snapTPtrs []int
}
func (c *compaction) save() {
c.snapGPI = c.gpi
c.snapSeenKey = c.seenKey
c.snapGPOverlappedBytes = c.gpOverlappedBytes
c.snapTPtrs = append(c.snapTPtrs[:0], c.tPtrs...)
}
func (c *compaction) restore() {
c.gpi = c.snapGPI
c.seenKey = c.snapSeenKey
c.gpOverlappedBytes = c.snapGPOverlappedBytes
c.tPtrs = append(c.tPtrs[:0], c.snapTPtrs...)
}
func (c *compaction) release() {
if !c.released {
c.released = true
c.v.release()
}
}
// Expand compacted tables; need external synchronization.
func (c *compaction) expand() {
limit := int64(c.s.o.GetCompactionExpandLimit(c.sourceLevel))
vt0 := c.v.levels[c.sourceLevel]
vt1 := tFiles{}
if level := c.sourceLevel + 1; level < len(c.v.levels) {
vt1 = c.v.levels[level]
}
t0, t1 := c.levels[0], c.levels[1]
imin, imax := t0.getRange(c.s.icmp)
// We expand t0 here just incase ukey hop across tables.
t0 = vt0.getOverlaps(t0, c.s.icmp, imin.ukey(), imax.ukey(), c.sourceLevel == 0)
if len(t0) != len(c.levels[0]) {
imin, imax = t0.getRange(c.s.icmp)
}
t1 = vt1.getOverlaps(t1, c.s.icmp, imin.ukey(), imax.ukey(), false)
// Get entire range covered by compaction.
amin, amax := append(t0, t1...).getRange(c.s.icmp)
// See if we can grow the number of inputs in "sourceLevel" without
// changing the number of "sourceLevel+1" files we pick up.
if len(t1) > 0 {
exp0 := vt0.getOverlaps(nil, c.s.icmp, amin.ukey(), amax.ukey(), c.sourceLevel == 0)
if len(exp0) > len(t0) && t1.size()+exp0.size() < limit {
xmin, xmax := exp0.getRange(c.s.icmp)
exp1 := vt1.getOverlaps(nil, c.s.icmp, xmin.ukey(), xmax.ukey(), false)
if len(exp1) == len(t1) {
c.s.logf("table@compaction expanding L%d+L%d (F·%d S·%s)+(F·%d S·%s) -> (F·%d S·%s)+(F·%d S·%s)",
c.sourceLevel, c.sourceLevel+1, len(t0), shortenb(int(t0.size())), len(t1), shortenb(int(t1.size())),
len(exp0), shortenb(int(exp0.size())), len(exp1), shortenb(int(exp1.size())))
imin, imax = xmin, xmax
t0, t1 = exp0, exp1
amin, amax = append(t0, t1...).getRange(c.s.icmp)
}
}
}
// Compute the set of grandparent files that overlap this compaction
// (parent == sourceLevel+1; grandparent == sourceLevel+2)
if level := c.sourceLevel + 2; level < len(c.v.levels) {
c.gp = c.v.levels[level].getOverlaps(c.gp, c.s.icmp, amin.ukey(), amax.ukey(), false)
}
c.levels[0], c.levels[1] = t0, t1
c.imin, c.imax = imin, imax
}
// Check whether compaction is trivial.
func (c *compaction) trivial() bool {
return len(c.levels[0]) == 1 && len(c.levels[1]) == 0 && c.gp.size() <= c.maxGPOverlaps
}
func (c *compaction) baseLevelForKey(ukey []byte) bool {
for level := c.sourceLevel + 2; level < len(c.v.levels); level++ {
tables := c.v.levels[level]
for c.tPtrs[level] < len(tables) {
t := tables[c.tPtrs[level]]
if c.s.icmp.uCompare(ukey, t.imax.ukey()) <= 0 {
// We've advanced far enough.
if c.s.icmp.uCompare(ukey, t.imin.ukey()) >= 0 {
// Key falls in this file's range, so definitely not base level.
return false
}
break
}
c.tPtrs[level]++
}
}
return true
}
func (c *compaction) shouldStopBefore(ikey internalKey) bool {
for ; c.gpi < len(c.gp); c.gpi++ {
gp := c.gp[c.gpi]
if c.s.icmp.Compare(ikey, gp.imax) <= 0 {
break
}
if c.seenKey {
c.gpOverlappedBytes += gp.size
}
}
c.seenKey = true
if c.gpOverlappedBytes > c.maxGPOverlaps {
// Too much overlap for current output; start new output.
c.gpOverlappedBytes = 0
return true
}
return false
}
// Creates an iterator.
func (c *compaction) newIterator() iterator.Iterator {
// Creates iterator slice.
icap := len(c.levels)
if c.sourceLevel == 0 {
// Special case for level-0.
icap = len(c.levels[0]) + 1
}
its := make([]iterator.Iterator, 0, icap)
// Options.
ro := &opt.ReadOptions{
DontFillCache: true,
Strict: opt.StrictOverride,
}
strict := c.s.o.GetStrict(opt.StrictCompaction)
if strict {
ro.Strict |= opt.StrictReader
}
for i, tables := range c.levels {
if len(tables) == 0 {
continue
}
// Level-0 is not sorted and may overlaps each other.
if c.sourceLevel+i == 0 {
for _, t := range tables {
its = append(its, c.s.tops.newIterator(t, nil, ro))
}
} else {
it := iterator.NewIndexedIterator(tables.newIndexIterator(c.s.tops, c.s.icmp, nil, ro), strict)
its = append(its, it)
}
}
return iterator.NewMergedIterator(its, c.s.icmp, strict)
}

View File

@ -13,6 +13,7 @@ import (
"strings"
"github.com/syndtr/goleveldb/leveldb/errors"
"github.com/syndtr/goleveldb/leveldb/storage"
)
type byteReader interface {
@ -35,30 +36,28 @@ const (
type cpRecord struct {
level int
ikey iKey
ikey internalKey
}
type atRecord struct {
level int
num uint64
size uint64
imin iKey
imax iKey
num int64
size int64
imin internalKey
imax internalKey
}
type dtRecord struct {
level int
num uint64
num int64
}
type sessionRecord struct {
numLevel int
hasRec int
comparer string
journalNum uint64
prevJournalNum uint64
nextFileNum uint64
journalNum int64
prevJournalNum int64
nextFileNum int64
seqNum uint64
compPtrs []cpRecord
addedTables []atRecord
@ -77,17 +76,17 @@ func (p *sessionRecord) setComparer(name string) {
p.comparer = name
}
func (p *sessionRecord) setJournalNum(num uint64) {
func (p *sessionRecord) setJournalNum(num int64) {
p.hasRec |= 1 << recJournalNum
p.journalNum = num
}
func (p *sessionRecord) setPrevJournalNum(num uint64) {
func (p *sessionRecord) setPrevJournalNum(num int64) {
p.hasRec |= 1 << recPrevJournalNum
p.prevJournalNum = num
}
func (p *sessionRecord) setNextFileNum(num uint64) {
func (p *sessionRecord) setNextFileNum(num int64) {
p.hasRec |= 1 << recNextFileNum
p.nextFileNum = num
}
@ -97,7 +96,7 @@ func (p *sessionRecord) setSeqNum(num uint64) {
p.seqNum = num
}
func (p *sessionRecord) addCompPtr(level int, ikey iKey) {
func (p *sessionRecord) addCompPtr(level int, ikey internalKey) {
p.hasRec |= 1 << recCompPtr
p.compPtrs = append(p.compPtrs, cpRecord{level, ikey})
}
@ -107,13 +106,13 @@ func (p *sessionRecord) resetCompPtrs() {
p.compPtrs = p.compPtrs[:0]
}
func (p *sessionRecord) addTable(level int, num, size uint64, imin, imax iKey) {
func (p *sessionRecord) addTable(level int, num, size int64, imin, imax internalKey) {
p.hasRec |= 1 << recAddTable
p.addedTables = append(p.addedTables, atRecord{level, num, size, imin, imax})
}
func (p *sessionRecord) addTableFile(level int, t *tFile) {
p.addTable(level, t.file.Num(), t.size, t.imin, t.imax)
p.addTable(level, t.fd.Num, t.size, t.imin, t.imax)
}
func (p *sessionRecord) resetAddedTables() {
@ -121,7 +120,7 @@ func (p *sessionRecord) resetAddedTables() {
p.addedTables = p.addedTables[:0]
}
func (p *sessionRecord) delTable(level int, num uint64) {
func (p *sessionRecord) delTable(level int, num int64) {
p.hasRec |= 1 << recDelTable
p.deletedTables = append(p.deletedTables, dtRecord{level, num})
}
@ -139,6 +138,13 @@ func (p *sessionRecord) putUvarint(w io.Writer, x uint64) {
_, p.err = w.Write(p.scratch[:n])
}
func (p *sessionRecord) putVarint(w io.Writer, x int64) {
if x < 0 {
panic("invalid negative value")
}
p.putUvarint(w, uint64(x))
}
func (p *sessionRecord) putBytes(w io.Writer, x []byte) {
if p.err != nil {
return
@ -158,11 +164,11 @@ func (p *sessionRecord) encode(w io.Writer) error {
}
if p.has(recJournalNum) {
p.putUvarint(w, recJournalNum)
p.putUvarint(w, p.journalNum)
p.putVarint(w, p.journalNum)
}
if p.has(recNextFileNum) {
p.putUvarint(w, recNextFileNum)
p.putUvarint(w, p.nextFileNum)
p.putVarint(w, p.nextFileNum)
}
if p.has(recSeqNum) {
p.putUvarint(w, recSeqNum)
@ -176,13 +182,13 @@ func (p *sessionRecord) encode(w io.Writer) error {
for _, r := range p.deletedTables {
p.putUvarint(w, recDelTable)
p.putUvarint(w, uint64(r.level))
p.putUvarint(w, r.num)
p.putVarint(w, r.num)
}
for _, r := range p.addedTables {
p.putUvarint(w, recAddTable)
p.putUvarint(w, uint64(r.level))
p.putUvarint(w, r.num)
p.putUvarint(w, r.size)
p.putVarint(w, r.num)
p.putVarint(w, r.size)
p.putBytes(w, r.imin)
p.putBytes(w, r.imax)
}
@ -196,9 +202,9 @@ func (p *sessionRecord) readUvarintMayEOF(field string, r io.ByteReader, mayEOF
x, err := binary.ReadUvarint(r)
if err != nil {
if err == io.ErrUnexpectedEOF || (mayEOF == false && err == io.EOF) {
p.err = errors.NewErrCorrupted(nil, &ErrManifestCorrupted{field, "short read"})
p.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrManifestCorrupted{field, "short read"})
} else if strings.HasPrefix(err.Error(), "binary:") {
p.err = errors.NewErrCorrupted(nil, &ErrManifestCorrupted{field, err.Error()})
p.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrManifestCorrupted{field, err.Error()})
} else {
p.err = err
}
@ -211,6 +217,14 @@ func (p *sessionRecord) readUvarint(field string, r io.ByteReader) uint64 {
return p.readUvarintMayEOF(field, r, false)
}
func (p *sessionRecord) readVarint(field string, r io.ByteReader) int64 {
x := int64(p.readUvarintMayEOF(field, r, false))
if x < 0 {
p.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrManifestCorrupted{field, "invalid negative value"})
}
return x
}
func (p *sessionRecord) readBytes(field string, r byteReader) []byte {
if p.err != nil {
return nil
@ -223,7 +237,7 @@ func (p *sessionRecord) readBytes(field string, r byteReader) []byte {
_, p.err = io.ReadFull(r, x)
if p.err != nil {
if p.err == io.ErrUnexpectedEOF {
p.err = errors.NewErrCorrupted(nil, &ErrManifestCorrupted{field, "short read"})
p.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrManifestCorrupted{field, "short read"})
}
return nil
}
@ -238,10 +252,6 @@ func (p *sessionRecord) readLevel(field string, r io.ByteReader) int {
if p.err != nil {
return 0
}
if x >= uint64(p.numLevel) {
p.err = errors.NewErrCorrupted(nil, &ErrManifestCorrupted{field, "invalid level number"})
return 0
}
return int(x)
}
@ -266,17 +276,17 @@ func (p *sessionRecord) decode(r io.Reader) error {
p.setComparer(string(x))
}
case recJournalNum:
x := p.readUvarint("journal-num", br)
x := p.readVarint("journal-num", br)
if p.err == nil {
p.setJournalNum(x)
}
case recPrevJournalNum:
x := p.readUvarint("prev-journal-num", br)
x := p.readVarint("prev-journal-num", br)
if p.err == nil {
p.setPrevJournalNum(x)
}
case recNextFileNum:
x := p.readUvarint("next-file-num", br)
x := p.readVarint("next-file-num", br)
if p.err == nil {
p.setNextFileNum(x)
}
@ -289,12 +299,12 @@ func (p *sessionRecord) decode(r io.Reader) error {
level := p.readLevel("comp-ptr.level", br)
ikey := p.readBytes("comp-ptr.ikey", br)
if p.err == nil {
p.addCompPtr(level, iKey(ikey))
p.addCompPtr(level, internalKey(ikey))
}
case recAddTable:
level := p.readLevel("add-table.level", br)
num := p.readUvarint("add-table.num", br)
size := p.readUvarint("add-table.size", br)
num := p.readVarint("add-table.num", br)
size := p.readVarint("add-table.size", br)
imin := p.readBytes("add-table.imin", br)
imax := p.readBytes("add-table.imax", br)
if p.err == nil {
@ -302,7 +312,7 @@ func (p *sessionRecord) decode(r io.Reader) error {
}
case recDelTable:
level := p.readLevel("del-table.level", br)
num := p.readUvarint("del-table.num", br)
num := p.readVarint("del-table.num", br)
if p.err == nil {
p.delTable(level, num)
}

View File

@ -9,8 +9,6 @@ package leveldb
import (
"bytes"
"testing"
"github.com/syndtr/goleveldb/leveldb/opt"
)
func decodeEncode(v *sessionRecord) (res bool, err error) {
@ -19,7 +17,7 @@ func decodeEncode(v *sessionRecord) (res bool, err error) {
if err != nil {
return
}
v2 := &sessionRecord{numLevel: opt.DefaultNumLevel}
v2 := &sessionRecord{}
err = v.decode(b)
if err != nil {
return
@ -33,9 +31,9 @@ func decodeEncode(v *sessionRecord) (res bool, err error) {
}
func TestSessionRecord_EncodeDecode(t *testing.T) {
big := uint64(1) << 50
v := &sessionRecord{numLevel: opt.DefaultNumLevel}
i := uint64(0)
big := int64(1) << 50
v := &sessionRecord{}
i := int64(0)
test := func() {
res, err := decodeEncode(v)
if err != nil {
@ -49,16 +47,16 @@ func TestSessionRecord_EncodeDecode(t *testing.T) {
for ; i < 4; i++ {
test()
v.addTable(3, big+300+i, big+400+i,
newIkey([]byte("foo"), big+500+1, ktVal),
newIkey([]byte("zoo"), big+600+1, ktDel))
makeInternalKey(nil, []byte("foo"), uint64(big+500+1), keyTypeVal),
makeInternalKey(nil, []byte("zoo"), uint64(big+600+1), keyTypeDel))
v.delTable(4, big+700+i)
v.addCompPtr(int(i), newIkey([]byte("x"), big+900+1, ktVal))
v.addCompPtr(int(i), makeInternalKey(nil, []byte("x"), uint64(big+900+1), keyTypeVal))
}
v.setComparer("foo")
v.setJournalNum(big + 100)
v.setPrevJournalNum(big + 99)
v.setNextFileNum(big + 200)
v.setSeqNum(big + 1000)
v.setSeqNum(uint64(big + 1000))
test()
}

View File

@ -17,15 +17,15 @@ import (
// Logging.
type dropper struct {
s *session
file storage.File
s *session
fd storage.FileDesc
}
func (d dropper) Drop(err error) {
if e, ok := err.(*journal.ErrCorrupted); ok {
d.s.logf("journal@drop %s-%d S·%s %q", d.file.Type(), d.file.Num(), shortenb(e.Size), e.Reason)
d.s.logf("journal@drop %s-%d S·%s %q", d.fd.Type, d.fd.Num, shortenb(e.Size), e.Reason)
} else {
d.s.logf("journal@drop %s-%d %q", d.file.Type(), d.file.Num(), err)
d.s.logf("journal@drop %s-%d %q", d.fd.Type, d.fd.Num, err)
}
}
@ -34,25 +34,21 @@ func (s *session) logf(format string, v ...interface{}) { s.stor.Log(fmt.Sprintf
// File utils.
func (s *session) getJournalFile(num uint64) storage.File {
return s.stor.GetFile(num, storage.TypeJournal)
func (s *session) newTemp() storage.FileDesc {
num := atomic.AddInt64(&s.stTempFileNum, 1) - 1
return storage.FileDesc{storage.TypeTemp, num}
}
func (s *session) getTableFile(num uint64) storage.File {
return s.stor.GetFile(num, storage.TypeTable)
}
func (s *session) getFiles(t storage.FileType) ([]storage.File, error) {
return s.stor.GetFiles(t)
}
func (s *session) newTemp() storage.File {
num := atomic.AddUint64(&s.stTempFileNum, 1) - 1
return s.stor.GetFile(num, storage.TypeTemp)
}
func (s *session) tableFileFromRecord(r atRecord) *tFile {
return newTableFile(s.getTableFile(r.num), r.size, r.imin, r.imax)
func (s *session) addFileRef(fd storage.FileDesc, ref int) int {
ref += s.fileRef[fd.Num]
if ref > 0 {
s.fileRef[fd.Num] = ref
} else if ref == 0 {
delete(s.fileRef, fd.Num)
} else {
panic(fmt.Sprintf("negative ref: %v", fd))
}
return ref
}
// Session state.
@ -62,65 +58,90 @@ func (s *session) tableFileFromRecord(r atRecord) *tFile {
func (s *session) version() *version {
s.vmu.Lock()
defer s.vmu.Unlock()
s.stVersion.ref++
s.stVersion.incref()
return s.stVersion
}
func (s *session) tLen(level int) int {
s.vmu.Lock()
defer s.vmu.Unlock()
return s.stVersion.tLen(level)
}
// Set current version to v.
func (s *session) setVersion(v *version) {
s.vmu.Lock()
v.ref = 1 // Holds by session.
if old := s.stVersion; old != nil {
v.ref++ // Holds by old version.
old.next = v
old.releaseNB()
defer s.vmu.Unlock()
// Hold by session. It is important to call this first before releasing
// current version, otherwise the still used files might get released.
v.incref()
if s.stVersion != nil {
// Release current version.
s.stVersion.releaseNB()
}
s.stVersion = v
s.vmu.Unlock()
}
// Get current unused file number.
func (s *session) nextFileNum() uint64 {
return atomic.LoadUint64(&s.stNextFileNum)
func (s *session) nextFileNum() int64 {
return atomic.LoadInt64(&s.stNextFileNum)
}
// Set current unused file number to num.
func (s *session) setNextFileNum(num uint64) {
atomic.StoreUint64(&s.stNextFileNum, num)
func (s *session) setNextFileNum(num int64) {
atomic.StoreInt64(&s.stNextFileNum, num)
}
// Mark file number as used.
func (s *session) markFileNum(num uint64) {
func (s *session) markFileNum(num int64) {
nextFileNum := num + 1
for {
old, x := s.stNextFileNum, nextFileNum
if old > x {
x = old
}
if atomic.CompareAndSwapUint64(&s.stNextFileNum, old, x) {
if atomic.CompareAndSwapInt64(&s.stNextFileNum, old, x) {
break
}
}
}
// Allocate a file number.
func (s *session) allocFileNum() uint64 {
return atomic.AddUint64(&s.stNextFileNum, 1) - 1
func (s *session) allocFileNum() int64 {
return atomic.AddInt64(&s.stNextFileNum, 1) - 1
}
// Reuse given file number.
func (s *session) reuseFileNum(num uint64) {
func (s *session) reuseFileNum(num int64) {
for {
old, x := s.stNextFileNum, num
if old != x+1 {
x = old
}
if atomic.CompareAndSwapUint64(&s.stNextFileNum, old, x) {
if atomic.CompareAndSwapInt64(&s.stNextFileNum, old, x) {
break
}
}
}
// Set compaction ptr at given level; need external synchronization.
func (s *session) setCompPtr(level int, ik internalKey) {
if level >= len(s.stCompPtrs) {
newCompPtrs := make([]internalKey, level+1)
copy(newCompPtrs, s.stCompPtrs)
s.stCompPtrs = newCompPtrs
}
s.stCompPtrs[level] = append(internalKey{}, ik...)
}
// Get compaction ptr at given level; need external synchronization.
func (s *session) getCompPtr(level int) internalKey {
if level >= len(s.stCompPtrs) {
return nil
}
return s.stCompPtrs[level]
}
// Manifest related utils.
// Fill given session record obj with current states; need external
@ -149,29 +170,28 @@ func (s *session) fillRecord(r *sessionRecord, snapshot bool) {
// Mark if record has been committed, this will update session state;
// need external synchronization.
func (s *session) recordCommited(r *sessionRecord) {
if r.has(recJournalNum) {
s.stJournalNum = r.journalNum
func (s *session) recordCommited(rec *sessionRecord) {
if rec.has(recJournalNum) {
s.stJournalNum = rec.journalNum
}
if r.has(recPrevJournalNum) {
s.stPrevJournalNum = r.prevJournalNum
if rec.has(recPrevJournalNum) {
s.stPrevJournalNum = rec.prevJournalNum
}
if r.has(recSeqNum) {
s.stSeqNum = r.seqNum
if rec.has(recSeqNum) {
s.stSeqNum = rec.seqNum
}
for _, p := range r.compPtrs {
s.stCompPtrs[p.level] = iKey(p.ikey)
for _, r := range rec.compPtrs {
s.setCompPtr(r.level, internalKey(r.ikey))
}
}
// Create a new manifest file; need external synchronization.
func (s *session) newManifest(rec *sessionRecord, v *version) (err error) {
num := s.allocFileNum()
file := s.stor.GetFile(num, storage.TypeManifest)
writer, err := file.Create()
fd := storage.FileDesc{storage.TypeManifest, s.allocFileNum()}
writer, err := s.stor.Create(fd)
if err != nil {
return
}
@ -182,7 +202,7 @@ func (s *session) newManifest(rec *sessionRecord, v *version) (err error) {
defer v.release()
}
if rec == nil {
rec = &sessionRecord{numLevel: s.o.GetNumLevel()}
rec = &sessionRecord{}
}
s.fillRecord(rec, true)
v.fillRecord(rec)
@ -196,16 +216,16 @@ func (s *session) newManifest(rec *sessionRecord, v *version) (err error) {
if s.manifestWriter != nil {
s.manifestWriter.Close()
}
if s.manifestFile != nil {
s.manifestFile.Remove()
if !s.manifestFd.Zero() {
s.stor.Remove(s.manifestFd)
}
s.manifestFile = file
s.manifestFd = fd
s.manifestWriter = writer
s.manifest = jw
} else {
writer.Close()
file.Remove()
s.reuseFileNum(num)
s.stor.Remove(fd)
s.reuseFileNum(fd.Num)
}
}()
@ -221,7 +241,7 @@ func (s *session) newManifest(rec *sessionRecord, v *version) (err error) {
if err != nil {
return
}
err = s.stor.SetManifest(file)
err = s.stor.SetMeta(fd)
return
}
@ -240,9 +260,11 @@ func (s *session) flushManifest(rec *sessionRecord) (err error) {
if err != nil {
return
}
err = s.manifestWriter.Sync()
if err != nil {
return
if !s.o.GetNoSync() {
err = s.manifestWriter.Sync()
if err != nil {
return
}
}
s.recordCommited(rec)
return

View File

@ -17,11 +17,12 @@ import (
"strings"
"sync"
"time"
"github.com/syndtr/goleveldb/leveldb/util"
)
var errFileOpen = errors.New("leveldb/storage: file still open")
var (
errFileOpen = errors.New("leveldb/storage: file still open")
errReadOnly = errors.New("leveldb/storage: storage is read-only")
)
type fileLock interface {
release() error
@ -31,41 +32,53 @@ type fileStorageLock struct {
fs *fileStorage
}
func (lock *fileStorageLock) Release() {
fs := lock.fs
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.slock == lock {
fs.slock = nil
func (lock *fileStorageLock) Unlock() {
if lock.fs != nil {
lock.fs.mu.Lock()
defer lock.fs.mu.Unlock()
if lock.fs.slock == lock {
lock.fs.slock = nil
}
}
return
}
const logSizeThreshold = 1024 * 1024 // 1 MiB
// fileStorage is a file-system backed storage.
type fileStorage struct {
path string
path string
readOnly bool
mu sync.Mutex
flock fileLock
slock *fileStorageLock
logw *os.File
buf []byte
mu sync.Mutex
flock fileLock
slock *fileStorageLock
logw *os.File
logSize int64
buf []byte
// Opened file counter; if open < 0 means closed.
open int
day int
}
// OpenFile returns a new filesytem-backed storage implementation with the given
// path. This also hold a file lock, so any subsequent attempt to open the same
// path will fail.
// path. This also acquire a file lock, so any subsequent attempt to open the
// same path will fail.
//
// The storage must be closed after use, by calling Close method.
func OpenFile(path string) (Storage, error) {
if err := os.MkdirAll(path, 0755); err != nil {
func OpenFile(path string, readOnly bool) (Storage, error) {
if fi, err := os.Stat(path); err == nil {
if !fi.IsDir() {
return nil, fmt.Errorf("leveldb/storage: open %s: not a directory", path)
}
} else if os.IsNotExist(err) && !readOnly {
if err := os.MkdirAll(path, 0755); err != nil {
return nil, err
}
} else {
return nil, err
}
flock, err := newFileLock(filepath.Join(path, "LOCK"))
flock, err := newFileLock(filepath.Join(path, "LOCK"), readOnly)
if err != nil {
return nil, err
}
@ -76,23 +89,42 @@ func OpenFile(path string) (Storage, error) {
}
}()
rename(filepath.Join(path, "LOG"), filepath.Join(path, "LOG.old"))
logw, err := os.OpenFile(filepath.Join(path, "LOG"), os.O_WRONLY|os.O_CREATE, 0644)
if err != nil {
return nil, err
var (
logw *os.File
logSize int64
)
if !readOnly {
logw, err = os.OpenFile(filepath.Join(path, "LOG"), os.O_WRONLY|os.O_CREATE, 0644)
if err != nil {
return nil, err
}
logSize, err = logw.Seek(0, os.SEEK_END)
if err != nil {
logw.Close()
return nil, err
}
}
fs := &fileStorage{path: path, flock: flock, logw: logw}
fs := &fileStorage{
path: path,
readOnly: readOnly,
flock: flock,
logw: logw,
logSize: logSize,
}
runtime.SetFinalizer(fs, (*fileStorage).Close)
return fs, nil
}
func (fs *fileStorage) Lock() (util.Releaser, error) {
func (fs *fileStorage) Lock() (Locker, error) {
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return nil, ErrClosed
}
if fs.readOnly {
return &fileStorageLock{}, nil
}
if fs.slock != nil {
return nil, ErrLocked
}
@ -101,7 +133,7 @@ func (fs *fileStorage) Lock() (util.Releaser, error) {
}
func itoa(buf []byte, i int, wid int) []byte {
var u uint = uint(i)
u := uint(i)
if u == 0 && wid <= 1 {
return append(buf, '0')
}
@ -126,6 +158,22 @@ func (fs *fileStorage) printDay(t time.Time) {
}
func (fs *fileStorage) doLog(t time.Time, str string) {
if fs.logSize > logSizeThreshold {
// Rotate log file.
fs.logw.Close()
fs.logw = nil
fs.logSize = 0
rename(filepath.Join(fs.path, "LOG"), filepath.Join(fs.path, "LOG.old"))
}
if fs.logw == nil {
var err error
fs.logw, err = os.OpenFile(filepath.Join(fs.path, "LOG"), os.O_WRONLY|os.O_CREATE, 0644)
if err != nil {
return
}
// Force printDay on new log file.
fs.day = 0
}
fs.printDay(t)
hour, min, sec := t.Clock()
msec := t.Nanosecond() / 1e3
@ -145,65 +193,87 @@ func (fs *fileStorage) doLog(t time.Time, str string) {
}
func (fs *fileStorage) Log(str string) {
t := time.Now()
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return
if !fs.readOnly {
t := time.Now()
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return
}
fs.doLog(t, str)
}
fs.doLog(t, str)
}
func (fs *fileStorage) log(str string) {
fs.doLog(time.Now(), str)
if !fs.readOnly {
fs.doLog(time.Now(), str)
}
}
func (fs *fileStorage) GetFile(num uint64, t FileType) File {
return &file{fs: fs, num: num, t: t}
}
func (fs *fileStorage) SetMeta(fd FileDesc) (err error) {
if !FileDescOk(fd) {
return ErrInvalidFile
}
if fs.readOnly {
return errReadOnly
}
func (fs *fileStorage) GetFiles(t FileType) (ff []File, err error) {
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return nil, ErrClosed
return ErrClosed
}
dir, err := os.Open(fs.path)
if err != nil {
return
}
fnn, err := dir.Readdirnames(0)
// Close the dir first before checking for Readdirnames error.
if err := dir.Close(); err != nil {
fs.log(fmt.Sprintf("close dir: %v", err))
}
if err != nil {
return
}
f := &file{fs: fs}
for _, fn := range fnn {
if f.parse(fn) && (f.t&t) != 0 {
ff = append(ff, f)
f = &file{fs: fs}
defer func() {
if err != nil {
fs.log(fmt.Sprintf("CURRENT: %v", err))
}
}()
path := fmt.Sprintf("%s.%d", filepath.Join(fs.path, "CURRENT"), fd.Num)
w, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
if err != nil {
return
}
_, err = fmt.Fprintln(w, fsGenName(fd))
if err != nil {
fs.log(fmt.Sprintf("write CURRENT.%d: %v", fd.Num, err))
return
}
if err = w.Sync(); err != nil {
fs.log(fmt.Sprintf("flush CURRENT.%d: %v", fd.Num, err))
return
}
if err = w.Close(); err != nil {
fs.log(fmt.Sprintf("close CURRENT.%d: %v", fd.Num, err))
return
}
if err != nil {
return
}
if err = rename(path, filepath.Join(fs.path, "CURRENT")); err != nil {
fs.log(fmt.Sprintf("rename CURRENT.%d: %v", fd.Num, err))
return
}
// Sync root directory.
if err = syncDir(fs.path); err != nil {
fs.log(fmt.Sprintf("syncDir: %v", err))
}
return
}
func (fs *fileStorage) GetManifest() (f File, err error) {
func (fs *fileStorage) GetMeta() (fd FileDesc, err error) {
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return nil, ErrClosed
return FileDesc{}, ErrClosed
}
dir, err := os.Open(fs.path)
if err != nil {
return
}
fnn, err := dir.Readdirnames(0)
names, err := dir.Readdirnames(0)
// Close the dir first before checking for Readdirnames error.
if err := dir.Close(); err != nil {
fs.log(fmt.Sprintf("close dir: %v", err))
if ce := dir.Close(); ce != nil {
fs.log(fmt.Sprintf("close dir: %v", ce))
}
if err != nil {
return
@ -212,55 +282,64 @@ func (fs *fileStorage) GetManifest() (f File, err error) {
var rem []string
var pend bool
var cerr error
for _, fn := range fnn {
if strings.HasPrefix(fn, "CURRENT") {
pend1 := len(fn) > 7
for _, name := range names {
if strings.HasPrefix(name, "CURRENT") {
pend1 := len(name) > 7
var pendNum int64
// Make sure it is valid name for a CURRENT file, otherwise skip it.
if pend1 {
if fn[7] != '.' || len(fn) < 9 {
fs.log(fmt.Sprintf("skipping %s: invalid file name", fn))
if name[7] != '.' || len(name) < 9 {
fs.log(fmt.Sprintf("skipping %s: invalid file name", name))
continue
}
if _, e1 := strconv.ParseUint(fn[8:], 10, 0); e1 != nil {
fs.log(fmt.Sprintf("skipping %s: invalid file num: %v", fn, e1))
var e1 error
if pendNum, e1 = strconv.ParseInt(name[8:], 10, 0); e1 != nil {
fs.log(fmt.Sprintf("skipping %s: invalid file num: %v", name, e1))
continue
}
}
path := filepath.Join(fs.path, fn)
path := filepath.Join(fs.path, name)
r, e1 := os.OpenFile(path, os.O_RDONLY, 0)
if e1 != nil {
return nil, e1
return FileDesc{}, e1
}
b, e1 := ioutil.ReadAll(r)
if e1 != nil {
r.Close()
return nil, e1
return FileDesc{}, e1
}
f1 := &file{fs: fs}
if len(b) < 1 || b[len(b)-1] != '\n' || !f1.parse(string(b[:len(b)-1])) {
fs.log(fmt.Sprintf("skipping %s: corrupted or incomplete", fn))
var fd1 FileDesc
if len(b) < 1 || b[len(b)-1] != '\n' || !fsParseNamePtr(string(b[:len(b)-1]), &fd1) {
fs.log(fmt.Sprintf("skipping %s: corrupted or incomplete", name))
if pend1 {
rem = append(rem, fn)
rem = append(rem, name)
}
if !pend1 || cerr == nil {
cerr = fmt.Errorf("leveldb/storage: corrupted or incomplete %s file", fn)
metaFd, _ := fsParseName(name)
cerr = &ErrCorrupted{
Fd: metaFd,
Err: errors.New("leveldb/storage: corrupted or incomplete meta file"),
}
}
} else if f != nil && f1.Num() < f.Num() {
fs.log(fmt.Sprintf("skipping %s: obsolete", fn))
} else if pend1 && pendNum != fd1.Num {
fs.log(fmt.Sprintf("skipping %s: inconsistent pending-file num: %d vs %d", name, pendNum, fd1.Num))
rem = append(rem, name)
} else if fd1.Num < fd.Num {
fs.log(fmt.Sprintf("skipping %s: obsolete", name))
if pend1 {
rem = append(rem, fn)
rem = append(rem, name)
}
} else {
f = f1
fd = fd1
pend = pend1
}
if err := r.Close(); err != nil {
fs.log(fmt.Sprintf("close %s: %v", fn, err))
fs.log(fmt.Sprintf("close %s: %v", name, err))
}
}
}
// Don't remove any files if there is no valid CURRENT file.
if f == nil {
if fd.Zero() {
if cerr != nil {
err = cerr
} else {
@ -268,52 +347,140 @@ func (fs *fileStorage) GetManifest() (f File, err error) {
}
return
}
// Rename pending CURRENT file to an effective CURRENT.
if pend {
path := fmt.Sprintf("%s.%d", filepath.Join(fs.path, "CURRENT"), f.Num())
if err := rename(path, filepath.Join(fs.path, "CURRENT")); err != nil {
fs.log(fmt.Sprintf("CURRENT.%d -> CURRENT: %v", f.Num(), err))
if !fs.readOnly {
// Rename pending CURRENT file to an effective CURRENT.
if pend {
path := fmt.Sprintf("%s.%d", filepath.Join(fs.path, "CURRENT"), fd.Num)
if err := rename(path, filepath.Join(fs.path, "CURRENT")); err != nil {
fs.log(fmt.Sprintf("CURRENT.%d -> CURRENT: %v", fd.Num, err))
}
}
}
// Remove obsolete or incomplete pending CURRENT files.
for _, fn := range rem {
path := filepath.Join(fs.path, fn)
if err := os.Remove(path); err != nil {
fs.log(fmt.Sprintf("remove %s: %v", fn, err))
// Remove obsolete or incomplete pending CURRENT files.
for _, name := range rem {
path := filepath.Join(fs.path, name)
if err := os.Remove(path); err != nil {
fs.log(fmt.Sprintf("remove %s: %v", name, err))
}
}
}
return
}
func (fs *fileStorage) SetManifest(f File) (err error) {
func (fs *fileStorage) List(ft FileType) (fds []FileDesc, err error) {
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return nil, ErrClosed
}
dir, err := os.Open(fs.path)
if err != nil {
return
}
names, err := dir.Readdirnames(0)
// Close the dir first before checking for Readdirnames error.
if cerr := dir.Close(); cerr != nil {
fs.log(fmt.Sprintf("close dir: %v", cerr))
}
if err == nil {
for _, name := range names {
if fd, ok := fsParseName(name); ok && fd.Type&ft != 0 {
fds = append(fds, fd)
}
}
}
return
}
func (fs *fileStorage) Open(fd FileDesc) (Reader, error) {
if !FileDescOk(fd) {
return nil, ErrInvalidFile
}
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return nil, ErrClosed
}
of, err := os.OpenFile(filepath.Join(fs.path, fsGenName(fd)), os.O_RDONLY, 0)
if err != nil {
if fsHasOldName(fd) && os.IsNotExist(err) {
of, err = os.OpenFile(filepath.Join(fs.path, fsGenOldName(fd)), os.O_RDONLY, 0)
if err == nil {
goto ok
}
}
return nil, err
}
ok:
fs.open++
return &fileWrap{File: of, fs: fs, fd: fd}, nil
}
func (fs *fileStorage) Create(fd FileDesc) (Writer, error) {
if !FileDescOk(fd) {
return nil, ErrInvalidFile
}
if fs.readOnly {
return nil, errReadOnly
}
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return nil, ErrClosed
}
of, err := os.OpenFile(filepath.Join(fs.path, fsGenName(fd)), os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
if err != nil {
return nil, err
}
fs.open++
return &fileWrap{File: of, fs: fs, fd: fd}, nil
}
func (fs *fileStorage) Remove(fd FileDesc) error {
if !FileDescOk(fd) {
return ErrInvalidFile
}
if fs.readOnly {
return errReadOnly
}
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return ErrClosed
}
f2, ok := f.(*file)
if !ok || f2.t != TypeManifest {
err := os.Remove(filepath.Join(fs.path, fsGenName(fd)))
if err != nil {
if fsHasOldName(fd) && os.IsNotExist(err) {
if e1 := os.Remove(filepath.Join(fs.path, fsGenOldName(fd))); !os.IsNotExist(e1) {
fs.log(fmt.Sprintf("remove %s: %v (old name)", fd, err))
err = e1
}
} else {
fs.log(fmt.Sprintf("remove %s: %v", fd, err))
}
}
return err
}
func (fs *fileStorage) Rename(oldfd, newfd FileDesc) error {
if !FileDescOk(oldfd) || !FileDescOk(newfd) {
return ErrInvalidFile
}
defer func() {
if err != nil {
fs.log(fmt.Sprintf("CURRENT: %v", err))
}
}()
path := fmt.Sprintf("%s.%d", filepath.Join(fs.path, "CURRENT"), f2.Num())
w, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
if err != nil {
return err
if oldfd == newfd {
return nil
}
_, err = fmt.Fprintln(w, f2.name())
// Close the file first.
if err := w.Close(); err != nil {
fs.log(fmt.Sprintf("close CURRENT.%d: %v", f2.num, err))
if fs.readOnly {
return errReadOnly
}
if err != nil {
return err
fs.mu.Lock()
defer fs.mu.Unlock()
if fs.open < 0 {
return ErrClosed
}
return rename(path, filepath.Join(fs.path, "CURRENT"))
return rename(filepath.Join(fs.path, fsGenName(oldfd)), filepath.Join(fs.path, fsGenName(newfd)))
}
func (fs *fileStorage) Close() error {
@ -326,209 +493,107 @@ func (fs *fileStorage) Close() error {
runtime.SetFinalizer(fs, nil)
if fs.open > 0 {
fs.log(fmt.Sprintf("refuse to close, %d files still open", fs.open))
return fmt.Errorf("leveldb/storage: cannot close, %d files still open", fs.open)
fs.log(fmt.Sprintf("close: warning, %d files still open", fs.open))
}
fs.open = -1
e1 := fs.logw.Close()
err := fs.flock.release()
if err == nil {
err = e1
if fs.logw != nil {
fs.logw.Close()
}
return err
return fs.flock.release()
}
type fileWrap struct {
*os.File
f *file
fs *fileStorage
fd FileDesc
closed bool
}
func (fw fileWrap) Sync() error {
func (fw *fileWrap) Sync() error {
if err := fw.File.Sync(); err != nil {
return err
}
if fw.f.Type() == TypeManifest {
if fw.fd.Type == TypeManifest {
// Also sync parent directory if file type is manifest.
// See: https://code.google.com/p/leveldb/issues/detail?id=190.
if err := syncDir(fw.f.fs.path); err != nil {
if err := syncDir(fw.fs.path); err != nil {
fw.fs.log(fmt.Sprintf("syncDir: %v", err))
return err
}
}
return nil
}
func (fw fileWrap) Close() error {
f := fw.f
f.fs.mu.Lock()
defer f.fs.mu.Unlock()
if !f.open {
func (fw *fileWrap) Close() error {
fw.fs.mu.Lock()
defer fw.fs.mu.Unlock()
if fw.closed {
return ErrClosed
}
f.open = false
f.fs.open--
fw.closed = true
fw.fs.open--
err := fw.File.Close()
if err != nil {
f.fs.log(fmt.Sprintf("close %s.%d: %v", f.Type(), f.Num(), err))
fw.fs.log(fmt.Sprintf("close %s: %v", fw.fd, err))
}
return err
}
type file struct {
fs *fileStorage
num uint64
t FileType
open bool
}
func (f *file) Open() (Reader, error) {
f.fs.mu.Lock()
defer f.fs.mu.Unlock()
if f.fs.open < 0 {
return nil, ErrClosed
}
if f.open {
return nil, errFileOpen
}
of, err := os.OpenFile(f.path(), os.O_RDONLY, 0)
if err != nil {
if f.hasOldName() && os.IsNotExist(err) {
of, err = os.OpenFile(f.oldPath(), os.O_RDONLY, 0)
if err == nil {
goto ok
}
}
return nil, err
}
ok:
f.open = true
f.fs.open++
return fileWrap{of, f}, nil
}
func (f *file) Create() (Writer, error) {
f.fs.mu.Lock()
defer f.fs.mu.Unlock()
if f.fs.open < 0 {
return nil, ErrClosed
}
if f.open {
return nil, errFileOpen
}
of, err := os.OpenFile(f.path(), os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
if err != nil {
return nil, err
}
f.open = true
f.fs.open++
return fileWrap{of, f}, nil
}
func (f *file) Replace(newfile File) error {
f.fs.mu.Lock()
defer f.fs.mu.Unlock()
if f.fs.open < 0 {
return ErrClosed
}
newfile2, ok := newfile.(*file)
if !ok {
return ErrInvalidFile
}
if f.open || newfile2.open {
return errFileOpen
}
return rename(newfile2.path(), f.path())
}
func (f *file) Type() FileType {
return f.t
}
func (f *file) Num() uint64 {
return f.num
}
func (f *file) Remove() error {
f.fs.mu.Lock()
defer f.fs.mu.Unlock()
if f.fs.open < 0 {
return ErrClosed
}
if f.open {
return errFileOpen
}
err := os.Remove(f.path())
if err != nil {
f.fs.log(fmt.Sprintf("remove %s.%d: %v", f.Type(), f.Num(), err))
}
// Also try remove file with old name, just in case.
if f.hasOldName() {
if e1 := os.Remove(f.oldPath()); !os.IsNotExist(e1) {
f.fs.log(fmt.Sprintf("remove %s.%d: %v (old name)", f.Type(), f.Num(), err))
err = e1
}
}
return err
}
func (f *file) hasOldName() bool {
return f.t == TypeTable
}
func (f *file) oldName() string {
switch f.t {
case TypeTable:
return fmt.Sprintf("%06d.sst", f.num)
}
return f.name()
}
func (f *file) oldPath() string {
return filepath.Join(f.fs.path, f.oldName())
}
func (f *file) name() string {
switch f.t {
func fsGenName(fd FileDesc) string {
switch fd.Type {
case TypeManifest:
return fmt.Sprintf("MANIFEST-%06d", f.num)
return fmt.Sprintf("MANIFEST-%06d", fd.Num)
case TypeJournal:
return fmt.Sprintf("%06d.log", f.num)
return fmt.Sprintf("%06d.log", fd.Num)
case TypeTable:
return fmt.Sprintf("%06d.ldb", f.num)
return fmt.Sprintf("%06d.ldb", fd.Num)
case TypeTemp:
return fmt.Sprintf("%06d.tmp", f.num)
return fmt.Sprintf("%06d.tmp", fd.Num)
default:
panic("invalid file type")
}
}
func (f *file) path() string {
return filepath.Join(f.fs.path, f.name())
func fsHasOldName(fd FileDesc) bool {
return fd.Type == TypeTable
}
func (f *file) parse(name string) bool {
var num uint64
func fsGenOldName(fd FileDesc) string {
switch fd.Type {
case TypeTable:
return fmt.Sprintf("%06d.sst", fd.Num)
}
return fsGenName(fd)
}
func fsParseName(name string) (fd FileDesc, ok bool) {
var tail string
_, err := fmt.Sscanf(name, "%d.%s", &num, &tail)
_, err := fmt.Sscanf(name, "%d.%s", &fd.Num, &tail)
if err == nil {
switch tail {
case "log":
f.t = TypeJournal
fd.Type = TypeJournal
case "ldb", "sst":
f.t = TypeTable
fd.Type = TypeTable
case "tmp":
f.t = TypeTemp
fd.Type = TypeTemp
default:
return false
return
}
f.num = num
return true
return fd, true
}
n, _ := fmt.Sscanf(name, "MANIFEST-%d%s", &num, &tail)
n, _ := fmt.Sscanf(name, "MANIFEST-%d%s", &fd.Num, &tail)
if n == 1 {
f.t = TypeManifest
f.num = num
return true
fd.Type = TypeManifest
return fd, true
}
return false
return
}
func fsParseNamePtr(name string, fd *FileDesc) bool {
_fd, ok := fsParseName(name)
if fd != nil {
*fd = _fd
}
return ok
}

View File

@ -0,0 +1,34 @@
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
// All rights reserved.
//
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
// +build nacl
package storage
import (
"os"
"syscall"
)
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
return nil, syscall.ENOTSUP
}
func setFileLock(f *os.File, readOnly, lock bool) error {
return syscall.ENOTSUP
}
func rename(oldpath, newpath string) error {
return syscall.ENOTSUP
}
func isErrInvalid(err error) bool {
return false
}
func syncDir(name string) error {
return syscall.ENOTSUP
}

View File

@ -19,8 +19,21 @@ func (fl *plan9FileLock) release() error {
return fl.f.Close()
}
func newFileLock(path string) (fl fileLock, err error) {
f, err := os.OpenFile(path, os.O_RDWR|os.O_CREATE, os.ModeExclusive|0644)
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
var (
flag int
perm os.FileMode
)
if readOnly {
flag = os.O_RDONLY
} else {
flag = os.O_RDWR
perm = os.ModeExclusive
}
f, err := os.OpenFile(path, flag, perm)
if os.IsNotExist(err) {
f, err = os.OpenFile(path, flag|os.O_CREATE, perm|0644)
}
if err != nil {
return
}

View File

@ -18,18 +18,27 @@ type unixFileLock struct {
}
func (fl *unixFileLock) release() error {
if err := setFileLock(fl.f, false); err != nil {
if err := setFileLock(fl.f, false, false); err != nil {
return err
}
return fl.f.Close()
}
func newFileLock(path string) (fl fileLock, err error) {
f, err := os.OpenFile(path, os.O_RDWR|os.O_CREATE, 0644)
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
var flag int
if readOnly {
flag = os.O_RDONLY
} else {
flag = os.O_RDWR
}
f, err := os.OpenFile(path, flag, 0)
if os.IsNotExist(err) {
f, err = os.OpenFile(path, flag|os.O_CREATE, 0644)
}
if err != nil {
return
}
err = setFileLock(f, true)
err = setFileLock(f, readOnly, true)
if err != nil {
f.Close()
return
@ -38,7 +47,7 @@ func newFileLock(path string) (fl fileLock, err error) {
return
}
func setFileLock(f *os.File, lock bool) error {
func setFileLock(f *os.File, readOnly, lock bool) error {
flock := syscall.Flock_t{
Type: syscall.F_UNLCK,
Start: 0,
@ -46,7 +55,11 @@ func setFileLock(f *os.File, lock bool) error {
Whence: 1,
}
if lock {
flock.Type = syscall.F_WRLCK
if readOnly {
flock.Type = syscall.F_RDLCK
} else {
flock.Type = syscall.F_WRLCK
}
}
return syscall.FcntlFlock(f.Fd(), syscall.F_SETLK, &flock)
}

View File

@ -17,14 +17,14 @@ var cases = []struct {
oldName []string
name string
ftype FileType
num uint64
num int64
}{
{nil, "000100.log", TypeJournal, 100},
{nil, "000000.log", TypeJournal, 0},
{[]string{"000000.sst"}, "000000.ldb", TypeTable, 0},
{nil, "MANIFEST-000002", TypeManifest, 2},
{nil, "MANIFEST-000007", TypeManifest, 7},
{nil, "18446744073709551615.log", TypeJournal, 18446744073709551615},
{nil, "9223372036854775807.log", TypeJournal, 9223372036854775807},
{nil, "000100.tmp", TypeTemp, 100},
}
@ -55,9 +55,8 @@ var invalidCases = []string{
func TestFileStorage_CreateFileName(t *testing.T) {
for _, c := range cases {
f := &file{num: c.num, t: c.ftype}
if f.name() != c.name {
t.Errorf("invalid filename got '%s', want '%s'", f.name(), c.name)
if name := fsGenName(FileDesc{c.ftype, c.num}); name != c.name {
t.Errorf("invalid filename got '%s', want '%s'", name, c.name)
}
}
}
@ -65,16 +64,16 @@ func TestFileStorage_CreateFileName(t *testing.T) {
func TestFileStorage_ParseFileName(t *testing.T) {
for _, c := range cases {
for _, name := range append([]string{c.name}, c.oldName...) {
f := new(file)
if !f.parse(name) {
fd, ok := fsParseName(name)
if !ok {
t.Errorf("cannot parse filename '%s'", name)
continue
}
if f.Type() != c.ftype {
t.Errorf("filename '%s' invalid type got '%d', want '%d'", name, f.Type(), c.ftype)
if fd.Type != c.ftype {
t.Errorf("filename '%s' invalid type got '%d', want '%d'", name, fd.Type, c.ftype)
}
if f.Num() != c.num {
t.Errorf("filename '%s' invalid number got '%d', want '%d'", name, f.Num(), c.num)
if fd.Num != c.num {
t.Errorf("filename '%s' invalid number got '%d', want '%d'", name, fd.Num, c.num)
}
}
}
@ -82,32 +81,25 @@ func TestFileStorage_ParseFileName(t *testing.T) {
func TestFileStorage_InvalidFileName(t *testing.T) {
for _, name := range invalidCases {
f := new(file)
if f.parse(name) {
if fsParseNamePtr(name, nil) {
t.Errorf("filename '%s' should be invalid", name)
}
}
}
func TestFileStorage_Locking(t *testing.T) {
path := filepath.Join(os.TempDir(), fmt.Sprintf("goleveldbtestfd-%d", os.Getuid()))
_, err := os.Stat(path)
if err == nil {
err = os.RemoveAll(path)
if err != nil {
t.Fatal("RemoveAll: got error: ", err)
}
path := filepath.Join(os.TempDir(), fmt.Sprintf("goleveldb-testrwlock-%d", os.Getuid()))
if err := os.RemoveAll(path); err != nil && !os.IsNotExist(err) {
t.Fatal("RemoveAll: got error: ", err)
}
defer os.RemoveAll(path)
p1, err := OpenFile(path)
p1, err := OpenFile(path, false)
if err != nil {
t.Fatal("OpenFile(1): got error: ", err)
}
defer os.RemoveAll(path)
p2, err := OpenFile(path)
p2, err := OpenFile(path, false)
if err != nil {
t.Logf("OpenFile(2): got error: %s (expected)", err)
} else {
@ -118,7 +110,7 @@ func TestFileStorage_Locking(t *testing.T) {
p1.Close()
p3, err := OpenFile(path)
p3, err := OpenFile(path, false)
if err != nil {
t.Fatal("OpenFile(3): got error: ", err)
}
@ -134,9 +126,51 @@ func TestFileStorage_Locking(t *testing.T) {
} else {
t.Logf("storage lock got error: %s (expected)", err)
}
l.Release()
l.Unlock()
_, err = p3.Lock()
if err != nil {
t.Fatal("storage lock failed(2): ", err)
}
}
func TestFileStorage_ReadOnlyLocking(t *testing.T) {
path := filepath.Join(os.TempDir(), fmt.Sprintf("goleveldb-testrolock-%d", os.Getuid()))
if err := os.RemoveAll(path); err != nil && !os.IsNotExist(err) {
t.Fatal("RemoveAll: got error: ", err)
}
defer os.RemoveAll(path)
p1, err := OpenFile(path, false)
if err != nil {
t.Fatal("OpenFile(1): got error: ", err)
}
_, err = OpenFile(path, true)
if err != nil {
t.Logf("OpenFile(2): got error: %s (expected)", err)
} else {
t.Fatal("OpenFile(2): expect error")
}
p1.Close()
p3, err := OpenFile(path, true)
if err != nil {
t.Fatal("OpenFile(3): got error: ", err)
}
p4, err := OpenFile(path, true)
if err != nil {
t.Fatal("OpenFile(4): got error: ", err)
}
_, err = OpenFile(path, false)
if err != nil {
t.Logf("OpenFile(5): got error: %s (expected)", err)
} else {
t.Fatal("OpenFile(2): expect error")
}
p3.Close()
p4.Close()
}

View File

@ -18,18 +18,27 @@ type unixFileLock struct {
}
func (fl *unixFileLock) release() error {
if err := setFileLock(fl.f, false); err != nil {
if err := setFileLock(fl.f, false, false); err != nil {
return err
}
return fl.f.Close()
}
func newFileLock(path string) (fl fileLock, err error) {
f, err := os.OpenFile(path, os.O_RDWR|os.O_CREATE, 0644)
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
var flag int
if readOnly {
flag = os.O_RDONLY
} else {
flag = os.O_RDWR
}
f, err := os.OpenFile(path, flag, 0)
if os.IsNotExist(err) {
f, err = os.OpenFile(path, flag|os.O_CREATE, 0644)
}
if err != nil {
return
}
err = setFileLock(f, true)
err = setFileLock(f, readOnly, true)
if err != nil {
f.Close()
return
@ -38,10 +47,14 @@ func newFileLock(path string) (fl fileLock, err error) {
return
}
func setFileLock(f *os.File, lock bool) error {
func setFileLock(f *os.File, readOnly, lock bool) error {
how := syscall.LOCK_UN
if lock {
how = syscall.LOCK_EX
if readOnly {
how = syscall.LOCK_SH
} else {
how = syscall.LOCK_EX
}
}
return syscall.Flock(int(f.Fd()), how|syscall.LOCK_NB)
}
@ -50,13 +63,23 @@ func rename(oldpath, newpath string) error {
return os.Rename(oldpath, newpath)
}
func isErrInvalid(err error) bool {
if err == os.ErrInvalid {
return true
}
if syserr, ok := err.(*os.SyscallError); ok && syserr.Err == syscall.EINVAL {
return true
}
return false
}
func syncDir(name string) error {
f, err := os.Open(name)
if err != nil {
return err
}
defer f.Close()
if err := f.Sync(); err != nil {
if err := f.Sync(); err != nil && !isErrInvalid(err) {
return err
}
return nil

View File

@ -29,12 +29,22 @@ func (fl *windowsFileLock) release() error {
return syscall.Close(fl.fd)
}
func newFileLock(path string) (fl fileLock, err error) {
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
pathp, err := syscall.UTF16PtrFromString(path)
if err != nil {
return
}
fd, err := syscall.CreateFile(pathp, syscall.GENERIC_READ|syscall.GENERIC_WRITE, 0, nil, syscall.CREATE_ALWAYS, syscall.FILE_ATTRIBUTE_NORMAL, 0)
var access, shareMode uint32
if readOnly {
access = syscall.GENERIC_READ
shareMode = syscall.FILE_SHARE_READ
} else {
access = syscall.GENERIC_READ | syscall.GENERIC_WRITE
}
fd, err := syscall.CreateFile(pathp, access, shareMode, nil, syscall.OPEN_EXISTING, syscall.FILE_ATTRIBUTE_NORMAL, 0)
if err == syscall.ERROR_FILE_NOT_FOUND {
fd, err = syscall.CreateFile(pathp, access, shareMode, nil, syscall.OPEN_ALWAYS, syscall.FILE_ATTRIBUTE_NORMAL, 0)
}
if err != nil {
return
}
@ -47,9 +57,8 @@ func moveFileEx(from *uint16, to *uint16, flags uint32) error {
if r1 == 0 {
if e1 != 0 {
return error(e1)
} else {
return syscall.EINVAL
}
return syscall.EINVAL
}
return nil
}

View File

@ -10,8 +10,6 @@ import (
"bytes"
"os"
"sync"
"github.com/syndtr/goleveldb/leveldb/util"
)
const typeShift = 3
@ -20,7 +18,7 @@ type memStorageLock struct {
ms *memStorage
}
func (lock *memStorageLock) Release() {
func (lock *memStorageLock) Unlock() {
ms := lock.ms
ms.mu.Lock()
defer ms.mu.Unlock()
@ -32,10 +30,10 @@ func (lock *memStorageLock) Release() {
// memStorage is a memory-backed storage.
type memStorage struct {
mu sync.Mutex
slock *memStorageLock
files map[uint64]*memFile
manifest *memFilePtr
mu sync.Mutex
slock *memStorageLock
files map[uint64]*memFile
meta FileDesc
}
// NewMemStorage returns a new memory-backed storage implementation.
@ -45,7 +43,7 @@ func NewMemStorage() Storage {
}
}
func (ms *memStorage) Lock() (util.Releaser, error) {
func (ms *memStorage) Lock() (Locker, error) {
ms.mu.Lock()
defer ms.mu.Unlock()
if ms.slock != nil {
@ -57,147 +55,164 @@ func (ms *memStorage) Lock() (util.Releaser, error) {
func (*memStorage) Log(str string) {}
func (ms *memStorage) GetFile(num uint64, t FileType) File {
return &memFilePtr{ms: ms, num: num, t: t}
}
func (ms *memStorage) GetFiles(t FileType) ([]File, error) {
ms.mu.Lock()
var ff []File
for x, _ := range ms.files {
num, mt := x>>typeShift, FileType(x)&TypeAll
if mt&t == 0 {
continue
}
ff = append(ff, &memFilePtr{ms: ms, num: num, t: mt})
}
ms.mu.Unlock()
return ff, nil
}
func (ms *memStorage) GetManifest() (File, error) {
ms.mu.Lock()
defer ms.mu.Unlock()
if ms.manifest == nil {
return nil, os.ErrNotExist
}
return ms.manifest, nil
}
func (ms *memStorage) SetManifest(f File) error {
fm, ok := f.(*memFilePtr)
if !ok || fm.t != TypeManifest {
func (ms *memStorage) SetMeta(fd FileDesc) error {
if !FileDescOk(fd) {
return ErrInvalidFile
}
ms.mu.Lock()
ms.manifest = fm
ms.meta = fd
ms.mu.Unlock()
return nil
}
func (*memStorage) Close() error { return nil }
type memReader struct {
*bytes.Reader
m *memFile
}
func (mr *memReader) Close() error {
return mr.m.Close()
}
type memFile struct {
bytes.Buffer
ms *memStorage
open bool
}
func (*memFile) Sync() error { return nil }
func (m *memFile) Close() error {
m.ms.mu.Lock()
m.open = false
m.ms.mu.Unlock()
return nil
}
type memFilePtr struct {
ms *memStorage
num uint64
t FileType
}
func (p *memFilePtr) x() uint64 {
return p.Num()<<typeShift | uint64(p.Type())
}
func (p *memFilePtr) Open() (Reader, error) {
ms := p.ms
func (ms *memStorage) GetMeta() (FileDesc, error) {
ms.mu.Lock()
defer ms.mu.Unlock()
if m, exist := ms.files[p.x()]; exist {
if ms.meta.Zero() {
return FileDesc{}, os.ErrNotExist
}
return ms.meta, nil
}
func (ms *memStorage) List(ft FileType) ([]FileDesc, error) {
ms.mu.Lock()
var fds []FileDesc
for x := range ms.files {
fd := unpackFile(x)
if fd.Type&ft != 0 {
fds = append(fds, fd)
}
}
ms.mu.Unlock()
return fds, nil
}
func (ms *memStorage) Open(fd FileDesc) (Reader, error) {
if !FileDescOk(fd) {
return nil, ErrInvalidFile
}
ms.mu.Lock()
defer ms.mu.Unlock()
if m, exist := ms.files[packFile(fd)]; exist {
if m.open {
return nil, errFileOpen
}
m.open = true
return &memReader{Reader: bytes.NewReader(m.Bytes()), m: m}, nil
return &memReader{Reader: bytes.NewReader(m.Bytes()), ms: ms, m: m}, nil
}
return nil, os.ErrNotExist
}
func (p *memFilePtr) Create() (Writer, error) {
ms := p.ms
func (ms *memStorage) Create(fd FileDesc) (Writer, error) {
if !FileDescOk(fd) {
return nil, ErrInvalidFile
}
x := packFile(fd)
ms.mu.Lock()
defer ms.mu.Unlock()
m, exist := ms.files[p.x()]
m, exist := ms.files[x]
if exist {
if m.open {
return nil, errFileOpen
}
m.Reset()
} else {
m = &memFile{ms: ms}
ms.files[p.x()] = m
m = &memFile{}
ms.files[x] = m
}
m.open = true
return m, nil
return &memWriter{memFile: m, ms: ms}, nil
}
func (p *memFilePtr) Replace(newfile File) error {
p1, ok := newfile.(*memFilePtr)
if !ok {
func (ms *memStorage) Remove(fd FileDesc) error {
if !FileDescOk(fd) {
return ErrInvalidFile
}
ms := p.ms
x := packFile(fd)
ms.mu.Lock()
defer ms.mu.Unlock()
m1, exist := ms.files[p1.x()]
if !exist {
return os.ErrNotExist
}
m0, exist := ms.files[p.x()]
if (exist && m0.open) || m1.open {
return errFileOpen
}
delete(ms.files, p1.x())
ms.files[p.x()] = m1
return nil
}
func (p *memFilePtr) Type() FileType {
return p.t
}
func (p *memFilePtr) Num() uint64 {
return p.num
}
func (p *memFilePtr) Remove() error {
ms := p.ms
ms.mu.Lock()
defer ms.mu.Unlock()
if _, exist := ms.files[p.x()]; exist {
delete(ms.files, p.x())
if _, exist := ms.files[x]; exist {
delete(ms.files, x)
return nil
}
return os.ErrNotExist
}
func (ms *memStorage) Rename(oldfd, newfd FileDesc) error {
if FileDescOk(oldfd) || FileDescOk(newfd) {
return ErrInvalidFile
}
if oldfd == newfd {
return nil
}
oldx := packFile(oldfd)
newx := packFile(newfd)
ms.mu.Lock()
defer ms.mu.Unlock()
oldm, exist := ms.files[oldx]
if !exist {
return os.ErrNotExist
}
newm, exist := ms.files[newx]
if (exist && newm.open) || oldm.open {
return errFileOpen
}
delete(ms.files, oldx)
ms.files[newx] = oldm
return nil
}
func (*memStorage) Close() error { return nil }
type memFile struct {
bytes.Buffer
open bool
}
type memReader struct {
*bytes.Reader
ms *memStorage
m *memFile
closed bool
}
func (mr *memReader) Close() error {
mr.ms.mu.Lock()
defer mr.ms.mu.Unlock()
if mr.closed {
return ErrClosed
}
mr.m.open = false
return nil
}
type memWriter struct {
*memFile
ms *memStorage
closed bool
}
func (*memWriter) Sync() error { return nil }
func (mw *memWriter) Close() error {
mw.ms.mu.Lock()
defer mw.ms.mu.Unlock()
if mw.closed {
return ErrClosed
}
mw.memFile.open = false
return nil
}
func packFile(fd FileDesc) uint64 {
return uint64(fd.Num)<<typeShift | uint64(fd.Type)
}
func unpackFile(x uint64) FileDesc {
return FileDesc{FileType(x) & TypeAll, int64(x >> typeShift)}
}

View File

@ -24,24 +24,23 @@ func TestMemStorage(t *testing.T) {
} else {
t.Logf("storage lock got error: %s (expected)", err)
}
l.Release()
l.Unlock()
_, err = m.Lock()
if err != nil {
t.Fatal("storage lock failed(2): ", err)
}
f := m.GetFile(1, TypeTable)
if f.Num() != 1 && f.Type() != TypeTable {
t.Fatal("invalid file number and type")
w, err := m.Create(FileDesc{TypeTable, 1})
if err != nil {
t.Fatal("Storage.Create: ", err)
}
w, _ := f.Create()
w.Write([]byte("abc"))
w.Close()
if ff, _ := m.GetFiles(TypeAll); len(ff) != 1 {
if fds, _ := m.List(TypeAll); len(fds) != 1 {
t.Fatal("invalid GetFiles len")
}
buf := new(bytes.Buffer)
r, err := f.Open()
r, err := m.Open(FileDesc{TypeTable, 1})
if err != nil {
t.Fatal("Open: got error: ", err)
}
@ -50,17 +49,17 @@ func TestMemStorage(t *testing.T) {
if got := buf.String(); got != "abc" {
t.Fatalf("Read: invalid value, want=abc got=%s", got)
}
if _, err := f.Open(); err != nil {
if _, err := m.Open(FileDesc{TypeTable, 1}); err != nil {
t.Fatal("Open: got error: ", err)
}
if _, err := m.GetFile(1, TypeTable).Open(); err == nil {
if _, err := m.Open(FileDesc{TypeTable, 1}); err == nil {
t.Fatal("expecting error")
}
f.Remove()
if ff, _ := m.GetFiles(TypeAll); len(ff) != 0 {
t.Fatal("invalid GetFiles len", len(ff))
m.Remove(FileDesc{TypeTable, 1})
if fds, _ := m.List(TypeAll); len(fds) != 0 {
t.Fatal("invalid GetFiles len", len(fds))
}
if _, err := f.Open(); err == nil {
if _, err := m.Open(FileDesc{TypeTable, 1}); err == nil {
t.Fatal("expecting error")
}
}

View File

@ -11,12 +11,12 @@ import (
"errors"
"fmt"
"io"
"github.com/syndtr/goleveldb/leveldb/util"
)
type FileType uint32
// FileType represent a file type.
type FileType int
// File types.
const (
TypeManifest FileType = 1 << iota
TypeJournal
@ -40,12 +40,28 @@ func (t FileType) String() string {
return fmt.Sprintf("<unknown:%d>", t)
}
// Common error.
var (
ErrInvalidFile = errors.New("leveldb/storage: invalid file for argument")
ErrLocked = errors.New("leveldb/storage: already locked")
ErrClosed = errors.New("leveldb/storage: closed")
)
// ErrCorrupted is the type that wraps errors that indicate corruption of
// a file. Package storage has its own type instead of using
// errors.ErrCorrupted to prevent circular import.
type ErrCorrupted struct {
Fd FileDesc
Err error
}
func (e *ErrCorrupted) Error() string {
if !e.Fd.Zero() {
return fmt.Sprintf("%v [file=%v]", e.Err, e.Fd)
}
return e.Err.Error()
}
// Syncer is the interface that wraps basic Sync method.
type Syncer interface {
// Sync commits the current contents of the file to stable storage.
@ -67,91 +83,97 @@ type Writer interface {
Syncer
}
// File is the file. A file instance must be goroutine-safe.
type File interface {
// Open opens the file for read. Returns os.ErrNotExist error
// if the file does not exist.
// Returns ErrClosed if the underlying storage is closed.
Open() (r Reader, err error)
// Create creates the file for writting. Truncate the file if
// already exist.
// Returns ErrClosed if the underlying storage is closed.
Create() (w Writer, err error)
// Replace replaces file with newfile.
// Returns ErrClosed if the underlying storage is closed.
Replace(newfile File) error
// Type returns the file type
Type() FileType
// Num returns the file number.
Num() uint64
// Remove removes the file.
// Returns ErrClosed if the underlying storage is closed.
Remove() error
// Locker is the interface that wraps Unlock method.
type Locker interface {
Unlock()
}
// Storage is the storage. A storage instance must be goroutine-safe.
// FileDesc is a 'file descriptor'.
type FileDesc struct {
Type FileType
Num int64
}
func (fd FileDesc) String() string {
switch fd.Type {
case TypeManifest:
return fmt.Sprintf("MANIFEST-%06d", fd.Num)
case TypeJournal:
return fmt.Sprintf("%06d.log", fd.Num)
case TypeTable:
return fmt.Sprintf("%06d.ldb", fd.Num)
case TypeTemp:
return fmt.Sprintf("%06d.tmp", fd.Num)
default:
return fmt.Sprintf("%#x-%d", fd.Type, fd.Num)
}
}
// Zero returns true if fd == (FileDesc{}).
func (fd FileDesc) Zero() bool {
return fd == (FileDesc{})
}
// FileDescOk returns true if fd is a valid 'file descriptor'.
func FileDescOk(fd FileDesc) bool {
switch fd.Type {
case TypeManifest:
case TypeJournal:
case TypeTable:
case TypeTemp:
default:
return false
}
return fd.Num >= 0
}
// Storage is the storage. A storage instance must be safe for concurrent use.
type Storage interface {
// Lock locks the storage. Any subsequent attempt to call Lock will fail
// until the last lock released.
// After use the caller should call the Release method.
Lock() (l util.Releaser, err error)
// Caller should call Unlock method after use.
Lock() (Locker, error)
// Log logs a string. This is used for logging. An implementation
// may write to a file, stdout or simply do nothing.
// Log logs a string. This is used for logging.
// An implementation may write to a file, stdout or simply do nothing.
Log(str string)
// GetFile returns a file for the given number and type. GetFile will never
// returns nil, even if the underlying storage is closed.
GetFile(num uint64, t FileType) File
// SetMeta store 'file descriptor' that can later be acquired using GetMeta
// method. The 'file descriptor' should point to a valid file.
// SetMeta should be implemented in such way that changes should happen
// atomically.
SetMeta(fd FileDesc) error
// GetFiles returns a slice of files that match the given file types.
// GetMeta returns 'file descriptor' stored in meta. The 'file descriptor'
// can be updated using SetMeta method.
// Returns os.ErrNotExist if meta doesn't store any 'file descriptor', or
// 'file descriptor' point to nonexistent file.
GetMeta() (FileDesc, error)
// List returns file descriptors that match the given file types.
// The file types may be OR'ed together.
GetFiles(t FileType) ([]File, error)
List(ft FileType) ([]FileDesc, error)
// GetManifest returns a manifest file. Returns os.ErrNotExist if manifest
// file does not exist.
GetManifest() (File, error)
// Open opens file with the given 'file descriptor' read-only.
// Returns os.ErrNotExist error if the file does not exist.
// Returns ErrClosed if the underlying storage is closed.
Open(fd FileDesc) (Reader, error)
// SetManifest sets the given file as manifest file. The given file should
// be a manifest file type or error will be returned.
SetManifest(f File) error
// Create creates file with the given 'file descriptor', truncate if already
// exist and opens write-only.
// Returns ErrClosed if the underlying storage is closed.
Create(fd FileDesc) (Writer, error)
// Close closes the storage. It is valid to call Close multiple times.
// Other methods should not be called after the storage has been closed.
// Remove removes file with the given 'file descriptor'.
// Returns ErrClosed if the underlying storage is closed.
Remove(fd FileDesc) error
// Rename renames file from oldfd to newfd.
// Returns ErrClosed if the underlying storage is closed.
Rename(oldfd, newfd FileDesc) error
// Close closes the storage.
// It is valid to call Close multiple times. Other methods should not be
// called after the storage has been closed.
Close() error
}
// FileInfo wraps basic file info.
type FileInfo struct {
Type FileType
Num uint64
}
func (fi FileInfo) String() string {
switch fi.Type {
case TypeManifest:
return fmt.Sprintf("MANIFEST-%06d", fi.Num)
case TypeJournal:
return fmt.Sprintf("%06d.log", fi.Num)
case TypeTable:
return fmt.Sprintf("%06d.ldb", fi.Num)
case TypeTemp:
return fmt.Sprintf("%06d.tmp", fi.Num)
default:
return fmt.Sprintf("%#x-%d", fi.Type, fi.Num)
}
}
// NewFileInfo creates new FileInfo from the given File. It will returns nil
// if File is nil.
func NewFileInfo(f File) *FileInfo {
if f == nil {
return nil
}
return &FileInfo{f.Type(), f.Num()}
}

View File

@ -1,539 +0,0 @@
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
// All rights reserved.
//
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENE file.
package leveldb
import (
"errors"
"fmt"
"io"
"io/ioutil"
"math/rand"
"os"
"path/filepath"
"sync"
"testing"
"github.com/syndtr/goleveldb/leveldb/storage"
"github.com/syndtr/goleveldb/leveldb/util"
)
const typeShift = 4
var (
tsErrInvalidFile = errors.New("leveldb.testStorage: invalid file for argument")
tsErrFileOpen = errors.New("leveldb.testStorage: file still open")
)
var (
tsFSEnv = os.Getenv("GOLEVELDB_USEFS")
tsTempdir = os.Getenv("GOLEVELDB_TEMPDIR")
tsKeepFS = tsFSEnv == "2"
tsFS = tsKeepFS || tsFSEnv == "" || tsFSEnv == "1"
tsMU = &sync.Mutex{}
tsNum = 0
)
type tsOp uint
const (
tsOpOpen tsOp = iota
tsOpCreate
tsOpRead
tsOpReadAt
tsOpWrite
tsOpSync
tsOpNum
)
type tsLock struct {
ts *testStorage
r util.Releaser
}
func (l tsLock) Release() {
l.r.Release()
l.ts.t.Log("I: storage lock released")
}
type tsReader struct {
tf tsFile
storage.Reader
}
func (tr tsReader) Read(b []byte) (n int, err error) {
ts := tr.tf.ts
ts.countRead(tr.tf.Type())
if tr.tf.shouldErrLocked(tsOpRead) {
return 0, errors.New("leveldb.testStorage: emulated read error")
}
n, err = tr.Reader.Read(b)
if err != nil && err != io.EOF {
ts.t.Errorf("E: read error, num=%d type=%v n=%d: %v", tr.tf.Num(), tr.tf.Type(), n, err)
}
return
}
func (tr tsReader) ReadAt(b []byte, off int64) (n int, err error) {
ts := tr.tf.ts
ts.countRead(tr.tf.Type())
if tr.tf.shouldErrLocked(tsOpReadAt) {
return 0, errors.New("leveldb.testStorage: emulated readAt error")
}
n, err = tr.Reader.ReadAt(b, off)
if err != nil && err != io.EOF {
ts.t.Errorf("E: readAt error, num=%d type=%v off=%d n=%d: %v", tr.tf.Num(), tr.tf.Type(), off, n, err)
}
return
}
func (tr tsReader) Close() (err error) {
err = tr.Reader.Close()
tr.tf.close("reader", err)
return
}
type tsWriter struct {
tf tsFile
storage.Writer
}
func (tw tsWriter) Write(b []byte) (n int, err error) {
if tw.tf.shouldErrLocked(tsOpWrite) {
return 0, errors.New("leveldb.testStorage: emulated write error")
}
n, err = tw.Writer.Write(b)
if err != nil {
tw.tf.ts.t.Errorf("E: write error, num=%d type=%v n=%d: %v", tw.tf.Num(), tw.tf.Type(), n, err)
}
return
}
func (tw tsWriter) Sync() (err error) {
ts := tw.tf.ts
ts.mu.Lock()
for ts.emuDelaySync&tw.tf.Type() != 0 {
ts.cond.Wait()
}
ts.mu.Unlock()
if tw.tf.shouldErrLocked(tsOpSync) {
return errors.New("leveldb.testStorage: emulated sync error")
}
err = tw.Writer.Sync()
if err != nil {
tw.tf.ts.t.Errorf("E: sync error, num=%d type=%v: %v", tw.tf.Num(), tw.tf.Type(), err)
}
return
}
func (tw tsWriter) Close() (err error) {
err = tw.Writer.Close()
tw.tf.close("writer", err)
return
}
type tsFile struct {
ts *testStorage
storage.File
}
func (tf tsFile) x() uint64 {
return tf.Num()<<typeShift | uint64(tf.Type())
}
func (tf tsFile) shouldErr(op tsOp) bool {
return tf.ts.shouldErr(tf, op)
}
func (tf tsFile) shouldErrLocked(op tsOp) bool {
tf.ts.mu.Lock()
defer tf.ts.mu.Unlock()
return tf.shouldErr(op)
}
func (tf tsFile) checkOpen(m string) error {
ts := tf.ts
if writer, ok := ts.opens[tf.x()]; ok {
if writer {
ts.t.Errorf("E: cannot %s file, num=%d type=%v: a writer still open", m, tf.Num(), tf.Type())
} else {
ts.t.Errorf("E: cannot %s file, num=%d type=%v: a reader still open", m, tf.Num(), tf.Type())
}
return tsErrFileOpen
}
return nil
}
func (tf tsFile) close(m string, err error) {
ts := tf.ts
ts.mu.Lock()
defer ts.mu.Unlock()
if _, ok := ts.opens[tf.x()]; !ok {
ts.t.Errorf("E: %s: redudant file closing, num=%d type=%v", m, tf.Num(), tf.Type())
} else if err == nil {
ts.t.Logf("I: %s: file closed, num=%d type=%v", m, tf.Num(), tf.Type())
}
delete(ts.opens, tf.x())
if err != nil {
ts.t.Errorf("E: %s: cannot close file, num=%d type=%v: %v", m, tf.Num(), tf.Type(), err)
}
}
func (tf tsFile) Open() (r storage.Reader, err error) {
ts := tf.ts
ts.mu.Lock()
defer ts.mu.Unlock()
err = tf.checkOpen("open")
if err != nil {
return
}
if tf.shouldErr(tsOpOpen) {
err = errors.New("leveldb.testStorage: emulated open error")
return
}
r, err = tf.File.Open()
if err != nil {
if ts.ignoreOpenErr&tf.Type() != 0 {
ts.t.Logf("I: cannot open file, num=%d type=%v: %v (ignored)", tf.Num(), tf.Type(), err)
} else {
ts.t.Errorf("E: cannot open file, num=%d type=%v: %v", tf.Num(), tf.Type(), err)
}
} else {
ts.t.Logf("I: file opened, num=%d type=%v", tf.Num(), tf.Type())
ts.opens[tf.x()] = false
r = tsReader{tf, r}
}
return
}
func (tf tsFile) Create() (w storage.Writer, err error) {
ts := tf.ts
ts.mu.Lock()
defer ts.mu.Unlock()
err = tf.checkOpen("create")
if err != nil {
return
}
if tf.shouldErr(tsOpCreate) {
err = errors.New("leveldb.testStorage: emulated create error")
return
}
w, err = tf.File.Create()
if err != nil {
ts.t.Errorf("E: cannot create file, num=%d type=%v: %v", tf.Num(), tf.Type(), err)
} else {
ts.t.Logf("I: file created, num=%d type=%v", tf.Num(), tf.Type())
ts.opens[tf.x()] = true
w = tsWriter{tf, w}
}
return
}
func (tf tsFile) Replace(newfile storage.File) (err error) {
ts := tf.ts
ts.mu.Lock()
defer ts.mu.Unlock()
err = tf.checkOpen("replace")
if err != nil {
return
}
err = tf.File.Replace(newfile.(tsFile).File)
if err != nil {
ts.t.Errorf("E: cannot replace file, num=%d type=%v: %v", tf.Num(), tf.Type(), err)
} else {
ts.t.Logf("I: file replace, num=%d type=%v", tf.Num(), tf.Type())
}
return
}
func (tf tsFile) Remove() (err error) {
ts := tf.ts
ts.mu.Lock()
defer ts.mu.Unlock()
err = tf.checkOpen("remove")
if err != nil {
return
}
err = tf.File.Remove()
if err != nil {
ts.t.Errorf("E: cannot remove file, num=%d type=%v: %v", tf.Num(), tf.Type(), err)
} else {
ts.t.Logf("I: file removed, num=%d type=%v", tf.Num(), tf.Type())
}
return
}
type testStorage struct {
t *testing.T
storage.Storage
closeFn func() error
mu sync.Mutex
cond sync.Cond
// Open files, true=writer, false=reader
opens map[uint64]bool
emuDelaySync storage.FileType
ignoreOpenErr storage.FileType
readCnt uint64
readCntEn storage.FileType
emuErr [tsOpNum]storage.FileType
emuErrOnce [tsOpNum]storage.FileType
emuRandErr [tsOpNum]storage.FileType
emuRandErrProb int
emuErrOnceMap map[uint64]uint
emuRandRand *rand.Rand
}
func (ts *testStorage) shouldErr(tf tsFile, op tsOp) bool {
if ts.emuErr[op]&tf.Type() != 0 {
return true
} else if ts.emuRandErr[op]&tf.Type() != 0 || ts.emuErrOnce[op]&tf.Type() != 0 {
sop := uint(1) << op
eop := ts.emuErrOnceMap[tf.x()]
if eop&sop == 0 && (ts.emuRandRand.Int()%ts.emuRandErrProb == 0 || ts.emuErrOnce[op]&tf.Type() != 0) {
ts.emuErrOnceMap[tf.x()] = eop | sop
ts.t.Logf("I: emulated error: file=%d type=%v op=%v", tf.Num(), tf.Type(), op)
return true
}
}
return false
}
func (ts *testStorage) SetEmuErr(t storage.FileType, ops ...tsOp) {
ts.mu.Lock()
for _, op := range ops {
ts.emuErr[op] = t
}
ts.mu.Unlock()
}
func (ts *testStorage) SetEmuErrOnce(t storage.FileType, ops ...tsOp) {
ts.mu.Lock()
for _, op := range ops {
ts.emuErrOnce[op] = t
}
ts.mu.Unlock()
}
func (ts *testStorage) SetEmuRandErr(t storage.FileType, ops ...tsOp) {
ts.mu.Lock()
for _, op := range ops {
ts.emuRandErr[op] = t
}
ts.mu.Unlock()
}
func (ts *testStorage) SetEmuRandErrProb(prob int) {
ts.mu.Lock()
ts.emuRandErrProb = prob
ts.mu.Unlock()
}
func (ts *testStorage) DelaySync(t storage.FileType) {
ts.mu.Lock()
ts.emuDelaySync |= t
ts.cond.Broadcast()
ts.mu.Unlock()
}
func (ts *testStorage) ReleaseSync(t storage.FileType) {
ts.mu.Lock()
ts.emuDelaySync &= ^t
ts.cond.Broadcast()
ts.mu.Unlock()
}
func (ts *testStorage) ReadCounter() uint64 {
ts.mu.Lock()
defer ts.mu.Unlock()
return ts.readCnt
}
func (ts *testStorage) ResetReadCounter() {
ts.mu.Lock()
ts.readCnt = 0
ts.mu.Unlock()
}
func (ts *testStorage) SetReadCounter(t storage.FileType) {
ts.mu.Lock()
ts.readCntEn = t
ts.mu.Unlock()
}
func (ts *testStorage) countRead(t storage.FileType) {
ts.mu.Lock()
if ts.readCntEn&t != 0 {
ts.readCnt++
}
ts.mu.Unlock()
}
func (ts *testStorage) SetIgnoreOpenErr(t storage.FileType) {
ts.ignoreOpenErr = t
}
func (ts *testStorage) Lock() (r util.Releaser, err error) {
r, err = ts.Storage.Lock()
if err != nil {
ts.t.Logf("W: storage locking failed: %v", err)
} else {
ts.t.Log("I: storage locked")
r = tsLock{ts, r}
}
return
}
func (ts *testStorage) Log(str string) {
ts.t.Log("L: " + str)
ts.Storage.Log(str)
}
func (ts *testStorage) GetFile(num uint64, t storage.FileType) storage.File {
return tsFile{ts, ts.Storage.GetFile(num, t)}
}
func (ts *testStorage) GetFiles(t storage.FileType) (ff []storage.File, err error) {
ff0, err := ts.Storage.GetFiles(t)
if err != nil {
ts.t.Errorf("E: get files failed: %v", err)
return
}
ff = make([]storage.File, len(ff0))
for i, f := range ff0 {
ff[i] = tsFile{ts, f}
}
ts.t.Logf("I: get files, type=0x%x count=%d", int(t), len(ff))
return
}
func (ts *testStorage) GetManifest() (f storage.File, err error) {
f0, err := ts.Storage.GetManifest()
if err != nil {
if !os.IsNotExist(err) {
ts.t.Errorf("E: get manifest failed: %v", err)
}
return
}
f = tsFile{ts, f0}
ts.t.Logf("I: get manifest, num=%d", f.Num())
return
}
func (ts *testStorage) SetManifest(f storage.File) error {
tf, ok := f.(tsFile)
if !ok {
ts.t.Error("E: set manifest failed: type assertion failed")
return tsErrInvalidFile
} else if tf.Type() != storage.TypeManifest {
ts.t.Errorf("E: set manifest failed: invalid file type: %s", tf.Type())
return tsErrInvalidFile
}
err := ts.Storage.SetManifest(tf.File)
if err != nil {
ts.t.Errorf("E: set manifest failed: %v", err)
} else {
ts.t.Logf("I: set manifest, num=%d", tf.Num())
}
return err
}
func (ts *testStorage) Close() error {
ts.CloseCheck()
err := ts.Storage.Close()
if err != nil {
ts.t.Errorf("E: closing storage failed: %v", err)
} else {
ts.t.Log("I: storage closed")
}
if ts.closeFn != nil {
if err := ts.closeFn(); err != nil {
ts.t.Errorf("E: close function: %v", err)
}
}
return err
}
func (ts *testStorage) CloseCheck() {
ts.mu.Lock()
if len(ts.opens) == 0 {
ts.t.Log("I: all files are closed")
} else {
ts.t.Errorf("E: %d files still open", len(ts.opens))
for x, writer := range ts.opens {
num, tt := x>>typeShift, storage.FileType(x)&storage.TypeAll
ts.t.Errorf("E: * num=%d type=%v writer=%v", num, tt, writer)
}
}
ts.mu.Unlock()
}
func newTestStorage(t *testing.T) *testStorage {
var stor storage.Storage
var closeFn func() error
if tsFS {
for {
tsMU.Lock()
num := tsNum
tsNum++
tsMU.Unlock()
tempdir := tsTempdir
if tempdir == "" {
tempdir = os.TempDir()
}
path := filepath.Join(tempdir, fmt.Sprintf("goleveldb-test%d0%d0%d", os.Getuid(), os.Getpid(), num))
if _, err := os.Stat(path); err != nil {
stor, err = storage.OpenFile(path)
if err != nil {
t.Fatalf("F: cannot create storage: %v", err)
}
t.Logf("I: storage created: %s", path)
closeFn = func() error {
for _, name := range []string{"LOG.old", "LOG"} {
f, err := os.Open(filepath.Join(path, name))
if err != nil {
continue
}
if log, err := ioutil.ReadAll(f); err != nil {
t.Logf("---------------------- %s ----------------------", name)
t.Logf("cannot read log: %v", err)
t.Logf("---------------------- %s ----------------------", name)
} else if len(log) > 0 {
t.Logf("---------------------- %s ----------------------\n%s", name, string(log))
t.Logf("---------------------- %s ----------------------", name)
}
f.Close()
}
if t.Failed() {
t.Logf("testing failed, test DB preserved at %s", path)
return nil
}
if tsKeepFS {
return nil
}
return os.RemoveAll(path)
}
break
}
}
} else {
stor = storage.NewMemStorage()
}
ts := &testStorage{
t: t,
Storage: stor,
closeFn: closeFn,
opens: make(map[uint64]bool),
emuErrOnceMap: make(map[uint64]uint),
emuRandErrProb: 0x999,
emuRandRand: rand.New(rand.NewSource(0xfacedead)),
}
ts.cond.L = &ts.mu
return ts
}

View File

@ -21,10 +21,10 @@ import (
// tFile holds basic information about a table.
type tFile struct {
file storage.File
fd storage.FileDesc
seekLeft int32
size uint64
imin, imax iKey
size int64
imin, imax internalKey
}
// Returns true if given key is after largest key of this table.
@ -48,9 +48,9 @@ func (t *tFile) consumeSeek() int32 {
}
// Creates new tFile.
func newTableFile(file storage.File, size uint64, imin, imax iKey) *tFile {
func newTableFile(fd storage.FileDesc, size int64, imin, imax internalKey) *tFile {
f := &tFile{
file: file,
fd: fd,
size: size,
imin: imin,
imax: imax,
@ -77,6 +77,10 @@ func newTableFile(file storage.File, size uint64, imin, imax iKey) *tFile {
return f
}
func tableFileFromRecord(r atRecord) *tFile {
return newTableFile(storage.FileDesc{storage.TypeTable, r.num}, r.size, r.imin, r.imax)
}
// tFiles hold multiple tFile.
type tFiles []*tFile
@ -89,7 +93,7 @@ func (tf tFiles) nums() string {
if i != 0 {
x += ", "
}
x += fmt.Sprint(f.file.Num())
x += fmt.Sprint(f.fd.Num)
}
x += " ]"
return x
@ -101,7 +105,7 @@ func (tf tFiles) lessByKey(icmp *iComparer, i, j int) bool {
a, b := tf[i], tf[j]
n := icmp.Compare(a.imin, b.imin)
if n == 0 {
return a.file.Num() < b.file.Num()
return a.fd.Num < b.fd.Num
}
return n < 0
}
@ -109,7 +113,7 @@ func (tf tFiles) lessByKey(icmp *iComparer, i, j int) bool {
// Returns true if i file number is greater than j.
// This used for sort by file number in descending order.
func (tf tFiles) lessByNum(i, j int) bool {
return tf[i].file.Num() > tf[j].file.Num()
return tf[i].fd.Num > tf[j].fd.Num
}
// Sorts tables by key in ascending order.
@ -123,7 +127,7 @@ func (tf tFiles) sortByNum() {
}
// Returns sum of all tables size.
func (tf tFiles) size() (sum uint64) {
func (tf tFiles) size() (sum int64) {
for _, t := range tf {
sum += t.size
}
@ -132,7 +136,7 @@ func (tf tFiles) size() (sum uint64) {
// Searches smallest index of tables whose its smallest
// key is after or equal with given key.
func (tf tFiles) searchMin(icmp *iComparer, ikey iKey) int {
func (tf tFiles) searchMin(icmp *iComparer, ikey internalKey) int {
return sort.Search(len(tf), func(i int) bool {
return icmp.Compare(tf[i].imin, ikey) >= 0
})
@ -140,7 +144,7 @@ func (tf tFiles) searchMin(icmp *iComparer, ikey iKey) int {
// Searches smallest index of tables whose its largest
// key is after or equal with given key.
func (tf tFiles) searchMax(icmp *iComparer, ikey iKey) int {
func (tf tFiles) searchMax(icmp *iComparer, ikey internalKey) int {
return sort.Search(len(tf), func(i int) bool {
return icmp.Compare(tf[i].imax, ikey) >= 0
})
@ -162,7 +166,7 @@ func (tf tFiles) overlaps(icmp *iComparer, umin, umax []byte, unsorted bool) boo
i := 0
if len(umin) > 0 {
// Find the earliest possible internal key for min.
i = tf.searchMax(icmp, newIkey(umin, kMaxSeq, ktSeek))
i = tf.searchMax(icmp, makeInternalKey(nil, umin, keyMaxSeq, keyTypeSeek))
}
if i >= len(tf) {
// Beginning of range is after all files, so no overlap.
@ -205,7 +209,7 @@ func (tf tFiles) getOverlaps(dst tFiles, icmp *iComparer, umin, umax []byte, ove
}
// Returns tables key range.
func (tf tFiles) getRange(icmp *iComparer) (imin, imax iKey) {
func (tf tFiles) getRange(icmp *iComparer) (imin, imax internalKey) {
for i, t := range tf {
if i == 0 {
imin, imax = t.imin, t.imax
@ -227,10 +231,10 @@ func (tf tFiles) newIndexIterator(tops *tOps, icmp *iComparer, slice *util.Range
if slice != nil {
var start, limit int
if slice.Start != nil {
start = tf.searchMax(icmp, iKey(slice.Start))
start = tf.searchMax(icmp, internalKey(slice.Start))
}
if slice.Limit != nil {
limit = tf.searchMin(icmp, iKey(slice.Limit))
limit = tf.searchMin(icmp, internalKey(slice.Limit))
} else {
limit = tf.Len()
}
@ -255,7 +259,7 @@ type tFilesArrayIndexer struct {
}
func (a *tFilesArrayIndexer) Search(key []byte) int {
return a.searchMax(a.icmp, iKey(key))
return a.searchMax(a.icmp, internalKey(key))
}
func (a *tFilesArrayIndexer) Get(i int) iterator.Iterator {
@ -287,6 +291,7 @@ func (x *tFilesSortByNum) Less(i, j int) bool {
// Table operations.
type tOps struct {
s *session
noSync bool
cache *cache.Cache
bcache *cache.Cache
bpool *util.BufferPool
@ -294,16 +299,16 @@ type tOps struct {
// Creates an empty table and returns table writer.
func (t *tOps) create() (*tWriter, error) {
file := t.s.getTableFile(t.s.allocFileNum())
fw, err := file.Create()
fd := storage.FileDesc{storage.TypeTable, t.s.allocFileNum()}
fw, err := t.s.stor.Create(fd)
if err != nil {
return nil, err
}
return &tWriter{
t: t,
file: file,
w: fw,
tw: table.NewWriter(fw, t.s.o.Options),
t: t,
fd: fd,
w: fw,
tw: table.NewWriter(fw, t.s.o.Options),
}, nil
}
@ -339,21 +344,20 @@ func (t *tOps) createFrom(src iterator.Iterator) (f *tFile, n int, err error) {
// Opens table. It returns a cache handle, which should
// be released after use.
func (t *tOps) open(f *tFile) (ch *cache.Handle, err error) {
num := f.file.Num()
ch = t.cache.Get(0, num, func() (size int, value cache.Value) {
ch = t.cache.Get(0, uint64(f.fd.Num), func() (size int, value cache.Value) {
var r storage.Reader
r, err = f.file.Open()
r, err = t.s.stor.Open(f.fd)
if err != nil {
return 0, nil
}
var bcache *cache.CacheGetter
var bcache *cache.NamespaceGetter
if t.bcache != nil {
bcache = &cache.CacheGetter{Cache: t.bcache, NS: num}
bcache = &cache.NamespaceGetter{Cache: t.bcache, NS: uint64(f.fd.Num)}
}
var tr *table.Reader
tr, err = table.NewReader(r, int64(f.size), storage.NewFileInfo(f.file), bcache, t.bpool, t.s.o.Options)
tr, err = table.NewReader(r, f.size, f.fd, bcache, t.bpool, t.s.o.Options)
if err != nil {
r.Close()
return 0, nil
@ -389,14 +393,13 @@ func (t *tOps) findKey(f *tFile, key []byte, ro *opt.ReadOptions) (rkey []byte,
}
// Returns approximate offset of the given key.
func (t *tOps) offsetOf(f *tFile, key []byte) (offset uint64, err error) {
func (t *tOps) offsetOf(f *tFile, key []byte) (offset int64, err error) {
ch, err := t.open(f)
if err != nil {
return
}
defer ch.Release()
offset_, err := ch.Value().(*table.Reader).OffsetOf(key)
return uint64(offset_), err
return ch.Value().(*table.Reader).OffsetOf(key)
}
// Creates an iterator from the given table.
@ -413,15 +416,14 @@ func (t *tOps) newIterator(f *tFile, slice *util.Range, ro *opt.ReadOptions) ite
// Removes table from persistent storage. It waits until
// no one use the the table.
func (t *tOps) remove(f *tFile) {
num := f.file.Num()
t.cache.Delete(0, num, func() {
if err := f.file.Remove(); err != nil {
t.s.logf("table@remove removing @%d %q", num, err)
t.cache.Delete(0, uint64(f.fd.Num), func() {
if err := t.s.stor.Remove(f.fd); err != nil {
t.s.logf("table@remove removing @%d %q", f.fd.Num, err)
} else {
t.s.logf("table@remove removed @%d", num)
t.s.logf("table@remove removed @%d", f.fd.Num)
}
if t.bcache != nil {
t.bcache.EvictNS(num)
t.bcache.EvictNS(uint64(f.fd.Num))
}
})
}
@ -432,7 +434,7 @@ func (t *tOps) close() {
t.bpool.Close()
t.cache.Close()
if t.bcache != nil {
t.bcache.Close()
t.bcache.CloseWeak()
}
}
@ -441,22 +443,27 @@ func newTableOps(s *session) *tOps {
var (
cacher cache.Cacher
bcache *cache.Cache
bpool *util.BufferPool
)
if s.o.GetOpenFilesCacheCapacity() > 0 {
cacher = cache.NewLRU(s.o.GetOpenFilesCacheCapacity())
}
if !s.o.DisableBlockCache {
if !s.o.GetDisableBlockCache() {
var bcacher cache.Cacher
if s.o.GetBlockCacheCapacity() > 0 {
bcacher = cache.NewLRU(s.o.GetBlockCacheCapacity())
}
bcache = cache.NewCache(bcacher)
}
if !s.o.GetDisableBufferPool() {
bpool = util.NewBufferPool(s.o.GetBlockSize() + 5)
}
return &tOps{
s: s,
noSync: s.o.GetNoSync(),
cache: cache.NewCache(cacher),
bcache: bcache,
bpool: util.NewBufferPool(s.o.GetBlockSize() + 5),
bpool: bpool,
}
}
@ -465,9 +472,9 @@ func newTableOps(s *session) *tOps {
type tWriter struct {
t *tOps
file storage.File
w storage.Writer
tw *table.Writer
fd storage.FileDesc
w storage.Writer
tw *table.Writer
first, last []byte
}
@ -501,20 +508,21 @@ func (w *tWriter) finish() (f *tFile, err error) {
if err != nil {
return
}
err = w.w.Sync()
if err != nil {
return
if !w.t.noSync {
err = w.w.Sync()
if err != nil {
return
}
}
f = newTableFile(w.file, uint64(w.tw.BytesLen()), iKey(w.first), iKey(w.last))
f = newTableFile(w.fd, int64(w.tw.BytesLen()), internalKey(w.first), internalKey(w.last))
return
}
// Drops the table.
func (w *tWriter) drop() {
w.close()
w.file.Remove()
w.t.s.reuseFileNum(w.file.Num())
w.file = nil
w.t.s.stor.Remove(w.fd)
w.t.s.reuseFileNum(w.fd.Num)
w.tw = nil
w.first = nil
w.last = nil

View File

@ -14,7 +14,7 @@ import (
"strings"
"sync"
"github.com/syndtr/gosnappy/snappy"
"github.com/golang/snappy"
"github.com/syndtr/goleveldb/leveldb/cache"
"github.com/syndtr/goleveldb/leveldb/comparer"
@ -26,12 +26,15 @@ import (
"github.com/syndtr/goleveldb/leveldb/util"
)
// Reader errors.
var (
ErrNotFound = errors.ErrNotFound
ErrReaderReleased = errors.New("leveldb/table: reader released")
ErrIterReleased = errors.New("leveldb/table: iterator released")
)
// ErrCorrupted describes error due to corruption. This error will be wrapped
// with errors.ErrCorrupted.
type ErrCorrupted struct {
Pos int64
Size int64
@ -61,7 +64,7 @@ type block struct {
func (b *block) seek(cmp comparer.Comparer, rstart, rlimit int, key []byte) (index, offset int, err error) {
index = sort.Search(b.restartsLen-rstart-(b.restartsLen-rlimit), func(i int) bool {
offset := int(binary.LittleEndian.Uint32(b.data[b.restartsOffset+4*(rstart+i):]))
offset += 1 // shared always zero, since this is a restart point
offset++ // shared always zero, since this is a restart point
v1, n1 := binary.Uvarint(b.data[offset:]) // key length
_, n2 := binary.Uvarint(b.data[offset+n1:]) // value length
m := offset + n1 + n2
@ -356,7 +359,7 @@ func (i *blockIter) Prev() bool {
i.value = nil
offset := i.block.restartOffset(ri)
if offset == i.offset {
ri -= 1
ri--
if ri < 0 {
i.dir = dirSOI
return false
@ -507,9 +510,9 @@ func (i *indexIter) Get() iterator.Iterator {
// Reader is a table reader.
type Reader struct {
mu sync.RWMutex
fi *storage.FileInfo
fd storage.FileDesc
reader io.ReaderAt
cache *cache.CacheGetter
cache *cache.NamespaceGetter
err error
bpool *util.BufferPool
// Options
@ -539,7 +542,7 @@ func (r *Reader) blockKind(bh blockHandle) string {
}
func (r *Reader) newErrCorrupted(pos, size int64, kind, reason string) error {
return &errors.ErrCorrupted{File: r.fi, Err: &ErrCorrupted{Pos: pos, Size: size, Kind: kind, Reason: reason}}
return &errors.ErrCorrupted{Fd: r.fd, Err: &ErrCorrupted{Pos: pos, Size: size, Kind: kind, Reason: reason}}
}
func (r *Reader) newErrCorruptedBH(bh blockHandle, reason string) error {
@ -551,7 +554,7 @@ func (r *Reader) fixErrCorruptedBH(bh blockHandle, err error) error {
cerr.Pos = int64(bh.offset)
cerr.Size = int64(bh.length)
cerr.Kind = r.blockKind(bh)
return &errors.ErrCorrupted{File: r.fi, Err: cerr}
return &errors.ErrCorrupted{Fd: r.fd, Err: cerr}
}
return err
}
@ -578,6 +581,7 @@ func (r *Reader) readRawBlock(bh blockHandle, verifyChecksum bool) ([]byte, erro
case blockTypeSnappyCompression:
decLen, err := snappy.DecodedLen(data[:bh.length])
if err != nil {
r.bpool.Put(data)
return nil, r.newErrCorruptedBH(bh, err.Error())
}
decData := r.bpool.Get(decLen)
@ -783,8 +787,8 @@ func (r *Reader) getDataIterErr(dataBH blockHandle, slice *util.Range, verifyChe
// table. And a nil Range.Limit is treated as a key after all keys in
// the table.
//
// The returned iterator is not goroutine-safe and should be released
// when not used.
// The returned iterator is not safe for concurrent use and should be released
// after use.
//
// Also read Iterator documentation of the leveldb/iterator package.
func (r *Reader) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator {
@ -826,18 +830,21 @@ func (r *Reader) find(key []byte, filtered bool, ro *opt.ReadOptions, noValue bo
index := r.newBlockIter(indexBlock, nil, nil, true)
defer index.Release()
if !index.Seek(key) {
err = index.Error()
if err == nil {
if err = index.Error(); err == nil {
err = ErrNotFound
}
return
}
dataBH, n := decodeBlockHandle(index.Value())
if n == 0 {
r.err = r.newErrCorruptedBH(r.indexBH, "bad data block handle")
return
return nil, nil, r.err
}
// The filter should only used for exact match.
if filtered && r.filter != nil {
filterBlock, frel, ferr := r.getFilterBlock(true)
if ferr == nil {
@ -847,30 +854,53 @@ func (r *Reader) find(key []byte, filtered bool, ro *opt.ReadOptions, noValue bo
}
frel.Release()
} else if !errors.IsCorrupted(ferr) {
err = ferr
return nil, nil, ferr
}
}
data := r.getDataIter(dataBH, nil, r.verifyChecksum, !ro.GetDontFillCache())
if !data.Seek(key) {
data.Release()
if err = data.Error(); err != nil {
return
}
// The nearest greater-than key is the first key of the next block.
if !index.Next() {
if err = index.Error(); err == nil {
err = ErrNotFound
}
return
}
dataBH, n = decodeBlockHandle(index.Value())
if n == 0 {
r.err = r.newErrCorruptedBH(r.indexBH, "bad data block handle")
return nil, nil, r.err
}
data = r.getDataIter(dataBH, nil, r.verifyChecksum, !ro.GetDontFillCache())
if !data.Next() {
data.Release()
if err = data.Error(); err == nil {
err = ErrNotFound
}
return
}
}
data := r.getDataIter(dataBH, nil, r.verifyChecksum, !ro.GetDontFillCache())
defer data.Release()
if !data.Seek(key) {
err = data.Error()
if err == nil {
err = ErrNotFound
}
return
}
// Don't use block buffer, no need to copy the buffer.
// Key doesn't use block buffer, no need to copy the buffer.
rkey = data.Key()
if !noValue {
if r.bpool == nil {
value = data.Value()
} else {
// Use block buffer, and since the buffer will be recycled, the buffer
// need to be copied.
// Value does use block buffer, and since the buffer will be
// recycled, it need to be copied.
value = append([]byte{}, data.Value()...)
}
}
data.Release()
return
}
@ -888,7 +918,7 @@ func (r *Reader) Find(key []byte, filtered bool, ro *opt.ReadOptions) (rkey, val
return r.find(key, filtered, ro, false)
}
// Find finds key that is greater than or equal to the given key.
// FindKey finds key that is greater than or equal to the given key.
// It returns ErrNotFound if the table doesn't contain such key.
// If filtered is true then the nearest 'block' will be checked against
// 'filter data' (if present) and will immediately return ErrNotFound if
@ -987,14 +1017,14 @@ func (r *Reader) Release() {
// NewReader creates a new initialized table reader for the file.
// The fi, cache and bpool is optional and can be nil.
//
// The returned table reader instance is goroutine-safe.
func NewReader(f io.ReaderAt, size int64, fi *storage.FileInfo, cache *cache.CacheGetter, bpool *util.BufferPool, o *opt.Options) (*Reader, error) {
// The returned table reader instance is safe for concurrent use.
func NewReader(f io.ReaderAt, size int64, fd storage.FileDesc, cache *cache.NamespaceGetter, bpool *util.BufferPool, o *opt.Options) (*Reader, error) {
if f == nil {
return nil, errors.New("leveldb/table: nil file")
}
r := &Reader{
fi: fi,
fd: fd,
reader: f,
cache: cache,
bpool: bpool,
@ -1039,9 +1069,8 @@ func NewReader(f io.ReaderAt, size int64, fi *storage.FileInfo, cache *cache.Cac
if errors.IsCorrupted(err) {
r.err = err
return r, nil
} else {
return nil, err
}
return nil, err
}
// Set data end.
@ -1086,9 +1115,8 @@ func NewReader(f io.ReaderAt, size int64, fi *storage.FileInfo, cache *cache.Cac
if errors.IsCorrupted(err) {
r.err = err
return r, nil
} else {
return nil, err
}
return nil, err
}
if r.filter != nil {
r.filterBlock, err = r.readFilterBlock(r.filterBH)

View File

@ -14,6 +14,7 @@ import (
"github.com/syndtr/goleveldb/leveldb/iterator"
"github.com/syndtr/goleveldb/leveldb/opt"
"github.com/syndtr/goleveldb/leveldb/storage"
"github.com/syndtr/goleveldb/leveldb/testutil"
"github.com/syndtr/goleveldb/leveldb/util"
)
@ -59,7 +60,7 @@ var _ = testutil.Defer(func() {
It("Should be able to approximate offset of a key correctly", func() {
Expect(err).ShouldNot(HaveOccurred())
tr, err := NewReader(bytes.NewReader(buf.Bytes()), int64(buf.Len()), nil, nil, nil, o)
tr, err := NewReader(bytes.NewReader(buf.Bytes()), int64(buf.Len()), storage.FileDesc{}, nil, nil, o)
Expect(err).ShouldNot(HaveOccurred())
CheckOffset := func(key string, expect, threshold int) {
offset, err := tr.OffsetOf([]byte(key))
@ -96,7 +97,7 @@ var _ = testutil.Defer(func() {
tw.Close()
// Opening the table.
tr, _ := NewReader(bytes.NewReader(buf.Bytes()), int64(buf.Len()), nil, nil, nil, o)
tr, _ := NewReader(bytes.NewReader(buf.Bytes()), int64(buf.Len()), storage.FileDesc{}, nil, nil, o)
return tableWrapper{tr}
}
Test := func(kv *testutil.KeyValue, body func(r *Reader)) func() {
@ -110,7 +111,7 @@ var _ = testutil.Defer(func() {
}
testutil.AllKeyValueTesting(nil, Build, nil, nil)
Describe("with one key per block", Test(testutil.KeyValue_Generate(nil, 9, 1, 10, 512, 512), func(r *Reader) {
Describe("with one key per block", Test(testutil.KeyValue_Generate(nil, 9, 1, 1, 10, 512, 512), func(r *Reader) {
It("should have correct blocks number", func() {
indexBlock, err := r.readBlock(r.indexBH, true)
Expect(err).To(BeNil())

View File

@ -12,7 +12,7 @@ import (
"fmt"
"io"
"github.com/syndtr/gosnappy/snappy"
"github.com/golang/snappy"
"github.com/syndtr/goleveldb/leveldb/comparer"
"github.com/syndtr/goleveldb/leveldb/filter"
@ -167,11 +167,7 @@ func (w *Writer) writeBlock(buf *util.Buffer, compression opt.Compression) (bh b
if n := snappy.MaxEncodedLen(buf.Len()) + blockTrailerLen; len(w.compressionScratch) < n {
w.compressionScratch = make([]byte, n)
}
var compressed []byte
compressed, err = snappy.Encode(w.compressionScratch, buf.Bytes())
if err != nil {
return
}
compressed := snappy.Encode(w.compressionScratch, buf.Bytes())
n := len(compressed)
b = compressed[:n+blockTrailerLen]
b[n] = blockTypeSnappyCompression
@ -353,7 +349,7 @@ func (w *Writer) Close() error {
// NewWriter creates a new initialized table writer for the file.
//
// Table writer is not goroutine-safe.
// Table writer is not safe for concurrent use.
func NewWriter(f io.Writer, o *opt.Options) *Writer {
w := &Writer{
writer: f,

View File

@ -61,3 +61,31 @@ func newTestingDB(o *opt.Options, ro *opt.ReadOptions, wo *opt.WriteOptions) *te
stor: stor,
}
}
type testingTransaction struct {
*Transaction
ro *opt.ReadOptions
wo *opt.WriteOptions
}
func (t *testingTransaction) TestPut(key []byte, value []byte) error {
return t.Put(key, value, t.wo)
}
func (t *testingTransaction) TestDelete(key []byte) error {
return t.Delete(key, t.wo)
}
func (t *testingTransaction) TestGet(key []byte) (value []byte, err error) {
return t.Get(key, t.ro)
}
func (t *testingTransaction) TestHas(key []byte) (ret bool, err error) {
return t.Has(key, t.ro)
}
func (t *testingTransaction) TestNewIterator(slice *util.Range) iterator.Iterator {
return t.NewIterator(slice, t.ro)
}
func (t *testingTransaction) TestClose() {}

View File

@ -72,20 +72,27 @@ func maxInt(a, b int) int {
return b
}
type files []storage.File
type fdSorter []storage.FileDesc
func (p files) Len() int {
func (p fdSorter) Len() int {
return len(p)
}
func (p files) Less(i, j int) bool {
return p[i].Num() < p[j].Num()
func (p fdSorter) Less(i, j int) bool {
return p[i].Num < p[j].Num
}
func (p files) Swap(i, j int) {
func (p fdSorter) Swap(i, j int) {
p[i], p[j] = p[j], p[i]
}
func (p files) sort() {
sort.Sort(p)
func sortFds(fds []storage.FileDesc) {
sort.Sort(fdSorter(fds))
}
func ensureBuffer(b []byte, n int) []byte {
if cap(b) < n {
return make([]byte, n)
}
return b[:n]
}

View File

@ -201,6 +201,7 @@ func (p *BufferPool) String() string {
func (p *BufferPool) drain() {
ticker := time.NewTicker(2 * time.Second)
defer ticker.Stop()
for {
select {
case <-ticker.C:

View File

@ -7,38 +7,38 @@
package util
import (
"bytes"
"encoding/binary"
)
// Hash return hash of the given data.
func Hash(data []byte, seed uint32) uint32 {
// Similar to murmur hash
var m uint32 = 0xc6a4a793
var r uint32 = 24
h := seed ^ (uint32(len(data)) * m)
const (
m = uint32(0xc6a4a793)
r = uint32(24)
)
var (
h = seed ^ (uint32(len(data)) * m)
i int
)
buf := bytes.NewBuffer(data)
for buf.Len() >= 4 {
var w uint32
binary.Read(buf, binary.LittleEndian, &w)
h += w
for n := len(data) - len(data)%4; i < n; i += 4 {
h += binary.LittleEndian.Uint32(data[i:])
h *= m
h ^= (h >> 16)
}
rest := buf.Bytes()
switch len(rest) {
switch len(data) - i {
default:
panic("not reached")
case 3:
h += uint32(rest[2]) << 16
h += uint32(data[i+2]) << 16
fallthrough
case 2:
h += uint32(rest[1]) << 8
h += uint32(data[i+1]) << 8
fallthrough
case 1:
h += uint32(rest[0])
h += uint32(data[i])
h *= m
h ^= (h >> r)
case 0:

View File

@ -0,0 +1,46 @@
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
// All rights reserved.
//
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
package util
import (
"testing"
)
var hashTests = []struct {
data []byte
seed uint32
hash uint32
}{
{nil, 0xbc9f1d34, 0xbc9f1d34},
{[]byte{0x62}, 0xbc9f1d34, 0xef1345c4},
{[]byte{0xc3, 0x97}, 0xbc9f1d34, 0x5b663814},
{[]byte{0xe2, 0x99, 0xa5}, 0xbc9f1d34, 0x323c078f},
{[]byte{0xe1, 0x80, 0xb9, 0x32}, 0xbc9f1d34, 0xed21633a},
{[]byte{
0x01, 0xc0, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00,
0x14, 0x00, 0x00, 0x00,
0x00, 0x00, 0x04, 0x00,
0x00, 0x00, 0x00, 0x14,
0x00, 0x00, 0x00, 0x18,
0x28, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00,
0x02, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00,
}, 0x12345678, 0xf333dabb},
}
func TestHash(t *testing.T) {
for i, x := range hashTests {
h := Hash(x.data, x.seed)
if h != x.hash {
t.Fatalf("test-%d: invalid hash, %#x vs %#x", i, h, x.hash)
}
}
}

View File

@ -1,21 +0,0 @@
// Copyright (c) 2014, Suryandaru Triandana <syndtr@gmail.com>
// All rights reserved.
//
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
// +build go1.3
package util
import (
"sync"
)
type Pool struct {
sync.Pool
}
func NewPool(cap int) *Pool {
return &Pool{}
}

View File

@ -1,33 +0,0 @@
// Copyright (c) 2014, Suryandaru Triandana <syndtr@gmail.com>
// All rights reserved.
//
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
// +build !go1.3
package util
type Pool struct {
pool chan interface{}
}
func (p *Pool) Get() interface{} {
select {
case x := <-p.pool:
return x
default:
return nil
}
}
func (p *Pool) Put(x interface{}) {
select {
case p.pool <- x:
default:
}
}
func NewPool(cap int) *Pool {
return &Pool{pool: make(chan interface{}, cap)}
}

View File

@ -19,7 +19,7 @@ var (
// Releaser is the interface that wraps the basic Release method.
type Releaser interface {
// Release releases associated resources. Release should always success
// and can be called multipe times without causing error.
// and can be called multiple times without causing error.
Release()
}

View File

@ -7,6 +7,7 @@
package leveldb
import (
"fmt"
"sync/atomic"
"unsafe"
@ -23,7 +24,7 @@ type tSet struct {
type version struct {
s *session
tables []tFiles
levels []tFiles
// Level that should be compacted next and its compaction score.
// Score < 1 means compaction is not strictly needed. These fields
@ -33,43 +34,48 @@ type version struct {
cSeek unsafe.Pointer
ref int
// Succeeding version.
next *version
closing bool
ref int
released bool
}
func newVersion(s *session) *version {
return &version{s: s, tables: make([]tFiles, s.o.GetNumLevel())}
return &version{s: s}
}
func (v *version) incref() {
if v.released {
panic("already released")
}
v.ref++
if v.ref == 1 {
// Incr file ref.
for _, tt := range v.levels {
for _, t := range tt {
v.s.addFileRef(t.fd, 1)
}
}
}
}
func (v *version) releaseNB() {
v.ref--
if v.ref > 0 {
return
}
if v.ref < 0 {
} else if v.ref < 0 {
panic("negative version ref")
}
tables := make(map[uint64]bool)
for _, tt := range v.next.tables {
for _, tt := range v.levels {
for _, t := range tt {
num := t.file.Num()
tables[num] = true
}
}
for _, tt := range v.tables {
for _, t := range tt {
num := t.file.Num()
if _, ok := tables[num]; !ok {
if v.s.addFileRef(t.fd, -1) == 0 {
v.s.tops.remove(t)
}
}
}
v.next.releaseNB()
v.next = nil
v.released = true
}
func (v *version) release() {
@ -78,11 +84,26 @@ func (v *version) release() {
v.s.vmu.Unlock()
}
func (v *version) walkOverlapping(ikey iKey, f func(level int, t *tFile) bool, lf func(level int) bool) {
func (v *version) walkOverlapping(aux tFiles, ikey internalKey, f func(level int, t *tFile) bool, lf func(level int) bool) {
ukey := ikey.ukey()
// Aux level.
if aux != nil {
for _, t := range aux {
if t.overlaps(v.s.icmp, ukey, ukey) {
if !f(-1, t) {
return
}
}
}
if lf != nil && !lf(-1) {
return
}
}
// Walk tables level-by-level.
for level, tables := range v.tables {
for level, tables := range v.levels {
if len(tables) == 0 {
continue
}
@ -114,7 +135,11 @@ func (v *version) walkOverlapping(ikey iKey, f func(level int, t *tFile) bool, l
}
}
func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byte, tcomp bool, err error) {
func (v *version) get(aux tFiles, ikey internalKey, ro *opt.ReadOptions, noValue bool) (value []byte, tcomp bool, err error) {
if v.closing {
return nil, false, ErrClosed
}
ukey := ikey.ukey()
var (
@ -124,16 +149,16 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
// Level-0.
zfound bool
zseq uint64
zkt kType
zkt keyType
zval []byte
)
err = ErrNotFound
// Since entries never hope across level, finding key/value
// Since entries never hop across level, finding key/value
// in smaller level make later levels irrelevant.
v.walkOverlapping(ikey, func(level int, t *tFile) bool {
if !tseek {
v.walkOverlapping(aux, ikey, func(level int, t *tFile) bool {
if level >= 0 && !tseek {
if tset == nil {
tset = &tSet{level, t}
} else {
@ -150,6 +175,7 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
} else {
fikey, fval, ferr = v.s.tops.find(t, ikey, ro)
}
switch ferr {
case nil:
case ErrNotFound:
@ -159,9 +185,10 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
return false
}
if fukey, fseq, fkt, fkerr := parseIkey(fikey); fkerr == nil {
if fukey, fseq, fkt, fkerr := parseInternalKey(fikey); fkerr == nil {
if v.s.icmp.uCompare(ukey, fukey) == 0 {
if level == 0 {
// Level <= 0 may overlaps each-other.
if level <= 0 {
if fseq >= zseq {
zfound = true
zseq = fseq
@ -170,12 +197,12 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
}
} else {
switch fkt {
case ktVal:
case keyTypeVal:
value = fval
err = nil
case ktDel:
case keyTypeDel:
default:
panic("leveldb: invalid iKey type")
panic("leveldb: invalid internalKey type")
}
return false
}
@ -189,12 +216,12 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
}, func(level int) bool {
if zfound {
switch zkt {
case ktVal:
case keyTypeVal:
value = zval
err = nil
case ktDel:
case keyTypeDel:
default:
panic("leveldb: invalid iKey type")
panic("leveldb: invalid internalKey type")
}
return false
}
@ -209,46 +236,40 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
return
}
func (v *version) sampleSeek(ikey iKey) (tcomp bool) {
func (v *version) sampleSeek(ikey internalKey) (tcomp bool) {
var tset *tSet
v.walkOverlapping(ikey, func(level int, t *tFile) bool {
v.walkOverlapping(nil, ikey, func(level int, t *tFile) bool {
if tset == nil {
tset = &tSet{level, t}
return true
} else {
if tset.table.consumeSeek() <= 0 {
tcomp = atomic.CompareAndSwapPointer(&v.cSeek, nil, unsafe.Pointer(tset))
}
return false
}
if tset.table.consumeSeek() <= 0 {
tcomp = atomic.CompareAndSwapPointer(&v.cSeek, nil, unsafe.Pointer(tset))
}
return false
}, nil)
return
}
func (v *version) getIterators(slice *util.Range, ro *opt.ReadOptions) (its []iterator.Iterator) {
// Merge all level zero files together since they may overlap
for _, t := range v.tables[0] {
it := v.s.tops.newIterator(t, slice, ro)
its = append(its, it)
}
strict := opt.GetStrict(v.s.o.Options, ro, opt.StrictReader)
for _, tables := range v.tables[1:] {
if len(tables) == 0 {
continue
for level, tables := range v.levels {
if level == 0 {
// Merge all level zero files together since they may overlap.
for _, t := range tables {
its = append(its, v.s.tops.newIterator(t, slice, ro))
}
} else if len(tables) != 0 {
its = append(its, iterator.NewIndexedIterator(tables.newIndexIterator(v.s.tops, v.s.icmp, slice, ro), strict))
}
it := iterator.NewIndexedIterator(tables.newIndexIterator(v.s.tops, v.s.icmp, slice, ro), strict)
its = append(its, it)
}
return
}
func (v *version) newStaging() *versionStaging {
return &versionStaging{base: v, tables: make([]tablesScratch, v.s.o.GetNumLevel())}
return &versionStaging{base: v}
}
// Spawn a new version based on this version.
@ -259,19 +280,22 @@ func (v *version) spawn(r *sessionRecord) *version {
}
func (v *version) fillRecord(r *sessionRecord) {
for level, ts := range v.tables {
for _, t := range ts {
for level, tables := range v.levels {
for _, t := range tables {
r.addTableFile(level, t)
}
}
}
func (v *version) tLen(level int) int {
return len(v.tables[level])
if level < len(v.levels) {
return len(v.levels[level])
}
return 0
}
func (v *version) offsetOf(ikey iKey) (n uint64, err error) {
for level, tables := range v.tables {
func (v *version) offsetOf(ikey internalKey) (n int64, err error) {
for level, tables := range v.levels {
for _, t := range tables {
if v.s.icmp.Compare(t.imax, ikey) <= 0 {
// Entire file is before "ikey", so just add the file size
@ -287,12 +311,11 @@ func (v *version) offsetOf(ikey iKey) (n uint64, err error) {
} else {
// "ikey" falls in the range for this table. Add the
// approximate offset of "ikey" within the table.
var nn uint64
nn, err = v.s.tops.offsetOf(t, ikey)
if err != nil {
if m, err := v.s.tops.offsetOf(t, ikey); err == nil {
n += m
} else {
return 0, err
}
n += nn
}
}
}
@ -300,37 +323,50 @@ func (v *version) offsetOf(ikey iKey) (n uint64, err error) {
return
}
func (v *version) pickLevel(umin, umax []byte) (level int) {
if !v.tables[0].overlaps(v.s.icmp, umin, umax, true) {
var overlaps tFiles
maxLevel := v.s.o.GetMaxMemCompationLevel()
for ; level < maxLevel; level++ {
if v.tables[level+1].overlaps(v.s.icmp, umin, umax, false) {
break
}
overlaps = v.tables[level+2].getOverlaps(overlaps, v.s.icmp, umin, umax, false)
if overlaps.size() > uint64(v.s.o.GetCompactionGPOverlaps(level)) {
break
func (v *version) pickMemdbLevel(umin, umax []byte, maxLevel int) (level int) {
if maxLevel > 0 {
if len(v.levels) == 0 {
return maxLevel
}
if !v.levels[0].overlaps(v.s.icmp, umin, umax, true) {
var overlaps tFiles
for ; level < maxLevel; level++ {
if pLevel := level + 1; pLevel >= len(v.levels) {
return maxLevel
} else if v.levels[pLevel].overlaps(v.s.icmp, umin, umax, false) {
break
}
if gpLevel := level + 2; gpLevel < len(v.levels) {
overlaps = v.levels[gpLevel].getOverlaps(overlaps, v.s.icmp, umin, umax, false)
if overlaps.size() > int64(v.s.o.GetCompactionGPOverlaps(level)) {
break
}
}
}
}
}
return
}
func (v *version) computeCompaction() {
// Precomputed best level for next compaction
var bestLevel int = -1
var bestScore float64 = -1
bestLevel := int(-1)
bestScore := float64(-1)
for level, tables := range v.tables {
statFiles := make([]int, len(v.levels))
statSizes := make([]string, len(v.levels))
statScore := make([]string, len(v.levels))
statTotSize := int64(0)
for level, tables := range v.levels {
var score float64
size := tables.size()
if level == 0 {
// We treat level-0 specially by bounding the number of files
// instead of number of bytes for two reasons:
//
// (1) With larger write-buffer sizes, it is nice not to do too
// many level-0 compactions.
// many level-0 compaction.
//
// (2) The files in level-0 are merged on every read and
// therefore we wish to avoid too many files when the individual
@ -339,17 +375,24 @@ func (v *version) computeCompaction() {
// overwrites/deletions).
score = float64(len(tables)) / float64(v.s.o.GetCompactionL0Trigger())
} else {
score = float64(tables.size()) / float64(v.s.o.GetCompactionTotalSize(level))
score = float64(size) / float64(v.s.o.GetCompactionTotalSize(level))
}
if score > bestScore {
bestLevel = level
bestScore = score
}
statFiles[level] = len(tables)
statSizes[level] = shortenb(int(size))
statScore[level] = fmt.Sprintf("%.2f", score)
statTotSize += size
}
v.cLevel = bestLevel
v.cScore = bestScore
v.s.logf("version@stat F·%v S·%s%v Sc·%v", statFiles, shortenb(int(statTotSize)), statSizes, statScore)
}
func (v *version) needCompaction() bool {
@ -357,43 +400,48 @@ func (v *version) needCompaction() bool {
}
type tablesScratch struct {
added map[uint64]atRecord
deleted map[uint64]struct{}
added map[int64]atRecord
deleted map[int64]struct{}
}
type versionStaging struct {
base *version
tables []tablesScratch
levels []tablesScratch
}
func (p *versionStaging) getScratch(level int) *tablesScratch {
if level >= len(p.levels) {
newLevels := make([]tablesScratch, level+1)
copy(newLevels, p.levels)
p.levels = newLevels
}
return &(p.levels[level])
}
func (p *versionStaging) commit(r *sessionRecord) {
// Deleted tables.
for _, r := range r.deletedTables {
tm := &(p.tables[r.level])
if len(p.base.tables[r.level]) > 0 {
if tm.deleted == nil {
tm.deleted = make(map[uint64]struct{})
scratch := p.getScratch(r.level)
if r.level < len(p.base.levels) && len(p.base.levels[r.level]) > 0 {
if scratch.deleted == nil {
scratch.deleted = make(map[int64]struct{})
}
tm.deleted[r.num] = struct{}{}
scratch.deleted[r.num] = struct{}{}
}
if tm.added != nil {
delete(tm.added, r.num)
if scratch.added != nil {
delete(scratch.added, r.num)
}
}
// New tables.
for _, r := range r.addedTables {
tm := &(p.tables[r.level])
if tm.added == nil {
tm.added = make(map[uint64]atRecord)
scratch := p.getScratch(r.level)
if scratch.added == nil {
scratch.added = make(map[int64]atRecord)
}
tm.added[r.num] = r
if tm.deleted != nil {
delete(tm.deleted, r.num)
scratch.added[r.num] = r
if scratch.deleted != nil {
delete(scratch.deleted, r.num)
}
}
}
@ -401,39 +449,62 @@ func (p *versionStaging) commit(r *sessionRecord) {
func (p *versionStaging) finish() *version {
// Build new version.
nv := newVersion(p.base.s)
for level, tm := range p.tables {
btables := p.base.tables[level]
n := len(btables) + len(tm.added) - len(tm.deleted)
if n < 0 {
n = 0
}
nt := make(tFiles, 0, n)
// Base tables.
for _, t := range btables {
if _, ok := tm.deleted[t.file.Num()]; ok {
continue
}
if _, ok := tm.added[t.file.Num()]; ok {
continue
}
nt = append(nt, t)
}
// New tables.
for _, r := range tm.added {
nt = append(nt, p.base.s.tableFileFromRecord(r))
}
// Sort tables.
if level == 0 {
nt.sortByNum()
} else {
nt.sortByKey(p.base.s.icmp)
}
nv.tables[level] = nt
numLevel := len(p.levels)
if len(p.base.levels) > numLevel {
numLevel = len(p.base.levels)
}
nv.levels = make([]tFiles, numLevel)
for level := 0; level < numLevel; level++ {
var baseTabels tFiles
if level < len(p.base.levels) {
baseTabels = p.base.levels[level]
}
if level < len(p.levels) {
scratch := p.levels[level]
var nt tFiles
// Prealloc list if possible.
if n := len(baseTabels) + len(scratch.added) - len(scratch.deleted); n > 0 {
nt = make(tFiles, 0, n)
}
// Base tables.
for _, t := range baseTabels {
if _, ok := scratch.deleted[t.fd.Num]; ok {
continue
}
if _, ok := scratch.added[t.fd.Num]; ok {
continue
}
nt = append(nt, t)
}
// New tables.
for _, r := range scratch.added {
nt = append(nt, tableFileFromRecord(r))
}
if len(nt) != 0 {
// Sort tables.
if level == 0 {
nt.sortByNum()
} else {
nt.sortByKey(p.base.s.icmp)
}
nv.levels[level] = nt
}
} else {
nv.levels[level] = baseTabels
}
}
// Trim levels.
n := len(nv.levels)
for ; n > 0 && nv.levels[n-1] == nil; n-- {
}
nv.levels = nv.levels[:n]
// Compute compaction score for new version.
nv.computeCompaction()

View File

@ -0,0 +1,181 @@
package leveldb
import (
"encoding/binary"
"reflect"
"testing"
"github.com/onsi/gomega"
"github.com/syndtr/goleveldb/leveldb/testutil"
)
type testFileRec struct {
level int
num int64
}
func TestVersionStaging(t *testing.T) {
gomega.RegisterTestingT(t)
stor := testutil.NewStorage()
defer stor.Close()
s, err := newSession(stor, nil)
if err != nil {
t.Fatal(err)
}
v := newVersion(s)
v.newStaging()
tmp := make([]byte, 4)
mik := func(i uint64) []byte {
binary.BigEndian.PutUint32(tmp, uint32(i))
return []byte(makeInternalKey(nil, tmp, 0, keyTypeVal))
}
for i, x := range []struct {
add, del []testFileRec
levels [][]int64
}{
{
add: []testFileRec{
{1, 1},
},
levels: [][]int64{
{},
{1},
},
},
{
add: []testFileRec{
{1, 1},
},
levels: [][]int64{
{},
{1},
},
},
{
del: []testFileRec{
{1, 1},
},
levels: [][]int64{},
},
{
add: []testFileRec{
{0, 1},
{0, 3},
{0, 2},
{2, 5},
{1, 4},
},
levels: [][]int64{
{3, 2, 1},
{4},
{5},
},
},
{
add: []testFileRec{
{1, 6},
{2, 5},
},
del: []testFileRec{
{0, 1},
{0, 4},
},
levels: [][]int64{
{3, 2},
{4, 6},
{5},
},
},
{
del: []testFileRec{
{0, 3},
{0, 2},
{1, 4},
{1, 6},
{2, 5},
},
levels: [][]int64{},
},
{
add: []testFileRec{
{0, 1},
},
levels: [][]int64{
{1},
},
},
{
add: []testFileRec{
{1, 2},
},
levels: [][]int64{
{1},
{2},
},
},
{
add: []testFileRec{
{0, 3},
},
levels: [][]int64{
{3, 1},
{2},
},
},
{
add: []testFileRec{
{6, 9},
},
levels: [][]int64{
{3, 1},
{2},
{},
{},
{},
{},
{9},
},
},
{
del: []testFileRec{
{6, 9},
},
levels: [][]int64{
{3, 1},
{2},
},
},
} {
rec := &sessionRecord{}
for _, f := range x.add {
ik := mik(uint64(f.num))
rec.addTable(f.level, f.num, 1, ik, ik)
}
for _, f := range x.del {
rec.delTable(f.level, f.num)
}
vs := v.newStaging()
vs.commit(rec)
v = vs.finish()
if len(v.levels) != len(x.levels) {
t.Fatalf("#%d: invalid level count: want=%d got=%d", i, len(x.levels), len(v.levels))
}
for j, want := range x.levels {
tables := v.levels[j]
if len(want) != len(tables) {
t.Fatalf("#%d.%d: invalid tables count: want=%d got=%d", i, j, len(want), len(tables))
}
got := make([]int64, len(tables))
for k, t := range tables {
got[k] = t.fd.Num
}
if !reflect.DeepEqual(want, got) {
t.Fatalf("#%d.%d: invalid tables: want=%v got=%v", i, j, want, got)
}
}
}
}

View File

@ -1,30 +0,0 @@
syntax:glob
.DS_Store
.git
.gitignore
*.[568ao]
*.ao
*.so
*.pyc
._*
.nfs.*
[568a].out
*~
*.orig
*.rej
*.exe
.*.swp
core
*.cgo*.go
*.cgo*.c
_cgo_*
_obj
_test
_testmain.go
build.out
snappy/testdata
test.out
y.tab.[ch]
syntax:regexp
^.*/core.[0-9]*$

View File

@ -1,12 +0,0 @@
# This is the official list of Snappy-Go authors for copyright purposes.
# This file is distinct from the CONTRIBUTORS files.
# See the latter for an explanation.
# Names should be added to this file as
# Name or Organization <email address>
# The email address is not required for organizations.
# Please keep the list sorted.
Google Inc.
Jan Mercl <0xjnml@gmail.com>

View File

@ -1,34 +0,0 @@
# This is the official list of people who can contribute
# (and typically have contributed) code to the Snappy-Go repository.
# The AUTHORS file lists the copyright holders; this file
# lists people. For example, Google employees are listed here
# but not in AUTHORS, because Google holds the copyright.
#
# The submission process automatically checks to make sure
# that people submitting code are listed in this file (by email address).
#
# Names should be added to this file only after verifying that
# the individual or the individual's organization has agreed to
# the appropriate Contributor License Agreement, found here:
#
# http://code.google.com/legal/individual-cla-v1.0.html
# http://code.google.com/legal/corporate-cla-v1.0.html
#
# The agreement for individuals can be filled out on the web.
#
# When adding J Random Contributor's name to this file,
# either J's name or J's organization's name should be
# added to the AUTHORS file, depending on whether the
# individual or corporate CLA was used.
# Names should be added to this file like so:
# Name <email address>
# Please keep the list sorted.
Jan Mercl <0xjnml@gmail.com>
Kai Backman <kaib@golang.org>
Marc-Antoine Ruel <maruel@chromium.org>
Nigel Tao <nigeltao@golang.org>
Rob Pike <r@golang.org>
Russ Cox <rsc@golang.org>

View File

@ -1,27 +0,0 @@
Copyright (c) 2011 The Snappy-Go Authors. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Google Inc. nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@ -1,11 +0,0 @@
This is a Snappy library for the Go programming language.
To download and install from source:
$ go get code.google.com/p/snappy-go/snappy
Unless otherwise noted, the Snappy-Go source files are distributed
under the BSD-style license found in the LICENSE file.
Contributions should follow the same procedure as for the Go project:
http://golang.org/doc/contribute.html

View File

@ -1,292 +0,0 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package snappy
import (
"encoding/binary"
"errors"
"io"
)
var (
// ErrCorrupt reports that the input is invalid.
ErrCorrupt = errors.New("snappy: corrupt input")
// ErrUnsupported reports that the input isn't supported.
ErrUnsupported = errors.New("snappy: unsupported input")
)
// DecodedLen returns the length of the decoded block.
func DecodedLen(src []byte) (int, error) {
v, _, err := decodedLen(src)
return v, err
}
// decodedLen returns the length of the decoded block and the number of bytes
// that the length header occupied.
func decodedLen(src []byte) (blockLen, headerLen int, err error) {
v, n := binary.Uvarint(src)
if n == 0 {
return 0, 0, ErrCorrupt
}
if uint64(int(v)) != v {
return 0, 0, errors.New("snappy: decoded block is too large")
}
return int(v), n, nil
}
// Decode returns the decoded form of src. The returned slice may be a sub-
// slice of dst if dst was large enough to hold the entire decoded block.
// Otherwise, a newly allocated slice will be returned.
// It is valid to pass a nil dst.
func Decode(dst, src []byte) ([]byte, error) {
dLen, s, err := decodedLen(src)
if err != nil {
return nil, err
}
if len(dst) < dLen {
dst = make([]byte, dLen)
}
var d, offset, length int
for s < len(src) {
switch src[s] & 0x03 {
case tagLiteral:
x := uint(src[s] >> 2)
switch {
case x < 60:
s += 1
case x == 60:
s += 2
if s > len(src) {
return nil, ErrCorrupt
}
x = uint(src[s-1])
case x == 61:
s += 3
if s > len(src) {
return nil, ErrCorrupt
}
x = uint(src[s-2]) | uint(src[s-1])<<8
case x == 62:
s += 4
if s > len(src) {
return nil, ErrCorrupt
}
x = uint(src[s-3]) | uint(src[s-2])<<8 | uint(src[s-1])<<16
case x == 63:
s += 5
if s > len(src) {
return nil, ErrCorrupt
}
x = uint(src[s-4]) | uint(src[s-3])<<8 | uint(src[s-2])<<16 | uint(src[s-1])<<24
}
length = int(x + 1)
if length <= 0 {
return nil, errors.New("snappy: unsupported literal length")
}
if length > len(dst)-d || length > len(src)-s {
return nil, ErrCorrupt
}
copy(dst[d:], src[s:s+length])
d += length
s += length
continue
case tagCopy1:
s += 2
if s > len(src) {
return nil, ErrCorrupt
}
length = 4 + int(src[s-2])>>2&0x7
offset = int(src[s-2])&0xe0<<3 | int(src[s-1])
case tagCopy2:
s += 3
if s > len(src) {
return nil, ErrCorrupt
}
length = 1 + int(src[s-3])>>2
offset = int(src[s-2]) | int(src[s-1])<<8
case tagCopy4:
return nil, errors.New("snappy: unsupported COPY_4 tag")
}
end := d + length
if offset > d || end > len(dst) {
return nil, ErrCorrupt
}
for ; d < end; d++ {
dst[d] = dst[d-offset]
}
}
if d != dLen {
return nil, ErrCorrupt
}
return dst[:d], nil
}
// NewReader returns a new Reader that decompresses from r, using the framing
// format described at
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt
func NewReader(r io.Reader) *Reader {
return &Reader{
r: r,
decoded: make([]byte, maxUncompressedChunkLen),
buf: make([]byte, MaxEncodedLen(maxUncompressedChunkLen)+checksumSize),
}
}
// Reader is an io.Reader than can read Snappy-compressed bytes.
type Reader struct {
r io.Reader
err error
decoded []byte
buf []byte
// decoded[i:j] contains decoded bytes that have not yet been passed on.
i, j int
readHeader bool
}
// Reset discards any buffered data, resets all state, and switches the Snappy
// reader to read from r. This permits reusing a Reader rather than allocating
// a new one.
func (r *Reader) Reset(reader io.Reader) {
r.r = reader
r.err = nil
r.i = 0
r.j = 0
r.readHeader = false
}
func (r *Reader) readFull(p []byte) (ok bool) {
if _, r.err = io.ReadFull(r.r, p); r.err != nil {
if r.err == io.ErrUnexpectedEOF {
r.err = ErrCorrupt
}
return false
}
return true
}
// Read satisfies the io.Reader interface.
func (r *Reader) Read(p []byte) (int, error) {
if r.err != nil {
return 0, r.err
}
for {
if r.i < r.j {
n := copy(p, r.decoded[r.i:r.j])
r.i += n
return n, nil
}
if !r.readFull(r.buf[:4]) {
return 0, r.err
}
chunkType := r.buf[0]
if !r.readHeader {
if chunkType != chunkTypeStreamIdentifier {
r.err = ErrCorrupt
return 0, r.err
}
r.readHeader = true
}
chunkLen := int(r.buf[1]) | int(r.buf[2])<<8 | int(r.buf[3])<<16
if chunkLen > len(r.buf) {
r.err = ErrUnsupported
return 0, r.err
}
// The chunk types are specified at
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt
switch chunkType {
case chunkTypeCompressedData:
// Section 4.2. Compressed data (chunk type 0x00).
if chunkLen < checksumSize {
r.err = ErrCorrupt
return 0, r.err
}
buf := r.buf[:chunkLen]
if !r.readFull(buf) {
return 0, r.err
}
checksum := uint32(buf[0]) | uint32(buf[1])<<8 | uint32(buf[2])<<16 | uint32(buf[3])<<24
buf = buf[checksumSize:]
n, err := DecodedLen(buf)
if err != nil {
r.err = err
return 0, r.err
}
if n > len(r.decoded) {
r.err = ErrCorrupt
return 0, r.err
}
if _, err := Decode(r.decoded, buf); err != nil {
r.err = err
return 0, r.err
}
if crc(r.decoded[:n]) != checksum {
r.err = ErrCorrupt
return 0, r.err
}
r.i, r.j = 0, n
continue
case chunkTypeUncompressedData:
// Section 4.3. Uncompressed data (chunk type 0x01).
if chunkLen < checksumSize {
r.err = ErrCorrupt
return 0, r.err
}
buf := r.buf[:checksumSize]
if !r.readFull(buf) {
return 0, r.err
}
checksum := uint32(buf[0]) | uint32(buf[1])<<8 | uint32(buf[2])<<16 | uint32(buf[3])<<24
// Read directly into r.decoded instead of via r.buf.
n := chunkLen - checksumSize
if !r.readFull(r.decoded[:n]) {
return 0, r.err
}
if crc(r.decoded[:n]) != checksum {
r.err = ErrCorrupt
return 0, r.err
}
r.i, r.j = 0, n
continue
case chunkTypeStreamIdentifier:
// Section 4.1. Stream identifier (chunk type 0xff).
if chunkLen != len(magicBody) {
r.err = ErrCorrupt
return 0, r.err
}
if !r.readFull(r.buf[:len(magicBody)]) {
return 0, r.err
}
for i := 0; i < len(magicBody); i++ {
if r.buf[i] != magicBody[i] {
r.err = ErrCorrupt
return 0, r.err
}
}
continue
}
if chunkType <= 0x7f {
// Section 4.5. Reserved unskippable chunks (chunk types 0x02-0x7f).
r.err = ErrUnsupported
return 0, r.err
} else {
// Section 4.4 Padding (chunk type 0xfe).
// Section 4.6. Reserved skippable chunks (chunk types 0x80-0xfd).
if !r.readFull(r.buf[:chunkLen]) {
return 0, r.err
}
}
}
}

View File

@ -1,258 +0,0 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package snappy
import (
"encoding/binary"
"io"
)
// We limit how far copy back-references can go, the same as the C++ code.
const maxOffset = 1 << 15
// emitLiteral writes a literal chunk and returns the number of bytes written.
func emitLiteral(dst, lit []byte) int {
i, n := 0, uint(len(lit)-1)
switch {
case n < 60:
dst[0] = uint8(n)<<2 | tagLiteral
i = 1
case n < 1<<8:
dst[0] = 60<<2 | tagLiteral
dst[1] = uint8(n)
i = 2
case n < 1<<16:
dst[0] = 61<<2 | tagLiteral
dst[1] = uint8(n)
dst[2] = uint8(n >> 8)
i = 3
case n < 1<<24:
dst[0] = 62<<2 | tagLiteral
dst[1] = uint8(n)
dst[2] = uint8(n >> 8)
dst[3] = uint8(n >> 16)
i = 4
case int64(n) < 1<<32:
dst[0] = 63<<2 | tagLiteral
dst[1] = uint8(n)
dst[2] = uint8(n >> 8)
dst[3] = uint8(n >> 16)
dst[4] = uint8(n >> 24)
i = 5
default:
panic("snappy: source buffer is too long")
}
if copy(dst[i:], lit) != len(lit) {
panic("snappy: destination buffer is too short")
}
return i + len(lit)
}
// emitCopy writes a copy chunk and returns the number of bytes written.
func emitCopy(dst []byte, offset, length int) int {
i := 0
for length > 0 {
x := length - 4
if 0 <= x && x < 1<<3 && offset < 1<<11 {
dst[i+0] = uint8(offset>>8)&0x07<<5 | uint8(x)<<2 | tagCopy1
dst[i+1] = uint8(offset)
i += 2
break
}
x = length
if x > 1<<6 {
x = 1 << 6
}
dst[i+0] = uint8(x-1)<<2 | tagCopy2
dst[i+1] = uint8(offset)
dst[i+2] = uint8(offset >> 8)
i += 3
length -= x
}
return i
}
// Encode returns the encoded form of src. The returned slice may be a sub-
// slice of dst if dst was large enough to hold the entire encoded block.
// Otherwise, a newly allocated slice will be returned.
// It is valid to pass a nil dst.
func Encode(dst, src []byte) ([]byte, error) {
if n := MaxEncodedLen(len(src)); len(dst) < n {
dst = make([]byte, n)
}
// The block starts with the varint-encoded length of the decompressed bytes.
d := binary.PutUvarint(dst, uint64(len(src)))
// Return early if src is short.
if len(src) <= 4 {
if len(src) != 0 {
d += emitLiteral(dst[d:], src)
}
return dst[:d], nil
}
// Initialize the hash table. Its size ranges from 1<<8 to 1<<14 inclusive.
const maxTableSize = 1 << 14
shift, tableSize := uint(32-8), 1<<8
for tableSize < maxTableSize && tableSize < len(src) {
shift--
tableSize *= 2
}
var table [maxTableSize]int
// Iterate over the source bytes.
var (
s int // The iterator position.
t int // The last position with the same hash as s.
lit int // The start position of any pending literal bytes.
)
for s+3 < len(src) {
// Update the hash table.
b0, b1, b2, b3 := src[s], src[s+1], src[s+2], src[s+3]
h := uint32(b0) | uint32(b1)<<8 | uint32(b2)<<16 | uint32(b3)<<24
p := &table[(h*0x1e35a7bd)>>shift]
// We need to to store values in [-1, inf) in table. To save
// some initialization time, (re)use the table's zero value
// and shift the values against this zero: add 1 on writes,
// subtract 1 on reads.
t, *p = *p-1, s+1
// If t is invalid or src[s:s+4] differs from src[t:t+4], accumulate a literal byte.
if t < 0 || s-t >= maxOffset || b0 != src[t] || b1 != src[t+1] || b2 != src[t+2] || b3 != src[t+3] {
s++
continue
}
// Otherwise, we have a match. First, emit any pending literal bytes.
if lit != s {
d += emitLiteral(dst[d:], src[lit:s])
}
// Extend the match to be as long as possible.
s0 := s
s, t = s+4, t+4
for s < len(src) && src[s] == src[t] {
s++
t++
}
// Emit the copied bytes.
d += emitCopy(dst[d:], s-t, s-s0)
lit = s
}
// Emit any final pending literal bytes and return.
if lit != len(src) {
d += emitLiteral(dst[d:], src[lit:])
}
return dst[:d], nil
}
// MaxEncodedLen returns the maximum length of a snappy block, given its
// uncompressed length.
func MaxEncodedLen(srcLen int) int {
// Compressed data can be defined as:
// compressed := item* literal*
// item := literal* copy
//
// The trailing literal sequence has a space blowup of at most 62/60
// since a literal of length 60 needs one tag byte + one extra byte
// for length information.
//
// Item blowup is trickier to measure. Suppose the "copy" op copies
// 4 bytes of data. Because of a special check in the encoding code,
// we produce a 4-byte copy only if the offset is < 65536. Therefore
// the copy op takes 3 bytes to encode, and this type of item leads
// to at most the 62/60 blowup for representing literals.
//
// Suppose the "copy" op copies 5 bytes of data. If the offset is big
// enough, it will take 5 bytes to encode the copy op. Therefore the
// worst case here is a one-byte literal followed by a five-byte copy.
// That is, 6 bytes of input turn into 7 bytes of "compressed" data.
//
// This last factor dominates the blowup, so the final estimate is:
return 32 + srcLen + srcLen/6
}
// NewWriter returns a new Writer that compresses to w, using the framing
// format described at
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt
func NewWriter(w io.Writer) *Writer {
return &Writer{
w: w,
enc: make([]byte, MaxEncodedLen(maxUncompressedChunkLen)),
}
}
// Writer is an io.Writer than can write Snappy-compressed bytes.
type Writer struct {
w io.Writer
err error
enc []byte
buf [checksumSize + chunkHeaderSize]byte
wroteHeader bool
}
// Reset discards the writer's state and switches the Snappy writer to write to
// w. This permits reusing a Writer rather than allocating a new one.
func (w *Writer) Reset(writer io.Writer) {
w.w = writer
w.err = nil
w.wroteHeader = false
}
// Write satisfies the io.Writer interface.
func (w *Writer) Write(p []byte) (n int, errRet error) {
if w.err != nil {
return 0, w.err
}
if !w.wroteHeader {
copy(w.enc, magicChunk)
if _, err := w.w.Write(w.enc[:len(magicChunk)]); err != nil {
w.err = err
return n, err
}
w.wroteHeader = true
}
for len(p) > 0 {
var uncompressed []byte
if len(p) > maxUncompressedChunkLen {
uncompressed, p = p[:maxUncompressedChunkLen], p[maxUncompressedChunkLen:]
} else {
uncompressed, p = p, nil
}
checksum := crc(uncompressed)
// Compress the buffer, discarding the result if the improvement
// isn't at least 12.5%.
chunkType := uint8(chunkTypeCompressedData)
chunkBody, err := Encode(w.enc, uncompressed)
if err != nil {
w.err = err
return n, err
}
if len(chunkBody) >= len(uncompressed)-len(uncompressed)/8 {
chunkType, chunkBody = chunkTypeUncompressedData, uncompressed
}
chunkLen := 4 + len(chunkBody)
w.buf[0] = chunkType
w.buf[1] = uint8(chunkLen >> 0)
w.buf[2] = uint8(chunkLen >> 8)
w.buf[3] = uint8(chunkLen >> 16)
w.buf[4] = uint8(checksum >> 0)
w.buf[5] = uint8(checksum >> 8)
w.buf[6] = uint8(checksum >> 16)
w.buf[7] = uint8(checksum >> 24)
if _, err = w.w.Write(w.buf[:]); err != nil {
w.err = err
return n, err
}
if _, err = w.w.Write(chunkBody); err != nil {
w.err = err
return n, err
}
n += len(uncompressed)
}
return n, nil
}

View File

@ -1,68 +0,0 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Package snappy implements the snappy block-based compression format.
// It aims for very high speeds and reasonable compression.
//
// The C++ snappy implementation is at http://code.google.com/p/snappy/
package snappy
import (
"hash/crc32"
)
/*
Each encoded block begins with the varint-encoded length of the decoded data,
followed by a sequence of chunks. Chunks begin and end on byte boundaries. The
first byte of each chunk is broken into its 2 least and 6 most significant bits
called l and m: l ranges in [0, 4) and m ranges in [0, 64). l is the chunk tag.
Zero means a literal tag. All other values mean a copy tag.
For literal tags:
- If m < 60, the next 1 + m bytes are literal bytes.
- Otherwise, let n be the little-endian unsigned integer denoted by the next
m - 59 bytes. The next 1 + n bytes after that are literal bytes.
For copy tags, length bytes are copied from offset bytes ago, in the style of
Lempel-Ziv compression algorithms. In particular:
- For l == 1, the offset ranges in [0, 1<<11) and the length in [4, 12).
The length is 4 + the low 3 bits of m. The high 3 bits of m form bits 8-10
of the offset. The next byte is bits 0-7 of the offset.
- For l == 2, the offset ranges in [0, 1<<16) and the length in [1, 65).
The length is 1 + m. The offset is the little-endian unsigned integer
denoted by the next 2 bytes.
- For l == 3, this tag is a legacy format that is no longer supported.
*/
const (
tagLiteral = 0x00
tagCopy1 = 0x01
tagCopy2 = 0x02
tagCopy4 = 0x03
)
const (
checksumSize = 4
chunkHeaderSize = 4
magicChunk = "\xff\x06\x00\x00" + magicBody
magicBody = "sNaPpY"
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt says
// that "the uncompressed data in a chunk must be no longer than 65536 bytes".
maxUncompressedChunkLen = 65536
)
const (
chunkTypeCompressedData = 0x00
chunkTypeUncompressedData = 0x01
chunkTypePadding = 0xfe
chunkTypeStreamIdentifier = 0xff
)
var crcTable = crc32.MakeTable(crc32.Castagnoli)
// crc implements the checksum specified in section 3 of
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt
func crc(b []byte) uint32 {
c := crc32.Update(0, crcTable, b)
return uint32(c>>15|c<<17) + 0xa282ead8
}

View File

@ -1,364 +0,0 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package snappy
import (
"bytes"
"flag"
"fmt"
"io"
"io/ioutil"
"math/rand"
"net/http"
"os"
"path/filepath"
"strings"
"testing"
)
var (
download = flag.Bool("download", false, "If true, download any missing files before running benchmarks")
testdata = flag.String("testdata", "testdata", "Directory containing the test data")
)
func roundtrip(b, ebuf, dbuf []byte) error {
e, err := Encode(ebuf, b)
if err != nil {
return fmt.Errorf("encoding error: %v", err)
}
d, err := Decode(dbuf, e)
if err != nil {
return fmt.Errorf("decoding error: %v", err)
}
if !bytes.Equal(b, d) {
return fmt.Errorf("roundtrip mismatch:\n\twant %v\n\tgot %v", b, d)
}
return nil
}
func TestEmpty(t *testing.T) {
if err := roundtrip(nil, nil, nil); err != nil {
t.Fatal(err)
}
}
func TestSmallCopy(t *testing.T) {
for _, ebuf := range [][]byte{nil, make([]byte, 20), make([]byte, 64)} {
for _, dbuf := range [][]byte{nil, make([]byte, 20), make([]byte, 64)} {
for i := 0; i < 32; i++ {
s := "aaaa" + strings.Repeat("b", i) + "aaaabbbb"
if err := roundtrip([]byte(s), ebuf, dbuf); err != nil {
t.Errorf("len(ebuf)=%d, len(dbuf)=%d, i=%d: %v", len(ebuf), len(dbuf), i, err)
}
}
}
}
}
func TestSmallRand(t *testing.T) {
rng := rand.New(rand.NewSource(27354294))
for n := 1; n < 20000; n += 23 {
b := make([]byte, n)
for i := range b {
b[i] = uint8(rng.Uint32())
}
if err := roundtrip(b, nil, nil); err != nil {
t.Fatal(err)
}
}
}
func TestSmallRegular(t *testing.T) {
for n := 1; n < 20000; n += 23 {
b := make([]byte, n)
for i := range b {
b[i] = uint8(i%10 + 'a')
}
if err := roundtrip(b, nil, nil); err != nil {
t.Fatal(err)
}
}
}
func cmp(a, b []byte) error {
if len(a) != len(b) {
return fmt.Errorf("got %d bytes, want %d", len(a), len(b))
}
for i := range a {
if a[i] != b[i] {
return fmt.Errorf("byte #%d: got 0x%02x, want 0x%02x", i, a[i], b[i])
}
}
return nil
}
func TestFramingFormat(t *testing.T) {
// src is comprised of alternating 1e5-sized sequences of random
// (incompressible) bytes and repeated (compressible) bytes. 1e5 was chosen
// because it is larger than maxUncompressedChunkLen (64k).
src := make([]byte, 1e6)
rng := rand.New(rand.NewSource(1))
for i := 0; i < 10; i++ {
if i%2 == 0 {
for j := 0; j < 1e5; j++ {
src[1e5*i+j] = uint8(rng.Intn(256))
}
} else {
for j := 0; j < 1e5; j++ {
src[1e5*i+j] = uint8(i)
}
}
}
buf := new(bytes.Buffer)
if _, err := NewWriter(buf).Write(src); err != nil {
t.Fatalf("Write: encoding: %v", err)
}
dst, err := ioutil.ReadAll(NewReader(buf))
if err != nil {
t.Fatalf("ReadAll: decoding: %v", err)
}
if err := cmp(dst, src); err != nil {
t.Fatal(err)
}
}
func TestReaderReset(t *testing.T) {
gold := bytes.Repeat([]byte("All that is gold does not glitter,\n"), 10000)
buf := new(bytes.Buffer)
if _, err := NewWriter(buf).Write(gold); err != nil {
t.Fatalf("Write: %v", err)
}
encoded, invalid, partial := buf.String(), "invalid", "partial"
r := NewReader(nil)
for i, s := range []string{encoded, invalid, partial, encoded, partial, invalid, encoded, encoded} {
if s == partial {
r.Reset(strings.NewReader(encoded))
if _, err := r.Read(make([]byte, 101)); err != nil {
t.Errorf("#%d: %v", i, err)
continue
}
continue
}
r.Reset(strings.NewReader(s))
got, err := ioutil.ReadAll(r)
switch s {
case encoded:
if err != nil {
t.Errorf("#%d: %v", i, err)
continue
}
if err := cmp(got, gold); err != nil {
t.Errorf("#%d: %v", i, err)
continue
}
case invalid:
if err == nil {
t.Errorf("#%d: got nil error, want non-nil", i)
continue
}
}
}
}
func TestWriterReset(t *testing.T) {
gold := bytes.Repeat([]byte("Not all those who wander are lost;\n"), 10000)
var gots, wants [][]byte
const n = 20
w, failed := NewWriter(nil), false
for i := 0; i <= n; i++ {
buf := new(bytes.Buffer)
w.Reset(buf)
want := gold[:len(gold)*i/n]
if _, err := w.Write(want); err != nil {
t.Errorf("#%d: Write: %v", i, err)
failed = true
continue
}
got, err := ioutil.ReadAll(NewReader(buf))
if err != nil {
t.Errorf("#%d: ReadAll: %v", i, err)
failed = true
continue
}
gots = append(gots, got)
wants = append(wants, want)
}
if failed {
return
}
for i := range gots {
if err := cmp(gots[i], wants[i]); err != nil {
t.Errorf("#%d: %v", i, err)
}
}
}
func benchDecode(b *testing.B, src []byte) {
encoded, err := Encode(nil, src)
if err != nil {
b.Fatal(err)
}
// Bandwidth is in amount of uncompressed data.
b.SetBytes(int64(len(src)))
b.ResetTimer()
for i := 0; i < b.N; i++ {
Decode(src, encoded)
}
}
func benchEncode(b *testing.B, src []byte) {
// Bandwidth is in amount of uncompressed data.
b.SetBytes(int64(len(src)))
dst := make([]byte, MaxEncodedLen(len(src)))
b.ResetTimer()
for i := 0; i < b.N; i++ {
Encode(dst, src)
}
}
func readFile(b testing.TB, filename string) []byte {
src, err := ioutil.ReadFile(filename)
if err != nil {
b.Fatalf("failed reading %s: %s", filename, err)
}
if len(src) == 0 {
b.Fatalf("%s has zero length", filename)
}
return src
}
// expand returns a slice of length n containing repeated copies of src.
func expand(src []byte, n int) []byte {
dst := make([]byte, n)
for x := dst; len(x) > 0; {
i := copy(x, src)
x = x[i:]
}
return dst
}
func benchWords(b *testing.B, n int, decode bool) {
// Note: the file is OS-language dependent so the resulting values are not
// directly comparable for non-US-English OS installations.
data := expand(readFile(b, "/usr/share/dict/words"), n)
if decode {
benchDecode(b, data)
} else {
benchEncode(b, data)
}
}
func BenchmarkWordsDecode1e3(b *testing.B) { benchWords(b, 1e3, true) }
func BenchmarkWordsDecode1e4(b *testing.B) { benchWords(b, 1e4, true) }
func BenchmarkWordsDecode1e5(b *testing.B) { benchWords(b, 1e5, true) }
func BenchmarkWordsDecode1e6(b *testing.B) { benchWords(b, 1e6, true) }
func BenchmarkWordsEncode1e3(b *testing.B) { benchWords(b, 1e3, false) }
func BenchmarkWordsEncode1e4(b *testing.B) { benchWords(b, 1e4, false) }
func BenchmarkWordsEncode1e5(b *testing.B) { benchWords(b, 1e5, false) }
func BenchmarkWordsEncode1e6(b *testing.B) { benchWords(b, 1e6, false) }
// testFiles' values are copied directly from
// https://raw.githubusercontent.com/google/snappy/master/snappy_unittest.cc
// The label field is unused in snappy-go.
var testFiles = []struct {
label string
filename string
}{
{"html", "html"},
{"urls", "urls.10K"},
{"jpg", "fireworks.jpeg"},
{"jpg_200", "fireworks.jpeg"},
{"pdf", "paper-100k.pdf"},
{"html4", "html_x_4"},
{"txt1", "alice29.txt"},
{"txt2", "asyoulik.txt"},
{"txt3", "lcet10.txt"},
{"txt4", "plrabn12.txt"},
{"pb", "geo.protodata"},
{"gaviota", "kppkn.gtb"},
}
// The test data files are present at this canonical URL.
const baseURL = "https://raw.githubusercontent.com/google/snappy/master/testdata/"
func downloadTestdata(basename string) (errRet error) {
filename := filepath.Join(*testdata, basename)
if stat, err := os.Stat(filename); err == nil && stat.Size() != 0 {
return nil
}
if !*download {
return fmt.Errorf("test data not found; skipping benchmark without the -download flag")
}
// Download the official snappy C++ implementation reference test data
// files for benchmarking.
if err := os.Mkdir(*testdata, 0777); err != nil && !os.IsExist(err) {
return fmt.Errorf("failed to create testdata: %s", err)
}
f, err := os.Create(filename)
if err != nil {
return fmt.Errorf("failed to create %s: %s", filename, err)
}
defer f.Close()
defer func() {
if errRet != nil {
os.Remove(filename)
}
}()
url := baseURL + basename
resp, err := http.Get(url)
if err != nil {
return fmt.Errorf("failed to download %s: %s", url, err)
}
defer resp.Body.Close()
if s := resp.StatusCode; s != http.StatusOK {
return fmt.Errorf("downloading %s: HTTP status code %d (%s)", url, s, http.StatusText(s))
}
_, err = io.Copy(f, resp.Body)
if err != nil {
return fmt.Errorf("failed to download %s to %s: %s", url, filename, err)
}
return nil
}
func benchFile(b *testing.B, n int, decode bool) {
if err := downloadTestdata(testFiles[n].filename); err != nil {
b.Fatalf("failed to download testdata: %s", err)
}
data := readFile(b, filepath.Join(*testdata, testFiles[n].filename))
if decode {
benchDecode(b, data)
} else {
benchEncode(b, data)
}
}
// Naming convention is kept similar to what snappy's C++ implementation uses.
func Benchmark_UFlat0(b *testing.B) { benchFile(b, 0, true) }
func Benchmark_UFlat1(b *testing.B) { benchFile(b, 1, true) }
func Benchmark_UFlat2(b *testing.B) { benchFile(b, 2, true) }
func Benchmark_UFlat3(b *testing.B) { benchFile(b, 3, true) }
func Benchmark_UFlat4(b *testing.B) { benchFile(b, 4, true) }
func Benchmark_UFlat5(b *testing.B) { benchFile(b, 5, true) }
func Benchmark_UFlat6(b *testing.B) { benchFile(b, 6, true) }
func Benchmark_UFlat7(b *testing.B) { benchFile(b, 7, true) }
func Benchmark_UFlat8(b *testing.B) { benchFile(b, 8, true) }
func Benchmark_UFlat9(b *testing.B) { benchFile(b, 9, true) }
func Benchmark_UFlat10(b *testing.B) { benchFile(b, 10, true) }
func Benchmark_UFlat11(b *testing.B) { benchFile(b, 11, true) }
func Benchmark_ZFlat0(b *testing.B) { benchFile(b, 0, false) }
func Benchmark_ZFlat1(b *testing.B) { benchFile(b, 1, false) }
func Benchmark_ZFlat2(b *testing.B) { benchFile(b, 2, false) }
func Benchmark_ZFlat3(b *testing.B) { benchFile(b, 3, false) }
func Benchmark_ZFlat4(b *testing.B) { benchFile(b, 4, false) }
func Benchmark_ZFlat5(b *testing.B) { benchFile(b, 5, false) }
func Benchmark_ZFlat6(b *testing.B) { benchFile(b, 6, false) }
func Benchmark_ZFlat7(b *testing.B) { benchFile(b, 7, false) }
func Benchmark_ZFlat8(b *testing.B) { benchFile(b, 8, false) }
func Benchmark_ZFlat9(b *testing.B) { benchFile(b, 9, false) }
func Benchmark_ZFlat10(b *testing.B) { benchFile(b, 10, false) }
func Benchmark_ZFlat11(b *testing.B) { benchFile(b, 11, false) }