mirror of https://github.com/perkeep/perkeep.git
cmd/camput: compact LevelDB on HaveCache setup
This CL is about levelDB as the HaveCache for camput, and there are several aspects to it. To describe it, I'll take the particular example where you want to add many permanodes (~33k) to a given set, with camput. Something like: for _, blob := range blobs { do("camput attr -add sha1-foobar camliMember " + blob) } In a "normal" levelDB use case, everytime the number of level-0 .ldb files goes over 4 (by default), a background compaction task is started to transform these SST into level-1 ones, and remove the level-0 ones. However, since our particular camput call is very short lived (especially on a local Perkeep), not only might there be not enough time for the compaction to be triggered, but even if it is, when the DB is flushed (on a Close call), any ongoing compactions are cancelled. This makes level-0 compactions very unlikely to happen on short-lived camput calls. As a result, the number of level-0 files keeps growing until levelDB fails while trying to open them all, because it hits the current process ulimit. Now, in this CL, what we propose is to systematically force a compaction as soon as the HaveCache is opened. It is not scheduled concurrently, so we are sure that the compaction happens before the DB actually gets used by camput. This seems to make sure that the number of level-0 tables never grows too much. With this change, I was able to run the above example on 33K blobs without hitting the ulimit error. However, it should be noted that potential problems might remain. The compaction for levels above 0 is triggered based only on the total size of the level (e.g. at 100MB by default for level-1), and not on the number of files. Since we're creating many tiny tables (basically 1 entry per table), the number of files grows very fast while the total size does not, and the compaction does not get triggered, even if forced with CompactRange. This does not seem to be a problem for our use case, as levelDB does not seem to need to open many of the level-1 files at the same time, so we're not hitting the ulimit problem because of that. If needed, there's at least one way this problem (if it is one?) could be fixed: make the compaction trigger on other conditions, such as number of files per level. I've experimented with it (forcing the level-1 compaction to trigger at the 100 files limit), and it seems to be working. But I had to do change the goleveldb code itself, and I don't think levelDB implementations are supposed to do that. For information, at the end of the run on the 33K blobs: $ du -sch *.ldb ... 83M total $ ll | wc -l 20988 And indeed, when asking for leveldb.stats on the table: Level | Tables | Size(MB) | -------+------------+---------------+ 0 | 1 | 0.00015 | 1 | 20981 | 3.47307 | Also, update github.com/syndtr/goleveldb to 34011bf325bce385408353a30b101fe5e923eb6e And remove github.com/syndtr/gosnappy as goleveldb does not use it anymore. Also apply this change to StatCache. Fixes #1008 Change-Id: If9f790a003e67f3c075881470e52e5f2174afa73
This commit is contained in:
parent
b88b82f1ee
commit
da9020ec71
|
@ -3,26 +3,12 @@
|
|||
|
||||
[[projects]]
|
||||
name = "bazil.org/fuse"
|
||||
packages = [
|
||||
".",
|
||||
"fs",
|
||||
"fuseutil",
|
||||
"syscallx"
|
||||
]
|
||||
packages = [".","fs","fuseutil","syscallx"]
|
||||
revision = "371fbbdaa8987b715bdd21d6adc4c9b20155f748"
|
||||
|
||||
[[projects]]
|
||||
name = "cloud.google.com/go"
|
||||
packages = [
|
||||
"compute/metadata",
|
||||
"datastore",
|
||||
"internal",
|
||||
"internal/bundler",
|
||||
"logging",
|
||||
"logging/apiv2",
|
||||
"logging/internal",
|
||||
"storage"
|
||||
]
|
||||
packages = ["compute/metadata","datastore","internal","internal/bundler","logging","logging/apiv2","logging/internal","storage"]
|
||||
revision = "b70ccc799b9d019708c3eb9395acef6e3f6b7bc8"
|
||||
|
||||
[[projects]]
|
||||
|
@ -43,11 +29,7 @@
|
|||
|
||||
[[projects]]
|
||||
name = "github.com/cznic/internal"
|
||||
packages = [
|
||||
"buffer",
|
||||
"file",
|
||||
"slice"
|
||||
]
|
||||
packages = ["buffer","file","slice"]
|
||||
revision = "4747030f7cf2f4c0a01512b00cd68734b167ac3b"
|
||||
|
||||
[[projects]]
|
||||
|
@ -104,16 +86,7 @@
|
|||
[[projects]]
|
||||
branch = "master"
|
||||
name = "github.com/golang/protobuf"
|
||||
packages = [
|
||||
"proto",
|
||||
"ptypes",
|
||||
"ptypes/any",
|
||||
"ptypes/duration",
|
||||
"ptypes/empty",
|
||||
"ptypes/struct",
|
||||
"ptypes/timestamp",
|
||||
"ptypes/wrappers"
|
||||
]
|
||||
packages = ["proto","ptypes","ptypes/any","ptypes/duration","ptypes/empty","ptypes/struct","ptypes/timestamp","ptypes/wrappers"]
|
||||
revision = "1e59b77b52bf8e4b449a57e6f79f21226d571845"
|
||||
|
||||
[[projects]]
|
||||
|
@ -128,21 +101,7 @@
|
|||
|
||||
[[projects]]
|
||||
name = "github.com/gopherjs/gopherjs"
|
||||
packages = [
|
||||
".",
|
||||
"build",
|
||||
"compiler",
|
||||
"compiler/analysis",
|
||||
"compiler/astutil",
|
||||
"compiler/filter",
|
||||
"compiler/natives",
|
||||
"compiler/prelude",
|
||||
"compiler/typesutil",
|
||||
"internal/sysutil",
|
||||
"js",
|
||||
"nosync",
|
||||
"tests/otherpkg"
|
||||
]
|
||||
packages = [".","build","compiler","compiler/analysis","compiler/astutil","compiler/filter","compiler/natives","compiler/prelude","compiler/typesutil","internal/sysutil","js","nosync","tests/otherpkg"]
|
||||
revision = "b40cd48c38f9a18eb3db20d163bad78de12cf0b7"
|
||||
|
||||
[[projects]]
|
||||
|
@ -164,10 +123,7 @@
|
|||
|
||||
[[projects]]
|
||||
name = "github.com/hjfreyer/taglib-go"
|
||||
packages = [
|
||||
"taglib",
|
||||
"taglib/id3"
|
||||
]
|
||||
packages = ["taglib","taglib/id3"]
|
||||
revision = "0ef8bba9c41b66c12f60ce9833786838d2c2d3d8"
|
||||
|
||||
[[projects]]
|
||||
|
@ -188,10 +144,7 @@
|
|||
|
||||
[[projects]]
|
||||
name = "github.com/lib/pq"
|
||||
packages = [
|
||||
".",
|
||||
"oid"
|
||||
]
|
||||
packages = [".","oid"]
|
||||
revision = "9afcd9aa793101bd0536da34e74ae0123345bab1"
|
||||
|
||||
[[projects]]
|
||||
|
@ -244,10 +197,7 @@
|
|||
|
||||
[[projects]]
|
||||
name = "github.com/rwcarlsen/goexif"
|
||||
packages = [
|
||||
"exif",
|
||||
"tiff"
|
||||
]
|
||||
packages = ["exif","tiff"]
|
||||
revision = "709fab3d192d7c62f86043caff1e7e3fb0f42bd8"
|
||||
|
||||
[[projects]]
|
||||
|
@ -279,26 +229,8 @@
|
|||
|
||||
[[projects]]
|
||||
name = "github.com/syndtr/goleveldb"
|
||||
packages = [
|
||||
"leveldb",
|
||||
"leveldb/cache",
|
||||
"leveldb/comparer",
|
||||
"leveldb/errors",
|
||||
"leveldb/filter",
|
||||
"leveldb/iterator",
|
||||
"leveldb/journal",
|
||||
"leveldb/memdb",
|
||||
"leveldb/opt",
|
||||
"leveldb/storage",
|
||||
"leveldb/table",
|
||||
"leveldb/util"
|
||||
]
|
||||
revision = "4875955338b0a434238a31165cb87255ab6e9e4a"
|
||||
|
||||
[[projects]]
|
||||
name = "github.com/syndtr/gosnappy"
|
||||
packages = ["snappy"]
|
||||
revision = "156a073208e131d7d2e212cb749feae7c339e846"
|
||||
packages = ["leveldb","leveldb/cache","leveldb/comparer","leveldb/errors","leveldb/filter","leveldb/iterator","leveldb/journal","leveldb/memdb","leveldb/opt","leveldb/storage","leveldb/table","leveldb/util"]
|
||||
revision = "34011bf325bce385408353a30b101fe5e923eb6e"
|
||||
|
||||
[[projects]]
|
||||
name = "github.com/tgulacsi/picago"
|
||||
|
@ -307,87 +239,27 @@
|
|||
|
||||
[[projects]]
|
||||
name = "go4.org"
|
||||
packages = [
|
||||
"cloud/cloudlaunch",
|
||||
"cloud/google/gceutil",
|
||||
"cloud/google/gcsutil",
|
||||
"ctxutil",
|
||||
"errorutil",
|
||||
"fault",
|
||||
"jsonconfig",
|
||||
"legal",
|
||||
"lock",
|
||||
"net/throttle",
|
||||
"oauthutil",
|
||||
"readerutil",
|
||||
"strutil",
|
||||
"syncutil",
|
||||
"syncutil/singleflight",
|
||||
"types",
|
||||
"wkfs",
|
||||
"wkfs/gcs",
|
||||
"writerutil"
|
||||
]
|
||||
packages = ["cloud/cloudlaunch","cloud/google/gceutil","cloud/google/gcsutil","ctxutil","errorutil","fault","jsonconfig","legal","lock","net/throttle","oauthutil","readerutil","strutil","syncutil","syncutil/singleflight","types","wkfs","wkfs/gcs","writerutil"]
|
||||
revision = "c3a8ba339e20006b054736f8eb9fc5e1d5fa6eab"
|
||||
|
||||
[[projects]]
|
||||
name = "golang.org/x/crypto"
|
||||
packages = [
|
||||
"acme",
|
||||
"acme/autocert",
|
||||
"cast5",
|
||||
"nacl/secretbox",
|
||||
"openpgp",
|
||||
"openpgp/armor",
|
||||
"openpgp/elgamal",
|
||||
"openpgp/errors",
|
||||
"openpgp/packet",
|
||||
"openpgp/s2k",
|
||||
"pbkdf2",
|
||||
"poly1305",
|
||||
"salsa20/salsa",
|
||||
"scrypt",
|
||||
"ssh/terminal"
|
||||
]
|
||||
packages = ["acme","acme/autocert","cast5","nacl/secretbox","openpgp","openpgp/armor","openpgp/elgamal","openpgp/errors","openpgp/packet","openpgp/s2k","pbkdf2","poly1305","salsa20/salsa","scrypt","ssh/terminal"]
|
||||
revision = "ede567c8e044a5913dad1d1af3696d9da953104c"
|
||||
|
||||
[[projects]]
|
||||
name = "golang.org/x/image"
|
||||
packages = [
|
||||
"draw",
|
||||
"math/f64",
|
||||
"tiff",
|
||||
"tiff/lzw"
|
||||
]
|
||||
packages = ["draw","math/f64","tiff","tiff/lzw"]
|
||||
revision = "12117c17ca67ffa1ce22e9409f3b0b0a93ac08c7"
|
||||
|
||||
[[projects]]
|
||||
name = "golang.org/x/net"
|
||||
packages = [
|
||||
"context",
|
||||
"context/ctxhttp",
|
||||
"html",
|
||||
"html/atom",
|
||||
"html/charset",
|
||||
"http2",
|
||||
"http2/hpack",
|
||||
"idna",
|
||||
"internal/timeseries",
|
||||
"lex/httplex",
|
||||
"trace",
|
||||
"xsrftoken"
|
||||
]
|
||||
packages = ["context","context/ctxhttp","html","html/atom","html/charset","http2","http2/hpack","idna","internal/timeseries","lex/httplex","trace","xsrftoken"]
|
||||
revision = "d866cfc389cec985d6fda2859936a575a55a3ab6"
|
||||
|
||||
[[projects]]
|
||||
name = "golang.org/x/oauth2"
|
||||
packages = [
|
||||
".",
|
||||
"google",
|
||||
"internal",
|
||||
"jws",
|
||||
"jwt"
|
||||
]
|
||||
packages = [".","google","internal","jws","jwt"]
|
||||
revision = "197281d4e0ecd78c33865daf9c6e51626feefcb2"
|
||||
|
||||
[[projects]]
|
||||
|
@ -402,35 +274,7 @@
|
|||
|
||||
[[projects]]
|
||||
name = "golang.org/x/text"
|
||||
packages = [
|
||||
".",
|
||||
"collate",
|
||||
"collate/build",
|
||||
"encoding",
|
||||
"encoding/charmap",
|
||||
"encoding/htmlindex",
|
||||
"encoding/internal",
|
||||
"encoding/internal/identifier",
|
||||
"encoding/japanese",
|
||||
"encoding/korean",
|
||||
"encoding/simplifiedchinese",
|
||||
"encoding/traditionalchinese",
|
||||
"encoding/unicode",
|
||||
"internal/colltab",
|
||||
"internal/gen",
|
||||
"internal/tag",
|
||||
"internal/triegen",
|
||||
"internal/ucd",
|
||||
"internal/utf8internal",
|
||||
"language",
|
||||
"runes",
|
||||
"secure/bidirule",
|
||||
"transform",
|
||||
"unicode/bidi",
|
||||
"unicode/cldr",
|
||||
"unicode/norm",
|
||||
"unicode/rangetable"
|
||||
]
|
||||
packages = [".","collate","collate/build","encoding","encoding/charmap","encoding/htmlindex","encoding/internal","encoding/internal/identifier","encoding/japanese","encoding/korean","encoding/simplifiedchinese","encoding/traditionalchinese","encoding/unicode","internal/colltab","internal/gen","internal/tag","internal/triegen","internal/ucd","internal/utf8internal","language","runes","secure/bidirule","transform","unicode/bidi","unicode/cldr","unicode/norm","unicode/rangetable"]
|
||||
revision = "88f656faf3f37f690df1a32515b479415e1a6769"
|
||||
|
||||
[[projects]]
|
||||
|
@ -440,99 +284,35 @@
|
|||
|
||||
[[projects]]
|
||||
name = "golang.org/x/tools"
|
||||
packages = [
|
||||
"go/buildutil",
|
||||
"go/gcimporter15",
|
||||
"go/types/typeutil",
|
||||
"refactor/importgraph"
|
||||
]
|
||||
packages = ["go/buildutil","go/gcimporter15","go/types/typeutil","refactor/importgraph"]
|
||||
revision = "e531a2a1c15f94033f6fa87666caeb19a688175f"
|
||||
|
||||
[[projects]]
|
||||
name = "google.golang.org/api"
|
||||
packages = [
|
||||
"cloudresourcemanager/v1",
|
||||
"compute/v1",
|
||||
"drive/v2",
|
||||
"drive/v3",
|
||||
"gensupport",
|
||||
"googleapi",
|
||||
"googleapi/internal/uritemplates",
|
||||
"googleapi/transport",
|
||||
"internal",
|
||||
"iterator",
|
||||
"option",
|
||||
"servicemanagement/v1",
|
||||
"sqladmin/v1beta3",
|
||||
"storage/v1",
|
||||
"transport"
|
||||
]
|
||||
packages = ["cloudresourcemanager/v1","compute/v1","drive/v2","drive/v3","gensupport","googleapi","googleapi/internal/uritemplates","googleapi/transport","internal","iterator","option","servicemanagement/v1","sqladmin/v1beta3","storage/v1","transport"]
|
||||
revision = "48e49d1645e228d1c50c3d54fb476b2224477303"
|
||||
|
||||
[[projects]]
|
||||
name = "google.golang.org/appengine"
|
||||
packages = [
|
||||
".",
|
||||
"internal",
|
||||
"internal/app_identity",
|
||||
"internal/base",
|
||||
"internal/datastore",
|
||||
"internal/log",
|
||||
"internal/modules",
|
||||
"internal/remote_api",
|
||||
"internal/socket",
|
||||
"internal/urlfetch",
|
||||
"socket",
|
||||
"urlfetch"
|
||||
]
|
||||
packages = [".","internal","internal/app_identity","internal/base","internal/datastore","internal/log","internal/modules","internal/remote_api","internal/socket","internal/urlfetch","socket","urlfetch"]
|
||||
revision = "150dc57a1b433e64154302bdc40b6bb8aefa313a"
|
||||
version = "v1.0.0"
|
||||
|
||||
[[projects]]
|
||||
name = "google.golang.org/genproto"
|
||||
packages = [
|
||||
"googleapis/api/label",
|
||||
"googleapis/api/metric",
|
||||
"googleapis/api/monitoredres",
|
||||
"googleapis/api/serviceconfig",
|
||||
"googleapis/datastore/v1",
|
||||
"googleapis/logging/type",
|
||||
"googleapis/logging/v2",
|
||||
"googleapis/rpc/status",
|
||||
"googleapis/type/latlng",
|
||||
"protobuf"
|
||||
]
|
||||
packages = ["googleapis/api/label","googleapis/api/metric","googleapis/api/monitoredres","googleapis/api/serviceconfig","googleapis/datastore/v1","googleapis/logging/type","googleapis/logging/v2","googleapis/rpc/status","googleapis/type/latlng","protobuf"]
|
||||
revision = "08f135d1a31b6ba454287638a3ce23a55adace6f"
|
||||
|
||||
[[projects]]
|
||||
name = "google.golang.org/grpc"
|
||||
packages = [
|
||||
".",
|
||||
"codes",
|
||||
"credentials",
|
||||
"credentials/oauth",
|
||||
"grpclog",
|
||||
"internal",
|
||||
"metadata",
|
||||
"naming",
|
||||
"peer",
|
||||
"stats",
|
||||
"tap",
|
||||
"transport"
|
||||
]
|
||||
packages = [".","codes","credentials","credentials/oauth","grpclog","internal","metadata","naming","peer","stats","tap","transport"]
|
||||
revision = "188a132adcfba339f1f2d5da52498451341f9ee8"
|
||||
source = "https://github.com/bradfitz/grpc-go.git"
|
||||
|
||||
[[projects]]
|
||||
branch = "v2"
|
||||
name = "gopkg.in/mgo.v2"
|
||||
packages = [
|
||||
".",
|
||||
"bson",
|
||||
"internal/json",
|
||||
"internal/sasl",
|
||||
"internal/scram"
|
||||
]
|
||||
packages = [".","bson","internal/json","internal/sasl","internal/scram"]
|
||||
revision = "3f83fa5005286a7fe593b055f0d7771a7dce4655"
|
||||
|
||||
[[projects]]
|
||||
|
@ -547,15 +327,7 @@
|
|||
|
||||
[[projects]]
|
||||
name = "myitcv.io/react"
|
||||
packages = [
|
||||
".",
|
||||
"cmd/reactGen",
|
||||
"internal/bundle",
|
||||
"internal/core",
|
||||
"internal/dev",
|
||||
"internal/preact",
|
||||
"internal/prod"
|
||||
]
|
||||
packages = [".","cmd/reactGen","internal/bundle","internal/core","internal/dev","internal/preact","internal/prod"]
|
||||
revision = "bca7c66b77ed8a5b86fb77cff70914c4a7cc3ce5"
|
||||
|
||||
[[projects]]
|
||||
|
@ -565,16 +337,12 @@
|
|||
|
||||
[[projects]]
|
||||
name = "rsc.io/qr"
|
||||
packages = [
|
||||
".",
|
||||
"coding",
|
||||
"gf256"
|
||||
]
|
||||
packages = [".","coding","gf256"]
|
||||
revision = "48b2ede4844e13f1a2b7ce4d2529c9af7e359fc5"
|
||||
|
||||
[solve-meta]
|
||||
analyzer-name = "dep"
|
||||
analyzer-version = 1
|
||||
inputs-digest = "0922eed8dece3458922d996935e4ca92abcc96da00d8ec0295ee44815079a0fc"
|
||||
inputs-digest = "58aeea7aea438a25e7b53a5ba4847f3f28c99cedf3bdfad4e7b2dca110da839d"
|
||||
solver-name = "gps-cdcl"
|
||||
solver-version = 1
|
||||
|
|
|
@ -25,7 +25,6 @@ required = [
|
|||
"google.golang.org/grpc", # fork by bradfitz
|
||||
"golang.org/x/text",
|
||||
"github.com/neelance/sourcemap",
|
||||
"github.com/syndtr/gosnappy/snappy",
|
||||
"golang.org/x/sys/unix",
|
||||
"golang.org/x/tools/go/types/typeutil", # for gopherjs
|
||||
"golang.org/x/tools/go/gcimporter15", # for gopherjs
|
||||
|
@ -195,11 +194,7 @@ required = [
|
|||
|
||||
[[constraint]]
|
||||
name = "github.com/syndtr/goleveldb"
|
||||
revision = "4875955338b0a434238a31165cb87255ab6e9e4a"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/syndtr/gosnappy"
|
||||
revision = "156a073208e131d7d2e212cb749feae7c339e846"
|
||||
revision = "34011bf325bce385408353a30b101fe5e923eb6e"
|
||||
|
||||
[[constraint]]
|
||||
name = "github.com/tgulacsi/picago"
|
||||
|
|
|
@ -36,6 +36,7 @@ import (
|
|||
"perkeep.org/pkg/client"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb"
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
var errCacheMiss = errors.New("not in cache")
|
||||
|
@ -59,12 +60,39 @@ func NewKvHaveCache(gen string) *KvHaveCache {
|
|||
if err != nil {
|
||||
log.Fatalf("Could not create/open new have cache at %v, %v", fullPath, err)
|
||||
}
|
||||
|
||||
if err := maybeRunCompaction("HaveCache", db); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
return &KvHaveCache{
|
||||
filename: fullPath,
|
||||
db: db,
|
||||
}
|
||||
}
|
||||
|
||||
// maybeRunCompaction forces compaction of db, if the number of
|
||||
// tables in level 0 is >= 4. dbname should be provided for error messages.
|
||||
func maybeRunCompaction(dbname string, db *leveldb.DB) error {
|
||||
val, err := db.GetProperty("leveldb.num-files-at-level0")
|
||||
if err != nil {
|
||||
return fmt.Errorf("could not get number of level-0 files of %v's LevelDB: %v", dbname, err)
|
||||
}
|
||||
nbFiles, err := strconv.Atoi(val)
|
||||
if err != nil {
|
||||
return fmt.Errorf("could not convert number of level-0 files to int: %v", err)
|
||||
}
|
||||
// Only force compaction if we're at the default trigger (4), see
|
||||
// github.com/syndtr/goleveldb/leveldb/opt.DefaultCompactionL0Trigger
|
||||
if nbFiles < 4 {
|
||||
return nil
|
||||
}
|
||||
if err := db.CompactRange(util.Range{nil, nil}); err != nil {
|
||||
return fmt.Errorf("could not run compaction on %v's LevelDB: %v", dbname, err)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Close should be called to commit all the writes
|
||||
// to the db and to unlock the file.
|
||||
func (c *KvHaveCache) Close() error {
|
||||
|
@ -129,6 +157,11 @@ func NewKvStatCache(gen string) *KvStatCache {
|
|||
if err != nil {
|
||||
log.Fatalf("Could not create/open new stat cache at %v, %v", fullPath, err)
|
||||
}
|
||||
|
||||
if err := maybeRunCompaction("StatCache", db); err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
return &KvStatCache{
|
||||
filename: fullPath,
|
||||
db: db,
|
||||
|
|
|
@ -10,94 +10,97 @@ Installation
|
|||
Requirements
|
||||
-----------
|
||||
|
||||
* Need at least `go1.2` or newer.
|
||||
* Need at least `go1.5` or newer.
|
||||
|
||||
Usage
|
||||
-----------
|
||||
|
||||
Create or open a database:
|
||||
|
||||
db, err := leveldb.OpenFile("path/to/db", nil)
|
||||
...
|
||||
defer db.Close()
|
||||
...
|
||||
|
||||
```go
|
||||
// The returned DB instance is safe for concurrent use. Which mean that all
|
||||
// DB's methods may be called concurrently from multiple goroutine.
|
||||
db, err := leveldb.OpenFile("path/to/db", nil)
|
||||
...
|
||||
defer db.Close()
|
||||
...
|
||||
```
|
||||
Read or modify the database content:
|
||||
|
||||
// Remember that the contents of the returned slice should not be modified.
|
||||
data, err := db.Get([]byte("key"), nil)
|
||||
...
|
||||
err = db.Put([]byte("key"), []byte("value"), nil)
|
||||
...
|
||||
err = db.Delete([]byte("key"), nil)
|
||||
...
|
||||
```go
|
||||
// Remember that the contents of the returned slice should not be modified.
|
||||
data, err := db.Get([]byte("key"), nil)
|
||||
...
|
||||
err = db.Put([]byte("key"), []byte("value"), nil)
|
||||
...
|
||||
err = db.Delete([]byte("key"), nil)
|
||||
...
|
||||
```
|
||||
|
||||
Iterate over database content:
|
||||
|
||||
iter := db.NewIterator(nil, nil)
|
||||
for iter.Next() {
|
||||
// Remember that the contents of the returned slice should not be modified, and
|
||||
// only valid until the next call to Next.
|
||||
key := iter.Key()
|
||||
value := iter.Value()
|
||||
...
|
||||
}
|
||||
iter.Release()
|
||||
err = iter.Error()
|
||||
```go
|
||||
iter := db.NewIterator(nil, nil)
|
||||
for iter.Next() {
|
||||
// Remember that the contents of the returned slice should not be modified, and
|
||||
// only valid until the next call to Next.
|
||||
key := iter.Key()
|
||||
value := iter.Value()
|
||||
...
|
||||
|
||||
}
|
||||
iter.Release()
|
||||
err = iter.Error()
|
||||
...
|
||||
```
|
||||
Seek-then-Iterate:
|
||||
|
||||
iter := db.NewIterator(nil, nil)
|
||||
for ok := iter.Seek(key); ok; ok = iter.Next() {
|
||||
// Use key/value.
|
||||
...
|
||||
}
|
||||
iter.Release()
|
||||
err = iter.Error()
|
||||
```go
|
||||
iter := db.NewIterator(nil, nil)
|
||||
for ok := iter.Seek(key); ok; ok = iter.Next() {
|
||||
// Use key/value.
|
||||
...
|
||||
|
||||
}
|
||||
iter.Release()
|
||||
err = iter.Error()
|
||||
...
|
||||
```
|
||||
Iterate over subset of database content:
|
||||
|
||||
iter := db.NewIterator(&util.Range{Start: []byte("foo"), Limit: []byte("xoo")}, nil)
|
||||
for iter.Next() {
|
||||
// Use key/value.
|
||||
...
|
||||
}
|
||||
iter.Release()
|
||||
err = iter.Error()
|
||||
```go
|
||||
iter := db.NewIterator(&util.Range{Start: []byte("foo"), Limit: []byte("xoo")}, nil)
|
||||
for iter.Next() {
|
||||
// Use key/value.
|
||||
...
|
||||
|
||||
}
|
||||
iter.Release()
|
||||
err = iter.Error()
|
||||
...
|
||||
```
|
||||
Iterate over subset of database content with a particular prefix:
|
||||
|
||||
iter := db.NewIterator(util.BytesPrefix([]byte("foo-")), nil)
|
||||
for iter.Next() {
|
||||
// Use key/value.
|
||||
...
|
||||
}
|
||||
iter.Release()
|
||||
err = iter.Error()
|
||||
```go
|
||||
iter := db.NewIterator(util.BytesPrefix([]byte("foo-")), nil)
|
||||
for iter.Next() {
|
||||
// Use key/value.
|
||||
...
|
||||
|
||||
}
|
||||
iter.Release()
|
||||
err = iter.Error()
|
||||
...
|
||||
```
|
||||
Batch writes:
|
||||
|
||||
batch := new(leveldb.Batch)
|
||||
batch.Put([]byte("foo"), []byte("value"))
|
||||
batch.Put([]byte("bar"), []byte("another value"))
|
||||
batch.Delete([]byte("baz"))
|
||||
err = db.Write(batch, nil)
|
||||
...
|
||||
|
||||
```go
|
||||
batch := new(leveldb.Batch)
|
||||
batch.Put([]byte("foo"), []byte("value"))
|
||||
batch.Put([]byte("bar"), []byte("another value"))
|
||||
batch.Delete([]byte("baz"))
|
||||
err = db.Write(batch, nil)
|
||||
...
|
||||
```
|
||||
Use bloom filter:
|
||||
|
||||
o := &opt.Options{
|
||||
Filter: filter.NewBloomFilter(10),
|
||||
}
|
||||
db, err := leveldb.OpenFile("path/to/db", o)
|
||||
...
|
||||
defer db.Close()
|
||||
...
|
||||
|
||||
```go
|
||||
o := &opt.Options{
|
||||
Filter: filter.NewBloomFilter(10),
|
||||
}
|
||||
db, err := leveldb.OpenFile("path/to/db", o)
|
||||
...
|
||||
defer db.Close()
|
||||
...
|
||||
```
|
||||
Documentation
|
||||
-----------
|
||||
|
||||
|
|
|
@ -9,11 +9,15 @@ package leveldb
|
|||
import (
|
||||
"encoding/binary"
|
||||
"fmt"
|
||||
"io"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/errors"
|
||||
"github.com/syndtr/goleveldb/leveldb/memdb"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
)
|
||||
|
||||
// ErrBatchCorrupted records reason of batch corruption. This error will be
|
||||
// wrapped with errors.ErrCorrupted.
|
||||
type ErrBatchCorrupted struct {
|
||||
Reason string
|
||||
}
|
||||
|
@ -23,84 +27,102 @@ func (e *ErrBatchCorrupted) Error() string {
|
|||
}
|
||||
|
||||
func newErrBatchCorrupted(reason string) error {
|
||||
return errors.NewErrCorrupted(nil, &ErrBatchCorrupted{reason})
|
||||
return errors.NewErrCorrupted(storage.FileDesc{}, &ErrBatchCorrupted{reason})
|
||||
}
|
||||
|
||||
const (
|
||||
batchHdrLen = 8 + 4
|
||||
batchGrowRec = 3000
|
||||
batchHeaderLen = 8 + 4
|
||||
batchGrowRec = 3000
|
||||
batchBufioSize = 16
|
||||
)
|
||||
|
||||
// BatchReplay wraps basic batch operations.
|
||||
type BatchReplay interface {
|
||||
Put(key, value []byte)
|
||||
Delete(key []byte)
|
||||
}
|
||||
|
||||
type batchIndex struct {
|
||||
keyType keyType
|
||||
keyPos, keyLen int
|
||||
valuePos, valueLen int
|
||||
}
|
||||
|
||||
func (index batchIndex) k(data []byte) []byte {
|
||||
return data[index.keyPos : index.keyPos+index.keyLen]
|
||||
}
|
||||
|
||||
func (index batchIndex) v(data []byte) []byte {
|
||||
if index.valueLen != 0 {
|
||||
return data[index.valuePos : index.valuePos+index.valueLen]
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (index batchIndex) kv(data []byte) (key, value []byte) {
|
||||
return index.k(data), index.v(data)
|
||||
}
|
||||
|
||||
// Batch is a write batch.
|
||||
type Batch struct {
|
||||
data []byte
|
||||
rLen, bLen int
|
||||
seq uint64
|
||||
sync bool
|
||||
data []byte
|
||||
index []batchIndex
|
||||
|
||||
// internalLen is sums of key/value pair length plus 8-bytes internal key.
|
||||
internalLen int
|
||||
}
|
||||
|
||||
func (b *Batch) grow(n int) {
|
||||
off := len(b.data)
|
||||
if off == 0 {
|
||||
off = batchHdrLen
|
||||
if b.data != nil {
|
||||
b.data = b.data[:off]
|
||||
}
|
||||
}
|
||||
if cap(b.data)-off < n {
|
||||
if b.data == nil {
|
||||
b.data = make([]byte, off, off+n)
|
||||
} else {
|
||||
odata := b.data
|
||||
div := 1
|
||||
if b.rLen > batchGrowRec {
|
||||
div = b.rLen / batchGrowRec
|
||||
}
|
||||
b.data = make([]byte, off, off+n+(off-batchHdrLen)/div)
|
||||
copy(b.data, odata)
|
||||
o := len(b.data)
|
||||
if cap(b.data)-o < n {
|
||||
div := 1
|
||||
if len(b.index) > batchGrowRec {
|
||||
div = len(b.index) / batchGrowRec
|
||||
}
|
||||
ndata := make([]byte, o, o+n+o/div)
|
||||
copy(ndata, b.data)
|
||||
b.data = ndata
|
||||
}
|
||||
}
|
||||
|
||||
func (b *Batch) appendRec(kt kType, key, value []byte) {
|
||||
func (b *Batch) appendRec(kt keyType, key, value []byte) {
|
||||
n := 1 + binary.MaxVarintLen32 + len(key)
|
||||
if kt == ktVal {
|
||||
if kt == keyTypeVal {
|
||||
n += binary.MaxVarintLen32 + len(value)
|
||||
}
|
||||
b.grow(n)
|
||||
off := len(b.data)
|
||||
data := b.data[:off+n]
|
||||
data[off] = byte(kt)
|
||||
off += 1
|
||||
off += binary.PutUvarint(data[off:], uint64(len(key)))
|
||||
copy(data[off:], key)
|
||||
off += len(key)
|
||||
if kt == ktVal {
|
||||
off += binary.PutUvarint(data[off:], uint64(len(value)))
|
||||
copy(data[off:], value)
|
||||
off += len(value)
|
||||
index := batchIndex{keyType: kt}
|
||||
o := len(b.data)
|
||||
data := b.data[:o+n]
|
||||
data[o] = byte(kt)
|
||||
o++
|
||||
o += binary.PutUvarint(data[o:], uint64(len(key)))
|
||||
index.keyPos = o
|
||||
index.keyLen = len(key)
|
||||
o += copy(data[o:], key)
|
||||
if kt == keyTypeVal {
|
||||
o += binary.PutUvarint(data[o:], uint64(len(value)))
|
||||
index.valuePos = o
|
||||
index.valueLen = len(value)
|
||||
o += copy(data[o:], value)
|
||||
}
|
||||
b.data = data[:off]
|
||||
b.rLen++
|
||||
// Include 8-byte ikey header
|
||||
b.bLen += len(key) + len(value) + 8
|
||||
b.data = data[:o]
|
||||
b.index = append(b.index, index)
|
||||
b.internalLen += index.keyLen + index.valueLen + 8
|
||||
}
|
||||
|
||||
// Put appends 'put operation' of the given key/value pair to the batch.
|
||||
// It is safe to modify the contents of the argument after Put returns.
|
||||
// It is safe to modify the contents of the argument after Put returns but not
|
||||
// before.
|
||||
func (b *Batch) Put(key, value []byte) {
|
||||
b.appendRec(ktVal, key, value)
|
||||
b.appendRec(keyTypeVal, key, value)
|
||||
}
|
||||
|
||||
// Delete appends 'delete operation' of the given key to the batch.
|
||||
// It is safe to modify the contents of the argument after Delete returns.
|
||||
// It is safe to modify the contents of the argument after Delete returns but
|
||||
// not before.
|
||||
func (b *Batch) Delete(key []byte) {
|
||||
b.appendRec(ktDel, key, nil)
|
||||
b.appendRec(keyTypeDel, key, nil)
|
||||
}
|
||||
|
||||
// Dump dumps batch contents. The returned slice can be loaded into the
|
||||
|
@ -108,7 +130,7 @@ func (b *Batch) Delete(key []byte) {
|
|||
// The returned slice is not its own copy, so the contents should not be
|
||||
// modified.
|
||||
func (b *Batch) Dump() []byte {
|
||||
return b.encode()
|
||||
return b.data
|
||||
}
|
||||
|
||||
// Load loads given slice into the batch. Previous contents of the batch
|
||||
|
@ -116,137 +138,212 @@ func (b *Batch) Dump() []byte {
|
|||
// The given slice will not be copied and will be used as batch buffer, so
|
||||
// it is not safe to modify the contents of the slice.
|
||||
func (b *Batch) Load(data []byte) error {
|
||||
return b.decode(0, data)
|
||||
return b.decode(data, -1)
|
||||
}
|
||||
|
||||
// Replay replays batch contents.
|
||||
func (b *Batch) Replay(r BatchReplay) error {
|
||||
return b.decodeRec(func(i int, kt kType, key, value []byte) {
|
||||
switch kt {
|
||||
case ktVal:
|
||||
r.Put(key, value)
|
||||
case ktDel:
|
||||
r.Delete(key)
|
||||
for _, index := range b.index {
|
||||
switch index.keyType {
|
||||
case keyTypeVal:
|
||||
r.Put(index.k(b.data), index.v(b.data))
|
||||
case keyTypeDel:
|
||||
r.Delete(index.k(b.data))
|
||||
}
|
||||
})
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Len returns number of records in the batch.
|
||||
func (b *Batch) Len() int {
|
||||
return b.rLen
|
||||
return len(b.index)
|
||||
}
|
||||
|
||||
// Reset resets the batch.
|
||||
func (b *Batch) Reset() {
|
||||
b.data = b.data[:0]
|
||||
b.seq = 0
|
||||
b.rLen = 0
|
||||
b.bLen = 0
|
||||
b.sync = false
|
||||
b.index = b.index[:0]
|
||||
b.internalLen = 0
|
||||
}
|
||||
|
||||
func (b *Batch) init(sync bool) {
|
||||
b.sync = sync
|
||||
func (b *Batch) replayInternal(fn func(i int, kt keyType, k, v []byte) error) error {
|
||||
for i, index := range b.index {
|
||||
if err := fn(i, index.keyType, index.k(b.data), index.v(b.data)); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (b *Batch) append(p *Batch) {
|
||||
if p.rLen > 0 {
|
||||
b.grow(len(p.data) - batchHdrLen)
|
||||
b.data = append(b.data, p.data[batchHdrLen:]...)
|
||||
b.rLen += p.rLen
|
||||
}
|
||||
if p.sync {
|
||||
b.sync = true
|
||||
}
|
||||
}
|
||||
ob := len(b.data)
|
||||
oi := len(b.index)
|
||||
b.data = append(b.data, p.data...)
|
||||
b.index = append(b.index, p.index...)
|
||||
b.internalLen += p.internalLen
|
||||
|
||||
// size returns sums of key/value pair length plus 8-bytes ikey.
|
||||
func (b *Batch) size() int {
|
||||
return b.bLen
|
||||
}
|
||||
|
||||
func (b *Batch) encode() []byte {
|
||||
b.grow(0)
|
||||
binary.LittleEndian.PutUint64(b.data, b.seq)
|
||||
binary.LittleEndian.PutUint32(b.data[8:], uint32(b.rLen))
|
||||
|
||||
return b.data
|
||||
}
|
||||
|
||||
func (b *Batch) decode(prevSeq uint64, data []byte) error {
|
||||
if len(data) < batchHdrLen {
|
||||
return newErrBatchCorrupted("too short")
|
||||
}
|
||||
|
||||
b.seq = binary.LittleEndian.Uint64(data)
|
||||
if b.seq < prevSeq {
|
||||
return newErrBatchCorrupted("invalid sequence number")
|
||||
}
|
||||
b.rLen = int(binary.LittleEndian.Uint32(data[8:]))
|
||||
if b.rLen < 0 {
|
||||
return newErrBatchCorrupted("invalid records length")
|
||||
}
|
||||
// No need to be precise at this point, it won't be used anyway
|
||||
b.bLen = len(data) - batchHdrLen
|
||||
b.data = data
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (b *Batch) decodeRec(f func(i int, kt kType, key, value []byte)) (err error) {
|
||||
off := batchHdrLen
|
||||
for i := 0; i < b.rLen; i++ {
|
||||
if off >= len(b.data) {
|
||||
return newErrBatchCorrupted("invalid records length")
|
||||
}
|
||||
|
||||
kt := kType(b.data[off])
|
||||
if kt > ktVal {
|
||||
return newErrBatchCorrupted("bad record: invalid type")
|
||||
}
|
||||
off += 1
|
||||
|
||||
x, n := binary.Uvarint(b.data[off:])
|
||||
off += n
|
||||
if n <= 0 || off+int(x) > len(b.data) {
|
||||
return newErrBatchCorrupted("bad record: invalid key length")
|
||||
}
|
||||
key := b.data[off : off+int(x)]
|
||||
off += int(x)
|
||||
var value []byte
|
||||
if kt == ktVal {
|
||||
x, n := binary.Uvarint(b.data[off:])
|
||||
off += n
|
||||
if n <= 0 || off+int(x) > len(b.data) {
|
||||
return newErrBatchCorrupted("bad record: invalid value length")
|
||||
// Updating index offset.
|
||||
if ob != 0 {
|
||||
for ; oi < len(b.index); oi++ {
|
||||
index := &b.index[oi]
|
||||
index.keyPos += ob
|
||||
if index.valueLen != 0 {
|
||||
index.valuePos += ob
|
||||
}
|
||||
value = b.data[off : off+int(x)]
|
||||
off += int(x)
|
||||
}
|
||||
|
||||
f(i, kt, key, value)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (b *Batch) memReplay(to *memdb.DB) error {
|
||||
return b.decodeRec(func(i int, kt kType, key, value []byte) {
|
||||
ikey := newIkey(key, b.seq+uint64(i), kt)
|
||||
to.Put(ikey, value)
|
||||
func (b *Batch) decode(data []byte, expectedLen int) error {
|
||||
b.data = data
|
||||
b.index = b.index[:0]
|
||||
b.internalLen = 0
|
||||
err := decodeBatch(data, func(i int, index batchIndex) error {
|
||||
b.index = append(b.index, index)
|
||||
b.internalLen += index.keyLen + index.valueLen + 8
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
func (b *Batch) memDecodeAndReplay(prevSeq uint64, data []byte, to *memdb.DB) error {
|
||||
if err := b.decode(prevSeq, data); err != nil {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return b.memReplay(to)
|
||||
if expectedLen >= 0 && len(b.index) != expectedLen {
|
||||
return newErrBatchCorrupted(fmt.Sprintf("invalid records length: %d vs %d", expectedLen, len(b.index)))
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (b *Batch) revertMemReplay(to *memdb.DB) error {
|
||||
return b.decodeRec(func(i int, kt kType, key, value []byte) {
|
||||
ikey := newIkey(key, b.seq+uint64(i), kt)
|
||||
to.Delete(ikey)
|
||||
})
|
||||
func (b *Batch) putMem(seq uint64, mdb *memdb.DB) error {
|
||||
var ik []byte
|
||||
for i, index := range b.index {
|
||||
ik = makeInternalKey(ik, index.k(b.data), seq+uint64(i), index.keyType)
|
||||
if err := mdb.Put(ik, index.v(b.data)); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (b *Batch) revertMem(seq uint64, mdb *memdb.DB) error {
|
||||
var ik []byte
|
||||
for i, index := range b.index {
|
||||
ik = makeInternalKey(ik, index.k(b.data), seq+uint64(i), index.keyType)
|
||||
if err := mdb.Delete(ik); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func newBatch() interface{} {
|
||||
return &Batch{}
|
||||
}
|
||||
|
||||
func decodeBatch(data []byte, fn func(i int, index batchIndex) error) error {
|
||||
var index batchIndex
|
||||
for i, o := 0, 0; o < len(data); i++ {
|
||||
// Key type.
|
||||
index.keyType = keyType(data[o])
|
||||
if index.keyType > keyTypeVal {
|
||||
return newErrBatchCorrupted(fmt.Sprintf("bad record: invalid type %#x", uint(index.keyType)))
|
||||
}
|
||||
o++
|
||||
|
||||
// Key.
|
||||
x, n := binary.Uvarint(data[o:])
|
||||
o += n
|
||||
if n <= 0 || o+int(x) > len(data) {
|
||||
return newErrBatchCorrupted("bad record: invalid key length")
|
||||
}
|
||||
index.keyPos = o
|
||||
index.keyLen = int(x)
|
||||
o += index.keyLen
|
||||
|
||||
// Value.
|
||||
if index.keyType == keyTypeVal {
|
||||
x, n = binary.Uvarint(data[o:])
|
||||
o += n
|
||||
if n <= 0 || o+int(x) > len(data) {
|
||||
return newErrBatchCorrupted("bad record: invalid value length")
|
||||
}
|
||||
index.valuePos = o
|
||||
index.valueLen = int(x)
|
||||
o += index.valueLen
|
||||
} else {
|
||||
index.valuePos = 0
|
||||
index.valueLen = 0
|
||||
}
|
||||
|
||||
if err := fn(i, index); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func decodeBatchToMem(data []byte, expectSeq uint64, mdb *memdb.DB) (seq uint64, batchLen int, err error) {
|
||||
seq, batchLen, err = decodeBatchHeader(data)
|
||||
if err != nil {
|
||||
return 0, 0, err
|
||||
}
|
||||
if seq < expectSeq {
|
||||
return 0, 0, newErrBatchCorrupted("invalid sequence number")
|
||||
}
|
||||
data = data[batchHeaderLen:]
|
||||
var ik []byte
|
||||
var decodedLen int
|
||||
err = decodeBatch(data, func(i int, index batchIndex) error {
|
||||
if i >= batchLen {
|
||||
return newErrBatchCorrupted("invalid records length")
|
||||
}
|
||||
ik = makeInternalKey(ik, index.k(data), seq+uint64(i), index.keyType)
|
||||
if err := mdb.Put(ik, index.v(data)); err != nil {
|
||||
return err
|
||||
}
|
||||
decodedLen++
|
||||
return nil
|
||||
})
|
||||
if err == nil && decodedLen != batchLen {
|
||||
err = newErrBatchCorrupted(fmt.Sprintf("invalid records length: %d vs %d", batchLen, decodedLen))
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func encodeBatchHeader(dst []byte, seq uint64, batchLen int) []byte {
|
||||
dst = ensureBuffer(dst, batchHeaderLen)
|
||||
binary.LittleEndian.PutUint64(dst, seq)
|
||||
binary.LittleEndian.PutUint32(dst[8:], uint32(batchLen))
|
||||
return dst
|
||||
}
|
||||
|
||||
func decodeBatchHeader(data []byte) (seq uint64, batchLen int, err error) {
|
||||
if len(data) < batchHeaderLen {
|
||||
return 0, 0, newErrBatchCorrupted("too short")
|
||||
}
|
||||
|
||||
seq = binary.LittleEndian.Uint64(data)
|
||||
batchLen = int(binary.LittleEndian.Uint32(data[8:]))
|
||||
if batchLen < 0 {
|
||||
return 0, 0, newErrBatchCorrupted("invalid records length")
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func batchesLen(batches []*Batch) int {
|
||||
batchLen := 0
|
||||
for _, batch := range batches {
|
||||
batchLen += batch.Len()
|
||||
}
|
||||
return batchLen
|
||||
}
|
||||
|
||||
func writeBatchesWithHeader(wr io.Writer, batches []*Batch, seq uint64) error {
|
||||
if _, err := wr.Write(encodeBatchHeader(nil, seq, batchesLen(batches))); err != nil {
|
||||
return err
|
||||
}
|
||||
for _, batch := range batches {
|
||||
if _, err := wr.Write(batch.data); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
|
|
@ -8,113 +8,140 @@ package leveldb
|
|||
|
||||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"testing"
|
||||
"testing/quick"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/comparer"
|
||||
"github.com/syndtr/goleveldb/leveldb/memdb"
|
||||
"github.com/syndtr/goleveldb/leveldb/testutil"
|
||||
)
|
||||
|
||||
type tbRec struct {
|
||||
kt kType
|
||||
key, value []byte
|
||||
func TestBatchHeader(t *testing.T) {
|
||||
f := func(seq uint64, length uint32) bool {
|
||||
encoded := encodeBatchHeader(nil, seq, int(length))
|
||||
decSeq, decLength, err := decodeBatchHeader(encoded)
|
||||
return err == nil && decSeq == seq && decLength == int(length)
|
||||
}
|
||||
config := &quick.Config{
|
||||
Rand: testutil.NewRand(),
|
||||
}
|
||||
if err := quick.Check(f, config); err != nil {
|
||||
t.Error(err)
|
||||
}
|
||||
}
|
||||
|
||||
type testBatch struct {
|
||||
rec []*tbRec
|
||||
type batchKV struct {
|
||||
kt keyType
|
||||
k, v []byte
|
||||
}
|
||||
|
||||
func (p *testBatch) Put(key, value []byte) {
|
||||
p.rec = append(p.rec, &tbRec{ktVal, key, value})
|
||||
}
|
||||
|
||||
func (p *testBatch) Delete(key []byte) {
|
||||
p.rec = append(p.rec, &tbRec{ktDel, key, nil})
|
||||
}
|
||||
|
||||
func compareBatch(t *testing.T, b1, b2 *Batch) {
|
||||
if b1.seq != b2.seq {
|
||||
t.Errorf("invalid seq number want %d, got %d", b1.seq, b2.seq)
|
||||
}
|
||||
if b1.Len() != b2.Len() {
|
||||
t.Fatalf("invalid record length want %d, got %d", b1.Len(), b2.Len())
|
||||
}
|
||||
p1, p2 := new(testBatch), new(testBatch)
|
||||
err := b1.Replay(p1)
|
||||
if err != nil {
|
||||
t.Fatal("error when replaying batch 1: ", err)
|
||||
}
|
||||
err = b2.Replay(p2)
|
||||
if err != nil {
|
||||
t.Fatal("error when replaying batch 2: ", err)
|
||||
}
|
||||
for i := range p1.rec {
|
||||
r1, r2 := p1.rec[i], p2.rec[i]
|
||||
if r1.kt != r2.kt {
|
||||
t.Errorf("invalid type on record '%d' want %d, got %d", i, r1.kt, r2.kt)
|
||||
func TestBatch(t *testing.T) {
|
||||
var (
|
||||
kvs []batchKV
|
||||
internalLen int
|
||||
)
|
||||
batch := new(Batch)
|
||||
rbatch := new(Batch)
|
||||
abatch := new(Batch)
|
||||
testBatch := func(i int, kt keyType, k, v []byte) error {
|
||||
kv := kvs[i]
|
||||
if kv.kt != kt {
|
||||
return fmt.Errorf("invalid key type, index=%d: %d vs %d", i, kv.kt, kt)
|
||||
}
|
||||
if !bytes.Equal(r1.key, r2.key) {
|
||||
t.Errorf("invalid key on record '%d' want %s, got %s", i, string(r1.key), string(r2.key))
|
||||
if !bytes.Equal(kv.k, k) {
|
||||
return fmt.Errorf("invalid key, index=%d", i)
|
||||
}
|
||||
if r1.kt == ktVal {
|
||||
if !bytes.Equal(r1.value, r2.value) {
|
||||
t.Errorf("invalid value on record '%d' want %s, got %s", i, string(r1.value), string(r2.value))
|
||||
if !bytes.Equal(kv.v, v) {
|
||||
return fmt.Errorf("invalid value, index=%d", i)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
f := func(ktr uint8, k, v []byte) bool {
|
||||
kt := keyType(ktr % 2)
|
||||
if kt == keyTypeVal {
|
||||
batch.Put(k, v)
|
||||
rbatch.Put(k, v)
|
||||
kvs = append(kvs, batchKV{kt: kt, k: k, v: v})
|
||||
internalLen += len(k) + len(v) + 8
|
||||
} else {
|
||||
batch.Delete(k)
|
||||
rbatch.Delete(k)
|
||||
kvs = append(kvs, batchKV{kt: kt, k: k})
|
||||
internalLen += len(k) + 8
|
||||
}
|
||||
if batch.Len() != len(kvs) {
|
||||
t.Logf("batch.Len: %d vs %d", len(kvs), batch.Len())
|
||||
return false
|
||||
}
|
||||
if batch.internalLen != internalLen {
|
||||
t.Logf("abatch.internalLen: %d vs %d", internalLen, batch.internalLen)
|
||||
return false
|
||||
}
|
||||
if len(kvs)%1000 == 0 {
|
||||
if err := batch.replayInternal(testBatch); err != nil {
|
||||
t.Logf("batch.replayInternal: %v", err)
|
||||
return false
|
||||
}
|
||||
|
||||
abatch.append(rbatch)
|
||||
rbatch.Reset()
|
||||
if abatch.Len() != len(kvs) {
|
||||
t.Logf("abatch.Len: %d vs %d", len(kvs), abatch.Len())
|
||||
return false
|
||||
}
|
||||
if abatch.internalLen != internalLen {
|
||||
t.Logf("abatch.internalLen: %d vs %d", internalLen, abatch.internalLen)
|
||||
return false
|
||||
}
|
||||
if err := abatch.replayInternal(testBatch); err != nil {
|
||||
t.Logf("abatch.replayInternal: %v", err)
|
||||
return false
|
||||
}
|
||||
|
||||
nbatch := new(Batch)
|
||||
if err := nbatch.Load(batch.Dump()); err != nil {
|
||||
t.Logf("nbatch.Load: %v", err)
|
||||
return false
|
||||
}
|
||||
if nbatch.Len() != len(kvs) {
|
||||
t.Logf("nbatch.Len: %d vs %d", len(kvs), nbatch.Len())
|
||||
return false
|
||||
}
|
||||
if nbatch.internalLen != internalLen {
|
||||
t.Logf("nbatch.internalLen: %d vs %d", internalLen, nbatch.internalLen)
|
||||
return false
|
||||
}
|
||||
if err := nbatch.replayInternal(testBatch); err != nil {
|
||||
t.Logf("nbatch.replayInternal: %v", err)
|
||||
return false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestBatch_EncodeDecode(t *testing.T) {
|
||||
b1 := new(Batch)
|
||||
b1.seq = 10009
|
||||
b1.Put([]byte("key1"), []byte("value1"))
|
||||
b1.Put([]byte("key2"), []byte("value2"))
|
||||
b1.Delete([]byte("key1"))
|
||||
b1.Put([]byte("k"), []byte(""))
|
||||
b1.Put([]byte("zzzzzzzzzzz"), []byte("zzzzzzzzzzzzzzzzzzzzzzzz"))
|
||||
b1.Delete([]byte("key10000"))
|
||||
b1.Delete([]byte("k"))
|
||||
buf := b1.encode()
|
||||
b2 := new(Batch)
|
||||
err := b2.decode(0, buf)
|
||||
if err != nil {
|
||||
t.Error("error when decoding batch: ", err)
|
||||
}
|
||||
compareBatch(t, b1, b2)
|
||||
}
|
||||
|
||||
func TestBatch_Append(t *testing.T) {
|
||||
b1 := new(Batch)
|
||||
b1.seq = 10009
|
||||
b1.Put([]byte("key1"), []byte("value1"))
|
||||
b1.Put([]byte("key2"), []byte("value2"))
|
||||
b1.Delete([]byte("key1"))
|
||||
b1.Put([]byte("foo"), []byte("foovalue"))
|
||||
b1.Put([]byte("bar"), []byte("barvalue"))
|
||||
b2a := new(Batch)
|
||||
b2a.seq = 10009
|
||||
b2a.Put([]byte("key1"), []byte("value1"))
|
||||
b2a.Put([]byte("key2"), []byte("value2"))
|
||||
b2a.Delete([]byte("key1"))
|
||||
b2b := new(Batch)
|
||||
b2b.Put([]byte("foo"), []byte("foovalue"))
|
||||
b2b.Put([]byte("bar"), []byte("barvalue"))
|
||||
b2a.append(b2b)
|
||||
compareBatch(t, b1, b2a)
|
||||
}
|
||||
|
||||
func TestBatch_Size(t *testing.T) {
|
||||
b := new(Batch)
|
||||
for i := 0; i < 2; i++ {
|
||||
b.Put([]byte("key1"), []byte("value1"))
|
||||
b.Put([]byte("key2"), []byte("value2"))
|
||||
b.Delete([]byte("key1"))
|
||||
b.Put([]byte("foo"), []byte("foovalue"))
|
||||
b.Put([]byte("bar"), []byte("barvalue"))
|
||||
mem := memdb.New(&iComparer{comparer.DefaultComparer}, 0)
|
||||
b.memReplay(mem)
|
||||
if b.size() != mem.Size() {
|
||||
t.Errorf("invalid batch size calculation, want=%d got=%d", mem.Size(), b.size())
|
||||
if len(kvs)%10000 == 0 {
|
||||
nbatch := new(Batch)
|
||||
if err := batch.Replay(nbatch); err != nil {
|
||||
t.Logf("batch.Replay: %v", err)
|
||||
return false
|
||||
}
|
||||
if nbatch.Len() != len(kvs) {
|
||||
t.Logf("nbatch.Len: %d vs %d", len(kvs), nbatch.Len())
|
||||
return false
|
||||
}
|
||||
if nbatch.internalLen != internalLen {
|
||||
t.Logf("nbatch.internalLen: %d vs %d", internalLen, nbatch.internalLen)
|
||||
return false
|
||||
}
|
||||
if err := nbatch.replayInternal(testBatch); err != nil {
|
||||
t.Logf("nbatch.replayInternal: %v", err)
|
||||
return false
|
||||
}
|
||||
}
|
||||
b.Reset()
|
||||
return true
|
||||
}
|
||||
config := &quick.Config{
|
||||
MaxCount: 40000,
|
||||
Rand: testutil.NewRand(),
|
||||
}
|
||||
if err := quick.Check(f, config); err != nil {
|
||||
t.Error(err)
|
||||
}
|
||||
t.Logf("length=%d internalLen=%d", len(kvs), internalLen)
|
||||
}
|
||||
|
|
|
@ -1,58 +0,0 @@
|
|||
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
|
||||
// All rights reserved.
|
||||
//
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
// +build !go1.2
|
||||
|
||||
package leveldb
|
||||
|
||||
import (
|
||||
"sync/atomic"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func BenchmarkDBReadConcurrent(b *testing.B) {
|
||||
p := openDBBench(b, false)
|
||||
p.populate(b.N)
|
||||
p.fill()
|
||||
p.gc()
|
||||
defer p.close()
|
||||
|
||||
b.ResetTimer()
|
||||
b.SetBytes(116)
|
||||
|
||||
b.RunParallel(func(pb *testing.PB) {
|
||||
iter := p.newIter()
|
||||
defer iter.Release()
|
||||
for pb.Next() && iter.Next() {
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func BenchmarkDBReadConcurrent2(b *testing.B) {
|
||||
p := openDBBench(b, false)
|
||||
p.populate(b.N)
|
||||
p.fill()
|
||||
p.gc()
|
||||
defer p.close()
|
||||
|
||||
b.ResetTimer()
|
||||
b.SetBytes(116)
|
||||
|
||||
var dir uint32
|
||||
b.RunParallel(func(pb *testing.PB) {
|
||||
iter := p.newIter()
|
||||
defer iter.Release()
|
||||
if atomic.AddUint32(&dir, 1)%2 == 0 {
|
||||
for pb.Next() && iter.Next() {
|
||||
}
|
||||
} else {
|
||||
if pb.Next() && iter.Last() {
|
||||
for pb.Next() && iter.Prev() {
|
||||
}
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
|
@ -13,6 +13,7 @@ import (
|
|||
"os"
|
||||
"path/filepath"
|
||||
"runtime"
|
||||
"sync/atomic"
|
||||
"testing"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/iterator"
|
||||
|
@ -90,7 +91,7 @@ func openDBBench(b *testing.B, noCompress bool) *dbBench {
|
|||
ro: &opt.ReadOptions{},
|
||||
wo: &opt.WriteOptions{},
|
||||
}
|
||||
p.stor, err = storage.OpenFile(benchDB)
|
||||
p.stor, err = storage.OpenFile(benchDB, false)
|
||||
if err != nil {
|
||||
b.Fatal("cannot open stor: ", err)
|
||||
}
|
||||
|
@ -103,7 +104,6 @@ func openDBBench(b *testing.B, noCompress bool) *dbBench {
|
|||
b.Fatal("cannot open db: ", err)
|
||||
}
|
||||
|
||||
runtime.GOMAXPROCS(runtime.NumCPU())
|
||||
return p
|
||||
}
|
||||
|
||||
|
@ -259,7 +259,6 @@ func (p *dbBench) close() {
|
|||
p.keys = nil
|
||||
p.values = nil
|
||||
runtime.GC()
|
||||
runtime.GOMAXPROCS(1)
|
||||
}
|
||||
|
||||
func BenchmarkDBWrite(b *testing.B) {
|
||||
|
@ -462,3 +461,47 @@ func BenchmarkDBGetRandom(b *testing.B) {
|
|||
p.gets()
|
||||
p.close()
|
||||
}
|
||||
|
||||
func BenchmarkDBReadConcurrent(b *testing.B) {
|
||||
p := openDBBench(b, false)
|
||||
p.populate(b.N)
|
||||
p.fill()
|
||||
p.gc()
|
||||
defer p.close()
|
||||
|
||||
b.ResetTimer()
|
||||
b.SetBytes(116)
|
||||
|
||||
b.RunParallel(func(pb *testing.PB) {
|
||||
iter := p.newIter()
|
||||
defer iter.Release()
|
||||
for pb.Next() && iter.Next() {
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func BenchmarkDBReadConcurrent2(b *testing.B) {
|
||||
p := openDBBench(b, false)
|
||||
p.populate(b.N)
|
||||
p.fill()
|
||||
p.gc()
|
||||
defer p.close()
|
||||
|
||||
b.ResetTimer()
|
||||
b.SetBytes(116)
|
||||
|
||||
var dir uint32
|
||||
b.RunParallel(func(pb *testing.PB) {
|
||||
iter := p.newIter()
|
||||
defer iter.Release()
|
||||
if atomic.AddUint32(&dir, 1)%2 == 0 {
|
||||
for pb.Next() && iter.Next() {
|
||||
}
|
||||
} else {
|
||||
if pb.Next() && iter.Last() {
|
||||
for pb.Next() && iter.Prev() {
|
||||
}
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
|
|
|
@ -4,13 +4,12 @@
|
|||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
// +build !go1.2
|
||||
|
||||
package cache
|
||||
|
||||
import (
|
||||
"math/rand"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
func BenchmarkLRUCache(b *testing.B) {
|
|
@ -16,7 +16,7 @@ import (
|
|||
)
|
||||
|
||||
// Cacher provides interface to implements a caching functionality.
|
||||
// An implementation must be goroutine-safe.
|
||||
// An implementation must be safe for concurrent use.
|
||||
type Cacher interface {
|
||||
// Capacity returns cache capacity.
|
||||
Capacity() int
|
||||
|
@ -47,17 +47,21 @@ type Cacher interface {
|
|||
// so the the Release method will be called once object is released.
|
||||
type Value interface{}
|
||||
|
||||
type CacheGetter struct {
|
||||
// NamespaceGetter provides convenient wrapper for namespace.
|
||||
type NamespaceGetter struct {
|
||||
Cache *Cache
|
||||
NS uint64
|
||||
}
|
||||
|
||||
func (g *CacheGetter) Get(key uint64, setFunc func() (size int, value Value)) *Handle {
|
||||
// Get simply calls Cache.Get() method.
|
||||
func (g *NamespaceGetter) Get(key uint64, setFunc func() (size int, value Value)) *Handle {
|
||||
return g.Cache.Get(g.NS, key, setFunc)
|
||||
}
|
||||
|
||||
// The hash tables implementation is based on:
|
||||
// "Dynamic-Sized Nonblocking Hash Tables", by Yujie Liu, Kunlong Zhang, and Michael Spear. ACM Symposium on Principles of Distributed Computing, Jul 2014.
|
||||
// "Dynamic-Sized Nonblocking Hash Tables", by Yujie Liu,
|
||||
// Kunlong Zhang, and Michael Spear.
|
||||
// ACM Symposium on Principles of Distributed Computing, Jul 2014.
|
||||
|
||||
const (
|
||||
mInitialSize = 1 << 4
|
||||
|
@ -507,18 +511,12 @@ func (r *Cache) EvictAll() {
|
|||
}
|
||||
}
|
||||
|
||||
// Close closes the 'cache map' and releases all 'cache node'.
|
||||
// Close closes the 'cache map' and forcefully releases all 'cache node'.
|
||||
func (r *Cache) Close() error {
|
||||
r.mu.Lock()
|
||||
if !r.closed {
|
||||
r.closed = true
|
||||
|
||||
if r.cacher != nil {
|
||||
if err := r.cacher.Close(); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
h := (*mNode)(r.mHead)
|
||||
h.initBuckets()
|
||||
|
||||
|
@ -537,10 +535,37 @@ func (r *Cache) Close() error {
|
|||
for _, f := range n.onDel {
|
||||
f()
|
||||
}
|
||||
n.onDel = nil
|
||||
}
|
||||
}
|
||||
}
|
||||
r.mu.Unlock()
|
||||
|
||||
// Avoid deadlock.
|
||||
if r.cacher != nil {
|
||||
if err := r.cacher.Close(); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// CloseWeak closes the 'cache map' and evict all 'cache node' from cacher, but
|
||||
// unlike Close it doesn't forcefully releases 'cache node'.
|
||||
func (r *Cache) CloseWeak() error {
|
||||
r.mu.Lock()
|
||||
if !r.closed {
|
||||
r.closed = true
|
||||
}
|
||||
r.mu.Unlock()
|
||||
|
||||
// Avoid deadlock.
|
||||
if r.cacher != nil {
|
||||
r.cacher.EvictAll()
|
||||
if err := r.cacher.Close(); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
|
@ -610,10 +635,12 @@ func (n *Node) unrefLocked() {
|
|||
}
|
||||
}
|
||||
|
||||
// Handle is a 'cache handle' of a 'cache node'.
|
||||
type Handle struct {
|
||||
n unsafe.Pointer // *Node
|
||||
}
|
||||
|
||||
// Value returns the value of the 'cache node'.
|
||||
func (h *Handle) Value() Value {
|
||||
n := (*Node)(atomic.LoadPointer(&h.n))
|
||||
if n != nil {
|
||||
|
@ -622,6 +649,8 @@ func (h *Handle) Value() Value {
|
|||
return nil
|
||||
}
|
||||
|
||||
// Release releases this 'cache handle'.
|
||||
// It is safe to call release multiple times.
|
||||
func (h *Handle) Release() {
|
||||
nPtr := atomic.LoadPointer(&h.n)
|
||||
if nPtr != nil && atomic.CompareAndSwapPointer(&h.n, nPtr, nil) {
|
||||
|
|
|
@ -45,20 +45,29 @@ func set(c *Cache, ns, key uint64, value Value, charge int, relf func()) *Handle
|
|||
return c.Get(ns, key, func() (int, Value) {
|
||||
if relf != nil {
|
||||
return charge, releaserFunc{relf, value}
|
||||
} else {
|
||||
return charge, value
|
||||
}
|
||||
return charge, value
|
||||
})
|
||||
}
|
||||
|
||||
type cacheMapTestParams struct {
|
||||
nobjects, nhandles, concurrent, repeat int
|
||||
}
|
||||
|
||||
func TestCacheMap(t *testing.T) {
|
||||
runtime.GOMAXPROCS(runtime.NumCPU())
|
||||
|
||||
nsx := []struct {
|
||||
nobjects, nhandles, concurrent, repeat int
|
||||
}{
|
||||
{10000, 400, 50, 3},
|
||||
{100000, 1000, 100, 10},
|
||||
var params []cacheMapTestParams
|
||||
if testing.Short() {
|
||||
params = []cacheMapTestParams{
|
||||
{1000, 100, 20, 3},
|
||||
{10000, 300, 50, 10},
|
||||
}
|
||||
} else {
|
||||
params = []cacheMapTestParams{
|
||||
{10000, 400, 50, 3},
|
||||
{100000, 1000, 100, 10},
|
||||
}
|
||||
}
|
||||
|
||||
var (
|
||||
|
@ -66,7 +75,7 @@ func TestCacheMap(t *testing.T) {
|
|||
handles [][]unsafe.Pointer
|
||||
)
|
||||
|
||||
for _, x := range nsx {
|
||||
for _, x := range params {
|
||||
objects = append(objects, make([]int32o, x.nobjects))
|
||||
handles = append(handles, make([]unsafe.Pointer, x.nhandles))
|
||||
}
|
||||
|
@ -76,7 +85,7 @@ func TestCacheMap(t *testing.T) {
|
|||
wg := new(sync.WaitGroup)
|
||||
var done int32
|
||||
|
||||
for ns, x := range nsx {
|
||||
for ns, x := range params {
|
||||
for i := 0; i < x.concurrent; i++ {
|
||||
wg.Add(1)
|
||||
go func(ns, i, repeat int, objects []int32o, handles []unsafe.Pointer) {
|
||||
|
|
|
@ -6,7 +6,9 @@
|
|||
|
||||
package leveldb
|
||||
|
||||
import "github.com/syndtr/goleveldb/leveldb/comparer"
|
||||
import (
|
||||
"github.com/syndtr/goleveldb/leveldb/comparer"
|
||||
)
|
||||
|
||||
type iComparer struct {
|
||||
ucmp comparer.Comparer
|
||||
|
@ -33,43 +35,33 @@ func (icmp *iComparer) Name() string {
|
|||
}
|
||||
|
||||
func (icmp *iComparer) Compare(a, b []byte) int {
|
||||
x := icmp.ucmp.Compare(iKey(a).ukey(), iKey(b).ukey())
|
||||
x := icmp.uCompare(internalKey(a).ukey(), internalKey(b).ukey())
|
||||
if x == 0 {
|
||||
if m, n := iKey(a).num(), iKey(b).num(); m > n {
|
||||
x = -1
|
||||
if m, n := internalKey(a).num(), internalKey(b).num(); m > n {
|
||||
return -1
|
||||
} else if m < n {
|
||||
x = 1
|
||||
return 1
|
||||
}
|
||||
}
|
||||
return x
|
||||
}
|
||||
|
||||
func (icmp *iComparer) Separator(dst, a, b []byte) []byte {
|
||||
ua, ub := iKey(a).ukey(), iKey(b).ukey()
|
||||
dst = icmp.ucmp.Separator(dst, ua, ub)
|
||||
if dst == nil {
|
||||
return nil
|
||||
ua, ub := internalKey(a).ukey(), internalKey(b).ukey()
|
||||
dst = icmp.uSeparator(dst, ua, ub)
|
||||
if dst != nil && len(dst) < len(ua) && icmp.uCompare(ua, dst) < 0 {
|
||||
// Append earliest possible number.
|
||||
return append(dst, keyMaxNumBytes...)
|
||||
}
|
||||
if len(dst) < len(ua) && icmp.uCompare(ua, dst) < 0 {
|
||||
dst = append(dst, kMaxNumBytes...)
|
||||
} else {
|
||||
// Did not close possibilities that n maybe longer than len(ub).
|
||||
dst = append(dst, a[len(a)-8:]...)
|
||||
}
|
||||
return dst
|
||||
return nil
|
||||
}
|
||||
|
||||
func (icmp *iComparer) Successor(dst, b []byte) []byte {
|
||||
ub := iKey(b).ukey()
|
||||
dst = icmp.ucmp.Successor(dst, ub)
|
||||
if dst == nil {
|
||||
return nil
|
||||
ub := internalKey(b).ukey()
|
||||
dst = icmp.uSuccessor(dst, ub)
|
||||
if dst != nil && len(dst) < len(ub) && icmp.uCompare(ub, dst) < 0 {
|
||||
// Append earliest possible number.
|
||||
return append(dst, keyMaxNumBytes...)
|
||||
}
|
||||
if len(dst) < len(ub) && icmp.uCompare(ub, dst) < 0 {
|
||||
dst = append(dst, kMaxNumBytes...)
|
||||
} else {
|
||||
// Did not close possibilities that n maybe longer than len(ub).
|
||||
dst = append(dst, b[len(b)-8:]...)
|
||||
}
|
||||
return dst
|
||||
return nil
|
||||
}
|
||||
|
|
|
@ -9,12 +9,13 @@ package leveldb
|
|||
import (
|
||||
"bytes"
|
||||
"fmt"
|
||||
"github.com/syndtr/goleveldb/leveldb/filter"
|
||||
"github.com/syndtr/goleveldb/leveldb/opt"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
"io"
|
||||
"math/rand"
|
||||
"testing"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/filter"
|
||||
"github.com/syndtr/goleveldb/leveldb/opt"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
)
|
||||
|
||||
const ctValSize = 1000
|
||||
|
@ -99,19 +100,17 @@ func (h *dbCorruptHarness) corrupt(ft storage.FileType, fi, offset, n int) {
|
|||
p := &h.dbHarness
|
||||
t := p.t
|
||||
|
||||
ff, _ := p.stor.GetFiles(ft)
|
||||
sff := files(ff)
|
||||
sff.sort()
|
||||
fds, _ := p.stor.List(ft)
|
||||
sortFds(fds)
|
||||
if fi < 0 {
|
||||
fi = len(sff) - 1
|
||||
fi = len(fds) - 1
|
||||
}
|
||||
if fi >= len(sff) {
|
||||
if fi >= len(fds) {
|
||||
t.Fatalf("no such file with type %q with index %d", ft, fi)
|
||||
}
|
||||
|
||||
file := sff[fi]
|
||||
|
||||
r, err := file.Open()
|
||||
fd := fds[fi]
|
||||
r, err := h.stor.Open(fd)
|
||||
if err != nil {
|
||||
t.Fatal("cannot open file: ", err)
|
||||
}
|
||||
|
@ -149,11 +148,11 @@ func (h *dbCorruptHarness) corrupt(ft storage.FileType, fi, offset, n int) {
|
|||
buf[offset+i] ^= 0x80
|
||||
}
|
||||
|
||||
err = file.Remove()
|
||||
err = h.stor.Remove(fd)
|
||||
if err != nil {
|
||||
t.Fatal("cannot remove old file: ", err)
|
||||
}
|
||||
w, err := file.Create()
|
||||
w, err := h.stor.Create(fd)
|
||||
if err != nil {
|
||||
t.Fatal("cannot create new file: ", err)
|
||||
}
|
||||
|
@ -165,25 +164,37 @@ func (h *dbCorruptHarness) corrupt(ft storage.FileType, fi, offset, n int) {
|
|||
}
|
||||
|
||||
func (h *dbCorruptHarness) removeAll(ft storage.FileType) {
|
||||
ff, err := h.stor.GetFiles(ft)
|
||||
fds, err := h.stor.List(ft)
|
||||
if err != nil {
|
||||
h.t.Fatal("get files: ", err)
|
||||
}
|
||||
for _, f := range ff {
|
||||
if err := f.Remove(); err != nil {
|
||||
for _, fd := range fds {
|
||||
if err := h.stor.Remove(fd); err != nil {
|
||||
h.t.Error("remove file: ", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (h *dbCorruptHarness) forceRemoveAll(ft storage.FileType) {
|
||||
fds, err := h.stor.List(ft)
|
||||
if err != nil {
|
||||
h.t.Fatal("get files: ", err)
|
||||
}
|
||||
for _, fd := range fds {
|
||||
if err := h.stor.ForceRemove(fd); err != nil {
|
||||
h.t.Error("remove file: ", err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (h *dbCorruptHarness) removeOne(ft storage.FileType) {
|
||||
ff, err := h.stor.GetFiles(ft)
|
||||
fds, err := h.stor.List(ft)
|
||||
if err != nil {
|
||||
h.t.Fatal("get files: ", err)
|
||||
}
|
||||
f := ff[rand.Intn(len(ff))]
|
||||
h.t.Logf("removing file @%d", f.Num())
|
||||
if err := f.Remove(); err != nil {
|
||||
fd := fds[rand.Intn(len(fds))]
|
||||
h.t.Logf("removing file @%d", fd.Num)
|
||||
if err := h.stor.Remove(fd); err != nil {
|
||||
h.t.Error("remove file: ", err)
|
||||
}
|
||||
}
|
||||
|
@ -221,6 +232,7 @@ func (h *dbCorruptHarness) check(min, max int) {
|
|||
|
||||
func TestCorruptDB_Journal(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.build(100)
|
||||
h.check(100, 100)
|
||||
|
@ -230,12 +242,11 @@ func TestCorruptDB_Journal(t *testing.T) {
|
|||
|
||||
h.openDB()
|
||||
h.check(36, 36)
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_Table(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.build(100)
|
||||
h.compactMem()
|
||||
|
@ -246,12 +257,11 @@ func TestCorruptDB_Table(t *testing.T) {
|
|||
|
||||
h.openDB()
|
||||
h.check(99, 99)
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_TableIndex(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.build(10000)
|
||||
h.compactMem()
|
||||
|
@ -260,8 +270,6 @@ func TestCorruptDB_TableIndex(t *testing.T) {
|
|||
|
||||
h.openDB()
|
||||
h.check(5000, 9999)
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_MissingManifest(t *testing.T) {
|
||||
|
@ -271,6 +279,7 @@ func TestCorruptDB_MissingManifest(t *testing.T) {
|
|||
Strict: opt.StrictJournalChecksum,
|
||||
WriteBuffer: 1000 * 60,
|
||||
})
|
||||
defer h.close()
|
||||
|
||||
h.build(1000)
|
||||
h.compactMem()
|
||||
|
@ -286,10 +295,8 @@ func TestCorruptDB_MissingManifest(t *testing.T) {
|
|||
h.compactMem()
|
||||
h.closeDB()
|
||||
|
||||
h.stor.SetIgnoreOpenErr(storage.TypeManifest)
|
||||
h.removeAll(storage.TypeManifest)
|
||||
h.forceRemoveAll(storage.TypeManifest)
|
||||
h.openAssert(false)
|
||||
h.stor.SetIgnoreOpenErr(0)
|
||||
|
||||
h.recover()
|
||||
h.check(1000, 1000)
|
||||
|
@ -300,12 +307,11 @@ func TestCorruptDB_MissingManifest(t *testing.T) {
|
|||
|
||||
h.recover()
|
||||
h.check(1000, 1000)
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_SequenceNumberRecovery(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.put("foo", "v1")
|
||||
h.put("foo", "v2")
|
||||
|
@ -321,12 +327,11 @@ func TestCorruptDB_SequenceNumberRecovery(t *testing.T) {
|
|||
|
||||
h.reopenDB()
|
||||
h.getVal("foo", "v6")
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_SequenceNumberRecoveryTable(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.put("foo", "v1")
|
||||
h.put("foo", "v2")
|
||||
|
@ -344,12 +349,11 @@ func TestCorruptDB_SequenceNumberRecoveryTable(t *testing.T) {
|
|||
|
||||
h.reopenDB()
|
||||
h.getVal("foo", "v6")
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_CorruptedManifest(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.put("foo", "hello")
|
||||
h.compactMem()
|
||||
|
@ -360,12 +364,11 @@ func TestCorruptDB_CorruptedManifest(t *testing.T) {
|
|||
|
||||
h.recover()
|
||||
h.getVal("foo", "hello")
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_CompactionInputError(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.build(10)
|
||||
h.compactMem()
|
||||
|
@ -377,12 +380,11 @@ func TestCorruptDB_CompactionInputError(t *testing.T) {
|
|||
|
||||
h.build(10000)
|
||||
h.check(10000, 10000)
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_UnrelatedKeys(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.build(10)
|
||||
h.compactMem()
|
||||
|
@ -394,12 +396,11 @@ func TestCorruptDB_UnrelatedKeys(t *testing.T) {
|
|||
h.getVal(string(tkey(1000)), string(tval(1000, ctValSize)))
|
||||
h.compactMem()
|
||||
h.getVal(string(tkey(1000)), string(tval(1000, ctValSize)))
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_Level0NewerFileHasOlderSeqnum(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.put("a", "v1")
|
||||
h.put("b", "v1")
|
||||
|
@ -421,12 +422,11 @@ func TestCorruptDB_Level0NewerFileHasOlderSeqnum(t *testing.T) {
|
|||
h.getVal("b", "v3")
|
||||
h.getVal("c", "v0")
|
||||
h.getVal("d", "v0")
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_RecoverInvalidSeq_Issue53(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.put("a", "v1")
|
||||
h.put("b", "v1")
|
||||
|
@ -448,12 +448,11 @@ func TestCorruptDB_RecoverInvalidSeq_Issue53(t *testing.T) {
|
|||
h.getVal("b", "v3")
|
||||
h.getVal("c", "v0")
|
||||
h.getVal("d", "v0")
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_MissingTableFiles(t *testing.T) {
|
||||
h := newDbCorruptHarness(t)
|
||||
defer h.close()
|
||||
|
||||
h.put("a", "v1")
|
||||
h.put("b", "v1")
|
||||
|
@ -467,8 +466,6 @@ func TestCorruptDB_MissingTableFiles(t *testing.T) {
|
|||
|
||||
h.removeOne(storage.TypeTable)
|
||||
h.openAssert(false)
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
||||
func TestCorruptDB_RecoverTable(t *testing.T) {
|
||||
|
@ -477,6 +474,7 @@ func TestCorruptDB_RecoverTable(t *testing.T) {
|
|||
CompactionTableSize: 90 * opt.KiB,
|
||||
Filter: filter.NewBloomFilter(10),
|
||||
})
|
||||
defer h.close()
|
||||
|
||||
h.build(1000)
|
||||
h.compactMem()
|
||||
|
@ -495,6 +493,4 @@ func TestCorruptDB_RecoverTable(t *testing.T) {
|
|||
t.Errorf("invalid seq, want=%d got=%d", seq, h.db.seq)
|
||||
}
|
||||
h.check(985, 985)
|
||||
|
||||
h.close()
|
||||
}
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -11,109 +11,79 @@ import (
|
|||
"time"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/errors"
|
||||
"github.com/syndtr/goleveldb/leveldb/memdb"
|
||||
"github.com/syndtr/goleveldb/leveldb/opt"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
)
|
||||
|
||||
var (
|
||||
errCompactionTransactExiting = errors.New("leveldb: compaction transact exiting")
|
||||
)
|
||||
|
||||
type cStats struct {
|
||||
sync.Mutex
|
||||
type cStat struct {
|
||||
duration time.Duration
|
||||
read uint64
|
||||
write uint64
|
||||
read int64
|
||||
write int64
|
||||
}
|
||||
|
||||
func (p *cStats) add(n *cStatsStaging) {
|
||||
p.Lock()
|
||||
func (p *cStat) add(n *cStatStaging) {
|
||||
p.duration += n.duration
|
||||
p.read += n.read
|
||||
p.write += n.write
|
||||
p.Unlock()
|
||||
}
|
||||
|
||||
func (p *cStats) get() (duration time.Duration, read, write uint64) {
|
||||
p.Lock()
|
||||
defer p.Unlock()
|
||||
func (p *cStat) get() (duration time.Duration, read, write int64) {
|
||||
return p.duration, p.read, p.write
|
||||
}
|
||||
|
||||
type cStatsStaging struct {
|
||||
type cStatStaging struct {
|
||||
start time.Time
|
||||
duration time.Duration
|
||||
on bool
|
||||
read uint64
|
||||
write uint64
|
||||
read int64
|
||||
write int64
|
||||
}
|
||||
|
||||
func (p *cStatsStaging) startTimer() {
|
||||
func (p *cStatStaging) startTimer() {
|
||||
if !p.on {
|
||||
p.start = time.Now()
|
||||
p.on = true
|
||||
}
|
||||
}
|
||||
|
||||
func (p *cStatsStaging) stopTimer() {
|
||||
func (p *cStatStaging) stopTimer() {
|
||||
if p.on {
|
||||
p.duration += time.Since(p.start)
|
||||
p.on = false
|
||||
}
|
||||
}
|
||||
|
||||
type cMem struct {
|
||||
s *session
|
||||
level int
|
||||
rec *sessionRecord
|
||||
type cStats struct {
|
||||
lk sync.Mutex
|
||||
stats []cStat
|
||||
}
|
||||
|
||||
func newCMem(s *session) *cMem {
|
||||
return &cMem{s: s, rec: &sessionRecord{numLevel: s.o.GetNumLevel()}}
|
||||
}
|
||||
|
||||
func (c *cMem) flush(mem *memdb.DB, level int) error {
|
||||
s := c.s
|
||||
|
||||
// Write memdb to table.
|
||||
iter := mem.NewIterator(nil)
|
||||
defer iter.Release()
|
||||
t, n, err := s.tops.createFrom(iter)
|
||||
if err != nil {
|
||||
return err
|
||||
func (p *cStats) addStat(level int, n *cStatStaging) {
|
||||
p.lk.Lock()
|
||||
if level >= len(p.stats) {
|
||||
newStats := make([]cStat, level+1)
|
||||
copy(newStats, p.stats)
|
||||
p.stats = newStats
|
||||
}
|
||||
p.stats[level].add(n)
|
||||
p.lk.Unlock()
|
||||
}
|
||||
|
||||
// Pick level.
|
||||
if level < 0 {
|
||||
v := s.version()
|
||||
level = v.pickLevel(t.imin.ukey(), t.imax.ukey())
|
||||
v.release()
|
||||
func (p *cStats) getStat(level int) (duration time.Duration, read, write int64) {
|
||||
p.lk.Lock()
|
||||
defer p.lk.Unlock()
|
||||
if level < len(p.stats) {
|
||||
return p.stats[level].get()
|
||||
}
|
||||
c.rec.addTableFile(level, t)
|
||||
|
||||
s.logf("mem@flush created L%d@%d N·%d S·%s %q:%q", level, t.file.Num(), n, shortenb(int(t.size)), t.imin, t.imax)
|
||||
|
||||
c.level = level
|
||||
return nil
|
||||
}
|
||||
|
||||
func (c *cMem) reset() {
|
||||
c.rec = &sessionRecord{numLevel: c.s.o.GetNumLevel()}
|
||||
}
|
||||
|
||||
func (c *cMem) commit(journal, seq uint64) error {
|
||||
c.rec.setJournalNum(journal)
|
||||
c.rec.setSeqNum(seq)
|
||||
|
||||
// Commit changes.
|
||||
return c.s.commit(c.rec)
|
||||
return
|
||||
}
|
||||
|
||||
func (db *DB) compactionError() {
|
||||
var (
|
||||
err error
|
||||
wlocked bool
|
||||
)
|
||||
var err error
|
||||
noerr:
|
||||
// No error.
|
||||
for {
|
||||
|
@ -121,12 +91,12 @@ noerr:
|
|||
case err = <-db.compErrSetC:
|
||||
switch {
|
||||
case err == nil:
|
||||
case errors.IsCorrupted(err):
|
||||
case err == ErrReadOnly, errors.IsCorrupted(err):
|
||||
goto hasperr
|
||||
default:
|
||||
goto haserr
|
||||
}
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return
|
||||
}
|
||||
}
|
||||
|
@ -139,11 +109,11 @@ haserr:
|
|||
switch {
|
||||
case err == nil:
|
||||
goto noerr
|
||||
case errors.IsCorrupted(err):
|
||||
case err == ErrReadOnly, errors.IsCorrupted(err):
|
||||
goto hasperr
|
||||
default:
|
||||
}
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return
|
||||
}
|
||||
}
|
||||
|
@ -155,9 +125,9 @@ hasperr:
|
|||
case db.compPerErrC <- err:
|
||||
case db.writeLockC <- struct{}{}:
|
||||
// Hold write lock, so that write won't pass-through.
|
||||
wlocked = true
|
||||
case _, _ = <-db.closeC:
|
||||
if wlocked {
|
||||
db.compWriteLocking = true
|
||||
case <-db.closeC:
|
||||
if db.compWriteLocking {
|
||||
// We should release the lock or Close will hang.
|
||||
<-db.writeLockC
|
||||
}
|
||||
|
@ -202,7 +172,7 @@ func (db *DB) compactionTransact(name string, t compactionTransactInterface) {
|
|||
disableBackoff = db.s.o.GetDisableCompactionBackoff()
|
||||
)
|
||||
for n := 0; ; n++ {
|
||||
// Check wether the DB is closed.
|
||||
// Check whether the DB is closed.
|
||||
if db.isClosed() {
|
||||
db.logf("%s exiting", name)
|
||||
db.compactionExitTransact()
|
||||
|
@ -225,7 +195,7 @@ func (db *DB) compactionTransact(name string, t compactionTransactInterface) {
|
|||
db.logf("%s exiting (persistent error %q)", name, perr)
|
||||
db.compactionExitTransact()
|
||||
}
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
db.logf("%s exiting", name)
|
||||
db.compactionExitTransact()
|
||||
}
|
||||
|
@ -254,7 +224,7 @@ func (db *DB) compactionTransact(name string, t compactionTransactInterface) {
|
|||
}
|
||||
select {
|
||||
case <-backoffT.C:
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
db.logf("%s exiting", name)
|
||||
db.compactionExitTransact()
|
||||
}
|
||||
|
@ -286,22 +256,27 @@ func (db *DB) compactionExitTransact() {
|
|||
panic(errCompactionTransactExiting)
|
||||
}
|
||||
|
||||
func (db *DB) compactionCommit(name string, rec *sessionRecord) {
|
||||
db.compCommitLk.Lock()
|
||||
defer db.compCommitLk.Unlock() // Defer is necessary.
|
||||
db.compactionTransactFunc(name+"@commit", func(cnt *compactionTransactCounter) error {
|
||||
return db.s.commit(rec)
|
||||
}, nil)
|
||||
}
|
||||
|
||||
func (db *DB) memCompaction() {
|
||||
mem := db.getFrozenMem()
|
||||
if mem == nil {
|
||||
mdb := db.getFrozenMem()
|
||||
if mdb == nil {
|
||||
return
|
||||
}
|
||||
defer mem.decref()
|
||||
defer mdb.decref()
|
||||
|
||||
c := newCMem(db.s)
|
||||
stats := new(cStatsStaging)
|
||||
|
||||
db.logf("mem@flush N·%d S·%s", mem.mdb.Len(), shortenb(mem.mdb.Size()))
|
||||
db.logf("memdb@flush N·%d S·%s", mdb.Len(), shortenb(mdb.Size()))
|
||||
|
||||
// Don't compact empty memdb.
|
||||
if mem.mdb.Len() == 0 {
|
||||
db.logf("mem@flush skipping")
|
||||
// drop frozen mem
|
||||
if mdb.Len() == 0 {
|
||||
db.logf("memdb@flush skipping")
|
||||
// drop frozen memdb
|
||||
db.dropFrozenMem()
|
||||
return
|
||||
}
|
||||
|
@ -313,39 +288,48 @@ func (db *DB) memCompaction() {
|
|||
case <-db.compPerErrC:
|
||||
close(resumeC)
|
||||
resumeC = nil
|
||||
case _, _ = <-db.closeC:
|
||||
return
|
||||
case <-db.closeC:
|
||||
db.compactionExitTransact()
|
||||
}
|
||||
|
||||
db.compactionTransactFunc("mem@flush", func(cnt *compactionTransactCounter) (err error) {
|
||||
var (
|
||||
rec = &sessionRecord{}
|
||||
stats = &cStatStaging{}
|
||||
flushLevel int
|
||||
)
|
||||
|
||||
// Generate tables.
|
||||
db.compactionTransactFunc("memdb@flush", func(cnt *compactionTransactCounter) (err error) {
|
||||
stats.startTimer()
|
||||
defer stats.stopTimer()
|
||||
return c.flush(mem.mdb, -1)
|
||||
flushLevel, err = db.s.flushMemdb(rec, mdb.DB, db.memdbMaxLevel)
|
||||
stats.stopTimer()
|
||||
return
|
||||
}, func() error {
|
||||
for _, r := range c.rec.addedTables {
|
||||
db.logf("mem@flush revert @%d", r.num)
|
||||
f := db.s.getTableFile(r.num)
|
||||
if err := f.Remove(); err != nil {
|
||||
for _, r := range rec.addedTables {
|
||||
db.logf("memdb@flush revert @%d", r.num)
|
||||
if err := db.s.stor.Remove(storage.FileDesc{Type: storage.TypeTable, Num: r.num}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
})
|
||||
|
||||
db.compactionTransactFunc("mem@commit", func(cnt *compactionTransactCounter) (err error) {
|
||||
stats.startTimer()
|
||||
defer stats.stopTimer()
|
||||
return c.commit(db.journalFile.Num(), db.frozenSeq)
|
||||
}, nil)
|
||||
rec.setJournalNum(db.journalFd.Num)
|
||||
rec.setSeqNum(db.frozenSeq)
|
||||
|
||||
db.logf("mem@flush committed F·%d T·%v", len(c.rec.addedTables), stats.duration)
|
||||
// Commit.
|
||||
stats.startTimer()
|
||||
db.compactionCommit("memdb", rec)
|
||||
stats.stopTimer()
|
||||
|
||||
for _, r := range c.rec.addedTables {
|
||||
db.logf("memdb@flush committed F·%d T·%v", len(rec.addedTables), stats.duration)
|
||||
|
||||
for _, r := range rec.addedTables {
|
||||
stats.write += r.size
|
||||
}
|
||||
db.compStats[c.level].add(stats)
|
||||
db.compStats.addStat(flushLevel, stats)
|
||||
|
||||
// Drop frozen mem.
|
||||
// Drop frozen memdb.
|
||||
db.dropFrozenMem()
|
||||
|
||||
// Resume table compaction.
|
||||
|
@ -353,13 +337,13 @@ func (db *DB) memCompaction() {
|
|||
select {
|
||||
case <-resumeC:
|
||||
close(resumeC)
|
||||
case _, _ = <-db.closeC:
|
||||
return
|
||||
case <-db.closeC:
|
||||
db.compactionExitTransact()
|
||||
}
|
||||
}
|
||||
|
||||
// Trigger table compaction.
|
||||
db.compSendTrigger(db.tcompCmdC)
|
||||
db.compTrigger(db.tcompCmdC)
|
||||
}
|
||||
|
||||
type tableCompactionBuilder struct {
|
||||
|
@ -367,7 +351,7 @@ type tableCompactionBuilder struct {
|
|||
s *session
|
||||
c *compaction
|
||||
rec *sessionRecord
|
||||
stat0, stat1 *cStatsStaging
|
||||
stat0, stat1 *cStatStaging
|
||||
|
||||
snapHasLastUkey bool
|
||||
snapLastUkey []byte
|
||||
|
@ -394,7 +378,7 @@ func (b *tableCompactionBuilder) appendKV(key, value []byte) error {
|
|||
select {
|
||||
case ch := <-b.db.tcompPauseC:
|
||||
b.db.pauseCompaction(ch)
|
||||
case _, _ = <-b.db.closeC:
|
||||
case <-b.db.closeC:
|
||||
b.db.compactionExitTransact()
|
||||
default:
|
||||
}
|
||||
|
@ -421,9 +405,9 @@ func (b *tableCompactionBuilder) flush() error {
|
|||
if err != nil {
|
||||
return err
|
||||
}
|
||||
b.rec.addTableFile(b.c.level+1, t)
|
||||
b.rec.addTableFile(b.c.sourceLevel+1, t)
|
||||
b.stat1.write += t.size
|
||||
b.s.logf("table@build created L%d@%d N·%d S·%s %q:%q", b.c.level+1, t.file.Num(), b.tw.tw.EntriesLen(), shortenb(int(t.size)), t.imin, t.imax)
|
||||
b.s.logf("table@build created L%d@%d N·%d S·%s %q:%q", b.c.sourceLevel+1, t.fd.Num, b.tw.tw.EntriesLen(), shortenb(int(t.size)), t.imin, t.imax)
|
||||
b.tw = nil
|
||||
return nil
|
||||
}
|
||||
|
@ -468,7 +452,7 @@ func (b *tableCompactionBuilder) run(cnt *compactionTransactCounter) error {
|
|||
}
|
||||
|
||||
ikey := iter.Key()
|
||||
ukey, seq, kt, kerr := parseIkey(ikey)
|
||||
ukey, seq, kt, kerr := parseInternalKey(ikey)
|
||||
|
||||
if kerr == nil {
|
||||
shouldStop := !resumed && b.c.shouldStopBefore(ikey)
|
||||
|
@ -494,14 +478,14 @@ func (b *tableCompactionBuilder) run(cnt *compactionTransactCounter) error {
|
|||
|
||||
hasLastUkey = true
|
||||
lastUkey = append(lastUkey[:0], ukey...)
|
||||
lastSeq = kMaxSeq
|
||||
lastSeq = keyMaxSeq
|
||||
}
|
||||
|
||||
switch {
|
||||
case lastSeq <= b.minSeq:
|
||||
// Dropped because newer entry for same user key exist
|
||||
fallthrough // (A)
|
||||
case kt == ktDel && seq <= b.minSeq && b.c.baseLevelForKey(lastUkey):
|
||||
case kt == keyTypeDel && seq <= b.minSeq && b.c.baseLevelForKey(lastUkey):
|
||||
// For this user key:
|
||||
// (1) there is no data in higher levels
|
||||
// (2) data in lower levels will have larger seq numbers
|
||||
|
@ -523,7 +507,7 @@ func (b *tableCompactionBuilder) run(cnt *compactionTransactCounter) error {
|
|||
// Don't drop corrupted keys.
|
||||
hasLastUkey = false
|
||||
lastUkey = lastUkey[:0]
|
||||
lastSeq = kMaxSeq
|
||||
lastSeq = keyMaxSeq
|
||||
b.kerrCnt++
|
||||
}
|
||||
|
||||
|
@ -546,8 +530,7 @@ func (b *tableCompactionBuilder) run(cnt *compactionTransactCounter) error {
|
|||
func (b *tableCompactionBuilder) revert() error {
|
||||
for _, at := range b.rec.addedTables {
|
||||
b.s.logf("table@build revert @%d", at.num)
|
||||
f := b.s.getTableFile(at.num)
|
||||
if err := f.Remove(); err != nil {
|
||||
if err := b.s.stor.Remove(storage.FileDesc{Type: storage.TypeTable, Num: at.num}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
@ -557,31 +540,29 @@ func (b *tableCompactionBuilder) revert() error {
|
|||
func (db *DB) tableCompaction(c *compaction, noTrivial bool) {
|
||||
defer c.release()
|
||||
|
||||
rec := &sessionRecord{numLevel: db.s.o.GetNumLevel()}
|
||||
rec.addCompPtr(c.level, c.imax)
|
||||
rec := &sessionRecord{}
|
||||
rec.addCompPtr(c.sourceLevel, c.imax)
|
||||
|
||||
if !noTrivial && c.trivial() {
|
||||
t := c.tables[0][0]
|
||||
db.logf("table@move L%d@%d -> L%d", c.level, t.file.Num(), c.level+1)
|
||||
rec.delTable(c.level, t.file.Num())
|
||||
rec.addTableFile(c.level+1, t)
|
||||
db.compactionTransactFunc("table@move", func(cnt *compactionTransactCounter) (err error) {
|
||||
return db.s.commit(rec)
|
||||
}, nil)
|
||||
t := c.levels[0][0]
|
||||
db.logf("table@move L%d@%d -> L%d", c.sourceLevel, t.fd.Num, c.sourceLevel+1)
|
||||
rec.delTable(c.sourceLevel, t.fd.Num)
|
||||
rec.addTableFile(c.sourceLevel+1, t)
|
||||
db.compactionCommit("table-move", rec)
|
||||
return
|
||||
}
|
||||
|
||||
var stats [2]cStatsStaging
|
||||
for i, tables := range c.tables {
|
||||
var stats [2]cStatStaging
|
||||
for i, tables := range c.levels {
|
||||
for _, t := range tables {
|
||||
stats[i].read += t.size
|
||||
// Insert deleted tables into record
|
||||
rec.delTable(c.level+i, t.file.Num())
|
||||
rec.delTable(c.sourceLevel+i, t.fd.Num)
|
||||
}
|
||||
}
|
||||
sourceSize := int(stats[0].read + stats[1].read)
|
||||
minSeq := db.minSeq()
|
||||
db.logf("table@compaction L%d·%d -> L%d·%d S·%s Q·%d", c.level, len(c.tables[0]), c.level+1, len(c.tables[1]), shortenb(sourceSize), minSeq)
|
||||
db.logf("table@compaction L%d·%d -> L%d·%d S·%s Q·%d", c.sourceLevel, len(c.levels[0]), c.sourceLevel+1, len(c.levels[1]), shortenb(sourceSize), minSeq)
|
||||
|
||||
b := &tableCompactionBuilder{
|
||||
db: db,
|
||||
|
@ -591,49 +572,60 @@ func (db *DB) tableCompaction(c *compaction, noTrivial bool) {
|
|||
stat1: &stats[1],
|
||||
minSeq: minSeq,
|
||||
strict: db.s.o.GetStrict(opt.StrictCompaction),
|
||||
tableSize: db.s.o.GetCompactionTableSize(c.level + 1),
|
||||
tableSize: db.s.o.GetCompactionTableSize(c.sourceLevel + 1),
|
||||
}
|
||||
db.compactionTransact("table@build", b)
|
||||
|
||||
// Commit changes
|
||||
db.compactionTransactFunc("table@commit", func(cnt *compactionTransactCounter) (err error) {
|
||||
stats[1].startTimer()
|
||||
defer stats[1].stopTimer()
|
||||
return db.s.commit(rec)
|
||||
}, nil)
|
||||
// Commit.
|
||||
stats[1].startTimer()
|
||||
db.compactionCommit("table", rec)
|
||||
stats[1].stopTimer()
|
||||
|
||||
resultSize := int(stats[1].write)
|
||||
db.logf("table@compaction committed F%s S%s Ke·%d D·%d T·%v", sint(len(rec.addedTables)-len(rec.deletedTables)), sshortenb(resultSize-sourceSize), b.kerrCnt, b.dropCnt, stats[1].duration)
|
||||
|
||||
// Save compaction stats
|
||||
for i := range stats {
|
||||
db.compStats[c.level+1].add(&stats[i])
|
||||
db.compStats.addStat(c.sourceLevel+1, &stats[i])
|
||||
}
|
||||
}
|
||||
|
||||
func (db *DB) tableRangeCompaction(level int, umin, umax []byte) {
|
||||
func (db *DB) tableRangeCompaction(level int, umin, umax []byte) error {
|
||||
db.logf("table@compaction range L%d %q:%q", level, umin, umax)
|
||||
|
||||
if level >= 0 {
|
||||
if c := db.s.getCompactionRange(level, umin, umax); c != nil {
|
||||
if c := db.s.getCompactionRange(level, umin, umax, true); c != nil {
|
||||
db.tableCompaction(c, true)
|
||||
}
|
||||
} else {
|
||||
v := db.s.version()
|
||||
m := 1
|
||||
for i, t := range v.tables[1:] {
|
||||
if t.overlaps(db.s.icmp, umin, umax, false) {
|
||||
m = i + 1
|
||||
}
|
||||
}
|
||||
v.release()
|
||||
// Retry until nothing to compact.
|
||||
for {
|
||||
compacted := false
|
||||
|
||||
for level := 0; level < m; level++ {
|
||||
if c := db.s.getCompactionRange(level, umin, umax); c != nil {
|
||||
db.tableCompaction(c, true)
|
||||
// Scan for maximum level with overlapped tables.
|
||||
v := db.s.version()
|
||||
m := 1
|
||||
for i := m; i < len(v.levels); i++ {
|
||||
tables := v.levels[i]
|
||||
if tables.overlaps(db.s.icmp, umin, umax, false) {
|
||||
m = i
|
||||
}
|
||||
}
|
||||
v.release()
|
||||
|
||||
for level := 0; level < m; level++ {
|
||||
if c := db.s.getCompactionRange(level, umin, umax, false); c != nil {
|
||||
db.tableCompaction(c, true)
|
||||
compacted = true
|
||||
}
|
||||
}
|
||||
|
||||
if !compacted {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (db *DB) tableAutoCompaction() {
|
||||
|
@ -651,7 +643,7 @@ func (db *DB) tableNeedCompaction() bool {
|
|||
func (db *DB) pauseCompaction(ch chan<- struct{}) {
|
||||
select {
|
||||
case ch <- struct{}{}:
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
db.compactionExitTransact()
|
||||
}
|
||||
}
|
||||
|
@ -660,11 +652,11 @@ type cCmd interface {
|
|||
ack(err error)
|
||||
}
|
||||
|
||||
type cIdle struct {
|
||||
type cAuto struct {
|
||||
ackC chan<- error
|
||||
}
|
||||
|
||||
func (r cIdle) ack(err error) {
|
||||
func (r cAuto) ack(err error) {
|
||||
if r.ackC != nil {
|
||||
defer func() {
|
||||
recover()
|
||||
|
@ -688,38 +680,38 @@ func (r cRange) ack(err error) {
|
|||
}
|
||||
}
|
||||
|
||||
// This will trigger auto compation and/or wait for all compaction to be done.
|
||||
func (db *DB) compSendIdle(compC chan<- cCmd) (err error) {
|
||||
// This will trigger auto compaction but will not wait for it.
|
||||
func (db *DB) compTrigger(compC chan<- cCmd) {
|
||||
select {
|
||||
case compC <- cAuto{}:
|
||||
default:
|
||||
}
|
||||
}
|
||||
|
||||
// This will trigger auto compaction and/or wait for all compaction to be done.
|
||||
func (db *DB) compTriggerWait(compC chan<- cCmd) (err error) {
|
||||
ch := make(chan error)
|
||||
defer close(ch)
|
||||
// Send cmd.
|
||||
select {
|
||||
case compC <- cIdle{ch}:
|
||||
case compC <- cAuto{ch}:
|
||||
case err = <-db.compErrC:
|
||||
return
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return ErrClosed
|
||||
}
|
||||
// Wait cmd.
|
||||
select {
|
||||
case err = <-ch:
|
||||
case err = <-db.compErrC:
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return ErrClosed
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
// This will trigger auto compaction but will not wait for it.
|
||||
func (db *DB) compSendTrigger(compC chan<- cCmd) {
|
||||
select {
|
||||
case compC <- cIdle{}:
|
||||
default:
|
||||
}
|
||||
}
|
||||
|
||||
// Send range compaction request.
|
||||
func (db *DB) compSendRange(compC chan<- cCmd, level int, min, max []byte) (err error) {
|
||||
func (db *DB) compTriggerRange(compC chan<- cCmd, level int, min, max []byte) (err error) {
|
||||
ch := make(chan error)
|
||||
defer close(ch)
|
||||
// Send cmd.
|
||||
|
@ -727,14 +719,14 @@ func (db *DB) compSendRange(compC chan<- cCmd, level int, min, max []byte) (err
|
|||
case compC <- cRange{level, min, max, ch}:
|
||||
case err := <-db.compErrC:
|
||||
return err
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return ErrClosed
|
||||
}
|
||||
// Wait cmd.
|
||||
select {
|
||||
case err = <-ch:
|
||||
case err = <-db.compErrC:
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return ErrClosed
|
||||
}
|
||||
return err
|
||||
|
@ -759,14 +751,14 @@ func (db *DB) mCompaction() {
|
|||
select {
|
||||
case x = <-db.mcompCmdC:
|
||||
switch x.(type) {
|
||||
case cIdle:
|
||||
case cAuto:
|
||||
db.memCompaction()
|
||||
x.ack(nil)
|
||||
x = nil
|
||||
default:
|
||||
panic("leveldb: unknown command")
|
||||
}
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return
|
||||
}
|
||||
}
|
||||
|
@ -799,7 +791,7 @@ func (db *DB) tCompaction() {
|
|||
case ch := <-db.tcompPauseC:
|
||||
db.pauseCompaction(ch)
|
||||
continue
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return
|
||||
default:
|
||||
}
|
||||
|
@ -814,17 +806,16 @@ func (db *DB) tCompaction() {
|
|||
case ch := <-db.tcompPauseC:
|
||||
db.pauseCompaction(ch)
|
||||
continue
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return
|
||||
}
|
||||
}
|
||||
if x != nil {
|
||||
switch cmd := x.(type) {
|
||||
case cIdle:
|
||||
case cAuto:
|
||||
ackQ = append(ackQ, x)
|
||||
case cRange:
|
||||
db.tableRangeCompaction(cmd.level, cmd.min, cmd.max)
|
||||
x.ack(nil)
|
||||
x.ack(db.tableRangeCompaction(cmd.level, cmd.min, cmd.max))
|
||||
default:
|
||||
panic("leveldb: unknown command")
|
||||
}
|
||||
|
|
|
@ -19,7 +19,7 @@ import (
|
|||
)
|
||||
|
||||
var (
|
||||
errInvalidIkey = errors.New("leveldb: Iterator: invalid internal key")
|
||||
errInvalidInternalKey = errors.New("leveldb: Iterator: invalid internal key")
|
||||
)
|
||||
|
||||
type memdbReleaser struct {
|
||||
|
@ -33,40 +33,50 @@ func (mr *memdbReleaser) Release() {
|
|||
})
|
||||
}
|
||||
|
||||
func (db *DB) newRawIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator {
|
||||
func (db *DB) newRawIterator(auxm *memDB, auxt tFiles, slice *util.Range, ro *opt.ReadOptions) iterator.Iterator {
|
||||
strict := opt.GetStrict(db.s.o.Options, ro, opt.StrictReader)
|
||||
em, fm := db.getMems()
|
||||
v := db.s.version()
|
||||
|
||||
ti := v.getIterators(slice, ro)
|
||||
n := len(ti) + 2
|
||||
i := make([]iterator.Iterator, 0, n)
|
||||
emi := em.mdb.NewIterator(slice)
|
||||
emi.SetReleaser(&memdbReleaser{m: em})
|
||||
i = append(i, emi)
|
||||
if fm != nil {
|
||||
fmi := fm.mdb.NewIterator(slice)
|
||||
fmi.SetReleaser(&memdbReleaser{m: fm})
|
||||
i = append(i, fmi)
|
||||
tableIts := v.getIterators(slice, ro)
|
||||
n := len(tableIts) + len(auxt) + 3
|
||||
its := make([]iterator.Iterator, 0, n)
|
||||
|
||||
if auxm != nil {
|
||||
ami := auxm.NewIterator(slice)
|
||||
ami.SetReleaser(&memdbReleaser{m: auxm})
|
||||
its = append(its, ami)
|
||||
}
|
||||
i = append(i, ti...)
|
||||
strict := opt.GetStrict(db.s.o.Options, ro, opt.StrictReader)
|
||||
mi := iterator.NewMergedIterator(i, db.s.icmp, strict)
|
||||
for _, t := range auxt {
|
||||
its = append(its, v.s.tops.newIterator(t, slice, ro))
|
||||
}
|
||||
|
||||
emi := em.NewIterator(slice)
|
||||
emi.SetReleaser(&memdbReleaser{m: em})
|
||||
its = append(its, emi)
|
||||
if fm != nil {
|
||||
fmi := fm.NewIterator(slice)
|
||||
fmi.SetReleaser(&memdbReleaser{m: fm})
|
||||
its = append(its, fmi)
|
||||
}
|
||||
its = append(its, tableIts...)
|
||||
mi := iterator.NewMergedIterator(its, db.s.icmp, strict)
|
||||
mi.SetReleaser(&versionReleaser{v: v})
|
||||
return mi
|
||||
}
|
||||
|
||||
func (db *DB) newIterator(seq uint64, slice *util.Range, ro *opt.ReadOptions) *dbIter {
|
||||
func (db *DB) newIterator(auxm *memDB, auxt tFiles, seq uint64, slice *util.Range, ro *opt.ReadOptions) *dbIter {
|
||||
var islice *util.Range
|
||||
if slice != nil {
|
||||
islice = &util.Range{}
|
||||
if slice.Start != nil {
|
||||
islice.Start = newIkey(slice.Start, kMaxSeq, ktSeek)
|
||||
islice.Start = makeInternalKey(nil, slice.Start, keyMaxSeq, keyTypeSeek)
|
||||
}
|
||||
if slice.Limit != nil {
|
||||
islice.Limit = newIkey(slice.Limit, kMaxSeq, ktSeek)
|
||||
islice.Limit = makeInternalKey(nil, slice.Limit, keyMaxSeq, keyTypeSeek)
|
||||
}
|
||||
}
|
||||
rawIter := db.newRawIterator(islice, ro)
|
||||
rawIter := db.newRawIterator(auxm, auxt, islice, ro)
|
||||
iter := &dbIter{
|
||||
db: db,
|
||||
icmp: db.s.icmp,
|
||||
|
@ -177,7 +187,7 @@ func (i *dbIter) Seek(key []byte) bool {
|
|||
return false
|
||||
}
|
||||
|
||||
ikey := newIkey(key, i.seq, ktSeek)
|
||||
ikey := makeInternalKey(nil, key, i.seq, keyTypeSeek)
|
||||
if i.iter.Seek(ikey) {
|
||||
i.dir = dirSOI
|
||||
return i.next()
|
||||
|
@ -189,15 +199,15 @@ func (i *dbIter) Seek(key []byte) bool {
|
|||
|
||||
func (i *dbIter) next() bool {
|
||||
for {
|
||||
if ukey, seq, kt, kerr := parseIkey(i.iter.Key()); kerr == nil {
|
||||
if ukey, seq, kt, kerr := parseInternalKey(i.iter.Key()); kerr == nil {
|
||||
i.sampleSeek()
|
||||
if seq <= i.seq {
|
||||
switch kt {
|
||||
case ktDel:
|
||||
case keyTypeDel:
|
||||
// Skip deleted key.
|
||||
i.key = append(i.key[:0], ukey...)
|
||||
i.dir = dirForward
|
||||
case ktVal:
|
||||
case keyTypeVal:
|
||||
if i.dir == dirSOI || i.icmp.uCompare(ukey, i.key) > 0 {
|
||||
i.key = append(i.key[:0], ukey...)
|
||||
i.value = append(i.value[:0], i.iter.Value()...)
|
||||
|
@ -240,13 +250,13 @@ func (i *dbIter) prev() bool {
|
|||
del := true
|
||||
if i.iter.Valid() {
|
||||
for {
|
||||
if ukey, seq, kt, kerr := parseIkey(i.iter.Key()); kerr == nil {
|
||||
if ukey, seq, kt, kerr := parseInternalKey(i.iter.Key()); kerr == nil {
|
||||
i.sampleSeek()
|
||||
if seq <= i.seq {
|
||||
if !del && i.icmp.uCompare(ukey, i.key) < 0 {
|
||||
return true
|
||||
}
|
||||
del = (kt == ktDel)
|
||||
del = (kt == keyTypeDel)
|
||||
if !del {
|
||||
i.key = append(i.key[:0], ukey...)
|
||||
i.value = append(i.value[:0], i.iter.Value()...)
|
||||
|
@ -282,7 +292,7 @@ func (i *dbIter) Prev() bool {
|
|||
return i.Last()
|
||||
case dirForward:
|
||||
for i.iter.Prev() {
|
||||
if ukey, _, _, kerr := parseIkey(i.iter.Key()); kerr == nil {
|
||||
if ukey, _, _, kerr := parseInternalKey(i.iter.Key()); kerr == nil {
|
||||
i.sampleSeek()
|
||||
if i.icmp.uCompare(ukey, i.key) < 0 {
|
||||
goto cont
|
||||
|
|
|
@ -59,7 +59,7 @@ func (db *DB) releaseSnapshot(se *snapshotElement) {
|
|||
}
|
||||
}
|
||||
|
||||
// Gets minimum sequence that not being snapshoted.
|
||||
// Gets minimum sequence that not being snapshotted.
|
||||
func (db *DB) minSeq() uint64 {
|
||||
db.snapsMu.Lock()
|
||||
defer db.snapsMu.Unlock()
|
||||
|
@ -110,7 +110,7 @@ func (snap *Snapshot) Get(key []byte, ro *opt.ReadOptions) (value []byte, err er
|
|||
err = ErrSnapshotReleased
|
||||
return
|
||||
}
|
||||
return snap.db.get(key, snap.elem.seq, ro)
|
||||
return snap.db.get(nil, nil, key, snap.elem.seq, ro)
|
||||
}
|
||||
|
||||
// Has returns true if the DB does contains the given key.
|
||||
|
@ -127,11 +127,11 @@ func (snap *Snapshot) Has(key []byte, ro *opt.ReadOptions) (ret bool, err error)
|
|||
err = ErrSnapshotReleased
|
||||
return
|
||||
}
|
||||
return snap.db.has(key, snap.elem.seq, ro)
|
||||
return snap.db.has(nil, nil, key, snap.elem.seq, ro)
|
||||
}
|
||||
|
||||
// NewIterator returns an iterator for the snapshot of the uderlying DB.
|
||||
// The returned iterator is not goroutine-safe, but it is safe to use
|
||||
// NewIterator returns an iterator for the snapshot of the underlying DB.
|
||||
// The returned iterator is not safe for concurrent use, but it is safe to use
|
||||
// multiple iterators concurrently, with each in a dedicated goroutine.
|
||||
// It is also safe to use an iterator concurrently with modifying its
|
||||
// underlying DB. The resultant key/value pairs are guaranteed to be
|
||||
|
@ -158,7 +158,7 @@ func (snap *Snapshot) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterat
|
|||
}
|
||||
// Since iterator already hold version ref, it doesn't need to
|
||||
// hold snapshot ref.
|
||||
return snap.db.newIterator(snap.elem.seq, slice, ro)
|
||||
return snap.db.newIterator(nil, nil, snap.elem.seq, slice, ro)
|
||||
}
|
||||
|
||||
// Release releases the snapshot. This will not release any returned
|
||||
|
|
|
@ -7,19 +7,29 @@
|
|||
package leveldb
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"sync/atomic"
|
||||
"time"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/journal"
|
||||
"github.com/syndtr/goleveldb/leveldb/memdb"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
)
|
||||
|
||||
var (
|
||||
errHasFrozenMem = errors.New("has frozen mem")
|
||||
)
|
||||
|
||||
type memDB struct {
|
||||
db *DB
|
||||
mdb *memdb.DB
|
||||
db *DB
|
||||
*memdb.DB
|
||||
ref int32
|
||||
}
|
||||
|
||||
func (m *memDB) getref() int32 {
|
||||
return atomic.LoadInt32(&m.ref)
|
||||
}
|
||||
|
||||
func (m *memDB) incref() {
|
||||
atomic.AddInt32(&m.ref, 1)
|
||||
}
|
||||
|
@ -27,12 +37,12 @@ func (m *memDB) incref() {
|
|||
func (m *memDB) decref() {
|
||||
if ref := atomic.AddInt32(&m.ref, -1); ref == 0 {
|
||||
// Only put back memdb with std capacity.
|
||||
if m.mdb.Capacity() == m.db.s.o.GetWriteBuffer() {
|
||||
m.mdb.Reset()
|
||||
m.db.mpoolPut(m.mdb)
|
||||
if m.Capacity() == m.db.s.o.GetWriteBuffer() {
|
||||
m.Reset()
|
||||
m.db.mpoolPut(m.DB)
|
||||
}
|
||||
m.db = nil
|
||||
m.mdb = nil
|
||||
m.DB = nil
|
||||
} else if ref < 0 {
|
||||
panic("negative memdb ref")
|
||||
}
|
||||
|
@ -48,31 +58,40 @@ func (db *DB) addSeq(delta uint64) {
|
|||
atomic.AddUint64(&db.seq, delta)
|
||||
}
|
||||
|
||||
func (db *DB) sampleSeek(ikey iKey) {
|
||||
func (db *DB) setSeq(seq uint64) {
|
||||
atomic.StoreUint64(&db.seq, seq)
|
||||
}
|
||||
|
||||
func (db *DB) sampleSeek(ikey internalKey) {
|
||||
v := db.s.version()
|
||||
if v.sampleSeek(ikey) {
|
||||
// Trigger table compaction.
|
||||
db.compSendTrigger(db.tcompCmdC)
|
||||
db.compTrigger(db.tcompCmdC)
|
||||
}
|
||||
v.release()
|
||||
}
|
||||
|
||||
func (db *DB) mpoolPut(mem *memdb.DB) {
|
||||
defer func() {
|
||||
recover()
|
||||
}()
|
||||
select {
|
||||
case db.memPool <- mem:
|
||||
default:
|
||||
if !db.isClosed() {
|
||||
select {
|
||||
case db.memPool <- mem:
|
||||
default:
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (db *DB) mpoolGet() *memdb.DB {
|
||||
func (db *DB) mpoolGet(n int) *memDB {
|
||||
var mdb *memdb.DB
|
||||
select {
|
||||
case mem := <-db.memPool:
|
||||
return mem
|
||||
case mdb = <-db.memPool:
|
||||
default:
|
||||
return nil
|
||||
}
|
||||
if mdb == nil || mdb.Capacity() < n {
|
||||
mdb = memdb.New(db.s.icmp, maxInt(db.s.o.GetWriteBuffer(), n))
|
||||
}
|
||||
return &memDB{
|
||||
db: db,
|
||||
DB: mdb,
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -85,7 +104,13 @@ func (db *DB) mpoolDrain() {
|
|||
case <-db.memPool:
|
||||
default:
|
||||
}
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
ticker.Stop()
|
||||
// Make sure the pool is drained.
|
||||
select {
|
||||
case <-db.memPool:
|
||||
case <-time.After(time.Second):
|
||||
}
|
||||
close(db.memPool)
|
||||
return
|
||||
}
|
||||
|
@ -95,11 +120,10 @@ func (db *DB) mpoolDrain() {
|
|||
// Create new memdb and froze the old one; need external synchronization.
|
||||
// newMem only called synchronously by the writer.
|
||||
func (db *DB) newMem(n int) (mem *memDB, err error) {
|
||||
num := db.s.allocFileNum()
|
||||
file := db.s.getJournalFile(num)
|
||||
w, err := file.Create()
|
||||
fd := storage.FileDesc{Type: storage.TypeJournal, Num: db.s.allocFileNum()}
|
||||
w, err := db.s.stor.Create(fd)
|
||||
if err != nil {
|
||||
db.s.reuseFileNum(num)
|
||||
db.s.reuseFileNum(fd.Num)
|
||||
return
|
||||
}
|
||||
|
||||
|
@ -107,7 +131,7 @@ func (db *DB) newMem(n int) (mem *memDB, err error) {
|
|||
defer db.memMu.Unlock()
|
||||
|
||||
if db.frozenMem != nil {
|
||||
panic("still has frozen mem")
|
||||
return nil, errHasFrozenMem
|
||||
}
|
||||
|
||||
if db.journal == nil {
|
||||
|
@ -115,20 +139,14 @@ func (db *DB) newMem(n int) (mem *memDB, err error) {
|
|||
} else {
|
||||
db.journal.Reset(w)
|
||||
db.journalWriter.Close()
|
||||
db.frozenJournalFile = db.journalFile
|
||||
db.frozenJournalFd = db.journalFd
|
||||
}
|
||||
db.journalWriter = w
|
||||
db.journalFile = file
|
||||
db.journalFd = fd
|
||||
db.frozenMem = db.mem
|
||||
mdb := db.mpoolGet()
|
||||
if mdb == nil || mdb.Capacity() < n {
|
||||
mdb = memdb.New(db.s.icmp, maxInt(db.s.o.GetWriteBuffer(), n))
|
||||
}
|
||||
mem = &memDB{
|
||||
db: db,
|
||||
mdb: mdb,
|
||||
ref: 2,
|
||||
}
|
||||
mem = db.mpoolGet(n)
|
||||
mem.incref() // for self
|
||||
mem.incref() // for caller
|
||||
db.mem = mem
|
||||
// The seq only incremented by the writer. And whoever called newMem
|
||||
// should hold write lock, so no need additional synchronization here.
|
||||
|
@ -140,24 +158,26 @@ func (db *DB) newMem(n int) (mem *memDB, err error) {
|
|||
func (db *DB) getMems() (e, f *memDB) {
|
||||
db.memMu.RLock()
|
||||
defer db.memMu.RUnlock()
|
||||
if db.mem == nil {
|
||||
if db.mem != nil {
|
||||
db.mem.incref()
|
||||
} else if !db.isClosed() {
|
||||
panic("nil effective mem")
|
||||
}
|
||||
db.mem.incref()
|
||||
if db.frozenMem != nil {
|
||||
db.frozenMem.incref()
|
||||
}
|
||||
return db.mem, db.frozenMem
|
||||
}
|
||||
|
||||
// Get frozen memdb.
|
||||
// Get effective memdb.
|
||||
func (db *DB) getEffectiveMem() *memDB {
|
||||
db.memMu.RLock()
|
||||
defer db.memMu.RUnlock()
|
||||
if db.mem == nil {
|
||||
if db.mem != nil {
|
||||
db.mem.incref()
|
||||
} else if !db.isClosed() {
|
||||
panic("nil effective mem")
|
||||
}
|
||||
db.mem.incref()
|
||||
return db.mem
|
||||
}
|
||||
|
||||
|
@ -181,17 +201,25 @@ func (db *DB) getFrozenMem() *memDB {
|
|||
// Drop frozen memdb; assume that frozen memdb isn't nil.
|
||||
func (db *DB) dropFrozenMem() {
|
||||
db.memMu.Lock()
|
||||
if err := db.frozenJournalFile.Remove(); err != nil {
|
||||
db.logf("journal@remove removing @%d %q", db.frozenJournalFile.Num(), err)
|
||||
if err := db.s.stor.Remove(db.frozenJournalFd); err != nil {
|
||||
db.logf("journal@remove removing @%d %q", db.frozenJournalFd.Num, err)
|
||||
} else {
|
||||
db.logf("journal@remove removed @%d", db.frozenJournalFile.Num())
|
||||
db.logf("journal@remove removed @%d", db.frozenJournalFd.Num)
|
||||
}
|
||||
db.frozenJournalFile = nil
|
||||
db.frozenJournalFd = storage.FileDesc{}
|
||||
db.frozenMem.decref()
|
||||
db.frozenMem = nil
|
||||
db.memMu.Unlock()
|
||||
}
|
||||
|
||||
// Clear mems ptr; used by DB.Close().
|
||||
func (db *DB) clearMems() {
|
||||
db.memMu.Lock()
|
||||
db.mem = nil
|
||||
db.frozenMem = nil
|
||||
db.memMu.Unlock()
|
||||
}
|
||||
|
||||
// Set closed flag; return true if not already closed.
|
||||
func (db *DB) setClosed() bool {
|
||||
return atomic.CompareAndSwapUint32(&db.closed, 0, 1)
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,325 @@
|
|||
// Copyright (c) 2016, Suryandaru Triandana <syndtr@gmail.com>
|
||||
// All rights reserved.
|
||||
//
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
package leveldb
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/iterator"
|
||||
"github.com/syndtr/goleveldb/leveldb/opt"
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
var errTransactionDone = errors.New("leveldb: transaction already closed")
|
||||
|
||||
// Transaction is the transaction handle.
|
||||
type Transaction struct {
|
||||
db *DB
|
||||
lk sync.RWMutex
|
||||
seq uint64
|
||||
mem *memDB
|
||||
tables tFiles
|
||||
ikScratch []byte
|
||||
rec sessionRecord
|
||||
stats cStatStaging
|
||||
closed bool
|
||||
}
|
||||
|
||||
// Get gets the value for the given key. It returns ErrNotFound if the
|
||||
// DB does not contains the key.
|
||||
//
|
||||
// The returned slice is its own copy, it is safe to modify the contents
|
||||
// of the returned slice.
|
||||
// It is safe to modify the contents of the argument after Get returns.
|
||||
func (tr *Transaction) Get(key []byte, ro *opt.ReadOptions) ([]byte, error) {
|
||||
tr.lk.RLock()
|
||||
defer tr.lk.RUnlock()
|
||||
if tr.closed {
|
||||
return nil, errTransactionDone
|
||||
}
|
||||
return tr.db.get(tr.mem.DB, tr.tables, key, tr.seq, ro)
|
||||
}
|
||||
|
||||
// Has returns true if the DB does contains the given key.
|
||||
//
|
||||
// It is safe to modify the contents of the argument after Has returns.
|
||||
func (tr *Transaction) Has(key []byte, ro *opt.ReadOptions) (bool, error) {
|
||||
tr.lk.RLock()
|
||||
defer tr.lk.RUnlock()
|
||||
if tr.closed {
|
||||
return false, errTransactionDone
|
||||
}
|
||||
return tr.db.has(tr.mem.DB, tr.tables, key, tr.seq, ro)
|
||||
}
|
||||
|
||||
// NewIterator returns an iterator for the latest snapshot of the transaction.
|
||||
// The returned iterator is not safe for concurrent use, but it is safe to use
|
||||
// multiple iterators concurrently, with each in a dedicated goroutine.
|
||||
// It is also safe to use an iterator concurrently while writes to the
|
||||
// transaction. The resultant key/value pairs are guaranteed to be consistent.
|
||||
//
|
||||
// Slice allows slicing the iterator to only contains keys in the given
|
||||
// range. A nil Range.Start is treated as a key before all keys in the
|
||||
// DB. And a nil Range.Limit is treated as a key after all keys in
|
||||
// the DB.
|
||||
//
|
||||
// The iterator must be released after use, by calling Release method.
|
||||
//
|
||||
// Also read Iterator documentation of the leveldb/iterator package.
|
||||
func (tr *Transaction) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator {
|
||||
tr.lk.RLock()
|
||||
defer tr.lk.RUnlock()
|
||||
if tr.closed {
|
||||
return iterator.NewEmptyIterator(errTransactionDone)
|
||||
}
|
||||
tr.mem.incref()
|
||||
return tr.db.newIterator(tr.mem, tr.tables, tr.seq, slice, ro)
|
||||
}
|
||||
|
||||
func (tr *Transaction) flush() error {
|
||||
// Flush memdb.
|
||||
if tr.mem.Len() != 0 {
|
||||
tr.stats.startTimer()
|
||||
iter := tr.mem.NewIterator(nil)
|
||||
t, n, err := tr.db.s.tops.createFrom(iter)
|
||||
iter.Release()
|
||||
tr.stats.stopTimer()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if tr.mem.getref() == 1 {
|
||||
tr.mem.Reset()
|
||||
} else {
|
||||
tr.mem.decref()
|
||||
tr.mem = tr.db.mpoolGet(0)
|
||||
tr.mem.incref()
|
||||
}
|
||||
tr.tables = append(tr.tables, t)
|
||||
tr.rec.addTableFile(0, t)
|
||||
tr.stats.write += t.size
|
||||
tr.db.logf("transaction@flush created L0@%d N·%d S·%s %q:%q", t.fd.Num, n, shortenb(int(t.size)), t.imin, t.imax)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (tr *Transaction) put(kt keyType, key, value []byte) error {
|
||||
tr.ikScratch = makeInternalKey(tr.ikScratch, key, tr.seq+1, kt)
|
||||
if tr.mem.Free() < len(tr.ikScratch)+len(value) {
|
||||
if err := tr.flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
if err := tr.mem.Put(tr.ikScratch, value); err != nil {
|
||||
return err
|
||||
}
|
||||
tr.seq++
|
||||
return nil
|
||||
}
|
||||
|
||||
// Put sets the value for the given key. It overwrites any previous value
|
||||
// for that key; a DB is not a multi-map.
|
||||
// Please note that the transaction is not compacted until committed, so if you
|
||||
// writes 10 same keys, then those 10 same keys are in the transaction.
|
||||
//
|
||||
// It is safe to modify the contents of the arguments after Put returns.
|
||||
func (tr *Transaction) Put(key, value []byte, wo *opt.WriteOptions) error {
|
||||
tr.lk.Lock()
|
||||
defer tr.lk.Unlock()
|
||||
if tr.closed {
|
||||
return errTransactionDone
|
||||
}
|
||||
return tr.put(keyTypeVal, key, value)
|
||||
}
|
||||
|
||||
// Delete deletes the value for the given key.
|
||||
// Please note that the transaction is not compacted until committed, so if you
|
||||
// writes 10 same keys, then those 10 same keys are in the transaction.
|
||||
//
|
||||
// It is safe to modify the contents of the arguments after Delete returns.
|
||||
func (tr *Transaction) Delete(key []byte, wo *opt.WriteOptions) error {
|
||||
tr.lk.Lock()
|
||||
defer tr.lk.Unlock()
|
||||
if tr.closed {
|
||||
return errTransactionDone
|
||||
}
|
||||
return tr.put(keyTypeDel, key, nil)
|
||||
}
|
||||
|
||||
// Write apply the given batch to the transaction. The batch will be applied
|
||||
// sequentially.
|
||||
// Please note that the transaction is not compacted until committed, so if you
|
||||
// writes 10 same keys, then those 10 same keys are in the transaction.
|
||||
//
|
||||
// It is safe to modify the contents of the arguments after Write returns.
|
||||
func (tr *Transaction) Write(b *Batch, wo *opt.WriteOptions) error {
|
||||
if b == nil || b.Len() == 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
tr.lk.Lock()
|
||||
defer tr.lk.Unlock()
|
||||
if tr.closed {
|
||||
return errTransactionDone
|
||||
}
|
||||
return b.replayInternal(func(i int, kt keyType, k, v []byte) error {
|
||||
return tr.put(kt, k, v)
|
||||
})
|
||||
}
|
||||
|
||||
func (tr *Transaction) setDone() {
|
||||
tr.closed = true
|
||||
tr.db.tr = nil
|
||||
tr.mem.decref()
|
||||
<-tr.db.writeLockC
|
||||
}
|
||||
|
||||
// Commit commits the transaction. If error is not nil, then the transaction is
|
||||
// not committed, it can then either be retried or discarded.
|
||||
//
|
||||
// Other methods should not be called after transaction has been committed.
|
||||
func (tr *Transaction) Commit() error {
|
||||
if err := tr.db.ok(); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
tr.lk.Lock()
|
||||
defer tr.lk.Unlock()
|
||||
if tr.closed {
|
||||
return errTransactionDone
|
||||
}
|
||||
if err := tr.flush(); err != nil {
|
||||
// Return error, lets user decide either to retry or discard
|
||||
// transaction.
|
||||
return err
|
||||
}
|
||||
if len(tr.tables) != 0 {
|
||||
// Committing transaction.
|
||||
tr.rec.setSeqNum(tr.seq)
|
||||
tr.db.compCommitLk.Lock()
|
||||
tr.stats.startTimer()
|
||||
var cerr error
|
||||
for retry := 0; retry < 3; retry++ {
|
||||
cerr = tr.db.s.commit(&tr.rec)
|
||||
if cerr != nil {
|
||||
tr.db.logf("transaction@commit error R·%d %q", retry, cerr)
|
||||
select {
|
||||
case <-time.After(time.Second):
|
||||
case <-tr.db.closeC:
|
||||
tr.db.logf("transaction@commit exiting")
|
||||
tr.db.compCommitLk.Unlock()
|
||||
return cerr
|
||||
}
|
||||
} else {
|
||||
// Success. Set db.seq.
|
||||
tr.db.setSeq(tr.seq)
|
||||
break
|
||||
}
|
||||
}
|
||||
tr.stats.stopTimer()
|
||||
if cerr != nil {
|
||||
// Return error, lets user decide either to retry or discard
|
||||
// transaction.
|
||||
return cerr
|
||||
}
|
||||
|
||||
// Update compaction stats. This is safe as long as we hold compCommitLk.
|
||||
tr.db.compStats.addStat(0, &tr.stats)
|
||||
|
||||
// Trigger table auto-compaction.
|
||||
tr.db.compTrigger(tr.db.tcompCmdC)
|
||||
tr.db.compCommitLk.Unlock()
|
||||
|
||||
// Additionally, wait compaction when certain threshold reached.
|
||||
// Ignore error, returns error only if transaction can't be committed.
|
||||
tr.db.waitCompaction()
|
||||
}
|
||||
// Only mark as done if transaction committed successfully.
|
||||
tr.setDone()
|
||||
return nil
|
||||
}
|
||||
|
||||
func (tr *Transaction) discard() {
|
||||
// Discard transaction.
|
||||
for _, t := range tr.tables {
|
||||
tr.db.logf("transaction@discard @%d", t.fd.Num)
|
||||
if err1 := tr.db.s.stor.Remove(t.fd); err1 == nil {
|
||||
tr.db.s.reuseFileNum(t.fd.Num)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Discard discards the transaction.
|
||||
//
|
||||
// Other methods should not be called after transaction has been discarded.
|
||||
func (tr *Transaction) Discard() {
|
||||
tr.lk.Lock()
|
||||
if !tr.closed {
|
||||
tr.discard()
|
||||
tr.setDone()
|
||||
}
|
||||
tr.lk.Unlock()
|
||||
}
|
||||
|
||||
func (db *DB) waitCompaction() error {
|
||||
if db.s.tLen(0) >= db.s.o.GetWriteL0PauseTrigger() {
|
||||
return db.compTriggerWait(db.tcompCmdC)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// OpenTransaction opens an atomic DB transaction. Only one transaction can be
|
||||
// opened at a time. Subsequent call to Write and OpenTransaction will be blocked
|
||||
// until in-flight transaction is committed or discarded.
|
||||
// The returned transaction handle is safe for concurrent use.
|
||||
//
|
||||
// Transaction is expensive and can overwhelm compaction, especially if
|
||||
// transaction size is small. Use with caution.
|
||||
//
|
||||
// The transaction must be closed once done, either by committing or discarding
|
||||
// the transaction.
|
||||
// Closing the DB will discard open transaction.
|
||||
func (db *DB) OpenTransaction() (*Transaction, error) {
|
||||
if err := db.ok(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// The write happen synchronously.
|
||||
select {
|
||||
case db.writeLockC <- struct{}{}:
|
||||
case err := <-db.compPerErrC:
|
||||
return nil, err
|
||||
case <-db.closeC:
|
||||
return nil, ErrClosed
|
||||
}
|
||||
|
||||
if db.tr != nil {
|
||||
panic("leveldb: has open transaction")
|
||||
}
|
||||
|
||||
// Flush current memdb.
|
||||
if db.mem != nil && db.mem.Len() != 0 {
|
||||
if _, err := db.rotateMem(0, true); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
|
||||
// Wait compaction when certain threshold reached.
|
||||
if err := db.waitCompaction(); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
tr := &Transaction{
|
||||
db: db,
|
||||
seq: db.seq,
|
||||
mem: db.mpoolGet(0),
|
||||
}
|
||||
tr.mem.incref()
|
||||
db.tr = tr
|
||||
return tr, nil
|
||||
}
|
|
@ -21,14 +21,16 @@ type Reader interface {
|
|||
NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator
|
||||
}
|
||||
|
||||
type Sizes []uint64
|
||||
// Sizes is list of size.
|
||||
type Sizes []int64
|
||||
|
||||
// Sum returns sum of the sizes.
|
||||
func (p Sizes) Sum() (n uint64) {
|
||||
for _, s := range p {
|
||||
n += s
|
||||
func (sizes Sizes) Sum() int64 {
|
||||
var sum int64
|
||||
for _, size := range sizes {
|
||||
sum += size
|
||||
}
|
||||
return n
|
||||
return sum
|
||||
}
|
||||
|
||||
// Logging.
|
||||
|
@ -40,59 +42,59 @@ func (db *DB) checkAndCleanFiles() error {
|
|||
v := db.s.version()
|
||||
defer v.release()
|
||||
|
||||
tablesMap := make(map[uint64]bool)
|
||||
for _, tables := range v.tables {
|
||||
tmap := make(map[int64]bool)
|
||||
for _, tables := range v.levels {
|
||||
for _, t := range tables {
|
||||
tablesMap[t.file.Num()] = false
|
||||
tmap[t.fd.Num] = false
|
||||
}
|
||||
}
|
||||
|
||||
files, err := db.s.getFiles(storage.TypeAll)
|
||||
fds, err := db.s.stor.List(storage.TypeAll)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
var nTables int
|
||||
var rem []storage.File
|
||||
for _, f := range files {
|
||||
var nt int
|
||||
var rem []storage.FileDesc
|
||||
for _, fd := range fds {
|
||||
keep := true
|
||||
switch f.Type() {
|
||||
switch fd.Type {
|
||||
case storage.TypeManifest:
|
||||
keep = f.Num() >= db.s.manifestFile.Num()
|
||||
keep = fd.Num >= db.s.manifestFd.Num
|
||||
case storage.TypeJournal:
|
||||
if db.frozenJournalFile != nil {
|
||||
keep = f.Num() >= db.frozenJournalFile.Num()
|
||||
if !db.frozenJournalFd.Zero() {
|
||||
keep = fd.Num >= db.frozenJournalFd.Num
|
||||
} else {
|
||||
keep = f.Num() >= db.journalFile.Num()
|
||||
keep = fd.Num >= db.journalFd.Num
|
||||
}
|
||||
case storage.TypeTable:
|
||||
_, keep = tablesMap[f.Num()]
|
||||
_, keep = tmap[fd.Num]
|
||||
if keep {
|
||||
tablesMap[f.Num()] = true
|
||||
nTables++
|
||||
tmap[fd.Num] = true
|
||||
nt++
|
||||
}
|
||||
}
|
||||
|
||||
if !keep {
|
||||
rem = append(rem, f)
|
||||
rem = append(rem, fd)
|
||||
}
|
||||
}
|
||||
|
||||
if nTables != len(tablesMap) {
|
||||
var missing []*storage.FileInfo
|
||||
for num, present := range tablesMap {
|
||||
if nt != len(tmap) {
|
||||
var mfds []storage.FileDesc
|
||||
for num, present := range tmap {
|
||||
if !present {
|
||||
missing = append(missing, &storage.FileInfo{Type: storage.TypeTable, Num: num})
|
||||
mfds = append(mfds, storage.FileDesc{storage.TypeTable, num})
|
||||
db.logf("db@janitor table missing @%d", num)
|
||||
}
|
||||
}
|
||||
return errors.NewErrCorrupted(nil, &errors.ErrMissingFiles{Files: missing})
|
||||
return errors.NewErrCorrupted(storage.FileDesc{}, &errors.ErrMissingFiles{Fds: mfds})
|
||||
}
|
||||
|
||||
db.logf("db@janitor F·%d G·%d", len(files), len(rem))
|
||||
for _, f := range rem {
|
||||
db.logf("db@janitor removing %s-%d", f.Type(), f.Num())
|
||||
if err := f.Remove(); err != nil {
|
||||
db.logf("db@janitor F·%d G·%d", len(fds), len(rem))
|
||||
for _, fd := range rem {
|
||||
db.logf("db@janitor removing %s-%d", fd.Type, fd.Num)
|
||||
if err := db.s.stor.Remove(fd); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
|
|
@ -7,6 +7,7 @@
|
|||
package leveldb
|
||||
|
||||
import (
|
||||
"sync/atomic"
|
||||
"time"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/memdb"
|
||||
|
@ -14,91 +15,95 @@ import (
|
|||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
func (db *DB) writeJournal(b *Batch) error {
|
||||
w, err := db.journal.Next()
|
||||
func (db *DB) writeJournal(batches []*Batch, seq uint64, sync bool) error {
|
||||
wr, err := db.journal.Next()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err := w.Write(b.encode()); err != nil {
|
||||
if err := writeBatchesWithHeader(wr, batches, seq); err != nil {
|
||||
return err
|
||||
}
|
||||
if err := db.journal.Flush(); err != nil {
|
||||
return err
|
||||
}
|
||||
if b.sync {
|
||||
if sync {
|
||||
return db.journalWriter.Sync()
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (db *DB) jWriter() {
|
||||
defer db.closeW.Done()
|
||||
for {
|
||||
select {
|
||||
case b := <-db.journalC:
|
||||
if b != nil {
|
||||
db.journalAckC <- db.writeJournal(b)
|
||||
}
|
||||
case _, _ = <-db.closeC:
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (db *DB) rotateMem(n int) (mem *memDB, err error) {
|
||||
func (db *DB) rotateMem(n int, wait bool) (mem *memDB, err error) {
|
||||
retryLimit := 3
|
||||
retry:
|
||||
// Wait for pending memdb compaction.
|
||||
err = db.compSendIdle(db.mcompCmdC)
|
||||
err = db.compTriggerWait(db.mcompCmdC)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
retryLimit--
|
||||
|
||||
// Create new memdb and journal.
|
||||
mem, err = db.newMem(n)
|
||||
if err != nil {
|
||||
if err == errHasFrozenMem {
|
||||
if retryLimit <= 0 {
|
||||
panic("BUG: still has frozen memdb")
|
||||
}
|
||||
goto retry
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
// Schedule memdb compaction.
|
||||
db.compSendTrigger(db.mcompCmdC)
|
||||
if wait {
|
||||
err = db.compTriggerWait(db.mcompCmdC)
|
||||
} else {
|
||||
db.compTrigger(db.mcompCmdC)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (db *DB) flush(n int) (mem *memDB, nn int, err error) {
|
||||
func (db *DB) flush(n int) (mdb *memDB, mdbFree int, err error) {
|
||||
delayed := false
|
||||
slowdownTrigger := db.s.o.GetWriteL0SlowdownTrigger()
|
||||
pauseTrigger := db.s.o.GetWriteL0PauseTrigger()
|
||||
flush := func() (retry bool) {
|
||||
v := db.s.version()
|
||||
defer v.release()
|
||||
mem = db.getEffectiveMem()
|
||||
mdb = db.getEffectiveMem()
|
||||
if mdb == nil {
|
||||
err = ErrClosed
|
||||
return false
|
||||
}
|
||||
defer func() {
|
||||
if retry {
|
||||
mem.decref()
|
||||
mem = nil
|
||||
mdb.decref()
|
||||
mdb = nil
|
||||
}
|
||||
}()
|
||||
nn = mem.mdb.Free()
|
||||
tLen := db.s.tLen(0)
|
||||
mdbFree = mdb.Free()
|
||||
switch {
|
||||
case v.tLen(0) >= db.s.o.GetWriteL0SlowdownTrigger() && !delayed:
|
||||
case tLen >= slowdownTrigger && !delayed:
|
||||
delayed = true
|
||||
time.Sleep(time.Millisecond)
|
||||
case nn >= n:
|
||||
case mdbFree >= n:
|
||||
return false
|
||||
case v.tLen(0) >= db.s.o.GetWriteL0PauseTrigger():
|
||||
case tLen >= pauseTrigger:
|
||||
delayed = true
|
||||
err = db.compSendIdle(db.tcompCmdC)
|
||||
err = db.compTriggerWait(db.tcompCmdC)
|
||||
if err != nil {
|
||||
return false
|
||||
}
|
||||
default:
|
||||
// Allow memdb to grow if it has no entry.
|
||||
if mem.mdb.Len() == 0 {
|
||||
nn = n
|
||||
if mdb.Len() == 0 {
|
||||
mdbFree = n
|
||||
} else {
|
||||
mem.decref()
|
||||
mem, err = db.rotateMem(n)
|
||||
mdb.decref()
|
||||
mdb, err = db.rotateMem(n, false)
|
||||
if err == nil {
|
||||
nn = mem.mdb.Free()
|
||||
mdbFree = mdb.Free()
|
||||
} else {
|
||||
nn = 0
|
||||
mdbFree = 0
|
||||
}
|
||||
}
|
||||
return false
|
||||
|
@ -113,157 +118,265 @@ func (db *DB) flush(n int) (mem *memDB, nn int, err error) {
|
|||
db.writeDelayN++
|
||||
} else if db.writeDelayN > 0 {
|
||||
db.logf("db@write was delayed N·%d T·%v", db.writeDelayN, db.writeDelay)
|
||||
atomic.AddInt32(&db.cWriteDelayN, int32(db.writeDelayN))
|
||||
atomic.AddInt64(&db.cWriteDelay, int64(db.writeDelay))
|
||||
db.writeDelay = 0
|
||||
db.writeDelayN = 0
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
// Write apply the given batch to the DB. The batch will be applied
|
||||
// sequentially.
|
||||
//
|
||||
// It is safe to modify the contents of the arguments after Write returns.
|
||||
func (db *DB) Write(b *Batch, wo *opt.WriteOptions) (err error) {
|
||||
err = db.ok()
|
||||
if err != nil || b == nil || b.Len() == 0 {
|
||||
return
|
||||
type writeMerge struct {
|
||||
sync bool
|
||||
batch *Batch
|
||||
keyType keyType
|
||||
key, value []byte
|
||||
}
|
||||
|
||||
func (db *DB) unlockWrite(overflow bool, merged int, err error) {
|
||||
for i := 0; i < merged; i++ {
|
||||
db.writeAckC <- err
|
||||
}
|
||||
|
||||
b.init(wo.GetSync())
|
||||
|
||||
// The write happen synchronously.
|
||||
select {
|
||||
case db.writeC <- b:
|
||||
if <-db.writeMergedC {
|
||||
return <-db.writeAckC
|
||||
}
|
||||
case db.writeLockC <- struct{}{}:
|
||||
case err = <-db.compPerErrC:
|
||||
return
|
||||
case _, _ = <-db.closeC:
|
||||
return ErrClosed
|
||||
if overflow {
|
||||
// Pass lock to the next write (that failed to merge).
|
||||
db.writeMergedC <- false
|
||||
} else {
|
||||
// Release lock.
|
||||
<-db.writeLockC
|
||||
}
|
||||
}
|
||||
|
||||
merged := 0
|
||||
danglingMerge := false
|
||||
defer func() {
|
||||
if danglingMerge {
|
||||
db.writeMergedC <- false
|
||||
} else {
|
||||
<-db.writeLockC
|
||||
}
|
||||
for i := 0; i < merged; i++ {
|
||||
db.writeAckC <- err
|
||||
}
|
||||
}()
|
||||
|
||||
mem, memFree, err := db.flush(b.size())
|
||||
// ourBatch if defined should equal with batch.
|
||||
func (db *DB) writeLocked(batch, ourBatch *Batch, merge, sync bool) error {
|
||||
// Try to flush memdb. This method would also trying to throttle writes
|
||||
// if it is too fast and compaction cannot catch-up.
|
||||
mdb, mdbFree, err := db.flush(batch.internalLen)
|
||||
if err != nil {
|
||||
return
|
||||
db.unlockWrite(false, 0, err)
|
||||
return err
|
||||
}
|
||||
defer mem.decref()
|
||||
defer mdb.decref()
|
||||
|
||||
// Calculate maximum size of the batch.
|
||||
m := 1 << 20
|
||||
if x := b.size(); x <= 128<<10 {
|
||||
m = x + (128 << 10)
|
||||
}
|
||||
m = minInt(m, memFree)
|
||||
var (
|
||||
overflow bool
|
||||
merged int
|
||||
batches = []*Batch{batch}
|
||||
)
|
||||
|
||||
// Merge with other batch.
|
||||
drain:
|
||||
for b.size() < m && !b.sync {
|
||||
select {
|
||||
case nb := <-db.writeC:
|
||||
if b.size()+nb.size() <= m {
|
||||
b.append(nb)
|
||||
db.writeMergedC <- true
|
||||
merged++
|
||||
} else {
|
||||
danglingMerge = true
|
||||
break drain
|
||||
}
|
||||
default:
|
||||
break drain
|
||||
if merge {
|
||||
// Merge limit.
|
||||
var mergeLimit int
|
||||
if batch.internalLen > 128<<10 {
|
||||
mergeLimit = (1 << 20) - batch.internalLen
|
||||
} else {
|
||||
mergeLimit = 128 << 10
|
||||
}
|
||||
}
|
||||
|
||||
// Set batch first seq number relative from last seq.
|
||||
b.seq = db.seq + 1
|
||||
|
||||
// Write journal concurrently if it is large enough.
|
||||
if b.size() >= (128 << 10) {
|
||||
// Push the write batch to the journal writer
|
||||
select {
|
||||
case db.journalC <- b:
|
||||
// Write into memdb
|
||||
if berr := b.memReplay(mem.mdb); berr != nil {
|
||||
panic(berr)
|
||||
}
|
||||
case err = <-db.compPerErrC:
|
||||
return
|
||||
case _, _ = <-db.closeC:
|
||||
err = ErrClosed
|
||||
return
|
||||
mergeCap := mdbFree - batch.internalLen
|
||||
if mergeLimit > mergeCap {
|
||||
mergeLimit = mergeCap
|
||||
}
|
||||
// Wait for journal writer
|
||||
select {
|
||||
case err = <-db.journalAckC:
|
||||
if err != nil {
|
||||
// Revert memdb if error detected
|
||||
if berr := b.revertMemReplay(mem.mdb); berr != nil {
|
||||
panic(berr)
|
||||
|
||||
merge:
|
||||
for mergeLimit > 0 {
|
||||
select {
|
||||
case incoming := <-db.writeMergeC:
|
||||
if incoming.batch != nil {
|
||||
// Merge batch.
|
||||
if incoming.batch.internalLen > mergeLimit {
|
||||
overflow = true
|
||||
break merge
|
||||
}
|
||||
batches = append(batches, incoming.batch)
|
||||
mergeLimit -= incoming.batch.internalLen
|
||||
} else {
|
||||
// Merge put.
|
||||
internalLen := len(incoming.key) + len(incoming.value) + 8
|
||||
if internalLen > mergeLimit {
|
||||
overflow = true
|
||||
break merge
|
||||
}
|
||||
if ourBatch == nil {
|
||||
ourBatch = db.batchPool.Get().(*Batch)
|
||||
ourBatch.Reset()
|
||||
batches = append(batches, ourBatch)
|
||||
}
|
||||
// We can use same batch since concurrent write doesn't
|
||||
// guarantee write order.
|
||||
ourBatch.appendRec(incoming.keyType, incoming.key, incoming.value)
|
||||
mergeLimit -= internalLen
|
||||
}
|
||||
return
|
||||
sync = sync || incoming.sync
|
||||
merged++
|
||||
db.writeMergedC <- true
|
||||
|
||||
default:
|
||||
break merge
|
||||
}
|
||||
case _, _ = <-db.closeC:
|
||||
err = ErrClosed
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
// Seq number.
|
||||
seq := db.seq + 1
|
||||
|
||||
// Write journal.
|
||||
if err := db.writeJournal(batches, seq, sync); err != nil {
|
||||
db.unlockWrite(overflow, merged, err)
|
||||
return err
|
||||
}
|
||||
|
||||
// Put batches.
|
||||
for _, batch := range batches {
|
||||
if err := batch.putMem(seq, mdb.DB); err != nil {
|
||||
panic(err)
|
||||
}
|
||||
seq += uint64(batch.Len())
|
||||
}
|
||||
|
||||
// Incr seq number.
|
||||
db.addSeq(uint64(batchesLen(batches)))
|
||||
|
||||
// Rotate memdb if it's reach the threshold.
|
||||
if batch.internalLen >= mdbFree {
|
||||
db.rotateMem(0, false)
|
||||
}
|
||||
|
||||
db.unlockWrite(overflow, merged, nil)
|
||||
return nil
|
||||
}
|
||||
|
||||
// Write apply the given batch to the DB. The batch records will be applied
|
||||
// sequentially. Write might be used concurrently, when used concurrently and
|
||||
// batch is small enough, write will try to merge the batches. Set NoWriteMerge
|
||||
// option to true to disable write merge.
|
||||
//
|
||||
// It is safe to modify the contents of the arguments after Write returns but
|
||||
// not before. Write will not modify content of the batch.
|
||||
func (db *DB) Write(batch *Batch, wo *opt.WriteOptions) error {
|
||||
if err := db.ok(); err != nil || batch == nil || batch.Len() == 0 {
|
||||
return err
|
||||
}
|
||||
|
||||
// If the batch size is larger than write buffer, it may justified to write
|
||||
// using transaction instead. Using transaction the batch will be written
|
||||
// into tables directly, skipping the journaling.
|
||||
if batch.internalLen > db.s.o.GetWriteBuffer() && !db.s.o.GetDisableLargeBatchTransaction() {
|
||||
tr, err := db.OpenTransaction()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if err := tr.Write(batch, wo); err != nil {
|
||||
tr.Discard()
|
||||
return err
|
||||
}
|
||||
return tr.Commit()
|
||||
}
|
||||
|
||||
merge := !wo.GetNoWriteMerge() && !db.s.o.GetNoWriteMerge()
|
||||
sync := wo.GetSync() && !db.s.o.GetNoSync()
|
||||
|
||||
// Acquire write lock.
|
||||
if merge {
|
||||
select {
|
||||
case db.writeMergeC <- writeMerge{sync: sync, batch: batch}:
|
||||
if <-db.writeMergedC {
|
||||
// Write is merged.
|
||||
return <-db.writeAckC
|
||||
}
|
||||
// Write is not merged, the write lock is handed to us. Continue.
|
||||
case db.writeLockC <- struct{}{}:
|
||||
// Write lock acquired.
|
||||
case err := <-db.compPerErrC:
|
||||
// Compaction error.
|
||||
return err
|
||||
case <-db.closeC:
|
||||
// Closed
|
||||
return ErrClosed
|
||||
}
|
||||
} else {
|
||||
err = db.writeJournal(b)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
if berr := b.memReplay(mem.mdb); berr != nil {
|
||||
panic(berr)
|
||||
select {
|
||||
case db.writeLockC <- struct{}{}:
|
||||
// Write lock acquired.
|
||||
case err := <-db.compPerErrC:
|
||||
// Compaction error.
|
||||
return err
|
||||
case <-db.closeC:
|
||||
// Closed
|
||||
return ErrClosed
|
||||
}
|
||||
}
|
||||
|
||||
// Set last seq number.
|
||||
db.addSeq(uint64(b.Len()))
|
||||
return db.writeLocked(batch, nil, merge, sync)
|
||||
}
|
||||
|
||||
if b.size() >= memFree {
|
||||
db.rotateMem(0)
|
||||
func (db *DB) putRec(kt keyType, key, value []byte, wo *opt.WriteOptions) error {
|
||||
if err := db.ok(); err != nil {
|
||||
return err
|
||||
}
|
||||
return
|
||||
|
||||
merge := !wo.GetNoWriteMerge() && !db.s.o.GetNoWriteMerge()
|
||||
sync := wo.GetSync() && !db.s.o.GetNoSync()
|
||||
|
||||
// Acquire write lock.
|
||||
if merge {
|
||||
select {
|
||||
case db.writeMergeC <- writeMerge{sync: sync, keyType: kt, key: key, value: value}:
|
||||
if <-db.writeMergedC {
|
||||
// Write is merged.
|
||||
return <-db.writeAckC
|
||||
}
|
||||
// Write is not merged, the write lock is handed to us. Continue.
|
||||
case db.writeLockC <- struct{}{}:
|
||||
// Write lock acquired.
|
||||
case err := <-db.compPerErrC:
|
||||
// Compaction error.
|
||||
return err
|
||||
case <-db.closeC:
|
||||
// Closed
|
||||
return ErrClosed
|
||||
}
|
||||
} else {
|
||||
select {
|
||||
case db.writeLockC <- struct{}{}:
|
||||
// Write lock acquired.
|
||||
case err := <-db.compPerErrC:
|
||||
// Compaction error.
|
||||
return err
|
||||
case <-db.closeC:
|
||||
// Closed
|
||||
return ErrClosed
|
||||
}
|
||||
}
|
||||
|
||||
batch := db.batchPool.Get().(*Batch)
|
||||
batch.Reset()
|
||||
batch.appendRec(kt, key, value)
|
||||
return db.writeLocked(batch, batch, merge, sync)
|
||||
}
|
||||
|
||||
// Put sets the value for the given key. It overwrites any previous value
|
||||
// for that key; a DB is not a multi-map.
|
||||
// for that key; a DB is not a multi-map. Write merge also applies for Put, see
|
||||
// Write.
|
||||
//
|
||||
// It is safe to modify the contents of the arguments after Put returns.
|
||||
// It is safe to modify the contents of the arguments after Put returns but not
|
||||
// before.
|
||||
func (db *DB) Put(key, value []byte, wo *opt.WriteOptions) error {
|
||||
b := new(Batch)
|
||||
b.Put(key, value)
|
||||
return db.Write(b, wo)
|
||||
return db.putRec(keyTypeVal, key, value, wo)
|
||||
}
|
||||
|
||||
// Delete deletes the value for the given key. It returns ErrNotFound if
|
||||
// the DB does not contain the key.
|
||||
// Delete deletes the value for the given key. Delete will not returns error if
|
||||
// key doesn't exist. Write merge also applies for Delete, see Write.
|
||||
//
|
||||
// It is safe to modify the contents of the arguments after Delete returns.
|
||||
// It is safe to modify the contents of the arguments after Delete returns but
|
||||
// not before.
|
||||
func (db *DB) Delete(key []byte, wo *opt.WriteOptions) error {
|
||||
b := new(Batch)
|
||||
b.Delete(key)
|
||||
return db.Write(b, wo)
|
||||
return db.putRec(keyTypeDel, key, nil, wo)
|
||||
}
|
||||
|
||||
func isMemOverlaps(icmp *iComparer, mem *memdb.DB, min, max []byte) bool {
|
||||
iter := mem.NewIterator(nil)
|
||||
defer iter.Release()
|
||||
return (max == nil || (iter.First() && icmp.uCompare(max, iKey(iter.Key()).ukey()) >= 0)) &&
|
||||
(min == nil || (iter.Last() && icmp.uCompare(min, iKey(iter.Key()).ukey()) <= 0))
|
||||
return (max == nil || (iter.First() && icmp.uCompare(max, internalKey(iter.Key()).ukey()) >= 0)) &&
|
||||
(min == nil || (iter.Last() && icmp.uCompare(min, internalKey(iter.Key()).ukey()) <= 0))
|
||||
}
|
||||
|
||||
// CompactRange compacts the underlying DB for the given key range.
|
||||
|
@ -285,21 +398,24 @@ func (db *DB) CompactRange(r util.Range) error {
|
|||
case db.writeLockC <- struct{}{}:
|
||||
case err := <-db.compPerErrC:
|
||||
return err
|
||||
case _, _ = <-db.closeC:
|
||||
case <-db.closeC:
|
||||
return ErrClosed
|
||||
}
|
||||
|
||||
// Check for overlaps in memdb.
|
||||
mem := db.getEffectiveMem()
|
||||
defer mem.decref()
|
||||
if isMemOverlaps(db.s.icmp, mem.mdb, r.Start, r.Limit) {
|
||||
mdb := db.getEffectiveMem()
|
||||
if mdb == nil {
|
||||
return ErrClosed
|
||||
}
|
||||
defer mdb.decref()
|
||||
if isMemOverlaps(db.s.icmp, mdb.DB, r.Start, r.Limit) {
|
||||
// Memdb compaction.
|
||||
if _, err := db.rotateMem(0); err != nil {
|
||||
if _, err := db.rotateMem(0, false); err != nil {
|
||||
<-db.writeLockC
|
||||
return err
|
||||
}
|
||||
<-db.writeLockC
|
||||
if err := db.compSendIdle(db.mcompCmdC); err != nil {
|
||||
if err := db.compTriggerWait(db.mcompCmdC); err != nil {
|
||||
return err
|
||||
}
|
||||
} else {
|
||||
|
@ -307,5 +423,33 @@ func (db *DB) CompactRange(r util.Range) error {
|
|||
}
|
||||
|
||||
// Table compaction.
|
||||
return db.compSendRange(db.tcompCmdC, -1, r.Start, r.Limit)
|
||||
return db.compTriggerRange(db.tcompCmdC, -1, r.Start, r.Limit)
|
||||
}
|
||||
|
||||
// SetReadOnly makes DB read-only. It will stay read-only until reopened.
|
||||
func (db *DB) SetReadOnly() error {
|
||||
if err := db.ok(); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Lock writer.
|
||||
select {
|
||||
case db.writeLockC <- struct{}{}:
|
||||
db.compWriteLocking = true
|
||||
case err := <-db.compPerErrC:
|
||||
return err
|
||||
case <-db.closeC:
|
||||
return ErrClosed
|
||||
}
|
||||
|
||||
// Set compaction read-only.
|
||||
select {
|
||||
case db.compErrSetC <- ErrReadOnly:
|
||||
case perr := <-db.compPerErrC:
|
||||
return perr
|
||||
case <-db.closeC:
|
||||
return ErrClosed
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
|
|
@ -8,6 +8,8 @@
|
|||
//
|
||||
// Create or open a database:
|
||||
//
|
||||
// // The returned DB instance is safe for concurrent use. Which mean that all
|
||||
// // DB's methods may be called concurrently from multiple goroutine.
|
||||
// db, err := leveldb.OpenFile("path/to/db", nil)
|
||||
// ...
|
||||
// defer db.Close()
|
||||
|
|
|
@ -10,8 +10,10 @@ import (
|
|||
"github.com/syndtr/goleveldb/leveldb/errors"
|
||||
)
|
||||
|
||||
// Common errors.
|
||||
var (
|
||||
ErrNotFound = errors.ErrNotFound
|
||||
ErrReadOnly = errors.New("leveldb: read-only mode")
|
||||
ErrSnapshotReleased = errors.New("leveldb: snapshot released")
|
||||
ErrIterReleased = errors.New("leveldb: iterator released")
|
||||
ErrClosed = errors.New("leveldb: closed")
|
||||
|
|
|
@ -15,6 +15,7 @@ import (
|
|||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
// Common errors.
|
||||
var (
|
||||
ErrNotFound = New("leveldb: not found")
|
||||
ErrReleased = util.ErrReleased
|
||||
|
@ -29,21 +30,20 @@ func New(text string) error {
|
|||
// ErrCorrupted is the type that wraps errors that indicate corruption in
|
||||
// the database.
|
||||
type ErrCorrupted struct {
|
||||
File *storage.FileInfo
|
||||
Err error
|
||||
Fd storage.FileDesc
|
||||
Err error
|
||||
}
|
||||
|
||||
func (e *ErrCorrupted) Error() string {
|
||||
if e.File != nil {
|
||||
return fmt.Sprintf("%v [file=%v]", e.Err, e.File)
|
||||
} else {
|
||||
return e.Err.Error()
|
||||
if !e.Fd.Zero() {
|
||||
return fmt.Sprintf("%v [file=%v]", e.Err, e.Fd)
|
||||
}
|
||||
return e.Err.Error()
|
||||
}
|
||||
|
||||
// NewErrCorrupted creates new ErrCorrupted error.
|
||||
func NewErrCorrupted(f storage.File, err error) error {
|
||||
return &ErrCorrupted{storage.NewFileInfo(f), err}
|
||||
func NewErrCorrupted(fd storage.FileDesc, err error) error {
|
||||
return &ErrCorrupted{fd, err}
|
||||
}
|
||||
|
||||
// IsCorrupted returns a boolean indicating whether the error is indicating
|
||||
|
@ -52,24 +52,26 @@ func IsCorrupted(err error) bool {
|
|||
switch err.(type) {
|
||||
case *ErrCorrupted:
|
||||
return true
|
||||
case *storage.ErrCorrupted:
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// ErrMissingFiles is the type that indicating a corruption due to missing
|
||||
// files.
|
||||
// files. ErrMissingFiles always wrapped with ErrCorrupted.
|
||||
type ErrMissingFiles struct {
|
||||
Files []*storage.FileInfo
|
||||
Fds []storage.FileDesc
|
||||
}
|
||||
|
||||
func (e *ErrMissingFiles) Error() string { return "file missing" }
|
||||
|
||||
// SetFile sets 'file info' of the given error with the given file.
|
||||
// SetFd sets 'file info' of the given error with the given file.
|
||||
// Currently only ErrCorrupted is supported, otherwise will do nothing.
|
||||
func SetFile(err error, f storage.File) error {
|
||||
func SetFd(err error, fd storage.FileDesc) error {
|
||||
switch x := err.(type) {
|
||||
case *ErrCorrupted:
|
||||
x.File = storage.NewFileInfo(f)
|
||||
x.Fd = fd
|
||||
return x
|
||||
}
|
||||
return err
|
||||
|
|
|
@ -32,12 +32,12 @@ var _ = testutil.Defer(func() {
|
|||
db := newTestingDB(o, nil, nil)
|
||||
t := testutil.DBTesting{
|
||||
DB: db,
|
||||
Deleted: testutil.KeyValue_Generate(nil, 500, 1, 50, 5, 5).Clone(),
|
||||
Deleted: testutil.KeyValue_Generate(nil, 500, 1, 1, 50, 5, 5).Clone(),
|
||||
}
|
||||
testutil.DoDBTesting(&t)
|
||||
db.TestClose()
|
||||
done <- true
|
||||
}, 20.0)
|
||||
}, 80.0)
|
||||
})
|
||||
|
||||
Describe("read test", func() {
|
||||
|
@ -54,5 +54,64 @@ var _ = testutil.Defer(func() {
|
|||
db.(*testingDB).TestClose()
|
||||
})
|
||||
})
|
||||
|
||||
Describe("transaction test", func() {
|
||||
It("should do transaction correctly", func(done Done) {
|
||||
db := newTestingDB(o, nil, nil)
|
||||
|
||||
By("creating first transaction")
|
||||
var err error
|
||||
tr := &testingTransaction{}
|
||||
tr.Transaction, err = db.OpenTransaction()
|
||||
Expect(err).NotTo(HaveOccurred())
|
||||
t0 := &testutil.DBTesting{
|
||||
DB: tr,
|
||||
Deleted: testutil.KeyValue_Generate(nil, 200, 1, 1, 50, 5, 5).Clone(),
|
||||
}
|
||||
testutil.DoDBTesting(t0)
|
||||
testutil.TestGet(tr, t0.Present)
|
||||
testutil.TestHas(tr, t0.Present)
|
||||
|
||||
By("committing first transaction")
|
||||
err = tr.Commit()
|
||||
Expect(err).NotTo(HaveOccurred())
|
||||
testutil.TestIter(db, nil, t0.Present)
|
||||
testutil.TestGet(db, t0.Present)
|
||||
testutil.TestHas(db, t0.Present)
|
||||
|
||||
By("manipulating DB without transaction")
|
||||
t0.DB = db
|
||||
testutil.DoDBTesting(t0)
|
||||
|
||||
By("creating second transaction")
|
||||
tr.Transaction, err = db.OpenTransaction()
|
||||
Expect(err).NotTo(HaveOccurred())
|
||||
t1 := &testutil.DBTesting{
|
||||
DB: tr,
|
||||
Deleted: t0.Deleted.Clone(),
|
||||
Present: t0.Present.Clone(),
|
||||
}
|
||||
testutil.DoDBTesting(t1)
|
||||
testutil.TestIter(db, nil, t0.Present)
|
||||
|
||||
By("discarding second transaction")
|
||||
tr.Discard()
|
||||
testutil.TestIter(db, nil, t0.Present)
|
||||
|
||||
By("creating third transaction")
|
||||
tr.Transaction, err = db.OpenTransaction()
|
||||
Expect(err).NotTo(HaveOccurred())
|
||||
t0.DB = tr
|
||||
testutil.DoDBTesting(t0)
|
||||
|
||||
By("committing third transaction")
|
||||
err = tr.Commit()
|
||||
Expect(err).NotTo(HaveOccurred())
|
||||
testutil.TestIter(db, nil, t0.Present)
|
||||
|
||||
db.TestClose()
|
||||
done <- true
|
||||
}, 240.0)
|
||||
})
|
||||
})
|
||||
})
|
||||
|
|
|
@ -15,7 +15,7 @@ type iFilter struct {
|
|||
}
|
||||
|
||||
func (f iFilter) Contains(filter, key []byte) bool {
|
||||
return f.Filter.Contains(filter, iKey(key).ukey())
|
||||
return f.Filter.Contains(filter, internalKey(key).ukey())
|
||||
}
|
||||
|
||||
func (f iFilter) NewGenerator() filter.FilterGenerator {
|
||||
|
@ -27,5 +27,5 @@ type iFilterGenerator struct {
|
|||
}
|
||||
|
||||
func (g iFilterGenerator) Add(key []byte) {
|
||||
g.FilterGenerator.Add(iKey(key).ukey())
|
||||
g.FilterGenerator.Add(internalKey(key).ukey())
|
||||
}
|
||||
|
|
|
@ -17,7 +17,7 @@ var _ = testutil.Defer(func() {
|
|||
Describe("Array iterator", func() {
|
||||
It("Should iterates and seeks correctly", func() {
|
||||
// Build key/value.
|
||||
kv := testutil.KeyValue_Generate(nil, 70, 1, 5, 3, 3)
|
||||
kv := testutil.KeyValue_Generate(nil, 70, 1, 1, 5, 3, 3)
|
||||
|
||||
// Test the iterator.
|
||||
t := testutil.IteratorTesting{
|
||||
|
|
|
@ -52,7 +52,7 @@ var _ = testutil.Defer(func() {
|
|||
for _, x := range n {
|
||||
sum += x
|
||||
}
|
||||
kv := testutil.KeyValue_Generate(nil, sum, 1, 10, 4, 4)
|
||||
kv := testutil.KeyValue_Generate(nil, sum, 1, 1, 10, 4, 4)
|
||||
for i, j := 0, 0; i < len(n); i++ {
|
||||
for x := n[i]; x > 0; x-- {
|
||||
key, value := kv.Index(j)
|
||||
|
@ -69,7 +69,7 @@ var _ = testutil.Defer(func() {
|
|||
}
|
||||
testutil.DoIteratorTesting(&t)
|
||||
done <- true
|
||||
}, 1.5)
|
||||
}, 15.0)
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -21,13 +21,13 @@ var (
|
|||
// IteratorSeeker is the interface that wraps the 'seeks method'.
|
||||
type IteratorSeeker interface {
|
||||
// First moves the iterator to the first key/value pair. If the iterator
|
||||
// only contains one key/value pair then First and Last whould moves
|
||||
// only contains one key/value pair then First and Last would moves
|
||||
// to the same key/value pair.
|
||||
// It returns whether such pair exist.
|
||||
First() bool
|
||||
|
||||
// Last moves the iterator to the last key/value pair. If the iterator
|
||||
// only contains one key/value pair then First and Last whould moves
|
||||
// only contains one key/value pair then First and Last would moves
|
||||
// to the same key/value pair.
|
||||
// It returns whether such pair exist.
|
||||
Last() bool
|
||||
|
@ -48,7 +48,7 @@ type IteratorSeeker interface {
|
|||
Prev() bool
|
||||
}
|
||||
|
||||
// CommonIterator is the interface that wraps common interator methods.
|
||||
// CommonIterator is the interface that wraps common iterator methods.
|
||||
type CommonIterator interface {
|
||||
IteratorSeeker
|
||||
|
||||
|
@ -71,14 +71,15 @@ type CommonIterator interface {
|
|||
|
||||
// Iterator iterates over a DB's key/value pairs in key order.
|
||||
//
|
||||
// When encouter an error any 'seeks method' will return false and will
|
||||
// When encounter an error any 'seeks method' will return false and will
|
||||
// yield no key/value pairs. The error can be queried by calling the Error
|
||||
// method. Calling Release is still necessary.
|
||||
//
|
||||
// An iterator must be released after use, but it is not necessary to read
|
||||
// an iterator until exhaustion.
|
||||
// Also, an iterator is not necessarily goroutine-safe, but it is safe to use
|
||||
// multiple iterators concurrently, with each in a dedicated goroutine.
|
||||
// Also, an iterator is not necessarily safe for concurrent use, but it is
|
||||
// safe to use multiple iterators concurrently, with each in a dedicated
|
||||
// goroutine.
|
||||
type Iterator interface {
|
||||
CommonIterator
|
||||
|
||||
|
@ -87,7 +88,7 @@ type Iterator interface {
|
|||
// its contents may change on the next call to any 'seeks method'.
|
||||
Key() []byte
|
||||
|
||||
// Value returns the key of the current key/value pair, or nil if done.
|
||||
// Value returns the value of the current key/value pair, or nil if done.
|
||||
// The caller should not modify the contents of the returned slice, and
|
||||
// its contents may change on the next call to any 'seeks method'.
|
||||
Value() []byte
|
||||
|
@ -98,7 +99,7 @@ type Iterator interface {
|
|||
//
|
||||
// ErrorCallbackSetter implemented by indexed and merged iterator.
|
||||
type ErrorCallbackSetter interface {
|
||||
// SetErrorCallback allows set an error callback of the coresponding
|
||||
// SetErrorCallback allows set an error callback of the corresponding
|
||||
// iterator. Use nil to clear the callback.
|
||||
SetErrorCallback(f func(err error))
|
||||
}
|
||||
|
|
|
@ -24,7 +24,7 @@ var _ = testutil.Defer(func() {
|
|||
|
||||
// Build key/value.
|
||||
filledKV := make([]testutil.KeyValue, filled)
|
||||
kv := testutil.KeyValue_Generate(nil, 100, 1, 10, 4, 4)
|
||||
kv := testutil.KeyValue_Generate(nil, 100, 1, 1, 10, 4, 4)
|
||||
kv.Iterate(func(i int, key, value []byte) {
|
||||
filledKV[rnd.Intn(filled)].Put(key, value)
|
||||
})
|
||||
|
@ -49,7 +49,7 @@ var _ = testutil.Defer(func() {
|
|||
}
|
||||
testutil.DoIteratorTesting(&t)
|
||||
done <- true
|
||||
}, 1.5)
|
||||
}, 15.0)
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -83,6 +83,7 @@ import (
|
|||
"io"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/errors"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
|
@ -165,7 +166,7 @@ func (r *Reader) corrupt(n int, reason string, skip bool) error {
|
|||
r.dropper.Drop(&ErrCorrupted{n, reason})
|
||||
}
|
||||
if r.strict && !skip {
|
||||
r.err = errors.NewErrCorrupted(nil, &ErrCorrupted{n, reason})
|
||||
r.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrCorrupted{n, reason})
|
||||
return r.err
|
||||
}
|
||||
return errSkip
|
||||
|
@ -179,34 +180,37 @@ func (r *Reader) nextChunk(first bool) error {
|
|||
checksum := binary.LittleEndian.Uint32(r.buf[r.j+0 : r.j+4])
|
||||
length := binary.LittleEndian.Uint16(r.buf[r.j+4 : r.j+6])
|
||||
chunkType := r.buf[r.j+6]
|
||||
|
||||
unprocBlock := r.n - r.j
|
||||
if checksum == 0 && length == 0 && chunkType == 0 {
|
||||
// Drop entire block.
|
||||
m := r.n - r.j
|
||||
r.i = r.n
|
||||
r.j = r.n
|
||||
return r.corrupt(m, "zero header", false)
|
||||
} else {
|
||||
m := r.n - r.j
|
||||
r.i = r.j + headerSize
|
||||
r.j = r.j + headerSize + int(length)
|
||||
if r.j > r.n {
|
||||
// Drop entire block.
|
||||
r.i = r.n
|
||||
r.j = r.n
|
||||
return r.corrupt(m, "chunk length overflows block", false)
|
||||
} else if r.checksum && checksum != util.NewCRC(r.buf[r.i-1:r.j]).Value() {
|
||||
// Drop entire block.
|
||||
r.i = r.n
|
||||
r.j = r.n
|
||||
return r.corrupt(m, "checksum mismatch", false)
|
||||
}
|
||||
return r.corrupt(unprocBlock, "zero header", false)
|
||||
}
|
||||
if chunkType < fullChunkType || chunkType > lastChunkType {
|
||||
// Drop entire block.
|
||||
r.i = r.n
|
||||
r.j = r.n
|
||||
return r.corrupt(unprocBlock, fmt.Sprintf("invalid chunk type %#x", chunkType), false)
|
||||
}
|
||||
r.i = r.j + headerSize
|
||||
r.j = r.j + headerSize + int(length)
|
||||
if r.j > r.n {
|
||||
// Drop entire block.
|
||||
r.i = r.n
|
||||
r.j = r.n
|
||||
return r.corrupt(unprocBlock, "chunk length overflows block", false)
|
||||
} else if r.checksum && checksum != util.NewCRC(r.buf[r.i-1:r.j]).Value() {
|
||||
// Drop entire block.
|
||||
r.i = r.n
|
||||
r.j = r.n
|
||||
return r.corrupt(unprocBlock, "checksum mismatch", false)
|
||||
}
|
||||
if first && chunkType != fullChunkType && chunkType != firstChunkType {
|
||||
m := r.j - r.i
|
||||
chunkLength := (r.j - r.i) + headerSize
|
||||
r.i = r.j
|
||||
// Report the error, but skip it.
|
||||
return r.corrupt(m+headerSize, "orphan chunk", true)
|
||||
return r.corrupt(chunkLength, "orphan chunk", true)
|
||||
}
|
||||
r.last = chunkType == fullChunkType || chunkType == lastChunkType
|
||||
return nil
|
||||
|
|
|
@ -11,132 +11,133 @@ import (
|
|||
"fmt"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/errors"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
)
|
||||
|
||||
type ErrIkeyCorrupted struct {
|
||||
// ErrInternalKeyCorrupted records internal key corruption.
|
||||
type ErrInternalKeyCorrupted struct {
|
||||
Ikey []byte
|
||||
Reason string
|
||||
}
|
||||
|
||||
func (e *ErrIkeyCorrupted) Error() string {
|
||||
return fmt.Sprintf("leveldb: iKey %q corrupted: %s", e.Ikey, e.Reason)
|
||||
func (e *ErrInternalKeyCorrupted) Error() string {
|
||||
return fmt.Sprintf("leveldb: internal key %q corrupted: %s", e.Ikey, e.Reason)
|
||||
}
|
||||
|
||||
func newErrIkeyCorrupted(ikey []byte, reason string) error {
|
||||
return errors.NewErrCorrupted(nil, &ErrIkeyCorrupted{append([]byte{}, ikey...), reason})
|
||||
func newErrInternalKeyCorrupted(ikey []byte, reason string) error {
|
||||
return errors.NewErrCorrupted(storage.FileDesc{}, &ErrInternalKeyCorrupted{append([]byte{}, ikey...), reason})
|
||||
}
|
||||
|
||||
type kType int
|
||||
type keyType uint
|
||||
|
||||
func (kt kType) String() string {
|
||||
func (kt keyType) String() string {
|
||||
switch kt {
|
||||
case ktDel:
|
||||
case keyTypeDel:
|
||||
return "d"
|
||||
case ktVal:
|
||||
case keyTypeVal:
|
||||
return "v"
|
||||
}
|
||||
return "x"
|
||||
return fmt.Sprintf("<invalid:%#x>", uint(kt))
|
||||
}
|
||||
|
||||
// Value types encoded as the last component of internal keys.
|
||||
// Don't modify; this value are saved to disk.
|
||||
const (
|
||||
ktDel kType = iota
|
||||
ktVal
|
||||
keyTypeDel = keyType(0)
|
||||
keyTypeVal = keyType(1)
|
||||
)
|
||||
|
||||
// ktSeek defines the kType that should be passed when constructing an
|
||||
// keyTypeSeek defines the keyType that should be passed when constructing an
|
||||
// internal key for seeking to a particular sequence number (since we
|
||||
// sort sequence numbers in decreasing order and the value type is
|
||||
// embedded as the low 8 bits in the sequence number in internal keys,
|
||||
// we need to use the highest-numbered ValueType, not the lowest).
|
||||
const ktSeek = ktVal
|
||||
const keyTypeSeek = keyTypeVal
|
||||
|
||||
const (
|
||||
// Maximum value possible for sequence number; the 8-bits are
|
||||
// used by value type, so its can packed together in single
|
||||
// 64-bit integer.
|
||||
kMaxSeq uint64 = (uint64(1) << 56) - 1
|
||||
keyMaxSeq = (uint64(1) << 56) - 1
|
||||
// Maximum value possible for packed sequence number and type.
|
||||
kMaxNum uint64 = (kMaxSeq << 8) | uint64(ktSeek)
|
||||
keyMaxNum = (keyMaxSeq << 8) | uint64(keyTypeSeek)
|
||||
)
|
||||
|
||||
// Maximum number encoded in bytes.
|
||||
var kMaxNumBytes = make([]byte, 8)
|
||||
var keyMaxNumBytes = make([]byte, 8)
|
||||
|
||||
func init() {
|
||||
binary.LittleEndian.PutUint64(kMaxNumBytes, kMaxNum)
|
||||
binary.LittleEndian.PutUint64(keyMaxNumBytes, keyMaxNum)
|
||||
}
|
||||
|
||||
type iKey []byte
|
||||
type internalKey []byte
|
||||
|
||||
func newIkey(ukey []byte, seq uint64, kt kType) iKey {
|
||||
if seq > kMaxSeq {
|
||||
func makeInternalKey(dst, ukey []byte, seq uint64, kt keyType) internalKey {
|
||||
if seq > keyMaxSeq {
|
||||
panic("leveldb: invalid sequence number")
|
||||
} else if kt > ktVal {
|
||||
} else if kt > keyTypeVal {
|
||||
panic("leveldb: invalid type")
|
||||
}
|
||||
|
||||
ik := make(iKey, len(ukey)+8)
|
||||
copy(ik, ukey)
|
||||
binary.LittleEndian.PutUint64(ik[len(ukey):], (seq<<8)|uint64(kt))
|
||||
return ik
|
||||
dst = ensureBuffer(dst, len(ukey)+8)
|
||||
copy(dst, ukey)
|
||||
binary.LittleEndian.PutUint64(dst[len(ukey):], (seq<<8)|uint64(kt))
|
||||
return internalKey(dst)
|
||||
}
|
||||
|
||||
func parseIkey(ik []byte) (ukey []byte, seq uint64, kt kType, err error) {
|
||||
func parseInternalKey(ik []byte) (ukey []byte, seq uint64, kt keyType, err error) {
|
||||
if len(ik) < 8 {
|
||||
return nil, 0, 0, newErrIkeyCorrupted(ik, "invalid length")
|
||||
return nil, 0, 0, newErrInternalKeyCorrupted(ik, "invalid length")
|
||||
}
|
||||
num := binary.LittleEndian.Uint64(ik[len(ik)-8:])
|
||||
seq, kt = uint64(num>>8), kType(num&0xff)
|
||||
if kt > ktVal {
|
||||
return nil, 0, 0, newErrIkeyCorrupted(ik, "invalid type")
|
||||
seq, kt = uint64(num>>8), keyType(num&0xff)
|
||||
if kt > keyTypeVal {
|
||||
return nil, 0, 0, newErrInternalKeyCorrupted(ik, "invalid type")
|
||||
}
|
||||
ukey = ik[:len(ik)-8]
|
||||
return
|
||||
}
|
||||
|
||||
func validIkey(ik []byte) bool {
|
||||
_, _, _, err := parseIkey(ik)
|
||||
func validInternalKey(ik []byte) bool {
|
||||
_, _, _, err := parseInternalKey(ik)
|
||||
return err == nil
|
||||
}
|
||||
|
||||
func (ik iKey) assert() {
|
||||
func (ik internalKey) assert() {
|
||||
if ik == nil {
|
||||
panic("leveldb: nil iKey")
|
||||
panic("leveldb: nil internalKey")
|
||||
}
|
||||
if len(ik) < 8 {
|
||||
panic(fmt.Sprintf("leveldb: iKey %q, len=%d: invalid length", []byte(ik), len(ik)))
|
||||
panic(fmt.Sprintf("leveldb: internal key %q, len=%d: invalid length", []byte(ik), len(ik)))
|
||||
}
|
||||
}
|
||||
|
||||
func (ik iKey) ukey() []byte {
|
||||
func (ik internalKey) ukey() []byte {
|
||||
ik.assert()
|
||||
return ik[:len(ik)-8]
|
||||
}
|
||||
|
||||
func (ik iKey) num() uint64 {
|
||||
func (ik internalKey) num() uint64 {
|
||||
ik.assert()
|
||||
return binary.LittleEndian.Uint64(ik[len(ik)-8:])
|
||||
}
|
||||
|
||||
func (ik iKey) parseNum() (seq uint64, kt kType) {
|
||||
func (ik internalKey) parseNum() (seq uint64, kt keyType) {
|
||||
num := ik.num()
|
||||
seq, kt = uint64(num>>8), kType(num&0xff)
|
||||
if kt > ktVal {
|
||||
panic(fmt.Sprintf("leveldb: iKey %q, len=%d: invalid type %#x", []byte(ik), len(ik), kt))
|
||||
seq, kt = uint64(num>>8), keyType(num&0xff)
|
||||
if kt > keyTypeVal {
|
||||
panic(fmt.Sprintf("leveldb: internal key %q, len=%d: invalid type %#x", []byte(ik), len(ik), kt))
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (ik iKey) String() string {
|
||||
func (ik internalKey) String() string {
|
||||
if ik == nil {
|
||||
return "<nil>"
|
||||
}
|
||||
|
||||
if ukey, seq, kt, err := parseIkey(ik); err == nil {
|
||||
if ukey, seq, kt, err := parseInternalKey(ik); err == nil {
|
||||
return fmt.Sprintf("%s,%s%d", shorten(string(ukey)), kt, seq)
|
||||
} else {
|
||||
return "<invalid>"
|
||||
}
|
||||
return fmt.Sprintf("<invalid:%#x>", []byte(ik))
|
||||
}
|
||||
|
|
|
@ -15,8 +15,8 @@ import (
|
|||
|
||||
var defaultIComparer = &iComparer{comparer.DefaultComparer}
|
||||
|
||||
func ikey(key string, seq uint64, kt kType) iKey {
|
||||
return newIkey([]byte(key), uint64(seq), kt)
|
||||
func ikey(key string, seq uint64, kt keyType) internalKey {
|
||||
return makeInternalKey(nil, []byte(key), uint64(seq), kt)
|
||||
}
|
||||
|
||||
func shortSep(a, b []byte) []byte {
|
||||
|
@ -37,7 +37,7 @@ func shortSuccessor(b []byte) []byte {
|
|||
return dst
|
||||
}
|
||||
|
||||
func testSingleKey(t *testing.T, key string, seq uint64, kt kType) {
|
||||
func testSingleKey(t *testing.T, key string, seq uint64, kt keyType) {
|
||||
ik := ikey(key, seq, kt)
|
||||
|
||||
if !bytes.Equal(ik.ukey(), []byte(key)) {
|
||||
|
@ -52,7 +52,7 @@ func testSingleKey(t *testing.T, key string, seq uint64, kt kType) {
|
|||
t.Errorf("type does not equal, got %v, want %v", rt, kt)
|
||||
}
|
||||
|
||||
if rukey, rseq, rt, kerr := parseIkey(ik); kerr == nil {
|
||||
if rukey, rseq, rt, kerr := parseInternalKey(ik); kerr == nil {
|
||||
if !bytes.Equal(rukey, []byte(key)) {
|
||||
t.Errorf("user key does not equal, got %v, want %v", string(ik.ukey()), key)
|
||||
}
|
||||
|
@ -67,7 +67,7 @@ func testSingleKey(t *testing.T, key string, seq uint64, kt kType) {
|
|||
}
|
||||
}
|
||||
|
||||
func TestIkey_EncodeDecode(t *testing.T) {
|
||||
func TestInternalKey_EncodeDecode(t *testing.T) {
|
||||
keys := []string{"", "k", "hello", "longggggggggggggggggggggg"}
|
||||
seqs := []uint64{
|
||||
1, 2, 3,
|
||||
|
@ -77,8 +77,8 @@ func TestIkey_EncodeDecode(t *testing.T) {
|
|||
}
|
||||
for _, key := range keys {
|
||||
for _, seq := range seqs {
|
||||
testSingleKey(t, key, seq, ktVal)
|
||||
testSingleKey(t, "hello", 1, ktDel)
|
||||
testSingleKey(t, key, seq, keyTypeVal)
|
||||
testSingleKey(t, "hello", 1, keyTypeDel)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -89,45 +89,45 @@ func assertBytes(t *testing.T, want, got []byte) {
|
|||
}
|
||||
}
|
||||
|
||||
func TestIkeyShortSeparator(t *testing.T) {
|
||||
func TestInternalKeyShortSeparator(t *testing.T) {
|
||||
// When user keys are same
|
||||
assertBytes(t, ikey("foo", 100, ktVal),
|
||||
shortSep(ikey("foo", 100, ktVal),
|
||||
ikey("foo", 99, ktVal)))
|
||||
assertBytes(t, ikey("foo", 100, ktVal),
|
||||
shortSep(ikey("foo", 100, ktVal),
|
||||
ikey("foo", 101, ktVal)))
|
||||
assertBytes(t, ikey("foo", 100, ktVal),
|
||||
shortSep(ikey("foo", 100, ktVal),
|
||||
ikey("foo", 100, ktVal)))
|
||||
assertBytes(t, ikey("foo", 100, ktVal),
|
||||
shortSep(ikey("foo", 100, ktVal),
|
||||
ikey("foo", 100, ktDel)))
|
||||
assertBytes(t, ikey("foo", 100, keyTypeVal),
|
||||
shortSep(ikey("foo", 100, keyTypeVal),
|
||||
ikey("foo", 99, keyTypeVal)))
|
||||
assertBytes(t, ikey("foo", 100, keyTypeVal),
|
||||
shortSep(ikey("foo", 100, keyTypeVal),
|
||||
ikey("foo", 101, keyTypeVal)))
|
||||
assertBytes(t, ikey("foo", 100, keyTypeVal),
|
||||
shortSep(ikey("foo", 100, keyTypeVal),
|
||||
ikey("foo", 100, keyTypeVal)))
|
||||
assertBytes(t, ikey("foo", 100, keyTypeVal),
|
||||
shortSep(ikey("foo", 100, keyTypeVal),
|
||||
ikey("foo", 100, keyTypeDel)))
|
||||
|
||||
// When user keys are misordered
|
||||
assertBytes(t, ikey("foo", 100, ktVal),
|
||||
shortSep(ikey("foo", 100, ktVal),
|
||||
ikey("bar", 99, ktVal)))
|
||||
assertBytes(t, ikey("foo", 100, keyTypeVal),
|
||||
shortSep(ikey("foo", 100, keyTypeVal),
|
||||
ikey("bar", 99, keyTypeVal)))
|
||||
|
||||
// When user keys are different, but correctly ordered
|
||||
assertBytes(t, ikey("g", uint64(kMaxSeq), ktSeek),
|
||||
shortSep(ikey("foo", 100, ktVal),
|
||||
ikey("hello", 200, ktVal)))
|
||||
assertBytes(t, ikey("g", uint64(keyMaxSeq), keyTypeSeek),
|
||||
shortSep(ikey("foo", 100, keyTypeVal),
|
||||
ikey("hello", 200, keyTypeVal)))
|
||||
|
||||
// When start user key is prefix of limit user key
|
||||
assertBytes(t, ikey("foo", 100, ktVal),
|
||||
shortSep(ikey("foo", 100, ktVal),
|
||||
ikey("foobar", 200, ktVal)))
|
||||
assertBytes(t, ikey("foo", 100, keyTypeVal),
|
||||
shortSep(ikey("foo", 100, keyTypeVal),
|
||||
ikey("foobar", 200, keyTypeVal)))
|
||||
|
||||
// When limit user key is prefix of start user key
|
||||
assertBytes(t, ikey("foobar", 100, ktVal),
|
||||
shortSep(ikey("foobar", 100, ktVal),
|
||||
ikey("foo", 200, ktVal)))
|
||||
assertBytes(t, ikey("foobar", 100, keyTypeVal),
|
||||
shortSep(ikey("foobar", 100, keyTypeVal),
|
||||
ikey("foo", 200, keyTypeVal)))
|
||||
}
|
||||
|
||||
func TestIkeyShortestSuccessor(t *testing.T) {
|
||||
assertBytes(t, ikey("g", uint64(kMaxSeq), ktSeek),
|
||||
shortSuccessor(ikey("foo", 100, ktVal)))
|
||||
assertBytes(t, ikey("\xff\xff", 100, ktVal),
|
||||
shortSuccessor(ikey("\xff\xff", 100, ktVal)))
|
||||
func TestInternalKeyShortestSuccessor(t *testing.T) {
|
||||
assertBytes(t, ikey("g", uint64(keyMaxSeq), keyTypeSeek),
|
||||
shortSuccessor(ikey("foo", 100, keyTypeVal)))
|
||||
assertBytes(t, ikey("\xff\xff", 100, keyTypeVal),
|
||||
shortSuccessor(ikey("\xff\xff", 100, keyTypeVal)))
|
||||
}
|
||||
|
|
|
@ -17,6 +17,7 @@ import (
|
|||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
// Common errors.
|
||||
var (
|
||||
ErrNotFound = errors.ErrNotFound
|
||||
ErrIterReleased = errors.New("leveldb/memdb: iterator released")
|
||||
|
@ -206,6 +207,7 @@ func (p *DB) randHeight() (h int) {
|
|||
return
|
||||
}
|
||||
|
||||
// Must hold RW-lock if prev == true, as it use shared prevNode slice.
|
||||
func (p *DB) findGE(key []byte, prev bool) (int, bool) {
|
||||
node := 0
|
||||
h := p.maxHeight - 1
|
||||
|
@ -302,7 +304,7 @@ func (p *DB) Put(key []byte, value []byte) error {
|
|||
node := len(p.nodeData)
|
||||
p.nodeData = append(p.nodeData, kvOffset, len(key), len(value), h)
|
||||
for i, n := range p.prevNode[:h] {
|
||||
m := n + 4 + i
|
||||
m := n + nNext + i
|
||||
p.nodeData = append(p.nodeData, p.nodeData[m])
|
||||
p.nodeData[m] = node
|
||||
}
|
||||
|
@ -327,7 +329,7 @@ func (p *DB) Delete(key []byte) error {
|
|||
|
||||
h := p.nodeData[node+nHeight]
|
||||
for i, n := range p.prevNode[:h] {
|
||||
m := n + 4 + i
|
||||
m := n + nNext + i
|
||||
p.nodeData[m] = p.nodeData[p.nodeData[m]+nNext+i]
|
||||
}
|
||||
|
||||
|
@ -384,7 +386,7 @@ func (p *DB) Find(key []byte) (rkey, value []byte, err error) {
|
|||
}
|
||||
|
||||
// NewIterator returns an iterator of the DB.
|
||||
// The returned iterator is not goroutine-safe, but it is safe to use
|
||||
// The returned iterator is not safe for concurrent use, but it is safe to use
|
||||
// multiple iterators concurrently, with each in a dedicated goroutine.
|
||||
// It is also safe to use an iterator concurrently with modifying its
|
||||
// underlying DB. However, the resultant key/value pairs are not guaranteed
|
||||
|
@ -410,7 +412,7 @@ func (p *DB) Capacity() int {
|
|||
}
|
||||
|
||||
// Size returns sum of keys and values length. Note that deleted
|
||||
// key/value will not be accouted for, but it will still consume
|
||||
// key/value will not be accounted for, but it will still consume
|
||||
// the buffer, since the buffer is append only.
|
||||
func (p *DB) Size() int {
|
||||
p.mu.RLock()
|
||||
|
@ -434,27 +436,32 @@ func (p *DB) Len() int {
|
|||
|
||||
// Reset resets the DB to initial empty state. Allows reuse the buffer.
|
||||
func (p *DB) Reset() {
|
||||
p.mu.Lock()
|
||||
p.rnd = rand.New(rand.NewSource(0xdeadbeef))
|
||||
p.maxHeight = 1
|
||||
p.n = 0
|
||||
p.kvSize = 0
|
||||
p.kvData = p.kvData[:0]
|
||||
p.nodeData = p.nodeData[:4+tMaxHeight]
|
||||
p.nodeData = p.nodeData[:nNext+tMaxHeight]
|
||||
p.nodeData[nKV] = 0
|
||||
p.nodeData[nKey] = 0
|
||||
p.nodeData[nVal] = 0
|
||||
p.nodeData[nHeight] = tMaxHeight
|
||||
for n := 0; n < tMaxHeight; n++ {
|
||||
p.nodeData[4+n] = 0
|
||||
p.nodeData[nNext+n] = 0
|
||||
p.prevNode[n] = 0
|
||||
}
|
||||
p.mu.Unlock()
|
||||
}
|
||||
|
||||
// New creates a new initalized in-memory key/value DB. The capacity
|
||||
// New creates a new initialized in-memory key/value DB. The capacity
|
||||
// is the initial key/value buffer capacity. The capacity is advisory,
|
||||
// not enforced.
|
||||
//
|
||||
// The returned DB instance is goroutine-safe.
|
||||
// This DB is append-only, deleting an entry would remove entry node but not
|
||||
// reclaim KV buffer.
|
||||
//
|
||||
// The returned DB instance is safe for concurrent use.
|
||||
func New(cmp comparer.BasicComparer, capacity int) *DB {
|
||||
p := &DB{
|
||||
cmp: cmp,
|
||||
|
|
|
@ -73,7 +73,7 @@ var _ = testutil.Defer(func() {
|
|||
db := New(comparer.DefaultComparer, 0)
|
||||
t := testutil.DBTesting{
|
||||
DB: db,
|
||||
Deleted: testutil.KeyValue_Generate(nil, 1000, 1, 30, 5, 5).Clone(),
|
||||
Deleted: testutil.KeyValue_Generate(nil, 1000, 1, 1, 30, 5, 5).Clone(),
|
||||
PostFn: func(t *testutil.DBTesting) {
|
||||
Expect(db.Len()).Should(Equal(t.Present.Len()))
|
||||
Expect(db.Size()).Should(Equal(t.Present.Size()))
|
||||
|
|
|
@ -8,10 +8,11 @@
|
|||
package opt
|
||||
|
||||
import (
|
||||
"math"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/cache"
|
||||
"github.com/syndtr/goleveldb/leveldb/comparer"
|
||||
"github.com/syndtr/goleveldb/leveldb/filter"
|
||||
"math"
|
||||
)
|
||||
|
||||
const (
|
||||
|
@ -35,8 +36,6 @@ var (
|
|||
DefaultCompactionTotalSizeMultiplier = 10.0
|
||||
DefaultCompressionType = SnappyCompression
|
||||
DefaultIteratorSamplingRate = 1 * MiB
|
||||
DefaultMaxMemCompationLevel = 2
|
||||
DefaultNumLevel = 7
|
||||
DefaultOpenFilesCacher = LRUCacher
|
||||
DefaultOpenFilesCacheCapacity = 500
|
||||
DefaultWriteBuffer = 4 * MiB
|
||||
|
@ -250,6 +249,11 @@ type Options struct {
|
|||
// The default value (DefaultCompression) uses snappy compression.
|
||||
Compression Compression
|
||||
|
||||
// DisableBufferPool allows disable use of util.BufferPool functionality.
|
||||
//
|
||||
// The default value is false.
|
||||
DisableBufferPool bool
|
||||
|
||||
// DisableBlockCache allows disable use of cache.Cache functionality on
|
||||
// 'sorted table' block.
|
||||
//
|
||||
|
@ -261,6 +265,13 @@ type Options struct {
|
|||
// The default value is false.
|
||||
DisableCompactionBackoff bool
|
||||
|
||||
// DisableLargeBatchTransaction allows disabling switch-to-transaction mode
|
||||
// on large batch write. If enable batch writes large than WriteBuffer will
|
||||
// use transaction.
|
||||
//
|
||||
// The default is false.
|
||||
DisableLargeBatchTransaction bool
|
||||
|
||||
// ErrorIfExist defines whether an error should returned if the DB already
|
||||
// exist.
|
||||
//
|
||||
|
@ -296,18 +307,15 @@ type Options struct {
|
|||
// The default is 1MiB.
|
||||
IteratorSamplingRate int
|
||||
|
||||
// MaxMemCompationLevel defines maximum level a newly compacted 'memdb'
|
||||
// will be pushed into if doesn't creates overlap. This should less than
|
||||
// NumLevel. Use -1 for level-0.
|
||||
// NoSync allows completely disable fsync.
|
||||
//
|
||||
// The default is 2.
|
||||
MaxMemCompationLevel int
|
||||
// The default is false.
|
||||
NoSync bool
|
||||
|
||||
// NumLevel defines number of database level. The level shouldn't changed
|
||||
// between opens, or the database will panic.
|
||||
// NoWriteMerge allows disabling write merge.
|
||||
//
|
||||
// The default is 7.
|
||||
NumLevel int
|
||||
// The default is false.
|
||||
NoWriteMerge bool
|
||||
|
||||
// OpenFilesCacher provides cache algorithm for open files caching.
|
||||
// Specify NoCacher to disable caching algorithm.
|
||||
|
@ -321,6 +329,11 @@ type Options struct {
|
|||
// The default value is 500.
|
||||
OpenFilesCacheCapacity int
|
||||
|
||||
// If true then opens DB in read-only mode.
|
||||
//
|
||||
// The default value is false.
|
||||
ReadOnly bool
|
||||
|
||||
// Strict defines the DB strict level.
|
||||
Strict Strict
|
||||
|
||||
|
@ -425,7 +438,7 @@ func (o *Options) GetCompactionTableSize(level int) int {
|
|||
if o.CompactionTableSize > 0 {
|
||||
base = o.CompactionTableSize
|
||||
}
|
||||
if len(o.CompactionTableSizeMultiplierPerLevel) > level && o.CompactionTableSizeMultiplierPerLevel[level] > 0 {
|
||||
if level < len(o.CompactionTableSizeMultiplierPerLevel) && o.CompactionTableSizeMultiplierPerLevel[level] > 0 {
|
||||
mult = o.CompactionTableSizeMultiplierPerLevel[level]
|
||||
} else if o.CompactionTableSizeMultiplier > 0 {
|
||||
mult = math.Pow(o.CompactionTableSizeMultiplier, float64(level))
|
||||
|
@ -446,7 +459,7 @@ func (o *Options) GetCompactionTotalSize(level int) int64 {
|
|||
if o.CompactionTotalSize > 0 {
|
||||
base = o.CompactionTotalSize
|
||||
}
|
||||
if len(o.CompactionTotalSizeMultiplierPerLevel) > level && o.CompactionTotalSizeMultiplierPerLevel[level] > 0 {
|
||||
if level < len(o.CompactionTotalSizeMultiplierPerLevel) && o.CompactionTotalSizeMultiplierPerLevel[level] > 0 {
|
||||
mult = o.CompactionTotalSizeMultiplierPerLevel[level]
|
||||
} else if o.CompactionTotalSizeMultiplier > 0 {
|
||||
mult = math.Pow(o.CompactionTotalSizeMultiplier, float64(level))
|
||||
|
@ -472,6 +485,20 @@ func (o *Options) GetCompression() Compression {
|
|||
return o.Compression
|
||||
}
|
||||
|
||||
func (o *Options) GetDisableBufferPool() bool {
|
||||
if o == nil {
|
||||
return false
|
||||
}
|
||||
return o.DisableBufferPool
|
||||
}
|
||||
|
||||
func (o *Options) GetDisableBlockCache() bool {
|
||||
if o == nil {
|
||||
return false
|
||||
}
|
||||
return o.DisableBlockCache
|
||||
}
|
||||
|
||||
func (o *Options) GetDisableCompactionBackoff() bool {
|
||||
if o == nil {
|
||||
return false
|
||||
|
@ -479,6 +506,13 @@ func (o *Options) GetDisableCompactionBackoff() bool {
|
|||
return o.DisableCompactionBackoff
|
||||
}
|
||||
|
||||
func (o *Options) GetDisableLargeBatchTransaction() bool {
|
||||
if o == nil {
|
||||
return false
|
||||
}
|
||||
return o.DisableLargeBatchTransaction
|
||||
}
|
||||
|
||||
func (o *Options) GetErrorIfExist() bool {
|
||||
if o == nil {
|
||||
return false
|
||||
|
@ -507,26 +541,18 @@ func (o *Options) GetIteratorSamplingRate() int {
|
|||
return o.IteratorSamplingRate
|
||||
}
|
||||
|
||||
func (o *Options) GetMaxMemCompationLevel() int {
|
||||
level := DefaultMaxMemCompationLevel
|
||||
if o != nil {
|
||||
if o.MaxMemCompationLevel > 0 {
|
||||
level = o.MaxMemCompationLevel
|
||||
} else if o.MaxMemCompationLevel < 0 {
|
||||
level = 0
|
||||
}
|
||||
func (o *Options) GetNoSync() bool {
|
||||
if o == nil {
|
||||
return false
|
||||
}
|
||||
if level >= o.GetNumLevel() {
|
||||
return o.GetNumLevel() - 1
|
||||
}
|
||||
return level
|
||||
return o.NoSync
|
||||
}
|
||||
|
||||
func (o *Options) GetNumLevel() int {
|
||||
if o == nil || o.NumLevel <= 0 {
|
||||
return DefaultNumLevel
|
||||
func (o *Options) GetNoWriteMerge() bool {
|
||||
if o == nil {
|
||||
return false
|
||||
}
|
||||
return o.NumLevel
|
||||
return o.NoWriteMerge
|
||||
}
|
||||
|
||||
func (o *Options) GetOpenFilesCacher() Cacher {
|
||||
|
@ -548,6 +574,13 @@ func (o *Options) GetOpenFilesCacheCapacity() int {
|
|||
return o.OpenFilesCacheCapacity
|
||||
}
|
||||
|
||||
func (o *Options) GetReadOnly() bool {
|
||||
if o == nil {
|
||||
return false
|
||||
}
|
||||
return o.ReadOnly
|
||||
}
|
||||
|
||||
func (o *Options) GetStrict(strict Strict) bool {
|
||||
if o == nil || o.Strict == 0 {
|
||||
return DefaultStrict&strict != 0
|
||||
|
@ -608,6 +641,11 @@ func (ro *ReadOptions) GetStrict(strict Strict) bool {
|
|||
// WriteOptions holds the optional parameters for 'write operation'. The
|
||||
// 'write operation' includes Write, Put and Delete.
|
||||
type WriteOptions struct {
|
||||
// NoWriteMerge allows disabling write merge.
|
||||
//
|
||||
// The default is false.
|
||||
NoWriteMerge bool
|
||||
|
||||
// Sync is whether to sync underlying writes from the OS buffer cache
|
||||
// through to actual disk, if applicable. Setting Sync can result in
|
||||
// slower writes.
|
||||
|
@ -623,6 +661,13 @@ type WriteOptions struct {
|
|||
Sync bool
|
||||
}
|
||||
|
||||
func (wo *WriteOptions) GetNoWriteMerge() bool {
|
||||
if wo == nil {
|
||||
return false
|
||||
}
|
||||
return wo.NoWriteMerge
|
||||
}
|
||||
|
||||
func (wo *WriteOptions) GetSync() bool {
|
||||
if wo == nil {
|
||||
return false
|
||||
|
|
|
@ -43,6 +43,8 @@ func (s *session) setOptions(o *opt.Options) {
|
|||
s.o.cache()
|
||||
}
|
||||
|
||||
const optCachedLevel = 7
|
||||
|
||||
type cachedOptions struct {
|
||||
*opt.Options
|
||||
|
||||
|
@ -54,15 +56,13 @@ type cachedOptions struct {
|
|||
}
|
||||
|
||||
func (co *cachedOptions) cache() {
|
||||
numLevel := co.Options.GetNumLevel()
|
||||
co.compactionExpandLimit = make([]int, optCachedLevel)
|
||||
co.compactionGPOverlaps = make([]int, optCachedLevel)
|
||||
co.compactionSourceLimit = make([]int, optCachedLevel)
|
||||
co.compactionTableSize = make([]int, optCachedLevel)
|
||||
co.compactionTotalSize = make([]int64, optCachedLevel)
|
||||
|
||||
co.compactionExpandLimit = make([]int, numLevel)
|
||||
co.compactionGPOverlaps = make([]int, numLevel)
|
||||
co.compactionSourceLimit = make([]int, numLevel)
|
||||
co.compactionTableSize = make([]int, numLevel)
|
||||
co.compactionTotalSize = make([]int64, numLevel)
|
||||
|
||||
for level := 0; level < numLevel; level++ {
|
||||
for level := 0; level < optCachedLevel; level++ {
|
||||
co.compactionExpandLimit[level] = co.Options.GetCompactionExpandLimit(level)
|
||||
co.compactionGPOverlaps[level] = co.Options.GetCompactionGPOverlaps(level)
|
||||
co.compactionSourceLimit[level] = co.Options.GetCompactionSourceLimit(level)
|
||||
|
@ -72,21 +72,36 @@ func (co *cachedOptions) cache() {
|
|||
}
|
||||
|
||||
func (co *cachedOptions) GetCompactionExpandLimit(level int) int {
|
||||
return co.compactionExpandLimit[level]
|
||||
if level < optCachedLevel {
|
||||
return co.compactionExpandLimit[level]
|
||||
}
|
||||
return co.Options.GetCompactionExpandLimit(level)
|
||||
}
|
||||
|
||||
func (co *cachedOptions) GetCompactionGPOverlaps(level int) int {
|
||||
return co.compactionGPOverlaps[level]
|
||||
if level < optCachedLevel {
|
||||
return co.compactionGPOverlaps[level]
|
||||
}
|
||||
return co.Options.GetCompactionGPOverlaps(level)
|
||||
}
|
||||
|
||||
func (co *cachedOptions) GetCompactionSourceLimit(level int) int {
|
||||
return co.compactionSourceLimit[level]
|
||||
if level < optCachedLevel {
|
||||
return co.compactionSourceLimit[level]
|
||||
}
|
||||
return co.Options.GetCompactionSourceLimit(level)
|
||||
}
|
||||
|
||||
func (co *cachedOptions) GetCompactionTableSize(level int) int {
|
||||
return co.compactionTableSize[level]
|
||||
if level < optCachedLevel {
|
||||
return co.compactionTableSize[level]
|
||||
}
|
||||
return co.Options.GetCompactionTableSize(level)
|
||||
}
|
||||
|
||||
func (co *cachedOptions) GetCompactionTotalSize(level int) int64 {
|
||||
return co.compactionTotalSize[level]
|
||||
if level < optCachedLevel {
|
||||
return co.compactionTotalSize[level]
|
||||
}
|
||||
return co.Options.GetCompactionTotalSize(level)
|
||||
}
|
||||
|
|
|
@ -11,16 +11,15 @@ import (
|
|||
"io"
|
||||
"os"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/errors"
|
||||
"github.com/syndtr/goleveldb/leveldb/iterator"
|
||||
"github.com/syndtr/goleveldb/leveldb/journal"
|
||||
"github.com/syndtr/goleveldb/leveldb/opt"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
// ErrManifestCorrupted records manifest corruption. This error will be
|
||||
// wrapped with errors.ErrCorrupted.
|
||||
type ErrManifestCorrupted struct {
|
||||
Field string
|
||||
Reason string
|
||||
|
@ -30,31 +29,32 @@ func (e *ErrManifestCorrupted) Error() string {
|
|||
return fmt.Sprintf("leveldb: manifest corrupted (field '%s'): %s", e.Field, e.Reason)
|
||||
}
|
||||
|
||||
func newErrManifestCorrupted(f storage.File, field, reason string) error {
|
||||
return errors.NewErrCorrupted(f, &ErrManifestCorrupted{field, reason})
|
||||
func newErrManifestCorrupted(fd storage.FileDesc, field, reason string) error {
|
||||
return errors.NewErrCorrupted(fd, &ErrManifestCorrupted{field, reason})
|
||||
}
|
||||
|
||||
// session represent a persistent database session.
|
||||
type session struct {
|
||||
// Need 64-bit alignment.
|
||||
stNextFileNum uint64 // current unused file number
|
||||
stJournalNum uint64 // current journal file number; need external synchronization
|
||||
stPrevJournalNum uint64 // prev journal file number; no longer used; for compatibility with older version of leveldb
|
||||
stNextFileNum int64 // current unused file number
|
||||
stJournalNum int64 // current journal file number; need external synchronization
|
||||
stPrevJournalNum int64 // prev journal file number; no longer used; for compatibility with older version of leveldb
|
||||
stTempFileNum int64
|
||||
stSeqNum uint64 // last mem compacted seq; need external synchronization
|
||||
stTempFileNum uint64
|
||||
|
||||
stor storage.Storage
|
||||
storLock util.Releaser
|
||||
storLock storage.Locker
|
||||
o *cachedOptions
|
||||
icmp *iComparer
|
||||
tops *tOps
|
||||
fileRef map[int64]int
|
||||
|
||||
manifest *journal.Writer
|
||||
manifestWriter storage.Writer
|
||||
manifestFile storage.File
|
||||
manifestFd storage.FileDesc
|
||||
|
||||
stCompPtrs []iKey // compaction pointers; need external synchronization
|
||||
stVersion *version // current version
|
||||
stCompPtrs []internalKey // compaction pointers; need external synchronization
|
||||
stVersion *version // current version
|
||||
vmu sync.Mutex
|
||||
}
|
||||
|
||||
|
@ -68,9 +68,9 @@ func newSession(stor storage.Storage, o *opt.Options) (s *session, err error) {
|
|||
return
|
||||
}
|
||||
s = &session{
|
||||
stor: stor,
|
||||
storLock: storLock,
|
||||
stCompPtrs: make([]iKey, o.GetNumLevel()),
|
||||
stor: stor,
|
||||
storLock: storLock,
|
||||
fileRef: make(map[int64]int),
|
||||
}
|
||||
s.setOptions(o)
|
||||
s.tops = newTableOps(s)
|
||||
|
@ -90,13 +90,12 @@ func (s *session) close() {
|
|||
}
|
||||
s.manifest = nil
|
||||
s.manifestWriter = nil
|
||||
s.manifestFile = nil
|
||||
s.stVersion = nil
|
||||
s.setVersion(&version{s: s, closing: true})
|
||||
}
|
||||
|
||||
// Release session lock.
|
||||
func (s *session) release() {
|
||||
s.storLock.Release()
|
||||
s.storLock.Unlock()
|
||||
}
|
||||
|
||||
// Create a new database session; need external synchronization.
|
||||
|
@ -111,27 +110,31 @@ func (s *session) recover() (err error) {
|
|||
if os.IsNotExist(err) {
|
||||
// Don't return os.ErrNotExist if the underlying storage contains
|
||||
// other files that belong to LevelDB. So the DB won't get trashed.
|
||||
if files, _ := s.stor.GetFiles(storage.TypeAll); len(files) > 0 {
|
||||
err = &errors.ErrCorrupted{File: &storage.FileInfo{Type: storage.TypeManifest}, Err: &errors.ErrMissingFiles{}}
|
||||
if fds, _ := s.stor.List(storage.TypeAll); len(fds) > 0 {
|
||||
err = &errors.ErrCorrupted{Fd: storage.FileDesc{Type: storage.TypeManifest}, Err: &errors.ErrMissingFiles{}}
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
m, err := s.stor.GetManifest()
|
||||
fd, err := s.stor.GetMeta()
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
|
||||
reader, err := m.Open()
|
||||
reader, err := s.stor.Open(fd)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
defer reader.Close()
|
||||
strict := s.o.GetStrict(opt.StrictManifest)
|
||||
jr := journal.NewReader(reader, dropper{s, m}, strict, true)
|
||||
|
||||
staging := s.stVersion.newStaging()
|
||||
rec := &sessionRecord{numLevel: s.o.GetNumLevel()}
|
||||
var (
|
||||
// Options.
|
||||
strict = s.o.GetStrict(opt.StrictManifest)
|
||||
|
||||
jr = journal.NewReader(reader, dropper{s, fd}, strict, true)
|
||||
rec = &sessionRecord{}
|
||||
staging = s.stVersion.newStaging()
|
||||
)
|
||||
for {
|
||||
var r io.Reader
|
||||
r, err = jr.Next()
|
||||
|
@ -140,24 +143,23 @@ func (s *session) recover() (err error) {
|
|||
err = nil
|
||||
break
|
||||
}
|
||||
return errors.SetFile(err, m)
|
||||
return errors.SetFd(err, fd)
|
||||
}
|
||||
|
||||
err = rec.decode(r)
|
||||
if err == nil {
|
||||
// save compact pointers
|
||||
for _, r := range rec.compPtrs {
|
||||
s.stCompPtrs[r.level] = iKey(r.ikey)
|
||||
s.setCompPtr(r.level, internalKey(r.ikey))
|
||||
}
|
||||
// commit record to version staging
|
||||
staging.commit(rec)
|
||||
} else {
|
||||
err = errors.SetFile(err, m)
|
||||
err = errors.SetFd(err, fd)
|
||||
if strict || !errors.IsCorrupted(err) {
|
||||
return
|
||||
} else {
|
||||
s.logf("manifest error: %v (skipped)", errors.SetFile(err, m))
|
||||
}
|
||||
s.logf("manifest error: %v (skipped)", errors.SetFd(err, fd))
|
||||
}
|
||||
rec.resetCompPtrs()
|
||||
rec.resetAddedTables()
|
||||
|
@ -166,18 +168,18 @@ func (s *session) recover() (err error) {
|
|||
|
||||
switch {
|
||||
case !rec.has(recComparer):
|
||||
return newErrManifestCorrupted(m, "comparer", "missing")
|
||||
return newErrManifestCorrupted(fd, "comparer", "missing")
|
||||
case rec.comparer != s.icmp.uName():
|
||||
return newErrManifestCorrupted(m, "comparer", fmt.Sprintf("mismatch: want '%s', got '%s'", s.icmp.uName(), rec.comparer))
|
||||
return newErrManifestCorrupted(fd, "comparer", fmt.Sprintf("mismatch: want '%s', got '%s'", s.icmp.uName(), rec.comparer))
|
||||
case !rec.has(recNextFileNum):
|
||||
return newErrManifestCorrupted(m, "next-file-num", "missing")
|
||||
return newErrManifestCorrupted(fd, "next-file-num", "missing")
|
||||
case !rec.has(recJournalNum):
|
||||
return newErrManifestCorrupted(m, "journal-file-num", "missing")
|
||||
return newErrManifestCorrupted(fd, "journal-file-num", "missing")
|
||||
case !rec.has(recSeqNum):
|
||||
return newErrManifestCorrupted(m, "seq-num", "missing")
|
||||
return newErrManifestCorrupted(fd, "seq-num", "missing")
|
||||
}
|
||||
|
||||
s.manifestFile = m
|
||||
s.manifestFd = fd
|
||||
s.setVersion(staging.finish())
|
||||
s.setNextFileNum(rec.nextFileNum)
|
||||
s.recordCommited(rec)
|
||||
|
@ -206,250 +208,3 @@ func (s *session) commit(r *sessionRecord) (err error) {
|
|||
|
||||
return
|
||||
}
|
||||
|
||||
// Pick a compaction based on current state; need external synchronization.
|
||||
func (s *session) pickCompaction() *compaction {
|
||||
v := s.version()
|
||||
|
||||
var level int
|
||||
var t0 tFiles
|
||||
if v.cScore >= 1 {
|
||||
level = v.cLevel
|
||||
cptr := s.stCompPtrs[level]
|
||||
tables := v.tables[level]
|
||||
for _, t := range tables {
|
||||
if cptr == nil || s.icmp.Compare(t.imax, cptr) > 0 {
|
||||
t0 = append(t0, t)
|
||||
break
|
||||
}
|
||||
}
|
||||
if len(t0) == 0 {
|
||||
t0 = append(t0, tables[0])
|
||||
}
|
||||
} else {
|
||||
if p := atomic.LoadPointer(&v.cSeek); p != nil {
|
||||
ts := (*tSet)(p)
|
||||
level = ts.level
|
||||
t0 = append(t0, ts.table)
|
||||
} else {
|
||||
v.release()
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
return newCompaction(s, v, level, t0)
|
||||
}
|
||||
|
||||
// Create compaction from given level and range; need external synchronization.
|
||||
func (s *session) getCompactionRange(level int, umin, umax []byte) *compaction {
|
||||
v := s.version()
|
||||
|
||||
t0 := v.tables[level].getOverlaps(nil, s.icmp, umin, umax, level == 0)
|
||||
if len(t0) == 0 {
|
||||
v.release()
|
||||
return nil
|
||||
}
|
||||
|
||||
// Avoid compacting too much in one shot in case the range is large.
|
||||
// But we cannot do this for level-0 since level-0 files can overlap
|
||||
// and we must not pick one file and drop another older file if the
|
||||
// two files overlap.
|
||||
if level > 0 {
|
||||
limit := uint64(v.s.o.GetCompactionSourceLimit(level))
|
||||
total := uint64(0)
|
||||
for i, t := range t0 {
|
||||
total += t.size
|
||||
if total >= limit {
|
||||
s.logf("table@compaction limiting F·%d -> F·%d", len(t0), i+1)
|
||||
t0 = t0[:i+1]
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return newCompaction(s, v, level, t0)
|
||||
}
|
||||
|
||||
func newCompaction(s *session, v *version, level int, t0 tFiles) *compaction {
|
||||
c := &compaction{
|
||||
s: s,
|
||||
v: v,
|
||||
level: level,
|
||||
tables: [2]tFiles{t0, nil},
|
||||
maxGPOverlaps: uint64(s.o.GetCompactionGPOverlaps(level)),
|
||||
tPtrs: make([]int, s.o.GetNumLevel()),
|
||||
}
|
||||
c.expand()
|
||||
c.save()
|
||||
return c
|
||||
}
|
||||
|
||||
// compaction represent a compaction state.
|
||||
type compaction struct {
|
||||
s *session
|
||||
v *version
|
||||
|
||||
level int
|
||||
tables [2]tFiles
|
||||
maxGPOverlaps uint64
|
||||
|
||||
gp tFiles
|
||||
gpi int
|
||||
seenKey bool
|
||||
gpOverlappedBytes uint64
|
||||
imin, imax iKey
|
||||
tPtrs []int
|
||||
released bool
|
||||
|
||||
snapGPI int
|
||||
snapSeenKey bool
|
||||
snapGPOverlappedBytes uint64
|
||||
snapTPtrs []int
|
||||
}
|
||||
|
||||
func (c *compaction) save() {
|
||||
c.snapGPI = c.gpi
|
||||
c.snapSeenKey = c.seenKey
|
||||
c.snapGPOverlappedBytes = c.gpOverlappedBytes
|
||||
c.snapTPtrs = append(c.snapTPtrs[:0], c.tPtrs...)
|
||||
}
|
||||
|
||||
func (c *compaction) restore() {
|
||||
c.gpi = c.snapGPI
|
||||
c.seenKey = c.snapSeenKey
|
||||
c.gpOverlappedBytes = c.snapGPOverlappedBytes
|
||||
c.tPtrs = append(c.tPtrs[:0], c.snapTPtrs...)
|
||||
}
|
||||
|
||||
func (c *compaction) release() {
|
||||
if !c.released {
|
||||
c.released = true
|
||||
c.v.release()
|
||||
}
|
||||
}
|
||||
|
||||
// Expand compacted tables; need external synchronization.
|
||||
func (c *compaction) expand() {
|
||||
limit := uint64(c.s.o.GetCompactionExpandLimit(c.level))
|
||||
vt0, vt1 := c.v.tables[c.level], c.v.tables[c.level+1]
|
||||
|
||||
t0, t1 := c.tables[0], c.tables[1]
|
||||
imin, imax := t0.getRange(c.s.icmp)
|
||||
// We expand t0 here just incase ukey hop across tables.
|
||||
t0 = vt0.getOverlaps(t0, c.s.icmp, imin.ukey(), imax.ukey(), c.level == 0)
|
||||
if len(t0) != len(c.tables[0]) {
|
||||
imin, imax = t0.getRange(c.s.icmp)
|
||||
}
|
||||
t1 = vt1.getOverlaps(t1, c.s.icmp, imin.ukey(), imax.ukey(), false)
|
||||
// Get entire range covered by compaction.
|
||||
amin, amax := append(t0, t1...).getRange(c.s.icmp)
|
||||
|
||||
// See if we can grow the number of inputs in "level" without
|
||||
// changing the number of "level+1" files we pick up.
|
||||
if len(t1) > 0 {
|
||||
exp0 := vt0.getOverlaps(nil, c.s.icmp, amin.ukey(), amax.ukey(), c.level == 0)
|
||||
if len(exp0) > len(t0) && t1.size()+exp0.size() < limit {
|
||||
xmin, xmax := exp0.getRange(c.s.icmp)
|
||||
exp1 := vt1.getOverlaps(nil, c.s.icmp, xmin.ukey(), xmax.ukey(), false)
|
||||
if len(exp1) == len(t1) {
|
||||
c.s.logf("table@compaction expanding L%d+L%d (F·%d S·%s)+(F·%d S·%s) -> (F·%d S·%s)+(F·%d S·%s)",
|
||||
c.level, c.level+1, len(t0), shortenb(int(t0.size())), len(t1), shortenb(int(t1.size())),
|
||||
len(exp0), shortenb(int(exp0.size())), len(exp1), shortenb(int(exp1.size())))
|
||||
imin, imax = xmin, xmax
|
||||
t0, t1 = exp0, exp1
|
||||
amin, amax = append(t0, t1...).getRange(c.s.icmp)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Compute the set of grandparent files that overlap this compaction
|
||||
// (parent == level+1; grandparent == level+2)
|
||||
if c.level+2 < c.s.o.GetNumLevel() {
|
||||
c.gp = c.v.tables[c.level+2].getOverlaps(c.gp, c.s.icmp, amin.ukey(), amax.ukey(), false)
|
||||
}
|
||||
|
||||
c.tables[0], c.tables[1] = t0, t1
|
||||
c.imin, c.imax = imin, imax
|
||||
}
|
||||
|
||||
// Check whether compaction is trivial.
|
||||
func (c *compaction) trivial() bool {
|
||||
return len(c.tables[0]) == 1 && len(c.tables[1]) == 0 && c.gp.size() <= c.maxGPOverlaps
|
||||
}
|
||||
|
||||
func (c *compaction) baseLevelForKey(ukey []byte) bool {
|
||||
for level, tables := range c.v.tables[c.level+2:] {
|
||||
for c.tPtrs[level] < len(tables) {
|
||||
t := tables[c.tPtrs[level]]
|
||||
if c.s.icmp.uCompare(ukey, t.imax.ukey()) <= 0 {
|
||||
// We've advanced far enough.
|
||||
if c.s.icmp.uCompare(ukey, t.imin.ukey()) >= 0 {
|
||||
// Key falls in this file's range, so definitely not base level.
|
||||
return false
|
||||
}
|
||||
break
|
||||
}
|
||||
c.tPtrs[level]++
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func (c *compaction) shouldStopBefore(ikey iKey) bool {
|
||||
for ; c.gpi < len(c.gp); c.gpi++ {
|
||||
gp := c.gp[c.gpi]
|
||||
if c.s.icmp.Compare(ikey, gp.imax) <= 0 {
|
||||
break
|
||||
}
|
||||
if c.seenKey {
|
||||
c.gpOverlappedBytes += gp.size
|
||||
}
|
||||
}
|
||||
c.seenKey = true
|
||||
|
||||
if c.gpOverlappedBytes > c.maxGPOverlaps {
|
||||
// Too much overlap for current output; start new output.
|
||||
c.gpOverlappedBytes = 0
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// Creates an iterator.
|
||||
func (c *compaction) newIterator() iterator.Iterator {
|
||||
// Creates iterator slice.
|
||||
icap := len(c.tables)
|
||||
if c.level == 0 {
|
||||
// Special case for level-0
|
||||
icap = len(c.tables[0]) + 1
|
||||
}
|
||||
its := make([]iterator.Iterator, 0, icap)
|
||||
|
||||
// Options.
|
||||
ro := &opt.ReadOptions{
|
||||
DontFillCache: true,
|
||||
Strict: opt.StrictOverride,
|
||||
}
|
||||
strict := c.s.o.GetStrict(opt.StrictCompaction)
|
||||
if strict {
|
||||
ro.Strict |= opt.StrictReader
|
||||
}
|
||||
|
||||
for i, tables := range c.tables {
|
||||
if len(tables) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
// Level-0 is not sorted and may overlaps each other.
|
||||
if c.level+i == 0 {
|
||||
for _, t := range tables {
|
||||
its = append(its, c.s.tops.newIterator(t, nil, ro))
|
||||
}
|
||||
} else {
|
||||
it := iterator.NewIndexedIterator(tables.newIndexIterator(c.s.tops, c.s.icmp, nil, ro), strict)
|
||||
its = append(its, it)
|
||||
}
|
||||
}
|
||||
|
||||
return iterator.NewMergedIterator(its, c.s.icmp, strict)
|
||||
}
|
||||
|
|
|
@ -0,0 +1,302 @@
|
|||
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
|
||||
// All rights reserved.
|
||||
//
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
package leveldb
|
||||
|
||||
import (
|
||||
"sync/atomic"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/iterator"
|
||||
"github.com/syndtr/goleveldb/leveldb/memdb"
|
||||
"github.com/syndtr/goleveldb/leveldb/opt"
|
||||
)
|
||||
|
||||
func (s *session) pickMemdbLevel(umin, umax []byte, maxLevel int) int {
|
||||
v := s.version()
|
||||
defer v.release()
|
||||
return v.pickMemdbLevel(umin, umax, maxLevel)
|
||||
}
|
||||
|
||||
func (s *session) flushMemdb(rec *sessionRecord, mdb *memdb.DB, maxLevel int) (int, error) {
|
||||
// Create sorted table.
|
||||
iter := mdb.NewIterator(nil)
|
||||
defer iter.Release()
|
||||
t, n, err := s.tops.createFrom(iter)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
|
||||
// Pick level other than zero can cause compaction issue with large
|
||||
// bulk insert and delete on strictly incrementing key-space. The
|
||||
// problem is that the small deletion markers trapped at lower level,
|
||||
// while key/value entries keep growing at higher level. Since the
|
||||
// key-space is strictly incrementing it will not overlaps with
|
||||
// higher level, thus maximum possible level is always picked, while
|
||||
// overlapping deletion marker pushed into lower level.
|
||||
// See: https://github.com/syndtr/goleveldb/issues/127.
|
||||
flushLevel := s.pickMemdbLevel(t.imin.ukey(), t.imax.ukey(), maxLevel)
|
||||
rec.addTableFile(flushLevel, t)
|
||||
|
||||
s.logf("memdb@flush created L%d@%d N·%d S·%s %q:%q", flushLevel, t.fd.Num, n, shortenb(int(t.size)), t.imin, t.imax)
|
||||
return flushLevel, nil
|
||||
}
|
||||
|
||||
// Pick a compaction based on current state; need external synchronization.
|
||||
func (s *session) pickCompaction() *compaction {
|
||||
v := s.version()
|
||||
|
||||
var sourceLevel int
|
||||
var t0 tFiles
|
||||
if v.cScore >= 1 {
|
||||
sourceLevel = v.cLevel
|
||||
cptr := s.getCompPtr(sourceLevel)
|
||||
tables := v.levels[sourceLevel]
|
||||
for _, t := range tables {
|
||||
if cptr == nil || s.icmp.Compare(t.imax, cptr) > 0 {
|
||||
t0 = append(t0, t)
|
||||
break
|
||||
}
|
||||
}
|
||||
if len(t0) == 0 {
|
||||
t0 = append(t0, tables[0])
|
||||
}
|
||||
} else {
|
||||
if p := atomic.LoadPointer(&v.cSeek); p != nil {
|
||||
ts := (*tSet)(p)
|
||||
sourceLevel = ts.level
|
||||
t0 = append(t0, ts.table)
|
||||
} else {
|
||||
v.release()
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
return newCompaction(s, v, sourceLevel, t0)
|
||||
}
|
||||
|
||||
// Create compaction from given level and range; need external synchronization.
|
||||
func (s *session) getCompactionRange(sourceLevel int, umin, umax []byte, noLimit bool) *compaction {
|
||||
v := s.version()
|
||||
|
||||
if sourceLevel >= len(v.levels) {
|
||||
v.release()
|
||||
return nil
|
||||
}
|
||||
|
||||
t0 := v.levels[sourceLevel].getOverlaps(nil, s.icmp, umin, umax, sourceLevel == 0)
|
||||
if len(t0) == 0 {
|
||||
v.release()
|
||||
return nil
|
||||
}
|
||||
|
||||
// Avoid compacting too much in one shot in case the range is large.
|
||||
// But we cannot do this for level-0 since level-0 files can overlap
|
||||
// and we must not pick one file and drop another older file if the
|
||||
// two files overlap.
|
||||
if !noLimit && sourceLevel > 0 {
|
||||
limit := int64(v.s.o.GetCompactionSourceLimit(sourceLevel))
|
||||
total := int64(0)
|
||||
for i, t := range t0 {
|
||||
total += t.size
|
||||
if total >= limit {
|
||||
s.logf("table@compaction limiting F·%d -> F·%d", len(t0), i+1)
|
||||
t0 = t0[:i+1]
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return newCompaction(s, v, sourceLevel, t0)
|
||||
}
|
||||
|
||||
func newCompaction(s *session, v *version, sourceLevel int, t0 tFiles) *compaction {
|
||||
c := &compaction{
|
||||
s: s,
|
||||
v: v,
|
||||
sourceLevel: sourceLevel,
|
||||
levels: [2]tFiles{t0, nil},
|
||||
maxGPOverlaps: int64(s.o.GetCompactionGPOverlaps(sourceLevel)),
|
||||
tPtrs: make([]int, len(v.levels)),
|
||||
}
|
||||
c.expand()
|
||||
c.save()
|
||||
return c
|
||||
}
|
||||
|
||||
// compaction represent a compaction state.
|
||||
type compaction struct {
|
||||
s *session
|
||||
v *version
|
||||
|
||||
sourceLevel int
|
||||
levels [2]tFiles
|
||||
maxGPOverlaps int64
|
||||
|
||||
gp tFiles
|
||||
gpi int
|
||||
seenKey bool
|
||||
gpOverlappedBytes int64
|
||||
imin, imax internalKey
|
||||
tPtrs []int
|
||||
released bool
|
||||
|
||||
snapGPI int
|
||||
snapSeenKey bool
|
||||
snapGPOverlappedBytes int64
|
||||
snapTPtrs []int
|
||||
}
|
||||
|
||||
func (c *compaction) save() {
|
||||
c.snapGPI = c.gpi
|
||||
c.snapSeenKey = c.seenKey
|
||||
c.snapGPOverlappedBytes = c.gpOverlappedBytes
|
||||
c.snapTPtrs = append(c.snapTPtrs[:0], c.tPtrs...)
|
||||
}
|
||||
|
||||
func (c *compaction) restore() {
|
||||
c.gpi = c.snapGPI
|
||||
c.seenKey = c.snapSeenKey
|
||||
c.gpOverlappedBytes = c.snapGPOverlappedBytes
|
||||
c.tPtrs = append(c.tPtrs[:0], c.snapTPtrs...)
|
||||
}
|
||||
|
||||
func (c *compaction) release() {
|
||||
if !c.released {
|
||||
c.released = true
|
||||
c.v.release()
|
||||
}
|
||||
}
|
||||
|
||||
// Expand compacted tables; need external synchronization.
|
||||
func (c *compaction) expand() {
|
||||
limit := int64(c.s.o.GetCompactionExpandLimit(c.sourceLevel))
|
||||
vt0 := c.v.levels[c.sourceLevel]
|
||||
vt1 := tFiles{}
|
||||
if level := c.sourceLevel + 1; level < len(c.v.levels) {
|
||||
vt1 = c.v.levels[level]
|
||||
}
|
||||
|
||||
t0, t1 := c.levels[0], c.levels[1]
|
||||
imin, imax := t0.getRange(c.s.icmp)
|
||||
// We expand t0 here just incase ukey hop across tables.
|
||||
t0 = vt0.getOverlaps(t0, c.s.icmp, imin.ukey(), imax.ukey(), c.sourceLevel == 0)
|
||||
if len(t0) != len(c.levels[0]) {
|
||||
imin, imax = t0.getRange(c.s.icmp)
|
||||
}
|
||||
t1 = vt1.getOverlaps(t1, c.s.icmp, imin.ukey(), imax.ukey(), false)
|
||||
// Get entire range covered by compaction.
|
||||
amin, amax := append(t0, t1...).getRange(c.s.icmp)
|
||||
|
||||
// See if we can grow the number of inputs in "sourceLevel" without
|
||||
// changing the number of "sourceLevel+1" files we pick up.
|
||||
if len(t1) > 0 {
|
||||
exp0 := vt0.getOverlaps(nil, c.s.icmp, amin.ukey(), amax.ukey(), c.sourceLevel == 0)
|
||||
if len(exp0) > len(t0) && t1.size()+exp0.size() < limit {
|
||||
xmin, xmax := exp0.getRange(c.s.icmp)
|
||||
exp1 := vt1.getOverlaps(nil, c.s.icmp, xmin.ukey(), xmax.ukey(), false)
|
||||
if len(exp1) == len(t1) {
|
||||
c.s.logf("table@compaction expanding L%d+L%d (F·%d S·%s)+(F·%d S·%s) -> (F·%d S·%s)+(F·%d S·%s)",
|
||||
c.sourceLevel, c.sourceLevel+1, len(t0), shortenb(int(t0.size())), len(t1), shortenb(int(t1.size())),
|
||||
len(exp0), shortenb(int(exp0.size())), len(exp1), shortenb(int(exp1.size())))
|
||||
imin, imax = xmin, xmax
|
||||
t0, t1 = exp0, exp1
|
||||
amin, amax = append(t0, t1...).getRange(c.s.icmp)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Compute the set of grandparent files that overlap this compaction
|
||||
// (parent == sourceLevel+1; grandparent == sourceLevel+2)
|
||||
if level := c.sourceLevel + 2; level < len(c.v.levels) {
|
||||
c.gp = c.v.levels[level].getOverlaps(c.gp, c.s.icmp, amin.ukey(), amax.ukey(), false)
|
||||
}
|
||||
|
||||
c.levels[0], c.levels[1] = t0, t1
|
||||
c.imin, c.imax = imin, imax
|
||||
}
|
||||
|
||||
// Check whether compaction is trivial.
|
||||
func (c *compaction) trivial() bool {
|
||||
return len(c.levels[0]) == 1 && len(c.levels[1]) == 0 && c.gp.size() <= c.maxGPOverlaps
|
||||
}
|
||||
|
||||
func (c *compaction) baseLevelForKey(ukey []byte) bool {
|
||||
for level := c.sourceLevel + 2; level < len(c.v.levels); level++ {
|
||||
tables := c.v.levels[level]
|
||||
for c.tPtrs[level] < len(tables) {
|
||||
t := tables[c.tPtrs[level]]
|
||||
if c.s.icmp.uCompare(ukey, t.imax.ukey()) <= 0 {
|
||||
// We've advanced far enough.
|
||||
if c.s.icmp.uCompare(ukey, t.imin.ukey()) >= 0 {
|
||||
// Key falls in this file's range, so definitely not base level.
|
||||
return false
|
||||
}
|
||||
break
|
||||
}
|
||||
c.tPtrs[level]++
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func (c *compaction) shouldStopBefore(ikey internalKey) bool {
|
||||
for ; c.gpi < len(c.gp); c.gpi++ {
|
||||
gp := c.gp[c.gpi]
|
||||
if c.s.icmp.Compare(ikey, gp.imax) <= 0 {
|
||||
break
|
||||
}
|
||||
if c.seenKey {
|
||||
c.gpOverlappedBytes += gp.size
|
||||
}
|
||||
}
|
||||
c.seenKey = true
|
||||
|
||||
if c.gpOverlappedBytes > c.maxGPOverlaps {
|
||||
// Too much overlap for current output; start new output.
|
||||
c.gpOverlappedBytes = 0
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// Creates an iterator.
|
||||
func (c *compaction) newIterator() iterator.Iterator {
|
||||
// Creates iterator slice.
|
||||
icap := len(c.levels)
|
||||
if c.sourceLevel == 0 {
|
||||
// Special case for level-0.
|
||||
icap = len(c.levels[0]) + 1
|
||||
}
|
||||
its := make([]iterator.Iterator, 0, icap)
|
||||
|
||||
// Options.
|
||||
ro := &opt.ReadOptions{
|
||||
DontFillCache: true,
|
||||
Strict: opt.StrictOverride,
|
||||
}
|
||||
strict := c.s.o.GetStrict(opt.StrictCompaction)
|
||||
if strict {
|
||||
ro.Strict |= opt.StrictReader
|
||||
}
|
||||
|
||||
for i, tables := range c.levels {
|
||||
if len(tables) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
// Level-0 is not sorted and may overlaps each other.
|
||||
if c.sourceLevel+i == 0 {
|
||||
for _, t := range tables {
|
||||
its = append(its, c.s.tops.newIterator(t, nil, ro))
|
||||
}
|
||||
} else {
|
||||
it := iterator.NewIndexedIterator(tables.newIndexIterator(c.s.tops, c.s.icmp, nil, ro), strict)
|
||||
its = append(its, it)
|
||||
}
|
||||
}
|
||||
|
||||
return iterator.NewMergedIterator(its, c.s.icmp, strict)
|
||||
}
|
|
@ -13,6 +13,7 @@ import (
|
|||
"strings"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/errors"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
)
|
||||
|
||||
type byteReader interface {
|
||||
|
@ -35,30 +36,28 @@ const (
|
|||
|
||||
type cpRecord struct {
|
||||
level int
|
||||
ikey iKey
|
||||
ikey internalKey
|
||||
}
|
||||
|
||||
type atRecord struct {
|
||||
level int
|
||||
num uint64
|
||||
size uint64
|
||||
imin iKey
|
||||
imax iKey
|
||||
num int64
|
||||
size int64
|
||||
imin internalKey
|
||||
imax internalKey
|
||||
}
|
||||
|
||||
type dtRecord struct {
|
||||
level int
|
||||
num uint64
|
||||
num int64
|
||||
}
|
||||
|
||||
type sessionRecord struct {
|
||||
numLevel int
|
||||
|
||||
hasRec int
|
||||
comparer string
|
||||
journalNum uint64
|
||||
prevJournalNum uint64
|
||||
nextFileNum uint64
|
||||
journalNum int64
|
||||
prevJournalNum int64
|
||||
nextFileNum int64
|
||||
seqNum uint64
|
||||
compPtrs []cpRecord
|
||||
addedTables []atRecord
|
||||
|
@ -77,17 +76,17 @@ func (p *sessionRecord) setComparer(name string) {
|
|||
p.comparer = name
|
||||
}
|
||||
|
||||
func (p *sessionRecord) setJournalNum(num uint64) {
|
||||
func (p *sessionRecord) setJournalNum(num int64) {
|
||||
p.hasRec |= 1 << recJournalNum
|
||||
p.journalNum = num
|
||||
}
|
||||
|
||||
func (p *sessionRecord) setPrevJournalNum(num uint64) {
|
||||
func (p *sessionRecord) setPrevJournalNum(num int64) {
|
||||
p.hasRec |= 1 << recPrevJournalNum
|
||||
p.prevJournalNum = num
|
||||
}
|
||||
|
||||
func (p *sessionRecord) setNextFileNum(num uint64) {
|
||||
func (p *sessionRecord) setNextFileNum(num int64) {
|
||||
p.hasRec |= 1 << recNextFileNum
|
||||
p.nextFileNum = num
|
||||
}
|
||||
|
@ -97,7 +96,7 @@ func (p *sessionRecord) setSeqNum(num uint64) {
|
|||
p.seqNum = num
|
||||
}
|
||||
|
||||
func (p *sessionRecord) addCompPtr(level int, ikey iKey) {
|
||||
func (p *sessionRecord) addCompPtr(level int, ikey internalKey) {
|
||||
p.hasRec |= 1 << recCompPtr
|
||||
p.compPtrs = append(p.compPtrs, cpRecord{level, ikey})
|
||||
}
|
||||
|
@ -107,13 +106,13 @@ func (p *sessionRecord) resetCompPtrs() {
|
|||
p.compPtrs = p.compPtrs[:0]
|
||||
}
|
||||
|
||||
func (p *sessionRecord) addTable(level int, num, size uint64, imin, imax iKey) {
|
||||
func (p *sessionRecord) addTable(level int, num, size int64, imin, imax internalKey) {
|
||||
p.hasRec |= 1 << recAddTable
|
||||
p.addedTables = append(p.addedTables, atRecord{level, num, size, imin, imax})
|
||||
}
|
||||
|
||||
func (p *sessionRecord) addTableFile(level int, t *tFile) {
|
||||
p.addTable(level, t.file.Num(), t.size, t.imin, t.imax)
|
||||
p.addTable(level, t.fd.Num, t.size, t.imin, t.imax)
|
||||
}
|
||||
|
||||
func (p *sessionRecord) resetAddedTables() {
|
||||
|
@ -121,7 +120,7 @@ func (p *sessionRecord) resetAddedTables() {
|
|||
p.addedTables = p.addedTables[:0]
|
||||
}
|
||||
|
||||
func (p *sessionRecord) delTable(level int, num uint64) {
|
||||
func (p *sessionRecord) delTable(level int, num int64) {
|
||||
p.hasRec |= 1 << recDelTable
|
||||
p.deletedTables = append(p.deletedTables, dtRecord{level, num})
|
||||
}
|
||||
|
@ -139,6 +138,13 @@ func (p *sessionRecord) putUvarint(w io.Writer, x uint64) {
|
|||
_, p.err = w.Write(p.scratch[:n])
|
||||
}
|
||||
|
||||
func (p *sessionRecord) putVarint(w io.Writer, x int64) {
|
||||
if x < 0 {
|
||||
panic("invalid negative value")
|
||||
}
|
||||
p.putUvarint(w, uint64(x))
|
||||
}
|
||||
|
||||
func (p *sessionRecord) putBytes(w io.Writer, x []byte) {
|
||||
if p.err != nil {
|
||||
return
|
||||
|
@ -158,11 +164,11 @@ func (p *sessionRecord) encode(w io.Writer) error {
|
|||
}
|
||||
if p.has(recJournalNum) {
|
||||
p.putUvarint(w, recJournalNum)
|
||||
p.putUvarint(w, p.journalNum)
|
||||
p.putVarint(w, p.journalNum)
|
||||
}
|
||||
if p.has(recNextFileNum) {
|
||||
p.putUvarint(w, recNextFileNum)
|
||||
p.putUvarint(w, p.nextFileNum)
|
||||
p.putVarint(w, p.nextFileNum)
|
||||
}
|
||||
if p.has(recSeqNum) {
|
||||
p.putUvarint(w, recSeqNum)
|
||||
|
@ -176,13 +182,13 @@ func (p *sessionRecord) encode(w io.Writer) error {
|
|||
for _, r := range p.deletedTables {
|
||||
p.putUvarint(w, recDelTable)
|
||||
p.putUvarint(w, uint64(r.level))
|
||||
p.putUvarint(w, r.num)
|
||||
p.putVarint(w, r.num)
|
||||
}
|
||||
for _, r := range p.addedTables {
|
||||
p.putUvarint(w, recAddTable)
|
||||
p.putUvarint(w, uint64(r.level))
|
||||
p.putUvarint(w, r.num)
|
||||
p.putUvarint(w, r.size)
|
||||
p.putVarint(w, r.num)
|
||||
p.putVarint(w, r.size)
|
||||
p.putBytes(w, r.imin)
|
||||
p.putBytes(w, r.imax)
|
||||
}
|
||||
|
@ -196,9 +202,9 @@ func (p *sessionRecord) readUvarintMayEOF(field string, r io.ByteReader, mayEOF
|
|||
x, err := binary.ReadUvarint(r)
|
||||
if err != nil {
|
||||
if err == io.ErrUnexpectedEOF || (mayEOF == false && err == io.EOF) {
|
||||
p.err = errors.NewErrCorrupted(nil, &ErrManifestCorrupted{field, "short read"})
|
||||
p.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrManifestCorrupted{field, "short read"})
|
||||
} else if strings.HasPrefix(err.Error(), "binary:") {
|
||||
p.err = errors.NewErrCorrupted(nil, &ErrManifestCorrupted{field, err.Error()})
|
||||
p.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrManifestCorrupted{field, err.Error()})
|
||||
} else {
|
||||
p.err = err
|
||||
}
|
||||
|
@ -211,6 +217,14 @@ func (p *sessionRecord) readUvarint(field string, r io.ByteReader) uint64 {
|
|||
return p.readUvarintMayEOF(field, r, false)
|
||||
}
|
||||
|
||||
func (p *sessionRecord) readVarint(field string, r io.ByteReader) int64 {
|
||||
x := int64(p.readUvarintMayEOF(field, r, false))
|
||||
if x < 0 {
|
||||
p.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrManifestCorrupted{field, "invalid negative value"})
|
||||
}
|
||||
return x
|
||||
}
|
||||
|
||||
func (p *sessionRecord) readBytes(field string, r byteReader) []byte {
|
||||
if p.err != nil {
|
||||
return nil
|
||||
|
@ -223,7 +237,7 @@ func (p *sessionRecord) readBytes(field string, r byteReader) []byte {
|
|||
_, p.err = io.ReadFull(r, x)
|
||||
if p.err != nil {
|
||||
if p.err == io.ErrUnexpectedEOF {
|
||||
p.err = errors.NewErrCorrupted(nil, &ErrManifestCorrupted{field, "short read"})
|
||||
p.err = errors.NewErrCorrupted(storage.FileDesc{}, &ErrManifestCorrupted{field, "short read"})
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
@ -238,10 +252,6 @@ func (p *sessionRecord) readLevel(field string, r io.ByteReader) int {
|
|||
if p.err != nil {
|
||||
return 0
|
||||
}
|
||||
if x >= uint64(p.numLevel) {
|
||||
p.err = errors.NewErrCorrupted(nil, &ErrManifestCorrupted{field, "invalid level number"})
|
||||
return 0
|
||||
}
|
||||
return int(x)
|
||||
}
|
||||
|
||||
|
@ -266,17 +276,17 @@ func (p *sessionRecord) decode(r io.Reader) error {
|
|||
p.setComparer(string(x))
|
||||
}
|
||||
case recJournalNum:
|
||||
x := p.readUvarint("journal-num", br)
|
||||
x := p.readVarint("journal-num", br)
|
||||
if p.err == nil {
|
||||
p.setJournalNum(x)
|
||||
}
|
||||
case recPrevJournalNum:
|
||||
x := p.readUvarint("prev-journal-num", br)
|
||||
x := p.readVarint("prev-journal-num", br)
|
||||
if p.err == nil {
|
||||
p.setPrevJournalNum(x)
|
||||
}
|
||||
case recNextFileNum:
|
||||
x := p.readUvarint("next-file-num", br)
|
||||
x := p.readVarint("next-file-num", br)
|
||||
if p.err == nil {
|
||||
p.setNextFileNum(x)
|
||||
}
|
||||
|
@ -289,12 +299,12 @@ func (p *sessionRecord) decode(r io.Reader) error {
|
|||
level := p.readLevel("comp-ptr.level", br)
|
||||
ikey := p.readBytes("comp-ptr.ikey", br)
|
||||
if p.err == nil {
|
||||
p.addCompPtr(level, iKey(ikey))
|
||||
p.addCompPtr(level, internalKey(ikey))
|
||||
}
|
||||
case recAddTable:
|
||||
level := p.readLevel("add-table.level", br)
|
||||
num := p.readUvarint("add-table.num", br)
|
||||
size := p.readUvarint("add-table.size", br)
|
||||
num := p.readVarint("add-table.num", br)
|
||||
size := p.readVarint("add-table.size", br)
|
||||
imin := p.readBytes("add-table.imin", br)
|
||||
imax := p.readBytes("add-table.imax", br)
|
||||
if p.err == nil {
|
||||
|
@ -302,7 +312,7 @@ func (p *sessionRecord) decode(r io.Reader) error {
|
|||
}
|
||||
case recDelTable:
|
||||
level := p.readLevel("del-table.level", br)
|
||||
num := p.readUvarint("del-table.num", br)
|
||||
num := p.readVarint("del-table.num", br)
|
||||
if p.err == nil {
|
||||
p.delTable(level, num)
|
||||
}
|
||||
|
|
|
@ -9,8 +9,6 @@ package leveldb
|
|||
import (
|
||||
"bytes"
|
||||
"testing"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/opt"
|
||||
)
|
||||
|
||||
func decodeEncode(v *sessionRecord) (res bool, err error) {
|
||||
|
@ -19,7 +17,7 @@ func decodeEncode(v *sessionRecord) (res bool, err error) {
|
|||
if err != nil {
|
||||
return
|
||||
}
|
||||
v2 := &sessionRecord{numLevel: opt.DefaultNumLevel}
|
||||
v2 := &sessionRecord{}
|
||||
err = v.decode(b)
|
||||
if err != nil {
|
||||
return
|
||||
|
@ -33,9 +31,9 @@ func decodeEncode(v *sessionRecord) (res bool, err error) {
|
|||
}
|
||||
|
||||
func TestSessionRecord_EncodeDecode(t *testing.T) {
|
||||
big := uint64(1) << 50
|
||||
v := &sessionRecord{numLevel: opt.DefaultNumLevel}
|
||||
i := uint64(0)
|
||||
big := int64(1) << 50
|
||||
v := &sessionRecord{}
|
||||
i := int64(0)
|
||||
test := func() {
|
||||
res, err := decodeEncode(v)
|
||||
if err != nil {
|
||||
|
@ -49,16 +47,16 @@ func TestSessionRecord_EncodeDecode(t *testing.T) {
|
|||
for ; i < 4; i++ {
|
||||
test()
|
||||
v.addTable(3, big+300+i, big+400+i,
|
||||
newIkey([]byte("foo"), big+500+1, ktVal),
|
||||
newIkey([]byte("zoo"), big+600+1, ktDel))
|
||||
makeInternalKey(nil, []byte("foo"), uint64(big+500+1), keyTypeVal),
|
||||
makeInternalKey(nil, []byte("zoo"), uint64(big+600+1), keyTypeDel))
|
||||
v.delTable(4, big+700+i)
|
||||
v.addCompPtr(int(i), newIkey([]byte("x"), big+900+1, ktVal))
|
||||
v.addCompPtr(int(i), makeInternalKey(nil, []byte("x"), uint64(big+900+1), keyTypeVal))
|
||||
}
|
||||
|
||||
v.setComparer("foo")
|
||||
v.setJournalNum(big + 100)
|
||||
v.setPrevJournalNum(big + 99)
|
||||
v.setNextFileNum(big + 200)
|
||||
v.setSeqNum(big + 1000)
|
||||
v.setSeqNum(uint64(big + 1000))
|
||||
test()
|
||||
}
|
||||
|
|
|
@ -17,15 +17,15 @@ import (
|
|||
// Logging.
|
||||
|
||||
type dropper struct {
|
||||
s *session
|
||||
file storage.File
|
||||
s *session
|
||||
fd storage.FileDesc
|
||||
}
|
||||
|
||||
func (d dropper) Drop(err error) {
|
||||
if e, ok := err.(*journal.ErrCorrupted); ok {
|
||||
d.s.logf("journal@drop %s-%d S·%s %q", d.file.Type(), d.file.Num(), shortenb(e.Size), e.Reason)
|
||||
d.s.logf("journal@drop %s-%d S·%s %q", d.fd.Type, d.fd.Num, shortenb(e.Size), e.Reason)
|
||||
} else {
|
||||
d.s.logf("journal@drop %s-%d %q", d.file.Type(), d.file.Num(), err)
|
||||
d.s.logf("journal@drop %s-%d %q", d.fd.Type, d.fd.Num, err)
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -34,25 +34,21 @@ func (s *session) logf(format string, v ...interface{}) { s.stor.Log(fmt.Sprintf
|
|||
|
||||
// File utils.
|
||||
|
||||
func (s *session) getJournalFile(num uint64) storage.File {
|
||||
return s.stor.GetFile(num, storage.TypeJournal)
|
||||
func (s *session) newTemp() storage.FileDesc {
|
||||
num := atomic.AddInt64(&s.stTempFileNum, 1) - 1
|
||||
return storage.FileDesc{storage.TypeTemp, num}
|
||||
}
|
||||
|
||||
func (s *session) getTableFile(num uint64) storage.File {
|
||||
return s.stor.GetFile(num, storage.TypeTable)
|
||||
}
|
||||
|
||||
func (s *session) getFiles(t storage.FileType) ([]storage.File, error) {
|
||||
return s.stor.GetFiles(t)
|
||||
}
|
||||
|
||||
func (s *session) newTemp() storage.File {
|
||||
num := atomic.AddUint64(&s.stTempFileNum, 1) - 1
|
||||
return s.stor.GetFile(num, storage.TypeTemp)
|
||||
}
|
||||
|
||||
func (s *session) tableFileFromRecord(r atRecord) *tFile {
|
||||
return newTableFile(s.getTableFile(r.num), r.size, r.imin, r.imax)
|
||||
func (s *session) addFileRef(fd storage.FileDesc, ref int) int {
|
||||
ref += s.fileRef[fd.Num]
|
||||
if ref > 0 {
|
||||
s.fileRef[fd.Num] = ref
|
||||
} else if ref == 0 {
|
||||
delete(s.fileRef, fd.Num)
|
||||
} else {
|
||||
panic(fmt.Sprintf("negative ref: %v", fd))
|
||||
}
|
||||
return ref
|
||||
}
|
||||
|
||||
// Session state.
|
||||
|
@ -62,65 +58,90 @@ func (s *session) tableFileFromRecord(r atRecord) *tFile {
|
|||
func (s *session) version() *version {
|
||||
s.vmu.Lock()
|
||||
defer s.vmu.Unlock()
|
||||
s.stVersion.ref++
|
||||
s.stVersion.incref()
|
||||
return s.stVersion
|
||||
}
|
||||
|
||||
func (s *session) tLen(level int) int {
|
||||
s.vmu.Lock()
|
||||
defer s.vmu.Unlock()
|
||||
return s.stVersion.tLen(level)
|
||||
}
|
||||
|
||||
// Set current version to v.
|
||||
func (s *session) setVersion(v *version) {
|
||||
s.vmu.Lock()
|
||||
v.ref = 1 // Holds by session.
|
||||
if old := s.stVersion; old != nil {
|
||||
v.ref++ // Holds by old version.
|
||||
old.next = v
|
||||
old.releaseNB()
|
||||
defer s.vmu.Unlock()
|
||||
// Hold by session. It is important to call this first before releasing
|
||||
// current version, otherwise the still used files might get released.
|
||||
v.incref()
|
||||
if s.stVersion != nil {
|
||||
// Release current version.
|
||||
s.stVersion.releaseNB()
|
||||
}
|
||||
s.stVersion = v
|
||||
s.vmu.Unlock()
|
||||
}
|
||||
|
||||
// Get current unused file number.
|
||||
func (s *session) nextFileNum() uint64 {
|
||||
return atomic.LoadUint64(&s.stNextFileNum)
|
||||
func (s *session) nextFileNum() int64 {
|
||||
return atomic.LoadInt64(&s.stNextFileNum)
|
||||
}
|
||||
|
||||
// Set current unused file number to num.
|
||||
func (s *session) setNextFileNum(num uint64) {
|
||||
atomic.StoreUint64(&s.stNextFileNum, num)
|
||||
func (s *session) setNextFileNum(num int64) {
|
||||
atomic.StoreInt64(&s.stNextFileNum, num)
|
||||
}
|
||||
|
||||
// Mark file number as used.
|
||||
func (s *session) markFileNum(num uint64) {
|
||||
func (s *session) markFileNum(num int64) {
|
||||
nextFileNum := num + 1
|
||||
for {
|
||||
old, x := s.stNextFileNum, nextFileNum
|
||||
if old > x {
|
||||
x = old
|
||||
}
|
||||
if atomic.CompareAndSwapUint64(&s.stNextFileNum, old, x) {
|
||||
if atomic.CompareAndSwapInt64(&s.stNextFileNum, old, x) {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Allocate a file number.
|
||||
func (s *session) allocFileNum() uint64 {
|
||||
return atomic.AddUint64(&s.stNextFileNum, 1) - 1
|
||||
func (s *session) allocFileNum() int64 {
|
||||
return atomic.AddInt64(&s.stNextFileNum, 1) - 1
|
||||
}
|
||||
|
||||
// Reuse given file number.
|
||||
func (s *session) reuseFileNum(num uint64) {
|
||||
func (s *session) reuseFileNum(num int64) {
|
||||
for {
|
||||
old, x := s.stNextFileNum, num
|
||||
if old != x+1 {
|
||||
x = old
|
||||
}
|
||||
if atomic.CompareAndSwapUint64(&s.stNextFileNum, old, x) {
|
||||
if atomic.CompareAndSwapInt64(&s.stNextFileNum, old, x) {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Set compaction ptr at given level; need external synchronization.
|
||||
func (s *session) setCompPtr(level int, ik internalKey) {
|
||||
if level >= len(s.stCompPtrs) {
|
||||
newCompPtrs := make([]internalKey, level+1)
|
||||
copy(newCompPtrs, s.stCompPtrs)
|
||||
s.stCompPtrs = newCompPtrs
|
||||
}
|
||||
s.stCompPtrs[level] = append(internalKey{}, ik...)
|
||||
}
|
||||
|
||||
// Get compaction ptr at given level; need external synchronization.
|
||||
func (s *session) getCompPtr(level int) internalKey {
|
||||
if level >= len(s.stCompPtrs) {
|
||||
return nil
|
||||
}
|
||||
return s.stCompPtrs[level]
|
||||
}
|
||||
|
||||
// Manifest related utils.
|
||||
|
||||
// Fill given session record obj with current states; need external
|
||||
|
@ -149,29 +170,28 @@ func (s *session) fillRecord(r *sessionRecord, snapshot bool) {
|
|||
|
||||
// Mark if record has been committed, this will update session state;
|
||||
// need external synchronization.
|
||||
func (s *session) recordCommited(r *sessionRecord) {
|
||||
if r.has(recJournalNum) {
|
||||
s.stJournalNum = r.journalNum
|
||||
func (s *session) recordCommited(rec *sessionRecord) {
|
||||
if rec.has(recJournalNum) {
|
||||
s.stJournalNum = rec.journalNum
|
||||
}
|
||||
|
||||
if r.has(recPrevJournalNum) {
|
||||
s.stPrevJournalNum = r.prevJournalNum
|
||||
if rec.has(recPrevJournalNum) {
|
||||
s.stPrevJournalNum = rec.prevJournalNum
|
||||
}
|
||||
|
||||
if r.has(recSeqNum) {
|
||||
s.stSeqNum = r.seqNum
|
||||
if rec.has(recSeqNum) {
|
||||
s.stSeqNum = rec.seqNum
|
||||
}
|
||||
|
||||
for _, p := range r.compPtrs {
|
||||
s.stCompPtrs[p.level] = iKey(p.ikey)
|
||||
for _, r := range rec.compPtrs {
|
||||
s.setCompPtr(r.level, internalKey(r.ikey))
|
||||
}
|
||||
}
|
||||
|
||||
// Create a new manifest file; need external synchronization.
|
||||
func (s *session) newManifest(rec *sessionRecord, v *version) (err error) {
|
||||
num := s.allocFileNum()
|
||||
file := s.stor.GetFile(num, storage.TypeManifest)
|
||||
writer, err := file.Create()
|
||||
fd := storage.FileDesc{storage.TypeManifest, s.allocFileNum()}
|
||||
writer, err := s.stor.Create(fd)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
|
@ -182,7 +202,7 @@ func (s *session) newManifest(rec *sessionRecord, v *version) (err error) {
|
|||
defer v.release()
|
||||
}
|
||||
if rec == nil {
|
||||
rec = &sessionRecord{numLevel: s.o.GetNumLevel()}
|
||||
rec = &sessionRecord{}
|
||||
}
|
||||
s.fillRecord(rec, true)
|
||||
v.fillRecord(rec)
|
||||
|
@ -196,16 +216,16 @@ func (s *session) newManifest(rec *sessionRecord, v *version) (err error) {
|
|||
if s.manifestWriter != nil {
|
||||
s.manifestWriter.Close()
|
||||
}
|
||||
if s.manifestFile != nil {
|
||||
s.manifestFile.Remove()
|
||||
if !s.manifestFd.Zero() {
|
||||
s.stor.Remove(s.manifestFd)
|
||||
}
|
||||
s.manifestFile = file
|
||||
s.manifestFd = fd
|
||||
s.manifestWriter = writer
|
||||
s.manifest = jw
|
||||
} else {
|
||||
writer.Close()
|
||||
file.Remove()
|
||||
s.reuseFileNum(num)
|
||||
s.stor.Remove(fd)
|
||||
s.reuseFileNum(fd.Num)
|
||||
}
|
||||
}()
|
||||
|
||||
|
@ -221,7 +241,7 @@ func (s *session) newManifest(rec *sessionRecord, v *version) (err error) {
|
|||
if err != nil {
|
||||
return
|
||||
}
|
||||
err = s.stor.SetManifest(file)
|
||||
err = s.stor.SetMeta(fd)
|
||||
return
|
||||
}
|
||||
|
||||
|
@ -240,9 +260,11 @@ func (s *session) flushManifest(rec *sessionRecord) (err error) {
|
|||
if err != nil {
|
||||
return
|
||||
}
|
||||
err = s.manifestWriter.Sync()
|
||||
if err != nil {
|
||||
return
|
||||
if !s.o.GetNoSync() {
|
||||
err = s.manifestWriter.Sync()
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
}
|
||||
s.recordCommited(rec)
|
||||
return
|
||||
|
|
|
@ -17,11 +17,12 @@ import (
|
|||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
var errFileOpen = errors.New("leveldb/storage: file still open")
|
||||
var (
|
||||
errFileOpen = errors.New("leveldb/storage: file still open")
|
||||
errReadOnly = errors.New("leveldb/storage: storage is read-only")
|
||||
)
|
||||
|
||||
type fileLock interface {
|
||||
release() error
|
||||
|
@ -31,41 +32,53 @@ type fileStorageLock struct {
|
|||
fs *fileStorage
|
||||
}
|
||||
|
||||
func (lock *fileStorageLock) Release() {
|
||||
fs := lock.fs
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.slock == lock {
|
||||
fs.slock = nil
|
||||
func (lock *fileStorageLock) Unlock() {
|
||||
if lock.fs != nil {
|
||||
lock.fs.mu.Lock()
|
||||
defer lock.fs.mu.Unlock()
|
||||
if lock.fs.slock == lock {
|
||||
lock.fs.slock = nil
|
||||
}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
const logSizeThreshold = 1024 * 1024 // 1 MiB
|
||||
|
||||
// fileStorage is a file-system backed storage.
|
||||
type fileStorage struct {
|
||||
path string
|
||||
path string
|
||||
readOnly bool
|
||||
|
||||
mu sync.Mutex
|
||||
flock fileLock
|
||||
slock *fileStorageLock
|
||||
logw *os.File
|
||||
buf []byte
|
||||
mu sync.Mutex
|
||||
flock fileLock
|
||||
slock *fileStorageLock
|
||||
logw *os.File
|
||||
logSize int64
|
||||
buf []byte
|
||||
// Opened file counter; if open < 0 means closed.
|
||||
open int
|
||||
day int
|
||||
}
|
||||
|
||||
// OpenFile returns a new filesytem-backed storage implementation with the given
|
||||
// path. This also hold a file lock, so any subsequent attempt to open the same
|
||||
// path will fail.
|
||||
// path. This also acquire a file lock, so any subsequent attempt to open the
|
||||
// same path will fail.
|
||||
//
|
||||
// The storage must be closed after use, by calling Close method.
|
||||
func OpenFile(path string) (Storage, error) {
|
||||
if err := os.MkdirAll(path, 0755); err != nil {
|
||||
func OpenFile(path string, readOnly bool) (Storage, error) {
|
||||
if fi, err := os.Stat(path); err == nil {
|
||||
if !fi.IsDir() {
|
||||
return nil, fmt.Errorf("leveldb/storage: open %s: not a directory", path)
|
||||
}
|
||||
} else if os.IsNotExist(err) && !readOnly {
|
||||
if err := os.MkdirAll(path, 0755); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
} else {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
flock, err := newFileLock(filepath.Join(path, "LOCK"))
|
||||
flock, err := newFileLock(filepath.Join(path, "LOCK"), readOnly)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
@ -76,23 +89,42 @@ func OpenFile(path string) (Storage, error) {
|
|||
}
|
||||
}()
|
||||
|
||||
rename(filepath.Join(path, "LOG"), filepath.Join(path, "LOG.old"))
|
||||
logw, err := os.OpenFile(filepath.Join(path, "LOG"), os.O_WRONLY|os.O_CREATE, 0644)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
var (
|
||||
logw *os.File
|
||||
logSize int64
|
||||
)
|
||||
if !readOnly {
|
||||
logw, err = os.OpenFile(filepath.Join(path, "LOG"), os.O_WRONLY|os.O_CREATE, 0644)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
logSize, err = logw.Seek(0, os.SEEK_END)
|
||||
if err != nil {
|
||||
logw.Close()
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
|
||||
fs := &fileStorage{path: path, flock: flock, logw: logw}
|
||||
fs := &fileStorage{
|
||||
path: path,
|
||||
readOnly: readOnly,
|
||||
flock: flock,
|
||||
logw: logw,
|
||||
logSize: logSize,
|
||||
}
|
||||
runtime.SetFinalizer(fs, (*fileStorage).Close)
|
||||
return fs, nil
|
||||
}
|
||||
|
||||
func (fs *fileStorage) Lock() (util.Releaser, error) {
|
||||
func (fs *fileStorage) Lock() (Locker, error) {
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return nil, ErrClosed
|
||||
}
|
||||
if fs.readOnly {
|
||||
return &fileStorageLock{}, nil
|
||||
}
|
||||
if fs.slock != nil {
|
||||
return nil, ErrLocked
|
||||
}
|
||||
|
@ -101,7 +133,7 @@ func (fs *fileStorage) Lock() (util.Releaser, error) {
|
|||
}
|
||||
|
||||
func itoa(buf []byte, i int, wid int) []byte {
|
||||
var u uint = uint(i)
|
||||
u := uint(i)
|
||||
if u == 0 && wid <= 1 {
|
||||
return append(buf, '0')
|
||||
}
|
||||
|
@ -126,6 +158,22 @@ func (fs *fileStorage) printDay(t time.Time) {
|
|||
}
|
||||
|
||||
func (fs *fileStorage) doLog(t time.Time, str string) {
|
||||
if fs.logSize > logSizeThreshold {
|
||||
// Rotate log file.
|
||||
fs.logw.Close()
|
||||
fs.logw = nil
|
||||
fs.logSize = 0
|
||||
rename(filepath.Join(fs.path, "LOG"), filepath.Join(fs.path, "LOG.old"))
|
||||
}
|
||||
if fs.logw == nil {
|
||||
var err error
|
||||
fs.logw, err = os.OpenFile(filepath.Join(fs.path, "LOG"), os.O_WRONLY|os.O_CREATE, 0644)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
// Force printDay on new log file.
|
||||
fs.day = 0
|
||||
}
|
||||
fs.printDay(t)
|
||||
hour, min, sec := t.Clock()
|
||||
msec := t.Nanosecond() / 1e3
|
||||
|
@ -145,65 +193,87 @@ func (fs *fileStorage) doLog(t time.Time, str string) {
|
|||
}
|
||||
|
||||
func (fs *fileStorage) Log(str string) {
|
||||
t := time.Now()
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return
|
||||
if !fs.readOnly {
|
||||
t := time.Now()
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return
|
||||
}
|
||||
fs.doLog(t, str)
|
||||
}
|
||||
fs.doLog(t, str)
|
||||
}
|
||||
|
||||
func (fs *fileStorage) log(str string) {
|
||||
fs.doLog(time.Now(), str)
|
||||
if !fs.readOnly {
|
||||
fs.doLog(time.Now(), str)
|
||||
}
|
||||
}
|
||||
|
||||
func (fs *fileStorage) GetFile(num uint64, t FileType) File {
|
||||
return &file{fs: fs, num: num, t: t}
|
||||
}
|
||||
func (fs *fileStorage) SetMeta(fd FileDesc) (err error) {
|
||||
if !FileDescOk(fd) {
|
||||
return ErrInvalidFile
|
||||
}
|
||||
if fs.readOnly {
|
||||
return errReadOnly
|
||||
}
|
||||
|
||||
func (fs *fileStorage) GetFiles(t FileType) (ff []File, err error) {
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return nil, ErrClosed
|
||||
return ErrClosed
|
||||
}
|
||||
dir, err := os.Open(fs.path)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
fnn, err := dir.Readdirnames(0)
|
||||
// Close the dir first before checking for Readdirnames error.
|
||||
if err := dir.Close(); err != nil {
|
||||
fs.log(fmt.Sprintf("close dir: %v", err))
|
||||
}
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
f := &file{fs: fs}
|
||||
for _, fn := range fnn {
|
||||
if f.parse(fn) && (f.t&t) != 0 {
|
||||
ff = append(ff, f)
|
||||
f = &file{fs: fs}
|
||||
defer func() {
|
||||
if err != nil {
|
||||
fs.log(fmt.Sprintf("CURRENT: %v", err))
|
||||
}
|
||||
}()
|
||||
path := fmt.Sprintf("%s.%d", filepath.Join(fs.path, "CURRENT"), fd.Num)
|
||||
w, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
_, err = fmt.Fprintln(w, fsGenName(fd))
|
||||
if err != nil {
|
||||
fs.log(fmt.Sprintf("write CURRENT.%d: %v", fd.Num, err))
|
||||
return
|
||||
}
|
||||
if err = w.Sync(); err != nil {
|
||||
fs.log(fmt.Sprintf("flush CURRENT.%d: %v", fd.Num, err))
|
||||
return
|
||||
}
|
||||
if err = w.Close(); err != nil {
|
||||
fs.log(fmt.Sprintf("close CURRENT.%d: %v", fd.Num, err))
|
||||
return
|
||||
}
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
if err = rename(path, filepath.Join(fs.path, "CURRENT")); err != nil {
|
||||
fs.log(fmt.Sprintf("rename CURRENT.%d: %v", fd.Num, err))
|
||||
return
|
||||
}
|
||||
// Sync root directory.
|
||||
if err = syncDir(fs.path); err != nil {
|
||||
fs.log(fmt.Sprintf("syncDir: %v", err))
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (fs *fileStorage) GetManifest() (f File, err error) {
|
||||
func (fs *fileStorage) GetMeta() (fd FileDesc, err error) {
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return nil, ErrClosed
|
||||
return FileDesc{}, ErrClosed
|
||||
}
|
||||
dir, err := os.Open(fs.path)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
fnn, err := dir.Readdirnames(0)
|
||||
names, err := dir.Readdirnames(0)
|
||||
// Close the dir first before checking for Readdirnames error.
|
||||
if err := dir.Close(); err != nil {
|
||||
fs.log(fmt.Sprintf("close dir: %v", err))
|
||||
if ce := dir.Close(); ce != nil {
|
||||
fs.log(fmt.Sprintf("close dir: %v", ce))
|
||||
}
|
||||
if err != nil {
|
||||
return
|
||||
|
@ -212,55 +282,64 @@ func (fs *fileStorage) GetManifest() (f File, err error) {
|
|||
var rem []string
|
||||
var pend bool
|
||||
var cerr error
|
||||
for _, fn := range fnn {
|
||||
if strings.HasPrefix(fn, "CURRENT") {
|
||||
pend1 := len(fn) > 7
|
||||
for _, name := range names {
|
||||
if strings.HasPrefix(name, "CURRENT") {
|
||||
pend1 := len(name) > 7
|
||||
var pendNum int64
|
||||
// Make sure it is valid name for a CURRENT file, otherwise skip it.
|
||||
if pend1 {
|
||||
if fn[7] != '.' || len(fn) < 9 {
|
||||
fs.log(fmt.Sprintf("skipping %s: invalid file name", fn))
|
||||
if name[7] != '.' || len(name) < 9 {
|
||||
fs.log(fmt.Sprintf("skipping %s: invalid file name", name))
|
||||
continue
|
||||
}
|
||||
if _, e1 := strconv.ParseUint(fn[8:], 10, 0); e1 != nil {
|
||||
fs.log(fmt.Sprintf("skipping %s: invalid file num: %v", fn, e1))
|
||||
var e1 error
|
||||
if pendNum, e1 = strconv.ParseInt(name[8:], 10, 0); e1 != nil {
|
||||
fs.log(fmt.Sprintf("skipping %s: invalid file num: %v", name, e1))
|
||||
continue
|
||||
}
|
||||
}
|
||||
path := filepath.Join(fs.path, fn)
|
||||
path := filepath.Join(fs.path, name)
|
||||
r, e1 := os.OpenFile(path, os.O_RDONLY, 0)
|
||||
if e1 != nil {
|
||||
return nil, e1
|
||||
return FileDesc{}, e1
|
||||
}
|
||||
b, e1 := ioutil.ReadAll(r)
|
||||
if e1 != nil {
|
||||
r.Close()
|
||||
return nil, e1
|
||||
return FileDesc{}, e1
|
||||
}
|
||||
f1 := &file{fs: fs}
|
||||
if len(b) < 1 || b[len(b)-1] != '\n' || !f1.parse(string(b[:len(b)-1])) {
|
||||
fs.log(fmt.Sprintf("skipping %s: corrupted or incomplete", fn))
|
||||
var fd1 FileDesc
|
||||
if len(b) < 1 || b[len(b)-1] != '\n' || !fsParseNamePtr(string(b[:len(b)-1]), &fd1) {
|
||||
fs.log(fmt.Sprintf("skipping %s: corrupted or incomplete", name))
|
||||
if pend1 {
|
||||
rem = append(rem, fn)
|
||||
rem = append(rem, name)
|
||||
}
|
||||
if !pend1 || cerr == nil {
|
||||
cerr = fmt.Errorf("leveldb/storage: corrupted or incomplete %s file", fn)
|
||||
metaFd, _ := fsParseName(name)
|
||||
cerr = &ErrCorrupted{
|
||||
Fd: metaFd,
|
||||
Err: errors.New("leveldb/storage: corrupted or incomplete meta file"),
|
||||
}
|
||||
}
|
||||
} else if f != nil && f1.Num() < f.Num() {
|
||||
fs.log(fmt.Sprintf("skipping %s: obsolete", fn))
|
||||
} else if pend1 && pendNum != fd1.Num {
|
||||
fs.log(fmt.Sprintf("skipping %s: inconsistent pending-file num: %d vs %d", name, pendNum, fd1.Num))
|
||||
rem = append(rem, name)
|
||||
} else if fd1.Num < fd.Num {
|
||||
fs.log(fmt.Sprintf("skipping %s: obsolete", name))
|
||||
if pend1 {
|
||||
rem = append(rem, fn)
|
||||
rem = append(rem, name)
|
||||
}
|
||||
} else {
|
||||
f = f1
|
||||
fd = fd1
|
||||
pend = pend1
|
||||
}
|
||||
if err := r.Close(); err != nil {
|
||||
fs.log(fmt.Sprintf("close %s: %v", fn, err))
|
||||
fs.log(fmt.Sprintf("close %s: %v", name, err))
|
||||
}
|
||||
}
|
||||
}
|
||||
// Don't remove any files if there is no valid CURRENT file.
|
||||
if f == nil {
|
||||
if fd.Zero() {
|
||||
if cerr != nil {
|
||||
err = cerr
|
||||
} else {
|
||||
|
@ -268,52 +347,140 @@ func (fs *fileStorage) GetManifest() (f File, err error) {
|
|||
}
|
||||
return
|
||||
}
|
||||
// Rename pending CURRENT file to an effective CURRENT.
|
||||
if pend {
|
||||
path := fmt.Sprintf("%s.%d", filepath.Join(fs.path, "CURRENT"), f.Num())
|
||||
if err := rename(path, filepath.Join(fs.path, "CURRENT")); err != nil {
|
||||
fs.log(fmt.Sprintf("CURRENT.%d -> CURRENT: %v", f.Num(), err))
|
||||
if !fs.readOnly {
|
||||
// Rename pending CURRENT file to an effective CURRENT.
|
||||
if pend {
|
||||
path := fmt.Sprintf("%s.%d", filepath.Join(fs.path, "CURRENT"), fd.Num)
|
||||
if err := rename(path, filepath.Join(fs.path, "CURRENT")); err != nil {
|
||||
fs.log(fmt.Sprintf("CURRENT.%d -> CURRENT: %v", fd.Num, err))
|
||||
}
|
||||
}
|
||||
}
|
||||
// Remove obsolete or incomplete pending CURRENT files.
|
||||
for _, fn := range rem {
|
||||
path := filepath.Join(fs.path, fn)
|
||||
if err := os.Remove(path); err != nil {
|
||||
fs.log(fmt.Sprintf("remove %s: %v", fn, err))
|
||||
// Remove obsolete or incomplete pending CURRENT files.
|
||||
for _, name := range rem {
|
||||
path := filepath.Join(fs.path, name)
|
||||
if err := os.Remove(path); err != nil {
|
||||
fs.log(fmt.Sprintf("remove %s: %v", name, err))
|
||||
}
|
||||
}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (fs *fileStorage) SetManifest(f File) (err error) {
|
||||
func (fs *fileStorage) List(ft FileType) (fds []FileDesc, err error) {
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return nil, ErrClosed
|
||||
}
|
||||
dir, err := os.Open(fs.path)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
names, err := dir.Readdirnames(0)
|
||||
// Close the dir first before checking for Readdirnames error.
|
||||
if cerr := dir.Close(); cerr != nil {
|
||||
fs.log(fmt.Sprintf("close dir: %v", cerr))
|
||||
}
|
||||
if err == nil {
|
||||
for _, name := range names {
|
||||
if fd, ok := fsParseName(name); ok && fd.Type&ft != 0 {
|
||||
fds = append(fds, fd)
|
||||
}
|
||||
}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (fs *fileStorage) Open(fd FileDesc) (Reader, error) {
|
||||
if !FileDescOk(fd) {
|
||||
return nil, ErrInvalidFile
|
||||
}
|
||||
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return nil, ErrClosed
|
||||
}
|
||||
of, err := os.OpenFile(filepath.Join(fs.path, fsGenName(fd)), os.O_RDONLY, 0)
|
||||
if err != nil {
|
||||
if fsHasOldName(fd) && os.IsNotExist(err) {
|
||||
of, err = os.OpenFile(filepath.Join(fs.path, fsGenOldName(fd)), os.O_RDONLY, 0)
|
||||
if err == nil {
|
||||
goto ok
|
||||
}
|
||||
}
|
||||
return nil, err
|
||||
}
|
||||
ok:
|
||||
fs.open++
|
||||
return &fileWrap{File: of, fs: fs, fd: fd}, nil
|
||||
}
|
||||
|
||||
func (fs *fileStorage) Create(fd FileDesc) (Writer, error) {
|
||||
if !FileDescOk(fd) {
|
||||
return nil, ErrInvalidFile
|
||||
}
|
||||
if fs.readOnly {
|
||||
return nil, errReadOnly
|
||||
}
|
||||
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return nil, ErrClosed
|
||||
}
|
||||
of, err := os.OpenFile(filepath.Join(fs.path, fsGenName(fd)), os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
fs.open++
|
||||
return &fileWrap{File: of, fs: fs, fd: fd}, nil
|
||||
}
|
||||
|
||||
func (fs *fileStorage) Remove(fd FileDesc) error {
|
||||
if !FileDescOk(fd) {
|
||||
return ErrInvalidFile
|
||||
}
|
||||
if fs.readOnly {
|
||||
return errReadOnly
|
||||
}
|
||||
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return ErrClosed
|
||||
}
|
||||
f2, ok := f.(*file)
|
||||
if !ok || f2.t != TypeManifest {
|
||||
err := os.Remove(filepath.Join(fs.path, fsGenName(fd)))
|
||||
if err != nil {
|
||||
if fsHasOldName(fd) && os.IsNotExist(err) {
|
||||
if e1 := os.Remove(filepath.Join(fs.path, fsGenOldName(fd))); !os.IsNotExist(e1) {
|
||||
fs.log(fmt.Sprintf("remove %s: %v (old name)", fd, err))
|
||||
err = e1
|
||||
}
|
||||
} else {
|
||||
fs.log(fmt.Sprintf("remove %s: %v", fd, err))
|
||||
}
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
func (fs *fileStorage) Rename(oldfd, newfd FileDesc) error {
|
||||
if !FileDescOk(oldfd) || !FileDescOk(newfd) {
|
||||
return ErrInvalidFile
|
||||
}
|
||||
defer func() {
|
||||
if err != nil {
|
||||
fs.log(fmt.Sprintf("CURRENT: %v", err))
|
||||
}
|
||||
}()
|
||||
path := fmt.Sprintf("%s.%d", filepath.Join(fs.path, "CURRENT"), f2.Num())
|
||||
w, err := os.OpenFile(path, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
|
||||
if err != nil {
|
||||
return err
|
||||
if oldfd == newfd {
|
||||
return nil
|
||||
}
|
||||
_, err = fmt.Fprintln(w, f2.name())
|
||||
// Close the file first.
|
||||
if err := w.Close(); err != nil {
|
||||
fs.log(fmt.Sprintf("close CURRENT.%d: %v", f2.num, err))
|
||||
if fs.readOnly {
|
||||
return errReadOnly
|
||||
}
|
||||
if err != nil {
|
||||
return err
|
||||
|
||||
fs.mu.Lock()
|
||||
defer fs.mu.Unlock()
|
||||
if fs.open < 0 {
|
||||
return ErrClosed
|
||||
}
|
||||
return rename(path, filepath.Join(fs.path, "CURRENT"))
|
||||
return rename(filepath.Join(fs.path, fsGenName(oldfd)), filepath.Join(fs.path, fsGenName(newfd)))
|
||||
}
|
||||
|
||||
func (fs *fileStorage) Close() error {
|
||||
|
@ -326,209 +493,107 @@ func (fs *fileStorage) Close() error {
|
|||
runtime.SetFinalizer(fs, nil)
|
||||
|
||||
if fs.open > 0 {
|
||||
fs.log(fmt.Sprintf("refuse to close, %d files still open", fs.open))
|
||||
return fmt.Errorf("leveldb/storage: cannot close, %d files still open", fs.open)
|
||||
fs.log(fmt.Sprintf("close: warning, %d files still open", fs.open))
|
||||
}
|
||||
fs.open = -1
|
||||
e1 := fs.logw.Close()
|
||||
err := fs.flock.release()
|
||||
if err == nil {
|
||||
err = e1
|
||||
if fs.logw != nil {
|
||||
fs.logw.Close()
|
||||
}
|
||||
return err
|
||||
return fs.flock.release()
|
||||
}
|
||||
|
||||
type fileWrap struct {
|
||||
*os.File
|
||||
f *file
|
||||
fs *fileStorage
|
||||
fd FileDesc
|
||||
closed bool
|
||||
}
|
||||
|
||||
func (fw fileWrap) Sync() error {
|
||||
func (fw *fileWrap) Sync() error {
|
||||
if err := fw.File.Sync(); err != nil {
|
||||
return err
|
||||
}
|
||||
if fw.f.Type() == TypeManifest {
|
||||
if fw.fd.Type == TypeManifest {
|
||||
// Also sync parent directory if file type is manifest.
|
||||
// See: https://code.google.com/p/leveldb/issues/detail?id=190.
|
||||
if err := syncDir(fw.f.fs.path); err != nil {
|
||||
if err := syncDir(fw.fs.path); err != nil {
|
||||
fw.fs.log(fmt.Sprintf("syncDir: %v", err))
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (fw fileWrap) Close() error {
|
||||
f := fw.f
|
||||
f.fs.mu.Lock()
|
||||
defer f.fs.mu.Unlock()
|
||||
if !f.open {
|
||||
func (fw *fileWrap) Close() error {
|
||||
fw.fs.mu.Lock()
|
||||
defer fw.fs.mu.Unlock()
|
||||
if fw.closed {
|
||||
return ErrClosed
|
||||
}
|
||||
f.open = false
|
||||
f.fs.open--
|
||||
fw.closed = true
|
||||
fw.fs.open--
|
||||
err := fw.File.Close()
|
||||
if err != nil {
|
||||
f.fs.log(fmt.Sprintf("close %s.%d: %v", f.Type(), f.Num(), err))
|
||||
fw.fs.log(fmt.Sprintf("close %s: %v", fw.fd, err))
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
type file struct {
|
||||
fs *fileStorage
|
||||
num uint64
|
||||
t FileType
|
||||
open bool
|
||||
}
|
||||
|
||||
func (f *file) Open() (Reader, error) {
|
||||
f.fs.mu.Lock()
|
||||
defer f.fs.mu.Unlock()
|
||||
if f.fs.open < 0 {
|
||||
return nil, ErrClosed
|
||||
}
|
||||
if f.open {
|
||||
return nil, errFileOpen
|
||||
}
|
||||
of, err := os.OpenFile(f.path(), os.O_RDONLY, 0)
|
||||
if err != nil {
|
||||
if f.hasOldName() && os.IsNotExist(err) {
|
||||
of, err = os.OpenFile(f.oldPath(), os.O_RDONLY, 0)
|
||||
if err == nil {
|
||||
goto ok
|
||||
}
|
||||
}
|
||||
return nil, err
|
||||
}
|
||||
ok:
|
||||
f.open = true
|
||||
f.fs.open++
|
||||
return fileWrap{of, f}, nil
|
||||
}
|
||||
|
||||
func (f *file) Create() (Writer, error) {
|
||||
f.fs.mu.Lock()
|
||||
defer f.fs.mu.Unlock()
|
||||
if f.fs.open < 0 {
|
||||
return nil, ErrClosed
|
||||
}
|
||||
if f.open {
|
||||
return nil, errFileOpen
|
||||
}
|
||||
of, err := os.OpenFile(f.path(), os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0644)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
f.open = true
|
||||
f.fs.open++
|
||||
return fileWrap{of, f}, nil
|
||||
}
|
||||
|
||||
func (f *file) Replace(newfile File) error {
|
||||
f.fs.mu.Lock()
|
||||
defer f.fs.mu.Unlock()
|
||||
if f.fs.open < 0 {
|
||||
return ErrClosed
|
||||
}
|
||||
newfile2, ok := newfile.(*file)
|
||||
if !ok {
|
||||
return ErrInvalidFile
|
||||
}
|
||||
if f.open || newfile2.open {
|
||||
return errFileOpen
|
||||
}
|
||||
return rename(newfile2.path(), f.path())
|
||||
}
|
||||
|
||||
func (f *file) Type() FileType {
|
||||
return f.t
|
||||
}
|
||||
|
||||
func (f *file) Num() uint64 {
|
||||
return f.num
|
||||
}
|
||||
|
||||
func (f *file) Remove() error {
|
||||
f.fs.mu.Lock()
|
||||
defer f.fs.mu.Unlock()
|
||||
if f.fs.open < 0 {
|
||||
return ErrClosed
|
||||
}
|
||||
if f.open {
|
||||
return errFileOpen
|
||||
}
|
||||
err := os.Remove(f.path())
|
||||
if err != nil {
|
||||
f.fs.log(fmt.Sprintf("remove %s.%d: %v", f.Type(), f.Num(), err))
|
||||
}
|
||||
// Also try remove file with old name, just in case.
|
||||
if f.hasOldName() {
|
||||
if e1 := os.Remove(f.oldPath()); !os.IsNotExist(e1) {
|
||||
f.fs.log(fmt.Sprintf("remove %s.%d: %v (old name)", f.Type(), f.Num(), err))
|
||||
err = e1
|
||||
}
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
func (f *file) hasOldName() bool {
|
||||
return f.t == TypeTable
|
||||
}
|
||||
|
||||
func (f *file) oldName() string {
|
||||
switch f.t {
|
||||
case TypeTable:
|
||||
return fmt.Sprintf("%06d.sst", f.num)
|
||||
}
|
||||
return f.name()
|
||||
}
|
||||
|
||||
func (f *file) oldPath() string {
|
||||
return filepath.Join(f.fs.path, f.oldName())
|
||||
}
|
||||
|
||||
func (f *file) name() string {
|
||||
switch f.t {
|
||||
func fsGenName(fd FileDesc) string {
|
||||
switch fd.Type {
|
||||
case TypeManifest:
|
||||
return fmt.Sprintf("MANIFEST-%06d", f.num)
|
||||
return fmt.Sprintf("MANIFEST-%06d", fd.Num)
|
||||
case TypeJournal:
|
||||
return fmt.Sprintf("%06d.log", f.num)
|
||||
return fmt.Sprintf("%06d.log", fd.Num)
|
||||
case TypeTable:
|
||||
return fmt.Sprintf("%06d.ldb", f.num)
|
||||
return fmt.Sprintf("%06d.ldb", fd.Num)
|
||||
case TypeTemp:
|
||||
return fmt.Sprintf("%06d.tmp", f.num)
|
||||
return fmt.Sprintf("%06d.tmp", fd.Num)
|
||||
default:
|
||||
panic("invalid file type")
|
||||
}
|
||||
}
|
||||
|
||||
func (f *file) path() string {
|
||||
return filepath.Join(f.fs.path, f.name())
|
||||
func fsHasOldName(fd FileDesc) bool {
|
||||
return fd.Type == TypeTable
|
||||
}
|
||||
|
||||
func (f *file) parse(name string) bool {
|
||||
var num uint64
|
||||
func fsGenOldName(fd FileDesc) string {
|
||||
switch fd.Type {
|
||||
case TypeTable:
|
||||
return fmt.Sprintf("%06d.sst", fd.Num)
|
||||
}
|
||||
return fsGenName(fd)
|
||||
}
|
||||
|
||||
func fsParseName(name string) (fd FileDesc, ok bool) {
|
||||
var tail string
|
||||
_, err := fmt.Sscanf(name, "%d.%s", &num, &tail)
|
||||
_, err := fmt.Sscanf(name, "%d.%s", &fd.Num, &tail)
|
||||
if err == nil {
|
||||
switch tail {
|
||||
case "log":
|
||||
f.t = TypeJournal
|
||||
fd.Type = TypeJournal
|
||||
case "ldb", "sst":
|
||||
f.t = TypeTable
|
||||
fd.Type = TypeTable
|
||||
case "tmp":
|
||||
f.t = TypeTemp
|
||||
fd.Type = TypeTemp
|
||||
default:
|
||||
return false
|
||||
return
|
||||
}
|
||||
f.num = num
|
||||
return true
|
||||
return fd, true
|
||||
}
|
||||
n, _ := fmt.Sscanf(name, "MANIFEST-%d%s", &num, &tail)
|
||||
n, _ := fmt.Sscanf(name, "MANIFEST-%d%s", &fd.Num, &tail)
|
||||
if n == 1 {
|
||||
f.t = TypeManifest
|
||||
f.num = num
|
||||
return true
|
||||
fd.Type = TypeManifest
|
||||
return fd, true
|
||||
}
|
||||
|
||||
return false
|
||||
return
|
||||
}
|
||||
|
||||
func fsParseNamePtr(name string, fd *FileDesc) bool {
|
||||
_fd, ok := fsParseName(name)
|
||||
if fd != nil {
|
||||
*fd = _fd
|
||||
}
|
||||
return ok
|
||||
}
|
||||
|
|
34
vendor/github.com/syndtr/goleveldb/leveldb/storage/file_storage_nacl.go
generated
vendored
Normal file
34
vendor/github.com/syndtr/goleveldb/leveldb/storage/file_storage_nacl.go
generated
vendored
Normal file
|
@ -0,0 +1,34 @@
|
|||
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
|
||||
// All rights reserved.
|
||||
//
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
// +build nacl
|
||||
|
||||
package storage
|
||||
|
||||
import (
|
||||
"os"
|
||||
"syscall"
|
||||
)
|
||||
|
||||
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
|
||||
return nil, syscall.ENOTSUP
|
||||
}
|
||||
|
||||
func setFileLock(f *os.File, readOnly, lock bool) error {
|
||||
return syscall.ENOTSUP
|
||||
}
|
||||
|
||||
func rename(oldpath, newpath string) error {
|
||||
return syscall.ENOTSUP
|
||||
}
|
||||
|
||||
func isErrInvalid(err error) bool {
|
||||
return false
|
||||
}
|
||||
|
||||
func syncDir(name string) error {
|
||||
return syscall.ENOTSUP
|
||||
}
|
|
@ -19,8 +19,21 @@ func (fl *plan9FileLock) release() error {
|
|||
return fl.f.Close()
|
||||
}
|
||||
|
||||
func newFileLock(path string) (fl fileLock, err error) {
|
||||
f, err := os.OpenFile(path, os.O_RDWR|os.O_CREATE, os.ModeExclusive|0644)
|
||||
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
|
||||
var (
|
||||
flag int
|
||||
perm os.FileMode
|
||||
)
|
||||
if readOnly {
|
||||
flag = os.O_RDONLY
|
||||
} else {
|
||||
flag = os.O_RDWR
|
||||
perm = os.ModeExclusive
|
||||
}
|
||||
f, err := os.OpenFile(path, flag, perm)
|
||||
if os.IsNotExist(err) {
|
||||
f, err = os.OpenFile(path, flag|os.O_CREATE, perm|0644)
|
||||
}
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
|
|
|
@ -18,18 +18,27 @@ type unixFileLock struct {
|
|||
}
|
||||
|
||||
func (fl *unixFileLock) release() error {
|
||||
if err := setFileLock(fl.f, false); err != nil {
|
||||
if err := setFileLock(fl.f, false, false); err != nil {
|
||||
return err
|
||||
}
|
||||
return fl.f.Close()
|
||||
}
|
||||
|
||||
func newFileLock(path string) (fl fileLock, err error) {
|
||||
f, err := os.OpenFile(path, os.O_RDWR|os.O_CREATE, 0644)
|
||||
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
|
||||
var flag int
|
||||
if readOnly {
|
||||
flag = os.O_RDONLY
|
||||
} else {
|
||||
flag = os.O_RDWR
|
||||
}
|
||||
f, err := os.OpenFile(path, flag, 0)
|
||||
if os.IsNotExist(err) {
|
||||
f, err = os.OpenFile(path, flag|os.O_CREATE, 0644)
|
||||
}
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
err = setFileLock(f, true)
|
||||
err = setFileLock(f, readOnly, true)
|
||||
if err != nil {
|
||||
f.Close()
|
||||
return
|
||||
|
@ -38,7 +47,7 @@ func newFileLock(path string) (fl fileLock, err error) {
|
|||
return
|
||||
}
|
||||
|
||||
func setFileLock(f *os.File, lock bool) error {
|
||||
func setFileLock(f *os.File, readOnly, lock bool) error {
|
||||
flock := syscall.Flock_t{
|
||||
Type: syscall.F_UNLCK,
|
||||
Start: 0,
|
||||
|
@ -46,7 +55,11 @@ func setFileLock(f *os.File, lock bool) error {
|
|||
Whence: 1,
|
||||
}
|
||||
if lock {
|
||||
flock.Type = syscall.F_WRLCK
|
||||
if readOnly {
|
||||
flock.Type = syscall.F_RDLCK
|
||||
} else {
|
||||
flock.Type = syscall.F_WRLCK
|
||||
}
|
||||
}
|
||||
return syscall.FcntlFlock(f.Fd(), syscall.F_SETLK, &flock)
|
||||
}
|
||||
|
|
|
@ -17,14 +17,14 @@ var cases = []struct {
|
|||
oldName []string
|
||||
name string
|
||||
ftype FileType
|
||||
num uint64
|
||||
num int64
|
||||
}{
|
||||
{nil, "000100.log", TypeJournal, 100},
|
||||
{nil, "000000.log", TypeJournal, 0},
|
||||
{[]string{"000000.sst"}, "000000.ldb", TypeTable, 0},
|
||||
{nil, "MANIFEST-000002", TypeManifest, 2},
|
||||
{nil, "MANIFEST-000007", TypeManifest, 7},
|
||||
{nil, "18446744073709551615.log", TypeJournal, 18446744073709551615},
|
||||
{nil, "9223372036854775807.log", TypeJournal, 9223372036854775807},
|
||||
{nil, "000100.tmp", TypeTemp, 100},
|
||||
}
|
||||
|
||||
|
@ -55,9 +55,8 @@ var invalidCases = []string{
|
|||
|
||||
func TestFileStorage_CreateFileName(t *testing.T) {
|
||||
for _, c := range cases {
|
||||
f := &file{num: c.num, t: c.ftype}
|
||||
if f.name() != c.name {
|
||||
t.Errorf("invalid filename got '%s', want '%s'", f.name(), c.name)
|
||||
if name := fsGenName(FileDesc{c.ftype, c.num}); name != c.name {
|
||||
t.Errorf("invalid filename got '%s', want '%s'", name, c.name)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -65,16 +64,16 @@ func TestFileStorage_CreateFileName(t *testing.T) {
|
|||
func TestFileStorage_ParseFileName(t *testing.T) {
|
||||
for _, c := range cases {
|
||||
for _, name := range append([]string{c.name}, c.oldName...) {
|
||||
f := new(file)
|
||||
if !f.parse(name) {
|
||||
fd, ok := fsParseName(name)
|
||||
if !ok {
|
||||
t.Errorf("cannot parse filename '%s'", name)
|
||||
continue
|
||||
}
|
||||
if f.Type() != c.ftype {
|
||||
t.Errorf("filename '%s' invalid type got '%d', want '%d'", name, f.Type(), c.ftype)
|
||||
if fd.Type != c.ftype {
|
||||
t.Errorf("filename '%s' invalid type got '%d', want '%d'", name, fd.Type, c.ftype)
|
||||
}
|
||||
if f.Num() != c.num {
|
||||
t.Errorf("filename '%s' invalid number got '%d', want '%d'", name, f.Num(), c.num)
|
||||
if fd.Num != c.num {
|
||||
t.Errorf("filename '%s' invalid number got '%d', want '%d'", name, fd.Num, c.num)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -82,32 +81,25 @@ func TestFileStorage_ParseFileName(t *testing.T) {
|
|||
|
||||
func TestFileStorage_InvalidFileName(t *testing.T) {
|
||||
for _, name := range invalidCases {
|
||||
f := new(file)
|
||||
if f.parse(name) {
|
||||
if fsParseNamePtr(name, nil) {
|
||||
t.Errorf("filename '%s' should be invalid", name)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestFileStorage_Locking(t *testing.T) {
|
||||
path := filepath.Join(os.TempDir(), fmt.Sprintf("goleveldbtestfd-%d", os.Getuid()))
|
||||
|
||||
_, err := os.Stat(path)
|
||||
if err == nil {
|
||||
err = os.RemoveAll(path)
|
||||
if err != nil {
|
||||
t.Fatal("RemoveAll: got error: ", err)
|
||||
}
|
||||
path := filepath.Join(os.TempDir(), fmt.Sprintf("goleveldb-testrwlock-%d", os.Getuid()))
|
||||
if err := os.RemoveAll(path); err != nil && !os.IsNotExist(err) {
|
||||
t.Fatal("RemoveAll: got error: ", err)
|
||||
}
|
||||
defer os.RemoveAll(path)
|
||||
|
||||
p1, err := OpenFile(path)
|
||||
p1, err := OpenFile(path, false)
|
||||
if err != nil {
|
||||
t.Fatal("OpenFile(1): got error: ", err)
|
||||
}
|
||||
|
||||
defer os.RemoveAll(path)
|
||||
|
||||
p2, err := OpenFile(path)
|
||||
p2, err := OpenFile(path, false)
|
||||
if err != nil {
|
||||
t.Logf("OpenFile(2): got error: %s (expected)", err)
|
||||
} else {
|
||||
|
@ -118,7 +110,7 @@ func TestFileStorage_Locking(t *testing.T) {
|
|||
|
||||
p1.Close()
|
||||
|
||||
p3, err := OpenFile(path)
|
||||
p3, err := OpenFile(path, false)
|
||||
if err != nil {
|
||||
t.Fatal("OpenFile(3): got error: ", err)
|
||||
}
|
||||
|
@ -134,9 +126,51 @@ func TestFileStorage_Locking(t *testing.T) {
|
|||
} else {
|
||||
t.Logf("storage lock got error: %s (expected)", err)
|
||||
}
|
||||
l.Release()
|
||||
l.Unlock()
|
||||
_, err = p3.Lock()
|
||||
if err != nil {
|
||||
t.Fatal("storage lock failed(2): ", err)
|
||||
}
|
||||
}
|
||||
|
||||
func TestFileStorage_ReadOnlyLocking(t *testing.T) {
|
||||
path := filepath.Join(os.TempDir(), fmt.Sprintf("goleveldb-testrolock-%d", os.Getuid()))
|
||||
if err := os.RemoveAll(path); err != nil && !os.IsNotExist(err) {
|
||||
t.Fatal("RemoveAll: got error: ", err)
|
||||
}
|
||||
defer os.RemoveAll(path)
|
||||
|
||||
p1, err := OpenFile(path, false)
|
||||
if err != nil {
|
||||
t.Fatal("OpenFile(1): got error: ", err)
|
||||
}
|
||||
|
||||
_, err = OpenFile(path, true)
|
||||
if err != nil {
|
||||
t.Logf("OpenFile(2): got error: %s (expected)", err)
|
||||
} else {
|
||||
t.Fatal("OpenFile(2): expect error")
|
||||
}
|
||||
|
||||
p1.Close()
|
||||
|
||||
p3, err := OpenFile(path, true)
|
||||
if err != nil {
|
||||
t.Fatal("OpenFile(3): got error: ", err)
|
||||
}
|
||||
|
||||
p4, err := OpenFile(path, true)
|
||||
if err != nil {
|
||||
t.Fatal("OpenFile(4): got error: ", err)
|
||||
}
|
||||
|
||||
_, err = OpenFile(path, false)
|
||||
if err != nil {
|
||||
t.Logf("OpenFile(5): got error: %s (expected)", err)
|
||||
} else {
|
||||
t.Fatal("OpenFile(2): expect error")
|
||||
}
|
||||
|
||||
p3.Close()
|
||||
p4.Close()
|
||||
}
|
||||
|
|
|
@ -18,18 +18,27 @@ type unixFileLock struct {
|
|||
}
|
||||
|
||||
func (fl *unixFileLock) release() error {
|
||||
if err := setFileLock(fl.f, false); err != nil {
|
||||
if err := setFileLock(fl.f, false, false); err != nil {
|
||||
return err
|
||||
}
|
||||
return fl.f.Close()
|
||||
}
|
||||
|
||||
func newFileLock(path string) (fl fileLock, err error) {
|
||||
f, err := os.OpenFile(path, os.O_RDWR|os.O_CREATE, 0644)
|
||||
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
|
||||
var flag int
|
||||
if readOnly {
|
||||
flag = os.O_RDONLY
|
||||
} else {
|
||||
flag = os.O_RDWR
|
||||
}
|
||||
f, err := os.OpenFile(path, flag, 0)
|
||||
if os.IsNotExist(err) {
|
||||
f, err = os.OpenFile(path, flag|os.O_CREATE, 0644)
|
||||
}
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
err = setFileLock(f, true)
|
||||
err = setFileLock(f, readOnly, true)
|
||||
if err != nil {
|
||||
f.Close()
|
||||
return
|
||||
|
@ -38,10 +47,14 @@ func newFileLock(path string) (fl fileLock, err error) {
|
|||
return
|
||||
}
|
||||
|
||||
func setFileLock(f *os.File, lock bool) error {
|
||||
func setFileLock(f *os.File, readOnly, lock bool) error {
|
||||
how := syscall.LOCK_UN
|
||||
if lock {
|
||||
how = syscall.LOCK_EX
|
||||
if readOnly {
|
||||
how = syscall.LOCK_SH
|
||||
} else {
|
||||
how = syscall.LOCK_EX
|
||||
}
|
||||
}
|
||||
return syscall.Flock(int(f.Fd()), how|syscall.LOCK_NB)
|
||||
}
|
||||
|
@ -50,13 +63,23 @@ func rename(oldpath, newpath string) error {
|
|||
return os.Rename(oldpath, newpath)
|
||||
}
|
||||
|
||||
func isErrInvalid(err error) bool {
|
||||
if err == os.ErrInvalid {
|
||||
return true
|
||||
}
|
||||
if syserr, ok := err.(*os.SyscallError); ok && syserr.Err == syscall.EINVAL {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func syncDir(name string) error {
|
||||
f, err := os.Open(name)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer f.Close()
|
||||
if err := f.Sync(); err != nil {
|
||||
if err := f.Sync(); err != nil && !isErrInvalid(err) {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
|
|
|
@ -29,12 +29,22 @@ func (fl *windowsFileLock) release() error {
|
|||
return syscall.Close(fl.fd)
|
||||
}
|
||||
|
||||
func newFileLock(path string) (fl fileLock, err error) {
|
||||
func newFileLock(path string, readOnly bool) (fl fileLock, err error) {
|
||||
pathp, err := syscall.UTF16PtrFromString(path)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
fd, err := syscall.CreateFile(pathp, syscall.GENERIC_READ|syscall.GENERIC_WRITE, 0, nil, syscall.CREATE_ALWAYS, syscall.FILE_ATTRIBUTE_NORMAL, 0)
|
||||
var access, shareMode uint32
|
||||
if readOnly {
|
||||
access = syscall.GENERIC_READ
|
||||
shareMode = syscall.FILE_SHARE_READ
|
||||
} else {
|
||||
access = syscall.GENERIC_READ | syscall.GENERIC_WRITE
|
||||
}
|
||||
fd, err := syscall.CreateFile(pathp, access, shareMode, nil, syscall.OPEN_EXISTING, syscall.FILE_ATTRIBUTE_NORMAL, 0)
|
||||
if err == syscall.ERROR_FILE_NOT_FOUND {
|
||||
fd, err = syscall.CreateFile(pathp, access, shareMode, nil, syscall.OPEN_ALWAYS, syscall.FILE_ATTRIBUTE_NORMAL, 0)
|
||||
}
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
|
@ -47,9 +57,8 @@ func moveFileEx(from *uint16, to *uint16, flags uint32) error {
|
|||
if r1 == 0 {
|
||||
if e1 != 0 {
|
||||
return error(e1)
|
||||
} else {
|
||||
return syscall.EINVAL
|
||||
}
|
||||
return syscall.EINVAL
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
|
|
@ -10,8 +10,6 @@ import (
|
|||
"bytes"
|
||||
"os"
|
||||
"sync"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
const typeShift = 3
|
||||
|
@ -20,7 +18,7 @@ type memStorageLock struct {
|
|||
ms *memStorage
|
||||
}
|
||||
|
||||
func (lock *memStorageLock) Release() {
|
||||
func (lock *memStorageLock) Unlock() {
|
||||
ms := lock.ms
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
|
@ -32,10 +30,10 @@ func (lock *memStorageLock) Release() {
|
|||
|
||||
// memStorage is a memory-backed storage.
|
||||
type memStorage struct {
|
||||
mu sync.Mutex
|
||||
slock *memStorageLock
|
||||
files map[uint64]*memFile
|
||||
manifest *memFilePtr
|
||||
mu sync.Mutex
|
||||
slock *memStorageLock
|
||||
files map[uint64]*memFile
|
||||
meta FileDesc
|
||||
}
|
||||
|
||||
// NewMemStorage returns a new memory-backed storage implementation.
|
||||
|
@ -45,7 +43,7 @@ func NewMemStorage() Storage {
|
|||
}
|
||||
}
|
||||
|
||||
func (ms *memStorage) Lock() (util.Releaser, error) {
|
||||
func (ms *memStorage) Lock() (Locker, error) {
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
if ms.slock != nil {
|
||||
|
@ -57,147 +55,164 @@ func (ms *memStorage) Lock() (util.Releaser, error) {
|
|||
|
||||
func (*memStorage) Log(str string) {}
|
||||
|
||||
func (ms *memStorage) GetFile(num uint64, t FileType) File {
|
||||
return &memFilePtr{ms: ms, num: num, t: t}
|
||||
}
|
||||
|
||||
func (ms *memStorage) GetFiles(t FileType) ([]File, error) {
|
||||
ms.mu.Lock()
|
||||
var ff []File
|
||||
for x, _ := range ms.files {
|
||||
num, mt := x>>typeShift, FileType(x)&TypeAll
|
||||
if mt&t == 0 {
|
||||
continue
|
||||
}
|
||||
ff = append(ff, &memFilePtr{ms: ms, num: num, t: mt})
|
||||
}
|
||||
ms.mu.Unlock()
|
||||
return ff, nil
|
||||
}
|
||||
|
||||
func (ms *memStorage) GetManifest() (File, error) {
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
if ms.manifest == nil {
|
||||
return nil, os.ErrNotExist
|
||||
}
|
||||
return ms.manifest, nil
|
||||
}
|
||||
|
||||
func (ms *memStorage) SetManifest(f File) error {
|
||||
fm, ok := f.(*memFilePtr)
|
||||
if !ok || fm.t != TypeManifest {
|
||||
func (ms *memStorage) SetMeta(fd FileDesc) error {
|
||||
if !FileDescOk(fd) {
|
||||
return ErrInvalidFile
|
||||
}
|
||||
|
||||
ms.mu.Lock()
|
||||
ms.manifest = fm
|
||||
ms.meta = fd
|
||||
ms.mu.Unlock()
|
||||
return nil
|
||||
}
|
||||
|
||||
func (*memStorage) Close() error { return nil }
|
||||
|
||||
type memReader struct {
|
||||
*bytes.Reader
|
||||
m *memFile
|
||||
}
|
||||
|
||||
func (mr *memReader) Close() error {
|
||||
return mr.m.Close()
|
||||
}
|
||||
|
||||
type memFile struct {
|
||||
bytes.Buffer
|
||||
ms *memStorage
|
||||
open bool
|
||||
}
|
||||
|
||||
func (*memFile) Sync() error { return nil }
|
||||
func (m *memFile) Close() error {
|
||||
m.ms.mu.Lock()
|
||||
m.open = false
|
||||
m.ms.mu.Unlock()
|
||||
return nil
|
||||
}
|
||||
|
||||
type memFilePtr struct {
|
||||
ms *memStorage
|
||||
num uint64
|
||||
t FileType
|
||||
}
|
||||
|
||||
func (p *memFilePtr) x() uint64 {
|
||||
return p.Num()<<typeShift | uint64(p.Type())
|
||||
}
|
||||
|
||||
func (p *memFilePtr) Open() (Reader, error) {
|
||||
ms := p.ms
|
||||
func (ms *memStorage) GetMeta() (FileDesc, error) {
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
if m, exist := ms.files[p.x()]; exist {
|
||||
if ms.meta.Zero() {
|
||||
return FileDesc{}, os.ErrNotExist
|
||||
}
|
||||
return ms.meta, nil
|
||||
}
|
||||
|
||||
func (ms *memStorage) List(ft FileType) ([]FileDesc, error) {
|
||||
ms.mu.Lock()
|
||||
var fds []FileDesc
|
||||
for x := range ms.files {
|
||||
fd := unpackFile(x)
|
||||
if fd.Type&ft != 0 {
|
||||
fds = append(fds, fd)
|
||||
}
|
||||
}
|
||||
ms.mu.Unlock()
|
||||
return fds, nil
|
||||
}
|
||||
|
||||
func (ms *memStorage) Open(fd FileDesc) (Reader, error) {
|
||||
if !FileDescOk(fd) {
|
||||
return nil, ErrInvalidFile
|
||||
}
|
||||
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
if m, exist := ms.files[packFile(fd)]; exist {
|
||||
if m.open {
|
||||
return nil, errFileOpen
|
||||
}
|
||||
m.open = true
|
||||
return &memReader{Reader: bytes.NewReader(m.Bytes()), m: m}, nil
|
||||
return &memReader{Reader: bytes.NewReader(m.Bytes()), ms: ms, m: m}, nil
|
||||
}
|
||||
return nil, os.ErrNotExist
|
||||
}
|
||||
|
||||
func (p *memFilePtr) Create() (Writer, error) {
|
||||
ms := p.ms
|
||||
func (ms *memStorage) Create(fd FileDesc) (Writer, error) {
|
||||
if !FileDescOk(fd) {
|
||||
return nil, ErrInvalidFile
|
||||
}
|
||||
|
||||
x := packFile(fd)
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
m, exist := ms.files[p.x()]
|
||||
m, exist := ms.files[x]
|
||||
if exist {
|
||||
if m.open {
|
||||
return nil, errFileOpen
|
||||
}
|
||||
m.Reset()
|
||||
} else {
|
||||
m = &memFile{ms: ms}
|
||||
ms.files[p.x()] = m
|
||||
m = &memFile{}
|
||||
ms.files[x] = m
|
||||
}
|
||||
m.open = true
|
||||
return m, nil
|
||||
return &memWriter{memFile: m, ms: ms}, nil
|
||||
}
|
||||
|
||||
func (p *memFilePtr) Replace(newfile File) error {
|
||||
p1, ok := newfile.(*memFilePtr)
|
||||
if !ok {
|
||||
func (ms *memStorage) Remove(fd FileDesc) error {
|
||||
if !FileDescOk(fd) {
|
||||
return ErrInvalidFile
|
||||
}
|
||||
ms := p.ms
|
||||
|
||||
x := packFile(fd)
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
m1, exist := ms.files[p1.x()]
|
||||
if !exist {
|
||||
return os.ErrNotExist
|
||||
}
|
||||
m0, exist := ms.files[p.x()]
|
||||
if (exist && m0.open) || m1.open {
|
||||
return errFileOpen
|
||||
}
|
||||
delete(ms.files, p1.x())
|
||||
ms.files[p.x()] = m1
|
||||
return nil
|
||||
}
|
||||
|
||||
func (p *memFilePtr) Type() FileType {
|
||||
return p.t
|
||||
}
|
||||
|
||||
func (p *memFilePtr) Num() uint64 {
|
||||
return p.num
|
||||
}
|
||||
|
||||
func (p *memFilePtr) Remove() error {
|
||||
ms := p.ms
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
if _, exist := ms.files[p.x()]; exist {
|
||||
delete(ms.files, p.x())
|
||||
if _, exist := ms.files[x]; exist {
|
||||
delete(ms.files, x)
|
||||
return nil
|
||||
}
|
||||
return os.ErrNotExist
|
||||
}
|
||||
|
||||
func (ms *memStorage) Rename(oldfd, newfd FileDesc) error {
|
||||
if FileDescOk(oldfd) || FileDescOk(newfd) {
|
||||
return ErrInvalidFile
|
||||
}
|
||||
if oldfd == newfd {
|
||||
return nil
|
||||
}
|
||||
|
||||
oldx := packFile(oldfd)
|
||||
newx := packFile(newfd)
|
||||
ms.mu.Lock()
|
||||
defer ms.mu.Unlock()
|
||||
oldm, exist := ms.files[oldx]
|
||||
if !exist {
|
||||
return os.ErrNotExist
|
||||
}
|
||||
newm, exist := ms.files[newx]
|
||||
if (exist && newm.open) || oldm.open {
|
||||
return errFileOpen
|
||||
}
|
||||
delete(ms.files, oldx)
|
||||
ms.files[newx] = oldm
|
||||
return nil
|
||||
}
|
||||
|
||||
func (*memStorage) Close() error { return nil }
|
||||
|
||||
type memFile struct {
|
||||
bytes.Buffer
|
||||
open bool
|
||||
}
|
||||
|
||||
type memReader struct {
|
||||
*bytes.Reader
|
||||
ms *memStorage
|
||||
m *memFile
|
||||
closed bool
|
||||
}
|
||||
|
||||
func (mr *memReader) Close() error {
|
||||
mr.ms.mu.Lock()
|
||||
defer mr.ms.mu.Unlock()
|
||||
if mr.closed {
|
||||
return ErrClosed
|
||||
}
|
||||
mr.m.open = false
|
||||
return nil
|
||||
}
|
||||
|
||||
type memWriter struct {
|
||||
*memFile
|
||||
ms *memStorage
|
||||
closed bool
|
||||
}
|
||||
|
||||
func (*memWriter) Sync() error { return nil }
|
||||
|
||||
func (mw *memWriter) Close() error {
|
||||
mw.ms.mu.Lock()
|
||||
defer mw.ms.mu.Unlock()
|
||||
if mw.closed {
|
||||
return ErrClosed
|
||||
}
|
||||
mw.memFile.open = false
|
||||
return nil
|
||||
}
|
||||
|
||||
func packFile(fd FileDesc) uint64 {
|
||||
return uint64(fd.Num)<<typeShift | uint64(fd.Type)
|
||||
}
|
||||
|
||||
func unpackFile(x uint64) FileDesc {
|
||||
return FileDesc{FileType(x) & TypeAll, int64(x >> typeShift)}
|
||||
}
|
||||
|
|
|
@ -24,24 +24,23 @@ func TestMemStorage(t *testing.T) {
|
|||
} else {
|
||||
t.Logf("storage lock got error: %s (expected)", err)
|
||||
}
|
||||
l.Release()
|
||||
l.Unlock()
|
||||
_, err = m.Lock()
|
||||
if err != nil {
|
||||
t.Fatal("storage lock failed(2): ", err)
|
||||
}
|
||||
|
||||
f := m.GetFile(1, TypeTable)
|
||||
if f.Num() != 1 && f.Type() != TypeTable {
|
||||
t.Fatal("invalid file number and type")
|
||||
w, err := m.Create(FileDesc{TypeTable, 1})
|
||||
if err != nil {
|
||||
t.Fatal("Storage.Create: ", err)
|
||||
}
|
||||
w, _ := f.Create()
|
||||
w.Write([]byte("abc"))
|
||||
w.Close()
|
||||
if ff, _ := m.GetFiles(TypeAll); len(ff) != 1 {
|
||||
if fds, _ := m.List(TypeAll); len(fds) != 1 {
|
||||
t.Fatal("invalid GetFiles len")
|
||||
}
|
||||
buf := new(bytes.Buffer)
|
||||
r, err := f.Open()
|
||||
r, err := m.Open(FileDesc{TypeTable, 1})
|
||||
if err != nil {
|
||||
t.Fatal("Open: got error: ", err)
|
||||
}
|
||||
|
@ -50,17 +49,17 @@ func TestMemStorage(t *testing.T) {
|
|||
if got := buf.String(); got != "abc" {
|
||||
t.Fatalf("Read: invalid value, want=abc got=%s", got)
|
||||
}
|
||||
if _, err := f.Open(); err != nil {
|
||||
if _, err := m.Open(FileDesc{TypeTable, 1}); err != nil {
|
||||
t.Fatal("Open: got error: ", err)
|
||||
}
|
||||
if _, err := m.GetFile(1, TypeTable).Open(); err == nil {
|
||||
if _, err := m.Open(FileDesc{TypeTable, 1}); err == nil {
|
||||
t.Fatal("expecting error")
|
||||
}
|
||||
f.Remove()
|
||||
if ff, _ := m.GetFiles(TypeAll); len(ff) != 0 {
|
||||
t.Fatal("invalid GetFiles len", len(ff))
|
||||
m.Remove(FileDesc{TypeTable, 1})
|
||||
if fds, _ := m.List(TypeAll); len(fds) != 0 {
|
||||
t.Fatal("invalid GetFiles len", len(fds))
|
||||
}
|
||||
if _, err := f.Open(); err == nil {
|
||||
if _, err := m.Open(FileDesc{TypeTable, 1}); err == nil {
|
||||
t.Fatal("expecting error")
|
||||
}
|
||||
}
|
||||
|
|
|
@ -11,12 +11,12 @@ import (
|
|||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
type FileType uint32
|
||||
// FileType represent a file type.
|
||||
type FileType int
|
||||
|
||||
// File types.
|
||||
const (
|
||||
TypeManifest FileType = 1 << iota
|
||||
TypeJournal
|
||||
|
@ -40,12 +40,28 @@ func (t FileType) String() string {
|
|||
return fmt.Sprintf("<unknown:%d>", t)
|
||||
}
|
||||
|
||||
// Common error.
|
||||
var (
|
||||
ErrInvalidFile = errors.New("leveldb/storage: invalid file for argument")
|
||||
ErrLocked = errors.New("leveldb/storage: already locked")
|
||||
ErrClosed = errors.New("leveldb/storage: closed")
|
||||
)
|
||||
|
||||
// ErrCorrupted is the type that wraps errors that indicate corruption of
|
||||
// a file. Package storage has its own type instead of using
|
||||
// errors.ErrCorrupted to prevent circular import.
|
||||
type ErrCorrupted struct {
|
||||
Fd FileDesc
|
||||
Err error
|
||||
}
|
||||
|
||||
func (e *ErrCorrupted) Error() string {
|
||||
if !e.Fd.Zero() {
|
||||
return fmt.Sprintf("%v [file=%v]", e.Err, e.Fd)
|
||||
}
|
||||
return e.Err.Error()
|
||||
}
|
||||
|
||||
// Syncer is the interface that wraps basic Sync method.
|
||||
type Syncer interface {
|
||||
// Sync commits the current contents of the file to stable storage.
|
||||
|
@ -67,91 +83,97 @@ type Writer interface {
|
|||
Syncer
|
||||
}
|
||||
|
||||
// File is the file. A file instance must be goroutine-safe.
|
||||
type File interface {
|
||||
// Open opens the file for read. Returns os.ErrNotExist error
|
||||
// if the file does not exist.
|
||||
// Returns ErrClosed if the underlying storage is closed.
|
||||
Open() (r Reader, err error)
|
||||
|
||||
// Create creates the file for writting. Truncate the file if
|
||||
// already exist.
|
||||
// Returns ErrClosed if the underlying storage is closed.
|
||||
Create() (w Writer, err error)
|
||||
|
||||
// Replace replaces file with newfile.
|
||||
// Returns ErrClosed if the underlying storage is closed.
|
||||
Replace(newfile File) error
|
||||
|
||||
// Type returns the file type
|
||||
Type() FileType
|
||||
|
||||
// Num returns the file number.
|
||||
Num() uint64
|
||||
|
||||
// Remove removes the file.
|
||||
// Returns ErrClosed if the underlying storage is closed.
|
||||
Remove() error
|
||||
// Locker is the interface that wraps Unlock method.
|
||||
type Locker interface {
|
||||
Unlock()
|
||||
}
|
||||
|
||||
// Storage is the storage. A storage instance must be goroutine-safe.
|
||||
// FileDesc is a 'file descriptor'.
|
||||
type FileDesc struct {
|
||||
Type FileType
|
||||
Num int64
|
||||
}
|
||||
|
||||
func (fd FileDesc) String() string {
|
||||
switch fd.Type {
|
||||
case TypeManifest:
|
||||
return fmt.Sprintf("MANIFEST-%06d", fd.Num)
|
||||
case TypeJournal:
|
||||
return fmt.Sprintf("%06d.log", fd.Num)
|
||||
case TypeTable:
|
||||
return fmt.Sprintf("%06d.ldb", fd.Num)
|
||||
case TypeTemp:
|
||||
return fmt.Sprintf("%06d.tmp", fd.Num)
|
||||
default:
|
||||
return fmt.Sprintf("%#x-%d", fd.Type, fd.Num)
|
||||
}
|
||||
}
|
||||
|
||||
// Zero returns true if fd == (FileDesc{}).
|
||||
func (fd FileDesc) Zero() bool {
|
||||
return fd == (FileDesc{})
|
||||
}
|
||||
|
||||
// FileDescOk returns true if fd is a valid 'file descriptor'.
|
||||
func FileDescOk(fd FileDesc) bool {
|
||||
switch fd.Type {
|
||||
case TypeManifest:
|
||||
case TypeJournal:
|
||||
case TypeTable:
|
||||
case TypeTemp:
|
||||
default:
|
||||
return false
|
||||
}
|
||||
return fd.Num >= 0
|
||||
}
|
||||
|
||||
// Storage is the storage. A storage instance must be safe for concurrent use.
|
||||
type Storage interface {
|
||||
// Lock locks the storage. Any subsequent attempt to call Lock will fail
|
||||
// until the last lock released.
|
||||
// After use the caller should call the Release method.
|
||||
Lock() (l util.Releaser, err error)
|
||||
// Caller should call Unlock method after use.
|
||||
Lock() (Locker, error)
|
||||
|
||||
// Log logs a string. This is used for logging. An implementation
|
||||
// may write to a file, stdout or simply do nothing.
|
||||
// Log logs a string. This is used for logging.
|
||||
// An implementation may write to a file, stdout or simply do nothing.
|
||||
Log(str string)
|
||||
|
||||
// GetFile returns a file for the given number and type. GetFile will never
|
||||
// returns nil, even if the underlying storage is closed.
|
||||
GetFile(num uint64, t FileType) File
|
||||
// SetMeta store 'file descriptor' that can later be acquired using GetMeta
|
||||
// method. The 'file descriptor' should point to a valid file.
|
||||
// SetMeta should be implemented in such way that changes should happen
|
||||
// atomically.
|
||||
SetMeta(fd FileDesc) error
|
||||
|
||||
// GetFiles returns a slice of files that match the given file types.
|
||||
// GetMeta returns 'file descriptor' stored in meta. The 'file descriptor'
|
||||
// can be updated using SetMeta method.
|
||||
// Returns os.ErrNotExist if meta doesn't store any 'file descriptor', or
|
||||
// 'file descriptor' point to nonexistent file.
|
||||
GetMeta() (FileDesc, error)
|
||||
|
||||
// List returns file descriptors that match the given file types.
|
||||
// The file types may be OR'ed together.
|
||||
GetFiles(t FileType) ([]File, error)
|
||||
List(ft FileType) ([]FileDesc, error)
|
||||
|
||||
// GetManifest returns a manifest file. Returns os.ErrNotExist if manifest
|
||||
// file does not exist.
|
||||
GetManifest() (File, error)
|
||||
// Open opens file with the given 'file descriptor' read-only.
|
||||
// Returns os.ErrNotExist error if the file does not exist.
|
||||
// Returns ErrClosed if the underlying storage is closed.
|
||||
Open(fd FileDesc) (Reader, error)
|
||||
|
||||
// SetManifest sets the given file as manifest file. The given file should
|
||||
// be a manifest file type or error will be returned.
|
||||
SetManifest(f File) error
|
||||
// Create creates file with the given 'file descriptor', truncate if already
|
||||
// exist and opens write-only.
|
||||
// Returns ErrClosed if the underlying storage is closed.
|
||||
Create(fd FileDesc) (Writer, error)
|
||||
|
||||
// Close closes the storage. It is valid to call Close multiple times.
|
||||
// Other methods should not be called after the storage has been closed.
|
||||
// Remove removes file with the given 'file descriptor'.
|
||||
// Returns ErrClosed if the underlying storage is closed.
|
||||
Remove(fd FileDesc) error
|
||||
|
||||
// Rename renames file from oldfd to newfd.
|
||||
// Returns ErrClosed if the underlying storage is closed.
|
||||
Rename(oldfd, newfd FileDesc) error
|
||||
|
||||
// Close closes the storage.
|
||||
// It is valid to call Close multiple times. Other methods should not be
|
||||
// called after the storage has been closed.
|
||||
Close() error
|
||||
}
|
||||
|
||||
// FileInfo wraps basic file info.
|
||||
type FileInfo struct {
|
||||
Type FileType
|
||||
Num uint64
|
||||
}
|
||||
|
||||
func (fi FileInfo) String() string {
|
||||
switch fi.Type {
|
||||
case TypeManifest:
|
||||
return fmt.Sprintf("MANIFEST-%06d", fi.Num)
|
||||
case TypeJournal:
|
||||
return fmt.Sprintf("%06d.log", fi.Num)
|
||||
case TypeTable:
|
||||
return fmt.Sprintf("%06d.ldb", fi.Num)
|
||||
case TypeTemp:
|
||||
return fmt.Sprintf("%06d.tmp", fi.Num)
|
||||
default:
|
||||
return fmt.Sprintf("%#x-%d", fi.Type, fi.Num)
|
||||
}
|
||||
}
|
||||
|
||||
// NewFileInfo creates new FileInfo from the given File. It will returns nil
|
||||
// if File is nil.
|
||||
func NewFileInfo(f File) *FileInfo {
|
||||
if f == nil {
|
||||
return nil
|
||||
}
|
||||
return &FileInfo{f.Type(), f.Num()}
|
||||
}
|
||||
|
|
|
@ -1,539 +0,0 @@
|
|||
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
|
||||
// All rights reserved.
|
||||
//
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENE file.
|
||||
|
||||
package leveldb
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"io/ioutil"
|
||||
"math/rand"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sync"
|
||||
"testing"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
const typeShift = 4
|
||||
|
||||
var (
|
||||
tsErrInvalidFile = errors.New("leveldb.testStorage: invalid file for argument")
|
||||
tsErrFileOpen = errors.New("leveldb.testStorage: file still open")
|
||||
)
|
||||
|
||||
var (
|
||||
tsFSEnv = os.Getenv("GOLEVELDB_USEFS")
|
||||
tsTempdir = os.Getenv("GOLEVELDB_TEMPDIR")
|
||||
tsKeepFS = tsFSEnv == "2"
|
||||
tsFS = tsKeepFS || tsFSEnv == "" || tsFSEnv == "1"
|
||||
tsMU = &sync.Mutex{}
|
||||
tsNum = 0
|
||||
)
|
||||
|
||||
type tsOp uint
|
||||
|
||||
const (
|
||||
tsOpOpen tsOp = iota
|
||||
tsOpCreate
|
||||
tsOpRead
|
||||
tsOpReadAt
|
||||
tsOpWrite
|
||||
tsOpSync
|
||||
|
||||
tsOpNum
|
||||
)
|
||||
|
||||
type tsLock struct {
|
||||
ts *testStorage
|
||||
r util.Releaser
|
||||
}
|
||||
|
||||
func (l tsLock) Release() {
|
||||
l.r.Release()
|
||||
l.ts.t.Log("I: storage lock released")
|
||||
}
|
||||
|
||||
type tsReader struct {
|
||||
tf tsFile
|
||||
storage.Reader
|
||||
}
|
||||
|
||||
func (tr tsReader) Read(b []byte) (n int, err error) {
|
||||
ts := tr.tf.ts
|
||||
ts.countRead(tr.tf.Type())
|
||||
if tr.tf.shouldErrLocked(tsOpRead) {
|
||||
return 0, errors.New("leveldb.testStorage: emulated read error")
|
||||
}
|
||||
n, err = tr.Reader.Read(b)
|
||||
if err != nil && err != io.EOF {
|
||||
ts.t.Errorf("E: read error, num=%d type=%v n=%d: %v", tr.tf.Num(), tr.tf.Type(), n, err)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (tr tsReader) ReadAt(b []byte, off int64) (n int, err error) {
|
||||
ts := tr.tf.ts
|
||||
ts.countRead(tr.tf.Type())
|
||||
if tr.tf.shouldErrLocked(tsOpReadAt) {
|
||||
return 0, errors.New("leveldb.testStorage: emulated readAt error")
|
||||
}
|
||||
n, err = tr.Reader.ReadAt(b, off)
|
||||
if err != nil && err != io.EOF {
|
||||
ts.t.Errorf("E: readAt error, num=%d type=%v off=%d n=%d: %v", tr.tf.Num(), tr.tf.Type(), off, n, err)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (tr tsReader) Close() (err error) {
|
||||
err = tr.Reader.Close()
|
||||
tr.tf.close("reader", err)
|
||||
return
|
||||
}
|
||||
|
||||
type tsWriter struct {
|
||||
tf tsFile
|
||||
storage.Writer
|
||||
}
|
||||
|
||||
func (tw tsWriter) Write(b []byte) (n int, err error) {
|
||||
if tw.tf.shouldErrLocked(tsOpWrite) {
|
||||
return 0, errors.New("leveldb.testStorage: emulated write error")
|
||||
}
|
||||
n, err = tw.Writer.Write(b)
|
||||
if err != nil {
|
||||
tw.tf.ts.t.Errorf("E: write error, num=%d type=%v n=%d: %v", tw.tf.Num(), tw.tf.Type(), n, err)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (tw tsWriter) Sync() (err error) {
|
||||
ts := tw.tf.ts
|
||||
ts.mu.Lock()
|
||||
for ts.emuDelaySync&tw.tf.Type() != 0 {
|
||||
ts.cond.Wait()
|
||||
}
|
||||
ts.mu.Unlock()
|
||||
if tw.tf.shouldErrLocked(tsOpSync) {
|
||||
return errors.New("leveldb.testStorage: emulated sync error")
|
||||
}
|
||||
err = tw.Writer.Sync()
|
||||
if err != nil {
|
||||
tw.tf.ts.t.Errorf("E: sync error, num=%d type=%v: %v", tw.tf.Num(), tw.tf.Type(), err)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (tw tsWriter) Close() (err error) {
|
||||
err = tw.Writer.Close()
|
||||
tw.tf.close("writer", err)
|
||||
return
|
||||
}
|
||||
|
||||
type tsFile struct {
|
||||
ts *testStorage
|
||||
storage.File
|
||||
}
|
||||
|
||||
func (tf tsFile) x() uint64 {
|
||||
return tf.Num()<<typeShift | uint64(tf.Type())
|
||||
}
|
||||
|
||||
func (tf tsFile) shouldErr(op tsOp) bool {
|
||||
return tf.ts.shouldErr(tf, op)
|
||||
}
|
||||
|
||||
func (tf tsFile) shouldErrLocked(op tsOp) bool {
|
||||
tf.ts.mu.Lock()
|
||||
defer tf.ts.mu.Unlock()
|
||||
return tf.shouldErr(op)
|
||||
}
|
||||
|
||||
func (tf tsFile) checkOpen(m string) error {
|
||||
ts := tf.ts
|
||||
if writer, ok := ts.opens[tf.x()]; ok {
|
||||
if writer {
|
||||
ts.t.Errorf("E: cannot %s file, num=%d type=%v: a writer still open", m, tf.Num(), tf.Type())
|
||||
} else {
|
||||
ts.t.Errorf("E: cannot %s file, num=%d type=%v: a reader still open", m, tf.Num(), tf.Type())
|
||||
}
|
||||
return tsErrFileOpen
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (tf tsFile) close(m string, err error) {
|
||||
ts := tf.ts
|
||||
ts.mu.Lock()
|
||||
defer ts.mu.Unlock()
|
||||
if _, ok := ts.opens[tf.x()]; !ok {
|
||||
ts.t.Errorf("E: %s: redudant file closing, num=%d type=%v", m, tf.Num(), tf.Type())
|
||||
} else if err == nil {
|
||||
ts.t.Logf("I: %s: file closed, num=%d type=%v", m, tf.Num(), tf.Type())
|
||||
}
|
||||
delete(ts.opens, tf.x())
|
||||
if err != nil {
|
||||
ts.t.Errorf("E: %s: cannot close file, num=%d type=%v: %v", m, tf.Num(), tf.Type(), err)
|
||||
}
|
||||
}
|
||||
|
||||
func (tf tsFile) Open() (r storage.Reader, err error) {
|
||||
ts := tf.ts
|
||||
ts.mu.Lock()
|
||||
defer ts.mu.Unlock()
|
||||
err = tf.checkOpen("open")
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
if tf.shouldErr(tsOpOpen) {
|
||||
err = errors.New("leveldb.testStorage: emulated open error")
|
||||
return
|
||||
}
|
||||
r, err = tf.File.Open()
|
||||
if err != nil {
|
||||
if ts.ignoreOpenErr&tf.Type() != 0 {
|
||||
ts.t.Logf("I: cannot open file, num=%d type=%v: %v (ignored)", tf.Num(), tf.Type(), err)
|
||||
} else {
|
||||
ts.t.Errorf("E: cannot open file, num=%d type=%v: %v", tf.Num(), tf.Type(), err)
|
||||
}
|
||||
} else {
|
||||
ts.t.Logf("I: file opened, num=%d type=%v", tf.Num(), tf.Type())
|
||||
ts.opens[tf.x()] = false
|
||||
r = tsReader{tf, r}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (tf tsFile) Create() (w storage.Writer, err error) {
|
||||
ts := tf.ts
|
||||
ts.mu.Lock()
|
||||
defer ts.mu.Unlock()
|
||||
err = tf.checkOpen("create")
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
if tf.shouldErr(tsOpCreate) {
|
||||
err = errors.New("leveldb.testStorage: emulated create error")
|
||||
return
|
||||
}
|
||||
w, err = tf.File.Create()
|
||||
if err != nil {
|
||||
ts.t.Errorf("E: cannot create file, num=%d type=%v: %v", tf.Num(), tf.Type(), err)
|
||||
} else {
|
||||
ts.t.Logf("I: file created, num=%d type=%v", tf.Num(), tf.Type())
|
||||
ts.opens[tf.x()] = true
|
||||
w = tsWriter{tf, w}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (tf tsFile) Replace(newfile storage.File) (err error) {
|
||||
ts := tf.ts
|
||||
ts.mu.Lock()
|
||||
defer ts.mu.Unlock()
|
||||
err = tf.checkOpen("replace")
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
err = tf.File.Replace(newfile.(tsFile).File)
|
||||
if err != nil {
|
||||
ts.t.Errorf("E: cannot replace file, num=%d type=%v: %v", tf.Num(), tf.Type(), err)
|
||||
} else {
|
||||
ts.t.Logf("I: file replace, num=%d type=%v", tf.Num(), tf.Type())
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (tf tsFile) Remove() (err error) {
|
||||
ts := tf.ts
|
||||
ts.mu.Lock()
|
||||
defer ts.mu.Unlock()
|
||||
err = tf.checkOpen("remove")
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
err = tf.File.Remove()
|
||||
if err != nil {
|
||||
ts.t.Errorf("E: cannot remove file, num=%d type=%v: %v", tf.Num(), tf.Type(), err)
|
||||
} else {
|
||||
ts.t.Logf("I: file removed, num=%d type=%v", tf.Num(), tf.Type())
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
type testStorage struct {
|
||||
t *testing.T
|
||||
storage.Storage
|
||||
closeFn func() error
|
||||
|
||||
mu sync.Mutex
|
||||
cond sync.Cond
|
||||
// Open files, true=writer, false=reader
|
||||
opens map[uint64]bool
|
||||
emuDelaySync storage.FileType
|
||||
ignoreOpenErr storage.FileType
|
||||
readCnt uint64
|
||||
readCntEn storage.FileType
|
||||
|
||||
emuErr [tsOpNum]storage.FileType
|
||||
emuErrOnce [tsOpNum]storage.FileType
|
||||
emuRandErr [tsOpNum]storage.FileType
|
||||
emuRandErrProb int
|
||||
emuErrOnceMap map[uint64]uint
|
||||
emuRandRand *rand.Rand
|
||||
}
|
||||
|
||||
func (ts *testStorage) shouldErr(tf tsFile, op tsOp) bool {
|
||||
if ts.emuErr[op]&tf.Type() != 0 {
|
||||
return true
|
||||
} else if ts.emuRandErr[op]&tf.Type() != 0 || ts.emuErrOnce[op]&tf.Type() != 0 {
|
||||
sop := uint(1) << op
|
||||
eop := ts.emuErrOnceMap[tf.x()]
|
||||
if eop&sop == 0 && (ts.emuRandRand.Int()%ts.emuRandErrProb == 0 || ts.emuErrOnce[op]&tf.Type() != 0) {
|
||||
ts.emuErrOnceMap[tf.x()] = eop | sop
|
||||
ts.t.Logf("I: emulated error: file=%d type=%v op=%v", tf.Num(), tf.Type(), op)
|
||||
return true
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
func (ts *testStorage) SetEmuErr(t storage.FileType, ops ...tsOp) {
|
||||
ts.mu.Lock()
|
||||
for _, op := range ops {
|
||||
ts.emuErr[op] = t
|
||||
}
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) SetEmuErrOnce(t storage.FileType, ops ...tsOp) {
|
||||
ts.mu.Lock()
|
||||
for _, op := range ops {
|
||||
ts.emuErrOnce[op] = t
|
||||
}
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) SetEmuRandErr(t storage.FileType, ops ...tsOp) {
|
||||
ts.mu.Lock()
|
||||
for _, op := range ops {
|
||||
ts.emuRandErr[op] = t
|
||||
}
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) SetEmuRandErrProb(prob int) {
|
||||
ts.mu.Lock()
|
||||
ts.emuRandErrProb = prob
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) DelaySync(t storage.FileType) {
|
||||
ts.mu.Lock()
|
||||
ts.emuDelaySync |= t
|
||||
ts.cond.Broadcast()
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) ReleaseSync(t storage.FileType) {
|
||||
ts.mu.Lock()
|
||||
ts.emuDelaySync &= ^t
|
||||
ts.cond.Broadcast()
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) ReadCounter() uint64 {
|
||||
ts.mu.Lock()
|
||||
defer ts.mu.Unlock()
|
||||
return ts.readCnt
|
||||
}
|
||||
|
||||
func (ts *testStorage) ResetReadCounter() {
|
||||
ts.mu.Lock()
|
||||
ts.readCnt = 0
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) SetReadCounter(t storage.FileType) {
|
||||
ts.mu.Lock()
|
||||
ts.readCntEn = t
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) countRead(t storage.FileType) {
|
||||
ts.mu.Lock()
|
||||
if ts.readCntEn&t != 0 {
|
||||
ts.readCnt++
|
||||
}
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func (ts *testStorage) SetIgnoreOpenErr(t storage.FileType) {
|
||||
ts.ignoreOpenErr = t
|
||||
}
|
||||
|
||||
func (ts *testStorage) Lock() (r util.Releaser, err error) {
|
||||
r, err = ts.Storage.Lock()
|
||||
if err != nil {
|
||||
ts.t.Logf("W: storage locking failed: %v", err)
|
||||
} else {
|
||||
ts.t.Log("I: storage locked")
|
||||
r = tsLock{ts, r}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (ts *testStorage) Log(str string) {
|
||||
ts.t.Log("L: " + str)
|
||||
ts.Storage.Log(str)
|
||||
}
|
||||
|
||||
func (ts *testStorage) GetFile(num uint64, t storage.FileType) storage.File {
|
||||
return tsFile{ts, ts.Storage.GetFile(num, t)}
|
||||
}
|
||||
|
||||
func (ts *testStorage) GetFiles(t storage.FileType) (ff []storage.File, err error) {
|
||||
ff0, err := ts.Storage.GetFiles(t)
|
||||
if err != nil {
|
||||
ts.t.Errorf("E: get files failed: %v", err)
|
||||
return
|
||||
}
|
||||
ff = make([]storage.File, len(ff0))
|
||||
for i, f := range ff0 {
|
||||
ff[i] = tsFile{ts, f}
|
||||
}
|
||||
ts.t.Logf("I: get files, type=0x%x count=%d", int(t), len(ff))
|
||||
return
|
||||
}
|
||||
|
||||
func (ts *testStorage) GetManifest() (f storage.File, err error) {
|
||||
f0, err := ts.Storage.GetManifest()
|
||||
if err != nil {
|
||||
if !os.IsNotExist(err) {
|
||||
ts.t.Errorf("E: get manifest failed: %v", err)
|
||||
}
|
||||
return
|
||||
}
|
||||
f = tsFile{ts, f0}
|
||||
ts.t.Logf("I: get manifest, num=%d", f.Num())
|
||||
return
|
||||
}
|
||||
|
||||
func (ts *testStorage) SetManifest(f storage.File) error {
|
||||
tf, ok := f.(tsFile)
|
||||
if !ok {
|
||||
ts.t.Error("E: set manifest failed: type assertion failed")
|
||||
return tsErrInvalidFile
|
||||
} else if tf.Type() != storage.TypeManifest {
|
||||
ts.t.Errorf("E: set manifest failed: invalid file type: %s", tf.Type())
|
||||
return tsErrInvalidFile
|
||||
}
|
||||
err := ts.Storage.SetManifest(tf.File)
|
||||
if err != nil {
|
||||
ts.t.Errorf("E: set manifest failed: %v", err)
|
||||
} else {
|
||||
ts.t.Logf("I: set manifest, num=%d", tf.Num())
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
func (ts *testStorage) Close() error {
|
||||
ts.CloseCheck()
|
||||
err := ts.Storage.Close()
|
||||
if err != nil {
|
||||
ts.t.Errorf("E: closing storage failed: %v", err)
|
||||
} else {
|
||||
ts.t.Log("I: storage closed")
|
||||
}
|
||||
if ts.closeFn != nil {
|
||||
if err := ts.closeFn(); err != nil {
|
||||
ts.t.Errorf("E: close function: %v", err)
|
||||
}
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
||||
func (ts *testStorage) CloseCheck() {
|
||||
ts.mu.Lock()
|
||||
if len(ts.opens) == 0 {
|
||||
ts.t.Log("I: all files are closed")
|
||||
} else {
|
||||
ts.t.Errorf("E: %d files still open", len(ts.opens))
|
||||
for x, writer := range ts.opens {
|
||||
num, tt := x>>typeShift, storage.FileType(x)&storage.TypeAll
|
||||
ts.t.Errorf("E: * num=%d type=%v writer=%v", num, tt, writer)
|
||||
}
|
||||
}
|
||||
ts.mu.Unlock()
|
||||
}
|
||||
|
||||
func newTestStorage(t *testing.T) *testStorage {
|
||||
var stor storage.Storage
|
||||
var closeFn func() error
|
||||
if tsFS {
|
||||
for {
|
||||
tsMU.Lock()
|
||||
num := tsNum
|
||||
tsNum++
|
||||
tsMU.Unlock()
|
||||
tempdir := tsTempdir
|
||||
if tempdir == "" {
|
||||
tempdir = os.TempDir()
|
||||
}
|
||||
path := filepath.Join(tempdir, fmt.Sprintf("goleveldb-test%d0%d0%d", os.Getuid(), os.Getpid(), num))
|
||||
if _, err := os.Stat(path); err != nil {
|
||||
stor, err = storage.OpenFile(path)
|
||||
if err != nil {
|
||||
t.Fatalf("F: cannot create storage: %v", err)
|
||||
}
|
||||
t.Logf("I: storage created: %s", path)
|
||||
closeFn = func() error {
|
||||
for _, name := range []string{"LOG.old", "LOG"} {
|
||||
f, err := os.Open(filepath.Join(path, name))
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if log, err := ioutil.ReadAll(f); err != nil {
|
||||
t.Logf("---------------------- %s ----------------------", name)
|
||||
t.Logf("cannot read log: %v", err)
|
||||
t.Logf("---------------------- %s ----------------------", name)
|
||||
} else if len(log) > 0 {
|
||||
t.Logf("---------------------- %s ----------------------\n%s", name, string(log))
|
||||
t.Logf("---------------------- %s ----------------------", name)
|
||||
}
|
||||
f.Close()
|
||||
}
|
||||
if t.Failed() {
|
||||
t.Logf("testing failed, test DB preserved at %s", path)
|
||||
return nil
|
||||
}
|
||||
if tsKeepFS {
|
||||
return nil
|
||||
}
|
||||
return os.RemoveAll(path)
|
||||
}
|
||||
|
||||
break
|
||||
}
|
||||
}
|
||||
} else {
|
||||
stor = storage.NewMemStorage()
|
||||
}
|
||||
ts := &testStorage{
|
||||
t: t,
|
||||
Storage: stor,
|
||||
closeFn: closeFn,
|
||||
opens: make(map[uint64]bool),
|
||||
emuErrOnceMap: make(map[uint64]uint),
|
||||
emuRandErrProb: 0x999,
|
||||
emuRandRand: rand.New(rand.NewSource(0xfacedead)),
|
||||
}
|
||||
ts.cond.L = &ts.mu
|
||||
return ts
|
||||
}
|
|
@ -21,10 +21,10 @@ import (
|
|||
|
||||
// tFile holds basic information about a table.
|
||||
type tFile struct {
|
||||
file storage.File
|
||||
fd storage.FileDesc
|
||||
seekLeft int32
|
||||
size uint64
|
||||
imin, imax iKey
|
||||
size int64
|
||||
imin, imax internalKey
|
||||
}
|
||||
|
||||
// Returns true if given key is after largest key of this table.
|
||||
|
@ -48,9 +48,9 @@ func (t *tFile) consumeSeek() int32 {
|
|||
}
|
||||
|
||||
// Creates new tFile.
|
||||
func newTableFile(file storage.File, size uint64, imin, imax iKey) *tFile {
|
||||
func newTableFile(fd storage.FileDesc, size int64, imin, imax internalKey) *tFile {
|
||||
f := &tFile{
|
||||
file: file,
|
||||
fd: fd,
|
||||
size: size,
|
||||
imin: imin,
|
||||
imax: imax,
|
||||
|
@ -77,6 +77,10 @@ func newTableFile(file storage.File, size uint64, imin, imax iKey) *tFile {
|
|||
return f
|
||||
}
|
||||
|
||||
func tableFileFromRecord(r atRecord) *tFile {
|
||||
return newTableFile(storage.FileDesc{storage.TypeTable, r.num}, r.size, r.imin, r.imax)
|
||||
}
|
||||
|
||||
// tFiles hold multiple tFile.
|
||||
type tFiles []*tFile
|
||||
|
||||
|
@ -89,7 +93,7 @@ func (tf tFiles) nums() string {
|
|||
if i != 0 {
|
||||
x += ", "
|
||||
}
|
||||
x += fmt.Sprint(f.file.Num())
|
||||
x += fmt.Sprint(f.fd.Num)
|
||||
}
|
||||
x += " ]"
|
||||
return x
|
||||
|
@ -101,7 +105,7 @@ func (tf tFiles) lessByKey(icmp *iComparer, i, j int) bool {
|
|||
a, b := tf[i], tf[j]
|
||||
n := icmp.Compare(a.imin, b.imin)
|
||||
if n == 0 {
|
||||
return a.file.Num() < b.file.Num()
|
||||
return a.fd.Num < b.fd.Num
|
||||
}
|
||||
return n < 0
|
||||
}
|
||||
|
@ -109,7 +113,7 @@ func (tf tFiles) lessByKey(icmp *iComparer, i, j int) bool {
|
|||
// Returns true if i file number is greater than j.
|
||||
// This used for sort by file number in descending order.
|
||||
func (tf tFiles) lessByNum(i, j int) bool {
|
||||
return tf[i].file.Num() > tf[j].file.Num()
|
||||
return tf[i].fd.Num > tf[j].fd.Num
|
||||
}
|
||||
|
||||
// Sorts tables by key in ascending order.
|
||||
|
@ -123,7 +127,7 @@ func (tf tFiles) sortByNum() {
|
|||
}
|
||||
|
||||
// Returns sum of all tables size.
|
||||
func (tf tFiles) size() (sum uint64) {
|
||||
func (tf tFiles) size() (sum int64) {
|
||||
for _, t := range tf {
|
||||
sum += t.size
|
||||
}
|
||||
|
@ -132,7 +136,7 @@ func (tf tFiles) size() (sum uint64) {
|
|||
|
||||
// Searches smallest index of tables whose its smallest
|
||||
// key is after or equal with given key.
|
||||
func (tf tFiles) searchMin(icmp *iComparer, ikey iKey) int {
|
||||
func (tf tFiles) searchMin(icmp *iComparer, ikey internalKey) int {
|
||||
return sort.Search(len(tf), func(i int) bool {
|
||||
return icmp.Compare(tf[i].imin, ikey) >= 0
|
||||
})
|
||||
|
@ -140,7 +144,7 @@ func (tf tFiles) searchMin(icmp *iComparer, ikey iKey) int {
|
|||
|
||||
// Searches smallest index of tables whose its largest
|
||||
// key is after or equal with given key.
|
||||
func (tf tFiles) searchMax(icmp *iComparer, ikey iKey) int {
|
||||
func (tf tFiles) searchMax(icmp *iComparer, ikey internalKey) int {
|
||||
return sort.Search(len(tf), func(i int) bool {
|
||||
return icmp.Compare(tf[i].imax, ikey) >= 0
|
||||
})
|
||||
|
@ -162,7 +166,7 @@ func (tf tFiles) overlaps(icmp *iComparer, umin, umax []byte, unsorted bool) boo
|
|||
i := 0
|
||||
if len(umin) > 0 {
|
||||
// Find the earliest possible internal key for min.
|
||||
i = tf.searchMax(icmp, newIkey(umin, kMaxSeq, ktSeek))
|
||||
i = tf.searchMax(icmp, makeInternalKey(nil, umin, keyMaxSeq, keyTypeSeek))
|
||||
}
|
||||
if i >= len(tf) {
|
||||
// Beginning of range is after all files, so no overlap.
|
||||
|
@ -205,7 +209,7 @@ func (tf tFiles) getOverlaps(dst tFiles, icmp *iComparer, umin, umax []byte, ove
|
|||
}
|
||||
|
||||
// Returns tables key range.
|
||||
func (tf tFiles) getRange(icmp *iComparer) (imin, imax iKey) {
|
||||
func (tf tFiles) getRange(icmp *iComparer) (imin, imax internalKey) {
|
||||
for i, t := range tf {
|
||||
if i == 0 {
|
||||
imin, imax = t.imin, t.imax
|
||||
|
@ -227,10 +231,10 @@ func (tf tFiles) newIndexIterator(tops *tOps, icmp *iComparer, slice *util.Range
|
|||
if slice != nil {
|
||||
var start, limit int
|
||||
if slice.Start != nil {
|
||||
start = tf.searchMax(icmp, iKey(slice.Start))
|
||||
start = tf.searchMax(icmp, internalKey(slice.Start))
|
||||
}
|
||||
if slice.Limit != nil {
|
||||
limit = tf.searchMin(icmp, iKey(slice.Limit))
|
||||
limit = tf.searchMin(icmp, internalKey(slice.Limit))
|
||||
} else {
|
||||
limit = tf.Len()
|
||||
}
|
||||
|
@ -255,7 +259,7 @@ type tFilesArrayIndexer struct {
|
|||
}
|
||||
|
||||
func (a *tFilesArrayIndexer) Search(key []byte) int {
|
||||
return a.searchMax(a.icmp, iKey(key))
|
||||
return a.searchMax(a.icmp, internalKey(key))
|
||||
}
|
||||
|
||||
func (a *tFilesArrayIndexer) Get(i int) iterator.Iterator {
|
||||
|
@ -287,6 +291,7 @@ func (x *tFilesSortByNum) Less(i, j int) bool {
|
|||
// Table operations.
|
||||
type tOps struct {
|
||||
s *session
|
||||
noSync bool
|
||||
cache *cache.Cache
|
||||
bcache *cache.Cache
|
||||
bpool *util.BufferPool
|
||||
|
@ -294,16 +299,16 @@ type tOps struct {
|
|||
|
||||
// Creates an empty table and returns table writer.
|
||||
func (t *tOps) create() (*tWriter, error) {
|
||||
file := t.s.getTableFile(t.s.allocFileNum())
|
||||
fw, err := file.Create()
|
||||
fd := storage.FileDesc{storage.TypeTable, t.s.allocFileNum()}
|
||||
fw, err := t.s.stor.Create(fd)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return &tWriter{
|
||||
t: t,
|
||||
file: file,
|
||||
w: fw,
|
||||
tw: table.NewWriter(fw, t.s.o.Options),
|
||||
t: t,
|
||||
fd: fd,
|
||||
w: fw,
|
||||
tw: table.NewWriter(fw, t.s.o.Options),
|
||||
}, nil
|
||||
}
|
||||
|
||||
|
@ -339,21 +344,20 @@ func (t *tOps) createFrom(src iterator.Iterator) (f *tFile, n int, err error) {
|
|||
// Opens table. It returns a cache handle, which should
|
||||
// be released after use.
|
||||
func (t *tOps) open(f *tFile) (ch *cache.Handle, err error) {
|
||||
num := f.file.Num()
|
||||
ch = t.cache.Get(0, num, func() (size int, value cache.Value) {
|
||||
ch = t.cache.Get(0, uint64(f.fd.Num), func() (size int, value cache.Value) {
|
||||
var r storage.Reader
|
||||
r, err = f.file.Open()
|
||||
r, err = t.s.stor.Open(f.fd)
|
||||
if err != nil {
|
||||
return 0, nil
|
||||
}
|
||||
|
||||
var bcache *cache.CacheGetter
|
||||
var bcache *cache.NamespaceGetter
|
||||
if t.bcache != nil {
|
||||
bcache = &cache.CacheGetter{Cache: t.bcache, NS: num}
|
||||
bcache = &cache.NamespaceGetter{Cache: t.bcache, NS: uint64(f.fd.Num)}
|
||||
}
|
||||
|
||||
var tr *table.Reader
|
||||
tr, err = table.NewReader(r, int64(f.size), storage.NewFileInfo(f.file), bcache, t.bpool, t.s.o.Options)
|
||||
tr, err = table.NewReader(r, f.size, f.fd, bcache, t.bpool, t.s.o.Options)
|
||||
if err != nil {
|
||||
r.Close()
|
||||
return 0, nil
|
||||
|
@ -389,14 +393,13 @@ func (t *tOps) findKey(f *tFile, key []byte, ro *opt.ReadOptions) (rkey []byte,
|
|||
}
|
||||
|
||||
// Returns approximate offset of the given key.
|
||||
func (t *tOps) offsetOf(f *tFile, key []byte) (offset uint64, err error) {
|
||||
func (t *tOps) offsetOf(f *tFile, key []byte) (offset int64, err error) {
|
||||
ch, err := t.open(f)
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
defer ch.Release()
|
||||
offset_, err := ch.Value().(*table.Reader).OffsetOf(key)
|
||||
return uint64(offset_), err
|
||||
return ch.Value().(*table.Reader).OffsetOf(key)
|
||||
}
|
||||
|
||||
// Creates an iterator from the given table.
|
||||
|
@ -413,15 +416,14 @@ func (t *tOps) newIterator(f *tFile, slice *util.Range, ro *opt.ReadOptions) ite
|
|||
// Removes table from persistent storage. It waits until
|
||||
// no one use the the table.
|
||||
func (t *tOps) remove(f *tFile) {
|
||||
num := f.file.Num()
|
||||
t.cache.Delete(0, num, func() {
|
||||
if err := f.file.Remove(); err != nil {
|
||||
t.s.logf("table@remove removing @%d %q", num, err)
|
||||
t.cache.Delete(0, uint64(f.fd.Num), func() {
|
||||
if err := t.s.stor.Remove(f.fd); err != nil {
|
||||
t.s.logf("table@remove removing @%d %q", f.fd.Num, err)
|
||||
} else {
|
||||
t.s.logf("table@remove removed @%d", num)
|
||||
t.s.logf("table@remove removed @%d", f.fd.Num)
|
||||
}
|
||||
if t.bcache != nil {
|
||||
t.bcache.EvictNS(num)
|
||||
t.bcache.EvictNS(uint64(f.fd.Num))
|
||||
}
|
||||
})
|
||||
}
|
||||
|
@ -432,7 +434,7 @@ func (t *tOps) close() {
|
|||
t.bpool.Close()
|
||||
t.cache.Close()
|
||||
if t.bcache != nil {
|
||||
t.bcache.Close()
|
||||
t.bcache.CloseWeak()
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -441,22 +443,27 @@ func newTableOps(s *session) *tOps {
|
|||
var (
|
||||
cacher cache.Cacher
|
||||
bcache *cache.Cache
|
||||
bpool *util.BufferPool
|
||||
)
|
||||
if s.o.GetOpenFilesCacheCapacity() > 0 {
|
||||
cacher = cache.NewLRU(s.o.GetOpenFilesCacheCapacity())
|
||||
}
|
||||
if !s.o.DisableBlockCache {
|
||||
if !s.o.GetDisableBlockCache() {
|
||||
var bcacher cache.Cacher
|
||||
if s.o.GetBlockCacheCapacity() > 0 {
|
||||
bcacher = cache.NewLRU(s.o.GetBlockCacheCapacity())
|
||||
}
|
||||
bcache = cache.NewCache(bcacher)
|
||||
}
|
||||
if !s.o.GetDisableBufferPool() {
|
||||
bpool = util.NewBufferPool(s.o.GetBlockSize() + 5)
|
||||
}
|
||||
return &tOps{
|
||||
s: s,
|
||||
noSync: s.o.GetNoSync(),
|
||||
cache: cache.NewCache(cacher),
|
||||
bcache: bcache,
|
||||
bpool: util.NewBufferPool(s.o.GetBlockSize() + 5),
|
||||
bpool: bpool,
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -465,9 +472,9 @@ func newTableOps(s *session) *tOps {
|
|||
type tWriter struct {
|
||||
t *tOps
|
||||
|
||||
file storage.File
|
||||
w storage.Writer
|
||||
tw *table.Writer
|
||||
fd storage.FileDesc
|
||||
w storage.Writer
|
||||
tw *table.Writer
|
||||
|
||||
first, last []byte
|
||||
}
|
||||
|
@ -501,20 +508,21 @@ func (w *tWriter) finish() (f *tFile, err error) {
|
|||
if err != nil {
|
||||
return
|
||||
}
|
||||
err = w.w.Sync()
|
||||
if err != nil {
|
||||
return
|
||||
if !w.t.noSync {
|
||||
err = w.w.Sync()
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
}
|
||||
f = newTableFile(w.file, uint64(w.tw.BytesLen()), iKey(w.first), iKey(w.last))
|
||||
f = newTableFile(w.fd, int64(w.tw.BytesLen()), internalKey(w.first), internalKey(w.last))
|
||||
return
|
||||
}
|
||||
|
||||
// Drops the table.
|
||||
func (w *tWriter) drop() {
|
||||
w.close()
|
||||
w.file.Remove()
|
||||
w.t.s.reuseFileNum(w.file.Num())
|
||||
w.file = nil
|
||||
w.t.s.stor.Remove(w.fd)
|
||||
w.t.s.reuseFileNum(w.fd.Num)
|
||||
w.tw = nil
|
||||
w.first = nil
|
||||
w.last = nil
|
||||
|
|
|
@ -14,7 +14,7 @@ import (
|
|||
"strings"
|
||||
"sync"
|
||||
|
||||
"github.com/syndtr/gosnappy/snappy"
|
||||
"github.com/golang/snappy"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/cache"
|
||||
"github.com/syndtr/goleveldb/leveldb/comparer"
|
||||
|
@ -26,12 +26,15 @@ import (
|
|||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
||||
// Reader errors.
|
||||
var (
|
||||
ErrNotFound = errors.ErrNotFound
|
||||
ErrReaderReleased = errors.New("leveldb/table: reader released")
|
||||
ErrIterReleased = errors.New("leveldb/table: iterator released")
|
||||
)
|
||||
|
||||
// ErrCorrupted describes error due to corruption. This error will be wrapped
|
||||
// with errors.ErrCorrupted.
|
||||
type ErrCorrupted struct {
|
||||
Pos int64
|
||||
Size int64
|
||||
|
@ -61,7 +64,7 @@ type block struct {
|
|||
func (b *block) seek(cmp comparer.Comparer, rstart, rlimit int, key []byte) (index, offset int, err error) {
|
||||
index = sort.Search(b.restartsLen-rstart-(b.restartsLen-rlimit), func(i int) bool {
|
||||
offset := int(binary.LittleEndian.Uint32(b.data[b.restartsOffset+4*(rstart+i):]))
|
||||
offset += 1 // shared always zero, since this is a restart point
|
||||
offset++ // shared always zero, since this is a restart point
|
||||
v1, n1 := binary.Uvarint(b.data[offset:]) // key length
|
||||
_, n2 := binary.Uvarint(b.data[offset+n1:]) // value length
|
||||
m := offset + n1 + n2
|
||||
|
@ -356,7 +359,7 @@ func (i *blockIter) Prev() bool {
|
|||
i.value = nil
|
||||
offset := i.block.restartOffset(ri)
|
||||
if offset == i.offset {
|
||||
ri -= 1
|
||||
ri--
|
||||
if ri < 0 {
|
||||
i.dir = dirSOI
|
||||
return false
|
||||
|
@ -507,9 +510,9 @@ func (i *indexIter) Get() iterator.Iterator {
|
|||
// Reader is a table reader.
|
||||
type Reader struct {
|
||||
mu sync.RWMutex
|
||||
fi *storage.FileInfo
|
||||
fd storage.FileDesc
|
||||
reader io.ReaderAt
|
||||
cache *cache.CacheGetter
|
||||
cache *cache.NamespaceGetter
|
||||
err error
|
||||
bpool *util.BufferPool
|
||||
// Options
|
||||
|
@ -539,7 +542,7 @@ func (r *Reader) blockKind(bh blockHandle) string {
|
|||
}
|
||||
|
||||
func (r *Reader) newErrCorrupted(pos, size int64, kind, reason string) error {
|
||||
return &errors.ErrCorrupted{File: r.fi, Err: &ErrCorrupted{Pos: pos, Size: size, Kind: kind, Reason: reason}}
|
||||
return &errors.ErrCorrupted{Fd: r.fd, Err: &ErrCorrupted{Pos: pos, Size: size, Kind: kind, Reason: reason}}
|
||||
}
|
||||
|
||||
func (r *Reader) newErrCorruptedBH(bh blockHandle, reason string) error {
|
||||
|
@ -551,7 +554,7 @@ func (r *Reader) fixErrCorruptedBH(bh blockHandle, err error) error {
|
|||
cerr.Pos = int64(bh.offset)
|
||||
cerr.Size = int64(bh.length)
|
||||
cerr.Kind = r.blockKind(bh)
|
||||
return &errors.ErrCorrupted{File: r.fi, Err: cerr}
|
||||
return &errors.ErrCorrupted{Fd: r.fd, Err: cerr}
|
||||
}
|
||||
return err
|
||||
}
|
||||
|
@ -578,6 +581,7 @@ func (r *Reader) readRawBlock(bh blockHandle, verifyChecksum bool) ([]byte, erro
|
|||
case blockTypeSnappyCompression:
|
||||
decLen, err := snappy.DecodedLen(data[:bh.length])
|
||||
if err != nil {
|
||||
r.bpool.Put(data)
|
||||
return nil, r.newErrCorruptedBH(bh, err.Error())
|
||||
}
|
||||
decData := r.bpool.Get(decLen)
|
||||
|
@ -783,8 +787,8 @@ func (r *Reader) getDataIterErr(dataBH blockHandle, slice *util.Range, verifyChe
|
|||
// table. And a nil Range.Limit is treated as a key after all keys in
|
||||
// the table.
|
||||
//
|
||||
// The returned iterator is not goroutine-safe and should be released
|
||||
// when not used.
|
||||
// The returned iterator is not safe for concurrent use and should be released
|
||||
// after use.
|
||||
//
|
||||
// Also read Iterator documentation of the leveldb/iterator package.
|
||||
func (r *Reader) NewIterator(slice *util.Range, ro *opt.ReadOptions) iterator.Iterator {
|
||||
|
@ -826,18 +830,21 @@ func (r *Reader) find(key []byte, filtered bool, ro *opt.ReadOptions, noValue bo
|
|||
|
||||
index := r.newBlockIter(indexBlock, nil, nil, true)
|
||||
defer index.Release()
|
||||
|
||||
if !index.Seek(key) {
|
||||
err = index.Error()
|
||||
if err == nil {
|
||||
if err = index.Error(); err == nil {
|
||||
err = ErrNotFound
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
dataBH, n := decodeBlockHandle(index.Value())
|
||||
if n == 0 {
|
||||
r.err = r.newErrCorruptedBH(r.indexBH, "bad data block handle")
|
||||
return
|
||||
return nil, nil, r.err
|
||||
}
|
||||
|
||||
// The filter should only used for exact match.
|
||||
if filtered && r.filter != nil {
|
||||
filterBlock, frel, ferr := r.getFilterBlock(true)
|
||||
if ferr == nil {
|
||||
|
@ -847,30 +854,53 @@ func (r *Reader) find(key []byte, filtered bool, ro *opt.ReadOptions, noValue bo
|
|||
}
|
||||
frel.Release()
|
||||
} else if !errors.IsCorrupted(ferr) {
|
||||
err = ferr
|
||||
return nil, nil, ferr
|
||||
}
|
||||
}
|
||||
|
||||
data := r.getDataIter(dataBH, nil, r.verifyChecksum, !ro.GetDontFillCache())
|
||||
if !data.Seek(key) {
|
||||
data.Release()
|
||||
if err = data.Error(); err != nil {
|
||||
return
|
||||
}
|
||||
|
||||
// The nearest greater-than key is the first key of the next block.
|
||||
if !index.Next() {
|
||||
if err = index.Error(); err == nil {
|
||||
err = ErrNotFound
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
dataBH, n = decodeBlockHandle(index.Value())
|
||||
if n == 0 {
|
||||
r.err = r.newErrCorruptedBH(r.indexBH, "bad data block handle")
|
||||
return nil, nil, r.err
|
||||
}
|
||||
|
||||
data = r.getDataIter(dataBH, nil, r.verifyChecksum, !ro.GetDontFillCache())
|
||||
if !data.Next() {
|
||||
data.Release()
|
||||
if err = data.Error(); err == nil {
|
||||
err = ErrNotFound
|
||||
}
|
||||
return
|
||||
}
|
||||
}
|
||||
data := r.getDataIter(dataBH, nil, r.verifyChecksum, !ro.GetDontFillCache())
|
||||
defer data.Release()
|
||||
if !data.Seek(key) {
|
||||
err = data.Error()
|
||||
if err == nil {
|
||||
err = ErrNotFound
|
||||
}
|
||||
return
|
||||
}
|
||||
// Don't use block buffer, no need to copy the buffer.
|
||||
|
||||
// Key doesn't use block buffer, no need to copy the buffer.
|
||||
rkey = data.Key()
|
||||
if !noValue {
|
||||
if r.bpool == nil {
|
||||
value = data.Value()
|
||||
} else {
|
||||
// Use block buffer, and since the buffer will be recycled, the buffer
|
||||
// need to be copied.
|
||||
// Value does use block buffer, and since the buffer will be
|
||||
// recycled, it need to be copied.
|
||||
value = append([]byte{}, data.Value()...)
|
||||
}
|
||||
}
|
||||
data.Release()
|
||||
return
|
||||
}
|
||||
|
||||
|
@ -888,7 +918,7 @@ func (r *Reader) Find(key []byte, filtered bool, ro *opt.ReadOptions) (rkey, val
|
|||
return r.find(key, filtered, ro, false)
|
||||
}
|
||||
|
||||
// Find finds key that is greater than or equal to the given key.
|
||||
// FindKey finds key that is greater than or equal to the given key.
|
||||
// It returns ErrNotFound if the table doesn't contain such key.
|
||||
// If filtered is true then the nearest 'block' will be checked against
|
||||
// 'filter data' (if present) and will immediately return ErrNotFound if
|
||||
|
@ -987,14 +1017,14 @@ func (r *Reader) Release() {
|
|||
// NewReader creates a new initialized table reader for the file.
|
||||
// The fi, cache and bpool is optional and can be nil.
|
||||
//
|
||||
// The returned table reader instance is goroutine-safe.
|
||||
func NewReader(f io.ReaderAt, size int64, fi *storage.FileInfo, cache *cache.CacheGetter, bpool *util.BufferPool, o *opt.Options) (*Reader, error) {
|
||||
// The returned table reader instance is safe for concurrent use.
|
||||
func NewReader(f io.ReaderAt, size int64, fd storage.FileDesc, cache *cache.NamespaceGetter, bpool *util.BufferPool, o *opt.Options) (*Reader, error) {
|
||||
if f == nil {
|
||||
return nil, errors.New("leveldb/table: nil file")
|
||||
}
|
||||
|
||||
r := &Reader{
|
||||
fi: fi,
|
||||
fd: fd,
|
||||
reader: f,
|
||||
cache: cache,
|
||||
bpool: bpool,
|
||||
|
@ -1039,9 +1069,8 @@ func NewReader(f io.ReaderAt, size int64, fi *storage.FileInfo, cache *cache.Cac
|
|||
if errors.IsCorrupted(err) {
|
||||
r.err = err
|
||||
return r, nil
|
||||
} else {
|
||||
return nil, err
|
||||
}
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// Set data end.
|
||||
|
@ -1086,9 +1115,8 @@ func NewReader(f io.ReaderAt, size int64, fi *storage.FileInfo, cache *cache.Cac
|
|||
if errors.IsCorrupted(err) {
|
||||
r.err = err
|
||||
return r, nil
|
||||
} else {
|
||||
return nil, err
|
||||
}
|
||||
return nil, err
|
||||
}
|
||||
if r.filter != nil {
|
||||
r.filterBlock, err = r.readFilterBlock(r.filterBH)
|
||||
|
|
|
@ -14,6 +14,7 @@ import (
|
|||
|
||||
"github.com/syndtr/goleveldb/leveldb/iterator"
|
||||
"github.com/syndtr/goleveldb/leveldb/opt"
|
||||
"github.com/syndtr/goleveldb/leveldb/storage"
|
||||
"github.com/syndtr/goleveldb/leveldb/testutil"
|
||||
"github.com/syndtr/goleveldb/leveldb/util"
|
||||
)
|
||||
|
@ -59,7 +60,7 @@ var _ = testutil.Defer(func() {
|
|||
It("Should be able to approximate offset of a key correctly", func() {
|
||||
Expect(err).ShouldNot(HaveOccurred())
|
||||
|
||||
tr, err := NewReader(bytes.NewReader(buf.Bytes()), int64(buf.Len()), nil, nil, nil, o)
|
||||
tr, err := NewReader(bytes.NewReader(buf.Bytes()), int64(buf.Len()), storage.FileDesc{}, nil, nil, o)
|
||||
Expect(err).ShouldNot(HaveOccurred())
|
||||
CheckOffset := func(key string, expect, threshold int) {
|
||||
offset, err := tr.OffsetOf([]byte(key))
|
||||
|
@ -96,7 +97,7 @@ var _ = testutil.Defer(func() {
|
|||
tw.Close()
|
||||
|
||||
// Opening the table.
|
||||
tr, _ := NewReader(bytes.NewReader(buf.Bytes()), int64(buf.Len()), nil, nil, nil, o)
|
||||
tr, _ := NewReader(bytes.NewReader(buf.Bytes()), int64(buf.Len()), storage.FileDesc{}, nil, nil, o)
|
||||
return tableWrapper{tr}
|
||||
}
|
||||
Test := func(kv *testutil.KeyValue, body func(r *Reader)) func() {
|
||||
|
@ -110,7 +111,7 @@ var _ = testutil.Defer(func() {
|
|||
}
|
||||
|
||||
testutil.AllKeyValueTesting(nil, Build, nil, nil)
|
||||
Describe("with one key per block", Test(testutil.KeyValue_Generate(nil, 9, 1, 10, 512, 512), func(r *Reader) {
|
||||
Describe("with one key per block", Test(testutil.KeyValue_Generate(nil, 9, 1, 1, 10, 512, 512), func(r *Reader) {
|
||||
It("should have correct blocks number", func() {
|
||||
indexBlock, err := r.readBlock(r.indexBH, true)
|
||||
Expect(err).To(BeNil())
|
||||
|
|
|
@ -12,7 +12,7 @@ import (
|
|||
"fmt"
|
||||
"io"
|
||||
|
||||
"github.com/syndtr/gosnappy/snappy"
|
||||
"github.com/golang/snappy"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/comparer"
|
||||
"github.com/syndtr/goleveldb/leveldb/filter"
|
||||
|
@ -167,11 +167,7 @@ func (w *Writer) writeBlock(buf *util.Buffer, compression opt.Compression) (bh b
|
|||
if n := snappy.MaxEncodedLen(buf.Len()) + blockTrailerLen; len(w.compressionScratch) < n {
|
||||
w.compressionScratch = make([]byte, n)
|
||||
}
|
||||
var compressed []byte
|
||||
compressed, err = snappy.Encode(w.compressionScratch, buf.Bytes())
|
||||
if err != nil {
|
||||
return
|
||||
}
|
||||
compressed := snappy.Encode(w.compressionScratch, buf.Bytes())
|
||||
n := len(compressed)
|
||||
b = compressed[:n+blockTrailerLen]
|
||||
b[n] = blockTypeSnappyCompression
|
||||
|
@ -353,7 +349,7 @@ func (w *Writer) Close() error {
|
|||
|
||||
// NewWriter creates a new initialized table writer for the file.
|
||||
//
|
||||
// Table writer is not goroutine-safe.
|
||||
// Table writer is not safe for concurrent use.
|
||||
func NewWriter(f io.Writer, o *opt.Options) *Writer {
|
||||
w := &Writer{
|
||||
writer: f,
|
||||
|
|
|
@ -61,3 +61,31 @@ func newTestingDB(o *opt.Options, ro *opt.ReadOptions, wo *opt.WriteOptions) *te
|
|||
stor: stor,
|
||||
}
|
||||
}
|
||||
|
||||
type testingTransaction struct {
|
||||
*Transaction
|
||||
ro *opt.ReadOptions
|
||||
wo *opt.WriteOptions
|
||||
}
|
||||
|
||||
func (t *testingTransaction) TestPut(key []byte, value []byte) error {
|
||||
return t.Put(key, value, t.wo)
|
||||
}
|
||||
|
||||
func (t *testingTransaction) TestDelete(key []byte) error {
|
||||
return t.Delete(key, t.wo)
|
||||
}
|
||||
|
||||
func (t *testingTransaction) TestGet(key []byte) (value []byte, err error) {
|
||||
return t.Get(key, t.ro)
|
||||
}
|
||||
|
||||
func (t *testingTransaction) TestHas(key []byte) (ret bool, err error) {
|
||||
return t.Has(key, t.ro)
|
||||
}
|
||||
|
||||
func (t *testingTransaction) TestNewIterator(slice *util.Range) iterator.Iterator {
|
||||
return t.NewIterator(slice, t.ro)
|
||||
}
|
||||
|
||||
func (t *testingTransaction) TestClose() {}
|
||||
|
|
|
@ -72,20 +72,27 @@ func maxInt(a, b int) int {
|
|||
return b
|
||||
}
|
||||
|
||||
type files []storage.File
|
||||
type fdSorter []storage.FileDesc
|
||||
|
||||
func (p files) Len() int {
|
||||
func (p fdSorter) Len() int {
|
||||
return len(p)
|
||||
}
|
||||
|
||||
func (p files) Less(i, j int) bool {
|
||||
return p[i].Num() < p[j].Num()
|
||||
func (p fdSorter) Less(i, j int) bool {
|
||||
return p[i].Num < p[j].Num
|
||||
}
|
||||
|
||||
func (p files) Swap(i, j int) {
|
||||
func (p fdSorter) Swap(i, j int) {
|
||||
p[i], p[j] = p[j], p[i]
|
||||
}
|
||||
|
||||
func (p files) sort() {
|
||||
sort.Sort(p)
|
||||
func sortFds(fds []storage.FileDesc) {
|
||||
sort.Sort(fdSorter(fds))
|
||||
}
|
||||
|
||||
func ensureBuffer(b []byte, n int) []byte {
|
||||
if cap(b) < n {
|
||||
return make([]byte, n)
|
||||
}
|
||||
return b[:n]
|
||||
}
|
||||
|
|
|
@ -201,6 +201,7 @@ func (p *BufferPool) String() string {
|
|||
|
||||
func (p *BufferPool) drain() {
|
||||
ticker := time.NewTicker(2 * time.Second)
|
||||
defer ticker.Stop()
|
||||
for {
|
||||
select {
|
||||
case <-ticker.C:
|
||||
|
|
|
@ -7,38 +7,38 @@
|
|||
package util
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"encoding/binary"
|
||||
)
|
||||
|
||||
// Hash return hash of the given data.
|
||||
func Hash(data []byte, seed uint32) uint32 {
|
||||
// Similar to murmur hash
|
||||
var m uint32 = 0xc6a4a793
|
||||
var r uint32 = 24
|
||||
h := seed ^ (uint32(len(data)) * m)
|
||||
const (
|
||||
m = uint32(0xc6a4a793)
|
||||
r = uint32(24)
|
||||
)
|
||||
var (
|
||||
h = seed ^ (uint32(len(data)) * m)
|
||||
i int
|
||||
)
|
||||
|
||||
buf := bytes.NewBuffer(data)
|
||||
for buf.Len() >= 4 {
|
||||
var w uint32
|
||||
binary.Read(buf, binary.LittleEndian, &w)
|
||||
h += w
|
||||
for n := len(data) - len(data)%4; i < n; i += 4 {
|
||||
h += binary.LittleEndian.Uint32(data[i:])
|
||||
h *= m
|
||||
h ^= (h >> 16)
|
||||
}
|
||||
|
||||
rest := buf.Bytes()
|
||||
switch len(rest) {
|
||||
switch len(data) - i {
|
||||
default:
|
||||
panic("not reached")
|
||||
case 3:
|
||||
h += uint32(rest[2]) << 16
|
||||
h += uint32(data[i+2]) << 16
|
||||
fallthrough
|
||||
case 2:
|
||||
h += uint32(rest[1]) << 8
|
||||
h += uint32(data[i+1]) << 8
|
||||
fallthrough
|
||||
case 1:
|
||||
h += uint32(rest[0])
|
||||
h += uint32(data[i])
|
||||
h *= m
|
||||
h ^= (h >> r)
|
||||
case 0:
|
||||
|
|
|
@ -0,0 +1,46 @@
|
|||
// Copyright (c) 2012, Suryandaru Triandana <syndtr@gmail.com>
|
||||
// All rights reserved.
|
||||
//
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
package util
|
||||
|
||||
import (
|
||||
"testing"
|
||||
)
|
||||
|
||||
var hashTests = []struct {
|
||||
data []byte
|
||||
seed uint32
|
||||
hash uint32
|
||||
}{
|
||||
{nil, 0xbc9f1d34, 0xbc9f1d34},
|
||||
{[]byte{0x62}, 0xbc9f1d34, 0xef1345c4},
|
||||
{[]byte{0xc3, 0x97}, 0xbc9f1d34, 0x5b663814},
|
||||
{[]byte{0xe2, 0x99, 0xa5}, 0xbc9f1d34, 0x323c078f},
|
||||
{[]byte{0xe1, 0x80, 0xb9, 0x32}, 0xbc9f1d34, 0xed21633a},
|
||||
{[]byte{
|
||||
0x01, 0xc0, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00,
|
||||
0x14, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x04, 0x00,
|
||||
0x00, 0x00, 0x00, 0x14,
|
||||
0x00, 0x00, 0x00, 0x18,
|
||||
0x28, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00,
|
||||
0x02, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00,
|
||||
}, 0x12345678, 0xf333dabb},
|
||||
}
|
||||
|
||||
func TestHash(t *testing.T) {
|
||||
for i, x := range hashTests {
|
||||
h := Hash(x.data, x.seed)
|
||||
if h != x.hash {
|
||||
t.Fatalf("test-%d: invalid hash, %#x vs %#x", i, h, x.hash)
|
||||
}
|
||||
}
|
||||
}
|
|
@ -1,21 +0,0 @@
|
|||
// Copyright (c) 2014, Suryandaru Triandana <syndtr@gmail.com>
|
||||
// All rights reserved.
|
||||
//
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
// +build go1.3
|
||||
|
||||
package util
|
||||
|
||||
import (
|
||||
"sync"
|
||||
)
|
||||
|
||||
type Pool struct {
|
||||
sync.Pool
|
||||
}
|
||||
|
||||
func NewPool(cap int) *Pool {
|
||||
return &Pool{}
|
||||
}
|
|
@ -1,33 +0,0 @@
|
|||
// Copyright (c) 2014, Suryandaru Triandana <syndtr@gmail.com>
|
||||
// All rights reserved.
|
||||
//
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
// +build !go1.3
|
||||
|
||||
package util
|
||||
|
||||
type Pool struct {
|
||||
pool chan interface{}
|
||||
}
|
||||
|
||||
func (p *Pool) Get() interface{} {
|
||||
select {
|
||||
case x := <-p.pool:
|
||||
return x
|
||||
default:
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
func (p *Pool) Put(x interface{}) {
|
||||
select {
|
||||
case p.pool <- x:
|
||||
default:
|
||||
}
|
||||
}
|
||||
|
||||
func NewPool(cap int) *Pool {
|
||||
return &Pool{pool: make(chan interface{}, cap)}
|
||||
}
|
|
@ -19,7 +19,7 @@ var (
|
|||
// Releaser is the interface that wraps the basic Release method.
|
||||
type Releaser interface {
|
||||
// Release releases associated resources. Release should always success
|
||||
// and can be called multipe times without causing error.
|
||||
// and can be called multiple times without causing error.
|
||||
Release()
|
||||
}
|
||||
|
||||
|
|
|
@ -7,6 +7,7 @@
|
|||
package leveldb
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"sync/atomic"
|
||||
"unsafe"
|
||||
|
||||
|
@ -23,7 +24,7 @@ type tSet struct {
|
|||
type version struct {
|
||||
s *session
|
||||
|
||||
tables []tFiles
|
||||
levels []tFiles
|
||||
|
||||
// Level that should be compacted next and its compaction score.
|
||||
// Score < 1 means compaction is not strictly needed. These fields
|
||||
|
@ -33,43 +34,48 @@ type version struct {
|
|||
|
||||
cSeek unsafe.Pointer
|
||||
|
||||
ref int
|
||||
// Succeeding version.
|
||||
next *version
|
||||
closing bool
|
||||
ref int
|
||||
released bool
|
||||
}
|
||||
|
||||
func newVersion(s *session) *version {
|
||||
return &version{s: s, tables: make([]tFiles, s.o.GetNumLevel())}
|
||||
return &version{s: s}
|
||||
}
|
||||
|
||||
func (v *version) incref() {
|
||||
if v.released {
|
||||
panic("already released")
|
||||
}
|
||||
|
||||
v.ref++
|
||||
if v.ref == 1 {
|
||||
// Incr file ref.
|
||||
for _, tt := range v.levels {
|
||||
for _, t := range tt {
|
||||
v.s.addFileRef(t.fd, 1)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (v *version) releaseNB() {
|
||||
v.ref--
|
||||
if v.ref > 0 {
|
||||
return
|
||||
}
|
||||
if v.ref < 0 {
|
||||
} else if v.ref < 0 {
|
||||
panic("negative version ref")
|
||||
}
|
||||
|
||||
tables := make(map[uint64]bool)
|
||||
for _, tt := range v.next.tables {
|
||||
for _, tt := range v.levels {
|
||||
for _, t := range tt {
|
||||
num := t.file.Num()
|
||||
tables[num] = true
|
||||
}
|
||||
}
|
||||
|
||||
for _, tt := range v.tables {
|
||||
for _, t := range tt {
|
||||
num := t.file.Num()
|
||||
if _, ok := tables[num]; !ok {
|
||||
if v.s.addFileRef(t.fd, -1) == 0 {
|
||||
v.s.tops.remove(t)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
v.next.releaseNB()
|
||||
v.next = nil
|
||||
v.released = true
|
||||
}
|
||||
|
||||
func (v *version) release() {
|
||||
|
@ -78,11 +84,26 @@ func (v *version) release() {
|
|||
v.s.vmu.Unlock()
|
||||
}
|
||||
|
||||
func (v *version) walkOverlapping(ikey iKey, f func(level int, t *tFile) bool, lf func(level int) bool) {
|
||||
func (v *version) walkOverlapping(aux tFiles, ikey internalKey, f func(level int, t *tFile) bool, lf func(level int) bool) {
|
||||
ukey := ikey.ukey()
|
||||
|
||||
// Aux level.
|
||||
if aux != nil {
|
||||
for _, t := range aux {
|
||||
if t.overlaps(v.s.icmp, ukey, ukey) {
|
||||
if !f(-1, t) {
|
||||
return
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if lf != nil && !lf(-1) {
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
// Walk tables level-by-level.
|
||||
for level, tables := range v.tables {
|
||||
for level, tables := range v.levels {
|
||||
if len(tables) == 0 {
|
||||
continue
|
||||
}
|
||||
|
@ -114,7 +135,11 @@ func (v *version) walkOverlapping(ikey iKey, f func(level int, t *tFile) bool, l
|
|||
}
|
||||
}
|
||||
|
||||
func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byte, tcomp bool, err error) {
|
||||
func (v *version) get(aux tFiles, ikey internalKey, ro *opt.ReadOptions, noValue bool) (value []byte, tcomp bool, err error) {
|
||||
if v.closing {
|
||||
return nil, false, ErrClosed
|
||||
}
|
||||
|
||||
ukey := ikey.ukey()
|
||||
|
||||
var (
|
||||
|
@ -124,16 +149,16 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
|
|||
// Level-0.
|
||||
zfound bool
|
||||
zseq uint64
|
||||
zkt kType
|
||||
zkt keyType
|
||||
zval []byte
|
||||
)
|
||||
|
||||
err = ErrNotFound
|
||||
|
||||
// Since entries never hope across level, finding key/value
|
||||
// Since entries never hop across level, finding key/value
|
||||
// in smaller level make later levels irrelevant.
|
||||
v.walkOverlapping(ikey, func(level int, t *tFile) bool {
|
||||
if !tseek {
|
||||
v.walkOverlapping(aux, ikey, func(level int, t *tFile) bool {
|
||||
if level >= 0 && !tseek {
|
||||
if tset == nil {
|
||||
tset = &tSet{level, t}
|
||||
} else {
|
||||
|
@ -150,6 +175,7 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
|
|||
} else {
|
||||
fikey, fval, ferr = v.s.tops.find(t, ikey, ro)
|
||||
}
|
||||
|
||||
switch ferr {
|
||||
case nil:
|
||||
case ErrNotFound:
|
||||
|
@ -159,9 +185,10 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
|
|||
return false
|
||||
}
|
||||
|
||||
if fukey, fseq, fkt, fkerr := parseIkey(fikey); fkerr == nil {
|
||||
if fukey, fseq, fkt, fkerr := parseInternalKey(fikey); fkerr == nil {
|
||||
if v.s.icmp.uCompare(ukey, fukey) == 0 {
|
||||
if level == 0 {
|
||||
// Level <= 0 may overlaps each-other.
|
||||
if level <= 0 {
|
||||
if fseq >= zseq {
|
||||
zfound = true
|
||||
zseq = fseq
|
||||
|
@ -170,12 +197,12 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
|
|||
}
|
||||
} else {
|
||||
switch fkt {
|
||||
case ktVal:
|
||||
case keyTypeVal:
|
||||
value = fval
|
||||
err = nil
|
||||
case ktDel:
|
||||
case keyTypeDel:
|
||||
default:
|
||||
panic("leveldb: invalid iKey type")
|
||||
panic("leveldb: invalid internalKey type")
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
@ -189,12 +216,12 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
|
|||
}, func(level int) bool {
|
||||
if zfound {
|
||||
switch zkt {
|
||||
case ktVal:
|
||||
case keyTypeVal:
|
||||
value = zval
|
||||
err = nil
|
||||
case ktDel:
|
||||
case keyTypeDel:
|
||||
default:
|
||||
panic("leveldb: invalid iKey type")
|
||||
panic("leveldb: invalid internalKey type")
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
@ -209,46 +236,40 @@ func (v *version) get(ikey iKey, ro *opt.ReadOptions, noValue bool) (value []byt
|
|||
return
|
||||
}
|
||||
|
||||
func (v *version) sampleSeek(ikey iKey) (tcomp bool) {
|
||||
func (v *version) sampleSeek(ikey internalKey) (tcomp bool) {
|
||||
var tset *tSet
|
||||
|
||||
v.walkOverlapping(ikey, func(level int, t *tFile) bool {
|
||||
v.walkOverlapping(nil, ikey, func(level int, t *tFile) bool {
|
||||
if tset == nil {
|
||||
tset = &tSet{level, t}
|
||||
return true
|
||||
} else {
|
||||
if tset.table.consumeSeek() <= 0 {
|
||||
tcomp = atomic.CompareAndSwapPointer(&v.cSeek, nil, unsafe.Pointer(tset))
|
||||
}
|
||||
return false
|
||||
}
|
||||
if tset.table.consumeSeek() <= 0 {
|
||||
tcomp = atomic.CompareAndSwapPointer(&v.cSeek, nil, unsafe.Pointer(tset))
|
||||
}
|
||||
return false
|
||||
}, nil)
|
||||
|
||||
return
|
||||
}
|
||||
|
||||
func (v *version) getIterators(slice *util.Range, ro *opt.ReadOptions) (its []iterator.Iterator) {
|
||||
// Merge all level zero files together since they may overlap
|
||||
for _, t := range v.tables[0] {
|
||||
it := v.s.tops.newIterator(t, slice, ro)
|
||||
its = append(its, it)
|
||||
}
|
||||
|
||||
strict := opt.GetStrict(v.s.o.Options, ro, opt.StrictReader)
|
||||
for _, tables := range v.tables[1:] {
|
||||
if len(tables) == 0 {
|
||||
continue
|
||||
for level, tables := range v.levels {
|
||||
if level == 0 {
|
||||
// Merge all level zero files together since they may overlap.
|
||||
for _, t := range tables {
|
||||
its = append(its, v.s.tops.newIterator(t, slice, ro))
|
||||
}
|
||||
} else if len(tables) != 0 {
|
||||
its = append(its, iterator.NewIndexedIterator(tables.newIndexIterator(v.s.tops, v.s.icmp, slice, ro), strict))
|
||||
}
|
||||
|
||||
it := iterator.NewIndexedIterator(tables.newIndexIterator(v.s.tops, v.s.icmp, slice, ro), strict)
|
||||
its = append(its, it)
|
||||
}
|
||||
|
||||
return
|
||||
}
|
||||
|
||||
func (v *version) newStaging() *versionStaging {
|
||||
return &versionStaging{base: v, tables: make([]tablesScratch, v.s.o.GetNumLevel())}
|
||||
return &versionStaging{base: v}
|
||||
}
|
||||
|
||||
// Spawn a new version based on this version.
|
||||
|
@ -259,19 +280,22 @@ func (v *version) spawn(r *sessionRecord) *version {
|
|||
}
|
||||
|
||||
func (v *version) fillRecord(r *sessionRecord) {
|
||||
for level, ts := range v.tables {
|
||||
for _, t := range ts {
|
||||
for level, tables := range v.levels {
|
||||
for _, t := range tables {
|
||||
r.addTableFile(level, t)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (v *version) tLen(level int) int {
|
||||
return len(v.tables[level])
|
||||
if level < len(v.levels) {
|
||||
return len(v.levels[level])
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
func (v *version) offsetOf(ikey iKey) (n uint64, err error) {
|
||||
for level, tables := range v.tables {
|
||||
func (v *version) offsetOf(ikey internalKey) (n int64, err error) {
|
||||
for level, tables := range v.levels {
|
||||
for _, t := range tables {
|
||||
if v.s.icmp.Compare(t.imax, ikey) <= 0 {
|
||||
// Entire file is before "ikey", so just add the file size
|
||||
|
@ -287,12 +311,11 @@ func (v *version) offsetOf(ikey iKey) (n uint64, err error) {
|
|||
} else {
|
||||
// "ikey" falls in the range for this table. Add the
|
||||
// approximate offset of "ikey" within the table.
|
||||
var nn uint64
|
||||
nn, err = v.s.tops.offsetOf(t, ikey)
|
||||
if err != nil {
|
||||
if m, err := v.s.tops.offsetOf(t, ikey); err == nil {
|
||||
n += m
|
||||
} else {
|
||||
return 0, err
|
||||
}
|
||||
n += nn
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -300,37 +323,50 @@ func (v *version) offsetOf(ikey iKey) (n uint64, err error) {
|
|||
return
|
||||
}
|
||||
|
||||
func (v *version) pickLevel(umin, umax []byte) (level int) {
|
||||
if !v.tables[0].overlaps(v.s.icmp, umin, umax, true) {
|
||||
var overlaps tFiles
|
||||
maxLevel := v.s.o.GetMaxMemCompationLevel()
|
||||
for ; level < maxLevel; level++ {
|
||||
if v.tables[level+1].overlaps(v.s.icmp, umin, umax, false) {
|
||||
break
|
||||
}
|
||||
overlaps = v.tables[level+2].getOverlaps(overlaps, v.s.icmp, umin, umax, false)
|
||||
if overlaps.size() > uint64(v.s.o.GetCompactionGPOverlaps(level)) {
|
||||
break
|
||||
func (v *version) pickMemdbLevel(umin, umax []byte, maxLevel int) (level int) {
|
||||
if maxLevel > 0 {
|
||||
if len(v.levels) == 0 {
|
||||
return maxLevel
|
||||
}
|
||||
if !v.levels[0].overlaps(v.s.icmp, umin, umax, true) {
|
||||
var overlaps tFiles
|
||||
for ; level < maxLevel; level++ {
|
||||
if pLevel := level + 1; pLevel >= len(v.levels) {
|
||||
return maxLevel
|
||||
} else if v.levels[pLevel].overlaps(v.s.icmp, umin, umax, false) {
|
||||
break
|
||||
}
|
||||
if gpLevel := level + 2; gpLevel < len(v.levels) {
|
||||
overlaps = v.levels[gpLevel].getOverlaps(overlaps, v.s.icmp, umin, umax, false)
|
||||
if overlaps.size() > int64(v.s.o.GetCompactionGPOverlaps(level)) {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return
|
||||
}
|
||||
|
||||
func (v *version) computeCompaction() {
|
||||
// Precomputed best level for next compaction
|
||||
var bestLevel int = -1
|
||||
var bestScore float64 = -1
|
||||
bestLevel := int(-1)
|
||||
bestScore := float64(-1)
|
||||
|
||||
for level, tables := range v.tables {
|
||||
statFiles := make([]int, len(v.levels))
|
||||
statSizes := make([]string, len(v.levels))
|
||||
statScore := make([]string, len(v.levels))
|
||||
statTotSize := int64(0)
|
||||
|
||||
for level, tables := range v.levels {
|
||||
var score float64
|
||||
size := tables.size()
|
||||
if level == 0 {
|
||||
// We treat level-0 specially by bounding the number of files
|
||||
// instead of number of bytes for two reasons:
|
||||
//
|
||||
// (1) With larger write-buffer sizes, it is nice not to do too
|
||||
// many level-0 compactions.
|
||||
// many level-0 compaction.
|
||||
//
|
||||
// (2) The files in level-0 are merged on every read and
|
||||
// therefore we wish to avoid too many files when the individual
|
||||
|
@ -339,17 +375,24 @@ func (v *version) computeCompaction() {
|
|||
// overwrites/deletions).
|
||||
score = float64(len(tables)) / float64(v.s.o.GetCompactionL0Trigger())
|
||||
} else {
|
||||
score = float64(tables.size()) / float64(v.s.o.GetCompactionTotalSize(level))
|
||||
score = float64(size) / float64(v.s.o.GetCompactionTotalSize(level))
|
||||
}
|
||||
|
||||
if score > bestScore {
|
||||
bestLevel = level
|
||||
bestScore = score
|
||||
}
|
||||
|
||||
statFiles[level] = len(tables)
|
||||
statSizes[level] = shortenb(int(size))
|
||||
statScore[level] = fmt.Sprintf("%.2f", score)
|
||||
statTotSize += size
|
||||
}
|
||||
|
||||
v.cLevel = bestLevel
|
||||
v.cScore = bestScore
|
||||
|
||||
v.s.logf("version@stat F·%v S·%s%v Sc·%v", statFiles, shortenb(int(statTotSize)), statSizes, statScore)
|
||||
}
|
||||
|
||||
func (v *version) needCompaction() bool {
|
||||
|
@ -357,43 +400,48 @@ func (v *version) needCompaction() bool {
|
|||
}
|
||||
|
||||
type tablesScratch struct {
|
||||
added map[uint64]atRecord
|
||||
deleted map[uint64]struct{}
|
||||
added map[int64]atRecord
|
||||
deleted map[int64]struct{}
|
||||
}
|
||||
|
||||
type versionStaging struct {
|
||||
base *version
|
||||
tables []tablesScratch
|
||||
levels []tablesScratch
|
||||
}
|
||||
|
||||
func (p *versionStaging) getScratch(level int) *tablesScratch {
|
||||
if level >= len(p.levels) {
|
||||
newLevels := make([]tablesScratch, level+1)
|
||||
copy(newLevels, p.levels)
|
||||
p.levels = newLevels
|
||||
}
|
||||
return &(p.levels[level])
|
||||
}
|
||||
|
||||
func (p *versionStaging) commit(r *sessionRecord) {
|
||||
// Deleted tables.
|
||||
for _, r := range r.deletedTables {
|
||||
tm := &(p.tables[r.level])
|
||||
|
||||
if len(p.base.tables[r.level]) > 0 {
|
||||
if tm.deleted == nil {
|
||||
tm.deleted = make(map[uint64]struct{})
|
||||
scratch := p.getScratch(r.level)
|
||||
if r.level < len(p.base.levels) && len(p.base.levels[r.level]) > 0 {
|
||||
if scratch.deleted == nil {
|
||||
scratch.deleted = make(map[int64]struct{})
|
||||
}
|
||||
tm.deleted[r.num] = struct{}{}
|
||||
scratch.deleted[r.num] = struct{}{}
|
||||
}
|
||||
|
||||
if tm.added != nil {
|
||||
delete(tm.added, r.num)
|
||||
if scratch.added != nil {
|
||||
delete(scratch.added, r.num)
|
||||
}
|
||||
}
|
||||
|
||||
// New tables.
|
||||
for _, r := range r.addedTables {
|
||||
tm := &(p.tables[r.level])
|
||||
|
||||
if tm.added == nil {
|
||||
tm.added = make(map[uint64]atRecord)
|
||||
scratch := p.getScratch(r.level)
|
||||
if scratch.added == nil {
|
||||
scratch.added = make(map[int64]atRecord)
|
||||
}
|
||||
tm.added[r.num] = r
|
||||
|
||||
if tm.deleted != nil {
|
||||
delete(tm.deleted, r.num)
|
||||
scratch.added[r.num] = r
|
||||
if scratch.deleted != nil {
|
||||
delete(scratch.deleted, r.num)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -401,39 +449,62 @@ func (p *versionStaging) commit(r *sessionRecord) {
|
|||
func (p *versionStaging) finish() *version {
|
||||
// Build new version.
|
||||
nv := newVersion(p.base.s)
|
||||
for level, tm := range p.tables {
|
||||
btables := p.base.tables[level]
|
||||
|
||||
n := len(btables) + len(tm.added) - len(tm.deleted)
|
||||
if n < 0 {
|
||||
n = 0
|
||||
}
|
||||
nt := make(tFiles, 0, n)
|
||||
|
||||
// Base tables.
|
||||
for _, t := range btables {
|
||||
if _, ok := tm.deleted[t.file.Num()]; ok {
|
||||
continue
|
||||
}
|
||||
if _, ok := tm.added[t.file.Num()]; ok {
|
||||
continue
|
||||
}
|
||||
nt = append(nt, t)
|
||||
}
|
||||
|
||||
// New tables.
|
||||
for _, r := range tm.added {
|
||||
nt = append(nt, p.base.s.tableFileFromRecord(r))
|
||||
}
|
||||
|
||||
// Sort tables.
|
||||
if level == 0 {
|
||||
nt.sortByNum()
|
||||
} else {
|
||||
nt.sortByKey(p.base.s.icmp)
|
||||
}
|
||||
nv.tables[level] = nt
|
||||
numLevel := len(p.levels)
|
||||
if len(p.base.levels) > numLevel {
|
||||
numLevel = len(p.base.levels)
|
||||
}
|
||||
nv.levels = make([]tFiles, numLevel)
|
||||
for level := 0; level < numLevel; level++ {
|
||||
var baseTabels tFiles
|
||||
if level < len(p.base.levels) {
|
||||
baseTabels = p.base.levels[level]
|
||||
}
|
||||
|
||||
if level < len(p.levels) {
|
||||
scratch := p.levels[level]
|
||||
|
||||
var nt tFiles
|
||||
// Prealloc list if possible.
|
||||
if n := len(baseTabels) + len(scratch.added) - len(scratch.deleted); n > 0 {
|
||||
nt = make(tFiles, 0, n)
|
||||
}
|
||||
|
||||
// Base tables.
|
||||
for _, t := range baseTabels {
|
||||
if _, ok := scratch.deleted[t.fd.Num]; ok {
|
||||
continue
|
||||
}
|
||||
if _, ok := scratch.added[t.fd.Num]; ok {
|
||||
continue
|
||||
}
|
||||
nt = append(nt, t)
|
||||
}
|
||||
|
||||
// New tables.
|
||||
for _, r := range scratch.added {
|
||||
nt = append(nt, tableFileFromRecord(r))
|
||||
}
|
||||
|
||||
if len(nt) != 0 {
|
||||
// Sort tables.
|
||||
if level == 0 {
|
||||
nt.sortByNum()
|
||||
} else {
|
||||
nt.sortByKey(p.base.s.icmp)
|
||||
}
|
||||
|
||||
nv.levels[level] = nt
|
||||
}
|
||||
} else {
|
||||
nv.levels[level] = baseTabels
|
||||
}
|
||||
}
|
||||
|
||||
// Trim levels.
|
||||
n := len(nv.levels)
|
||||
for ; n > 0 && nv.levels[n-1] == nil; n-- {
|
||||
}
|
||||
nv.levels = nv.levels[:n]
|
||||
|
||||
// Compute compaction score for new version.
|
||||
nv.computeCompaction()
|
||||
|
|
|
@ -0,0 +1,181 @@
|
|||
package leveldb
|
||||
|
||||
import (
|
||||
"encoding/binary"
|
||||
"reflect"
|
||||
"testing"
|
||||
|
||||
"github.com/onsi/gomega"
|
||||
|
||||
"github.com/syndtr/goleveldb/leveldb/testutil"
|
||||
)
|
||||
|
||||
type testFileRec struct {
|
||||
level int
|
||||
num int64
|
||||
}
|
||||
|
||||
func TestVersionStaging(t *testing.T) {
|
||||
gomega.RegisterTestingT(t)
|
||||
stor := testutil.NewStorage()
|
||||
defer stor.Close()
|
||||
s, err := newSession(stor, nil)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
v := newVersion(s)
|
||||
v.newStaging()
|
||||
|
||||
tmp := make([]byte, 4)
|
||||
mik := func(i uint64) []byte {
|
||||
binary.BigEndian.PutUint32(tmp, uint32(i))
|
||||
return []byte(makeInternalKey(nil, tmp, 0, keyTypeVal))
|
||||
}
|
||||
|
||||
for i, x := range []struct {
|
||||
add, del []testFileRec
|
||||
levels [][]int64
|
||||
}{
|
||||
{
|
||||
add: []testFileRec{
|
||||
{1, 1},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{},
|
||||
{1},
|
||||
},
|
||||
},
|
||||
{
|
||||
add: []testFileRec{
|
||||
{1, 1},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{},
|
||||
{1},
|
||||
},
|
||||
},
|
||||
{
|
||||
del: []testFileRec{
|
||||
{1, 1},
|
||||
},
|
||||
levels: [][]int64{},
|
||||
},
|
||||
{
|
||||
add: []testFileRec{
|
||||
{0, 1},
|
||||
{0, 3},
|
||||
{0, 2},
|
||||
{2, 5},
|
||||
{1, 4},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{3, 2, 1},
|
||||
{4},
|
||||
{5},
|
||||
},
|
||||
},
|
||||
{
|
||||
add: []testFileRec{
|
||||
{1, 6},
|
||||
{2, 5},
|
||||
},
|
||||
del: []testFileRec{
|
||||
{0, 1},
|
||||
{0, 4},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{3, 2},
|
||||
{4, 6},
|
||||
{5},
|
||||
},
|
||||
},
|
||||
{
|
||||
del: []testFileRec{
|
||||
{0, 3},
|
||||
{0, 2},
|
||||
{1, 4},
|
||||
{1, 6},
|
||||
{2, 5},
|
||||
},
|
||||
levels: [][]int64{},
|
||||
},
|
||||
{
|
||||
add: []testFileRec{
|
||||
{0, 1},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{1},
|
||||
},
|
||||
},
|
||||
{
|
||||
add: []testFileRec{
|
||||
{1, 2},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{1},
|
||||
{2},
|
||||
},
|
||||
},
|
||||
{
|
||||
add: []testFileRec{
|
||||
{0, 3},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{3, 1},
|
||||
{2},
|
||||
},
|
||||
},
|
||||
{
|
||||
add: []testFileRec{
|
||||
{6, 9},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{3, 1},
|
||||
{2},
|
||||
{},
|
||||
{},
|
||||
{},
|
||||
{},
|
||||
{9},
|
||||
},
|
||||
},
|
||||
{
|
||||
del: []testFileRec{
|
||||
{6, 9},
|
||||
},
|
||||
levels: [][]int64{
|
||||
{3, 1},
|
||||
{2},
|
||||
},
|
||||
},
|
||||
} {
|
||||
rec := &sessionRecord{}
|
||||
for _, f := range x.add {
|
||||
ik := mik(uint64(f.num))
|
||||
rec.addTable(f.level, f.num, 1, ik, ik)
|
||||
}
|
||||
for _, f := range x.del {
|
||||
rec.delTable(f.level, f.num)
|
||||
}
|
||||
vs := v.newStaging()
|
||||
vs.commit(rec)
|
||||
v = vs.finish()
|
||||
if len(v.levels) != len(x.levels) {
|
||||
t.Fatalf("#%d: invalid level count: want=%d got=%d", i, len(x.levels), len(v.levels))
|
||||
}
|
||||
for j, want := range x.levels {
|
||||
tables := v.levels[j]
|
||||
if len(want) != len(tables) {
|
||||
t.Fatalf("#%d.%d: invalid tables count: want=%d got=%d", i, j, len(want), len(tables))
|
||||
}
|
||||
got := make([]int64, len(tables))
|
||||
for k, t := range tables {
|
||||
got[k] = t.fd.Num
|
||||
}
|
||||
if !reflect.DeepEqual(want, got) {
|
||||
t.Fatalf("#%d.%d: invalid tables: want=%v got=%v", i, j, want, got)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
|
@ -1,30 +0,0 @@
|
|||
syntax:glob
|
||||
.DS_Store
|
||||
.git
|
||||
.gitignore
|
||||
*.[568ao]
|
||||
*.ao
|
||||
*.so
|
||||
*.pyc
|
||||
._*
|
||||
.nfs.*
|
||||
[568a].out
|
||||
*~
|
||||
*.orig
|
||||
*.rej
|
||||
*.exe
|
||||
.*.swp
|
||||
core
|
||||
*.cgo*.go
|
||||
*.cgo*.c
|
||||
_cgo_*
|
||||
_obj
|
||||
_test
|
||||
_testmain.go
|
||||
build.out
|
||||
snappy/testdata
|
||||
test.out
|
||||
y.tab.[ch]
|
||||
|
||||
syntax:regexp
|
||||
^.*/core.[0-9]*$
|
|
@ -1,12 +0,0 @@
|
|||
# This is the official list of Snappy-Go authors for copyright purposes.
|
||||
# This file is distinct from the CONTRIBUTORS files.
|
||||
# See the latter for an explanation.
|
||||
|
||||
# Names should be added to this file as
|
||||
# Name or Organization <email address>
|
||||
# The email address is not required for organizations.
|
||||
|
||||
# Please keep the list sorted.
|
||||
|
||||
Google Inc.
|
||||
Jan Mercl <0xjnml@gmail.com>
|
|
@ -1,34 +0,0 @@
|
|||
# This is the official list of people who can contribute
|
||||
# (and typically have contributed) code to the Snappy-Go repository.
|
||||
# The AUTHORS file lists the copyright holders; this file
|
||||
# lists people. For example, Google employees are listed here
|
||||
# but not in AUTHORS, because Google holds the copyright.
|
||||
#
|
||||
# The submission process automatically checks to make sure
|
||||
# that people submitting code are listed in this file (by email address).
|
||||
#
|
||||
# Names should be added to this file only after verifying that
|
||||
# the individual or the individual's organization has agreed to
|
||||
# the appropriate Contributor License Agreement, found here:
|
||||
#
|
||||
# http://code.google.com/legal/individual-cla-v1.0.html
|
||||
# http://code.google.com/legal/corporate-cla-v1.0.html
|
||||
#
|
||||
# The agreement for individuals can be filled out on the web.
|
||||
#
|
||||
# When adding J Random Contributor's name to this file,
|
||||
# either J's name or J's organization's name should be
|
||||
# added to the AUTHORS file, depending on whether the
|
||||
# individual or corporate CLA was used.
|
||||
|
||||
# Names should be added to this file like so:
|
||||
# Name <email address>
|
||||
|
||||
# Please keep the list sorted.
|
||||
|
||||
Jan Mercl <0xjnml@gmail.com>
|
||||
Kai Backman <kaib@golang.org>
|
||||
Marc-Antoine Ruel <maruel@chromium.org>
|
||||
Nigel Tao <nigeltao@golang.org>
|
||||
Rob Pike <r@golang.org>
|
||||
Russ Cox <rsc@golang.org>
|
|
@ -1,27 +0,0 @@
|
|||
Copyright (c) 2011 The Snappy-Go Authors. All rights reserved.
|
||||
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
modification, are permitted provided that the following conditions are
|
||||
met:
|
||||
|
||||
* Redistributions of source code must retain the above copyright
|
||||
notice, this list of conditions and the following disclaimer.
|
||||
* Redistributions in binary form must reproduce the above
|
||||
copyright notice, this list of conditions and the following disclaimer
|
||||
in the documentation and/or other materials provided with the
|
||||
distribution.
|
||||
* Neither the name of Google Inc. nor the names of its
|
||||
contributors may be used to endorse or promote products derived from
|
||||
this software without specific prior written permission.
|
||||
|
||||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
||||
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
||||
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
|
||||
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
||||
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
||||
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
||||
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
||||
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
||||
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
||||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
@ -1,11 +0,0 @@
|
|||
This is a Snappy library for the Go programming language.
|
||||
|
||||
To download and install from source:
|
||||
$ go get code.google.com/p/snappy-go/snappy
|
||||
|
||||
Unless otherwise noted, the Snappy-Go source files are distributed
|
||||
under the BSD-style license found in the LICENSE file.
|
||||
|
||||
Contributions should follow the same procedure as for the Go project:
|
||||
http://golang.org/doc/contribute.html
|
||||
|
|
@ -1,292 +0,0 @@
|
|||
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
|
||||
package snappy
|
||||
|
||||
import (
|
||||
"encoding/binary"
|
||||
"errors"
|
||||
"io"
|
||||
)
|
||||
|
||||
var (
|
||||
// ErrCorrupt reports that the input is invalid.
|
||||
ErrCorrupt = errors.New("snappy: corrupt input")
|
||||
// ErrUnsupported reports that the input isn't supported.
|
||||
ErrUnsupported = errors.New("snappy: unsupported input")
|
||||
)
|
||||
|
||||
// DecodedLen returns the length of the decoded block.
|
||||
func DecodedLen(src []byte) (int, error) {
|
||||
v, _, err := decodedLen(src)
|
||||
return v, err
|
||||
}
|
||||
|
||||
// decodedLen returns the length of the decoded block and the number of bytes
|
||||
// that the length header occupied.
|
||||
func decodedLen(src []byte) (blockLen, headerLen int, err error) {
|
||||
v, n := binary.Uvarint(src)
|
||||
if n == 0 {
|
||||
return 0, 0, ErrCorrupt
|
||||
}
|
||||
if uint64(int(v)) != v {
|
||||
return 0, 0, errors.New("snappy: decoded block is too large")
|
||||
}
|
||||
return int(v), n, nil
|
||||
}
|
||||
|
||||
// Decode returns the decoded form of src. The returned slice may be a sub-
|
||||
// slice of dst if dst was large enough to hold the entire decoded block.
|
||||
// Otherwise, a newly allocated slice will be returned.
|
||||
// It is valid to pass a nil dst.
|
||||
func Decode(dst, src []byte) ([]byte, error) {
|
||||
dLen, s, err := decodedLen(src)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if len(dst) < dLen {
|
||||
dst = make([]byte, dLen)
|
||||
}
|
||||
|
||||
var d, offset, length int
|
||||
for s < len(src) {
|
||||
switch src[s] & 0x03 {
|
||||
case tagLiteral:
|
||||
x := uint(src[s] >> 2)
|
||||
switch {
|
||||
case x < 60:
|
||||
s += 1
|
||||
case x == 60:
|
||||
s += 2
|
||||
if s > len(src) {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
x = uint(src[s-1])
|
||||
case x == 61:
|
||||
s += 3
|
||||
if s > len(src) {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
x = uint(src[s-2]) | uint(src[s-1])<<8
|
||||
case x == 62:
|
||||
s += 4
|
||||
if s > len(src) {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
x = uint(src[s-3]) | uint(src[s-2])<<8 | uint(src[s-1])<<16
|
||||
case x == 63:
|
||||
s += 5
|
||||
if s > len(src) {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
x = uint(src[s-4]) | uint(src[s-3])<<8 | uint(src[s-2])<<16 | uint(src[s-1])<<24
|
||||
}
|
||||
length = int(x + 1)
|
||||
if length <= 0 {
|
||||
return nil, errors.New("snappy: unsupported literal length")
|
||||
}
|
||||
if length > len(dst)-d || length > len(src)-s {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
copy(dst[d:], src[s:s+length])
|
||||
d += length
|
||||
s += length
|
||||
continue
|
||||
|
||||
case tagCopy1:
|
||||
s += 2
|
||||
if s > len(src) {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
length = 4 + int(src[s-2])>>2&0x7
|
||||
offset = int(src[s-2])&0xe0<<3 | int(src[s-1])
|
||||
|
||||
case tagCopy2:
|
||||
s += 3
|
||||
if s > len(src) {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
length = 1 + int(src[s-3])>>2
|
||||
offset = int(src[s-2]) | int(src[s-1])<<8
|
||||
|
||||
case tagCopy4:
|
||||
return nil, errors.New("snappy: unsupported COPY_4 tag")
|
||||
}
|
||||
|
||||
end := d + length
|
||||
if offset > d || end > len(dst) {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
for ; d < end; d++ {
|
||||
dst[d] = dst[d-offset]
|
||||
}
|
||||
}
|
||||
if d != dLen {
|
||||
return nil, ErrCorrupt
|
||||
}
|
||||
return dst[:d], nil
|
||||
}
|
||||
|
||||
// NewReader returns a new Reader that decompresses from r, using the framing
|
||||
// format described at
|
||||
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt
|
||||
func NewReader(r io.Reader) *Reader {
|
||||
return &Reader{
|
||||
r: r,
|
||||
decoded: make([]byte, maxUncompressedChunkLen),
|
||||
buf: make([]byte, MaxEncodedLen(maxUncompressedChunkLen)+checksumSize),
|
||||
}
|
||||
}
|
||||
|
||||
// Reader is an io.Reader than can read Snappy-compressed bytes.
|
||||
type Reader struct {
|
||||
r io.Reader
|
||||
err error
|
||||
decoded []byte
|
||||
buf []byte
|
||||
// decoded[i:j] contains decoded bytes that have not yet been passed on.
|
||||
i, j int
|
||||
readHeader bool
|
||||
}
|
||||
|
||||
// Reset discards any buffered data, resets all state, and switches the Snappy
|
||||
// reader to read from r. This permits reusing a Reader rather than allocating
|
||||
// a new one.
|
||||
func (r *Reader) Reset(reader io.Reader) {
|
||||
r.r = reader
|
||||
r.err = nil
|
||||
r.i = 0
|
||||
r.j = 0
|
||||
r.readHeader = false
|
||||
}
|
||||
|
||||
func (r *Reader) readFull(p []byte) (ok bool) {
|
||||
if _, r.err = io.ReadFull(r.r, p); r.err != nil {
|
||||
if r.err == io.ErrUnexpectedEOF {
|
||||
r.err = ErrCorrupt
|
||||
}
|
||||
return false
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// Read satisfies the io.Reader interface.
|
||||
func (r *Reader) Read(p []byte) (int, error) {
|
||||
if r.err != nil {
|
||||
return 0, r.err
|
||||
}
|
||||
for {
|
||||
if r.i < r.j {
|
||||
n := copy(p, r.decoded[r.i:r.j])
|
||||
r.i += n
|
||||
return n, nil
|
||||
}
|
||||
if !r.readFull(r.buf[:4]) {
|
||||
return 0, r.err
|
||||
}
|
||||
chunkType := r.buf[0]
|
||||
if !r.readHeader {
|
||||
if chunkType != chunkTypeStreamIdentifier {
|
||||
r.err = ErrCorrupt
|
||||
return 0, r.err
|
||||
}
|
||||
r.readHeader = true
|
||||
}
|
||||
chunkLen := int(r.buf[1]) | int(r.buf[2])<<8 | int(r.buf[3])<<16
|
||||
if chunkLen > len(r.buf) {
|
||||
r.err = ErrUnsupported
|
||||
return 0, r.err
|
||||
}
|
||||
|
||||
// The chunk types are specified at
|
||||
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt
|
||||
switch chunkType {
|
||||
case chunkTypeCompressedData:
|
||||
// Section 4.2. Compressed data (chunk type 0x00).
|
||||
if chunkLen < checksumSize {
|
||||
r.err = ErrCorrupt
|
||||
return 0, r.err
|
||||
}
|
||||
buf := r.buf[:chunkLen]
|
||||
if !r.readFull(buf) {
|
||||
return 0, r.err
|
||||
}
|
||||
checksum := uint32(buf[0]) | uint32(buf[1])<<8 | uint32(buf[2])<<16 | uint32(buf[3])<<24
|
||||
buf = buf[checksumSize:]
|
||||
|
||||
n, err := DecodedLen(buf)
|
||||
if err != nil {
|
||||
r.err = err
|
||||
return 0, r.err
|
||||
}
|
||||
if n > len(r.decoded) {
|
||||
r.err = ErrCorrupt
|
||||
return 0, r.err
|
||||
}
|
||||
if _, err := Decode(r.decoded, buf); err != nil {
|
||||
r.err = err
|
||||
return 0, r.err
|
||||
}
|
||||
if crc(r.decoded[:n]) != checksum {
|
||||
r.err = ErrCorrupt
|
||||
return 0, r.err
|
||||
}
|
||||
r.i, r.j = 0, n
|
||||
continue
|
||||
|
||||
case chunkTypeUncompressedData:
|
||||
// Section 4.3. Uncompressed data (chunk type 0x01).
|
||||
if chunkLen < checksumSize {
|
||||
r.err = ErrCorrupt
|
||||
return 0, r.err
|
||||
}
|
||||
buf := r.buf[:checksumSize]
|
||||
if !r.readFull(buf) {
|
||||
return 0, r.err
|
||||
}
|
||||
checksum := uint32(buf[0]) | uint32(buf[1])<<8 | uint32(buf[2])<<16 | uint32(buf[3])<<24
|
||||
// Read directly into r.decoded instead of via r.buf.
|
||||
n := chunkLen - checksumSize
|
||||
if !r.readFull(r.decoded[:n]) {
|
||||
return 0, r.err
|
||||
}
|
||||
if crc(r.decoded[:n]) != checksum {
|
||||
r.err = ErrCorrupt
|
||||
return 0, r.err
|
||||
}
|
||||
r.i, r.j = 0, n
|
||||
continue
|
||||
|
||||
case chunkTypeStreamIdentifier:
|
||||
// Section 4.1. Stream identifier (chunk type 0xff).
|
||||
if chunkLen != len(magicBody) {
|
||||
r.err = ErrCorrupt
|
||||
return 0, r.err
|
||||
}
|
||||
if !r.readFull(r.buf[:len(magicBody)]) {
|
||||
return 0, r.err
|
||||
}
|
||||
for i := 0; i < len(magicBody); i++ {
|
||||
if r.buf[i] != magicBody[i] {
|
||||
r.err = ErrCorrupt
|
||||
return 0, r.err
|
||||
}
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
if chunkType <= 0x7f {
|
||||
// Section 4.5. Reserved unskippable chunks (chunk types 0x02-0x7f).
|
||||
r.err = ErrUnsupported
|
||||
return 0, r.err
|
||||
|
||||
} else {
|
||||
// Section 4.4 Padding (chunk type 0xfe).
|
||||
// Section 4.6. Reserved skippable chunks (chunk types 0x80-0xfd).
|
||||
if !r.readFull(r.buf[:chunkLen]) {
|
||||
return 0, r.err
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
|
@ -1,258 +0,0 @@
|
|||
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
|
||||
package snappy
|
||||
|
||||
import (
|
||||
"encoding/binary"
|
||||
"io"
|
||||
)
|
||||
|
||||
// We limit how far copy back-references can go, the same as the C++ code.
|
||||
const maxOffset = 1 << 15
|
||||
|
||||
// emitLiteral writes a literal chunk and returns the number of bytes written.
|
||||
func emitLiteral(dst, lit []byte) int {
|
||||
i, n := 0, uint(len(lit)-1)
|
||||
switch {
|
||||
case n < 60:
|
||||
dst[0] = uint8(n)<<2 | tagLiteral
|
||||
i = 1
|
||||
case n < 1<<8:
|
||||
dst[0] = 60<<2 | tagLiteral
|
||||
dst[1] = uint8(n)
|
||||
i = 2
|
||||
case n < 1<<16:
|
||||
dst[0] = 61<<2 | tagLiteral
|
||||
dst[1] = uint8(n)
|
||||
dst[2] = uint8(n >> 8)
|
||||
i = 3
|
||||
case n < 1<<24:
|
||||
dst[0] = 62<<2 | tagLiteral
|
||||
dst[1] = uint8(n)
|
||||
dst[2] = uint8(n >> 8)
|
||||
dst[3] = uint8(n >> 16)
|
||||
i = 4
|
||||
case int64(n) < 1<<32:
|
||||
dst[0] = 63<<2 | tagLiteral
|
||||
dst[1] = uint8(n)
|
||||
dst[2] = uint8(n >> 8)
|
||||
dst[3] = uint8(n >> 16)
|
||||
dst[4] = uint8(n >> 24)
|
||||
i = 5
|
||||
default:
|
||||
panic("snappy: source buffer is too long")
|
||||
}
|
||||
if copy(dst[i:], lit) != len(lit) {
|
||||
panic("snappy: destination buffer is too short")
|
||||
}
|
||||
return i + len(lit)
|
||||
}
|
||||
|
||||
// emitCopy writes a copy chunk and returns the number of bytes written.
|
||||
func emitCopy(dst []byte, offset, length int) int {
|
||||
i := 0
|
||||
for length > 0 {
|
||||
x := length - 4
|
||||
if 0 <= x && x < 1<<3 && offset < 1<<11 {
|
||||
dst[i+0] = uint8(offset>>8)&0x07<<5 | uint8(x)<<2 | tagCopy1
|
||||
dst[i+1] = uint8(offset)
|
||||
i += 2
|
||||
break
|
||||
}
|
||||
|
||||
x = length
|
||||
if x > 1<<6 {
|
||||
x = 1 << 6
|
||||
}
|
||||
dst[i+0] = uint8(x-1)<<2 | tagCopy2
|
||||
dst[i+1] = uint8(offset)
|
||||
dst[i+2] = uint8(offset >> 8)
|
||||
i += 3
|
||||
length -= x
|
||||
}
|
||||
return i
|
||||
}
|
||||
|
||||
// Encode returns the encoded form of src. The returned slice may be a sub-
|
||||
// slice of dst if dst was large enough to hold the entire encoded block.
|
||||
// Otherwise, a newly allocated slice will be returned.
|
||||
// It is valid to pass a nil dst.
|
||||
func Encode(dst, src []byte) ([]byte, error) {
|
||||
if n := MaxEncodedLen(len(src)); len(dst) < n {
|
||||
dst = make([]byte, n)
|
||||
}
|
||||
|
||||
// The block starts with the varint-encoded length of the decompressed bytes.
|
||||
d := binary.PutUvarint(dst, uint64(len(src)))
|
||||
|
||||
// Return early if src is short.
|
||||
if len(src) <= 4 {
|
||||
if len(src) != 0 {
|
||||
d += emitLiteral(dst[d:], src)
|
||||
}
|
||||
return dst[:d], nil
|
||||
}
|
||||
|
||||
// Initialize the hash table. Its size ranges from 1<<8 to 1<<14 inclusive.
|
||||
const maxTableSize = 1 << 14
|
||||
shift, tableSize := uint(32-8), 1<<8
|
||||
for tableSize < maxTableSize && tableSize < len(src) {
|
||||
shift--
|
||||
tableSize *= 2
|
||||
}
|
||||
var table [maxTableSize]int
|
||||
|
||||
// Iterate over the source bytes.
|
||||
var (
|
||||
s int // The iterator position.
|
||||
t int // The last position with the same hash as s.
|
||||
lit int // The start position of any pending literal bytes.
|
||||
)
|
||||
for s+3 < len(src) {
|
||||
// Update the hash table.
|
||||
b0, b1, b2, b3 := src[s], src[s+1], src[s+2], src[s+3]
|
||||
h := uint32(b0) | uint32(b1)<<8 | uint32(b2)<<16 | uint32(b3)<<24
|
||||
p := &table[(h*0x1e35a7bd)>>shift]
|
||||
// We need to to store values in [-1, inf) in table. To save
|
||||
// some initialization time, (re)use the table's zero value
|
||||
// and shift the values against this zero: add 1 on writes,
|
||||
// subtract 1 on reads.
|
||||
t, *p = *p-1, s+1
|
||||
// If t is invalid or src[s:s+4] differs from src[t:t+4], accumulate a literal byte.
|
||||
if t < 0 || s-t >= maxOffset || b0 != src[t] || b1 != src[t+1] || b2 != src[t+2] || b3 != src[t+3] {
|
||||
s++
|
||||
continue
|
||||
}
|
||||
// Otherwise, we have a match. First, emit any pending literal bytes.
|
||||
if lit != s {
|
||||
d += emitLiteral(dst[d:], src[lit:s])
|
||||
}
|
||||
// Extend the match to be as long as possible.
|
||||
s0 := s
|
||||
s, t = s+4, t+4
|
||||
for s < len(src) && src[s] == src[t] {
|
||||
s++
|
||||
t++
|
||||
}
|
||||
// Emit the copied bytes.
|
||||
d += emitCopy(dst[d:], s-t, s-s0)
|
||||
lit = s
|
||||
}
|
||||
|
||||
// Emit any final pending literal bytes and return.
|
||||
if lit != len(src) {
|
||||
d += emitLiteral(dst[d:], src[lit:])
|
||||
}
|
||||
return dst[:d], nil
|
||||
}
|
||||
|
||||
// MaxEncodedLen returns the maximum length of a snappy block, given its
|
||||
// uncompressed length.
|
||||
func MaxEncodedLen(srcLen int) int {
|
||||
// Compressed data can be defined as:
|
||||
// compressed := item* literal*
|
||||
// item := literal* copy
|
||||
//
|
||||
// The trailing literal sequence has a space blowup of at most 62/60
|
||||
// since a literal of length 60 needs one tag byte + one extra byte
|
||||
// for length information.
|
||||
//
|
||||
// Item blowup is trickier to measure. Suppose the "copy" op copies
|
||||
// 4 bytes of data. Because of a special check in the encoding code,
|
||||
// we produce a 4-byte copy only if the offset is < 65536. Therefore
|
||||
// the copy op takes 3 bytes to encode, and this type of item leads
|
||||
// to at most the 62/60 blowup for representing literals.
|
||||
//
|
||||
// Suppose the "copy" op copies 5 bytes of data. If the offset is big
|
||||
// enough, it will take 5 bytes to encode the copy op. Therefore the
|
||||
// worst case here is a one-byte literal followed by a five-byte copy.
|
||||
// That is, 6 bytes of input turn into 7 bytes of "compressed" data.
|
||||
//
|
||||
// This last factor dominates the blowup, so the final estimate is:
|
||||
return 32 + srcLen + srcLen/6
|
||||
}
|
||||
|
||||
// NewWriter returns a new Writer that compresses to w, using the framing
|
||||
// format described at
|
||||
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt
|
||||
func NewWriter(w io.Writer) *Writer {
|
||||
return &Writer{
|
||||
w: w,
|
||||
enc: make([]byte, MaxEncodedLen(maxUncompressedChunkLen)),
|
||||
}
|
||||
}
|
||||
|
||||
// Writer is an io.Writer than can write Snappy-compressed bytes.
|
||||
type Writer struct {
|
||||
w io.Writer
|
||||
err error
|
||||
enc []byte
|
||||
buf [checksumSize + chunkHeaderSize]byte
|
||||
wroteHeader bool
|
||||
}
|
||||
|
||||
// Reset discards the writer's state and switches the Snappy writer to write to
|
||||
// w. This permits reusing a Writer rather than allocating a new one.
|
||||
func (w *Writer) Reset(writer io.Writer) {
|
||||
w.w = writer
|
||||
w.err = nil
|
||||
w.wroteHeader = false
|
||||
}
|
||||
|
||||
// Write satisfies the io.Writer interface.
|
||||
func (w *Writer) Write(p []byte) (n int, errRet error) {
|
||||
if w.err != nil {
|
||||
return 0, w.err
|
||||
}
|
||||
if !w.wroteHeader {
|
||||
copy(w.enc, magicChunk)
|
||||
if _, err := w.w.Write(w.enc[:len(magicChunk)]); err != nil {
|
||||
w.err = err
|
||||
return n, err
|
||||
}
|
||||
w.wroteHeader = true
|
||||
}
|
||||
for len(p) > 0 {
|
||||
var uncompressed []byte
|
||||
if len(p) > maxUncompressedChunkLen {
|
||||
uncompressed, p = p[:maxUncompressedChunkLen], p[maxUncompressedChunkLen:]
|
||||
} else {
|
||||
uncompressed, p = p, nil
|
||||
}
|
||||
checksum := crc(uncompressed)
|
||||
|
||||
// Compress the buffer, discarding the result if the improvement
|
||||
// isn't at least 12.5%.
|
||||
chunkType := uint8(chunkTypeCompressedData)
|
||||
chunkBody, err := Encode(w.enc, uncompressed)
|
||||
if err != nil {
|
||||
w.err = err
|
||||
return n, err
|
||||
}
|
||||
if len(chunkBody) >= len(uncompressed)-len(uncompressed)/8 {
|
||||
chunkType, chunkBody = chunkTypeUncompressedData, uncompressed
|
||||
}
|
||||
|
||||
chunkLen := 4 + len(chunkBody)
|
||||
w.buf[0] = chunkType
|
||||
w.buf[1] = uint8(chunkLen >> 0)
|
||||
w.buf[2] = uint8(chunkLen >> 8)
|
||||
w.buf[3] = uint8(chunkLen >> 16)
|
||||
w.buf[4] = uint8(checksum >> 0)
|
||||
w.buf[5] = uint8(checksum >> 8)
|
||||
w.buf[6] = uint8(checksum >> 16)
|
||||
w.buf[7] = uint8(checksum >> 24)
|
||||
if _, err = w.w.Write(w.buf[:]); err != nil {
|
||||
w.err = err
|
||||
return n, err
|
||||
}
|
||||
if _, err = w.w.Write(chunkBody); err != nil {
|
||||
w.err = err
|
||||
return n, err
|
||||
}
|
||||
n += len(uncompressed)
|
||||
}
|
||||
return n, nil
|
||||
}
|
|
@ -1,68 +0,0 @@
|
|||
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
|
||||
// Package snappy implements the snappy block-based compression format.
|
||||
// It aims for very high speeds and reasonable compression.
|
||||
//
|
||||
// The C++ snappy implementation is at http://code.google.com/p/snappy/
|
||||
package snappy
|
||||
|
||||
import (
|
||||
"hash/crc32"
|
||||
)
|
||||
|
||||
/*
|
||||
Each encoded block begins with the varint-encoded length of the decoded data,
|
||||
followed by a sequence of chunks. Chunks begin and end on byte boundaries. The
|
||||
first byte of each chunk is broken into its 2 least and 6 most significant bits
|
||||
called l and m: l ranges in [0, 4) and m ranges in [0, 64). l is the chunk tag.
|
||||
Zero means a literal tag. All other values mean a copy tag.
|
||||
|
||||
For literal tags:
|
||||
- If m < 60, the next 1 + m bytes are literal bytes.
|
||||
- Otherwise, let n be the little-endian unsigned integer denoted by the next
|
||||
m - 59 bytes. The next 1 + n bytes after that are literal bytes.
|
||||
|
||||
For copy tags, length bytes are copied from offset bytes ago, in the style of
|
||||
Lempel-Ziv compression algorithms. In particular:
|
||||
- For l == 1, the offset ranges in [0, 1<<11) and the length in [4, 12).
|
||||
The length is 4 + the low 3 bits of m. The high 3 bits of m form bits 8-10
|
||||
of the offset. The next byte is bits 0-7 of the offset.
|
||||
- For l == 2, the offset ranges in [0, 1<<16) and the length in [1, 65).
|
||||
The length is 1 + m. The offset is the little-endian unsigned integer
|
||||
denoted by the next 2 bytes.
|
||||
- For l == 3, this tag is a legacy format that is no longer supported.
|
||||
*/
|
||||
const (
|
||||
tagLiteral = 0x00
|
||||
tagCopy1 = 0x01
|
||||
tagCopy2 = 0x02
|
||||
tagCopy4 = 0x03
|
||||
)
|
||||
|
||||
const (
|
||||
checksumSize = 4
|
||||
chunkHeaderSize = 4
|
||||
magicChunk = "\xff\x06\x00\x00" + magicBody
|
||||
magicBody = "sNaPpY"
|
||||
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt says
|
||||
// that "the uncompressed data in a chunk must be no longer than 65536 bytes".
|
||||
maxUncompressedChunkLen = 65536
|
||||
)
|
||||
|
||||
const (
|
||||
chunkTypeCompressedData = 0x00
|
||||
chunkTypeUncompressedData = 0x01
|
||||
chunkTypePadding = 0xfe
|
||||
chunkTypeStreamIdentifier = 0xff
|
||||
)
|
||||
|
||||
var crcTable = crc32.MakeTable(crc32.Castagnoli)
|
||||
|
||||
// crc implements the checksum specified in section 3 of
|
||||
// https://code.google.com/p/snappy/source/browse/trunk/framing_format.txt
|
||||
func crc(b []byte) uint32 {
|
||||
c := crc32.Update(0, crcTable, b)
|
||||
return uint32(c>>15|c<<17) + 0xa282ead8
|
||||
}
|
|
@ -1,364 +0,0 @@
|
|||
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
|
||||
// Use of this source code is governed by a BSD-style
|
||||
// license that can be found in the LICENSE file.
|
||||
|
||||
package snappy
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io"
|
||||
"io/ioutil"
|
||||
"math/rand"
|
||||
"net/http"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
var (
|
||||
download = flag.Bool("download", false, "If true, download any missing files before running benchmarks")
|
||||
testdata = flag.String("testdata", "testdata", "Directory containing the test data")
|
||||
)
|
||||
|
||||
func roundtrip(b, ebuf, dbuf []byte) error {
|
||||
e, err := Encode(ebuf, b)
|
||||
if err != nil {
|
||||
return fmt.Errorf("encoding error: %v", err)
|
||||
}
|
||||
d, err := Decode(dbuf, e)
|
||||
if err != nil {
|
||||
return fmt.Errorf("decoding error: %v", err)
|
||||
}
|
||||
if !bytes.Equal(b, d) {
|
||||
return fmt.Errorf("roundtrip mismatch:\n\twant %v\n\tgot %v", b, d)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func TestEmpty(t *testing.T) {
|
||||
if err := roundtrip(nil, nil, nil); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
||||
|
||||
func TestSmallCopy(t *testing.T) {
|
||||
for _, ebuf := range [][]byte{nil, make([]byte, 20), make([]byte, 64)} {
|
||||
for _, dbuf := range [][]byte{nil, make([]byte, 20), make([]byte, 64)} {
|
||||
for i := 0; i < 32; i++ {
|
||||
s := "aaaa" + strings.Repeat("b", i) + "aaaabbbb"
|
||||
if err := roundtrip([]byte(s), ebuf, dbuf); err != nil {
|
||||
t.Errorf("len(ebuf)=%d, len(dbuf)=%d, i=%d: %v", len(ebuf), len(dbuf), i, err)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestSmallRand(t *testing.T) {
|
||||
rng := rand.New(rand.NewSource(27354294))
|
||||
for n := 1; n < 20000; n += 23 {
|
||||
b := make([]byte, n)
|
||||
for i := range b {
|
||||
b[i] = uint8(rng.Uint32())
|
||||
}
|
||||
if err := roundtrip(b, nil, nil); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestSmallRegular(t *testing.T) {
|
||||
for n := 1; n < 20000; n += 23 {
|
||||
b := make([]byte, n)
|
||||
for i := range b {
|
||||
b[i] = uint8(i%10 + 'a')
|
||||
}
|
||||
if err := roundtrip(b, nil, nil); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func cmp(a, b []byte) error {
|
||||
if len(a) != len(b) {
|
||||
return fmt.Errorf("got %d bytes, want %d", len(a), len(b))
|
||||
}
|
||||
for i := range a {
|
||||
if a[i] != b[i] {
|
||||
return fmt.Errorf("byte #%d: got 0x%02x, want 0x%02x", i, a[i], b[i])
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func TestFramingFormat(t *testing.T) {
|
||||
// src is comprised of alternating 1e5-sized sequences of random
|
||||
// (incompressible) bytes and repeated (compressible) bytes. 1e5 was chosen
|
||||
// because it is larger than maxUncompressedChunkLen (64k).
|
||||
src := make([]byte, 1e6)
|
||||
rng := rand.New(rand.NewSource(1))
|
||||
for i := 0; i < 10; i++ {
|
||||
if i%2 == 0 {
|
||||
for j := 0; j < 1e5; j++ {
|
||||
src[1e5*i+j] = uint8(rng.Intn(256))
|
||||
}
|
||||
} else {
|
||||
for j := 0; j < 1e5; j++ {
|
||||
src[1e5*i+j] = uint8(i)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
buf := new(bytes.Buffer)
|
||||
if _, err := NewWriter(buf).Write(src); err != nil {
|
||||
t.Fatalf("Write: encoding: %v", err)
|
||||
}
|
||||
dst, err := ioutil.ReadAll(NewReader(buf))
|
||||
if err != nil {
|
||||
t.Fatalf("ReadAll: decoding: %v", err)
|
||||
}
|
||||
if err := cmp(dst, src); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
}
|
||||
|
||||
func TestReaderReset(t *testing.T) {
|
||||
gold := bytes.Repeat([]byte("All that is gold does not glitter,\n"), 10000)
|
||||
buf := new(bytes.Buffer)
|
||||
if _, err := NewWriter(buf).Write(gold); err != nil {
|
||||
t.Fatalf("Write: %v", err)
|
||||
}
|
||||
encoded, invalid, partial := buf.String(), "invalid", "partial"
|
||||
r := NewReader(nil)
|
||||
for i, s := range []string{encoded, invalid, partial, encoded, partial, invalid, encoded, encoded} {
|
||||
if s == partial {
|
||||
r.Reset(strings.NewReader(encoded))
|
||||
if _, err := r.Read(make([]byte, 101)); err != nil {
|
||||
t.Errorf("#%d: %v", i, err)
|
||||
continue
|
||||
}
|
||||
continue
|
||||
}
|
||||
r.Reset(strings.NewReader(s))
|
||||
got, err := ioutil.ReadAll(r)
|
||||
switch s {
|
||||
case encoded:
|
||||
if err != nil {
|
||||
t.Errorf("#%d: %v", i, err)
|
||||
continue
|
||||
}
|
||||
if err := cmp(got, gold); err != nil {
|
||||
t.Errorf("#%d: %v", i, err)
|
||||
continue
|
||||
}
|
||||
case invalid:
|
||||
if err == nil {
|
||||
t.Errorf("#%d: got nil error, want non-nil", i)
|
||||
continue
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestWriterReset(t *testing.T) {
|
||||
gold := bytes.Repeat([]byte("Not all those who wander are lost;\n"), 10000)
|
||||
var gots, wants [][]byte
|
||||
const n = 20
|
||||
w, failed := NewWriter(nil), false
|
||||
for i := 0; i <= n; i++ {
|
||||
buf := new(bytes.Buffer)
|
||||
w.Reset(buf)
|
||||
want := gold[:len(gold)*i/n]
|
||||
if _, err := w.Write(want); err != nil {
|
||||
t.Errorf("#%d: Write: %v", i, err)
|
||||
failed = true
|
||||
continue
|
||||
}
|
||||
got, err := ioutil.ReadAll(NewReader(buf))
|
||||
if err != nil {
|
||||
t.Errorf("#%d: ReadAll: %v", i, err)
|
||||
failed = true
|
||||
continue
|
||||
}
|
||||
gots = append(gots, got)
|
||||
wants = append(wants, want)
|
||||
}
|
||||
if failed {
|
||||
return
|
||||
}
|
||||
for i := range gots {
|
||||
if err := cmp(gots[i], wants[i]); err != nil {
|
||||
t.Errorf("#%d: %v", i, err)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func benchDecode(b *testing.B, src []byte) {
|
||||
encoded, err := Encode(nil, src)
|
||||
if err != nil {
|
||||
b.Fatal(err)
|
||||
}
|
||||
// Bandwidth is in amount of uncompressed data.
|
||||
b.SetBytes(int64(len(src)))
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
Decode(src, encoded)
|
||||
}
|
||||
}
|
||||
|
||||
func benchEncode(b *testing.B, src []byte) {
|
||||
// Bandwidth is in amount of uncompressed data.
|
||||
b.SetBytes(int64(len(src)))
|
||||
dst := make([]byte, MaxEncodedLen(len(src)))
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
Encode(dst, src)
|
||||
}
|
||||
}
|
||||
|
||||
func readFile(b testing.TB, filename string) []byte {
|
||||
src, err := ioutil.ReadFile(filename)
|
||||
if err != nil {
|
||||
b.Fatalf("failed reading %s: %s", filename, err)
|
||||
}
|
||||
if len(src) == 0 {
|
||||
b.Fatalf("%s has zero length", filename)
|
||||
}
|
||||
return src
|
||||
}
|
||||
|
||||
// expand returns a slice of length n containing repeated copies of src.
|
||||
func expand(src []byte, n int) []byte {
|
||||
dst := make([]byte, n)
|
||||
for x := dst; len(x) > 0; {
|
||||
i := copy(x, src)
|
||||
x = x[i:]
|
||||
}
|
||||
return dst
|
||||
}
|
||||
|
||||
func benchWords(b *testing.B, n int, decode bool) {
|
||||
// Note: the file is OS-language dependent so the resulting values are not
|
||||
// directly comparable for non-US-English OS installations.
|
||||
data := expand(readFile(b, "/usr/share/dict/words"), n)
|
||||
if decode {
|
||||
benchDecode(b, data)
|
||||
} else {
|
||||
benchEncode(b, data)
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkWordsDecode1e3(b *testing.B) { benchWords(b, 1e3, true) }
|
||||
func BenchmarkWordsDecode1e4(b *testing.B) { benchWords(b, 1e4, true) }
|
||||
func BenchmarkWordsDecode1e5(b *testing.B) { benchWords(b, 1e5, true) }
|
||||
func BenchmarkWordsDecode1e6(b *testing.B) { benchWords(b, 1e6, true) }
|
||||
func BenchmarkWordsEncode1e3(b *testing.B) { benchWords(b, 1e3, false) }
|
||||
func BenchmarkWordsEncode1e4(b *testing.B) { benchWords(b, 1e4, false) }
|
||||
func BenchmarkWordsEncode1e5(b *testing.B) { benchWords(b, 1e5, false) }
|
||||
func BenchmarkWordsEncode1e6(b *testing.B) { benchWords(b, 1e6, false) }
|
||||
|
||||
// testFiles' values are copied directly from
|
||||
// https://raw.githubusercontent.com/google/snappy/master/snappy_unittest.cc
|
||||
// The label field is unused in snappy-go.
|
||||
var testFiles = []struct {
|
||||
label string
|
||||
filename string
|
||||
}{
|
||||
{"html", "html"},
|
||||
{"urls", "urls.10K"},
|
||||
{"jpg", "fireworks.jpeg"},
|
||||
{"jpg_200", "fireworks.jpeg"},
|
||||
{"pdf", "paper-100k.pdf"},
|
||||
{"html4", "html_x_4"},
|
||||
{"txt1", "alice29.txt"},
|
||||
{"txt2", "asyoulik.txt"},
|
||||
{"txt3", "lcet10.txt"},
|
||||
{"txt4", "plrabn12.txt"},
|
||||
{"pb", "geo.protodata"},
|
||||
{"gaviota", "kppkn.gtb"},
|
||||
}
|
||||
|
||||
// The test data files are present at this canonical URL.
|
||||
const baseURL = "https://raw.githubusercontent.com/google/snappy/master/testdata/"
|
||||
|
||||
func downloadTestdata(basename string) (errRet error) {
|
||||
filename := filepath.Join(*testdata, basename)
|
||||
if stat, err := os.Stat(filename); err == nil && stat.Size() != 0 {
|
||||
return nil
|
||||
}
|
||||
|
||||
if !*download {
|
||||
return fmt.Errorf("test data not found; skipping benchmark without the -download flag")
|
||||
}
|
||||
// Download the official snappy C++ implementation reference test data
|
||||
// files for benchmarking.
|
||||
if err := os.Mkdir(*testdata, 0777); err != nil && !os.IsExist(err) {
|
||||
return fmt.Errorf("failed to create testdata: %s", err)
|
||||
}
|
||||
|
||||
f, err := os.Create(filename)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create %s: %s", filename, err)
|
||||
}
|
||||
defer f.Close()
|
||||
defer func() {
|
||||
if errRet != nil {
|
||||
os.Remove(filename)
|
||||
}
|
||||
}()
|
||||
url := baseURL + basename
|
||||
resp, err := http.Get(url)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to download %s: %s", url, err)
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
if s := resp.StatusCode; s != http.StatusOK {
|
||||
return fmt.Errorf("downloading %s: HTTP status code %d (%s)", url, s, http.StatusText(s))
|
||||
}
|
||||
_, err = io.Copy(f, resp.Body)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to download %s to %s: %s", url, filename, err)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func benchFile(b *testing.B, n int, decode bool) {
|
||||
if err := downloadTestdata(testFiles[n].filename); err != nil {
|
||||
b.Fatalf("failed to download testdata: %s", err)
|
||||
}
|
||||
data := readFile(b, filepath.Join(*testdata, testFiles[n].filename))
|
||||
if decode {
|
||||
benchDecode(b, data)
|
||||
} else {
|
||||
benchEncode(b, data)
|
||||
}
|
||||
}
|
||||
|
||||
// Naming convention is kept similar to what snappy's C++ implementation uses.
|
||||
func Benchmark_UFlat0(b *testing.B) { benchFile(b, 0, true) }
|
||||
func Benchmark_UFlat1(b *testing.B) { benchFile(b, 1, true) }
|
||||
func Benchmark_UFlat2(b *testing.B) { benchFile(b, 2, true) }
|
||||
func Benchmark_UFlat3(b *testing.B) { benchFile(b, 3, true) }
|
||||
func Benchmark_UFlat4(b *testing.B) { benchFile(b, 4, true) }
|
||||
func Benchmark_UFlat5(b *testing.B) { benchFile(b, 5, true) }
|
||||
func Benchmark_UFlat6(b *testing.B) { benchFile(b, 6, true) }
|
||||
func Benchmark_UFlat7(b *testing.B) { benchFile(b, 7, true) }
|
||||
func Benchmark_UFlat8(b *testing.B) { benchFile(b, 8, true) }
|
||||
func Benchmark_UFlat9(b *testing.B) { benchFile(b, 9, true) }
|
||||
func Benchmark_UFlat10(b *testing.B) { benchFile(b, 10, true) }
|
||||
func Benchmark_UFlat11(b *testing.B) { benchFile(b, 11, true) }
|
||||
func Benchmark_ZFlat0(b *testing.B) { benchFile(b, 0, false) }
|
||||
func Benchmark_ZFlat1(b *testing.B) { benchFile(b, 1, false) }
|
||||
func Benchmark_ZFlat2(b *testing.B) { benchFile(b, 2, false) }
|
||||
func Benchmark_ZFlat3(b *testing.B) { benchFile(b, 3, false) }
|
||||
func Benchmark_ZFlat4(b *testing.B) { benchFile(b, 4, false) }
|
||||
func Benchmark_ZFlat5(b *testing.B) { benchFile(b, 5, false) }
|
||||
func Benchmark_ZFlat6(b *testing.B) { benchFile(b, 6, false) }
|
||||
func Benchmark_ZFlat7(b *testing.B) { benchFile(b, 7, false) }
|
||||
func Benchmark_ZFlat8(b *testing.B) { benchFile(b, 8, false) }
|
||||
func Benchmark_ZFlat9(b *testing.B) { benchFile(b, 9, false) }
|
||||
func Benchmark_ZFlat10(b *testing.B) { benchFile(b, 10, false) }
|
||||
func Benchmark_ZFlat11(b *testing.B) { benchFile(b, 11, false) }
|
Loading…
Reference in New Issue