2011-02-03 23:45:35 +00:00
|
|
|
/*
|
|
|
|
Copyright 2011 Google Inc.
|
|
|
|
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|
|
|
you may not use this file except in compliance with the License.
|
|
|
|
You may obtain a copy of the License at
|
|
|
|
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
See the License for the specific language governing permissions and
|
|
|
|
limitations under the License.
|
|
|
|
*/
|
|
|
|
|
|
|
|
package blobserver
|
|
|
|
|
|
|
|
import (
|
Update from r60 to [almost] Go 1.
A lot is still broken, but most stuff at least compiles now.
The directory tree has been rearranged now too. Go libraries are now
under "pkg". Fully qualified, they are e.g. "camlistore.org/pkg/jsonsign".
The go tool cannot yet fetch from arbitrary domains, but discussion is
happening now on which mechanism to use to allow that.
For now, put the camlistore root under $GOPATH/src. Typically $GOPATH
is $HOME, so Camlistore should be at $HOME/src/camlistore.org.
Then you can:
$ go build ./server/camlistored
... etc
The build.pl script is currently disabled. It'll be resurrected at
some point, but with a very different role (helping create a fake
GOPATH and running the go build command, if things are installed at
the wrong place, and/or running fileembed generators).
Many things are certainly broken.
Many things are disabled. (MySQL, all indexing, etc).
Many things need to be moved into
camlistore.org/third_party/{code.google.com,github.com} and updated
from their r60 to Go 1 versions, where applicable.
The GoMySQL stuff should be updated to use database/sql and the ziutek
library implementing database/sql/driver.
Help wanted.
Change-Id: If71217dc5c8f0e70dbe46e9504ca5131c6eeacde
2012-02-19 05:53:06 +00:00
|
|
|
"errors"
|
2011-02-04 22:31:23 +00:00
|
|
|
"io"
|
Get rid of QueueCreator and all its associated complexity.
Previous TODO entry was:
-- Get rid of QueueCreator entirely. Plan:
-- sync handler still has a source and dest (one pair) but
instead of calling CreateQueue on the source, it instead
has an index.Storage (configured via a RequiredObject
so it can be a kvfile, leveldb, mysql, postgres etc)
-- make all the index.Storage types be instantiable
from a jsonconfig Object, perhaps with constructors keyed
on a "type" field.
-- make sync handler support blobserver.Receiver (or StatReceiver)
like indexes, so it can receive blobs. but all it needs to
do to acknowledge the ReceiveBlob is write and flush to its
index.Storage. the syncing is async by default. (otherwise callers
could just use "replica" if they wanted sync replication).
But maybe for ease of configuration switching, we could also
support a sync mode. when it needs to replicate a blob,
it uses the source.
-- future option: sync mirror to an alternate path on ReceiveBlob
that can delete. e.g. you're uploading to s3 and google,
but don't want to upload to both at once, so you use the localdisk
as a buffer to spread out your upstream bandwidth.
-- end result: no more hardlinks or queue creator.
Change-Id: I6244fc4f3a655f08470ae3160502659399f468ed
2013-11-22 22:33:31 +00:00
|
|
|
"net/http"
|
2011-02-03 23:45:35 +00:00
|
|
|
"os"
|
Update from r60 to [almost] Go 1.
A lot is still broken, but most stuff at least compiles now.
The directory tree has been rearranged now too. Go libraries are now
under "pkg". Fully qualified, they are e.g. "camlistore.org/pkg/jsonsign".
The go tool cannot yet fetch from arbitrary domains, but discussion is
happening now on which mechanism to use to allow that.
For now, put the camlistore root under $GOPATH/src. Typically $GOPATH
is $HOME, so Camlistore should be at $HOME/src/camlistore.org.
Then you can:
$ go build ./server/camlistored
... etc
The build.pl script is currently disabled. It'll be resurrected at
some point, but with a very different role (helping create a fake
GOPATH and running the go build command, if things are installed at
the wrong place, and/or running fileembed generators).
Many things are certainly broken.
Many things are disabled. (MySQL, all indexing, etc).
Many things need to be moved into
camlistore.org/third_party/{code.google.com,github.com} and updated
from their r60 to Go 1 versions, where applicable.
The GoMySQL stuff should be updated to use database/sql and the ziutek
library implementing database/sql/driver.
Help wanted.
Change-Id: If71217dc5c8f0e70dbe46e9504ca5131c6eeacde
2012-02-19 05:53:06 +00:00
|
|
|
"time"
|
2011-10-07 00:44:30 +00:00
|
|
|
|
2013-08-04 02:54:30 +00:00
|
|
|
"camlistore.org/pkg/blob"
|
2013-12-09 11:48:15 +00:00
|
|
|
"camlistore.org/pkg/constants"
|
2013-12-02 21:20:51 +00:00
|
|
|
"camlistore.org/pkg/context"
|
2011-02-03 23:45:35 +00:00
|
|
|
)
|
|
|
|
|
2013-01-01 01:36:28 +00:00
|
|
|
// MaxBlobSize is the size of a single blob in Camlistore.
|
2013-12-09 11:48:15 +00:00
|
|
|
const MaxBlobSize = constants.MaxBlobSize
|
2013-01-01 01:36:28 +00:00
|
|
|
|
Update from r60 to [almost] Go 1.
A lot is still broken, but most stuff at least compiles now.
The directory tree has been rearranged now too. Go libraries are now
under "pkg". Fully qualified, they are e.g. "camlistore.org/pkg/jsonsign".
The go tool cannot yet fetch from arbitrary domains, but discussion is
happening now on which mechanism to use to allow that.
For now, put the camlistore root under $GOPATH/src. Typically $GOPATH
is $HOME, so Camlistore should be at $HOME/src/camlistore.org.
Then you can:
$ go build ./server/camlistored
... etc
The build.pl script is currently disabled. It'll be resurrected at
some point, but with a very different role (helping create a fake
GOPATH and running the go build command, if things are installed at
the wrong place, and/or running fileembed generators).
Many things are certainly broken.
Many things are disabled. (MySQL, all indexing, etc).
Many things need to be moved into
camlistore.org/third_party/{code.google.com,github.com} and updated
from their r60 to Go 1 versions, where applicable.
The GoMySQL stuff should be updated to use database/sql and the ziutek
library implementing database/sql/driver.
Help wanted.
Change-Id: If71217dc5c8f0e70dbe46e9504ca5131c6eeacde
2012-02-19 05:53:06 +00:00
|
|
|
var ErrCorruptBlob = errors.New("corrupt blob; digest doesn't match")
|
2011-03-06 17:28:02 +00:00
|
|
|
|
2013-11-30 21:58:16 +00:00
|
|
|
// ErrNotImplemented should be returned in methods where the function is not implemented
|
|
|
|
var ErrNotImplemented = errors.New("not implemented")
|
|
|
|
|
2013-11-23 19:09:06 +00:00
|
|
|
// BlobReceiver is the interface for receiving
|
2011-02-04 22:31:23 +00:00
|
|
|
type BlobReceiver interface {
|
2011-03-05 08:03:53 +00:00
|
|
|
// ReceiveBlob accepts a newly uploaded blob and writes it to
|
2013-11-23 19:09:06 +00:00
|
|
|
// permanent storage.
|
|
|
|
//
|
|
|
|
// Implementations of BlobReceiver downstream of the HTTP
|
|
|
|
// server can trust that the source isn't larger than
|
|
|
|
// MaxBlobSize and that its digest matches the provided blob
|
|
|
|
// ref. (If not, the read of the source will fail before EOF)
|
|
|
|
//
|
|
|
|
// To ensure those guarantees, callers of ReceiveBlob should
|
|
|
|
// not call ReceiveBlob directly but instead use either
|
|
|
|
// blobserver.Receive or blobserver.ReceiveString, which also
|
|
|
|
// take care of notifying the BlobReceiver's "BlobHub"
|
|
|
|
// notification bus for observers.
|
2013-08-04 02:54:30 +00:00
|
|
|
ReceiveBlob(br blob.Ref, source io.Reader) (blob.SizedRef, error)
|
2011-02-04 22:31:23 +00:00
|
|
|
}
|
|
|
|
|
2011-02-08 16:24:16 +00:00
|
|
|
type BlobStatter interface {
|
|
|
|
// Stat checks for the existence of blobs, writing their sizes
|
|
|
|
// (if found back to the dest channel), and returning an error
|
|
|
|
// or nil. Stat() should NOT close the channel.
|
2013-08-21 20:57:28 +00:00
|
|
|
// TODO(bradfitz): redefine this to close the channel? Or document
|
|
|
|
// better what the synchronization rules are.
|
|
|
|
StatBlobs(dest chan<- blob.SizedRef, blobs []blob.Ref) error
|
2011-02-26 22:03:10 +00:00
|
|
|
}
|
|
|
|
|
2013-08-04 02:54:30 +00:00
|
|
|
func StatBlob(bs BlobStatter, br blob.Ref) (sb blob.SizedRef, err error) {
|
|
|
|
c := make(chan blob.SizedRef, 1)
|
2013-08-21 20:57:28 +00:00
|
|
|
err = bs.StatBlobs(c, []blob.Ref{br})
|
2011-10-26 02:40:50 +00:00
|
|
|
if err != nil {
|
|
|
|
return
|
|
|
|
}
|
|
|
|
select {
|
|
|
|
case sb = <-c:
|
|
|
|
default:
|
Update from r60 to [almost] Go 1.
A lot is still broken, but most stuff at least compiles now.
The directory tree has been rearranged now too. Go libraries are now
under "pkg". Fully qualified, they are e.g. "camlistore.org/pkg/jsonsign".
The go tool cannot yet fetch from arbitrary domains, but discussion is
happening now on which mechanism to use to allow that.
For now, put the camlistore root under $GOPATH/src. Typically $GOPATH
is $HOME, so Camlistore should be at $HOME/src/camlistore.org.
Then you can:
$ go build ./server/camlistored
... etc
The build.pl script is currently disabled. It'll be resurrected at
some point, but with a very different role (helping create a fake
GOPATH and running the go build command, if things are installed at
the wrong place, and/or running fileembed generators).
Many things are certainly broken.
Many things are disabled. (MySQL, all indexing, etc).
Many things need to be moved into
camlistore.org/third_party/{code.google.com,github.com} and updated
from their r60 to Go 1 versions, where applicable.
The GoMySQL stuff should be updated to use database/sql and the ziutek
library implementing database/sql/driver.
Help wanted.
Change-Id: If71217dc5c8f0e70dbe46e9504ca5131c6eeacde
2012-02-19 05:53:06 +00:00
|
|
|
err = os.ErrNotExist
|
2011-10-26 02:40:50 +00:00
|
|
|
}
|
|
|
|
return
|
|
|
|
}
|
|
|
|
|
2011-09-29 02:37:28 +00:00
|
|
|
type StatReceiver interface {
|
|
|
|
BlobReceiver
|
|
|
|
BlobStatter
|
|
|
|
}
|
|
|
|
|
2011-02-26 22:03:10 +00:00
|
|
|
type BlobEnumerator interface {
|
|
|
|
// EnumerateBobs sends at most limit SizedBlobRef into dest,
|
|
|
|
// sorted, as long as they are lexigraphically greater than
|
|
|
|
// after (if provided).
|
2011-03-06 00:23:12 +00:00
|
|
|
// limit will be supplied and sanity checked by caller.
|
2011-03-25 16:52:51 +00:00
|
|
|
// EnumerateBlobs must close the channel. (even if limit
|
2013-12-02 21:20:51 +00:00
|
|
|
// was hit and more blobs remain, or an error is returned, or
|
|
|
|
// the ctx is canceled)
|
|
|
|
EnumerateBlobs(ctx *context.Context,
|
|
|
|
dest chan<- blob.SizedRef,
|
2011-10-11 01:04:20 +00:00
|
|
|
after string,
|
2013-08-21 20:57:28 +00:00
|
|
|
limit int) error
|
2011-02-08 16:24:16 +00:00
|
|
|
}
|
|
|
|
|
2013-12-19 13:02:20 +00:00
|
|
|
type BlobStreamer interface {
|
2014-12-21 08:53:53 +00:00
|
|
|
// BlobStream is an optional interface that may be implemented by
|
|
|
|
// Storage implementations.
|
|
|
|
//
|
|
|
|
// StreamBlobs sends blobs to dest in an unspecified order. It is
|
2014-09-09 02:12:47 +00:00
|
|
|
// expected that a Storage implementation implementing
|
|
|
|
// BlobStreamer will send blobs to dest in the most efficient
|
2014-12-21 08:53:53 +00:00
|
|
|
// order possible.
|
|
|
|
//
|
|
|
|
// The provided continuation token resumes the stream from a
|
|
|
|
// point. To start from the beginning, send the empty string.
|
|
|
|
// The token is opaque and must never be interpreted; its
|
|
|
|
// format may change between versions of the server.
|
|
|
|
//
|
|
|
|
// If the content is canceled, the error value is
|
|
|
|
// context.ErrCanceled and the nextContinueToken is a
|
|
|
|
// continuation token to resume exactly _at_ (not after) the
|
|
|
|
// last value sent. This lets callers receive a blob, decide
|
|
|
|
// its size crosses a threshold, and resume at that blob at a
|
|
|
|
// later point. Callers should thus usually pass an unbuffered
|
|
|
|
// channel, although it is not an error to do otherwise, if
|
|
|
|
// the caller is careful.
|
|
|
|
//
|
|
|
|
// StreamBlobs must unconditionally close dest before
|
2014-09-09 02:12:47 +00:00
|
|
|
// returning, and it must return context.ErrCanceled if
|
|
|
|
// ctx.Done() becomes readable.
|
2014-12-21 08:53:53 +00:00
|
|
|
//
|
|
|
|
// When StreamBlobs reaches the end, the return value is ("", nil).
|
|
|
|
// The nextContinueToken must only ever be non-empty if err is
|
|
|
|
// context.ErrCanceled.
|
|
|
|
StreamBlobs(ctx *context.Context, dest chan<- *blob.Blob, contToken string) (nextContinueToken string, err error)
|
2013-12-19 13:02:20 +00:00
|
|
|
}
|
|
|
|
|
2011-03-25 02:57:57 +00:00
|
|
|
// Cache is the minimal interface expected of a blob cache.
|
|
|
|
type Cache interface {
|
2014-03-14 19:11:08 +00:00
|
|
|
blob.Fetcher
|
2011-03-25 02:57:57 +00:00
|
|
|
BlobReceiver
|
|
|
|
BlobStatter
|
|
|
|
}
|
2011-03-07 04:11:36 +00:00
|
|
|
|
2011-05-10 21:55:12 +00:00
|
|
|
type BlobReceiveConfiger interface {
|
|
|
|
BlobReceiver
|
|
|
|
Configer
|
|
|
|
}
|
|
|
|
|
2011-05-09 16:11:18 +00:00
|
|
|
type Config struct {
|
Get rid of QueueCreator and all its associated complexity.
Previous TODO entry was:
-- Get rid of QueueCreator entirely. Plan:
-- sync handler still has a source and dest (one pair) but
instead of calling CreateQueue on the source, it instead
has an index.Storage (configured via a RequiredObject
so it can be a kvfile, leveldb, mysql, postgres etc)
-- make all the index.Storage types be instantiable
from a jsonconfig Object, perhaps with constructors keyed
on a "type" field.
-- make sync handler support blobserver.Receiver (or StatReceiver)
like indexes, so it can receive blobs. but all it needs to
do to acknowledge the ReceiveBlob is write and flush to its
index.Storage. the syncing is async by default. (otherwise callers
could just use "replica" if they wanted sync replication).
But maybe for ease of configuration switching, we could also
support a sync mode. when it needs to replicate a blob,
it uses the source.
-- future option: sync mirror to an alternate path on ReceiveBlob
that can delete. e.g. you're uploading to s3 and google,
but don't want to upload to both at once, so you use the localdisk
as a buffer to spread out your upstream bandwidth.
-- end result: no more hardlinks or queue creator.
Change-Id: I6244fc4f3a655f08470ae3160502659399f468ed
2013-11-22 22:33:31 +00:00
|
|
|
Writable bool
|
|
|
|
Readable bool
|
|
|
|
Deletable bool
|
|
|
|
CanLongPoll bool
|
2011-05-09 16:11:18 +00:00
|
|
|
|
|
|
|
// the "http://host:port" and optional path (but without trailing slash) to have "/camli/*" appended
|
2012-12-29 14:51:42 +00:00
|
|
|
URLBase string
|
|
|
|
HandlerFinder FindHandlerByTyper
|
2011-05-09 16:11:18 +00:00
|
|
|
}
|
|
|
|
|
2013-08-27 02:06:36 +00:00
|
|
|
type BlobRemover interface {
|
|
|
|
// RemoveBlobs removes 0 or more blobs. Removal of
|
|
|
|
// non-existent items isn't an error. Returns failure if any
|
|
|
|
// items existed but failed to be deleted.
|
2013-11-30 21:58:16 +00:00
|
|
|
// ErrNotImplemented may be returned for storage types not implementing removal.
|
2013-08-27 02:06:36 +00:00
|
|
|
RemoveBlobs(blobs []blob.Ref) error
|
|
|
|
}
|
|
|
|
|
|
|
|
// Storage is the interface that must be implemented by a blobserver
|
|
|
|
// storage type. (e.g. localdisk, s3, encrypt, shard, replica, remote)
|
|
|
|
type Storage interface {
|
2014-03-14 19:11:08 +00:00
|
|
|
blob.Fetcher
|
2013-08-27 02:06:36 +00:00
|
|
|
BlobReceiver
|
|
|
|
BlobStatter
|
|
|
|
BlobEnumerator
|
|
|
|
BlobRemover
|
|
|
|
}
|
|
|
|
|
2013-12-24 03:07:17 +00:00
|
|
|
type FetcherEnumerator interface {
|
2014-03-14 19:11:08 +00:00
|
|
|
blob.Fetcher
|
2013-12-24 03:07:17 +00:00
|
|
|
BlobEnumerator
|
|
|
|
}
|
|
|
|
|
Get rid of QueueCreator and all its associated complexity.
Previous TODO entry was:
-- Get rid of QueueCreator entirely. Plan:
-- sync handler still has a source and dest (one pair) but
instead of calling CreateQueue on the source, it instead
has an index.Storage (configured via a RequiredObject
so it can be a kvfile, leveldb, mysql, postgres etc)
-- make all the index.Storage types be instantiable
from a jsonconfig Object, perhaps with constructors keyed
on a "type" field.
-- make sync handler support blobserver.Receiver (or StatReceiver)
like indexes, so it can receive blobs. but all it needs to
do to acknowledge the ReceiveBlob is write and flush to its
index.Storage. the syncing is async by default. (otherwise callers
could just use "replica" if they wanted sync replication).
But maybe for ease of configuration switching, we could also
support a sync mode. when it needs to replicate a blob,
it uses the source.
-- future option: sync mirror to an alternate path on ReceiveBlob
that can delete. e.g. you're uploading to s3 and google,
but don't want to upload to both at once, so you use the localdisk
as a buffer to spread out your upstream bandwidth.
-- end result: no more hardlinks or queue creator.
Change-Id: I6244fc4f3a655f08470ae3160502659399f468ed
2013-11-22 22:33:31 +00:00
|
|
|
// StorageHandler is a storage implementation that also exports an HTTP
|
|
|
|
// status page.
|
|
|
|
type StorageHandler interface {
|
|
|
|
Storage
|
|
|
|
http.Handler
|
|
|
|
}
|
|
|
|
|
2013-08-27 02:06:36 +00:00
|
|
|
// Optional interface for storage implementations which can be asked
|
|
|
|
// to shut down cleanly. Regardless, all implementations should
|
|
|
|
// be able to survive crashes without data loss.
|
|
|
|
type ShutdownStorage interface {
|
|
|
|
Storage
|
|
|
|
io.Closer
|
2011-05-09 16:11:18 +00:00
|
|
|
}
|
|
|
|
|
2012-11-07 19:55:37 +00:00
|
|
|
// A GenerationNotSupportedError explains why a Storage
|
|
|
|
// value implemented the Generationer interface but failed due
|
|
|
|
// to a wrapped Storage value not implementing the interface.
|
|
|
|
type GenerationNotSupportedError string
|
|
|
|
|
|
|
|
func (s GenerationNotSupportedError) Error() string { return string(s) }
|
|
|
|
|
2013-08-21 20:57:28 +00:00
|
|
|
/*
|
2012-11-07 19:55:37 +00:00
|
|
|
The optional Generationer interface is an optimization and paranoia
|
|
|
|
facility for clients which can be implemented by Storage
|
|
|
|
implementations.
|
|
|
|
|
|
|
|
If the client sees the same random string in multiple upload sessions,
|
|
|
|
it assumes that the blobserver still has all the same blobs, and also
|
|
|
|
it's the same server. This mechanism is not fundamental to
|
|
|
|
Camlistore's operation: the client could also check each blob before
|
|
|
|
uploading, or enumerate all blobs from the server too. This is purely
|
|
|
|
an optimization so clients can mix this value into their "is this file
|
|
|
|
uploaded?" local cache keys.
|
|
|
|
*/
|
|
|
|
type Generationer interface {
|
|
|
|
// Generation returns a Storage's initialization time and
|
|
|
|
// and unique random string (or UUID). Implementations
|
|
|
|
// should call ResetStorageGeneration on demand if no
|
|
|
|
// information is known.
|
|
|
|
// The error will be of type GenerationNotSupportedError if an underlying
|
|
|
|
// storage target doesn't support the Generationer interface.
|
|
|
|
StorageGeneration() (initTime time.Time, random string, err error)
|
|
|
|
|
|
|
|
// ResetGeneration deletes the information returned by Generation
|
|
|
|
// and re-generates it.
|
|
|
|
ResetStorageGeneration() error
|
|
|
|
}
|
|
|
|
|
2013-08-27 02:06:36 +00:00
|
|
|
type Configer interface {
|
|
|
|
Config() *Config
|
2011-02-03 23:45:35 +00:00
|
|
|
}
|
2011-05-10 21:55:12 +00:00
|
|
|
|
|
|
|
type StorageConfiger interface {
|
|
|
|
Storage
|
|
|
|
Configer
|
|
|
|
}
|
2011-10-07 00:44:30 +00:00
|
|
|
|
2013-08-27 02:06:36 +00:00
|
|
|
// MaxEnumerateConfig is an optional interface implemented by Storage
|
|
|
|
// interfaces to advertise their max value for how many items can
|
|
|
|
// be enumerated at once.
|
|
|
|
type MaxEnumerateConfig interface {
|
|
|
|
Storage
|
|
|
|
|
|
|
|
// MaxEnumerate returns the max that this storage interface is
|
|
|
|
// capable of enumerating at once.
|
|
|
|
MaxEnumerate() int
|
2011-12-04 23:19:28 +00:00
|
|
|
}
|