mirror of https://github.com/perkeep/perkeep.git
99 lines
4.7 KiB
Plaintext
99 lines
4.7 KiB
Plaintext
|
<h1>Camlistore Overview</h1>
|
||
|
|
||
|
<p>Camlistore is your <b>personal storage system for life</b>.</p>
|
||
|
|
||
|
<p>What does that mean?</p>
|
||
|
|
||
|
<p>Throughout our life, we all continue to generate content, whether
|
||
|
that's writing documents, taking photos, writing comments online,
|
||
|
liking our friends' posts on social networks, etc. Our content is
|
||
|
typically spread between a mix of different companies' servers ("The
|
||
|
Cloud") and your own hardware (laptops, phones, etc). All of these
|
||
|
things are prone to failure: companies go out of business, change
|
||
|
ownership, or kill products. Personal harddrives fail, laptops and
|
||
|
phones are dropped.</p>
|
||
|
|
||
|
<p>It would be nice if we were a bit more in control. At least, it
|
||
|
would be nice if we had a reliable backup of all our content. Once we
|
||
|
have all our content, it's then nice to search it, view it, and
|
||
|
directly serve it or share it out to others (public or with select
|
||
|
ACLs), regardless of the original host's policies.</p>
|
||
|
|
||
|
<p>Camlistore is a system to do all that.</p>
|
||
|
|
||
|
<p>While Camlistore can store files like a traditional filesystem
|
||
|
(think: "directories", "files", "filenames"), its specialized in
|
||
|
storing higher-level objects, which can represent anything..</p>
|
||
|
|
||
|
<p>In addition to an implementation, Camlistore is also a schema for
|
||
|
how to represent many types of content. Much JSON is used.</p>
|
||
|
|
||
|
<p>Because every type of content in Camlistore is represented using
|
||
|
content-addressable blobs (even metadata), it's impossible to
|
||
|
"overwrite" things. It also means it's easy for Camlistore to sync in
|
||
|
any direction between your devices and Camlistore storage servers, without
|
||
|
versioning or conflict resolution issues.</p>
|
||
|
|
||
|
<p>Camlistore can represent both immutable information (like snapshots
|
||
|
of filesystem trees), but can also represent mutable
|
||
|
information. Mutable information is represented by storing immutable,
|
||
|
timestamped, GPG-signed blobs representing a mutation request. The
|
||
|
current state of an object is just the application of all mutation
|
||
|
blobs up until that point in time. Thus all history is recorded and
|
||
|
you can look at an object as it existed at any point in time, just by
|
||
|
ignoring mutations after a certain point.</p>
|
||
|
|
||
|
<p>Despite using parts of the OpenPGP spec, users don't need to use
|
||
|
the GnuPG tools or go to key signing events or anything dorky like
|
||
|
that.</p>
|
||
|
|
||
|
<p>You are in control of your Camlistore server(s), whether you run
|
||
|
your own copy or use a hosted version. In the latter case, you're at
|
||
|
least logically in control, analagous to how you're in charge of your
|
||
|
email (and it's your private repository of all your email), even if a
|
||
|
big company runs your email for you. Of course, you can also store all
|
||
|
your email in Camlistore too, but Gmail's interface and search is much
|
||
|
better.</p>
|
||
|
|
||
|
<p>Responsible (or paranoid) users would set up their Camlistore
|
||
|
servers to cross-replicate and mirror between different big companies'
|
||
|
cloud platforms if they're not able to run their own servers between
|
||
|
different geographical areas. (e.g. cross-replicating between
|
||
|
different big disks stored within a family)</p>
|
||
|
|
||
|
<p>A Camlistore server compromises several parts, all of which are
|
||
|
optional and can be turn on or off per-instance:</p>
|
||
|
|
||
|
<ul>
|
||
|
|
||
|
<li><b>Storage</b>: the most basic part of a Camlistore server is
|
||
|
storage. This is anything which can Get or Put a blob (named by its
|
||
|
content-addressable digest), and enumerate those blobs, sorted by
|
||
|
their digest. The only metadata a storage server needs to track
|
||
|
per-blob is its size. (No other metadata is permitted, as it's
|
||
|
stored elsewhere) Implementations are trivial and exist for local
|
||
|
disk, Amazon S3, Google Storage, etc.</li>
|
||
|
|
||
|
<li><b>Index</b>: index is implemented in terms of the Storage
|
||
|
interface, so can be synchronously or asynchronously replicated to
|
||
|
from other storage types. Putting a blob indexes it, enumerating
|
||
|
returns what has been indexed, and getting isn't supported. An
|
||
|
abstraction within Camlistore similar to the storage abstractions
|
||
|
means that any underlying system which can store keys & values and
|
||
|
can scan in sorted order from a point can be used to store
|
||
|
Camlistore's indexes. Implementations are likewise trivial and exist
|
||
|
for memory (for development), SQLite, LevelDB, MySQL, Postgres,
|
||
|
MongoDB, App Engine, etc. Dynamo and others would be trivial.</li>
|
||
|
|
||
|
<li><b>Search</b>: pointing Camlistore's search handlers at an index
|
||
|
means you can search for your things. It's worth pointing out that
|
||
|
you can lose your index at any time. If your database holding your index
|
||
|
goes corrupt, just delete it all and re-replicate from your storage
|
||
|
to your index: it'll be re-indexed and search will work again.</li>
|
||
|
|
||
|
<li><b>User Interface</b>: the web user interface lets you click
|
||
|
around and view your content, and do searches. Of course, you could
|
||
|
also just use the command-line tools or API.</li>
|
||
|
|
||
|
</ul>
|