Start of an overview page, describing the project, plus more misc doc/website cleanup

Change-Id: Ic456e73ab3adf549e7280c8ee1c9d6668719720e
2013-06-12 07:22:29 -07:00 · 2013-06-12 07:22:29 -07:00 · fc2597e01f
parent 11c32c142b
commit fc2597e01f
6 changed files with 132 additions and 20 deletions
--- a/doc/schema/objects/keep.txt
+++ b/doc/schema/objects/keep.txt
@ -1,4 +1,4 @@
-A signed "keep" edge for GC/indexing purposes.  Expressed a user's
+A signed "keep" edge for GC/indexing purposes.  Expresses a user's
 intent to keep an object.

 This is not the only way to keep an object alive for the purposes of
--- a/website/content/docs/index.html
+++ b/website/content/docs/index.html
@ -4,6 +4,12 @@
 directory</a> or in the source itself.  It's now being promoted to
 HTML, though:</p>

+<h2>Intro</h2>
+
+<ul>
+   <li><a href="/docs/overview">Overview</a>: What is Camlistore? Motivation, background.</li>
+</ul>
+
 <h2>For Users</h2>

 <ul>
--- a/website/content/docs/overview
+++ b/website/content/docs/overview
@ -0,0 +1,98 @@
+<h1>Camlistore Overview</h1>
+
+<p>Camlistore is your <b>personal storage system for life</b>.</p>
+
+<p>What does that mean?</p>
+
+<p>Throughout our life, we all continue to generate content, whether
+that's writing documents, taking photos, writing comments online,
+liking our friends' posts on social networks, etc. Our content is
+typically spread between a mix of different companies' servers ("The
+Cloud") and your own hardware (laptops, phones, etc).  All of these
+things are prone to failure: companies go out of business, change
+ownership, or kill products. Personal harddrives fail, laptops and
+phones are dropped.</p>
+
+<p>It would be nice if we were a bit more in control. At least, it
+would be nice if we had a reliable backup of all our content. Once we
+have all our content, it's then nice to search it, view it, and
+directly serve it or share it out to others (public or with select
+ACLs), regardless of the original host's policies.</p>
+
+<p>Camlistore is a system to do all that.</p>
+
+<p>While Camlistore can store files like a traditional filesystem
+(think: "directories", "files", "filenames"), its specialized in
+storing higher-level objects, which can represent anything..</p>
+
+<p>In addition to an implementation, Camlistore is also a schema for
+how to represent many types of content. Much JSON is used.</p>
+
+<p>Because every type of content in Camlistore is represented using
+content-addressable blobs (even metadata), it's impossible to
+"overwrite" things. It also means it's easy for Camlistore to sync in
+any direction between your devices and Camlistore storage servers, without
+versioning or conflict resolution issues.</p>
+
+<p>Camlistore can represent both immutable information (like snapshots
+of filesystem trees), but can also represent mutable
+information. Mutable information is represented by storing immutable,
+timestamped, GPG-signed blobs representing a mutation request. The
+current state of an object is just the application of all mutation
+blobs up until that point in time. Thus all history is recorded and
+you can look at an object as it existed at any point in time, just by
+ignoring mutations after a certain point.</p>
+
+<p>Despite using parts of the OpenPGP spec, users don't need to use
+the GnuPG tools or go to key signing events or anything dorky like
+that.</p>
+
+<p>You are in control of your Camlistore server(s), whether you run
+your own copy or use a hosted version. In the latter case, you're at
+least logically in control, analagous to how you're in charge of your
+email (and it's your private repository of all your email), even if a
+big company runs your email for you. Of course, you can also store all
+your email in Camlistore too, but Gmail's interface and search is much
+better.</p>
+
+<p>Responsible (or paranoid) users would set up their Camlistore
+servers to cross-replicate and mirror between different big companies'
+cloud platforms if they're not able to run their own servers between
+different geographical areas. (e.g. cross-replicating between
+different big disks stored within a family)</p>
+
+<p>A Camlistore server compromises several parts, all of which are
+optional and can be turn on or off per-instance:</p>
+
+<ul>
+
+ <li><b>Storage</b>: the most basic part of a Camlistore server is
+  storage. This is anything which can Get or Put a blob (named by its
+  content-addressable digest), and enumerate those blobs, sorted by
+  their digest. The only metadata a storage server needs to track
+  per-blob is its size. (No other metadata is permitted, as it's
+  stored elsewhere) Implementations are trivial and exist for local
+  disk, Amazon S3, Google Storage, etc.</li>
+
+  <li><b>Index</b>: index is implemented in terms of the Storage
+  interface, so can be synchronously or asynchronously replicated to
+  from other storage types. Putting a blob indexes it, enumerating
+  returns what has been indexed, and getting isn't supported. An
+  abstraction within Camlistore similar to the storage abstractions
+  means that any underlying system which can store keys & values and
+  can scan in sorted order from a point can be used to store
+  Camlistore's indexes. Implementations are likewise trivial and exist
+  for memory (for development), SQLite, LevelDB, MySQL, Postgres,
+  MongoDB, App Engine, etc. Dynamo and others would be trivial.</li>
+
+  <li><b>Search</b>: pointing Camlistore's search handlers at an index
+  means you can search for your things.  It's worth pointing out that   
+  you can lose your index at any time. If your database holding your index
+  goes corrupt, just delete it all and re-replicate from your storage
+  to your index: it'll be re-indexed and search will work again.</li>
+
+  <li><b>User Interface</b>: the web user interface lets you click
+  around and view your content, and do searches. Of course, you could
+  also just use the command-line tools or API.</li>
+
+</ul>
--- a/website/content/docs/principles
+++ b/website/content/docs/principles
@ -3,15 +3,22 @@
 <p><b>TODO:</b> elaborate.  This is just a quick sketch.</p>

 <ul>
-<li>Disk is cheap and getting cheaper</li>
-<li>Put the user in control.  Own your data.</li>
-<li>Privacy and paranoia</li>
-<li>Decentralization is important, but..</li>
-<li>End users won't be dorks.  Must also be possible to be easy, hosted.</li>
-<li>Content-Addressability has so many awesome properties (validation, cachability, etc).  Use it as much as possible.</li>
+  <li>Disk is cheap and getting cheaper. No need to delete in general.</li>

-<li>Redundancy and over-explicitness is fine.  Compression will help.
-Redundancy and over-explicitness will be convenient for future digital
-archeologists, too.</li>
+  <li>Put the user in control.  Own your data.</li>
+
+  <li>Privacy and paranoia</li>
+
+  <li>Decentralization is important, but..</li>
+
+  <li>End users won't be dorks.  Must also be possible to be easy,
+  hosted.</li>
+
+  <li>Content-Addressability has so many great properties (validation,
+  cachability, etc).  Use it as much as possible.</li>
+
+  <li>Redundancy and over-explicitness is fine.  Compression will
+  help.  Redundancy and over-explicitness will be convenient for
+  future digital archeologists, too.</li>

 </ul>
--- a/website/content/index.html
+++ b/website/content/index.html
@ -1,6 +1,6 @@
 <h1 class='bustTitleRegexp'>What is Camlistore?</h1>

-<p>Camlistore is your <b>personal storage system for life</b>.</p>
+<p>Camlistore is your <b>personal storage system for life</b>. See <a href="/docs/overview">the overview</a>.</p>

 <p>Note that it's a "storage system", not just a "file system".  It
 can store and be accessed like a traditional filesystem, but it
@ -12,23 +12,18 @@ you can access via a FUSE filesystem. Whatever.</p>
 <p>It is:</p>
 <ul>
  <li>a way to store, sync, share, model and back up content</li>
-  <li>everything private by default</li>
+  <li>paranoid about privacy, with everything private by default</li>
  <li>entirely under your control</li>
  <li>Open Source (Apache licensed)</li>
-  <li>an acronym for <i>"Content-Addressable Multi-Layer Indexed Storage"</i>, saying that Camlistore is about:
-      <ul>
+  <li>an acronym for <i>"Content-Addressable Multi-Layer Indexed Storage"</i>, saying that Camlistore is about:<ul>
        <li>content-addressable storage, at the lowest layer ("Like git for all content in your life")</li>
        <li>separate interoperable parts (<a href="/docs/arch">storage</a>, <a href="/docs/terms#graphsync">sync</a>, <a href="/docs/sharing">sharing</a>,
        <a href="/docs/schema">modeling</a>), with well-defined protocols and roles</li>
        <li>indexing and searching your content</li>
      </ul></li>
-  <li>your "home directory for the web"</li>
-  <li>pro-JSON (yet aggressively format agnostic)</li>
-  <li>pro-OpenPGP (for <a href="/docs/json-signing">signing claims</a>)</li>
-  <li>pro-paranoia and privacy</li>
  <li><a href="/docs/uses">ambitious</a>, but ...</li>
  <li>simple!</li>
-  <li>programming language-agnostic (parts and different implementations in <a href="http://golang.org/">Go</a>, Python, Java, Perl, Bash, ... the language doesn't matter.)  What matters is well-defined, simple HTTP interfaces.</li>
+  <li>programming language-agnostic (parts and different implementations in <a href="http://golang.org/">Go</a>, Python, Java, Perl, Bash, ... the language doesn't matter.)  What matters is simple, well-defined, formats and HTTP interfaces.</li>
  <li>neither "Cloud" nor "Local". happily both. Run it on your own machine (any OS, any architecture), your phone, EC2, App Engine, Heroku, whatever.</li>
  <li>a "20% project" from a few Google employees (and non-Googlers),
  but not Google-centric nor endorsed by Google (other than them
--- a/website/static/all.css
+++ b/website/static/all.css
@ -42,8 +42,14 @@ p,
 pre,
 ul,
 ol {
-	margin: 20px;
+	margin: 15px;
 }
+
+li > ul {
+      margin-top: 10px;
+      margin-bottom: 10px;
+}
+
 pre {
 	background: #e9e9e9;
 	padding: 10px;