diff --git a/doc/schema/README.md b/doc/schema/README.md new file mode 100644 index 000000000..49ce3c3d3 --- /dev/null +++ b/doc/schema/README.md @@ -0,0 +1,34 @@ +# Schema + +At the lowest layer, Camlistore doesn't care what you put in it (everything is +just dumb bytes) and you're free to adopt your own data model. However, the +upper layers of Camlistore standardize on a common schema to represent various +classes of data. + +Schema blobs are JSON objects with at least two attributes always set: +`camliVersion`, which is always 1, and `camliType`, which tells you the type of +metadata the blob contains. + +Here are some of the data types we've started to formalize a +[JSON](http://json.org/) schema for: + +* [Files](files/): traditional filesystems. Files, directories, inodes, + symlinks, etc. Uses the `file`, `directory`, `symlink`, and `inode` + camliTypes. + +* [Permanodes](permanode.md): the immutable root "anchor" of mutable Camlistore + objects (see [terminology](terms.md)). Users create signed + [claim](permanode.md#claim) schema blobs which reference a permanode and + define some mutation for the permanode. + + Permanodes are used to model many kinds of mutable data, including + mutable files, dynamic directories, and more. + + Uses the `permanode` and `claim` camliTypes. + +* [Static Sets](objects/static-set.md): Immutable lists of other blobs by + their refs. Indicated by the `static-set` camliType. + +* ["Keep" claims](objects/keep.md): Normally, any object that isn't referenced + by a permanode could theoretically be garbage collected. Keep claims prevent + that from happening. Indicated by the `keep` camliType. diff --git a/doc/schema/blob-magic.md b/doc/schema/blob-magic.md new file mode 100644 index 000000000..19a17867e --- /dev/null +++ b/doc/schema/blob-magic.md @@ -0,0 +1,23 @@ +# Camli Blob Magic + +[Note: not totally happy with this yet...] + +Ideal Camli JSON blobs should begin with the following 15 bytes: + + {"camliVersion" + +However, it's acknowledged that some JSON serialization libraries will format +things differently, so additional whitespace should be tolerated. + +An ideal camli serializer will strive for the above header, though, by doing +something like: + +- removing the "camliVersion" from the object, noting its value (and requiring + it to be present) + +- serializing the JSON with an existing JSON serialization library, + +- removing the serialized JSON's leading "{" character and prepending the 15 + byte header above, as well as the colon and saved version and comma (which can + have whitespace as desired) + diff --git a/doc/schema/blob-magic.txt b/doc/schema/blob-magic.txt deleted file mode 100644 index 10571c80c..000000000 --- a/doc/schema/blob-magic.txt +++ /dev/null @@ -1,26 +0,0 @@ --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- -Camli Blob Magic --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- - -[Note: not totally happy with this yet...] - -Ideal Camli JSON blobs should begin with the following 15 bytes: - - {"camliVersion" - -However, it's acknowledged that some JSON serialization libraries will -format things differently, so additional whitespace should be -tolerated. - -An ideal camli serializer will strive for the above header, though, by -doing something like: - - -- removing the "camliVersion" from the object, noting its value - (and requiring it to be present) - - -- serializing the JSON with an existing JSON serialization library, - - -- removing the serialized JSON's leading "{" character and prepending - the 15 byte header above, as well as the colon and saved version - and comma (which can have whitespace as desired) - diff --git a/doc/schema/bytes.md b/doc/schema/bytes.md new file mode 100644 index 000000000..72fd8cd84 --- /dev/null +++ b/doc/schema/bytes.md @@ -0,0 +1,39 @@ +# Bytes Blob + +Description of a series of bytes. + +A "bytes" is a metadata (JSON) blob to describe blobs. It's a recursive +definition that's able to describe a hash tree, describing very large +blobs (or "files"). + +A "bytes" blob can be used on its own, but is also used by things like +a "file" schema blob. + + {"camliVersion": 1, + "camliType": "bytes", + + // Required. Array of contiguous regions of bytes. Zero or more elements. + // + // Each element must have: + // "size": the number of bytes that this element contributes to array of bytes. + // Required, and must be greater than zero. + // + // At most one of: + // "blobRef": where to get the raw bytes from. if this and "bytesRef" + // are missing, the bytes are all zero (e.g. a sparse file hole) + // "bytesRef": alternative to blobRef, where to get the range's bytes + // from, but pointing recursively at a "bytes" schema blob + // describing the range, recursively. large files are made of + // these in a hash tree. it is an error if both "bytesRef" + // and "blobRef" are specified. + // + // Optional: + // "offset": the number of bytes into blobRef or bytesRef to skip to + // get the necessary bytes for the range. usually zero (unspecified) + "parts": [ + {"blobRef": "digalg-blobref", "size": 1024}, + {"bytesRef": "digalg-blobref", "size": 5000000, "offset": 492 }, + {"size": 1000000}, + {"blobRef": "digalg-blobref", "size": 10}, + ] + } diff --git a/doc/schema/bytes.txt b/doc/schema/bytes.txt deleted file mode 100644 index 41afbc9c0..000000000 --- a/doc/schema/bytes.txt +++ /dev/null @@ -1,38 +0,0 @@ -Description of a series of bytes. - -A "bytes" is a metadata (JSON) blob to describe blobs. It's a recursive -definition that's able to describe a hash tree, describing very large -blobs (or "files"). - -A "bytes" blob can be used on its own, but is also used by things like -a "file" schema blob. - - -{"camliVersion": 1, - "camliType": "bytes", - - // Required. Array of contiguous regions of bytes. Zero or more elements. - // - // Each element must have: - // "size": the number of bytes that this element contributes to array of bytes. - // Required, and must be greater than zero. - // - // At most one of: - // "blobRef": where to get the raw bytes from. if this and "bytesRef" - // are missing, the bytes are all zero (e.g. a sparse file hole) - // "bytesRef": alternative to blobRef, where to get the range's bytes - // from, but pointing recursively at a "bytes" schema blob - // describing the range, recursively. large files are made of - // these in a hash tree. it is an error if both "bytesRef" - // and "blobRef" are specified. - // - // Optional: - // "offset": the number of bytes into blobRef or bytesRef to skip to - // get the necessary bytes for the range. usually zero (unspecified) - "parts": [ - {"blobRef": "digalg-blobref", "size": 1024}, - {"bytesRef": "digalg-blobref", "size": 5000000, "offset": 492 }, - {"size": 1000000}, - {"blobRef": "digalg-blobref", "size": 10}, - ] -} diff --git a/doc/schema/claims/README.md b/doc/schema/claims/README.md new file mode 100644 index 000000000..0ffbdfac1 --- /dev/null +++ b/doc/schema/claims/README.md @@ -0,0 +1,5 @@ +# Claims + +* [Permanode Attributes](attributes.md) +* [Delete Claim](delete.md) +* [Share Claim](share.md) diff --git a/doc/schema/claims/TODO b/doc/schema/claims/TODO index 85ce49322..4c3e006a5 100644 --- a/doc/schema/claims/TODO +++ b/doc/schema/claims/TODO @@ -1,90 +1,99 @@ -TODO: ------ +# TODO + Clean this up and/or break into separate files. -{"camliVersion": 1, - "camliType": "claim", - "camliSigner": "....", - "claimDate": "2010-07-10T17:20:03.9212Z", // redundant with data in ascii armored "camliSig", - // but required. more legible. takes precedence over - // any date inferred from camliSig - "permaNode": "dig-xxxxxxx", // what is being modified - "claimType": "set-attribute", - "attribute": "camliContent", - "value": "dig-yyyyyyy", - "camliSig": .........} + {"camliVersion": 1, + "camliType": "claim", + "camliSigner": "....", + "claimDate": "2010-07-10T17:20:03.9212Z", // redundant with data in ascii armored "camliSig", + // but required. more legible. takes precedence over + // any date inferred from camliSig + "permaNode": "dig-xxxxxxx", // what is being modified + "claimType": "set-attribute", + "attribute": "camliContent", + "value": "dig-yyyyyyy", + "camliSig": .........} claimTypes: ----------- -"add-attribute" (adds a value to a multi-valued attribute (e.g. "tag")) -"set-attribute" (set a single-valued attribute. equivalent to "del-attribute" of "attribute" and then add-attribute) -"del-attribute" (deletes all values of "attribute", if no "value" given, or just the provided "value" if multi-valued) +* "add-attribute" (adds a value to a multi-valued attribute (e.g. "tag")) +* "set-attribute" (set a single-valued attribute. equivalent to "del-attribute" + of "attribute" and then add-attribute) +* "del-attribute" (deletes all values of "attribute", if no "value" given, or + just the provided "value" if multi-valued) -"multi".. atomically do multiple add/set/del from above on potentially different permanodes. looks like: +"multi".. atomically do multiple add/set/del from above on potentially different +permanodes. looks like: - {"camliVersion": 1, - "camliType": "claim", - "claimType": "multi", - "claimDate": "2013-02-24T17:20:03.9212Z", - "claims": [ - {"claimType": "set-attribute", - "permanode": "dig-xxxxxx", - "attribute": "foo", - "value": "fooValue"}, - {"claimType": "add-attribute", - "permanode": "dig-yyyyy", - "attribute": "tag", - "value": "funny"} - ], - "camliSig": .........} + {"camliVersion": 1, + "camliType": "claim", + "claimType": "multi", + "claimDate": "2013-02-24T17:20:03.9212Z", + "claims": [ + {"claimType": "set-attribute", + "permanode": "dig-xxxxxx", + "attribute": "foo", + "value": "fooValue"}, + {"claimType": "add-attribute", + "permanode": "dig-yyyyy", + "attribute": "tag", + "value": "funny"} + ], + "camliSig": .........} -Attribute names: ----------------- -camliContent: a permanode "becoming" something. value is pointer to what it is now. +## Attribute names + +camliContent: a permanode "becoming" something. value is pointer to what it is +now. -Old notes from July 2010 doc: ------------------------------ +## Old notes from July 2010 doc + Claim types: -permanode-become: - -- implies either: - 1) switching from typeless/lifeless virgin pnode into something (dynamic set, filesystem tree, etc) - 2) changing versions of that base metadata (new filesystem snapshot) - -- ‘permaNode’ is the thing that is changing - -- ‘contents’ is the current node that represents what permaNode changes to -set-membership: add a blobref to a dynamic set - -- "permaNode" is blobref of the dynamic set -delete-claim: delete another claim (target is claim to delete) - -- "contents" is the claim blobref you’re deleting -{set,add}-attribute: - -- attach a piece of metadata to something. - -- use set-attribute for single-valued attributes only: highest dated claim wins (of trusted person) e.g. "title", "description" - -- use add-attribute for multi-valued things. e.g. "tag" + +* permanode-become: + * implies either: + 1. switching from typeless/lifeless virgin pnode into something (dynamic + set, filesystem tree, etc) + 2. changing versions of that base metadata (new filesystem snapshot) + * 'permaNode' is the thing that is changing + * 'contents' is the current node that represents what permaNode changes to +* set-membership: add a blobref to a dynamic set + * "permaNode" is blobref of the dynamic set +* delete-claim: delete another claim (target is claim to delete) + * "contents" is the claim blobref you’re deleting +* {set,add}-attribute: + * attach a piece of metadata to something. + * use set-attribute for single-valued attributes only: highest dated claim + wins (of trusted person) e.g. "title", "description" + * use add-attribute for multi-valued things. e.g. "tag" Tagging something: -{"claimType": "add-attribute", // - "attribute": "tag", // utf-8, should have list of valid attributes names, preferrably not made up by us (open social spec?) - "value": "funny", // value that doesn’t have lasting value - "valueRef": "sha1-blobref", // hefty reference to a lasting value - - "claimer?": "sha1-of-the-dude-who’s-signing", - "claimDate": "2010-07-10T17:20:03.9212Z", - "claimType", "permanode-become", - "permaNode": "sha1-pnode", -} + + {"claimType": "add-attribute", // + "attribute": "tag", // utf-8, should have list of valid attributes names, preferrably not made up by us (open social spec?) + "value": "funny", // value that doesn’t have lasting value + "valueRef": "sha1-blobref", // hefty reference to a lasting value + + "claimer?": "sha1-of-the-dude-who’s-signing", + "claimDate": "2010-07-10T17:20:03.9212Z", + "claimType", "permanode-become", + "permaNode": "sha1-pnode", + } filesystem root claim: -{ - "camliVersion": 1, - "camliType": "claim", - // Stuff for camliType "claim": - "claimDate": "2010-07-10T17:20:03.9212Z", // redundant with data in ascii armored "camliSig". TODO: resolve - "claimType", "permanode-become", + { + "camliVersion": 1, + "camliType": "claim", - // Stuff for "permanode-become": - "permaNode": "sha1-pnode", - "contents": "sha1-fs-node" + // Stuff for camliType "claim": + "claimDate": "2010-07-10T17:20:03.9212Z", // redundant with data in ascii armored "camliSig". TODO: resolve + "claimType", "permanode-become", -,"camliSigner": "digalg-blobref-to-ascii-armor-public-key-of-signer", -"camliSig": "......"} + // Stuff for "permanode-become": + "permaNode": "sha1-pnode", + "contents": "sha1-fs-node" + + ,"camliSigner": "digalg-blobref-to-ascii-armor-public-key-of-signer", + "camliSig": "......"} diff --git a/doc/schema/claims/attributes.txt b/doc/schema/claims/attributes.md similarity index 99% rename from doc/schema/claims/attributes.txt rename to doc/schema/claims/attributes.md index 89d5ea4b0..8fc480766 100644 --- a/doc/schema/claims/attributes.txt +++ b/doc/schema/claims/attributes.md @@ -1,4 +1,4 @@ -Permanode Attributes +# Permanode Attributes While a permanode can have any arbitrary attributes and values (and each value can be single-valued or multi-valued), the following are diff --git a/doc/schema/claims/delete.md b/doc/schema/claims/delete.md new file mode 100644 index 000000000..a8e5941b1 --- /dev/null +++ b/doc/schema/claims/delete.md @@ -0,0 +1,18 @@ +# Delete Claim + +A claim can delete a permanode or another claim. + +(Un)Deletions are not considered as modifications, so the claimDate of a delete +claim is never considered as a modtime in the context of time constrained +searches. + +Example: + + {"camliVersion": 1, + "camliType": "claim", + "camliSigner": "sha1-f2b0b7da718b97ce8c31591d8ed4645c777f3ef4", + "claimDate": "2010-07-10T17:20:03.9212Z", + "claimType": "delete", + "target": "sha1-ab6dacb972eeee72df2a846aab7d751b5856a1a0", // the permanode or claim being deleted. + + } diff --git a/doc/schema/claims/delete.txt b/doc/schema/claims/delete.txt deleted file mode 100644 index 6d44e99fc..000000000 --- a/doc/schema/claims/delete.txt +++ /dev/null @@ -1,13 +0,0 @@ -A claim can delete a permanode or another claim. -(Un)Deletions are not considered as modifications, so the claimDate of a delete claim -is never considered as a modtime in the context of time constrained searches. ------ - -{"camliVersion": 1, - "camliType": "claim", - "camliSigner": "sha1-f2b0b7da718b97ce8c31591d8ed4645c777f3ef4", - "claimDate": "2010-07-10T17:20:03.9212Z", - "claimType": "delete", - "target": "sha1-ab6dacb972eeee72df2a846aab7d751b5856a1a0", // the permanode or claim being deleted. - -} diff --git a/doc/schema/claims/share.md b/doc/schema/claims/share.md new file mode 100644 index 000000000..c3e0f124b --- /dev/null +++ b/doc/schema/claims/share.md @@ -0,0 +1,34 @@ +# Share Claim + +A share claim makes blob(s) available to others. (that is, parties who are not +the owner of the Camlistore instance). + +Example: + + {"camliVersion": 1, + + // Type of authentication required to access the share. Currently only haveref + // is supported, which means that anyone with the claim blobref can access. + "authType": "haveref", + + "camliType": "claim", + "camliSigner": "sha1-f2b0b7da718b97ce8c31591d8ed4645c777f3ef4", + "claimDate": "2014-09-04T20:04:09.193945801Z", + "claimType": "share", + + // The blob or search to share. Exactly one of these must be present. It is an + // error to set neither or both. + "target": "sha1-543fbdfdbcb1297af8a4dc7d299c0cb90e2bea0f", + "search": , + + // If true, anything recursively reachable from target or search is also + // shared. Edges that are guaranteed to be followed for purposes of + // reachability are: + // - blobRef and bytesRef values of camliType="blob|file" + // - members of camliType="static-set" + // Currently reachability is implemented more loosely, but clients should not + // depend on that. + "transitive": false, + + + } diff --git a/doc/schema/claims/share.txt b/doc/schema/claims/share.txt deleted file mode 100644 index 99e41ee58..000000000 --- a/doc/schema/claims/share.txt +++ /dev/null @@ -1,31 +0,0 @@ -A share claim makes blob(s) available to others. (that is, parties who are not -the owner of the Camlistore instance). ------ - -{"camliVersion": 1, - - // Type of authentication required to access the share. Currently only haveref - // is supported, which means that anyone with the claim blobref can access. - "authType": "haveref", - - "camliType": "claim", - "camliSigner": "sha1-f2b0b7da718b97ce8c31591d8ed4645c777f3ef4", - "claimDate": "2014-09-04T20:04:09.193945801Z", - "claimType": "share", - - // The blob or search to share. Exactly one of these must be present. It is an - // error to set neither or both. - "target": "sha1-543fbdfdbcb1297af8a4dc7d299c0cb90e2bea0f", - "search": , - - // If true, anything recursively reachable from target or search is also - // shared. Edges that are guaranteed to be followed for purposes of - // reachability are: - // - blobRef and bytesRef values of camliType="blob|file" - // - members of camliType="static-set" - // Currently reachability is implemented more loosely, but clients should not - // depend on that. - "transitive": false, - - -} diff --git a/doc/schema/files/README.md b/doc/schema/files/README.md new file mode 100644 index 000000000..1a572238f --- /dev/null +++ b/doc/schema/files/README.md @@ -0,0 +1,9 @@ +# Files + +* [Common Attributes](common.md) +* [Directory](directory.md) +* [FIFO](fifo.md) +* [File](file.md) +* [Inode](inode.md) +* [Socket](socket.md) +* [Symlink](symlink.md) diff --git a/doc/schema/files/common.md b/doc/schema/files/common.md new file mode 100644 index 000000000..750af795f --- /dev/null +++ b/doc/schema/files/common.md @@ -0,0 +1,26 @@ +# Common Schema Fields + +Fields common to files, directories, symlinks, FIFOs and sockets. + + {"camliVersion": 1, + "camliType": "...", // one of "file", "directory", "symlink", "fifo", "socket" + + // At most one of these may be set. (zero may be present only for large files' subranges, + // represented as a tree of file schemas) But exactly one of these is required for + // top-level files, directories, symlinks, FIFOs, sockets, e.t.c. + "fileName": "if-it-is-utf8.txt", // only for utf-8 + "fileNameBytes": [65, 234, 234, 192, 23, 123], // if unknown charset (not recommended) + + // Optional: + "unixPermission": "0755", // no octal in JSON, so octal as string + "unixOwnerId": 1000, + "unixOwner": "bradfitz", + "unixGroupId": 500, + "unixGroup": "camliteam", + "unixXattrs": [....], // TBD + "unixMtime": "2010-07-10T17:14:51.5678Z", // UTC-- ISO 8601, as many significant digits as known + "unixCtime": "2010-07-10T17:20:03.9212Z", // UTC-- ISO 8601, best-effort to match unix meaning + + // Not recommended to include, but if you must: (atime is a bit silly) + "unixAtime": "2010-07-10T17:14:22.1234Z", // UTC-- ISO 8601 + } diff --git a/doc/schema/files/directory.md b/doc/schema/files/directory.md new file mode 100644 index 000000000..3263e6284 --- /dev/null +++ b/doc/schema/files/directory.md @@ -0,0 +1,12 @@ +# Directory Schema + + {"camliVersion": 1, + "camliType": "directory", + + // + // INCLUDE ALL REQUIRED & ANY OPTIONAL FIELDS FROM common.md + // + + // Required: + "entries": "digalg-blobref-to-static-set", + } diff --git a/doc/schema/files/directory.txt b/doc/schema/files/directory.txt deleted file mode 100644 index 917bb7d73..000000000 --- a/doc/schema/files/directory.txt +++ /dev/null @@ -1,13 +0,0 @@ -Directory schema - -{"camliVersion": 1, - "camliType": "directory", - - // - // INCLUDE ALL REQUIRED & ANY OPTIONAL FIELDS FROM file-common.txt - // - - // Required: - "entries": "digalg-blobref-to-static-set", -} - diff --git a/doc/schema/files/fifo.md b/doc/schema/files/fifo.md new file mode 100644 index 000000000..e04ec9f7b --- /dev/null +++ b/doc/schema/files/fifo.md @@ -0,0 +1,9 @@ +# FIFO Schema + + {"camliVersion": 1, + "camliType": "fifo", + + // + // INCLUDE ALL REQUIRED & ANY OPTIONAL FIELDS FROM common.md + // + } diff --git a/doc/schema/files/fifo.txt b/doc/schema/files/fifo.txt deleted file mode 100644 index 160f4a84e..000000000 --- a/doc/schema/files/fifo.txt +++ /dev/null @@ -1,9 +0,0 @@ -fifo schema - -{"camliVersion": 1, - "camliType": "fifo", - - // - // INCLUDE ALL REQUIRED & ANY OPTIONAL FIELDS FROM file-common.txt - // -} diff --git a/doc/schema/files/file-common.txt b/doc/schema/files/file-common.txt deleted file mode 100644 index 7475ce079..000000000 --- a/doc/schema/files/file-common.txt +++ /dev/null @@ -1,24 +0,0 @@ -Fields common to files, directories, symlinks, FIFOs and sockets - -{"camliVersion": 1, - "camliType": "...", // one of "file", "directory", "symlink", "fifo", "socket" - - // At most one of these may be set. (zero may be present only for large files' subranges, - // represented as a tree of file schemas) But exactly one of these is required for - // top-level files, directories, symlinks, FIFOs, sockets, e.t.c. - "fileName": "if-it-is-utf8.txt", // only for utf-8 - "fileNameBytes": [65, 234, 234, 192, 23, 123], // if unknown charset (not recommended) - - // Optional: - "unixPermission": "0755", // no octal in JSON, so octal as string - "unixOwnerId": 1000, - "unixOwner": "bradfitz", - "unixGroupId": 500, - "unixGroup": "camliteam", - "unixXattrs": [....], // TBD - "unixMtime": "2010-07-10T17:14:51.5678Z", // UTC-- ISO 8601, as many significant digits as known - "unixCtime": "2010-07-10T17:20:03.9212Z", // UTC-- ISO 8601, best-effort to match unix meaning - - // Not recommended to include, but if you must: (atime is a bit silly) - "unixAtime": "2010-07-10T17:14:22.1234Z", // UTC-- ISO 8601 -} diff --git a/doc/schema/files/file.md b/doc/schema/files/file.md new file mode 100644 index 000000000..040339816 --- /dev/null +++ b/doc/schema/files/file.md @@ -0,0 +1,14 @@ +# File Schema + + {"camliVersion": 1, + "camliType": "file", + + // #include "common.md" # metadata about the file + // #include "../bytes.md" # describes the bytes of the file + + // Optional, if linkcount > 1, for representing hardlinks properly. + "inodeRef": "digalg-blobref", // to "inode" blobref, when the link count > 1 + } + +TODO: Mac/NTFS-style resource forks? perhaps just a "streams" array of +recursive file objects? diff --git a/doc/schema/files/file.txt b/doc/schema/files/file.txt deleted file mode 100644 index c1389ba5a..000000000 --- a/doc/schema/files/file.txt +++ /dev/null @@ -1,15 +0,0 @@ -File schema - -{"camliVersion": 1, - "camliType": "file", - - // #include "file-common.txt" # metadata about the file - // #include "../bytes.txt" # describes the bytes of the file - - // Optional, if linkcount > 1, for representing hardlinks properly. - "inodeRef": "digalg-blobref", // to "inode" blobref, when the link count > 1 -} - -// TODO: Mac/NTFS-style resource forks? perhaps just a "streams" -// array of recursive file objects? - diff --git a/doc/schema/files/inode.txt b/doc/schema/files/inode.md similarity index 63% rename from doc/schema/files/inode.txt rename to doc/schema/files/inode.md index 26274cdec..40ac85ca4 100644 --- a/doc/schema/files/inode.txt +++ b/doc/schema/files/inode.md @@ -1,11 +1,11 @@ -Inode schema. +# Inode Schema -{"camliVersion": 1, - "camliType": "inode", - "inodeId": 12345 // st_ino - "deviceId": 53, // st_dev - "numLinks": 3, // st_nlink -} + {"camliVersion": 1, + "camliType": "inode", + "inodeId": 12345 // st_ino + "deviceId": 53, // st_dev + "numLinks": 3, // st_nlink + } This is optional and probably rarely used, but lets two+ files be represented as hardlinks with each other. If both files point to the diff --git a/doc/schema/files/socket.md b/doc/schema/files/socket.md new file mode 100644 index 000000000..4dc1c6168 --- /dev/null +++ b/doc/schema/files/socket.md @@ -0,0 +1,9 @@ +# Socket Schema + + {"camliVersion": 1, + "camliType": "socket", + + // + // INCLUDE ALL REQUIRED & ANY OPTIONAL FIELDS FROM common.md + // + } diff --git a/doc/schema/files/socket.txt b/doc/schema/files/socket.txt deleted file mode 100644 index b13d35682..000000000 --- a/doc/schema/files/socket.txt +++ /dev/null @@ -1,9 +0,0 @@ -socket schema - -{"camliVersion": 1, - "camliType": "socket", - - // - // INCLUDE ALL REQUIRED & ANY OPTIONAL FIELDS FROM file-common.txt - // -} diff --git a/doc/schema/files/symlink.md b/doc/schema/files/symlink.md new file mode 100644 index 000000000..e35adf19e --- /dev/null +++ b/doc/schema/files/symlink.md @@ -0,0 +1,18 @@ +# Symlink Schema + + {"camliVersion": 1, + "camliType": "symlink", + + // + // INCLUDE ALL REQUIRED & ANY OPTIONAL FIELDS FROM common.md + // + + // Exactly one of: + + // If UTF-8: + "symlinkTarget": "../foo/blah", + + // If unknown charset & have raw 8-bit filenames and can't convert + // to UTF-8. The array is a mix of UTF-8 and/or non-UTF-8 bytes (0-255). + "symlinkTargetBytes": ["../foo/Am", 233, "lie.jpg"], // e.g. Amélie in ISO-8859-1 when charset unknown + } diff --git a/doc/schema/files/symlink.txt b/doc/schema/files/symlink.txt deleted file mode 100644 index 3969f7395..000000000 --- a/doc/schema/files/symlink.txt +++ /dev/null @@ -1,18 +0,0 @@ -Symlink schema - -{"camliVersion": 1, - "camliType": "symlink", - - // - // INCLUDE ALL REQUIRED & ANY OPTIONAL FIELDS FROM file-common.txt - // - - // Exactly one of: - - // If UTF-8: - "symlinkTarget": "../foo/blah", - - // If unknown charset & have raw 8-bit filenames and can't convert - // to UTF-8. The array is a mix of UTF-8 and/or non-UTF-8 bytes (0-255). - "symlinkTargetBytes": ["../foo/Am", 233, "lie.jpg"], // e.g. Amélie in ISO-8859-1 when charset unknown -} diff --git a/doc/schema/index.html b/doc/schema/index.html deleted file mode 100644 index 5d9758339..000000000 --- a/doc/schema/index.html +++ /dev/null @@ -1,57 +0,0 @@ -

Schema

- -

- At the lowest layer, Camlistore doesn't care what you put in it - (everything is just dumb bytes) and you're free to adopt your own - data model. However, the upper layers of Camlistore standardize on - a common schema to represent various - classes of data. -

- -

- Schema blobs are JSON objects with at least two attributes always - set: camliVersion, which is always 1, - and camliType, which tells you the type of metadata the - blob contains. -

- -

- Here are some of the data types we've started to formalize - a JSON schema for: -

- -
    -
  • - Files: - traditional filesystems. Files, directories, inodes, symlinks, - etc. Uses the file, directory, symlink, - and inode camliTypes. -
  • - -
  • - Permanodes: the immutable root - "anchor" of mutable Camlistore objects - (see terminology). Users create - signed claim schema - blobs which reference a permanode and define some mutation for the - permanode. -
    - Permanodes are used to model many kinds of mutable data, including - mutable files, dynamic directories, and more. -
    - Uses the permanode and claim camliTypes. -
  • - -
  • - Static Sets: - Immutable lists of other blobs by their refs. Indicated by - the static-set camliType. -
  • - -
  • - "Keep" claims: - Normally, any object that isn't referenced by a permanode could - theoretically be garbage collected. Keep claims prevent that from - happening. Indicated by the keep camliType. -
  • -
diff --git a/doc/schema/objects/README.md b/doc/schema/objects/README.md new file mode 100644 index 000000000..83d484fb6 --- /dev/null +++ b/doc/schema/objects/README.md @@ -0,0 +1,4 @@ +# Objects + +* [Keep Claims](keep.md) +* [Static Set](static-set.md) diff --git a/doc/schema/objects/keep.txt b/doc/schema/objects/keep.md similarity index 73% rename from doc/schema/objects/keep.txt rename to doc/schema/objects/keep.md index 655585387..221abe218 100644 --- a/doc/schema/objects/keep.txt +++ b/doc/schema/objects/keep.md @@ -1,3 +1,5 @@ +# Keep Object + A signed "keep" edge for GC/indexing purposes. Expresses a user's intent to keep an object. @@ -9,8 +11,8 @@ those permanodes) This is just the most explicit way when you're not modeling the data with permanodes. -{"camliVersion": 1, - "camliType": "keep", - "target": "digalg-blobref-of-thing-to-keep", -} + {"camliVersion": 1, + "camliType": "keep", + "target": "digalg-blobref-of-thing-to-keep", + } diff --git a/doc/schema/objects/permanode.txt b/doc/schema/objects/permanode.txt deleted file mode 100644 index 0bcfd863d..000000000 --- a/doc/schema/objects/permanode.txt +++ /dev/null @@ -1,15 +0,0 @@ -The idea of a permanode is that it's the anchor from which you build -mutable objects. To serve as a reliable (consistently nameable) -object it must have no mutable state itself. - -{"camliVersion": 1, - "camliType": "permanode", - - // Required. Any random string, to force the sha1 of this - // node to be unique. Note that the date in the ASCII-armored - // GPG JSON signature will already help it be unique, so this - // doesn't need to be a great random. - "random": "615e05c68c8411df81a2001b639d041f" - -} - diff --git a/doc/schema/objects/static-set.md b/doc/schema/objects/static-set.md new file mode 100644 index 000000000..751cb5695 --- /dev/null +++ b/doc/schema/objects/static-set.md @@ -0,0 +1,25 @@ +# Static Set schema + +Example: + + {"camliVersion": 1, + "camliType": "static-set", + + // Required. + // May be ordered to unordered, depending on context/needs. If unordered, + // it's recommended but not required to sort the blobrefs. + "members": [ + "digalg-blobref-item1", // maybe a file? + "digalg-blobref-item2", // maybe a directory? + "digalg-blobref-item3", // maybe a symlink? + "digalg-blobref-item4", // maybe a permanode? + "digalg-blobref-item5", // ... don't know until you fetch it + "digalg-blobref-item6", // ... and what's valid depends on context + "digalg-blobref-item7", // ... a permanode in a directory would + "digalg-blobref-item8" // ... be invalid, for instance. + ] + } + +Note: dynamic sets are structured differently, using a permanode and + membership claim nodes. The above is just for presenting a snapshot + of members. diff --git a/doc/schema/objects/static-set.txt b/doc/schema/objects/static-set.txt deleted file mode 100644 index b6a06de05..000000000 --- a/doc/schema/objects/static-set.txt +++ /dev/null @@ -1,23 +0,0 @@ -Static set schema - -{"camliVersion": 1, - "camliType": "static-set", - - // Required. - // May be ordered to unordered, depending on context/needs. If unordered, - // it's recommended but not required to sort the blobrefs. - "members": [ - "digalg-blobref-item1", // maybe a file? - "digalg-blobref-item2", // maybe a directory? - "digalg-blobref-item3", // maybe a symlink? - "digalg-blobref-item4", // maybe a permanode? - "digalg-blobref-item5", // ... don't know until you fetch it - "digalg-blobref-item6", // ... and what's valid depends on context - "digalg-blobref-item7", // ... a permanode in a directory would - "digalg-blobref-item8" // ... be invalid, for instance. - ] -} - -Note: dynamic sets are structured differently, using a permanode and - membership claim nodes. The above is just for presenting a snapshot - of members. diff --git a/doc/schema/permanode.md b/doc/schema/permanode.md index b5a2f8337..c57a05b22 100644 --- a/doc/schema/permanode.md +++ b/doc/schema/permanode.md @@ -1,146 +1,114 @@ -

Permanodes

+# Permanodes -

- Permanodes are how Camlistore models mutable data on top of an - immutable, content-addressable datastore. The data is modeled using - nodes with two camliTypes: permanode and claim. -

+Permanodes are how Camlistore models mutable data on top of an immutable, +content-addressable datastore. The data is modeled using nodes with two +camliTypes: `permanode` and `claim`. -

Permanode

-

- A permanode is an anchor from which you build mutable objects. To - serve as a reliable (consistently nameable) object, it must have no - mutable state itself. In fact, a permanode is really just a - signed random - number. -

-
-{"camliVersion": 1,
- "camliType": "permanode",
+## Permanode
 
- // Required.  Any random string, to force the digest of this
- // node to be unique.  Note that the date in the ASCII-armored
- // GPG JSON signature will already help it be unique, so this
- // doesn't need to be a great random.
- "random": "615e05c68c8411df81a2001b639d041f"
+A permanode is an anchor from which you build mutable objects.  To serve as a
+reliable (consistently nameable) object, it must have no mutable state itself.
+In fact, a permanode is really just a [signed](../json-signing/) random number.
 
-<REQUIRED-JSON-SIGNATURE>}
-
+ {"camliVersion": 1, + "camliType": "permanode", -

Claim

-

- A claim is any signed JSON schema blob. One common use is modifying - "attributes" on a permanode. The state of a permanode is the result - of combining all attribute-modifying claims which reference it, in - order. Claim nodes look something like this: -

-
-{"camliVersion": 1,
- "camliType": "claim",
- "camliSigner": "....",
- "claimDate": "2010-07-10T17:20:03.9212Z", // redundant with data in ascii armored "camliSig",
-                                           // but required. more legible. takes precedence over
-                                           // any date inferred from camliSig
- "permaNode": "sha1-xxxxxxx",        // what is being modified
- "claimType": "set-attribute",
- "attribute": "camliContent",
- "value": "sha1-yyyyyyy",
- "camliSig": .........}
-
-

- All claims must be signed. -

- The anagrammatical property claimType defines what the - claim does, and is one of the following: -

-
    -
  • - add-attribute: adds a value to a multi-valued attribute - (e.g. "tag") -
  • + // Required. Any random string, to force the digest of this + // node to be unique. Note that the date in the ASCII-armored + // GPG JSON signature will already help it be unique, so this + // doesn't need to be a great random. + "random": "615e05c68c8411df81a2001b639d041f" -
  • - set-attribute: set a single-valued attribute. equivalent - to "del-attribute" of "attribute" and then add-attribute. -
  • + } -
  • - del-attribute: deletes all values of "attribute", if no - "value" given, or just the provided "value" if multi-valued -
  • +## Claim -
  • - multi: atomically do multiple add/set/del from above on - potentially different permanodes. looks like: -
    -   {"camliVersion": 1,
    -    "camliType": "claim",
    -    "claimType": "multi",
    -    "claimDate": "2013-02-24T17:20:03.9212Z",
    -    "claims": [
    -         {"claimType": "set-attribute",
    -          "permanode": "sha1-xxxxxx",
    -          "attribute": "foo",
    -          "value": "fooValue"},
    -         {"claimType": "add-attribute",
    -          "permanode": "sha1-yyyyy",
    -          "attribute": "tag",
    -          "value": "funny"}
    -    ],
    -    "camliSig": .........}
    -    
    -
  • -
+A claim is any signed JSON schema blob. One common use is modifying +"attributes" on a permanode. The state of a permanode is the result +of combining all attribute-modifying claims which reference it, in +order. Claim nodes look something like this: -

Attributes

-

- A permanode can have any attribute you like, but here are the ones - that currently mean something to Camlistore: -

-
    -
  • - tag: A set of zero or more keywords (or phrases) indexed - completely, for searching by tag. No HTML. -
  • -
  • - title: A name given to the permanode. No HTML. -
  • -
  • - description: An account of the permanode. It may include but is - not limited to: an abstract, a table of contents, or a free-text - account of the resource. No HTML. -
  • -
  • - camliContent: A reference to another blob. If a permanode - has this attribute, it's considered a pointer to its camliContent - value. -
  • -
  • - camliMember: A reference to another permanode. This - indicates that the referenced permanode is a dynamic set, and - we're a part of it. -
  • -
  • - camliPath*: camliPath attributes are set on permanodes - which represent dynamic directories. If a permanode has attributes: -
    -    camliPath:dir2 = $blobref_dir2_permandode
    -    camliPath:bar.txt = $blobref_bartxt_permanode
    -    
    + {"camliVersion": 1, + "camliType": "claim", + "camliSigner": "....", + "claimDate": "2010-07-10T17:20:03.9212Z", // redundant with data in ascii armored "camliSig", + // but required. more legible. takes precedence over + // any date inferred from camliSig + "permaNode": "sha1-xxxxxxx", // what is being modified + "claimType": "set-attribute", + "attribute": "camliContent", + "value": "sha1-yyyyyyy", + "camliSig": .........} + +All claims must be [signed](../json-signing/). + +The anagrammatical property `claimType` defines what the claim does, and is one +of the following: + +* `add-attribute`: adds a value to a multi-valued attribute (e.g. "tag") + +* `set-attribute`: set a single-valued attribute. equivalent to "del-attribute" + of "attribute" and then add-attribute. + +* `del-attribute`: deletes all values of "attribute", if no "value" given, or + just the provided "value" if multi-valued + +* `multi`: atomically do multiple add/set/del from above on potentially + different permanodes. looks like: + + {"camliVersion": 1, + "camliType": "claim", + "claimType": "multi", + "claimDate": "2013-02-24T17:20:03.9212Z", + "claims": [ + {"claimType": "set-attribute", + "permanode": "sha1-xxxxxx", + "attribute": "foo", + "value": "fooValue"}, + {"claimType": "add-attribute", + "permanode": "sha1-yyyyy", + "attribute": "tag", + "value": "funny"} + ], + "camliSig": .........} + +## Attributes + +A permanode can have any attribute you like, but here are the ones that +currently mean something to Camlistore: + +* `tag`: A set of zero or more keywords (or phrases) indexed completely, for + searching by tag. No HTML. + +* `title`: A name given to the permanode. No HTML. + +* `description`: An account of the permanode. It may include but is not limited + to: an abstract, a table of contents, or a free-text account of the resource. + No HTML. + +* `camliContent`: A reference to another blob. If a permanode has this + attribute, it's considered a pointer to its camliContent value. + +* `camliMember`: A reference to another permanode. This indicates that the + referenced permanode is a dynamic set, and we're a part of it. + +* `camliPath`: camliPath attributes are set on permanodes which represent + dynamic directories. If a permanode has attributes: + + camliPath:dir2 = $blobref_dir2_permandode + camliPath:bar.txt = $blobref_bartxt_permanode + It will appear as a directory containing "dir2" and "bar.txt". -
    + These are used by a few things, including the web UI, the "publish" code (declaring you want a photo at a URL and then the HTTP front end resolving each directory link in - http://myhostname.com/pics/x/y/x/funny.jpg), and the FUSE + ), and the FUSE read/write filesystem code. -
  • -
  • - camliRoot: A root name for the permanode. This will cause it to - show up as a named folder in the FUSE filesystem under roots/. - Creating a directory in roots/ will cause a new permanode to be - created with this attr set. You can also browse roots in the web UI. -
  • -
+ +* `camliRoot`: A root name for the permanode. This will cause it to show up as a + named folder in the FUSE filesystem under roots/. Creating a + directory in roots/ will cause a new permanode to be created with + this attr set. You can also browse roots in the web UI.