db2604f981
The current maximum size for a schema blob is 1MB. For a large enough directory (~20000 children), the resulting static-set JSON schema is over that maximum size. We could increase that maximum, but we would eventually hit the maximum blob size (16MB), which would only allow for ~300000 children. Even if that is an uncommon size, it is technically possible to have such large directories, so I don't think it would be reasonable to restrict users to such a limit. So it does not seems like enough of a solution. The solution proposed in this CL is to spread the children of a directory (when they are more numerous than a given maximum, here set to 10000) onto several static-sets, recursively if needed. These static-sets (subsets of the whole lot of children) are stored in the new "mergeSets" field of their parent static-set schema. The actual fileRefs or dirRefs, are still stored in the "members" field of the subset they were spread in. The "mergeSets" and "members" field of a static-set are therefore mutually exclusive. Fixes #924 Change-Id: Ibe47b50795d5288fe904d3cce0cc7f780d313408 |
||
---|---|---|
.. | ||
README.md | ||
TODO | ||
attributes.md | ||
blob-magic.md | ||
bytes.md | ||
common.md | ||
delete.md | ||
directory.md | ||
fifo.md | ||
file.md | ||
inode.md | ||
keep.md | ||
permanode.md | ||
share.md | ||
socket.md | ||
static-set.md | ||
symlink.md |
README.md
Schema
At the lowest layer, Perkeep doesn't care what you put in it (everything is just dumb bytes) and you're free to adopt your own data model. However, the upper layers of Perkeep standardize on a common schema to represent various classes of data.
Schema blobs are JSON objects with at least two attributes always set:
camliVersion
, which is always 1, and camliType
, which tells you the type of
metadata the blob contains.
Here are some of the data types we've started to formalize a JSON schema for:
-
Files: traditional filesystems. Files, directories, inodes, symlinks, etc. Uses the
file
,directory
,symlink
, andinode
camliTypes. -
"Keep" claims: Normally, any object that isn't referenced by a permanode could theoretically be garbage collected. Keep claims prevent that from happening. Indicated by the
keep
camliType. -
Permanodes: the immutable root "anchor" of mutable Perkeep objects (see terminology). Users create signed claim schema blobs which reference a permanode and define some mutation for the permanode.
Permanodes are used to model many kinds of mutable data, including mutable files, dynamic directories, and more.
Uses the
permanode
andclaim
camliTypes. -
Static Sets: Immutable lists of other blobs by their refs. Indicated by the
static-set
camliType.