4.6 KiB

Raw Blame History

Blob Upload Protocol

Uploading a single blob is done in two parts:

Optional: see if the server already has it with a HEAD request to its blob URL, or do a multi-stat request with a single blob. If the server has it, you're done. Blobs are content-addressable and have no metadata or version, so if the server has it, it has the right version.
PUT that blob to its blob URL (or do a multipart batch put of a single blob)

When uploading multiple blobs (the common case), the fastest option depends on whether or not you're using HTTP/2.0. If your client and server don't support HTTP/2.0, you want to use the batch stat and batch upload endpoints, which hopefully can die when the future finishes arriving.

If you have HTTP/2.0, uploading 100 blobs is just like uploading 100 single blobs, but all at once. Send all your 100 HEAD requests at once, wait 1 RTT for all 100 replies, and then send then the <= 100 PUT requests with the blobs that the server didn't have.

If you DON'T have HTTP/2.0 on both sides, you want to use the batch stat and batch upload endpoints, described below.

Preupload request:

see blob-stat

Batch upload request:

Things to note about the request:

You do a POST to $BLOB_ROOT/camli/upload where $BLOB_ROOT is the blobserver's root path. You can get the path from performing "discovery" on the server and getting the "blobRoot" value. See discovery.
You MUST provide a "name" parameter in each multipart part's Content-Disposition value. The part's name matters and is the blobref ("digest-hexhexhexhex") of your blob. The bytes MUST match the blobref and the server MUST reject it if they don't match.
You (currently) MUST provide a Content-Type for each multipart part. It doesn't matter what it is (it's thrown away), but it's necessary to satisfy various HTTP libraries. Easiest is to just set it to "application/octet-stream" Server implementions SHOULD fail if you clients forget it, to encourage clients to remember it for compatibility with all blob servers.
You (currently) MUST provide a "filename" parameter in each multipart's Content-Disposition value, unique per blob, but it will also be thrown away and exists purely to satisfy various HTTP libraries (mostly App Engine). It's recommended to either set this to an increasing number (e.g. "blob1", "blob2") or just repeat the blobref value here.
The total size of a batch upload HTTP request, including headers and body (including MIME bits) should not exceed 32 MB. (A single blob can be at most 16 MB, but will in practice be much smaller: claims will be at most ~1 KB, and file chunks are typically at most 64 KB or 256 KB)

Some of these requirements may be relaxed in the future.

Example:

POST /$BLOB_SERVER_PATH_ROOT/camli/upload HTTP/1.1
Host: upload-server.example.com
Content-Type: multipart/form-data; boundary=randomboundaryXYZ

--randomboundaryXYZ
Content-Disposition: form-data; name="sha1-9b03f7aca1ac60d40b5e570c34f79a3e07c918e8"; filename="blob1"
Content-Type: application/octet-stream

(binary or text blob data)
--randomboundaryXYZ
Content-Disposition: form-data; name="sha1-deadbeefdeadbeefdeadbeefdeadbeefdeadbeef"; filename="blob2"
Content-Type: application/octet-stream

(binary or text blob data)
--randomboundaryXYZ--

Response (status may be a 200 or a 303 to this data):

HTTP/1.1 200 OK
Content-Type: text/plain

{
   "received": [
      {"blobRef": "sha1-9b03f7aca1ac60d40b5e570c34f79a3e07c918e8",
       "size": 12312},
      {"blobRef": "sha1-deadbeefdeadbeefdeadbeefdeadbeefdeadbeef",
       "size": 29384933}
   ]
}

Response keys:

received         required   Array of {"blobRef": BLOBREF, "size": INT_bytes}
                            for blobs that were successfully saved. Empty
                            list in the case nothing was received.
errorText        optional   String error message for protocol errors
                            not relating to a particular blob.
                            Mostly for debugging clients.

If connection drops during a POST to an upload URL, you should re-do a stat request to verify which objects were received by the server and which were not. Also, the URL you received from stat before might no longer work, so stat is required to a get a valid upload URL.

For information on resuming truncated uploads, read blob-upload-resume.

4.6 KiB Raw Blame History

Blob Upload Protocol

Preupload request:

Batch upload request:

4.6 KiB

Raw Blame History