From 98a9e063b695c020fde8b876ec97fe914723fd0b Mon Sep 17 00:00:00 2001
From: Ines Montani <ines@ines.io>
Date: Sat, 22 Aug 2020 17:15:05 +0200
Subject: [PATCH] Update docs [ci skip]

---
 website/docs/usage/embeddings-transformers.md | 25 ++++++++++---------
 website/docs/usage/layers-architectures.md    | 13 ++++++++++
 2 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/website/docs/usage/embeddings-transformers.md b/website/docs/usage/embeddings-transformers.md
index 4f40104c4..7648a5d45 100644
--- a/website/docs/usage/embeddings-transformers.md
+++ b/website/docs/usage/embeddings-transformers.md
@@ -71,10 +71,10 @@ of performance.
 ## Shared embedding layers {#embedding-layers}
 
 spaCy lets you share a single transformer or other token-to-vector ("tok2vec")
-embedding layer  between multiple components. You can even update the shared layer,
-performing **multi-task learning**. Reusing the tok2vec layer between components
-can make your pipeline run a lot faster and result in much
-smaller models. However, it can make the pipeline less modular and make it more
+embedding layer between multiple components. You can even update the shared
+layer, performing **multi-task learning**. Reusing the tok2vec layer between
+components can make your pipeline run a lot faster and result in much smaller
+models. However, it can make the pipeline less modular and make it more
 difficult to swap components or retrain parts of the pipeline. Multi-task
 learning can affect your accuracy (either positively or negatively), and may
 require some retuning of your hyper-parameters.
@@ -87,11 +87,11 @@ require some retuning of your hyper-parameters.
 | ✅ **faster:** embed the documents once for your whole pipeline                             | ❌ **slower:** rerun the embedding for each component                   |
 | ❌ **less composable:** all components require the same embedding component in the pipeline | ✅ **modular:** components can be moved and swapped freely              |
 
-You can share a single transformer or other tok2vec model between multiple components
-by adding a [`Transformer`](/api/transformer) or [`Tok2Vec`](/api/tok2vec) component
-near the start of your pipeline. Components later in the pipeline can "connect"
-to it by including a **listener layer** like [Tok2VecListener](/api/architectures#Tok2VecListener)
-within their model. 
+You can share a single transformer or other tok2vec model between multiple
+components by adding a [`Transformer`](/api/transformer) or
+[`Tok2Vec`](/api/tok2vec) component near the start of your pipeline. Components
+later in the pipeline can "connect" to it by including a **listener layer** like
+[Tok2VecListener](/api/architectures#Tok2VecListener) within their model.
 
 ![Pipeline components listening to shared embedding component](../images/tok2vec-listener.svg)
 
@@ -102,9 +102,10 @@ listeners, allowing the listeners to **reuse the predictions** when they are
 eventually called. A similar mechanism is used to pass gradients from the
 listeners back to the model. The [`Transformer`](/api/transformer) component and
 [TransformerListener](/api/architectures#TransformerListener) layer do the same
-thing for transformer models, but the `Transformer` component will also save the 
-transformer outputs to the `doc._.trf_data` extension attribute, giving you
-access to them after the pipeline has finished running.
+thing for transformer models, but the `Transformer` component will also save the
+transformer outputs to the
+[`Doc._.trf_data`](/api/transformer#custom_attributes) extension attribute,
+giving you access to them after the pipeline has finished running.
 
 <!-- TODO: show example of implementation via config, side by side -->
 
diff --git a/website/docs/usage/layers-architectures.md b/website/docs/usage/layers-architectures.md
index 4f91b1595..aa398f752 100644
--- a/website/docs/usage/layers-architectures.md
+++ b/website/docs/usage/layers-architectures.md
@@ -170,3 +170,16 @@ to the host device unnecessarily.
 - Interaction with `predict`, `get_loss` and `set_annotations`
 - Initialization life-cycle with `begin_training`.
 - Link to relation extraction notebook.
+
+```python
+def update(self, examples):
+    docs = [ex.predicted for ex in examples]
+    refs = [ex.reference for ex in examples]
+    predictions, backprop = self.model.begin_update(docs)
+    gradient = self.get_loss(predictions, refs)
+    backprop(gradient)
+
+def __call__(self, doc):
+    predictions = self.model([doc])
+    self.set_annotations(predictions)
+```