mirror of https://github.com/explosion/spaCy.git
Update docs [ci skip]
This commit is contained in:
parent
8dfc4cbfe7
commit
98a9e063b6
|
@ -71,10 +71,10 @@ of performance.
|
||||||
## Shared embedding layers {#embedding-layers}
|
## Shared embedding layers {#embedding-layers}
|
||||||
|
|
||||||
spaCy lets you share a single transformer or other token-to-vector ("tok2vec")
|
spaCy lets you share a single transformer or other token-to-vector ("tok2vec")
|
||||||
embedding layer between multiple components. You can even update the shared layer,
|
embedding layer between multiple components. You can even update the shared
|
||||||
performing **multi-task learning**. Reusing the tok2vec layer between components
|
layer, performing **multi-task learning**. Reusing the tok2vec layer between
|
||||||
can make your pipeline run a lot faster and result in much
|
components can make your pipeline run a lot faster and result in much smaller
|
||||||
smaller models. However, it can make the pipeline less modular and make it more
|
models. However, it can make the pipeline less modular and make it more
|
||||||
difficult to swap components or retrain parts of the pipeline. Multi-task
|
difficult to swap components or retrain parts of the pipeline. Multi-task
|
||||||
learning can affect your accuracy (either positively or negatively), and may
|
learning can affect your accuracy (either positively or negatively), and may
|
||||||
require some retuning of your hyper-parameters.
|
require some retuning of your hyper-parameters.
|
||||||
|
@ -87,11 +87,11 @@ require some retuning of your hyper-parameters.
|
||||||
| ✅ **faster:** embed the documents once for your whole pipeline | ❌ **slower:** rerun the embedding for each component |
|
| ✅ **faster:** embed the documents once for your whole pipeline | ❌ **slower:** rerun the embedding for each component |
|
||||||
| ❌ **less composable:** all components require the same embedding component in the pipeline | ✅ **modular:** components can be moved and swapped freely |
|
| ❌ **less composable:** all components require the same embedding component in the pipeline | ✅ **modular:** components can be moved and swapped freely |
|
||||||
|
|
||||||
You can share a single transformer or other tok2vec model between multiple components
|
You can share a single transformer or other tok2vec model between multiple
|
||||||
by adding a [`Transformer`](/api/transformer) or [`Tok2Vec`](/api/tok2vec) component
|
components by adding a [`Transformer`](/api/transformer) or
|
||||||
near the start of your pipeline. Components later in the pipeline can "connect"
|
[`Tok2Vec`](/api/tok2vec) component near the start of your pipeline. Components
|
||||||
to it by including a **listener layer** like [Tok2VecListener](/api/architectures#Tok2VecListener)
|
later in the pipeline can "connect" to it by including a **listener layer** like
|
||||||
within their model.
|
[Tok2VecListener](/api/architectures#Tok2VecListener) within their model.
|
||||||
|
|
||||||
![Pipeline components listening to shared embedding component](../images/tok2vec-listener.svg)
|
![Pipeline components listening to shared embedding component](../images/tok2vec-listener.svg)
|
||||||
|
|
||||||
|
@ -103,8 +103,9 @@ eventually called. A similar mechanism is used to pass gradients from the
|
||||||
listeners back to the model. The [`Transformer`](/api/transformer) component and
|
listeners back to the model. The [`Transformer`](/api/transformer) component and
|
||||||
[TransformerListener](/api/architectures#TransformerListener) layer do the same
|
[TransformerListener](/api/architectures#TransformerListener) layer do the same
|
||||||
thing for transformer models, but the `Transformer` component will also save the
|
thing for transformer models, but the `Transformer` component will also save the
|
||||||
transformer outputs to the `doc._.trf_data` extension attribute, giving you
|
transformer outputs to the
|
||||||
access to them after the pipeline has finished running.
|
[`Doc._.trf_data`](/api/transformer#custom_attributes) extension attribute,
|
||||||
|
giving you access to them after the pipeline has finished running.
|
||||||
|
|
||||||
<!-- TODO: show example of implementation via config, side by side -->
|
<!-- TODO: show example of implementation via config, side by side -->
|
||||||
|
|
||||||
|
|
|
@ -170,3 +170,16 @@ to the host device unnecessarily.
|
||||||
- Interaction with `predict`, `get_loss` and `set_annotations`
|
- Interaction with `predict`, `get_loss` and `set_annotations`
|
||||||
- Initialization life-cycle with `begin_training`.
|
- Initialization life-cycle with `begin_training`.
|
||||||
- Link to relation extraction notebook.
|
- Link to relation extraction notebook.
|
||||||
|
|
||||||
|
```python
|
||||||
|
def update(self, examples):
|
||||||
|
docs = [ex.predicted for ex in examples]
|
||||||
|
refs = [ex.reference for ex in examples]
|
||||||
|
predictions, backprop = self.model.begin_update(docs)
|
||||||
|
gradient = self.get_loss(predictions, refs)
|
||||||
|
backprop(gradient)
|
||||||
|
|
||||||
|
def __call__(self, doc):
|
||||||
|
predictions = self.model([doc])
|
||||||
|
self.set_annotations(predictions)
|
||||||
|
```
|
||||||
|
|
Loading…
Reference in New Issue