mirror of https://github.com/explosion/spaCy.git
example wrapped Torch model and chaining with Thinc
This commit is contained in:
parent
d0a8849e4d
commit
b35a26ea5d
|
@ -118,7 +118,7 @@ code.
|
||||||
|
|
||||||
If no model is specified for the [`TextCategorizer`](/api/textcategorizer), the
|
If no model is specified for the [`TextCategorizer`](/api/textcategorizer), the
|
||||||
[TextCatEnsemble](/api/architectures#TextCatEnsemble) architecture is used by
|
[TextCatEnsemble](/api/architectures#TextCatEnsemble) architecture is used by
|
||||||
default. This architecture combines a simpel bag-of-words model with a neural
|
default. This architecture combines a simple bag-of-words model with a neural
|
||||||
network, usually resulting in the most accurate results, but at the cost of
|
network, usually resulting in the most accurate results, but at the cost of
|
||||||
speed. The config file for this model would look something like this:
|
speed. The config file for this model would look something like this:
|
||||||
|
|
||||||
|
@ -225,10 +225,59 @@ you'll be able to try it out in any of the spaCy components.
|
||||||
|
|
||||||
Thinc allows you to [wrap models](https://thinc.ai/docs/usage-frameworks)
|
Thinc allows you to [wrap models](https://thinc.ai/docs/usage-frameworks)
|
||||||
written in other machine learning frameworks like PyTorch, TensorFlow and MXNet
|
written in other machine learning frameworks like PyTorch, TensorFlow and MXNet
|
||||||
using a unified [`Model`](https://thinc.ai/docs/api-model) API. As well as
|
using a unified [`Model`](https://thinc.ai/docs/api-model) API.
|
||||||
**wrapping whole models**, Thinc lets you call into an external framework for
|
|
||||||
just **part of your model**: you can have a model where you use PyTorch just for
|
For example, let's use Pytorch to define a very simple Neural network consisting
|
||||||
the transformer layers, using "native" Thinc layers to do fiddly input and
|
of two hidden `Linear` layers with `ReLU` activation and dropout, and a
|
||||||
|
softmax-activated output layer.
|
||||||
|
|
||||||
|
```python
|
||||||
|
from torch import nn
|
||||||
|
|
||||||
|
torch_model = nn.Sequential(
|
||||||
|
nn.Linear(width, hidden_width),
|
||||||
|
nn.ReLU(),
|
||||||
|
nn.Dropout2d(dropout),
|
||||||
|
nn.Linear(hidden_width, nO),
|
||||||
|
nn.ReLU(),
|
||||||
|
nn.Dropout2d(dropout),
|
||||||
|
nn.Softmax(dim=1)
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
This PyTorch model can be wrapped as a Thinc `Model` by using Thinc's `PyTorchWrapper`:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from thinc.api import PyTorchWrapper
|
||||||
|
|
||||||
|
wrapped_pt_model = PyTorchWrapper(torch_model)
|
||||||
|
```
|
||||||
|
|
||||||
|
The resulting wrapped `Model` can be used as a **custom architecture** as such, or
|
||||||
|
can be a **subcomponent of a larger model**. For instance, we can use Thinc's
|
||||||
|
[`chain`](https://thinc.ai/docs/api-layers#chain)
|
||||||
|
combinator, which works like `Sequential` in PyTorch,
|
||||||
|
to combine the wrapped model with other components in a larger network.
|
||||||
|
This effectively means that you can easily wrap different components
|
||||||
|
from different frameworks, and "glue" them together with Thinc:
|
||||||
|
|
||||||
|
```python
|
||||||
|
from thinc.api import chain, with_array
|
||||||
|
from spacy.ml import CharacterEmbed
|
||||||
|
|
||||||
|
embed = CharacterEmbed(width, embed_size, nM, nC)
|
||||||
|
model = chain(embed, with_array(wrapped_pt_model))
|
||||||
|
```
|
||||||
|
|
||||||
|
In the above example, we have combined our custom PyTorch model with a
|
||||||
|
character embedding layer defined by spaCy.
|
||||||
|
[CharacterEmbed](/api/architectures#CharacterEmbed) returns a
|
||||||
|
`Model` that takes a `List[Doc]` as input, and outputs a `List[Floats2d]`.
|
||||||
|
To make sure that the wrapped Pytorch model receives valid inputs, we use Thinc's
|
||||||
|
[`with_array`](https://thinc.ai/docs/api-layers#with_array) helper.
|
||||||
|
|
||||||
|
As another example, you could have a model where you use PyTorch just for
|
||||||
|
the transformer layers, and use "native" Thinc layers to do fiddly input and
|
||||||
output transformations and add on task-specific "heads", as efficiency is less
|
output transformations and add on task-specific "heads", as efficiency is less
|
||||||
of a consideration for those parts of the network.
|
of a consideration for those parts of the network.
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue