Merge pull request #6409 from svlandeg/feature/trf-docs

This commit is contained in:
Ines Montani 2020-12-08 06:32:10 +01:00 committed by GitHub
commit ee2ec52f48
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 14 additions and 11 deletions

View File

@ -125,8 +125,9 @@ class Warnings:
class Errors: class Errors:
E001 = ("No component '{name}' found in pipeline. Available names: {opts}") E001 = ("No component '{name}' found in pipeline. Available names: {opts}")
E002 = ("Can't find factory for '{name}' for language {lang} ({lang_code}). " E002 = ("Can't find factory for '{name}' for language {lang} ({lang_code}). "
"This usually happens when spaCy calls `nlp.{method}` with custom " "This usually happens when spaCy calls `nlp.{method}` with a custom "
"component name that's not registered on the current language class. " "component name that's not registered on the current language class. "
"If you're using a Transformer, make sure to install 'spacy-transformers'. "
"If you're using a custom component, make sure you've added the " "If you're using a custom component, make sure you've added the "
"decorator `@Language.component` (for function components) or " "decorator `@Language.component` (for function components) or "
"`@Language.factory` (for class components).\n\nAvailable " "`@Language.factory` (for class components).\n\nAvailable "

View File

@ -143,10 +143,10 @@ argument that connects to the shared `tok2vec` component in the pipeline.
Construct an embedding layer that separately embeds a number of lexical Construct an embedding layer that separately embeds a number of lexical
attributes using hash embedding, concatenates the results, and passes it through attributes using hash embedding, concatenates the results, and passes it through
a feed-forward subnetwork to build a mixed representation. The features used a feed-forward subnetwork to build a mixed representation. The features used can
can be configured with the `attrs` argument. The suggested attributes are be configured with the `attrs` argument. The suggested attributes are `NORM`,
`NORM`, `PREFIX`, `SUFFIX` and `SHAPE`. This lets the model take into account `PREFIX`, `SUFFIX` and `SHAPE`. This lets the model take into account some
some subword information, without construction a fully character-based subword information, without construction a fully character-based
representation. If pretrained vectors are available, they can be included in the representation. If pretrained vectors are available, they can be included in the
representation as well, with the vectors table will be kept static (i.e. it's representation as well, with the vectors table will be kept static (i.e. it's
not updated). not updated).
@ -394,9 +394,10 @@ tokens. The layer therefore requires a reduction operation in order to calculate
a single token vector given zero or more wordpiece vectors. a single token vector given zero or more wordpiece vectors.
| Name | Description | | Name | Description |
| ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `pooling` | A reduction layer used to calculate the token vectors based on zero or more wordpiece vectors. If in doubt, mean pooling (see [`reduce_mean`](https://thinc.ai/docs/api-layers#reduce_mean)) is usually a good choice. ~~Model[Ragged, Floats2d]~~ | | `pooling` | A reduction layer used to calculate the token vectors based on zero or more wordpiece vectors. If in doubt, mean pooling (see [`reduce_mean`](https://thinc.ai/docs/api-layers#reduce_mean)) is usually a good choice. ~~Model[Ragged, Floats2d]~~ |
| `grad_factor` | Reweight gradients from the component before passing them upstream. You can set this to `0` to "freeze" the transformer weights with respect to the component, or use it to make some components more significant than others. Leaving it at `1.0` is usually fine. ~~float~~ | | `grad_factor` | Reweight gradients from the component before passing them upstream. You can set this to `0` to "freeze" the transformer weights with respect to the component, or use it to make some components more significant than others. Leaving it at `1.0` is usually fine. ~~float~~ |
| `upstream` | A string to identify the "upstream" `Transformer` component to communicate with. By default, the upstream name is the wildcard string `"*"`, but you could also specify the name of the `Transformer` component. You'll almost never have multiple upstream `Transformer` components, so the wildcard string will almost always be fine. ~~str~~ |
| **CREATES** | The model using the architecture. ~~Model[List[Doc], List[Floats2d]]~~ | | **CREATES** | The model using the architecture. ~~Model[List[Doc], List[Floats2d]]~~ |
### spacy-transformers.Tok2VecTransformer.v1 {#Tok2VecTransformer} ### spacy-transformers.Tok2VecTransformer.v1 {#Tok2VecTransformer}
@ -563,7 +564,8 @@ from the linear model, where it is stored in `model.attrs["multi_label"]`.
<Accordion title="spacy.TextCatEnsemble.v1 definition" spaced> <Accordion title="spacy.TextCatEnsemble.v1 definition" spaced>
The v1 was functionally similar, but used an internal `tok2vec` instead of taking it as argument. The v1 was functionally similar, but used an internal `tok2vec` instead of
taking it as argument.
| Name | Description | | Name | Description |
| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |