mirror of https://github.com/explosion/spaCy.git
Various docs updates for v3.1 (#8406)
* Update for Catalan/Italian lemmatizer changes * Add warning about relevance of section
This commit is contained in:
parent
7abfa25035
commit
e39d1bd4ab
|
@ -64,11 +64,13 @@ libraries (`pymorphy2`).
|
||||||
| Language | Default Mode |
|
| Language | Default Mode |
|
||||||
| -------- | ------------ |
|
| -------- | ------------ |
|
||||||
| `bn` | `rule` |
|
| `bn` | `rule` |
|
||||||
|
| `ca` | `pos_lookup` |
|
||||||
| `el` | `rule` |
|
| `el` | `rule` |
|
||||||
| `en` | `rule` |
|
| `en` | `rule` |
|
||||||
| `es` | `rule` |
|
| `es` | `rule` |
|
||||||
| `fa` | `rule` |
|
| `fa` | `rule` |
|
||||||
| `fr` | `rule` |
|
| `fr` | `rule` |
|
||||||
|
| `it` | `pos_lookup` |
|
||||||
| `mk` | `rule` |
|
| `mk` | `rule` |
|
||||||
| `nb` | `rule` |
|
| `nb` | `rule` |
|
||||||
| `nl` | `rule` |
|
| `nl` | `rule` |
|
||||||
|
|
|
@ -97,9 +97,10 @@ In the `sm`/`md`/`lg` models:
|
||||||
tagger. For English, the attribute ruler can improve its mapping from
|
tagger. For English, the attribute ruler can improve its mapping from
|
||||||
`token.tag` to `token.pos` if dependency parses from a `parser` are present,
|
`token.tag` to `token.pos` if dependency parses from a `parser` are present,
|
||||||
but the parser is not required.
|
but the parser is not required.
|
||||||
- The `lemmatizer` component for many languages (Dutch, English, French, Greek,
|
- The `lemmatizer` component for many languages (Catalan, Dutch, English,
|
||||||
Macedonian, Norwegian, Polish and Spanish) requires `token.pos` annotation
|
French, Greek, Italian Macedonian, Norwegian, Polish and Spanish) requires
|
||||||
from either `tagger`+`attribute_ruler` or `morphologizer`.
|
`token.pos` annotation from either `tagger`+`attribute_ruler` or
|
||||||
|
`morphologizer`.
|
||||||
- The `ner` component is independent with its own internal tok2vec layer.
|
- The `ner` component is independent with its own internal tok2vec layer.
|
||||||
|
|
||||||
### Transformer pipeline design {#design-trf}
|
### Transformer pipeline design {#design-trf}
|
||||||
|
@ -133,9 +134,9 @@ nlp = spacy.load("en_core_web_trf", disable=["tagger", "attribute_ruler", "lemma
|
||||||
Token.pos">
|
Token.pos">
|
||||||
|
|
||||||
The lemmatizer depends on `tagger`+`attribute_ruler` or `morphologizer` for
|
The lemmatizer depends on `tagger`+`attribute_ruler` or `morphologizer` for
|
||||||
Dutch, English, French, Greek, Macedonian, Norwegian, Polish and Spanish. If you
|
Catalan, Dutch, English, French, Greek, Italian, Macedonian, Norwegian, Polish
|
||||||
disable any of these components, you'll see lemmatizer warnings unless the
|
and Spanish. If you disable any of these components, you'll see lemmatizer
|
||||||
lemmatizer is also disabled.
|
warnings unless the lemmatizer is also disabled.
|
||||||
|
|
||||||
</Infobox>
|
</Infobox>
|
||||||
|
|
||||||
|
@ -184,6 +185,12 @@ nlp = spacy.load("en_core_web_trf", disable=["tagger", "parser", "attribute_rule
|
||||||
|
|
||||||
#### Move NER to the end of the pipeline
|
#### Move NER to the end of the pipeline
|
||||||
|
|
||||||
|
<Infobox title="For v3.0.x models only" variant="warning">
|
||||||
|
|
||||||
|
As of v3.1, the NER component is at the end of the pipeline by default.
|
||||||
|
|
||||||
|
</Infobox>
|
||||||
|
|
||||||
For access to `POS` and `LEMMA` features in an `entity_ruler`, move `ner` to the
|
For access to `POS` and `LEMMA` features in an `entity_ruler`, move `ner` to the
|
||||||
end of the pipeline after `attribute_ruler` and `lemmatizer`:
|
end of the pipeline after `attribute_ruler` and `lemmatizer`:
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue