mirror of https://github.com/explosion/spaCy.git
Add infobox
This commit is contained in:
parent
114cb18892
commit
1d5ff3e455
|
@ -1019,6 +1019,15 @@ above:
|
||||||
- The dictionary `b2a_multi` shows that there are no tokens in `spacy_tokens`
|
- The dictionary `b2a_multi` shows that there are no tokens in `spacy_tokens`
|
||||||
that map to multiple tokens in `other_tokens`.
|
that map to multiple tokens in `other_tokens`.
|
||||||
|
|
||||||
|
<Infobox title="Important note" variant="warning">
|
||||||
|
|
||||||
|
The current implementation of the alignment algorithm assumes that both
|
||||||
|
tokenizations add up to the same string. For example, you'll be able to align
|
||||||
|
`["I", "'", "m"]` and `["I", "'m"]`, which both add up to `"I'm"`, but not
|
||||||
|
`["I", "'m"]` and `["I", "am"]`.
|
||||||
|
|
||||||
|
</Infobox>
|
||||||
|
|
||||||
## Merging and splitting {#retokenization new="2.1"}
|
## Merging and splitting {#retokenization new="2.1"}
|
||||||
|
|
||||||
The [`Doc.retokenize`](/api/doc#retokenize) context manager lets you merge and
|
The [`Doc.retokenize`](/api/doc#retokenize) context manager lets you merge and
|
||||||
|
|
Loading…
Reference in New Issue