mirror of https://github.com/explosion/spaCy.git
Adjust wording [ci skip]
This commit is contained in:
parent
57d7076a72
commit
c3ead02ea5
|
@ -970,9 +970,10 @@ optimized for compatibility with treebank annotations. Other tools and resources
|
|||
can sometimes tokenize things differently – for example, `"I'm"` →
|
||||
`["I", "'", "m"]` instead of `["I", "'m"]`.
|
||||
|
||||
In cases like that, you often want to align the tokenization so that you can
|
||||
merge annotations from different sources together, or take vectors predicted by
|
||||
a [pre-trained BERT model](https://github.com/huggingface/pytorch-transformers)
|
||||
In situations like that, you often want to align the tokenization so that you
|
||||
can merge annotations from different sources together, or take vectors predicted
|
||||
by a
|
||||
[pre-trained BERT model](https://github.com/huggingface/pytorch-transformers)
|
||||
and apply them to spaCy tokens. spaCy's [`gold.align`](/api/goldparse#align)
|
||||
helper returns a `(cost, a2b, b2a, a2b_multi, b2a_multi)` tuple describing the
|
||||
number of misaligned tokens, the one-to-one mappings of token indices in both
|
||||
|
|
Loading…
Reference in New Issue