spaCy/spacy/training
Adriane Boyd 1c4df8fd09
Replace pytokenizations with internal alignment (#6293)
* Replace pytokenizations with internal alignment

Replace pytokenizations with internal alignment algorithm that is
restricted to only allow differences in whitespace and capitalization.

* Rename `spacy.training.align` to `spacy.training.alignment` to contain
the `Alignment` dataclass
* Implement `get_alignments` in `spacy.training.align`

* Refactor trailing whitespace handling

* Remove unnecessary exception for empty docs

Allow a non-empty whitespace-only doc to be aligned with an empty doc

* Remove empty docs exceptions completely
2020-11-03 16:24:38 +01:00
..
converters
__init__.pxd
__init__.py
align.pyx
alignment.py
augment.py
batchers.py
corpus.py
example.pxd
example.pyx
gold_io.pyx
initialize.py
iob_utils.py
loggers.py
loop.py
pretrain.py