Commit Graph

3 Commits

Author SHA1 Message Date
Adriane Boyd 28ba31e793
Add whitespace and combined augmenters (#10170)
Add whitespace augmenter that inserts a single whitespace token into a
doc containing annotation used in core trained pipelines.

Add a combined augmenter that handles lowercasing, orth variants and
whitespace augmentation.
2022-02-17 15:54:09 +01:00
Adriane Boyd 3f3e8110dc
Fix lowercase augmentation (#7336)
* Fix aborted/skipped augmentation for `spacy.orth_variants.v1` if
lowercasing was enabled for an example
* Simplify `spacy.orth_variants.v1` for `Example` vs. `GoldParse`
* Preserve reference tokenization in `spacy.lower_case.v1`
2021-03-09 14:02:32 +11:00
Ines Montani 3c36a57e84
Update data augmenters (#6196)
* Draft lower-case augmenter

* Make warning a debug log

* Update lowercase augmenter, docs and tests

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-10-04 17:46:29 +02:00