Ines Montani
|
b507f61629
|
Tidy up and move noun_chunks, token_match, url_match
|
2020-07-22 22:18:46 +02:00 |
Ines Montani
|
24f72c669c
|
Merge branch 'develop' into master-tmp
|
2020-05-21 18:39:06 +02:00 |
adrianeboyd
|
f4ef64a526
|
Improve tokenization for UD Dutch corpora (#5259)
* Improve tokenization for UD Dutch corpora
Improve tokenization for UD Dutch Alpino and LassySmall.
* Format Dutch tokenizer exceptions
|
2020-04-06 13:18:07 +02:00 |
Ines Montani
|
db55577c45
|
Drop Python 2.7 and 3.5 (#4828)
* Remove unicode declarations
* Remove Python 3.5 and 2.7 from CI
* Don't require pathlib
* Replace compat helpers
* Remove OrderedDict
* Use f-strings
* Set Cython compiler language level
* Fix typo
* Re-add OrderedDict for Table
* Update setup.cfg
* Revert CONTRIBUTING.md
* Revert lookups.md
* Revert top-level.md
* Small adjustments and docs [ci skip]
|
2019-12-22 01:53:56 +01:00 |
Ines Montani
|
145c0b7e88
|
Tidy up and auto-format
|
2019-04-09 11:40:19 +02:00 |
Yves Peirsman
|
951825532c
|
Improved Dutch language resources and Dutch lemmatization (#3409)
* Improved Dutch language resources and Dutch lemmatization
* Fix conftest
* Update punctuation.py
* Auto-format
* Format and fix tests
* Remove unused test file
* Re-add deleted test
* removed redundant infix regex pattern for ','; note: brackets + simple hyphen remains
* Cleaner lemmatization files
|
2019-04-03 14:13:26 +02:00 |