Commit Graph

21 Commits

Author SHA1 Message Date
Wannaphong Phatthiyaphaibun 5cba67146c add thai in spacy2 2017-09-26 21:36:27 +07:00
ines ece30c28a8 Don't split hyphenated words in German
This way, the tokenizer matches the tokenization in German treebanks
2017-09-16 20:40:15 +02:00
Matthew Honnibal d5fbf27335 Fix test 2017-09-04 16:45:11 +02:00
Matthew Honnibal 644d6c9e1a Improve lemmatization tests, re #1296 2017-09-04 15:17:44 +02:00
Jim Geovedi fbc62a09c7 added {pre,suf,in}fix tests 2017-08-20 13:43:00 +07:00
Jim Geovedi cc4772cac2 reworks 2017-08-03 13:08:38 +07:00
Jim Geovedi 783f7d8b86 added test set for Indonesian language 2017-07-29 18:21:07 +07:00
ines cc9c5dc7a3 Fix noun chunks test 2017-06-05 16:39:04 +02:00
ines a0f4592f0a Update tests 2017-06-05 02:26:13 +02:00
ines 3e105bcd36 Update tests 2017-06-05 02:09:27 +02:00
Matthew Honnibal 58be0e1f6f Update tests 2017-06-04 16:35:06 -05:00
Ines Montani 112c5787eb Merge pull request #1101 from oroszgy/hu_tokenizer_fix
More robust Hungarian tokenizer.
2017-06-04 22:37:51 +02:00
ines e47eef5e03 Update German tokenizer exceptions and tests 2017-06-03 21:07:44 +02:00
ines d77c2cc8bb Add tests for English norm exceptions 2017-06-03 20:59:50 +02:00
Gyorgy Orosz f0c3b09242 More robust Hungarian tokenizer. 2017-05-31 22:28:40 +02:00
ines 20a7003c0d Update model fixtures and reorganise tests 2017-05-29 22:14:31 +02:00
ines d0c6d4f76d Fix formatting 2017-05-23 11:32:00 +02:00
ines 2c3bdd09b1 Add English test for like_num 2017-05-09 11:06:34 +02:00
ines 22375eafb0 Fix and merge attrs and lex_attrs tests 2017-05-09 11:06:25 +02:00
ines c714841cc8 Move language-specific tests to tests/lang 2017-05-09 00:02:37 +02:00
ines 3c0f85de8e Remove imports in /lang/__init__.py 2017-05-08 23:58:07 +02:00