Commit Graph

8 Commits

Author SHA1 Message Date
Jens Dahl Møllerhøj e5055e3cf6 Add Danish lemmatizer (#2184)
* add danish lemmatizer

* fill contributor agreement
2018-04-07 19:07:28 +02:00
Søren Lind Kristiansen 15d13efafd Tune Danish tokenizer to more closely match tokenization in Universal Dependencies. 2017-12-20 17:36:52 +01:00
Søren Lind Kristiansen 0ffd27b0f6 Add several Danish alternative spellings 2017-11-27 13:35:41 +01:00
Søren Lind Kristiansen 6aa241bcec Add day of month tokenizer exceptions for Danish. 2017-11-24 15:03:24 +01:00
Søren Lind Kristiansen 0c276ed020 Add weekday abbreviations and remove abiguous month abbreviations for Danish. 2017-11-24 14:43:29 +01:00
Søren Lind Kristiansen 056547e989 Add multiple tokenizer exceptions for Danish. 2017-11-24 11:51:26 +01:00
Søren Lind Kristiansen 8dc265ac0c Add test for tokenization of 'i.' for Danish. 2017-11-24 11:29:37 +01:00
mollerhoj e840077601 Add some basic tests for Danish 2017-07-03 15:49:51 +02:00