Jens Dahl Møllerhøj
|
e5055e3cf6
|
Add Danish lemmatizer (#2184)
* add danish lemmatizer
* fill contributor agreement
|
2018-04-07 19:07:28 +02:00 |
Kit
|
9bc524982e
|
Find lowercased forms of numeric words
|
2018-01-08 03:25:08 +01:00 |
Søren Lind Kristiansen
|
bef735aef7
|
Fix Danish abbreviation 'm.h.t.'
|
2017-12-21 09:24:31 +01:00 |
Ines Montani
|
a3dd167d7f
|
Merge branch 'master' into da_ud_tokenization
|
2017-12-20 21:05:34 +00:00 |
Søren Lind Kristiansen
|
7a2f2f6f94
|
Fix formatting.
|
2017-12-20 18:37:37 +01:00 |
Søren Lind Kristiansen
|
15d13efafd
|
Tune Danish tokenizer to more closely match tokenization in Universal Dependencies.
|
2017-12-20 17:36:52 +01:00 |
Kim FalkJørgensen
|
648dc60755
|
Remove the incorrect exception 'm.h.t'
|
2017-12-20 10:02:39 +01:00 |
Kim FalkJørgensen
|
9c9f4ef84a
|
Fixing a translation error in examples.py
Adding an exception in the tokenizer_exceptions.py
|
2017-12-19 15:26:50 +01:00 |
Søren Lind Kristiansen
|
d86b537a38
|
Enable morph rules for Danish
|
2017-11-30 15:58:02 +01:00 |
Søren Lind Kristiansen
|
13a988adc3
|
Remove 'Number[psor]'
|
2017-11-30 15:55:04 +01:00 |
Søren Lind Kristiansen
|
dd6fde18a9
|
Add more Danish morph rules and clean up existing ones
|
2017-11-30 11:17:19 +01:00 |
Ines Montani
|
9052643e2c
|
Merge pull request #1653 from sorenlind/da_example_typo
Fix typo
|
2017-11-27 14:47:42 +00:00 |
Søren Lind Kristiansen
|
5fe58b885b
|
Fix typo
|
2017-11-27 15:36:18 +01:00 |
Ines Montani
|
d52b1ab245
|
Add unicode_literals (hopefully fixes test failure on Python 2)
|
2017-11-27 15:16:54 +01:00 |
Søren Lind Kristiansen
|
0ffd27b0f6
|
Add several Danish alternative spellings
|
2017-11-27 13:35:41 +01:00 |
Søren Lind Kristiansen
|
ef03e9ea53
|
Remove unused import.
|
2017-11-25 13:04:02 +01:00 |
Søren Lind Kristiansen
|
6aa241bcec
|
Add day of month tokenizer exceptions for Danish.
|
2017-11-24 15:03:24 +01:00 |
Søren Lind Kristiansen
|
0c276ed020
|
Add weekday abbreviations and remove abiguous month abbreviations for Danish.
|
2017-11-24 14:43:29 +01:00 |
Søren Lind Kristiansen
|
056547e989
|
Add multiple tokenizer exceptions for Danish.
|
2017-11-24 11:51:26 +01:00 |
Søren Lind Kristiansen
|
ac8116510d
|
Fix tokenization of 'i.' for Danish.
|
2017-11-24 11:16:53 +01:00 |
ines
|
acb9bdb852
|
Fix PRON_LEMMA imports
|
2017-11-06 17:41:53 +01:00 |
ines
|
819e30a26e
|
Tidy up tokenizer exceptions
|
2017-11-01 23:02:45 +01:00 |
ines
|
7e424a1804
|
Don't copy exception dicts if not necessary and tidy up
|
2017-10-31 21:05:29 +01:00 |
Ines Montani
|
facf77e541
|
Merge branch 'develop' into support-danish
|
2017-10-24 11:53:19 +02:00 |
ines
|
8ce6f96180
|
Don't make copies of language data components
|
2017-10-11 15:34:55 +02:00 |
ines
|
0c2343d73a
|
Tidy up language data
|
2017-10-11 02:22:49 +02:00 |
ines
|
1fe5e1a4d1
|
Add language example sentences (see #1107)
da, de, en, es, fr, he, it, nb, pl, pt, sv
|
2017-08-19 12:22:29 +02:00 |
mollerhoj
|
85144835da
|
Add Tag_map for Danish
|
2017-07-03 15:52:55 +02:00 |
mollerhoj
|
64c732918a
|
Add Morph_rules. (TODO: Not working?)
|
2017-07-03 15:52:55 +02:00 |
mollerhoj
|
3b2cb107a3
|
Add like_num functionality to Danish
|
2017-07-03 15:49:51 +02:00 |
mollerhoj
|
e8f40ceed8
|
Add short names of months to tokenizer_exceptions
|
2017-07-03 15:49:51 +02:00 |
mollerhoj
|
dc5be7d2f3
|
Cleanup list of Danish stopwords
|
2017-07-03 15:40:58 +02:00 |
ines
|
4c643d74c5
|
Add norm exceptions to other Language classes
|
2017-06-03 22:29:21 +02:00 |
ines
|
924e8506de
|
Move Defaults subclass to module scope (necessary for pickling)
|
2017-05-20 19:02:27 +02:00 |
ines
|
48177c4f92
|
Add missing tokenizer exceptions
|
2017-05-12 09:25:24 +02:00 |
ines
|
bb8be3d194
|
Add Danish language data
|
2017-05-10 21:15:12 +02:00 |