.. |
ar
|
Additions to Arabic stop words. (#2422)
|
2018-06-08 02:33:23 +02:00 |
bn
|
Update morph_rules.py (#3283)
|
2019-02-17 12:21:47 +01:00 |
ca
|
Typo error fixed (#3284)
|
2019-02-17 17:51:02 +01:00 |
da
|
Add Danish lemmatizer (#2184)
|
2018-04-07 19:07:28 +02:00 |
de
|
Also include lowercase norm exceptions
|
2018-10-13 15:37:30 +02:00 |
el
|
Optimize Greek language support (#2658)
|
2018-08-14 02:31:32 +02:00 |
en
|
quick typo fix
|
2018-03-24 17:26:35 +01:00 |
es
|
Fix Spanish noun_chunks (resolves #2210)
|
2018-04-18 18:44:01 -04:00 |
fa
|
Add Persian(Farsi) language support (#2797)
|
2018-10-13 15:31:49 +02:00 |
fi
|
Enhancement/lang fi examples (#2547)
|
2018-07-15 09:50:27 +02:00 |
fr
|
Improving the French lookup dictionnary for ambiguous words (#3185)
|
2019-01-31 23:53:45 +01:00 |
ga
|
Remove comma that caused list to wrap in tuple!
|
2017-10-31 20:13:16 +01:00 |
he
|
Don't make copies of language data components
|
2017-10-11 15:34:55 +02:00 |
hi
|
Fix missing comma
|
2018-10-28 00:09:16 +02:00 |
hr
|
Update stop_words.py
|
2018-03-24 17:31:24 +01:00 |
hu
|
Don't copy exception dicts if not necessary and tidy up
|
2017-10-31 21:05:29 +01:00 |
id
|
Update Indonesian model (#2752)
|
2018-09-14 12:30:32 +02:00 |
it
|
Fix syntax error in italian lemmatizer
|
2018-04-03 23:13:22 +02:00 |
ja
|
Making `lang/th/test_tokenizer.py` pass by creating `ThaiTokenizer` (#3078)
|
2019-01-10 15:40:37 +01:00 |
kn
|
Update stop_words.py
|
2019-02-14 12:25:19 +01:00 |
nb
|
Updated wordforms for Norwegian lemmatizer (#3007)
|
2018-12-06 15:46:18 +01:00 |
nl
|
Fix typo [ci skip]
|
2018-07-24 18:45:40 +02:00 |
pl
|
Improved polish tokenizer and stop words. (#2974)
|
2019-02-08 14:27:21 +11:00 |
pt
|
Update Portuguese Language (#2790)
|
2018-09-29 09:51:45 +02:00 |
ro
|
Updates to Romanian support (#2354)
|
2018-05-24 11:40:00 +02:00 |
ru
|
Ukrainian language added. Small fixes in Russian (#3241)
|
2019-02-07 21:05:11 +01:00 |
si
|
Adding "This is a sentence" example to Sinhala (#2846)
|
2018-10-14 00:06:40 +02:00 |
sv
|
Fixed tag map for Swedish Talbanken (#3186)
|
2019-02-08 14:28:59 +11:00 |
ta
|
Tamil (#3194)
|
2019-01-27 06:02:04 +01:00 |
te
|
Basic support for Telugu language (#2751)
|
2018-09-10 11:53:18 +02:00 |
th
|
Making `lang/th/test_tokenizer.py` pass by creating `ThaiTokenizer` (#3078)
|
2019-01-10 15:40:37 +01:00 |
tl
|
Added alpha support for Tagalog language (#3062)
|
2018-12-18 13:08:38 +01:00 |
tr
|
trilyon forgotten (#3083)
|
2018-12-27 14:44:23 +01:00 |
tt
|
Add Tatar Language Support (#2444)
|
2018-06-19 10:17:53 +02:00 |
uk
|
Ukrainian language added. Small fixes in Russian (#3241)
|
2019-02-07 21:05:11 +01:00 |
ur
|
Add Urdu Language Support (#2430)
|
2018-06-22 11:14:03 +02:00 |
vi
|
Add support for Vietnamese in spaCy by leveraging Pyvi, an external Vietnamese tokenizer (#2155)
|
2018-03-29 12:19:51 +02:00 |
xx
|
Tidy up language data
|
2017-10-11 02:22:49 +02:00 |
zh
|
Fix Chinese language related bugs (#2634)
|
2018-08-07 11:26:31 +02:00 |
__init__.py
|
Remove imports in /lang/__init__.py
|
2017-05-08 23:58:07 +02:00 |
char_classes.py
|
Ukrainian language added. Small fixes in Russian (#3241)
|
2019-02-07 21:05:11 +01:00 |
entity_rules.py
|
Reorganise entity rules
|
2017-05-09 01:37:10 +02:00 |
lex_attrs.py
|
Merge pull request #1891 from fucking-signup/master
|
2018-02-18 13:47:47 +01:00 |
norm_exceptions.py
|
Update base norm exceptions with more unicode characters
|
2017-10-14 14:58:52 +02:00 |
punctuation.py
|
Add symbols class to punctuation rules to handle emoji (see #1088)
|
2017-05-27 17:57:10 +02:00 |
tag_map.py
|
Fix formatting
|
2017-05-09 11:08:14 +02:00 |
tokenizer_exceptions.py
|
Tidy up tokenizer exceptions
|
2017-11-01 23:02:45 +01:00 |