spaCy/spacy/lang
Jan Jessewitsch c7e4fe9c5c
Fix/Improve german stop words (#5024)
* Fix german stop words

Two stop words ("einige" and  "einigen") are sticking together.
Remove three nouns that may serve as stop words in a specific context (e.g. religious or news) but are not applicable for general use.

* Create Jan-711.md
2020-02-17 18:59:22 +01:00
..
af 💫 Add base Language classes for more languages (#3276) 2019-02-15 01:31:19 +11:00
ar
bg Update examples and languages.json [ci skip] 2019-09-15 17:56:40 +02:00
bn
ca
cs 💫 Add base Language classes for more languages (#3276) 2019-02-15 01:31:19 +11:00
da Move lookup tables out of the core library (#4346) 2019-10-01 00:01:27 +02:00
de Fix/Improve german stop words (#5024) 2020-02-17 18:59:22 +01:00
el Standardize Greek tag map setup (#4997) 2020-02-11 17:44:56 -05:00
en Tidy up and auto-format [ci skip] 2019-10-24 16:20:48 +02:00
es
et 💫 Add base Language classes for more languages (#3276) 2019-02-15 01:31:19 +11:00
fa Move lookup tables out of the core library (#4346) 2019-10-01 00:01:27 +02:00
fi Improvements for Finnish tokenizer (#4985) 2020-02-10 20:32:43 -05:00
fr
ga
he Auto-format [ci skip] 2019-03-11 17:10:50 +01:00
hi
hr Move lookup tables out of the core library (#4346) 2019-10-01 00:01:27 +02:00
hu Improve URL_PATTERN and handling in tokenizer (#4374) 2019-10-05 13:00:09 +02:00
id Move lookup tables out of the core library (#4346) 2019-10-01 00:01:27 +02:00
is
it
ja Tidy up and auto-format 2019-12-21 19:04:17 +01:00
kn Enhancing Kannada language Resources (#3755) 2019-05-20 12:56:10 +02:00
ko
lb
lt Move lookup tables out of the core library (#4346) 2019-10-01 00:01:27 +02:00
lv 💫 Add base Language classes for more languages (#3276) 2019-02-15 01:31:19 +11:00
mr Tidy up [ci skip] 2019-06-12 13:38:23 +02:00
nb Tidy up and auto-format 2019-12-21 19:04:17 +01:00
nl Refactor lemmatizer and data table integration (#4353) 2019-10-01 21:36:03 +02:00
pl Tidy up and auto-format 2019-08-20 17:36:34 +02:00
pt Add missing tags to el/es/pt tag maps (#4696) 2019-11-23 14:57:21 +01:00
ro
ru
si
sk Add Slovak language tools implementation (#4943) 2020-02-03 13:03:59 +01:00
sl
sq
sr Move lookup tables out of the core library (#4346) 2019-10-01 00:01:27 +02:00
sv Adding noun_chunks to the Swedish language model (sv) (#4422) 2019-10-21 12:57:06 +02:00
ta
te 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
th fix thai bug (#3693) 2019-05-10 14:21:34 +02:00
tl Move lookup tables out of the core library (#4346) 2019-10-01 00:01:27 +02:00
tr
tt Tidy up and auto-format 2019-08-20 17:36:34 +02:00
uk Update Ukrainian lemmatizer with new lookups (#4359) 2019-10-02 12:04:06 +02:00
ur Move lookup tables out of the core library (#4346) 2019-10-01 00:01:27 +02:00
vi 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
xx
yo
zh Auto-format 2019-11-20 13:15:24 +01:00
__init__.py
char_classes.py
lex_attrs.py
norm_exceptions.py
punctuation.py
tag_map.py
tokenizer_exceptions.py