Matthew Honnibal
|
2ab0f2d186
|
Merge pull request #1664 from jimregan/italian-lemmatizer
BOM in Italian lemmatiser
|
2017-12-06 11:09:04 +01:00 |
Matthew Honnibal
|
3f247119d3
|
Merge pull request #1668 from sorenlind/da_morph
Add more Danish morph rules and clean up existing ones
|
2017-12-06 11:08:09 +01:00 |
ines
|
f2ea6d4713
|
Add Dutch example sentences (see #1107)
|
2017-12-01 23:36:05 +01:00 |
Søren Lind Kristiansen
|
d86b537a38
|
Enable morph rules for Danish
|
2017-11-30 15:58:02 +01:00 |
Søren Lind Kristiansen
|
13a988adc3
|
Remove 'Number[psor]'
|
2017-11-30 15:55:04 +01:00 |
Søren Lind Kristiansen
|
dd6fde18a9
|
Add more Danish morph rules and clean up existing ones
|
2017-11-30 11:17:19 +01:00 |
Vadim Mazaev
|
4ba7ddf651
|
Bugfixies
|
2017-11-30 12:29:38 +03:00 |
Matthew Honnibal
|
f9ed9ea529
|
Merge pull request #1624 from GreenRiverRUS/russian
Add support for Russian
|
2017-11-29 23:10:01 +01:00 |
Jim O'Regan
|
ba6a23fd11
|
BOM in Italian lemmatiser
|
2017-11-29 17:40:07 +00:00 |
Ines Montani
|
9052643e2c
|
Merge pull request #1653 from sorenlind/da_example_typo
Fix typo
|
2017-11-27 14:47:42 +00:00 |
Søren Lind Kristiansen
|
5fe58b885b
|
Fix typo
|
2017-11-27 15:36:18 +01:00 |
Ines Montani
|
d52b1ab245
|
Add unicode_literals (hopefully fixes test failure on Python 2)
|
2017-11-27 15:16:54 +01:00 |
Søren Lind Kristiansen
|
0ffd27b0f6
|
Add several Danish alternative spellings
|
2017-11-27 13:35:41 +01:00 |
Vadim Mazaev
|
cacd859dcd
|
Added tag map, fixed tests fails, added more exceptions
|
2017-11-26 20:54:48 +03:00 |
Søren Lind Kristiansen
|
ef03e9ea53
|
Remove unused import.
|
2017-11-25 13:04:02 +01:00 |
Søren Lind Kristiansen
|
6aa241bcec
|
Add day of month tokenizer exceptions for Danish.
|
2017-11-24 15:03:24 +01:00 |
Søren Lind Kristiansen
|
0c276ed020
|
Add weekday abbreviations and remove abiguous month abbreviations for Danish.
|
2017-11-24 14:43:29 +01:00 |
Søren Lind Kristiansen
|
056547e989
|
Add multiple tokenizer exceptions for Danish.
|
2017-11-24 11:51:26 +01:00 |
Søren Lind Kristiansen
|
ac8116510d
|
Fix tokenization of 'i.' for Danish.
|
2017-11-24 11:16:53 +01:00 |
Vadim Mazaev
|
81314f8659
|
Fixed tokenizer: added char classes; added first lemmatizer and
tokenizer tests
|
2017-11-21 22:23:59 +03:00 |
Vadim Mazaev
|
52ee1f9bf9
|
Updated Russian Language, added lemmatizer, norm exceptions and lex
attrs
|
2017-11-21 11:44:46 +03:00 |
Vadim Mazaev
|
a0739a06d4
|
Returned russian support from v1.10 branch
|
2017-11-17 17:06:15 +03:00 |
ines
|
c9d72de0fb
|
Add dummy serialization methods for Japanese and missing lang getter (resolves #1557)
|
2017-11-15 12:44:02 +01:00 |
Mathias Deschamps
|
c0691b2ab4
|
Add tokenizer exceptions for ing verbs
Extend list of tokenizing exceptions introduced in 123810b
|
2017-11-13 17:46:05 +01:00 |
Mathias Deschamps
|
288298ead9
|
Add norm exception for ing verbs
Some ing verbs are sometimes written in or in'. Make the NORM form correct
|
2017-11-13 17:46:05 +01:00 |
Abhinav Sharma
|
59f5740ede
|
improved upon the list of included stop_words
|
2017-11-13 17:13:49 +05:30 |
ines
|
123810b6de
|
Add "lovin'" to tokenizer exceptions (see #1248)
|
2017-11-09 17:09:30 +01:00 |
Ines Montani
|
42b241ccd0
|
Update language code in usage example in comment
|
2017-11-08 11:36:38 +01:00 |
Abhinav Sharma
|
84edade82d
|
Create examples.py
Populated the file with the translations of English example sentences
|
2017-11-08 13:23:08 +05:30 |
ines
|
bcf42b8846
|
Fix typo
|
2017-11-08 01:06:37 +01:00 |
ines
|
acb9bdb852
|
Fix PRON_LEMMA imports
|
2017-11-06 17:41:53 +01:00 |
ines
|
baa231745c
|
Fix Dutch tag map
|
2017-11-05 21:41:50 +01:00 |
ines
|
507ecb67af
|
Fix Spanish tag map
|
2017-11-05 19:23:34 +01:00 |
ines
|
975e1042ff
|
Fix Italian tag map
|
2017-11-05 18:34:09 +01:00 |
ines
|
6b2d6e4937
|
Fix Portuguese tag map
|
2017-11-05 18:31:00 +01:00 |
ines
|
fa2687fded
|
Fix Dutch tag map
|
2017-11-05 17:57:59 +01:00 |
ines
|
fb8990d916
|
Fix Spanish tag map
|
2017-11-05 17:48:46 +01:00 |
ines
|
9d13288f73
|
Fix French tag map
|
2017-11-05 17:47:59 +01:00 |
ines
|
54579805c5
|
Fix French tag map
|
2017-11-05 17:44:05 +01:00 |
Matthew Honnibal
|
0d4bd6414e
|
Fix Italian tag map
|
2017-11-05 14:11:03 +01:00 |
ines
|
ef597622a6
|
Add Portuguese tag map
|
2017-11-05 13:58:34 +01:00 |
ines
|
793c62dfda
|
Add Dutch tag map
|
2017-11-05 13:48:07 +01:00 |
ines
|
f7485a09c8
|
Fix Italian tag map
|
2017-11-05 13:12:58 +01:00 |
ines
|
3cef901834
|
Add tag map for French and Italian
|
2017-11-04 23:32:51 +01:00 |
ines
|
6c15aafebd
|
Fix formatting
|
2017-11-04 23:07:02 +01:00 |
ines
|
9baab241b4
|
Add skeleton language data for Turkish
|
2017-11-02 16:32:24 +01:00 |
ines
|
c6fea3e5f6
|
Add Romanian and Croatian skeletons (experimental)
Add language data templates to make it easier for others to contribute to the language support
|
2017-11-01 23:04:28 +01:00 |
ines
|
18c859500b
|
Add missing imports
|
2017-11-01 23:02:51 +01:00 |
ines
|
819e30a26e
|
Tidy up tokenizer exceptions
|
2017-11-01 23:02:45 +01:00 |
ines
|
9659391944
|
Update deprecated methods and add warnings
|
2017-11-01 16:49:42 +01:00 |