Commit Graph

12 Commits

Author SHA1 Message Date
ines acb9bdb852 Fix PRON_LEMMA imports 2017-11-06 17:41:53 +01:00
ines 819e30a26e Tidy up tokenizer exceptions 2017-11-01 23:02:45 +01:00
ines 7e424a1804 Don't copy exception dicts if not necessary and tidy up 2017-10-31 21:05:29 +01:00
mayukh18 1bc07758fa added few bengali pronouns 2017-10-25 22:24:40 +05:30
ines 8ce6f96180 Don't make copies of language data components 2017-10-11 15:34:55 +02:00
ines 0c2343d73a Tidy up language data 2017-10-11 02:22:49 +02:00
ines a8e58e04ef Add symbols class to punctuation rules to handle emoji (see #1088)
Currently doesn't work for Hungarian, because of conflicts with the
custom punctuation rules. Also doesn't take multi-character emoji like
👩🏽‍💻 into account.
2017-05-27 17:57:10 +02:00
ines 924e8506de Move Defaults subclass to module scope (necessary for pickling) 2017-05-20 19:02:27 +02:00
ines fc0d793360 Reorganise Bengali punctuation rules 2017-05-09 00:01:52 +02:00
ines 614aa09582 Tidy up Bengali tokenizer exceptions 2017-05-08 22:29:49 +02:00
ines 73b577cb01 Fix relative imports 2017-05-08 22:29:04 +02:00
ines f46ffe3e89 Move language data to /lang module 2017-05-08 20:00:40 +02:00