Commit Graph

14 Commits

Author SHA1 Message Date
Aristo Rinjuang 432ede04af adding more words and rephrasing (#2351)
* adding more words and rephrasing

* adding a contributor

* tokenizer bugs solved
2018-05-24 11:40:57 +02:00
ines 7e424a1804 Don't copy exception dicts if not necessary and tidy up 2017-10-31 21:05:29 +01:00
Jim Geovedi f77443ab68 reworked 2017-08-20 13:43:21 +07:00
Jim Geovedi b7d83f37c8 indonesian abbr. 2017-08-20 12:16:50 +07:00
Jim Geovedi 30fd068d42 hashtag prefix should be handled somewhere else 2017-08-03 13:03:02 +07:00
Jim Geovedi e9af79a803 added u-\d+ rules (sports team) 2017-07-30 21:23:01 +07:00
Jim Geovedi e5adc26c72 simplified rules 2017-07-29 18:21:32 +07:00
Jim Geovedi 4d04898dea updated regexp 2017-07-29 17:44:57 +07:00
Jim Geovedi 63f14ba46b added hyphen-suffix rules 2017-07-26 19:28:57 +07:00
Jim Geovedi 6eee7a7411 updated tokenizer exceptions 2017-07-26 19:13:47 +07:00
Jim Geovedi 68454c40bf added missing import 2017-07-24 14:12:34 +07:00
Jim Geovedi eaf9cbd708 cursed of copy & paste 2017-07-24 14:11:51 +07:00
Jim Geovedi 7aad6718bc enable tokenizer exceptions 2017-07-24 14:11:10 +07:00
Jim Geovedi bed8162d00 added tokenizer_exceptions 2017-07-23 22:55:05 +07:00