Commit Graph

69 Commits

Author SHA1 Message Date
Jim Geovedi c97f5ae0bb updated tokenizer exceptions 2017-07-26 19:12:52 +07:00
Jim Geovedi 73f6ac9d9b added hyhen 2017-07-24 15:56:31 +07:00
Jim Geovedi 68454c40bf added missing import 2017-07-24 14:12:34 +07:00
Jim Geovedi eaf9cbd708 cursed of copy & paste 2017-07-24 14:11:51 +07:00
Jim Geovedi 7aad6718bc enable tokenizer exceptions 2017-07-24 14:11:10 +07:00
Jim Geovedi ad56c9179a added tokenizer exceptions list 2017-07-24 14:10:16 +07:00
Jim Geovedi c1f3fe99fe updated punctuation rules 2017-07-24 13:57:21 +07:00
Jim Geovedi 37fa2c8c80 punctution rules 2017-07-24 06:17:18 +07:00
Jim Geovedi 082e94ac1c added inflix rules 2017-07-24 06:17:07 +07:00
Jim Geovedi 0e590c711f added prefix & suffix rules 2017-07-23 23:46:40 +07:00
Jim Geovedi d5fd32a572 added known currencies 2017-07-23 22:56:48 +07:00
Jim Geovedi f6f15678fb added lex_attrs 2017-07-23 22:55:22 +07:00
Jim Geovedi bed8162d00 added tokenizer_exceptions 2017-07-23 22:55:05 +07:00
Jim Geovedi b80c35bc9a added norm_exceptions 2017-07-23 22:54:49 +07:00
Jim Geovedi b5de329ea3 added norm_exceptions 2017-07-23 22:54:19 +07:00
Jim Geovedi 082e9ade46 fixed typo 2017-07-23 21:30:34 +07:00
Jim Geovedi e2efeb186e added stopwords 2017-07-23 20:52:37 +07:00
Jim Geovedi da98676839 use template 2017-07-23 20:51:31 +07:00
Jim Geovedi c2b4dd7809 start working on Indonesian language 2017-07-23 20:50:56 +07:00