Commit Graph

6025 Commits

Author SHA1 Message Date
Jim Geovedi f288964441 removed -el from suffix rules 2017-07-26 19:28:38 +07:00
Jim Geovedi 6eee7a7411 updated tokenizer exceptions 2017-07-26 19:13:47 +07:00
Jim Geovedi edec51b1b1 update punctuation rules 2017-07-26 19:13:36 +07:00
Jim Geovedi 62443d495a enable token match 2017-07-26 19:13:14 +07:00
Jim Geovedi c97f5ae0bb updated tokenizer exceptions 2017-07-26 19:12:52 +07:00
Jim Geovedi 73f6ac9d9b added hyhen 2017-07-24 15:56:31 +07:00
Jim Geovedi 68454c40bf added missing import 2017-07-24 14:12:34 +07:00
Jim Geovedi eaf9cbd708 cursed of copy & paste 2017-07-24 14:11:51 +07:00
Jim Geovedi 7aad6718bc enable tokenizer exceptions 2017-07-24 14:11:10 +07:00
Jim Geovedi ad56c9179a added tokenizer exceptions list 2017-07-24 14:10:16 +07:00
Jim Geovedi c1f3fe99fe updated punctuation rules 2017-07-24 13:57:21 +07:00
Jim Geovedi 37fa2c8c80 punctution rules 2017-07-24 06:17:18 +07:00
Jim Geovedi 082e94ac1c added inflix rules 2017-07-24 06:17:07 +07:00
Jim Geovedi d0ec484725 reverted 2017-07-24 06:16:29 +07:00
Jim Geovedi 0e590c711f added prefix & suffix rules 2017-07-23 23:46:40 +07:00
Jim Geovedi ba922e30e8 added ampere hour unit 2017-07-23 23:46:18 +07:00
Jim Geovedi 3b17eba27b added frequency units 2017-07-23 23:10:52 +07:00
Jim Geovedi d5fd32a572 added known currencies 2017-07-23 22:56:48 +07:00
Jim Geovedi f6f15678fb added lex_attrs 2017-07-23 22:55:22 +07:00
Jim Geovedi bed8162d00 added tokenizer_exceptions 2017-07-23 22:55:05 +07:00
Jim Geovedi b80c35bc9a added norm_exceptions 2017-07-23 22:54:49 +07:00
Jim Geovedi b5de329ea3 added norm_exceptions 2017-07-23 22:54:19 +07:00
Jim Geovedi 082e9ade46 fixed typo 2017-07-23 21:30:34 +07:00
Jim Geovedi e2efeb186e added stopwords 2017-07-23 20:52:37 +07:00
Jim Geovedi da98676839 use template 2017-07-23 20:51:31 +07:00
Jim Geovedi c2b4dd7809 start working on Indonesian language 2017-07-23 20:50:56 +07:00
Matthew Honnibal 5771bd1ff8 Increment version 2017-07-23 14:18:38 +02:00
Matthew Honnibal c4a81a47a4 Fix deserialization 2017-07-23 14:11:07 +02:00
Matthew Honnibal 2df563ad24 Remove optimization for textcat that caused loading problem 2017-07-23 14:10:51 +02:00
Matthew Honnibal 4fe77bced2 Add cfg attr to pipeline components 2017-07-23 00:52:47 +02:00
Matthew Honnibal d8aa721664 Compute Language.meta with a property 2017-07-23 00:50:18 +02:00
Matthew Honnibal 54a539a113 Finish text classifier example 2017-07-23 00:34:12 +02:00
Matthew Honnibal a88a7deffe Five save/load of textcat config 2017-07-23 00:33:43 +02:00
Matthew Honnibal c27fdaef6f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-07-22 20:15:55 +02:00
Matthew Honnibal 2bc7d87c70 Add example for training text classifier 2017-07-22 20:15:32 +02:00
Matthew Honnibal 9bae0ddc50 Fix minibatching 2017-07-22 20:14:49 +02:00
Matthew Honnibal ded0df5e2f Expose hyper-param as keyword arg 2017-07-22 20:14:37 +02:00
Matthew Honnibal f5de8deeec Increment version 2017-07-22 20:04:53 +02:00
Matthew Honnibal b55714d5d1 Make gold_tuples arg optional in begin_training 2017-07-22 20:04:43 +02:00
Matthew Honnibal ed6c85fa3c Fix loading of text categories in GoldParse 2017-07-22 20:04:03 +02:00
Matthew Honnibal 6ffec9dfea Update _ml, for textcat model 2017-07-22 20:03:40 +02:00
ines ab8ffbaab7 Add text classification to v2 overview 2017-07-22 17:56:51 +02:00
ines f085b88f9d Add TextCategorizer API docs stub 2017-07-22 17:56:33 +02:00
ines ab1a4e8b3c Add Tensorizer API docs stub 2017-07-22 17:56:25 +02:00
ines 0fb89dd204 Add text classification usage guide template 2017-07-22 17:56:07 +02:00
ines d05ab1b3a0 Add text classification to 101 overview and change order 2017-07-22 17:55:53 +02:00
ines d2a7e5b8e5 Add GoldParse.cats attribute 2017-07-22 17:55:35 +02:00
ines 23d976ed00 Add Doc.cats attribute and missing v2 tag 2017-07-22 17:55:14 +02:00
Ines Montani 1ddbeddca2 Fix typo 2017-07-22 15:00:58 +02:00
Matthew Honnibal d6a5c2c85a Add test for NER 2017-07-22 01:48:58 +02:00