Commit Graph

334 Commits

Author SHA1 Message Date
Matthew Honnibal 8c2938fe01 * Rename Lexicon._dict to Lexicon._map 2014-12-02 23:46:59 +11:00
Matthew Honnibal 2ee8a1e61f * Make intro chattier, explain philosophy better 2014-12-02 15:20:18 +11:00
Matthew Honnibal ea19850a69 * Add tokenizer section 2014-12-02 04:39:12 +11:00
Matthew Honnibal 3430d5f629 * Revise intro copy. Add NLTK comparison 2014-12-01 22:55:13 +11:00
Matthew Honnibal 33dfb4933c * Remove taggers from Language class. Work on doc strings 2014-11-26 19:53:55 +11:00
Matthew Honnibal cf55b48ba6 * Switch to predict label on shift. Big increase in accuracy. 2014-11-12 23:50:12 +11:00
Matthew Honnibal 8f84e8a78b * Neaten oracle 2014-11-12 23:38:07 +11:00
Matthew Honnibal 66cb4f96e1 * Upd gitignore 2014-11-12 23:25:27 +11:00
Matthew Honnibal 60c1e78596 * Commit outstanding tests 2014-11-12 23:24:32 +11:00
Matthew Honnibal 7e0a9077dd * Add context files 2014-11-12 23:22:36 +11:00
Matthew Honnibal 9b13392ac7 * Add conll experiments 2014-11-12 23:22:05 +11:00
Matthew Honnibal b934bf1c69 * Compile IOB 2014-11-12 23:21:40 +11:00
Matthew Honnibal 3b0b902384 * IOB-style parsing working. Accuracy down from BILOU, form 87-88 to 85-86 2014-11-12 23:21:09 +11:00
Matthew Honnibal e6bb8aa3a9 * Move moves to bilou_moves. Refactor context, returning to the simpler giant-enum style 2014-11-12 00:54:50 +11:00
Matthew Honnibal c788633429 * Add tokens_from_list method to Language 2014-11-11 23:43:14 +11:00
Matthew Honnibal da70b6bd60 * Upd tokenization special-cases 2014-11-11 22:10:15 +11:00
Matthew Honnibal 95282d4993 * Use the dynamic oracle 'follow' strategy 2014-11-11 21:11:17 +11:00
Matthew Honnibal 60ffdc2eb7 * Upd fabfile 2014-11-11 21:10:40 +11:00
Matthew Honnibal d5e9dce039 * Compile ner NER code 2014-11-11 21:10:22 +11:00
Matthew Honnibal b01604b303 * Upd NER tests 2014-11-11 21:10:04 +11:00
Matthew Honnibal 5aaf7a024d * Move ner features to ner subdir 2014-11-11 21:09:03 +11:00
Matthew Honnibal ff8989b63c * Use greedy NER parser 2014-11-11 21:08:35 +11:00
Matthew Honnibal 0d943ab358 * Fixed greedy NER parsing. With static oracle, replicates accuracy from tagger. 2014-11-11 17:17:54 +11:00
Matthew Honnibal 399239760b * Fix moves for new State struct 2014-11-10 22:16:05 +11:00
Matthew Honnibal 82247169f2 * Implement validation and oracle on pystate, for testing 2014-11-10 22:15:32 +11:00
Matthew Honnibal 3709ed9d6d * Add curr field to State, to handle entity being built 2014-11-10 22:14:36 +11:00
Matthew Honnibal 10e9e14c4f * Add tests for NER oracle 2014-11-10 22:13:46 +11:00
Matthew Honnibal af9ed18cf1 * Bug fixes to NER 2014-11-10 17:39:23 +11:00
Matthew Honnibal d7b2843643 * Add some tests for ner 2014-11-10 16:29:19 +11:00
Matthew Honnibal 9f2587f5ec * Work on shift-reduce NER 2014-11-10 16:28:56 +11:00
Matthew Honnibal f307eb2e36 * Refactor context extraction, and start breaking out gold standards into their own functions 2014-11-09 15:43:07 +11:00
Matthew Honnibal 602f993af9 * Moving tagger to accept multiple correct answers 2014-11-09 15:18:33 +11:00
Matthew Honnibal 10a33ec725 * Upd fabfile for experiments 2014-11-07 04:44:14 +11:00
Matthew Honnibal f37d896a42 * Upd NER feats. With adadelta learner, getting 76.9 on NER 2014-11-07 04:43:54 +11:00
Matthew Honnibal a42321bd4e * Upd shape test 2014-11-07 04:42:54 +11:00
Matthew Honnibal 68d1cdad62 * When encoding POS/NER tags, accept '-' as a missing value 2014-11-07 04:42:31 +11:00
Matthew Honnibal 949a6245f9 * Increase default number of iterations from 5 to 10 2014-11-07 04:42:04 +11:00
Matthew Honnibal 3cab1d9a29 * Refine word_shape feature, by trimming the max sequence length 2014-11-07 04:41:29 +11:00
Matthew Honnibal b4454cf036 * Add extra context tokens 2014-11-07 04:40:36 +11:00
Matthew Honnibal 50309e6e49 * Fix context vector, importing all features 2014-11-05 22:11:39 +11:00
Matthew Honnibal 07a23768de * Play with NER feats a bit. Up to 82.00 training on MUC7. 2014-11-05 21:47:17 +11:00
Matthew Honnibal edf739134c * Make make quiet by default, and add a vmake option for verbose make 2014-11-05 20:46:29 +11:00
Matthew Honnibal dbbb914480 * Upd setup 2014-11-05 20:45:44 +11:00
Matthew Honnibal 4ecbe8c893 * Complete refactor of Tagger features, to use a generic list of context names. 2014-11-05 20:45:29 +11:00
Matthew Honnibal 0a8c84625d * Moving feature context stuff to a generalized place 2014-11-05 19:55:10 +11:00
Matthew Honnibal 3733444101 * Generalize tagger code, in preparation for NER and supersense tagging. 2014-11-05 03:42:14 +11:00
Matthew Honnibal 81da61f3cf * Remove out-dated POS data test 2014-11-05 02:04:12 +11:00
Matthew Honnibal 0de700b566 * Comment out tests of hyphenation, while we decide what hyphenation policy should be. 2014-11-05 02:03:22 +11:00
Matthew Honnibal abbe3e44b0 * Move spacy.pos tagger to spacy.tagger, and generalize it so that it can take on other tagging tasks, given a different set of feature templates. 2014-11-05 00:37:59 +11:00
Matthew Honnibal 2420d944cb * Upd sales copy 2014-11-04 17:01:54 +11:00