Commit Graph

213 Commits

Author SHA1 Message Date
Matthew Honnibal 50ddc9fc45 Fix infinite loop bug 2017-05-08 07:54:26 -05:00
Matthew Honnibal 6782eedf9b Tmp GPU code 2017-05-07 11:04:24 -05:00
Matthew Honnibal e420e5a809 Tmp 2017-05-07 07:31:09 -05:00
Matthew Honnibal 700979fb3c CPU/GPU compat 2017-05-07 04:01:11 +02:00
Matthew Honnibal b439e04f8d Learning smoothly 2017-05-06 20:38:12 +02:00
Matthew Honnibal 08bee76790 Learns things 2017-05-06 18:24:38 +02:00
Matthew Honnibal bcf4cd0a5f Learns things 2017-05-06 17:37:36 +02:00
Matthew Honnibal 8e48b58cd6 Gradients look correct 2017-05-06 16:47:15 +02:00
Matthew Honnibal 7e04260d38 Data running through, likely errors in model 2017-05-06 14:22:20 +02:00
Matthew Honnibal ef4fa594aa Draft of NN parser, to be tested 2017-05-05 19:20:39 +02:00
Matthew Honnibal ccaf26206b Pseudocode for parser 2017-05-04 12:17:59 +02:00
Matthew Honnibal 2da16adcc2 Add dropout optin for parser and NER
Dropout can now be specified in the `Parser.update()` method via
the `drop` keyword argument, e.g.

    nlp.entity.update(doc, gold, drop=0.4)

This will randomly drop 40% of features, and multiply the value of the
others by 1. / 0.4. This may be useful for generalising from small data
sets.

This commit also patches the examples/training/train_new_entity_type.py
example, to use dropout and fix the output (previously it did not output
the learned entity).
2017-04-27 13:18:39 +02:00
Matthew Honnibal d2436dc17b Update fix for Issue #999 2017-04-23 18:14:37 +02:00
Matthew Honnibal 4eef200bab Persist the actions within spacy.parser.cfg 2017-04-20 17:02:44 +02:00
Matthew Honnibal 137b210bcf Restore use of FTRL training 2017-04-16 18:02:42 +02:00
Matthew Honnibal c76cb8af35 Fix training for new labels 2017-04-15 16:11:26 +02:00
Matthew Honnibal 4884b2c113 Refix StepwiseState 2017-04-15 16:00:28 +02:00
Matthew Honnibal 1a98e48b8e Fix Stepwisestate' 2017-04-15 13:35:01 +02:00
ines 0739ae7b76 Tidy up and fix formatting and imports 2017-04-15 13:05:15 +02:00
Matthew Honnibal 354458484c WIP on add_label bug during NER training
Currently when a new label is introduced to NER during training,
it causes the labels to be read in in an unexpected order. This
invalidates the model.
2017-04-14 23:52:17 +02:00
Matthew Honnibal 49e2de900e Add costs property to StepwiseState, to show which moves are gold. 2017-04-10 11:37:04 +02:00
Matthew Honnibal 1bb7b4ca71 Add comment 2017-03-31 13:59:19 +02:00
Matthew Honnibal a9b1f23c7d Enable regression loss for parser 2017-03-26 09:26:30 -05:00
Matthew Honnibal a46933a8fe Clean up FTRL parsing stuff. 2017-03-16 11:58:20 -05:00
Matthew Honnibal 8543db8a5b Use ftrl optimizer in parser 2017-03-15 11:56:37 -05:00
Matthew Honnibal d719f8e77e Use nogil in parser, and set L1 to 0.0 by default 2017-03-15 09:31:01 -05:00
Matthew Honnibal ca9c8c57c0 Add iteration argument to parser.update 2017-03-11 07:00:47 -06:00
Matthew Honnibal 318b9e32ff WIP on beam parser. Currently segfaults. 2017-03-11 06:19:52 -06:00
Matthew Honnibal ecf91a2dbb Support beam parser 2017-03-10 11:21:21 -06:00
Matthew Honnibal c62da02344 Use ftrl training, to learn compressed model. 2017-03-09 18:43:21 -06:00
Matthew Honnibal 40703988bc Use FTRL training in parser 2017-03-08 01:38:51 +01:00
Matthew Honnibal 97a1286129 Revert changes to tagger and parser for thinc 6 2017-01-09 10:08:34 -06:00
Matthew Honnibal af81ac8bb0 Use thinc 6.0 2016-12-29 11:58:42 +01:00
Matthew Honnibal 159e8c46e1 Merge old training fixes with newer state 2016-11-25 09:16:36 -06:00
Matthew Honnibal 608d8f5421 Pass cfg through parser, and have is_valid default to 1, not 0 when resetting state 2016-11-25 09:00:21 -06:00
Matthew Honnibal b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal 03a520ec4f Change signature of Parser.parseC, so that nr_class is read from the transition system. This allows the transition system to modify the number of actions in initialize_state. 2016-10-27 17:58:56 +02:00
Matthew Honnibal 3e688e6d4b Fix issue #514 -- serializer fails when new entity type has been added. The fix here is quite ugly. It's best to add the entities ASAP after loading the NLP pipeline, to mitigate the brittleness. 2016-10-23 17:45:44 +02:00
Matthew Honnibal 59038f7efa Restore support for prior data format -- specifically, the labels field of the config. 2016-10-17 00:53:26 +02:00
Matthew Honnibal 7887ab3b36 Fix default use of feature_templates in parser 2016-10-16 21:41:56 +02:00
Matthew Honnibal f787cd29fe Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor. 2016-10-16 21:34:57 +02:00
Matthew Honnibal e8c8aa08ce Make action_name optional in StepwiseState 2016-10-16 17:04:16 +02:00
Matthew Honnibal 4fc56d4a31 Rename 'labels' to 'actions' in parser options 2016-10-16 11:42:26 +02:00
Matthew Honnibal d9ae2d68af Load features by string-name for backwards compatibility. 2016-10-12 20:15:11 +02:00
Matthew Honnibal 3a03c668c3 Fix message in ParserStateError 2016-10-12 14:44:31 +02:00
Matthew Honnibal 6bf505e865 Fix error on ParserStateError 2016-10-12 14:35:55 +02:00
Matthew Honnibal ea23b64cc8 Refactor training, with new spacy.train module. Defaults still a little awkward. 2016-10-09 12:24:24 +02:00
Matthew Honnibal e3285f6f30 Revert "Fix report of ParserStateError"
This reverts commit 78f19baafa.
2016-09-30 20:11:33 +02:00
Matthew Honnibal 78f19baafa Fix report of ParserStateError 2016-09-30 19:59:22 +02:00
Matthew Honnibal 4cbf0d3bb6 Handle errors when no valid actions are available, pointing users to the issue tracker. 2016-09-27 19:19:53 +02:00