Commit Graph

118 Commits

Author SHA1 Message Date
ines 8bc05c2ba9 Delete old training scripts (resolves #911) 2017-03-23 11:07:59 +01:00
Raphaël Bournhonesque 08346dba1a Use specific language class instead of base Language class 2017-03-21 23:18:54 +01:00
Raphaël Bournhonesque 7568cd6bf8 Split CONLLX file using tabs and not default split separators 2017-03-21 23:00:13 +01:00
Matthew Honnibal ef6bd08e6c Update train_ud for Universal Dependencies 2 2017-03-16 17:08:15 -05:00
Matthew Honnibal a155482fda Improve printing in train_ud script 2017-03-11 11:11:05 -06:00
Matthew Honnibal 35124b144a Add L1 penalty option to parser 2017-03-09 18:44:53 -06:00
Matthew Honnibal 04a51dab62 Print active parser features during training 2017-03-08 01:37:19 +01:00
Matthew Honnibal 4ff92184f1 Improve train_ud script 2017-01-09 09:53:46 -06:00
Matthew Honnibal c1ef07788c Update train_ud.py
Create deps folder if it doesn't exist.
2017-01-09 10:55:44 +11:00
Matthew Honnibal c7889492f9 Fix model saving error for Python 3 2016-11-25 18:04:30 -06:00
Matthew Honnibal 22189e60db Use unicode literals in train_ud 2016-11-25 17:45:45 -06:00
Matthew Honnibal da5f0cce36 Fix train_ud script, which trains models from the Universal Dependencies format. 2016-11-25 11:19:33 -06:00
Matthew Honnibal 314bc8d34f Fix train script for 1.0 2016-11-25 08:57:37 -06:00
Matthew Honnibal bd1bfcca61 Update train.py 2016-10-13 03:23:48 +02:00
Matthew Honnibal ea23b64cc8 Refactor training, with new spacy.train module. Defaults still a little awkward. 2016-10-09 12:24:24 +02:00
Matthew Honnibal 53fbd3dd1c Fix train.py for v1.0.0-rc1 2016-10-05 01:11:46 +02:00
Matthew Honnibal 8036368d96 * Fix model saving 2016-05-23 12:01:46 +00:00
Matthew Honnibal 35214053fd * Work around get_lex_attr bug introduced during German parsing 2016-05-23 10:53:00 +00:00
Wolfgang Seeker dae6bc05eb define German dummy lemmatizer until morphology is done 2016-05-02 16:04:53 +02:00
Matthew Honnibal 8569dbc2d0 * Add initial stuff for Chinese parsing 2016-04-24 18:44:24 +02:00
Matthew Honnibal d249e2f7f3 * Improve error message in bin/parser/train.py 2016-03-29 13:04:33 +11:00
Wolfgang Seeker 690c5acabf adjust train.py to train both english and german models 2016-03-03 15:21:00 +01:00
Matthew Honnibal e2ed6251d7 * Fancy up the CLI for the conll train script 2016-02-02 22:58:06 +01:00
Matthew Honnibal a676d66807 * Update the CoNLL train script, to get working on other languages 2016-02-02 22:29:34 +01:00
Matthew Honnibal 6e68b344c1 * Train after parsing, not before. 2015-11-12 04:43:52 +11:00
Matthew Honnibal 4fb038a9eb * Update conll_train.py script for spaCy v0.97 2015-10-31 00:53:51 +11:00
Matthew Honnibal cfaa4bde5d * Add train and parse scripts that use CoNLL formatted data 2015-10-30 12:54:49 +11:00
Matthew Honnibal 83dccf0fd7 * Use io module insteads of deprecated codecs module 2015-10-10 14:13:01 +11:00
Matthew Honnibal f35632e2e5 * Remove SBD print statement in train, after SBD evaluation was removed from Scorer 2015-10-09 11:08:58 +02:00
Matthew Honnibal 6ea1601e93 * Add script to train models off the UD treebanks. Note that the UD data is restricted to research purposes only, and should only be used to train models for academic experiments. 2015-10-08 12:01:08 +11:00
Matthew Honnibal c503654ec1 * Update bin/parser/train for printing output. 2015-10-06 10:35:22 +11:00
alvations 764bdc62e7 caught another codecs.open 2015-09-30 20:16:52 +02:00
Matthew Honnibal b2e82e55f6 * Create POS model dir in training script 2015-09-08 15:36:23 +02:00
Matthew Honnibal d1eea2d865 * Update train.py for language-generic spaCy 2015-09-06 17:51:48 +02:00
Matthew Honnibal ddc1a5cfe5 * Fix training under python3 2015-07-28 14:09:30 +02:00
Matthew Honnibal c52179f5fa * Use print function in train.py, for py 2/3 compatibility 2015-07-24 04:52:35 +02:00
Matthew Honnibal 4729200dfc * Whitespace 2015-07-23 01:19:26 +02:00
Matthew Honnibal 317cbbc015 * Serialization round trip now working with decent API, but with rough spots in the organisation and requiring vocabulary to be fixed ahead of time. 2015-07-19 15:18:17 +02:00
Matthew Honnibal a6ff7e6ca4 * Fix redundant options in train.py 2015-07-17 22:38:05 +02:00
Matthew Honnibal 31b5e58aeb * Begin reorganizing neuralnet work 2015-06-30 14:26:53 +02:00
Matthew Honnibal 1135cfe50a * Tidy nn_train a bit 2015-06-29 16:45:14 +02:00
Matthew Honnibal df8179ca4f * Add separate Param and AdadeltaParam classes. AdadeltaParam seems broken. 2015-06-29 16:39:16 +02:00
Matthew Honnibal 1dff04acb5 * Apply regularization to the softmax, not the bias 2015-06-29 11:45:38 +02:00
Matthew Honnibal ca30fe1582 * Use He initialization trick 2015-06-29 10:56:02 +02:00
Matthew Honnibal fc34e1b6e4 * Move Theano functions into nn_train.py script 2015-06-29 07:09:16 +02:00
Matthew Honnibal fe7b24ecef * whitespace 2015-06-28 11:37:17 +02:00
Matthew Honnibal 7b8275fcc4 * Wire hyperparameters to script interface 2015-06-28 11:37:17 +02:00
Matthew Honnibal 897dd0dd0b * Merge changes, and adjust Example to use memoryview 2015-06-28 11:36:11 +02:00
Matthew Honnibal ef97b90833 * Fix token scoring 2015-06-28 06:22:18 +02:00
Matthew Honnibal 34c0ef2ee8 * Don't compile the orig_arc_eager and tree_arc_eager modules used for the EMNLP paper 2015-06-23 05:38:17 +02:00