Commit Graph

109 Commits

Author SHA1 Message Date
Matthew Honnibal c7889492f9 Fix model saving error for Python 3 2016-11-25 18:04:30 -06:00
Matthew Honnibal 22189e60db Use unicode literals in train_ud 2016-11-25 17:45:45 -06:00
Matthew Honnibal da5f0cce36 Fix train_ud script, which trains models from the Universal Dependencies format. 2016-11-25 11:19:33 -06:00
Matthew Honnibal 314bc8d34f Fix train script for 1.0 2016-11-25 08:57:37 -06:00
Matthew Honnibal bd1bfcca61 Update train.py 2016-10-13 03:23:48 +02:00
Matthew Honnibal ea23b64cc8 Refactor training, with new spacy.train module. Defaults still a little awkward. 2016-10-09 12:24:24 +02:00
Matthew Honnibal 53fbd3dd1c Fix train.py for v1.0.0-rc1 2016-10-05 01:11:46 +02:00
Matthew Honnibal 8036368d96 * Fix model saving 2016-05-23 12:01:46 +00:00
Matthew Honnibal 35214053fd * Work around get_lex_attr bug introduced during German parsing 2016-05-23 10:53:00 +00:00
Wolfgang Seeker dae6bc05eb define German dummy lemmatizer until morphology is done 2016-05-02 16:04:53 +02:00
Matthew Honnibal 8569dbc2d0 * Add initial stuff for Chinese parsing 2016-04-24 18:44:24 +02:00
Matthew Honnibal d249e2f7f3 * Improve error message in bin/parser/train.py 2016-03-29 13:04:33 +11:00
Wolfgang Seeker 690c5acabf adjust train.py to train both english and german models 2016-03-03 15:21:00 +01:00
Matthew Honnibal e2ed6251d7 * Fancy up the CLI for the conll train script 2016-02-02 22:58:06 +01:00
Matthew Honnibal a676d66807 * Update the CoNLL train script, to get working on other languages 2016-02-02 22:29:34 +01:00
Matthew Honnibal 6e68b344c1 * Train after parsing, not before. 2015-11-12 04:43:52 +11:00
Matthew Honnibal 4fb038a9eb * Update conll_train.py script for spaCy v0.97 2015-10-31 00:53:51 +11:00
Matthew Honnibal cfaa4bde5d * Add train and parse scripts that use CoNLL formatted data 2015-10-30 12:54:49 +11:00
Matthew Honnibal 83dccf0fd7 * Use io module insteads of deprecated codecs module 2015-10-10 14:13:01 +11:00
Matthew Honnibal f35632e2e5 * Remove SBD print statement in train, after SBD evaluation was removed from Scorer 2015-10-09 11:08:58 +02:00
Matthew Honnibal 6ea1601e93 * Add script to train models off the UD treebanks. Note that the UD data is restricted to research purposes only, and should only be used to train models for academic experiments. 2015-10-08 12:01:08 +11:00
Matthew Honnibal c503654ec1 * Update bin/parser/train for printing output. 2015-10-06 10:35:22 +11:00
alvations 764bdc62e7 caught another codecs.open 2015-09-30 20:16:52 +02:00
Matthew Honnibal b2e82e55f6 * Create POS model dir in training script 2015-09-08 15:36:23 +02:00
Matthew Honnibal d1eea2d865 * Update train.py for language-generic spaCy 2015-09-06 17:51:48 +02:00
Matthew Honnibal ddc1a5cfe5 * Fix training under python3 2015-07-28 14:09:30 +02:00
Matthew Honnibal c52179f5fa * Use print function in train.py, for py 2/3 compatibility 2015-07-24 04:52:35 +02:00
Matthew Honnibal 4729200dfc * Whitespace 2015-07-23 01:19:26 +02:00
Matthew Honnibal 317cbbc015 * Serialization round trip now working with decent API, but with rough spots in the organisation and requiring vocabulary to be fixed ahead of time. 2015-07-19 15:18:17 +02:00
Matthew Honnibal a6ff7e6ca4 * Fix redundant options in train.py 2015-07-17 22:38:05 +02:00
Matthew Honnibal 31b5e58aeb * Begin reorganizing neuralnet work 2015-06-30 14:26:53 +02:00
Matthew Honnibal 1135cfe50a * Tidy nn_train a bit 2015-06-29 16:45:14 +02:00
Matthew Honnibal df8179ca4f * Add separate Param and AdadeltaParam classes. AdadeltaParam seems broken. 2015-06-29 16:39:16 +02:00
Matthew Honnibal 1dff04acb5 * Apply regularization to the softmax, not the bias 2015-06-29 11:45:38 +02:00
Matthew Honnibal ca30fe1582 * Use He initialization trick 2015-06-29 10:56:02 +02:00
Matthew Honnibal fc34e1b6e4 * Move Theano functions into nn_train.py script 2015-06-29 07:09:16 +02:00
Matthew Honnibal fe7b24ecef * whitespace 2015-06-28 11:37:17 +02:00
Matthew Honnibal 7b8275fcc4 * Wire hyperparameters to script interface 2015-06-28 11:37:17 +02:00
Matthew Honnibal 897dd0dd0b * Merge changes, and adjust Example to use memoryview 2015-06-28 11:36:11 +02:00
Matthew Honnibal ef97b90833 * Fix token scoring 2015-06-28 06:22:18 +02:00
Matthew Honnibal 34c0ef2ee8 * Don't compile the orig_arc_eager and tree_arc_eager modules used for the EMNLP paper 2015-06-23 05:38:17 +02:00
Matthew Honnibal 59e9f9153c * Remove projectivity constraint in train.py, but raise Exception if non-projective sentence is encountered, since we've told GoldParse to projectivize 2015-06-23 05:04:46 +02:00
Matthew Honnibal 839e5038b7 * Raise exception on non-projective input 2015-06-23 00:01:55 +02:00
Matthew Honnibal 4dad4058c3 * Uncomment NER training 2015-06-16 23:36:54 +02:00
Matthew Honnibal 5699585278 * Use tree_arc_eager system as baseline in experiments 2015-06-15 08:23:43 +02:00
Matthew Honnibal 4841f8ad5e * Set transition system early 2015-06-15 02:54:12 +02:00
Matthew Honnibal bcfdf126a4 * Add toggle for OrigArcEager system 2015-06-14 20:28:14 +02:00
Matthew Honnibal c500d72dc2 * Temporarily disable NER, and wire up the verbose flag during training 2015-06-14 17:45:31 +02:00
Matthew Honnibal ac422492cf * Fix write_parses mode of bin/parser/train.py 2015-06-07 19:08:48 +02:00
Matthew Honnibal 1736fc5a67 * Add more options to bin/parser/train 2015-06-05 23:49:26 +02:00