Commit Graph

169 Commits

Author SHA1 Message Date
Matthew Honnibal c617d288d8 Update pipeline component names in spaCy train 2017-10-02 17:20:19 +02:00
Matthew Honnibal ac8481a7b0 Print NER loss 2017-09-28 08:05:31 -05:00
Matthew Honnibal 542ebfa498 Improve defaults 2017-09-27 18:54:37 -05:00
Matthew Honnibal dcb86bdc43 Default batch size to 32 2017-09-27 11:48:19 -05:00
ines 1ff62eaee7 Fix option shortcut to avoid conflict 2017-09-26 17:59:34 +02:00
ines 7fdfb78141 Add version option to cli.train 2017-09-26 17:34:52 +02:00
Matthew Honnibal 698fc0d016 Remove merge artefact 2017-09-26 08:31:37 -05:00
Matthew Honnibal defb68e94f Update feature/noshare with recent develop changes 2017-09-26 08:15:14 -05:00
ines edf7e4881d Add meta.json option to cli.train and add relevant properties
Add accuracy scores to meta.json instead of accuracy.json and replace
all relevant properties like lang, pipeline, spacy_version in existing
meta.json. If not present, also add name and version placeholders to
make it packagable.
2017-09-25 19:00:47 +02:00
Matthew Honnibal 204b58c864 Fix evaluation during training 2017-09-24 05:01:03 -05:00
Matthew Honnibal dc3a623d00 Remove unused update_shared argument 2017-09-24 05:00:37 -05:00
Matthew Honnibal 4348c479fc Merge pre-trained vectors and noshare patches 2017-09-22 20:07:28 -05:00
Matthew Honnibal e93d43a43a Fix training with preset vectors 2017-09-22 20:00:40 -05:00
Matthew Honnibal a2357cce3f Set random seed in train script 2017-09-23 02:57:31 +02:00
Matthew Honnibal 0a9016cade Fix serialization during training 2017-09-21 13:06:45 -05:00
Matthew Honnibal 20193371f5 Don't share CNN, to reduce complexities 2017-09-21 14:59:48 +02:00
Matthew Honnibal 1d73dec8b1 Refactor train script 2017-09-20 19:17:10 -05:00
Matthew Honnibal a0c4b33d03 Support resuming a model during spacy train 2017-09-18 18:04:47 -05:00
Matthew Honnibal 8496d76224 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-09-14 09:21:20 -05:00
Matthew Honnibal 24ff6b0ad9 Fix parsing and tok2vec models 2017-09-06 05:50:58 -05:00
Matthew Honnibal e920885676 Fix pickle during train 2017-09-02 12:46:01 -05:00
Matthew Honnibal 7a6edeea68 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-20 12:55:39 -05:00
Matthew Honnibal f2f9229964 Fix name of update_shared flag 2017-08-20 18:19:06 +02:00
Matthew Honnibal 84bb543e4d Add gold_preproc flag to cli/train 2017-08-20 11:07:00 -05:00
Matthew Honnibal 11c31d285c Restore changes from nn-beam-parser 2017-08-18 22:26:12 +02:00
Matthew Honnibal 52c180ecf5 Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5, reversing
changes made to 08e443e083.
2017-08-14 13:00:23 +02:00
Matthew Honnibal 8870d491f1 Remove redundant pickling during training 2017-08-12 08:55:53 -05:00
Matthew Honnibal 0a566dc320 Add update_tensors flag to Language.update. Experimental, re #1182 2017-08-06 02:18:12 +02:00
Matthew Honnibal c52fde40f4 Improve train CLI 2017-06-04 20:18:37 -05:00
Matthew Honnibal 21eef90dbc Support specifying which GPU 2017-06-03 16:10:23 -05:00
Matthew Honnibal 43353b5413 Improve train CLI script 2017-06-03 13:28:20 -05:00
Matthew Honnibal 8a693c2605 Write binary file during training 2017-05-31 02:59:18 +02:00
Matthew Honnibal 49235017bf Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-27 16:34:28 -05:00
Matthew Honnibal 5e4312feed Evaluate loaded class, to ensure save/load works 2017-05-27 15:47:02 -05:00
ines 086a06e7d7 Fix CLI docstrings and add command as first argument
Workaround for Plac
2017-05-27 20:01:46 +02:00
Matthew Honnibal de13fe0305 Remove length cap on sentences 2017-05-27 08:20:32 -05:00
Matthew Honnibal d65f99a720 Improve model saving in train script 2017-05-26 05:52:09 -05:00
Matthew Honnibal df8015f05d Tweaks to train script 2017-05-25 17:15:24 -05:00
Matthew Honnibal 702fe74a4d Clean up spacy.cli.train 2017-05-25 16:16:30 -05:00
Matthew Honnibal 135a13790c Disable gold preprocessing 2017-05-24 20:10:20 -05:00
Matthew Honnibal 3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal 532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
Matthew Honnibal 6e8dce2c05 Fix train command line args 2017-05-22 10:41:39 -05:00
Matthew Honnibal ae8cf70dc1 Fix CLI train signature 2017-05-22 06:13:39 -05:00
ines fc3ec733ea Reduce complexity in CLI
Remove now redundant model command and move plac annotations to cli
files
2017-05-22 12:28:58 +02:00
Matthew Honnibal bc2294d7f1 Add support for fiddly hyper-parameters to train func 2017-05-22 04:51:08 -05:00
Matthew Honnibal 4e0988605a Pass through non-projective=True 2017-05-22 04:51:08 -05:00
Matthew Honnibal e14533757b Use averaged params for evaluation 2017-05-22 04:51:08 -05:00
Matthew Honnibal 4c9202249d Refactor training, to fix memory leak 2017-05-21 09:07:06 -05:00
Matthew Honnibal 3376d4d6e8 Update the train script, fixing GPU memory leak 2017-05-19 18:15:50 -05:00
Matthew Honnibal ca70b08661 Fix GPU training and evaluation 2017-05-18 08:30:33 -05:00
Matthew Honnibal fc8d3a112c Add util.env_opt support: Can set hyper params through environment variables. 2017-05-18 04:36:53 -05:00
Matthew Honnibal 793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal 8cf097ca88 Redesign training to integrate NN components
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal 5211645af3 Get data flowing through pipeline. Needs redesign 2017-05-16 11:21:59 +02:00
Matthew Honnibal a9edb3aa1d Improve integration of NN parser, to support unified training API 2017-05-15 21:53:27 +02:00
ines 59c3b9d4dd Tidy up CLI and fix print functions 2017-05-07 23:25:29 +02:00
Matthew Honnibal 4f9657b42b Fix reporting if no dev data with train 2017-04-23 22:27:10 +02:00
ines 3a9710f356 Pass dev_scores to print_progress correctly (resolves #1008)
Only read scores attribute if command is used with dev_data, otherwise
default dev_scores to empty dict.
2017-04-23 15:58:40 +02:00
Matthew Honnibal 89a4f262fc Fix training methods 2017-04-16 13:00:37 -05:00
ines d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines 9952d3b08a Fix whitespace 2017-04-07 13:02:05 +02:00
Matthew Honnibal 2efdbc08ff Make training work with directories 2017-03-26 08:46:44 -05:00
Matthew Honnibal 9dcb58aaaf Merge CLI changes 2017-03-26 07:30:45 -05:00
Matthew Honnibal 6b7f7a2060 Connect parser L1 option to train CLI 2017-03-26 07:24:07 -05:00
Matthew Honnibal dec5571bf3 Update train CLI 2017-03-26 07:16:52 -05:00
ines 53cf2f1c0e Make dev data optional 2017-03-26 11:48:17 +02:00
ines 0035fd9efe Add spacy train work in progress 2017-03-23 11:08:41 +01:00