Commit Graph

272 Commits

Author SHA1 Message Date
Matthew Honnibal 2acc907d55 Improve profiling 2017-11-23 12:33:03 +00:00
Matthew Honnibal 8d692771f6 Improve profiling 2017-11-15 13:51:25 +01:00
ines 4c5d2c80d5 Re-add python -m to commands, too brittle :( (see #1536) 2017-11-10 02:30:55 +01:00
Matthew Honnibal de45702bbe Strip dev suffixes from version for compatibility check 2017-11-08 18:40:21 +01:00
Matthew Honnibal a2f980de4e Exclude .devN versioning from compatibility check 2017-11-08 18:03:52 +01:00
ines a4662a31a9 Move model package templates to cli.package and update docs 2017-11-07 12:15:35 +01:00
Matthew Honnibal c2bbf076a4 Add document length cap for training 2017-11-03 01:54:54 +01:00
Matthew Honnibal eca41f0cf6 Fix filename conversion for conllu 2017-11-01 21:26:49 +01:00
Matthew Honnibal e237472cdc Fix tag and filename conversion for conllu 2017-11-01 21:25:33 +01:00
ines affd3404ab Remove old model command (now "vocab") 2017-11-01 13:14:03 +01:00
ines 37e62ab0e2 Update vector meta in meta.json 2017-11-01 01:25:09 +01:00
Matthew Honnibal c390f2d745 Make it easier to pass explicit no-pruning to vocab 2017-10-31 20:14:47 +01:00
Matthew Honnibal 3659a807b0 Remove vector pruning arg from train CLI 2017-10-31 19:21:05 +01:00
Matthew Honnibal 59203a2e8a Move vector pruning command into spacy vocab cli tool 2017-10-31 19:10:01 +01:00
ines 803e41bc66 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-10-30 18:39:51 +01:00
ines abf8aa05d3 Populate --create-meta defaults from file if available
If meta.json is found in directory and user chooses to overwrite it, show existing data as defaults.
2017-10-30 18:39:38 +01:00
ines ce98fa7934 Fix formatting 2017-10-30 18:38:55 +01:00
ines 98c35d2585 Fix spacy vocab command 2017-10-30 18:38:41 +01:00
Matthew Honnibal e98451b5f7 Add -prune-vectors argument to spacy.cly.train 2017-10-30 18:00:10 +01:00
Explosion Bot 05a1dd570e Fix vocab script 2017-10-30 16:19:22 +01:00
Explosion Bot b46bdce8d2 Add missing import 2017-10-30 16:18:10 +01:00
Explosion Bot 0fc1209421 Wire up new vocab command 2017-10-30 16:14:50 +01:00
Matthew Honnibal 64e4ff7c4b Merge 'tidy-up' changes into branch. Resolve conflicts 2017-10-28 13:16:06 +02:00
ines d941fc3667 Tidy up CLI 2017-10-27 14:38:39 +02:00
Matthew Honnibal 531142a933 Merge remote-tracking branch 'origin/develop' into feature/better-parser 2017-10-27 12:34:48 +00:00
Matthew Honnibal b9616419e1 Add try/except around bz2 import 2017-10-27 01:18:05 +00:00
ines 11e3f19764 Fix vectors data added after training (see #1457) 2017-10-25 16:08:26 +02:00
ines 057954695b Read pipeline and vector data off model in --generate-meta 2017-10-25 16:03:26 +02:00
ines 273e638183 Add vector data to model meta after training (see #1457) 2017-10-25 16:03:05 +02:00
ines 95f6174516 Remove tensorizer from model pipeline example in spacy package 2017-10-24 16:00:56 +02:00
ines 24512420b1 Show error if data_path does not exist or is None (see #1102) 2017-10-19 00:53:49 +02:00
Matthew Honnibal dc01acd821 Escape encoding in validate function 2017-10-12 22:23:21 +02:00
ines fff1028391 Add validate CLI command 2017-10-12 20:05:06 +02:00
Matthew Honnibal a955843684 Increase default number of epochs 2017-10-12 13:13:01 +02:00
Matthew Honnibal acba2e1051 Fix metadata in training 2017-10-11 08:55:52 +02:00
Matthew Honnibal 74c2c6a58c Add default name and lang to meta 2017-10-11 08:49:12 +02:00
Matthew Honnibal 5156074df1 Make loading code more consistent in train command 2017-10-10 12:51:20 -05:00
Matthew Honnibal 97c9b5db8b Patch spacy.train for new pipeline management 2017-10-09 23:41:16 -05:00
Matthew Honnibal a635240398 Add conll_ner2json converter 2017-10-09 22:03:26 -05:00
Matthew Honnibal 735d18654d Add NER converter for CoNLL 2003 data 2017-10-09 20:06:28 -05:00
Matthew Honnibal 808d8740d6 Remove print statement 2017-10-09 08:45:20 -05:00
Matthew Honnibal 0f41b25f60 Add speed benchmarks to metadata 2017-10-09 08:05:37 -05:00
Matthew Honnibal be4f0b6460 Update defaults 2017-10-08 02:08:12 -05:00
Matthew Honnibal 9d66a915da Update training defaults 2017-10-07 21:02:38 -05:00
Matthew Honnibal 09442d25ec Merge remote-tracking branch 'origin/develop' into feature/parser-history-model 2017-10-07 07:05:04 -05:00
Matthew Honnibal f4c9a98166 Fix spacy evaluate command on non-GPU 2017-10-06 13:17:47 -05:00
Matthew Honnibal c6cd81f192 Wrap try/except around model saving 2017-10-05 08:14:24 -05:00
Matthew Honnibal 5743b06e36 Wrap model saving in try/except 2017-10-05 08:12:50 -05:00
ines 73ac0aa0b5 Update spacy evaluate and add displaCy option 2017-10-04 00:03:15 +02:00
Matthew Honnibal f24c2e3a8a Fix evaluate for non-GPU 2017-10-03 22:47:31 +02:00
Matthew Honnibal 1289187279 Fix circular import 2017-10-03 09:33:21 -05:00
Matthew Honnibal a44c4c3a5b Add timer to evaluate 2017-10-03 09:15:35 -05:00
Matthew Honnibal 8902df44de Fix component disabling during training 2017-10-02 21:07:23 +02:00
Matthew Honnibal c617d288d8 Update pipeline component names in spaCy train 2017-10-02 17:20:19 +02:00
Matthew Honnibal f942903429 Improve sentence merging in iob2json 2017-10-02 17:02:10 +02:00
Matthew Honnibal 31681d20e0 Fix concatenation in iob2json converter 2017-10-02 16:50:26 +02:00
Matthew Honnibal 4896ce3320 Remove misleading comment 2017-10-02 00:09:14 +02:00
Matthew Honnibal 94df115a81 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-10-01 14:06:23 -05:00
Matthew Honnibal 69c7c642c2 Add spacy evaluate 2017-10-01 14:05:04 -05:00
ines fd1a9225d8 Handle conversion of pipeline components correctly
Allow both comma and comma + whitespace as separators
2017-09-29 20:52:56 +02:00
Matthew Honnibal ac8481a7b0 Print NER loss 2017-09-28 08:05:31 -05:00
Matthew Honnibal 542ebfa498 Improve defaults 2017-09-27 18:54:37 -05:00
Matthew Honnibal dcb86bdc43 Default batch size to 32 2017-09-27 11:48:19 -05:00
ines 1ff62eaee7 Fix option shortcut to avoid conflict 2017-09-26 17:59:34 +02:00
ines 7fdfb78141 Add version option to cli.train 2017-09-26 17:34:52 +02:00
Matthew Honnibal 698fc0d016 Remove merge artefact 2017-09-26 08:31:37 -05:00
Matthew Honnibal defb68e94f Update feature/noshare with recent develop changes 2017-09-26 08:15:14 -05:00
ines edf7e4881d Add meta.json option to cli.train and add relevant properties
Add accuracy scores to meta.json instead of accuracy.json and replace
all relevant properties like lang, pipeline, spacy_version in existing
meta.json. If not present, also add name and version placeholders to
make it packagable.
2017-09-25 19:00:47 +02:00
Matthew Honnibal 204b58c864 Fix evaluation during training 2017-09-24 05:01:03 -05:00
Matthew Honnibal dc3a623d00 Remove unused update_shared argument 2017-09-24 05:00:37 -05:00
Matthew Honnibal 4348c479fc Merge pre-trained vectors and noshare patches 2017-09-22 20:07:28 -05:00
Matthew Honnibal e93d43a43a Fix training with preset vectors 2017-09-22 20:00:40 -05:00
Matthew Honnibal a2357cce3f Set random seed in train script 2017-09-23 02:57:31 +02:00
Matthew Honnibal 0a9016cade Fix serialization during training 2017-09-21 13:06:45 -05:00
Matthew Honnibal 20193371f5 Don't share CNN, to reduce complexities 2017-09-21 14:59:48 +02:00
Matthew Honnibal 1d73dec8b1 Refactor train script 2017-09-20 19:17:10 -05:00
Matthew Honnibal a0c4b33d03 Support resuming a model during spacy train 2017-09-18 18:04:47 -05:00
Matthew Honnibal 8496d76224 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-09-14 09:21:20 -05:00
Matthew Honnibal 24ff6b0ad9 Fix parsing and tok2vec models 2017-09-06 05:50:58 -05:00
Matthew Honnibal e920885676 Fix pickle during train 2017-09-02 12:46:01 -05:00
ines 7e04b7f89c Fix info text on pipeline in package cli 2017-08-26 18:30:59 +02:00
Matthew Honnibal 876f38c548 Merge pull request #1279 from oroszgy/model_cli_v2
Added vector loading to model cli
2017-08-26 15:57:50 +02:00
ines bb1abbeba5 Only link model if download was successfull 2017-08-23 12:36:31 +02:00
Matthew Honnibal 7be5f30f17 Add profile function 2017-08-21 23:22:49 +02:00
Gyorgy Orosz b3576bfc86 Added vector leading to model cli 2017-08-20 23:16:12 +02:00
Matthew Honnibal 7a6edeea68 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-20 12:55:39 -05:00
Matthew Honnibal f2f9229964 Fix name of update_shared flag 2017-08-20 18:19:06 +02:00
Matthew Honnibal 80a5146ec2 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-20 11:07:08 -05:00
Matthew Honnibal 84bb543e4d Add gold_preproc flag to cli/train 2017-08-20 11:07:00 -05:00
Gyorgy Orosz e5344b83a3 Ported model cli from v1 2017-08-19 21:45:23 +02:00
Matthew Honnibal 11c31d285c Restore changes from nn-beam-parser 2017-08-18 22:26:12 +02:00
Matthew Honnibal 52c180ecf5 Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5, reversing
changes made to 08e443e083.
2017-08-14 13:00:23 +02:00
Matthew Honnibal 4ae0d5e1e6 Set defaults for convert command 2017-08-13 09:03:38 +02:00
ines d4f2baf7dd Add create_meta option to package command
Re-create meta.json in model directory, even if it exists. Especially
useful when updating existing spaCy models or training with Prodigy.
Ensures user won't end up with multiple "en_core_web_sm" models, and
offers easy way to change the model's name and settings without having
to edit the meta.json file.
2017-08-12 21:44:18 +02:00
Matthew Honnibal 8870d491f1 Remove redundant pickling during training 2017-08-12 08:55:53 -05:00
ines 28e2fec23b Fix autolinking failure on fresh model install (resolves #1138)
On fresh install via subprocess, pip.get_installed_distributions()
won't show new model, so is_package check in link command fails.
Solution for now is to get model package path explicitly and pass it to
link command.
2017-08-09 11:52:38 +02:00
Matthew Honnibal 0a566dc320 Add update_tensors flag to Language.update. Experimental, re #1182 2017-08-06 02:18:12 +02:00
György Orosz 62dbf9025c Fixed conllu converter 2017-06-09 22:53:56 +02:00
ines 03db56f48c Detect spaCy version and add package title
Package title allows customised package names (like spacy-nightly)
2017-06-05 20:11:02 +02:00
Matthew Honnibal c52fde40f4 Improve train CLI 2017-06-04 20:18:37 -05:00