Commit Graph

4420 Commits

Author SHA1 Message Date
ines 718f1c50fb Add regression test for #1491 2017-11-03 21:11:20 +01:00
Matthew Honnibal 144a93c2a5 Back-off to tensor for similarity if no vectors 2017-11-03 20:56:33 +01:00
Matthew Honnibal 1e9634691a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-03 20:21:15 +01:00
Matthew Honnibal 13c8881d2f Expose parser's tok2vec model component 2017-11-03 20:20:59 +01:00
Matthew Honnibal 17c63906f9 Update tensorizer component 2017-11-03 20:20:26 +01:00
Matthew Honnibal 2bf21cbe29 Update model after optimising it instead of waiting 2017-11-03 20:20:01 +01:00
Matthew Honnibal d6e831bf89 Fix lemmatizer tests 2017-11-03 19:46:34 +01:00
ines eef930c73e Assert instead of print 2017-11-03 18:50:57 +01:00
ines f0986df94b Add test for #1488 (passes on v2.0.0a18?) 2017-11-03 14:44:36 +01:00
Matthew Honnibal 711278b667 Make test less flakey 2017-11-03 14:36:08 +01:00
Matthew Honnibal 7fea845374 Remove print statement 2017-11-03 14:04:51 +01:00
Matthew Honnibal 0a534ae96a Fix test for backprop d_pad 2017-11-03 14:04:16 +01:00
Matthew Honnibal 33bd2428db Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-03 13:29:56 +01:00
Matthew Honnibal 6681058abd Fix tensor extending in tagger 2017-11-03 13:29:36 +01:00
Matthew Honnibal bd2cbdfa85 Make Morphology not fail on unknown tags 2017-11-03 13:29:09 +01:00
Matthew Honnibal c9b118a7e9 Set softmax attr in tagger model 2017-11-03 11:22:01 +01:00
Matthew Honnibal a5b05f85f0 Set Doc.tensor attribute in parser 2017-11-03 11:21:00 +01:00
Matthew Honnibal 62ed58935a Add Doc.extend_tensor() method 2017-11-03 11:20:31 +01:00
Matthew Honnibal d6fc39c8a6 Set Doc.tensor from Tagger 2017-11-03 11:20:05 +01:00
Matthew Honnibal b3264aa5f0 Expose the softmax layer in the tagger model, to allow setting tensors 2017-11-03 11:19:51 +01:00
Matthew Honnibal c2bbf076a4 Add document length cap for training 2017-11-03 01:54:54 +01:00
Matthew Honnibal 6771780d3f Fix backprop of padding variable 2017-11-03 01:54:34 +01:00
Matthew Honnibal 54a716f2ec Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-03 00:55:20 +01:00
Matthew Honnibal 260e6ee3fb Improve efficiency of backprop of padding variable 2017-11-03 00:49:11 +01:00
Matthew Honnibal a22f96c3f1 Add test for backpropagating padding 2017-11-03 00:48:54 +01:00
ines 9baab241b4 Add skeleton language data for Turkish 2017-11-02 16:32:24 +01:00
ines c6fea3e5f6 Add Romanian and Croatian skeletons (experimental)
Add language data templates to make it easier for others to contribute to the language support
2017-11-01 23:04:28 +01:00
ines 18c859500b Add missing imports 2017-11-01 23:02:51 +01:00
ines 819e30a26e Tidy up tokenizer exceptions 2017-11-01 23:02:45 +01:00
ines 3af281a334 Update test model name 2017-11-01 23:02:00 +01:00
Matthew Honnibal b30dd36179 Allow Tagger.add_label() before training 2017-11-01 21:49:24 +01:00
Matthew Honnibal eca41f0cf6 Fix filename conversion for conllu 2017-11-01 21:26:49 +01:00
Matthew Honnibal e237472cdc Fix tag and filename conversion for conllu 2017-11-01 21:25:33 +01:00
Matthew Honnibal b84d99b281 Revert tagger.add_label() changes, to fix model 2017-11-01 21:10:45 +01:00
Matthew Honnibal f5855e539b Fix tagger model loading 2017-11-01 20:42:36 +01:00
Matthew Honnibal 624644adfe Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 20:26:41 +01:00
ines 5f661a1b3a Remove tensorizer from pre-set pipe_names 2017-11-01 19:48:33 +01:00
Matthew Honnibal 190522efd3 Fix tagger when some tags aren't in Morphology 2017-11-01 19:27:49 +01:00
Matthew Honnibal e85e31cfbd Fix backprop of d_pad 2017-11-01 19:27:26 +01:00
Matthew Honnibal 759cc79185 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 19:00:19 +01:00
Matthew Honnibal 1ae40b50b4 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 17:07:02 +01:00
Matthew Honnibal 7ae1aacdb8 Fix add_label methods 2017-11-01 17:06:43 +01:00
ines 8c2260e18c Move span tests to /doc 2017-11-01 16:56:35 +01:00
Matthew Honnibal 2ef7b59eb0 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 16:51:41 +01:00
ines 1d1f91a041 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 16:49:44 +01:00
ines 9659391944 Update deprecated methods and add warnings 2017-11-01 16:49:42 +01:00
ines 260cb37224 Catch deprecation warning 2017-11-01 16:49:18 +01:00
ines 5914faafbb Fix .merge tests to not use deprecated API 2017-11-01 16:49:11 +01:00
ines 705a4e3e4a Fix formatting 2017-11-01 16:44:08 +01:00
Matthew Honnibal d17a12c71d Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 16:38:26 +01:00
Matthew Honnibal 9f9439667b Don't create low-data text classifier if no vectors 2017-11-01 16:34:09 +01:00
Matthew Honnibal e7a9174877 Add add_label methods to Tagger and TextCategorizer 2017-11-01 16:32:44 +01:00
ines 39e0586192 Add deprecated helper
Uses warning to show DeprecationWarning and custom stack trace
2017-11-01 16:32:36 +01:00
Matthew Honnibal a7bf38bf31 Remove misleading comment on util.get_cuda_stream() 2017-11-01 13:57:25 +01:00
Matthew Honnibal 273e96b63f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 13:27:35 +01:00
Matthew Honnibal 9e0ebee81c Add Token.is_sent_start property, so can deprecate Token.sent_start 2017-11-01 13:27:14 +01:00
Matthew Honnibal 7e7116cdf7 Fix Doc.to_array when only one string attr provided 2017-11-01 13:26:43 +01:00
Matthew Honnibal 301fb2bb60 Implement Span.n_lefts and Span.n_rights 2017-11-01 13:25:12 +01:00
Matthew Honnibal c047498f87 Fix vectors test 2017-11-01 13:24:47 +01:00
ines 9a5e7c6fe2 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 13:14:45 +01:00
ines bfe17b7df1 Fix begin_training if get_gold_tuples is None 2017-11-01 13:14:31 +01:00
ines affd3404ab Remove old model command (now "vocab") 2017-11-01 13:14:03 +01:00
Matthew Honnibal fdb4b8e456 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 02:07:17 +01:00
Matthew Honnibal c48dd0e1d3 Fix vector pruning 2017-11-01 02:06:58 +01:00
ines 37e62ab0e2 Update vector meta in meta.json 2017-11-01 01:25:09 +01:00
ines 96b4aef0bf Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 01:10:53 +01:00
Matthew Honnibal 86eba61fae Fix token.vector when vectors are missing 2017-11-01 00:47:35 +01:00
ines 5683fd65ed Update docstrings 2017-11-01 00:42:39 +01:00
Matthew Honnibal 44bce8e53f Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-11-01 00:35:16 +01:00
Matthew Honnibal c16310d156 Update vectors with find method 2017-11-01 00:34:55 +01:00
Ines Montani d11659463b
Merge pull request #1152 from jimregan/develop-irish
[WIP] attempt a port from #1147
2017-11-01 00:23:43 +01:00
ines 2ad2f09d12 Update docstrings and simplify most_similar 2017-11-01 00:18:08 +01:00
Jim O'Regan 08b0bfd153 merge 2017-10-31 22:55:59 +00:00
Jim O'Regan 00ecfa5417 Ó, not O 2017-10-31 22:54:42 +00:00
ines ba2e6c8c6f Update docstrings and formatting 2017-10-31 23:23:34 +01:00
Matthew Honnibal 0de8d213a3
Merge pull request #1475 from explosion/feature/sm-vectors
Improve and simplify Vectors class
2017-10-31 22:59:50 +01:00
Ines Montani 25b1d6cd91
Fix syntax error 2017-10-31 22:36:03 +01:00
Matthew Honnibal 92dc127569 Fix test for Python 3 2017-10-31 22:21:55 +01:00
Jim O'Regan fe4b10346a replace example sentence until I get around to adding a punctuation.py 2017-10-31 20:24:53 +00:00
Matthew Honnibal c5799ecc7b Remove print statement 2017-10-31 21:12:33 +01:00
ines 7e424a1804 Don't copy exception dicts if not necessary and tidy up 2017-10-31 21:05:29 +01:00
Matthew Honnibal c390f2d745 Make it easier to pass explicit no-pruning to vocab 2017-10-31 20:14:47 +01:00
Ines Montani 06c25a8882
Remove comma that caused list to wrap in tuple!
Also removed extra dict wrappings for performance (we used to have them in there, but they should only really exist if copying the dict is absolutely necessary)
2017-10-31 20:13:16 +01:00
Matthew Honnibal d90a22afe6 Fix loading previous vectors models 2017-10-31 19:58:35 +01:00
Ines Montani 147448b65b
Add missing symbols 2017-10-31 19:34:45 +01:00
Matthew Honnibal 997a61557a Add vectors.n_keys property 2017-10-31 19:30:52 +01:00
Matthew Honnibal 8075726838 Restore vector usage in models 2017-10-31 19:21:17 +01:00
Matthew Honnibal 3659a807b0 Remove vector pruning arg from train CLI 2017-10-31 19:21:05 +01:00
Ines Montani 9b0de9fb43
Fix import of symbols (now nested one level lower) 2017-10-31 19:17:58 +01:00
Matthew Honnibal 59203a2e8a Move vector pruning command into spacy vocab cli tool 2017-10-31 19:10:01 +01:00
Matthew Honnibal 77d8f5de9a Revise and simplify Vectors class 2017-10-31 18:25:08 +01:00
Jim O'Regan d4a8160c36 change quotes 2017-10-31 15:15:44 +00:00
Jim O'Regan 34ca59691b no idea what is wrong here 2017-10-31 14:50:13 +00:00
Jim O'Regan 41dd29e48e merge 2017-10-31 14:07:45 +00:00
Matthew Honnibal cb5217012f Fix vector remapping 2017-10-31 11:40:46 +01:00
Matthew Honnibal 9c11ee4a1c WIP on vectors fixes 2017-10-31 11:22:56 +01:00
Matthew Honnibal ce876c551e Fix GPU usage 2017-10-31 02:33:34 +01:00
Matthew Honnibal 7698903617 Fix GPU usage 2017-10-31 02:33:16 +01:00
Matthew Honnibal 368fdb389a WIP on refactoring and fixing vectors 2017-10-31 02:00:26 +01:00
Matthew Honnibal 4e3006cec7 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-10-30 19:44:58 +01:00