Commit Graph

189 Commits

Author SHA1 Message Date
Matthew Honnibal 64e4ff7c4b Merge 'tidy-up' changes into branch. Resolve conflicts 2017-10-28 13:16:06 +02:00
Explosion Bot b22e42af7f Merge changes to parser and _ml 2017-10-28 11:52:10 +02:00
ines d96e72f656 Tidy up rest 2017-10-27 21:07:59 +02:00
ines e33b7e0b3c Tidy up parser and ML 2017-10-27 14:39:30 +02:00
Matthew Honnibal 531142a933 Merge remote-tracking branch 'origin/develop' into feature/better-parser 2017-10-27 12:34:48 +00:00
Matthew Honnibal c9987cf131 Avoid use of numpy.tensordot 2017-10-27 10:18:36 +00:00
Matthew Honnibal f6fef30adc Remove dead code from spacy._ml 2017-10-27 10:16:41 +00:00
ines 4eb5bd02e7 Update textcat pre-processing after to_array change 2017-10-27 00:32:12 +02:00
Matthew Honnibal 35977bdbb9 Update better-parser branch with develop 2017-10-26 00:55:53 +00:00
Matthew Honnibal 075e8118ea Update from develop 2017-10-25 12:45:21 +02:00
ines 0b1dcbac14 Remove unused function 2017-10-25 12:08:46 +02:00
Matthew Honnibal 3faf9189a2 Make parser hidden shape consistent even if maxout==1 2017-10-20 16:23:31 +02:00
Matthew Honnibal b101736555 Fix precomputed layer 2017-10-20 12:14:52 +02:00
Matthew Honnibal 64658e02e5 Implement fancier initialisation for precomputed layer 2017-10-20 03:07:45 +02:00
Matthew Honnibal a17a1b60c7 Clean up redundant PrecomputableMaxouts class 2017-10-19 20:26:37 +02:00
Matthew Honnibal b00d0a2c97 Fix bias in parser 2017-10-19 18:42:11 +02:00
Matthew Honnibal 03a215c5fd Make PrecomputableAffines work 2017-10-19 13:44:49 +02:00
Matthew Honnibal 76fe24f44d Improve embedding defaults 2017-10-11 09:44:17 +02:00
Matthew Honnibal b2b8506f2c Remove whitespace 2017-10-09 03:35:57 +02:00
Matthew Honnibal d163115e91 Add non-linearity after history features 2017-10-07 21:00:43 -05:00
Matthew Honnibal 5c750a9c2f Reserve 0 for 'missing' in history features 2017-10-06 06:10:13 -05:00
Matthew Honnibal fbba7c517e Pass dropout through to embed tables 2017-10-06 06:09:18 -05:00
Matthew Honnibal 3db0a32fd6 Fix dropout for history features 2017-10-05 22:21:30 -05:00
Matthew Honnibal fc06b0a333 Fix training when hist_size==0 2017-10-05 21:52:28 -05:00
Matthew Honnibal dcdfa071aa Disable LayerNorm hack 2017-10-04 20:06:52 -05:00
Matthew Honnibal bfabc333be Merge remote-tracking branch 'origin/develop' into feature/parser-history-model 2017-10-04 20:00:36 -05:00
Matthew Honnibal 92066b04d6 Fix Embed and HistoryFeatures 2017-10-04 19:55:34 -05:00
Matthew Honnibal bd8e84998a Add nO attribute to TextCategorizer model 2017-10-04 16:07:30 +02:00
Matthew Honnibal f8a0614527 Improve textcat model slightly 2017-10-04 15:15:53 +02:00
Matthew Honnibal 39798b0172 Uncomment layernorm adjustment hack 2017-10-04 15:12:09 +02:00
Matthew Honnibal 774f5732bd Fix dimensionality of textcat when no vectors available 2017-10-04 14:55:15 +02:00
Matthew Honnibal af75b74208 Unset LayerNorm backwards compat hack 2017-10-03 20:47:10 -05:00
Matthew Honnibal 246612cb53 Merge remote-tracking branch 'origin/develop' into feature/parser-history-model 2017-10-03 16:56:42 -05:00
Matthew Honnibal 5cbefcba17 Set backwards compatibility flag 2017-10-03 20:29:58 +02:00
Matthew Honnibal 5454b20cd7 Update thinc imports for 6.9 2017-10-03 20:07:17 +02:00
Matthew Honnibal e514d6aa0a Import thinc modules more explicitly, to avoid cycles 2017-10-03 18:49:25 +02:00
Matthew Honnibal b770f4e108 Fix embed class in history features 2017-10-03 13:26:55 +02:00
Matthew Honnibal 6aa6a5bc25 Add a layer type for history features 2017-10-03 12:43:09 +02:00
Matthew Honnibal f6330d69e6 Default embed size to 7000 2017-09-28 08:07:41 -05:00
Matthew Honnibal 1a37a2c0a0 Update training defaults 2017-09-27 11:48:07 -05:00
Matthew Honnibal e34e70673f Allow tagger models to be built with pre-defined tok2vec layer 2017-09-26 05:51:52 -05:00
Matthew Honnibal 63bd87508d Don't use iterated convolutions 2017-09-23 04:39:17 -05:00
Matthew Honnibal 4348c479fc Merge pre-trained vectors and noshare patches 2017-09-22 20:07:28 -05:00
Matthew Honnibal 4bd6a12b1f Fix Tok2Vec 2017-09-23 02:58:54 +02:00
Matthew Honnibal 980fb6e854 Refactor Tok2Vec 2017-09-22 09:38:36 -05:00
Matthew Honnibal d9124f1aa3 Add link_vectors_to_models function 2017-09-22 09:38:22 -05:00
Matthew Honnibal a186596307 Add 'reapply' combinator, for iterated CNN 2017-09-22 09:37:03 -05:00
Matthew Honnibal 40a4873b70 Fix serialization of model options 2017-09-21 13:07:26 -05:00
Matthew Honnibal 20193371f5 Don't share CNN, to reduce complexities 2017-09-21 14:59:48 +02:00
Matthew Honnibal f5144f04be Add argument for CNN maxout pieces 2017-09-20 19:14:41 -05:00
Matthew Honnibal 78301b2d29 Avoid comparison to None in Tok2Vec 2017-09-20 00:19:34 +02:00
Matthew Honnibal 3fa76c17d1 Refactor Tok2Vec 2017-09-18 15:00:05 -05:00
Matthew Honnibal 7b3f391f80 Try dropping the Affine layer, conditionally 2017-09-18 11:35:59 -05:00
Matthew Honnibal 2148ae605b Dont use iterated convolutions 2017-09-17 17:36:04 -05:00
Matthew Honnibal 8f42f8d305 Remove unused 'preprocess' argument in Tok2Vec' 2017-09-17 12:30:16 -05:00
Matthew Honnibal 8f913a74ca Fix defaults and args to build_tagger_model 2017-09-17 05:46:36 -05:00
Matthew Honnibal 2a93404da6 Support optional pre-trained vectors in tensorizer model 2017-09-16 12:45:37 -05:00
Matthew Honnibal 24ff6b0ad9 Fix parsing and tok2vec models 2017-09-06 05:50:58 -05:00
Matthew Honnibal 16e25ce3b5 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-09-04 09:26:53 -05:00
Matthew Honnibal 9f512e657a Fix drop_layer calculation 2017-09-04 09:26:38 -05:00
Matthew Honnibal c0eaba8b28 Fix low-data textcat 2017-09-02 15:17:32 +02:00
Matthew Honnibal a3b69bcb3d Add low_data mode in textcat 2017-09-02 14:56:30 +02:00
Matthew Honnibal a824cf8f9a Adjust text classification model 2017-09-02 11:41:00 +02:00
Matthew Honnibal ac040b99bb Add support for pre-trained vectors in text classifier 2017-09-01 16:39:55 +02:00
Matthew Honnibal 6d4e8e14ca Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-25 12:37:16 -05:00
Matthew Honnibal 4ce5531389 Use layer norm instead of batch norm 2017-08-25 12:37:10 -05:00
Matthew Honnibal 1c5c256e58 Fix fine_tune when optimizer is None 2017-08-23 10:51:33 +02:00
Matthew Honnibal 9c580ad28a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-22 17:02:04 -05:00
Matthew Honnibal a4633fff6f Restore use of batch norm in model 2017-08-22 17:01:58 -05:00
Matthew Honnibal df2745eb08 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-22 19:00:43 +02:00
Matthew Honnibal 18b64e79ec Fix fine tuning 2017-08-21 19:18:26 -05:00
Matthew Honnibal a21d8f3f0b Add predict paths to _ml models 2017-08-21 23:23:45 +02:00
Matthew Honnibal 80acbc5f1f Fix fine-tune weight mixture 2017-08-21 14:15:29 -05:00
Matthew Honnibal c10f63bf10 Initialize fine tuning to 0.5 2017-08-20 15:59:48 -05:00
Matthew Honnibal 8a59718fd6 Fix fine-tuning 2017-08-20 18:17:35 +02:00
Matthew Honnibal bae59bf92f Remove BiLSTM import 2017-08-18 22:46:59 +02:00
Matthew Honnibal fe90dfc390 Restore changes from nn-beam-parser to spacy/_ml 2017-08-18 22:38:28 +02:00
Matthew Honnibal ce321b0322 Restore changes from nn-beam-parser to spacy/_ml 2017-08-18 22:24:46 +02:00
Matthew Honnibal 931509d96a Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-08-18 21:57:15 +02:00
Matthew Honnibal 263366729e Don't import BiLSTM 2017-08-18 21:56:31 +02:00
Matthew Honnibal 85794c1167 Restore state of _ml.py 2017-08-18 14:55:23 -05:00
Matthew Honnibal 426f84937f Resolve conflicts when merging new beam parsing stuff 2017-08-18 13:38:32 -05:00
Matthew Honnibal 5181e8bedb Fix merge conflict in _ml 2017-08-18 13:35:51 -05:00
Matthew Honnibal 4b1e7bd6d8 Improve tensorizer model 2017-08-16 18:25:20 -05:00
Matthew Honnibal 6259490347 Fix mixture weights in fine_tune 2017-08-14 17:55:18 -05:00
Matthew Honnibal 335fa8b05c Fix gradient in fine_tune 2017-08-14 14:55:47 -05:00
Matthew Honnibal 52c180ecf5 Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop"
This reverts commit ea8de11ad5, reversing
changes made to 08e443e083.
2017-08-14 13:00:23 +02:00
Matthew Honnibal ac6c25f762 Check SGD is not None in update 2017-08-14 12:09:18 +02:00
Matthew Honnibal 4ab0c8c8e9 Try different drop_layer structure in Tok2Vec 2017-08-12 08:56:57 -05:00
Matthew Honnibal ebe0f7f641 Pass embed size correctly in tagger, and cache embeddings for efficiency 2017-08-12 05:45:20 -05:00
Matthew Honnibal f93f2bed58 Revert use of layer normalization in Tok2Vec 2017-08-09 17:47:03 -05:00
Matthew Honnibal ac2de6dced Switch to ReLu layers in Tok2Vec 2017-08-09 16:41:25 -05:00
Matthew Honnibal 88bf1cf87c Update parser for fine tuning 2017-08-08 15:34:17 -05:00
Matthew Honnibal 5d837c3776 Add mix weights on fine_tune 2017-08-07 06:32:59 -05:00
Matthew Honnibal 3ed203de25 Use LayerNorm and SELU in Tok2Vec 2017-08-06 18:33:18 +02:00
Matthew Honnibal 4a5cc89138 Fix tagger 'fine_tune', to keep private CNN weights 2017-08-06 14:15:48 +02:00
Matthew Honnibal 4cfb7a54e7 Fix tagger 2017-08-06 01:53:31 +02:00
Matthew Honnibal e9ab800e15 Fix tagging model 2017-08-06 01:50:08 +02:00
Matthew Honnibal 468c138ab3 WIP: Add fine-tuning logic to tagger model, re #1182 2017-08-06 01:13:23 +02:00
Matthew Honnibal 523b0df2c9 Update text classification model 2017-07-25 18:57:59 +02:00