spaCy

Commit Graph

Author	SHA1	Message	Date
Matthew Honnibal	39798b0172	Uncomment layernorm adjustment hack	2017-10-04 15:12:09 +02:00
Matthew Honnibal	774f5732bd	Fix dimensionality of textcat when no vectors available	2017-10-04 14:55:15 +02:00
Matthew Honnibal	af75b74208	Unset LayerNorm backwards compat hack	2017-10-03 20:47:10 -05:00
Matthew Honnibal	246612cb53	Merge remote-tracking branch 'origin/develop' into feature/parser-history-model	2017-10-03 16:56:42 -05:00
Matthew Honnibal	5cbefcba17	Set backwards compatibility flag	2017-10-03 20:29:58 +02:00
Matthew Honnibal	5454b20cd7	Update thinc imports for 6.9	2017-10-03 20:07:17 +02:00
Matthew Honnibal	e514d6aa0a	Import thinc modules more explicitly, to avoid cycles	2017-10-03 18:49:25 +02:00
Matthew Honnibal	b770f4e108	Fix embed class in history features	2017-10-03 13:26:55 +02:00
Matthew Honnibal	6aa6a5bc25	Add a layer type for history features	2017-10-03 12:43:09 +02:00
Matthew Honnibal	f6330d69e6	Default embed size to 7000	2017-09-28 08:07:41 -05:00
Matthew Honnibal	1a37a2c0a0	Update training defaults	2017-09-27 11:48:07 -05:00
Matthew Honnibal	e34e70673f	Allow tagger models to be built with pre-defined tok2vec layer	2017-09-26 05:51:52 -05:00
Matthew Honnibal	63bd87508d	Don't use iterated convolutions	2017-09-23 04:39:17 -05:00
Matthew Honnibal	4348c479fc	Merge pre-trained vectors and noshare patches	2017-09-22 20:07:28 -05:00
Matthew Honnibal	4bd6a12b1f	Fix Tok2Vec	2017-09-23 02:58:54 +02:00
Matthew Honnibal	980fb6e854	Refactor Tok2Vec	2017-09-22 09:38:36 -05:00
Matthew Honnibal	d9124f1aa3	Add link_vectors_to_models function	2017-09-22 09:38:22 -05:00
Matthew Honnibal	a186596307	Add 'reapply' combinator, for iterated CNN	2017-09-22 09:37:03 -05:00
Matthew Honnibal	40a4873b70	Fix serialization of model options	2017-09-21 13:07:26 -05:00
Matthew Honnibal	20193371f5	Don't share CNN, to reduce complexities	2017-09-21 14:59:48 +02:00
Matthew Honnibal	f5144f04be	Add argument for CNN maxout pieces	2017-09-20 19:14:41 -05:00
Matthew Honnibal	78301b2d29	Avoid comparison to None in Tok2Vec	2017-09-20 00:19:34 +02:00
Matthew Honnibal	3fa76c17d1	Refactor Tok2Vec	2017-09-18 15:00:05 -05:00
Matthew Honnibal	7b3f391f80	Try dropping the Affine layer, conditionally	2017-09-18 11:35:59 -05:00
Matthew Honnibal	2148ae605b	Dont use iterated convolutions	2017-09-17 17:36:04 -05:00
Matthew Honnibal	8f42f8d305	Remove unused 'preprocess' argument in Tok2Vec'	2017-09-17 12:30:16 -05:00
Matthew Honnibal	8f913a74ca	Fix defaults and args to build_tagger_model	2017-09-17 05:46:36 -05:00
Matthew Honnibal	2a93404da6	Support optional pre-trained vectors in tensorizer model	2017-09-16 12:45:37 -05:00
Matthew Honnibal	24ff6b0ad9	Fix parsing and tok2vec models	2017-09-06 05:50:58 -05:00
Matthew Honnibal	16e25ce3b5	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-09-04 09:26:53 -05:00
Matthew Honnibal	9f512e657a	Fix drop_layer calculation	2017-09-04 09:26:38 -05:00
Matthew Honnibal	c0eaba8b28	Fix low-data textcat	2017-09-02 15:17:32 +02:00
Matthew Honnibal	a3b69bcb3d	Add low_data mode in textcat	2017-09-02 14:56:30 +02:00
Matthew Honnibal	a824cf8f9a	Adjust text classification model	2017-09-02 11:41:00 +02:00
Matthew Honnibal	ac040b99bb	Add support for pre-trained vectors in text classifier	2017-09-01 16:39:55 +02:00
Matthew Honnibal	6d4e8e14ca	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-25 12:37:16 -05:00
Matthew Honnibal	4ce5531389	Use layer norm instead of batch norm	2017-08-25 12:37:10 -05:00
Matthew Honnibal	1c5c256e58	Fix fine_tune when optimizer is None	2017-08-23 10:51:33 +02:00
Matthew Honnibal	9c580ad28a	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-22 17:02:04 -05:00
Matthew Honnibal	a4633fff6f	Restore use of batch norm in model	2017-08-22 17:01:58 -05:00
Matthew Honnibal	df2745eb08	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-22 19:00:43 +02:00
Matthew Honnibal	18b64e79ec	Fix fine tuning	2017-08-21 19:18:26 -05:00
Matthew Honnibal	a21d8f3f0b	Add predict paths to _ml models	2017-08-21 23:23:45 +02:00
Matthew Honnibal	80acbc5f1f	Fix fine-tune weight mixture	2017-08-21 14:15:29 -05:00
Matthew Honnibal	c10f63bf10	Initialize fine tuning to 0.5	2017-08-20 15:59:48 -05:00
Matthew Honnibal	8a59718fd6	Fix fine-tuning	2017-08-20 18:17:35 +02:00
Matthew Honnibal	bae59bf92f	Remove BiLSTM import	2017-08-18 22:46:59 +02:00
Matthew Honnibal	fe90dfc390	Restore changes from nn-beam-parser to spacy/_ml	2017-08-18 22:38:28 +02:00
Matthew Honnibal	ce321b0322	Restore changes from nn-beam-parser to spacy/_ml	2017-08-18 22:24:46 +02:00
Matthew Honnibal	931509d96a	Merge branch 'develop' of https://github.com/explosion/spaCy into develop	2017-08-18 21:57:15 +02:00
Matthew Honnibal	263366729e	Don't import BiLSTM	2017-08-18 21:56:31 +02:00
Matthew Honnibal	85794c1167	Restore state of _ml.py	2017-08-18 14:55:23 -05:00
Matthew Honnibal	426f84937f	Resolve conflicts when merging new beam parsing stuff	2017-08-18 13:38:32 -05:00
Matthew Honnibal	5181e8bedb	Fix merge conflict in _ml	2017-08-18 13:35:51 -05:00
Matthew Honnibal	4b1e7bd6d8	Improve tensorizer model	2017-08-16 18:25:20 -05:00
Matthew Honnibal	6259490347	Fix mixture weights in fine_tune	2017-08-14 17:55:18 -05:00
Matthew Honnibal	335fa8b05c	Fix gradient in fine_tune	2017-08-14 14:55:47 -05:00
Matthew Honnibal	52c180ecf5	Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" This reverts commit `ea8de11ad5`, reversing changes made to `08e443e083`.	2017-08-14 13:00:23 +02:00
Matthew Honnibal	ac6c25f762	Check SGD is not None in update	2017-08-14 12:09:18 +02:00
Matthew Honnibal	4ab0c8c8e9	Try different drop_layer structure in Tok2Vec	2017-08-12 08:56:57 -05:00
Matthew Honnibal	ebe0f7f641	Pass embed size correctly in tagger, and cache embeddings for efficiency	2017-08-12 05:45:20 -05:00
Matthew Honnibal	f93f2bed58	Revert use of layer normalization in Tok2Vec	2017-08-09 17:47:03 -05:00
Matthew Honnibal	ac2de6dced	Switch to ReLu layers in Tok2Vec	2017-08-09 16:41:25 -05:00
Matthew Honnibal	88bf1cf87c	Update parser for fine tuning	2017-08-08 15:34:17 -05:00
Matthew Honnibal	5d837c3776	Add mix weights on fine_tune	2017-08-07 06:32:59 -05:00
Matthew Honnibal	3ed203de25	Use LayerNorm and SELU in Tok2Vec	2017-08-06 18:33:18 +02:00
Matthew Honnibal	4a5cc89138	Fix tagger 'fine_tune', to keep private CNN weights	2017-08-06 14:15:48 +02:00
Matthew Honnibal	4cfb7a54e7	Fix tagger	2017-08-06 01:53:31 +02:00
Matthew Honnibal	e9ab800e15	Fix tagging model	2017-08-06 01:50:08 +02:00
Matthew Honnibal	468c138ab3	WIP: Add fine-tuning logic to tagger model, re #1182	2017-08-06 01:13:23 +02:00
Matthew Honnibal	523b0df2c9	Update text classification model	2017-07-25 18:57:59 +02:00
Matthew Honnibal	2df563ad24	Remove optimization for textcat that caused loading problem	2017-07-23 14:10:51 +02:00
Matthew Honnibal	ded0df5e2f	Expose hyper-param as keyword arg	2017-07-22 20:14:37 +02:00
Matthew Honnibal	6ffec9dfea	Update _ml, for textcat model	2017-07-22 20:03:40 +02:00
Matthew Honnibal	727481377e	Add text-classifer thinc models	2017-07-20 00:17:17 +02:00
Matthew Honnibal	8a17b99b1c	Use NORM attribute, not LOWER	2017-06-03 15:30:16 -05:00
Matthew Honnibal	b92a89f87b	Make it easier to reference embedding tables	2017-05-29 17:53:29 -05:00
Matthew Honnibal	c91b121aeb	Move serialization functions to util	2017-05-29 10:13:42 +02:00
Matthew Honnibal	1fa2bfb600	Add model_to_bytes and model_from_bytes helpers. Probably belong in thinc.	2017-05-29 09:27:04 +02:00
Matthew Honnibal	6dad4117ad	Work on serialization for models	2017-05-29 01:37:57 +02:00
Matthew Honnibal	8de9829f09	Don't overwrite model in initialization, when loading	2017-05-27 15:50:40 -05:00
Matthew Honnibal	b27c587800	Fix pieces argument to PrecomputedMaxout	2017-05-25 06:46:59 -05:00
Matthew Honnibal	c998776c25	Make single array for features, to reduce GPU copies	2017-05-22 04:51:08 -05:00
Matthew Honnibal	8904814c0e	Add missing import	2017-05-21 09:07:56 -05:00
Matthew Honnibal	3b7c108246	Pass tokvecs through as a list, instead of concatenated. Also fix padding	2017-05-20 13:23:32 -05:00
Matthew Honnibal	b272890a8c	Try to move parser to simpler PrecomputedAffine class. Currently broken -- maybe the previous change	2017-05-20 06:40:10 -05:00
Matthew Honnibal	a438cef8c5	Fix significant bug in feature calculation -- off by 1	2017-05-18 06:21:32 -05:00
Matthew Honnibal	711ad5edc4	Cache features in doc2feats	2017-05-18 04:22:20 -05:00
Matthew Honnibal	5211645af3	Get data flowing through pipeline. Needs redesign	2017-05-16 11:21:59 +02:00
Matthew Honnibal	a9edb3aa1d	Improve integration of NN parser, to support unified training API	2017-05-15 21:53:27 +02:00
Matthew Honnibal	827b5af697	Update draft of parser neural network model Model is good, but code is messy. Currently requires Chainer, which may cause the build to fail on machines without a GPU. Outline of the model: We first predict context-sensitive vectors for each word in the input: (embed_lower \| embed_prefix \| embed_suffix \| embed_shape) >> Maxout(token_width) >> convolution ** 4 This convolutional layer is shared between the tagger and the parser. This prevents the parser from needing tag features. To boost the representation, we make a "super tag" with POS, morphology and dependency label. The tagger predicts this by adding a softmax layer onto the convolutional layer --- so, we're teaching the convolutional layer to give us a representation that's one affine transform from this informative lexical information. This is obviously good for the parser (which backprops to the convolutions too). The parser model makes a state vector by concatenating the vector representations for its context tokens. Current results suggest few context tokens works well. Maybe this is a bug. The current context tokens: * S0, S1, S2: Top three words on the stack * B0, B1: First two words of the buffer * S0L1, S0L2: Leftmost and second leftmost children of S0 * S0R1, S0R2: Rightmost and second rightmost children of S0 * S1L1, S1L2, S1R2, S1R, B0L1, B0L2: Likewise for S1 and B0 This makes the state vector quite long: 13T, where T is the token vector width (128 is working well). Fortunately, there's a way to structure the computation to save some expense (and make it more GPU friendly). The parser typically visits 2N states for a sentence of length N (although it may visit more, if it back-tracks with a non-monotonic transition). A naive implementation would require 2N (B, 13T) @ (13T, H) matrix multiplications for a batch of size B. We can instead perform one (BN, T) @ (T, 13*H) multiplication, to pre-compute the hidden weights for each positional feature wrt the words in the batch. (Note that our token vectors come from the CNN -- so we can't play this trick over the vocabulary. That's how Stanford's NN parser works --- and why its model is so big.) This pre-computation strategy allows a nice compromise between GPU-friendliness and implementation simplicity. The CNN and the wide lower layer are computed on the GPU, and then the precomputed hidden weights are moved to the CPU, before we start the transition-based parsing process. This makes a lot of things much easier. We don't have to worry about variable-length batch sizes, and we don't have to implement the dynamic oracle in CUDA to train. Currently the parser's loss function is multilabel log loss, as the dynamic oracle allows multiple states to be 0 cost. This is defined as: (exp(score) / Z) - (exp(score) / gZ) Where gZ is the sum of the scores assigned to gold classes. I'm very interested in regressing on the cost directly, but so far this isn't working well. Machinery is in place for beam-search, which has been working well for the linear model. Beam search should benefit greatly from the pre-computation trick.	2017-05-12 16:09:15 -05:00
Matthew Honnibal	bef89ef23d	Mergery	2017-05-08 08:29:36 -05:00
Matthew Honnibal	56073a11ef	Don't use tags when calculating token vectors	2017-05-08 07:52:24 -05:00
Matthew Honnibal	a66a4a4d0f	Replace einsums	2017-05-08 14:46:50 +02:00
Matthew Honnibal	807cb2e370	Add PretrainableMaxouts	2017-05-08 14:24:43 +02:00
Matthew Honnibal	2e2268a442	Precomputable hidden now working	2017-05-08 11:36:37 +02:00
Matthew Honnibal	10682d35ab	Get pre-computed version working	2017-05-08 00:38:35 +02:00
Matthew Honnibal	12039e80ca	Switch to single matmul for state layer	2017-05-07 14:26:34 +02:00
Matthew Honnibal	f99f5b75dc	working residual net	2017-05-07 03:57:26 +02:00
Matthew Honnibal	bdf2dba9fb	WIP on refactor, with hidde pre-computing	2017-05-07 02:02:43 +02:00

1 2 3 4

160 Commits