Commit Graph

170 Commits

Author SHA1 Message Date
Matthew Honnibal 7b06bb896e Fix for serialization 2017-05-29 13:42:55 +02:00
Matthew Honnibal 74235587ef Fix to serialization 2017-05-29 13:40:31 +02:00
Matthew Honnibal 59f355d525 Fixes for serialization 2017-05-29 13:38:20 +02:00
Matthew Honnibal ff26aa6c37 Work on to/from bytes/disk serialization methods 2017-05-29 11:45:45 +02:00
Matthew Honnibal 8a24c60c1e Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-28 08:12:05 -05:00
Matthew Honnibal bc97bc292c Fix __call__ method 2017-05-28 08:11:58 -05:00
Matthew Honnibal b082f76494 Randomize pipeline order during training 2017-05-27 18:32:21 -05:00
Matthew Honnibal 73a643d32a Don't randomise pipeline for training, and don't update if no gradient 2017-05-27 08:20:13 -05:00
Matthew Honnibal 8af3100143 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-26 11:31:41 -05:00
ines 353f0ef8d7 Use disable argument (list) for serialization 2017-05-26 12:33:54 +02:00
Matthew Honnibal dbf2a4cf57 Update all models on each epoch 2017-05-25 19:46:56 -05:00
Matthew Honnibal 82b11b0320 Remove print statement 2017-05-25 17:15:59 -05:00
Matthew Honnibal f403c2cd5f Add env opts for optimizer 2017-05-25 11:19:26 -05:00
Matthew Honnibal 8500d9b1da Only train one task per iter, holding grads 2017-05-25 06:47:42 -05:00
Matthew Honnibal e6cc927ab1 Rearrange multi-task learning 2017-05-24 20:10:54 -05:00
Matthew Honnibal 9adfe9e8fc Don't hold gradient updates in language -- let the parser decide how to batch the updates. 2017-05-23 04:29:10 -05:00
Matthew Honnibal 3959d778ac Revert "Revert "WIP on improving parser efficiency""
This reverts commit 532afef4a8.
2017-05-23 03:06:53 -05:00
Matthew Honnibal 532afef4a8 Revert "WIP on improving parser efficiency"
This reverts commit bdaac7ab44.
2017-05-23 03:05:25 -05:00
Matthew Honnibal bdaac7ab44 WIP on improving parser efficiency 2017-05-23 02:59:31 -05:00
ines 54f04a9fe0 Update API docs with changes in spacy.gold and spacy.language 2017-05-22 12:29:30 +02:00
Matthew Honnibal 9262fc4829 Fix syntax error 2017-05-22 05:14:59 -05:00
Matthew Honnibal 2a5eb9f61e Make nonproj methods top-level functions, instead of class methods 2017-05-22 04:51:08 -05:00
Matthew Honnibal 5738d373d5 Add deprojectivize to pipeline 2017-05-22 04:51:08 -05:00
Matthew Honnibal 8d1e64be69 Add experimental NeuralLabeller 2017-05-22 04:51:08 -05:00
Matthew Honnibal 5db89053aa Merge docstrings 2017-05-21 13:46:23 -05:00
Matthew Honnibal 432b3499b3 Fix memory leak 2017-05-21 13:38:46 -05:00
Matthew Honnibal 4c9202249d Refactor training, to fix memory leak 2017-05-21 09:07:06 -05:00
ines d82ae9a585 Change "function" to "callable" in docs 2017-05-21 13:17:40 +02:00
Matthew Honnibal 3b7c108246 Pass tokvecs through as a list, instead of concatenated. Also fix padding 2017-05-20 13:23:32 -05:00
Matthew Honnibal 66ea9aebe7 Remove the state argument from Language 2017-05-19 13:25:42 -05:00
ines 2c8c9dc0c9 Update docstrings and API docs for Language 2017-05-19 18:47:24 +02:00
ines d42bc16868 Update docstrings and API docs for Language class 2017-05-18 23:57:38 +02:00
Matthew Honnibal c2c825127a Fix use_params and pipe methods 2017-05-18 08:30:59 -05:00
Matthew Honnibal 2713041571 Fix GPU usage in Language 2017-05-18 04:25:19 -05:00
Matthew Honnibal 793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
Matthew Honnibal 8cf097ca88 Redesign training to integrate NN components
* Obsolete .parser, .entity etc names in favour of .pipeline
* Components no longer create models on initialization
* Models created by loading method (from_disk(), from_bytes() etc), or
    .begin_training()
* Add .predict(), .set_annotations() methods in components
* Pass state through pipeline, to allow components to share information
    more flexibly.
2017-05-16 16:17:30 +02:00
Matthew Honnibal 5211645af3 Get data flowing through pipeline. Needs redesign 2017-05-16 11:21:59 +02:00
Matthew Honnibal a9edb3aa1d Improve integration of NN parser, to support unified training API 2017-05-15 21:53:27 +02:00
Matthew Honnibal 9e167b7bb6 Strip serializer from code 2017-05-09 17:28:50 +02:00
ines ea5fa46475 Import LEX_ATTRS from lang.lex_attrs 2017-05-09 00:58:10 +02:00
ines 6eb6306843 Fix language data imports 2017-05-08 23:58:31 +02:00
Matthew Honnibal d0e19267e8 Create directory if missing in save_to_directory 2017-04-23 21:24:43 +02:00
Matthew Honnibal 4d2a659c52 Fix json dump for Python3 2017-04-23 17:05:53 +02:00
ines ddd5194088 Update Language docs and docstrings 2017-04-17 01:52:13 +02:00
ines f62b740961 Use compat.json_dumps 2017-04-17 01:46:14 +02:00
ines 8e83f8e2fa Update docstrings 2017-04-17 01:40:26 +02:00
ines e2299dc389 Ensure path in save_to_directory 2017-04-17 01:40:14 +02:00
Matthew Honnibal 4efd6fb9d6 Fix training 2017-04-16 15:28:27 -05:00
Matthew Honnibal 89a4f262fc Fix training methods 2017-04-16 13:00:37 -05:00
ines c05ec4b89a Add compat functions and remove old workarounds
Add ensure_path util function to handle checking instance of path
2017-04-15 12:11:16 +02:00