Commit Graph

5793 Commits

Author SHA1 Message Date
Matthew Honnibal 15f6efc127 Remove vectors from vocab 2017-05-28 11:45:32 +02:00
Matthew Honnibal c1263a844b Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-27 18:32:57 -05:00
Matthew Honnibal 9e711c3476 Divide d_loss by batch size 2017-05-27 18:32:46 -05:00
Matthew Honnibal b082f76494 Randomize pipeline order during training 2017-05-27 18:32:21 -05:00
ines 10d05c2b92 Fix typos, wording and formatting 2017-05-28 01:30:12 +02:00
ines eb5a8be9ad Update language overview and add section on 'xx' lang class 2017-05-28 01:15:44 +02:00
Matthew Honnibal a1d4c97fb7 Improve correctness of minibatching 2017-05-27 17:59:00 -05:00
ines 84189c1cab Add 'xx' language ID for multi-language support
Allows models to specify their language ID as 'xx'.
2017-05-28 00:58:59 +02:00
ines 33e332e67c Remove unused export 2017-05-28 00:57:59 +02:00
ines 01a7b10319 Add fallback fonts to illustrations 2017-05-28 00:32:54 +02:00
ines eb703f7656 Update API docs 2017-05-28 00:32:43 +02:00
ines c1983621fb Update util functions for model loading 2017-05-28 00:22:40 +02:00
ines c8543c8237 Fix formatting and docstrings and remove deprecated function 2017-05-28 00:22:40 +02:00
ines db116cbeda Update tokenization 101 and add illustration 2017-05-28 00:22:40 +02:00
ines b03fb2d7b0 Update 101 and usage docs 2017-05-28 00:22:40 +02:00
Matthew Honnibal 49235017bf Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-27 16:34:28 -05:00
Matthew Honnibal 7ebd26b8aa Use ordered dict to specify transitions 2017-05-27 15:52:20 -05:00
Matthew Honnibal 3eea5383a1 Add move_names property to parser 2017-05-27 15:51:55 -05:00
Matthew Honnibal 8de9829f09 Don't overwrite model in initialization, when loading 2017-05-27 15:50:40 -05:00
Matthew Honnibal 99316fa631 Use ordered dict to specify actions 2017-05-27 15:50:21 -05:00
Matthew Honnibal 655ca58c16 Clarifying change to StateC.clone 2017-05-27 15:49:37 -05:00
Matthew Honnibal 5e4312feed Evaluate loaded class, to ensure save/load works 2017-05-27 15:47:02 -05:00
Matthew Honnibal 34bbad8e0e Add __reduce__ methods on parser subclasses. Fixes pickling. 2017-05-27 15:46:06 -05:00
Matthew Honnibal 7cc9c3e9a6 Fix convert CLI 2017-05-27 15:44:42 -05:00
ines ae11c8d60f Add emoji sentiment to lightning tour matcher example 2017-05-27 20:02:20 +02:00
ines 1203959625 Add pipeline setting to meta.json generator 2017-05-27 20:02:01 +02:00
ines 086a06e7d7 Fix CLI docstrings and add command as first argument
Workaround for Plac
2017-05-27 20:01:46 +02:00
ines 22bf5f63bf Update Matcher docs and add social media analysis example 2017-05-27 17:58:18 +02:00
ines 0d33ead507 Fix initialisation of Doc in lightning tour example 2017-05-27 17:58:06 +02:00
ines e05bcd6aa8 Update docs to reflect flattened model meta.json
Don't use "setup" key and instead, keep "lang" on root level and add
"pipeline".
2017-05-27 17:57:46 +02:00
ines a8e58e04ef Add symbols class to punctuation rules to handle emoji (see #1088)
Currently doesn't work for Hungarian, because of conflicts with the
custom punctuation rules. Also doesn't take multi-character emoji like
👩🏽‍💻 into account.
2017-05-27 17:57:10 +02:00
Matthew Honnibal dc07d72d80 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-27 08:20:40 -05:00
Matthew Honnibal de13fe0305 Remove length cap on sentences 2017-05-27 08:20:32 -05:00
Matthew Honnibal 73a643d32a Don't randomise pipeline for training, and don't update if no gradient 2017-05-27 08:20:13 -05:00
Matthew Honnibal 3d22fcaf0b Return None from parser if there are no annotations 2017-05-26 14:02:59 -05:00
Matthew Honnibal d06f235fc9 Fix conflict on convert.py 2017-05-26 11:33:29 -05:00
Matthew Honnibal 2e587c6417 Export iob_to_biluo utility 2017-05-26 11:32:55 -05:00
Matthew Honnibal 2b3b937a04 Fix converter CLI 2017-05-26 11:32:41 -05:00
Matthew Honnibal 5a87bcf35f Fix converters 2017-05-26 11:32:34 -05:00
Matthew Honnibal 8af3100143 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2017-05-26 11:31:41 -05:00
Matthew Honnibal 3d5a536eaa Improve efficiency of parser batching 2017-05-26 11:31:23 -05:00
Matthew Honnibal daac3e3573 Always shuffle gold data, and support length cap 2017-05-26 11:30:52 -05:00
ines 70afcfec3e Update defaults and example 2017-05-26 14:04:31 +02:00
ines 1b982f0838 Update train command and add docs on hyperparameters 2017-05-26 14:02:38 +02:00
ines 1b9c6ded71 Update API docs and add "source" button to GH source 2017-05-26 13:40:32 +02:00
ines 93ee5c4a52 Update serialization info 2017-05-26 13:22:45 +02:00
ines f122d82f29 Update usage docs and ddd "under construction" 2017-05-26 13:17:48 +02:00
Matthew Honnibal d65f99a720 Improve model saving in train script 2017-05-26 05:52:09 -05:00
ines 286c3d0719 Update usage and 101 docs 2017-05-26 12:46:29 +02:00
ines 6d76c1ea16 Add 101 for Vocab, Lexeme and StringStore 2017-05-26 12:45:01 +02:00