Commit Graph

116 Commits

Author SHA1 Message Date
Matthew Honnibal 47a3ef06a6 Unhack deprojetivization, moving it into pipeline
Previously the deprojectivize() call was attached to the transition
system, and only called for German. Instead it should be a separate
process, called after the parser. This makes it available for any
language. Closes #898.
2017-03-31 12:31:50 +02:00
Matthew Honnibal 83ba6c247c Fix init of Language without model 2017-03-26 16:46:00 +02:00
Raphaël Bournhonesque f332bf05be Remove unused import statements 2017-03-21 21:08:54 +01:00
ines 9605cf39cc Handle default path in Language classes 2017-03-18 12:58:45 +01:00
Matthew Honnibal 8843b84bd1 Merge remote-tracking branch 'origin/develop-downloads' 2017-03-16 12:00:42 -05:00
ines 618ce3b425 Add .meta to Language object
Allows getting the current model's meta data, e.g.:
nlp = spacy.load('my-model')
print(nlp.meta)
2017-03-16 17:14:56 +01:00
Matthew Honnibal b382dc902c Add morph rules in Language 2017-03-15 09:24:40 -05:00
Matthew Honnibal f70be44746 Use lemmatizer in code, not from downloaded model. 2017-03-15 04:52:50 -05:00
Matthew Honnibal f71eeef9bb Pass path argument to end_training 2017-03-09 18:42:40 -06:00
Matthew Honnibal cd33b39a04 Fix 2/3 problem for json save/load 2017-03-08 01:39:13 +01:00
Ines Montani aa876884f0 Revert "Revert "Merge remote-tracking branch 'origin/master'""
This reverts commit fb9d3bb022.
2017-01-09 13:28:13 +01:00
Matthew Honnibal 3679fb43a3 Fix loading of lemmatizer 2016-12-18 17:34:09 +01:00
Ines Montani b11d8cd3db Merge remote-tracking branch 'origin/organize-language-data' into organize-language-data 2016-12-18 16:57:12 +01:00
Ines Montani 753068f1d5 Use base language data as default 2016-12-18 16:55:25 +01:00
Ines Montani bcc1d50d09 Remove trailing whitespace 2016-12-18 16:54:52 +01:00
Matthew Honnibal 44f4f008bd Wire up lemmatizer rules for English 2016-12-18 15:50:09 +01:00
Matthew Honnibal 296d33a4fc Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-11-26 12:36:18 +01:00
Matthew Honnibal 1f6c37c6f5 Fix create_tokenizer when nlp is None 2016-11-26 12:36:04 +01:00
Matthew Honnibal c7889492f9 Fix model saving error for Python 3 2016-11-25 18:04:30 -06:00
Matthew Honnibal 159e8c46e1 Merge old training fixes with newer state 2016-11-25 09:16:36 -06:00
Matthew Honnibal a2f55e7015 Pass cfg through loading, for training. 2016-11-25 09:01:20 -06:00
Matthew Honnibal 09f68bc641 Fix Issue #639: stop words in language class not used. This patch is messy, but it's better not to change too much until the language data loading can be properly refactored. 2016-11-24 00:13:55 +01:00
Matthew Honnibal 48e1dc29d4 Fix default path loading. 2016-11-23 23:48:55 +01:00
ExplodingCabbage 6c4f488e89 Fix syntax mistake 2016-11-23 15:12:45 +00:00
Matthew Honnibal 60eb2343ce Only try to load vectors if they exist. 2016-11-23 13:50:24 +01:00
Matthew Honnibal 618ac36093 Fix use of path argument in Language.__init__. Needs to be keyword arg, not positional. 2016-11-23 13:26:34 +01:00
Mark Amery fbe19680a6 Fix another bug related to Language.__init__'s path parameter 2016-11-20 20:31:34 +00:00
Mark Amery b0a07c21a0 Fix `path` param of `Language.__init__` always being ignored
There was an explicitly-declared `path` keyword argument, so 'path'
would never be present in `**overrides`. This line just overwrote
any manually-specified value the user might've passed to the `path`
parameter.
2016-11-20 16:29:57 +00:00
Matthew Honnibal 22647c2423 Check that patterns aren't null before compiling regex for tokenizer 2016-11-02 20:35:29 +01:00
Matthew Honnibal f7fee6c24b Check for class-defined make_docs method before assigning one provided as an argument 2016-11-02 19:57:13 +01:00
Matthew Honnibal b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal cb49189477 Remove dead code 2016-10-26 13:11:07 +02:00
Matthew Honnibal 150e02d72e Fix Issue #566 2016-10-23 20:19:01 +02:00
Matthew Honnibal 739213a8af Fix create_pipeline keyword argument. 2016-10-23 14:24:16 +02:00
Matthew Honnibal 5ec32f5d97 Fix loading of GloVe vectors, to address Issue #541 2016-10-20 18:27:48 +02:00
Matthew Honnibal d4aaf2752c Fix issue #535: Pipeline elements added even when data not installed. 2016-10-19 19:55:19 +02:00
Matthew Honnibal 1b651db9c5 Fix parser creation in Language class. 2016-10-18 19:36:44 +02:00
Matthew Honnibal 45a6f9b9c7 Fix loading of tagger. 2016-10-18 19:33:04 +02:00
Matthew Honnibal 7d5212f131 Refactor defaults 2016-10-18 16:18:25 +02:00
Matthew Honnibal f787cd29fe Refactor the pipeline classes to make them more consistent, and remove the redundant blank() constructor. 2016-10-16 21:34:57 +02:00
Matthew Honnibal ca51f3b77e Use DependencyParser and EntityRecognizer in the Language class. 2016-10-16 17:58:12 +02:00
Matthew Honnibal a81c5a7abf Fix name of labels keyword to 'actions'. 2016-10-16 12:00:27 +02:00
Matthew Honnibal 8a6b35d266 Delay binding in MakeDoc 2016-10-16 11:41:55 +02:00
Matthew Honnibal 08e9134760 Change default value of path to True 2016-10-15 14:12:54 +02:00
Matthew Honnibal 6d8cb515ac Break the tokenization stage out of the pipeline into a function 'make_doc'. This allows all pipeline methods to have the same signature. 2016-10-14 17:38:29 +02:00
Matthew Honnibal 41f88ce938 Fix dep model loading in parser 2016-10-12 20:26:38 +02:00
Matthew Honnibal 0e2bedc373 Fix default labels for parser and NER 2016-10-12 19:12:40 +02:00
Matthew Honnibal 847a4a4182 Refactor Language, dropping Language.blank() method. 2016-10-12 13:45:58 +02:00
Matthew Honnibal ea23b64cc8 Refactor training, with new spacy.train module. Defaults still a little awkward. 2016-10-09 12:24:24 +02:00
Matthew Honnibal eceeaefe53 Fix defaults for Parser and Entity, adding a blank= argument. 2016-09-30 19:56:06 +02:00