Commit Graph

2068 Commits

Author SHA1 Message Date
Gyorgy Orosz 366b3f8685 Merge branch 'master' into hu_tokenizer 2016-12-20 20:53:31 +01:00
Gyorgy Orosz c035928156 Partial Hungarian number tokenization is added. 2016-12-20 20:46:20 +01:00
JM 70ff0639b5 Fixed missing vec_path declaration that was failing if 'add_vectors' was set
Added vec_path variable declaration to avoid accessing it before assignment in case 'add_vectors' is in overrides.
2016-12-20 18:21:05 +01:00
Magnus Burton db5a077d2b Initial commit for Swedish 2016-12-20 11:05:06 +01:00
Matthew Honnibal 3f5747a9b2 Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-12-18 23:44:22 +01:00
Matthew Honnibal 40e71586d6 Fix Issue #683: Add 'SP' to tag_map, if it's not there already, within the Morphology class. 2016-12-18 23:44:05 +01:00
Matthew Honnibal fa1d23e10d Merge branch 'master' of https://github.com/explosion/spaCy 2016-12-18 23:32:03 +01:00
Matthew Honnibal f38eb25fe1 Fix test for word vector 2016-12-18 23:31:55 +01:00
Matthew Honnibal 4e68abebc4 Merge branch 'master' of ssh://github.com/explosion/spaCy 2016-12-18 23:19:45 +01:00
Matthew Honnibal 5a6328a5a4 Increment version 2016-12-18 23:19:19 +01:00
Matthew Honnibal 13a0b31279 Another tweak to GloVe path hackery. 2016-12-18 23:12:49 +01:00
Matthew Honnibal 2c6228565e Fix vector loading re glove hack 2016-12-18 23:06:44 +01:00
Matthew Honnibal 618b50a064 Fix issue #684: GloVe vectors not loaded in spacy.en.English. 2016-12-18 22:46:31 +01:00
Matthew Honnibal 404019ad2f Fix issue #672: ent_iob_ was a string, not unicode, due to missing unicode_literals statement. 2016-12-18 22:33:53 +01:00
Matthew Honnibal 2ef9d53117 Untested fix for issue #684: GloVe vectors hack should be inserted in English, not in spacy.load. 2016-12-18 22:29:31 +01:00
Matthew Honnibal c065359459 Fix path-override bug in spacy.load 2016-12-18 22:15:29 +01:00
Matthew Honnibal 813249f826 Work on morphology class. Still not fully consistent with rest of library. 2016-12-18 17:35:22 +01:00
Matthew Honnibal 3679fb43a3 Fix loading of lemmatizer 2016-12-18 17:34:09 +01:00
Matthew Honnibal 3980f1b0cb Ignore more morphology attributes in deprecated mode of intify_attrs 2016-12-18 17:33:46 +01:00
Matthew Honnibal 7a98ee5e5a Merge language data change 2016-12-18 17:03:52 +01:00
Matthew Honnibal e4c951c153 Merge branch 'organize-language-data' of ssh://github.com/explosion/spaCy into organize-language-data 2016-12-18 17:01:08 +01:00
Ines Montani b99d683a93 Fix formatting 2016-12-18 16:58:28 +01:00
Ines Montani b11d8cd3db Merge remote-tracking branch 'origin/organize-language-data' into organize-language-data 2016-12-18 16:57:12 +01:00
Ines Montani d1c1d3f9cd Fix tokenizer test 2016-12-18 16:55:32 +01:00
Ines Montani 753068f1d5 Use base language data as default 2016-12-18 16:55:25 +01:00
Ines Montani bcc1d50d09 Remove trailing whitespace 2016-12-18 16:54:52 +01:00
Ines Montani 4e95737c6c Add base tag map 2016-12-18 16:54:28 +01:00
Ines Montani 2b2ea8ca11 Reorganise language data 2016-12-18 16:54:19 +01:00
Matthew Honnibal 1b31c05bf8 Whitespace 2016-12-18 16:51:40 +01:00
Matthew Honnibal bdcecb3c96 Add import in regression test 2016-12-18 16:51:31 +01:00
Matthew Honnibal 6ee1df93c5 Set tag_map to None if it's not seen in the data by vocab 2016-12-18 16:51:10 +01:00
Matthew Honnibal 33996e770b Update header for morphology class 2016-12-18 16:50:42 +01:00
Matthew Honnibal d58187ffa7 Filter out morphology keys in deprecated attrs 2016-12-18 16:50:26 +01:00
Matthew Honnibal 837a5d4100 Update morphology class so that exceptions can be added one-by-one, and so that arbitrary attributes can be referenced. 2016-12-18 16:49:46 +01:00
Matthew Honnibal 44f4f008bd Wire up lemmatizer rules for English 2016-12-18 15:50:09 +01:00
Matthew Honnibal e6fc4afb04 Whitespace 2016-12-18 15:48:00 +01:00
Ines Montani 32b36c3882 Break language data components into their own files 2016-12-18 15:40:22 +01:00
Ines Montani 1bff59a8db Update English language data 2016-12-18 15:36:53 +01:00
Ines Montani 2eb163c5dd Add lemma rules 2016-12-18 15:36:53 +01:00
Ines Montani 29ad8143d8 Add morph rules 2016-12-18 15:36:53 +01:00
Ines Montani bc40dad7d9 Add entity rules 2016-12-18 15:36:53 +01:00
Ines Montani eaa3b1319d Fix formatting 2016-12-18 15:36:53 +01:00
Ines Montani 704c7442e0 Break language data components into their own files 2016-12-18 15:36:53 +01:00
Ines Montani 62655fd36f Add ENT_ID constant 2016-12-18 15:36:53 +01:00
Matthew Honnibal fa272fdf12 Merge branch 'organize-language-data' of ssh://github.com/explosion/spaCy into organize-language-data 2016-12-18 15:00:21 +01:00
Matthew Honnibal 57c4341453 Refactor loading of morphology exceptions, adding a method add_special_case. 2016-12-18 14:59:44 +01:00
Ines Montani 77cf2fb0f6 Remove unnecessary argument in test 2016-12-18 14:06:27 +01:00
Ines Montani 121c310566 Remove trailing whitespace 2016-12-18 14:06:27 +01:00
Ines Montani 0fc4e45cb3 Fix tag map for German 2016-12-18 13:30:03 +01:00
Ines Montani 28326649f3 Fix typo 2016-12-18 13:30:03 +01:00