Commit Graph

8113 Commits

Author SHA1 Message Date
Motoki Wu 54062b7326 added tests for issue #1915 2018-01-30 18:30:19 -08:00
Motoki Wu f4a7d1a423 make to sure pass in **cfg to each component when training 2018-01-30 18:29:54 -08:00
ines 4046823699 Only check component in factories if string (see #1911) 2018-01-30 16:29:07 +01:00
ines ce10d320c4 Fix component check in self.factories (see #1911) 2018-01-30 16:09:37 +01:00
ines 8901814248 Improve error handling if pipeline component is not callable (resolves #1911)
Also add help message if user accidentally calls nlp.add_pipe() with a string of a built-in component name.
2018-01-30 15:43:03 +01:00
Ines Montani 0d8a3c9b59
Merge pull request #1905 from Kimahriman/download-auto-link (fixes #1904)
Fixed auto linking after download and added simple test to check
2018-01-29 19:56:06 +00:00
Adam Binford 9238749aaf Removed test to avoid network requests 2018-01-29 14:48:20 -05:00
Adam Binford 1a2c2f7d7f Fixed auto linking after download and added simple test to check 2018-01-29 14:25:21 -05:00
Matthew Honnibal cb7110c22e
Merge pull request #1882 from ohenrik/nb_lemma_and_tag_map
Add norwegian bokmål ('nb') lemmatizer and tag_map
2018-01-29 18:18:50 +01:00
Matthew Honnibal 0c1e7f0c86
Merge pull request #1893 from azarezade/master
Add Persian language
2018-01-29 18:18:33 +01:00
Matthew Honnibal cbdab75b36 Increment version 2018-01-28 23:46:22 +01:00
Matthew Honnibal 512e6adb08
Merge pull request #1896 from thomasopsomer/fix-sent
Fix sentence boundaries serialization (issue #1834)
2018-01-28 21:18:51 +01:00
Matthew Honnibal f5b1ad4100 Limit parser model size, to hopefully reduce memory during CI tests 2018-01-28 21:00:32 +01:00
Thomas Opsomer f35895d81b add contributor agreement 2018-01-28 20:12:05 +01:00
Thomas Opsomer 515e25910e fix sent_start in serialization 2018-01-28 19:50:42 +01:00
Thomas Opsomer 45d62561f7 add test for the issue 2018-01-28 19:49:56 +01:00
ines 6d978e5c35 Don't use deprecated Doc.merge call in displaCy
As reported here: https://stackoverflow.com/a/48464412/6400719
2018-01-27 11:25:05 +01:00
Ali Zarezade bb6bd3d8ae add persian language 2018-01-27 13:27:26 +03:30
Ali Zarezade d195675db5 add persian language 2018-01-27 13:21:38 +03:30
Ole Henrik Skogstrøm 8e2c9f2475 Cleaned up nb tag_map comments 2018-01-25 11:09:28 +01:00
Ole Henrik Skogstrøm 1107e89fcf Updated doc string on nb tag_map module 2018-01-25 11:08:28 +01:00
Ole Henrik Skogstrøm bbc758526c Added contributors agreement 2018-01-25 11:05:29 +01:00
Matthew Honnibal 6a8cb905aa
Merge pull request #1876 from GregDubbin/master
Pattern matcher fixes
2018-01-24 16:38:11 +01:00
Matthew Honnibal 38b260e0c3
Merge pull request #1879 from azarezade/master
Add Persian character and symbols
2018-01-24 16:34:22 +01:00
Matthew Honnibal edb71a280e Add test for #1883: Unpickling Matcher 2018-01-24 15:42:33 +01:00
Matthew Honnibal 2ad050e668 Fix unpickling of Matcher. Also store correct data in matcher._patterns 2018-01-24 15:42:11 +01:00
Ole Henrik Skogstrøm 4058a7d579 Fix æøå characters in lemmatizer 2018-01-24 14:03:14 +01:00
Ole Henrik Skogstrøm 42248f423f Updated tag map 2018-01-24 13:50:33 +01:00
Ole Henrik Skogstrøm 74b430b49a Correct Lemmatizer 2018-01-24 13:26:33 +01:00
Ole Henrik Skogstrøm b9b3a40c78 Add norwegian lemmatizer and tag_map 2018-01-24 12:28:29 +01:00
Matthew Honnibal 42a18ef903 Add test for #1868: Vocab.__contains__ with ints 2018-01-23 23:27:05 +01:00
Matthew Honnibal 43f381ce36 Make Vocab.__contains__ work with ints. Fixes #1868 2018-01-23 23:26:47 +01:00
greg 85ab99e692 Correct test examples 2018-01-23 15:00:14 -05:00
greg f50bb1aafc Restructure StateC to eliminate dependency on unordered_map 2018-01-23 14:40:03 -05:00
Matthew Honnibal f3753c2453 Further model deserialization fixes re #1727 2018-01-23 19:16:05 +01:00
Matthew Honnibal 91e916cb67 Add comment to new test 2018-01-23 19:11:53 +01:00
Matthew Honnibal fd187d71ad Add test for #1727 2018-01-23 19:11:01 +01:00
Matthew Honnibal 85c942a6e3 Dont overwrite pretrained_dims setting from cfg. Fixes #1727 2018-01-23 19:10:49 +01:00
Ali Zarezade 42349471bc
add ٪ as punctuation 2018-01-23 18:11:33 +03:30
Ali Zarezade c27c7bf0e0
add contributors.md 2018-01-23 13:47:30 +03:30
Ali Zarezade 2bda582135
Add Persian character and symbols
Add Persian characters and the following:
- ٪ used instead of %
- ؟ used instead of ?
- ﷼ used instead of $
- ، used instead of ,
- ؛ used instead of ;
2018-01-23 13:20:36 +03:30
Matthew Honnibal 7e6dc283db Fix unicode import in test 2018-01-22 23:55:44 +01:00
greg 686735b94e Fix matcher import 2018-01-22 16:53:05 -05:00
greg 3a491093ee Import libcpp.map if libcpp.unordered_map doesn't exist 2018-01-22 16:46:25 -05:00
greg daefed0a34 Correct documentation of '+' and '*' ops 2018-01-22 15:55:44 -05:00
greg d55992bdf0 Switch match dictionary to use final state pointer rather than ID 2018-01-22 15:36:47 -05:00
Matthew Honnibal 4ce7d24fd5 Add test for #1799: Set left and right edges (and thus sentences) in non-projective parses. 2018-01-22 20:18:38 +01:00
Matthew Honnibal 56164ab688 Set l_edge and r_edge correctly for non-projective parses. Fixes #1799 2018-01-22 20:18:04 +01:00
Matthew Honnibal 964aa1b384 Merge branch 'master' of https://github.com/explosion/spaCy 2018-01-22 19:18:46 +01:00
Matthew Honnibal 29897ed1b3 Allow vector loading to work on 1d data files. Fixes #1831 2018-01-22 19:18:26 +01:00