Commit Graph

3083 Commits

Author SHA1 Message Date
kengz 17b7832419 mark test as needing models 2016-10-16 14:39:07 -04:00
kengz f046e0d7c8 add parse_tree method to language, separate from __call__ for efficiency, but will use __call__ to get the doc 2016-10-16 14:20:23 -04:00
Ines Montani 74a7a3fa16 Update notification settings 2016-10-16 01:53:26 +02:00
Ines Montani 15c012054c Update notification settings 2016-10-16 01:46:22 +02:00
Ines Montani 4e90604b96 Update badges 2016-10-16 01:11:35 +02:00
Ines Montani 5473a0026f Add "Where to ask questions" section to README 2016-10-16 00:51:31 +02:00
Ines Montani 59cba1be7f Fix Travis badge 2016-10-15 23:51:16 +02:00
Ines Montani 7c4d347b34 Update README.md for website 2016-10-03 20:21:33 +02:00
Ines Montani 9e8f333763 Update website 2016-10-03 20:19:13 +02:00
Ines Montani 504b80b6da Update gitignore 2016-10-03 20:19:05 +02:00
Matthew Honnibal eae2d723d1 Update LICENSE 2016-10-03 21:33:58 +11:00
Ines Montani 1359cfe32f Update README.rst 2016-10-03 12:33:22 +02:00
Matthew Honnibal d3dc5718b2 Fix syntax error in Doc 2016-09-28 11:39:49 +02:00
Matthew Honnibal 1b520e7bab Improve docstrings for Doc object 2016-09-28 11:15:13 +02:00
Matthew Honnibal 81a47c01d8 Fix test for empty sentence string. 2016-09-27 19:21:22 +02:00
Matthew Honnibal 4cbf0d3bb6 Handle errors when no valid actions are available, pointing users to the issue tracker. 2016-09-27 19:19:53 +02:00
Matthew Honnibal 430473bd98 Raise errors when no actions are available, re Issue #429 2016-09-27 19:09:37 +02:00
Matthew Honnibal fc4a7ad794 Test and fix Issue #411: IndexError when .sents property is used on empty string. 2016-09-27 18:49:14 +02:00
Matthew Honnibal 3d370b7d45 Add test for Issue #445, fixed in 3cb4d455d, with improved lemmatizer logic 2016-09-27 18:39:46 +02:00
Matthew Honnibal a2f3510d6d Fix lemmatizer 2016-09-27 17:47:05 +02:00
Matthew Honnibal 07776d8096 Fix pos name conflict in lemmatize 2016-09-27 17:35:58 +02:00
Matthew Honnibal 35cd953f9e Fix pos name conflict with morphology 2016-09-27 14:16:22 +02:00
Matthew Honnibal 8e7df3c4ca Expect the parser data, if parser.load() is called. 2016-09-27 14:02:12 +02:00
Matthew Honnibal bb4f201ad2 Pass morphological features from tag map into the lemmatizer. 2016-09-27 14:01:43 +02:00
Matthew Honnibal 40509e8bca Tweak the new is_base_form logic, because we can expect the 'pos' key in the morphology we're passed. 2016-09-27 14:01:16 +02:00
Matthew Honnibal 9c8ac91d72 Add test for Issue #435 2016-09-27 13:52:38 +02:00
Matthew Honnibal 3cb4d455d2 Pass lemmatizer morphological features, so that rules are sensitive to base/inflected distinction, which is how the WordNet data is designed. See Issue #435 2016-09-27 13:52:11 +02:00
Matthew Honnibal e233328d38 Fix Issue #371: Lexeme objects were unhashable. 2016-09-27 13:22:30 +02:00
Matthew Honnibal e382e48d9f Temporarily patch handling of defaul templates for tagger. Need to move these to language_data. 2016-09-27 13:21:28 +02:00
Matthew Honnibal a44763af0e Fix Issue #469: Incorrectly cased root label in noun chunk iterator 2016-09-27 13:13:01 +02:00
Matthew Honnibal b14b9b096b Return None if /deps directory not present, instead of trying to load the parser. 2016-09-26 18:48:03 +02:00
Matthew Honnibal e07b9665f7 Don't expect parser model 2016-09-26 18:09:33 +02:00
Matthew Honnibal ee6fa106da Fix parser features 2016-09-26 17:57:32 +02:00
Matthew Honnibal e607e4b598 Fix parser loading 2016-09-26 17:51:11 +02:00
Matthew Honnibal 0b2d7ae9d6 Fix Entity creation 2016-09-26 15:41:22 +02:00
Matthew Honnibal 2debc4e0a2 Add .blank() method to Parser. Start housing default dep labels and entity types within the Defaults class. 2016-09-26 11:57:54 +02:00
Matthew Honnibal 722199acb8 Add spacy.blank() method, that doesn't load data. Don't try to load data if path is falsey 2016-09-26 11:07:46 +02:00
Matthew Honnibal ae202e7a60 Fix init_model.py 2016-09-25 15:58:51 +02:00
Matthew Honnibal e56653f848 Add language data for German 2016-09-25 15:44:45 +02:00
Matthew Honnibal 7db956133e Move tokenizer data for German into spacy.de.language_data 2016-09-25 15:37:33 +02:00
Matthew Honnibal 95aaea0d3f Refactor so that the tokenizer data is read from Python data, rather than from disk 2016-09-25 14:49:53 +02:00
Matthew Honnibal d7e9acdcdf Add English language data, so that the tokenizer doesn't require the data download 2016-09-25 14:49:00 +02:00
Matthew Honnibal 82b8cc5efb Whitespace 2016-09-24 22:17:01 +02:00
Matthew Honnibal fd58f7655a Python 3 compatible basestring 2016-09-24 22:16:43 +02:00
Matthew Honnibal 082e95b19e Python 3 compatible basestring 2016-09-24 22:09:21 +02:00
Matthew Honnibal f19af6cb2c Python 3 compatible basestring 2016-09-24 22:08:43 +02:00
Matthew Honnibal 3ed4cdfe32 Handle pathlib.Path objects in CFile 2016-09-24 22:01:46 +02:00
Matthew Honnibal df88690177 Fix encoding of path variable 2016-09-24 21:13:15 +02:00
Matthew Honnibal af847e07fc Fix usage of pathlib for Python3 -- turning paths to strings. 2016-09-24 21:05:27 +02:00
Matthew Honnibal 453683aaf0 Fix spacy/vocab.pyx 2016-09-24 20:50:31 +02:00