Commit Graph

109 Commits

Author SHA1 Message Date
Ines Montani cae4457c38 💫 Add .similarity warnings for no vectors and option to exclude warnings (#2197)
* Add logic to filter out warning IDs via environment variable

Usage: SPACY_WARNING_EXCLUDE=W001,W007

* Add warnings for empty vectors

* Add warning if no word vectors are used in .similarity methods

For example, if only tensors are available in small models – should hopefully clear up some confusion around this

* Capture warnings in tests

* Rename SPACY_WARNING_EXCLUDE to SPACY_WARNING_IGNORE
2018-05-21 01:22:38 +02:00
Ines Montani 3141e04822
💫 New system for error messages and warnings (#2163)
* Add spacy.errors module

* Update deprecation and user warnings

* Replace errors and asserts with new error message system

* Remove redundant asserts

* Fix whitespace

* Add messages for print/util.prints statements

* Fix typo

* Fix typos

* Move CLI messages to spacy.cli._messages

* Add decorator to display error code with message

An implementation like this is nice because it only modifies the string when it's retrieved from the containing class – so we don't have to worry about manipulating tracebacks etc.

* Remove unused link in spacy.about

* Update errors for invalid pipeline components

* Improve error for unknown factories

* Add displaCy warnings

* Update formatting consistency

* Move error message to spacy.errors

* Update errors and check if doc returned by component is None
2018-04-03 15:50:31 +02:00
4altinok ed1ac2969e added new lexical feat to lexeme 2018-02-11 18:51:48 +01:00
Matthew Honnibal ccb51a9f36 Make .similarity() return 1.0 if all orth attrs match 2018-01-15 16:29:48 +01:00
Matthew Honnibal 4b09616b58 Add test for #1757: Comparison against None 2018-01-15 15:55:01 +01:00
Explosion Bot 41d0f1665a Fix add_attrs for cluster 2017-10-30 16:07:50 +01:00
Explosion Bot 5ede7cec9b Improve Lexeme.set_attrs method 2017-10-30 11:49:11 +01:00
ines a8e10f94e4 Tidy up Lexeme and update docs 2017-10-27 21:07:50 +02:00
Matthew Honnibal 6ceb0f0518 Allow Lexeme.rank to be set 2017-08-24 21:43:00 +02:00
Matthew Honnibal 6cd5730ee7 Fix lex struct setters for strings 2017-05-29 01:05:09 +02:00
Matthew Honnibal f51e6a6c16 Adjust lexeme sizing for attr_t being 64 bit 2017-05-28 12:51:09 +02:00
Matthew Honnibal 6863d01361 Remove vectors from lexeme 2017-05-28 11:45:48 +02:00
ines 27de0834b2 Update docstrings and API docs for Lexeme 2017-05-20 15:13:42 +02:00
Matthew Honnibal 793430aa7a Get spaCy train command working with neural network
* Integrate models into pipeline
* Add basic serialization (maybe incorrect)
* Fix pickle on vocab
2017-05-17 12:04:50 +02:00
ines 9d85cda8e4 Fix models error message and use about.__docs_models__ (see #1051) 2017-05-13 13:05:47 +02:00
ines 564939391a Remove spacy.orth 2017-05-09 01:21:47 +02:00
ines d24589aa72 Clean up imports, unused code, whitespace, docstrings 2017-04-15 12:05:47 +02:00
ines 561f2a3eb4 Use consistent formatting for docstrings 2017-04-15 11:59:21 +02:00
ines 3b667a24d4 Remove whitespace 2017-04-01 10:21:08 +02:00
ines e71a1f4bd0 Fix download commands in error messages (see #946) 2017-04-01 10:20:57 +02:00
Matthew Honnibal b86f8af0c1 Fix doc strings 2016-11-01 12:25:36 +01:00
Matthew Honnibal bea44bd3c4 Fix vector_norm when vector is assigned to Lexeme. 2016-10-23 14:23:56 +02:00
Matthew Honnibal ed5e178817 Add sentiment property on lexeme object 2016-10-19 20:52:52 +02:00
Matthew Honnibal e233328d38 Fix Issue #371: Lexeme objects were unhashable. 2016-09-27 13:22:30 +02:00
Matthew Honnibal 17137f5c0c * Fix issue #372: mistake in Lexeme rich comparison 2016-05-12 12:58:57 +02:00
Matthew Honnibal e31df66d26 * Fix Issue #361: Lexemes didn't have rich comparison. 2016-05-05 01:32:26 +02:00
Wolfgang Seeker d65ef41d08 make error messages language independent 2016-03-24 11:47:09 +01:00
Wolfgang Seeker 03fb498dbe introduce lang field for LexemeC to hold language id
put noun_chunk logic into iterators.py for each language separately
2016-03-10 13:01:34 +01:00
Matthew Honnibal 419edfab50 * Use generic flags for the new attributes until they're added 2016-02-04 15:50:54 +01:00
Matthew Honnibal 11810be33e * Add Python hooks for is_bracket/is_quote/is_left_punct/is_right_punct 2016-02-04 13:04:16 +01:00
Matthew Honnibal ab5aac5b2f * Add .rank property to Token and Lexeme, for frequency rank 2015-11-08 16:18:25 +01:00
Matthew Honnibal 1e99fcd413 * Rename .repvec to .vector in C API 2015-11-03 23:47:59 +11:00
Matthew Honnibal f7283a5067 * Fix vectors bugs for OOV words 2015-09-22 02:10:25 +02:00
Matthew Honnibal 44aecba701 * Fix Token.has_vector and Lexeme.has_vector 2015-09-22 01:43:16 +02:00
Matthew Honnibal 596fde8daa * Add has_vector attribute to Token and Lexeme 2015-09-21 19:52:43 +10:00
Matthew Honnibal f32927efbf * Raise exceptions if attempt to access parse, but data is not installed. This partly but not fully addresses Issue #97. Still need exceptions on the various Token attributes that access the parse tree, e.g. token.head, token.lefts, token.rights, etc. Exceptions should be centralized, too. 2015-09-21 18:35:40 +10:00
Matthew Honnibal 191d593e03 * Fix vectors bug in lexeme 2015-09-15 19:05:11 +10:00
Matthew Honnibal dd4d64b235 * Support setting of word vectors on Lexeme object. 2015-09-15 14:42:27 +10:00
Matthew Honnibal 193f127f81 * Fix ugly py_check_flag and py_set_flag functions in Lexeme 2015-09-15 13:06:18 +10:00
Matthew Honnibal 9561d88529 * Add is_stop to Python API 2015-09-14 18:25:40 +10:00
Matthew Honnibal 65dc0d1dfb * Extend word vectors support, with .similarity() function, vector_norm property, and rename repvec to vector. Keep repvec name as well for now for backwards compatibility. 2015-09-14 17:49:58 +10:00
Matthew Honnibal 07c09a0e1b * Fix attribute getters and setters in Lexeme 2015-09-09 14:29:22 +02:00
Matthew Honnibal 86c888667f * Merge in changes from de branch 2015-09-06 19:49:28 +02:00
Matthew Honnibal d2fc104a26 * Begin merge of Gazetteer and DE branches 2015-09-06 19:45:15 +02:00
Matthew Honnibal 7cc56ada6e * Temporarily add py_set_flag attribute in Lexeme 2015-09-06 17:52:51 +02:00
Matthew Honnibal 3acf60df06 * Add missing properties in Lexeme class 2015-08-26 19:16:28 +02:00
Matthew Honnibal 6f1743692a * Work on language-independent refactoring 2015-08-23 20:49:18 +02:00
Matthew Honnibal cad0cca4e3 * Tmp 2015-08-22 22:04:34 +02:00
Matthew Honnibal 8e4c69ee8c * Add is_oov property, and fix up handling of attributes 2015-07-27 01:50:06 +02:00
Matthew Honnibal 6bb96c122d * Host IS_ flags in attrs.pxd, and add properties for them on Token and Lexeme objects 2015-07-26 16:37:16 +02:00