spaCy/spacy
Ines Montani 483dddc9bc 💫 Add token match pattern validation via JSON schemas (#3244)
* Add custom MatchPatternError

* Improve validators and add validation option to Matcher

* Adjust formatting

* Never validate in Matcher within PhraseMatcher

If we do decide to make validate default to True, the PhraseMatcher's Matcher shouldn't ever validate. Here, we create the patterns automatically anyways (and it's currently unclear whether the validation has performance impacts at a very large scale).
2019-02-13 01:47:26 +11:00
..
cli 💫 Add token match pattern validation via JSON schemas (#3244) 2019-02-13 01:47:26 +11:00
data
displacy Tidy up and fix small bugs and typos 2019-02-08 14:14:49 +01:00
lang Tidy up and fix small bugs and typos 2019-02-08 14:14:49 +01:00
matcher 💫 Add token match pattern validation via JSON schemas (#3244) 2019-02-13 01:47:26 +11:00
pipeline 💫 Break up large pipeline.pyx (#3246) 2019-02-10 12:14:51 +01:00
syntax 💫 Prevent parser from predicting unseen classes (#3075) 2018-12-20 16:12:22 +01:00
tests 💫 Add token match pattern validation via JSON schemas (#3244) 2019-02-13 01:47:26 +11:00
tokens Only run noun chunks iterator in Span if available (closes #3199) 2019-02-08 18:33:16 +01:00
__init__.pxd
__init__.py Tidy up and format remaining files 2018-11-30 17:43:08 +01:00
__main__.py 💫 New JSON helpers, training data internals & CLI rewrite (#2932) 2018-11-30 20:16:14 +01:00
_align.pyx Improve alignment around quotes 2018-08-16 01:04:34 +02:00
_ml.py 💫 Better support for semi-supervised learning (#3035) 2018-12-10 16:25:33 +01:00
about.py Set version to v2.1.0a7.dev1 2019-02-08 01:54:01 +11:00
attrs.pxd Fix LANG symbol 2018-02-17 18:10:50 +01:00
attrs.pyx Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
compat.py 💫 Replace ujson, msgpack and dill/pickle/cloudpickle with srsly (#3003) 2018-12-03 01:28:22 +01:00
errors.py 💫 Add token match pattern validation via JSON schemas (#3244) 2019-02-13 01:47:26 +11:00
glossary.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
gold.pxd
gold.pyx Add gold.spans_from_biluo_tags helper (#3227) 2019-02-06 21:50:26 +11:00
language.py Improve entry points and allow custom language classes via entry points (#3080) 2018-12-20 23:58:43 +01:00
lemmatizer.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
lexeme.pxd
lexeme.pyx 💫 Add .similarity warnings for no vectors and option to exclude warnings (#2197) 2018-05-21 01:22:38 +02:00
morphology.pxd Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
morphology.pyx Fix lemmatization 2018-07-05 13:56:02 +02:00
parts_of_speech.pxd
parts_of_speech.pyx
scorer.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
strings.pxd
strings.pyx Add get_string_id helper to spacy.strings 2018-12-10 16:09:26 +01:00
structs.pxd Make NORM a token attribute (#3029) 2018-12-08 10:49:10 +01:00
symbols.pxd Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
symbols.pyx Revert "Merge branch 'develop' of https://github.com/explosion/spaCy into develop" 2018-03-27 19:23:02 +02:00
tokenizer.pxd
tokenizer.pyx Replacing regex library with re to increase tokenization speed (#3218) 2019-02-01 18:05:22 +11:00
typedefs.pxd
typedefs.pyx
util.py 💫 Add token match pattern validation via JSON schemas (#3244) 2019-02-13 01:47:26 +11:00
vectors.pyx Fix KeyError in Vectors.most_similar. Fixes #2648 2018-12-10 16:19:18 +01:00
vocab.pxd 💫 Small efficiency fixes to tokenizer (#2587) 2018-07-24 23:35:54 +02:00
vocab.pyx Prevent exceptions from setting POS but not TAG. Closes #1773 2018-12-30 13:16:05 +01:00