spaCy/spacy
Adriane Boyd 0e7f94b247
Update Tokenizer.explain with special matches (#7749)
* Update Tokenizer.explain with special matches

Update `Tokenizer.explain` and the pseudo-code in the docs to include
the processing of special cases that contain affixes or whitespace.

* Handle optional settings in explain

* Add test for special matches in explain

Add test for `Tokenizer.explain` for special cases containing affixes.
2021-04-19 19:08:20 +10:00
..
cli assemble CLI command (#7783) 2021-04-19 18:39:11 +10:00
displacy Also exclude user hooks in displacy conversion (#7419) 2021-03-12 09:41:59 +01:00
lang Added more exception to the italian language from https://forum.wordr… (#7246) 2021-03-30 10:23:32 +02:00
matcher Support match alignments (#7321) 2021-04-08 18:10:14 +10:00
ml Register CharEmbed layer (#7805) 2021-04-19 18:39:34 +10:00
pipeline Bugfix/nel crossing sentence (#7630) 2021-04-12 18:08:01 +10:00
tests Update Tokenizer.explain with special matches (#7749) 2021-04-19 19:08:20 +10:00
tokens Fix/update extension copying in Span.as_doc and Doc.from_docs (#7574) 2021-03-30 09:49:12 +02:00
training Fix parser sourcing in NER converter (#7631) 2021-04-08 12:25:03 +02:00
__init__.pxd
__init__.py Add vocab kwarg back to spacy.load 2021-03-11 10:58:59 +01:00
__main__.py
about.py Update thinc pin and set version to v3.0.5 (#7389) 2021-03-10 11:10:53 +01:00
attrs.pxd
attrs.pyx
compat.py
default_config.cfg Support large/infinite training corpora (#7208) 2021-04-08 18:08:04 +10:00
default_config_pretraining.cfg pretrain architectures (#6451) 2020-12-08 14:41:03 +08:00
errors.py Improve checks for sourced components (#7490) 2021-04-19 18:36:32 +10:00
glossary.py
kb.pxd Revert added_strings change (#6236) 2020-10-10 18:55:07 +02:00
kb.pyx Replace links to nightly docs [ci skip] 2021-01-30 20:09:38 +11:00
language.py Improve checks for sourced components (#7490) 2021-04-19 18:36:32 +10:00
lexeme.pxd
lexeme.pyx reduce memory load when reading all vectors from file (#6945) 2021-02-07 08:05:43 +08:00
lookups.py Replace links to nightly docs [ci skip] 2021-01-30 20:09:38 +11:00
morphology.pxd
morphology.pyx Prevent 0-length mem alloc (#6653) 2021-01-06 12:50:17 +11:00
parts_of_speech.pxd
parts_of_speech.pyx
pipe_analysis.py Tidy up and auto-format 2020-09-29 21:39:28 +02:00
py.typed Add py.typed 2021-03-16 09:48:31 +01:00
schemas.py Support env vars and CLI overrides for project.yml 2021-02-10 13:45:27 +11:00
scorer.py Extend score_spans for overlapping & non-labeled spans (#7209) 2021-04-08 12:19:17 +02:00
strings.pxd
strings.pyx Make vocab update in get_docs deterministic (#7603) 2021-04-09 11:53:13 +02:00
structs.pxd Add SpanGroup and Graph container types to represent arbitrary annotations (#6696) 2021-01-14 17:30:41 +11:00
symbols.pxd introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
symbols.pyx introduce token.has_head and refer to MISSING_DEP_ (WIP) 2021-01-12 17:17:06 +01:00
tokenizer.pxd
tokenizer.pyx Update Tokenizer.explain with special matches (#7749) 2021-04-19 19:08:20 +10:00
typedefs.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
typedefs.pyx
util.py Set catalogue lower pin to v2.0.3 (#7762) 2021-04-19 18:37:17 +10:00
vectors.pyx Fix vectors data on GPU (#7626) 2021-04-19 18:30:03 +10:00
vocab.pxd Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master 2020-11-25 11:49:34 +01:00
vocab.pyx Fix vectors data on GPU (#7626) 2021-04-19 18:30:03 +10:00