spaCy/spacy/matcher
Adriane Boyd 0d9740e826 Replace PhraseMatcher with Aho-Corasick
Replace PhraseMatcher with the Aho-Corasick algorithm over numpy arrays
of the hash values for the relevant attribute. The implementation is
based on FlashText.

The speed should be similar to the previous PhraseMatcher. It is now
possible to easily remove match IDs and matches don't go missing with
large keyword lists / vocabularies.

Fixes #4308.
2019-09-19 16:49:05 +02:00
..
__init__.py Dependency tree pattern matcher (#3465) 2019-06-16 13:25:32 +02:00
_schemas.py Tidy up and auto-format [ci skip] 2019-08-31 13:39:06 +02:00
dependencymatcher.pyx Dependency tree pattern matcher (#3465) 2019-06-16 13:25:32 +02:00
matcher.pxd adding double match for optional operator at the end (#4166) 2019-08-21 22:46:56 +02:00
matcher.pyx add return_matches and as_tuples back to Matcher.pipe (#4303) 2019-09-18 22:00:33 +02:00
phrasematcher.pxd Replace PhraseMatcher with Aho-Corasick 2019-09-19 16:49:05 +02:00
phrasematcher.pyx Replace PhraseMatcher with Aho-Corasick 2019-09-19 16:49:05 +02:00