mirror of https://github.com/explosion/spaCy.git
0d9740e826
Replace PhraseMatcher with the Aho-Corasick algorithm over numpy arrays of the hash values for the relevant attribute. The implementation is based on FlashText. The speed should be similar to the previous PhraseMatcher. It is now possible to easily remove match IDs and matches don't go missing with large keyword lists / vocabularies. Fixes #4308. |
||
---|---|---|
.. | ||
__init__.py | ||
_schemas.py | ||
dependencymatcher.pyx | ||
matcher.pxd | ||
matcher.pyx | ||
phrasematcher.pxd | ||
phrasematcher.pyx |