mirror of https://github.com/explosion/spaCy.git
1139247532
* Revert changes to priority of `token_match` so that it has priority over all other tokenizer patterns * Add lookahead and potentially slow lookbehind back to the default URL pattern * Expand character classes in URL pattern to improve matching around lookaheads and lookbehinds related to #4882 * Revert changes to Hungarian tokenizer * Revert (xfail) several URL tests to their status before #4374 * Update `tokenizer.explain()` and docs accordingly |
||
---|---|---|
.. | ||
__init__.py | ||
examples.py | ||
punctuation.py | ||
stop_words.py | ||
tokenizer_exceptions.py |