spaCy/spacy/tests/regression/test_issue2671.py

# coding: utf-8
from __future__ import unicode_literals

from spacy.lang.en import English
from spacy.matcher import Matcher


def test_issue2671():
    """Ensure the correct entity ID is returned for matches with quantifiers.
    See also #2675
    """
    nlp = English()
    matcher = Matcher(nlp.vocab)
    pattern_id = "test_pattern"
    pattern = [
        {"LOWER": "high"},
        {"IS_PUNCT": True, "OP": "?"},
        {"LOWER": "adrenaline"},
    ]
    matcher.add(pattern_id, None, pattern)
    doc1 = nlp("This is a high-adrenaline situation.")
    doc2 = nlp("This is a high adrenaline situation.")
    matches1 = matcher(doc1)
    for match_id, start, end in matches1:
        assert nlp.vocab.strings[match_id] == pattern_id
    matches2 = matcher(doc2)
    for match_id, start, end in matches2:
        assert nlp.vocab.strings[match_id] == pattern_id
Add failing test for issue 2671: Incorrect rule ID returned from matcher 2018-08-15 13:54:33 +00:00			`# coding: utf-8`
			`from __future__ import unicode_literals`

Tidy up tests 2018-09-27 14:41:57 +00:00			`from spacy.lang.en import English`
			`from spacy.matcher import Matcher`
Add failing test for issue 2671: Incorrect rule ID returned from matcher 2018-08-15 13:54:33 +00:00

			`def test_issue2671():`
Tidy up tests 2018-09-27 14:41:57 +00:00			`"""Ensure the correct entity ID is returned for matches with quantifiers.`
Note link between issues #2671 and #2675 2018-08-15 15:18:28 +00:00			`See also #2675`
Tidy up tests 2018-09-27 14:41:57 +00:00			`"""`
Add failing test for issue 2671: Incorrect rule ID returned from matcher 2018-08-15 13:54:33 +00:00			`nlp = English()`
			`matcher = Matcher(nlp.vocab)`
💫 Tidy up and auto-format tests (#2967) * Auto-format tests with black * Add flake8 config * Tidy up and remove unused imports * Fix redefinitions of test functions * Replace orths_and_spaces with words and spaces * Fix compatibility with pytest 4.0 * xfail test for now Test was previously overwritten by following test due to naming conflict, so failure wasn't reported * Unfail passing test * Only use fixture via arguments Fixes pytest 4.0 compatibility 2018-11-27 00:09:36 +00:00			`pattern_id = "test_pattern"`
			`pattern = [`
			`{"LOWER": "high"},`
			`{"IS_PUNCT": True, "OP": "?"},`
			`{"LOWER": "adrenaline"},`
			`]`
Tidy up tests 2018-09-27 14:41:57 +00:00			`matcher.add(pattern_id, None, pattern)`
Add failing test for issue 2671: Incorrect rule ID returned from matcher 2018-08-15 13:54:33 +00:00			`doc1 = nlp("This is a high-adrenaline situation.")`
			`doc2 = nlp("This is a high adrenaline situation.")`
Tidy up and format remaining files 2018-11-30 16:43:08 +00:00			`matches1 = matcher(doc1)`
			`for match_id, start, end in matches1:`
			`assert nlp.vocab.strings[match_id] == pattern_id`
			`matches2 = matcher(doc2)`
			`for match_id, start, end in matches2:`
			`assert nlp.vocab.strings[match_id] == pattern_id`