spaCy/spacy/tests/lang/ca/test_exception.py

import pytest


@pytest.mark.parametrize(
    "text,lemma",
    [("aprox.", "aproximadament"), ("pàg.", "pàgina"), ("p.ex.", "per exemple")],
)
def test_ca_tokenizer_handles_abbr(ca_tokenizer, text, lemma):
    tokens = ca_tokenizer(text)
    assert len(tokens) == 1


def test_ca_tokenizer_handles_exc_in_text(ca_tokenizer):
    text = "La Dra. Puig viu a la pl. dels Til·lers."
    doc = ca_tokenizer(text)
    assert [t.text for t in doc] == [
        "La",
        "Dra.",
        "Puig",
        "viu",
        "a",
        "la",
        "pl.",
        "d",
        "els",
        "Til·lers",
        ".",
    ]
Catalan Language Support (#2940) * Catalan language Support * Ddding Catalan to documentation 2018-11-26 14:25:47 +00:00			`import pytest`


Tidy up and auto-format 2019-08-20 15:36:34 +00:00			`@pytest.mark.parametrize(`
			`"text,lemma",`
			`[("aprox.", "aproximadament"), ("pàg.", "pàgina"), ("p.ex.", "per exemple")],`
			`)`
Catalan Language Support (#2940) * Catalan language Support * Ddding Catalan to documentation 2018-11-26 14:25:47 +00:00			`def test_ca_tokenizer_handles_abbr(ca_tokenizer, text, lemma):`
			`tokens = ca_tokenizer(text)`
			`assert len(tokens) == 1`


			`def test_ca_tokenizer_handles_exc_in_text(ca_tokenizer):`
Update Catalan tokenizer (#9297) * Update Makefile For more recent python version * updated for bsc changes New tokenization changes * Update test_text.py * updating tests and requirements * changed failed test in test/lang/ca changed failed test in test/lang/ca * Update .gitignore deleted stashed changes line * back to python 3.6 and remove transformer requirements As per request * Update test_exception.py Change the test * Update test_exception.py Remove test print * Update Makefile For more recent python version * updated for bsc changes New tokenization changes * updating tests and requirements * Update requirements.txt Removed spacy-transfromers from requirements * Update test_exception.py Added final punctuation to ensure consistency * Update Makefile Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Format * Update test to check all tokens Co-authored-by: cayorodriguez <crodriguezp@gmail.com> Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> 2021-09-27 12:42:30 +00:00			`text = "La Dra. Puig viu a la pl. dels Til·lers."`
			`doc = ca_tokenizer(text)`
			`assert [t.text for t in doc] == [`
			`"La",`
			`"Dra.",`
			`"Puig",`
			`"viu",`
			`"a",`
			`"la",`
			`"pl.",`
			`"d",`
			`"els",`
			`"Til·lers",`
			`".",`
			`]`