spaCy/lang_data/en
Matthew Honnibal fe9299a118 * Fix long-standing issue with coarse-grained tags: proper nouns weren't receiving the PROPN tag, and personal pronouns weren't receiving the PRON tag. This should fix Issue #191, and also Issue #325, which reported that proper nouns were being lemmatized using the common noun policies. This lemmatization will be prevented if the universal tag is PROPN, not NOUN, as no lemmatization rules are loaded for the PROPN tag. 2016-04-14 12:46:43 +02:00
..
gazetteer.json Fix Issue #243: Incorrect gazetteer entry 2016-01-30 06:58:29 +11:00
generate_specials.py Fix inconsistencies in generate_specials.py 2016-04-07 11:21:52 +10:00
infix.txt * Fix infixed commas in tokenizer, re Issue #326. Need to benchmark on empirical data, to make sure this doesn't break other cases. 2016-04-14 11:36:03 +02:00
lemma_rules.json * Fix quote marks in lemma_rules 2015-10-10 15:03:36 +11:00
morphs.json * Whitespace 2015-10-10 16:03:48 +11:00
prefix.txt * Add en language data, for tokenizer etc 2015-02-25 17:10:32 -05:00
specials.json * Fix Issue #201: Tokenization of there'll 2015-12-29 18:09:09 +01:00
suffix.txt * Add smart-quote possessive marker to tokenizer 2015-07-30 05:12:48 +02:00
tag_map.json * Fix long-standing issue with coarse-grained tags: proper nouns weren't receiving the PROPN tag, and personal pronouns weren't receiving the PRON tag. This should fix Issue #191, and also Issue #325, which reported that proper nouns were being lemmatized using the common noun policies. This lemmatization will be prevented if the universal tag is PROPN, not NOUN, as no lemmatization rules are loaded for the PROPN tag. 2016-04-14 12:46:43 +02:00