spaCy/spacy
Matthew Honnibal 04395ffa49 Bring English tag_map in line with UD Treebank
I wrote a small script to read the UD English training data and check
that our tag map and morph rules were resulting in the best POS map.
This hadn't been done for some time, and there have been various changes
to the UD schema since it has been done. After these changes we should
see much better agreement between our POS assignments and the UD POS
tags.
2019-03-21 13:53:44 +01:00
..
cli Merge pull request #3441 from explosion/fix/cli-ud-scripts 2019-03-20 12:19:15 +01:00
data
displacy 💫 Fix displaCy support for RTL languages (#3393) 2019-03-11 18:52:50 +01:00
lang Bring English tag_map in line with UD Treebank 2019-03-21 13:53:44 +01:00
matcher Add actual deprecation warning for n_threads (resolves #3410) 2019-03-15 16:38:44 +01:00
pipeline Tidy up references to n_threads and fix default 2019-03-15 16:24:26 +01:00
syntax Improve beam search defaults 2019-03-17 21:47:45 +01:00
tests Merging conversion scripts for conll formats (#3405) 2019-03-15 18:14:46 +01:00
tokens Fix similarity calculation if vectors are on GPU (#3440) 2019-03-20 12:09:59 +01:00
__init__.pxd
__init__.py Fix formatting (hopefully also restarts build properly) 2019-03-20 09:55:45 +01:00
__main__.py Update __main__.py 2019-03-20 09:43:26 +01:00
_align.pyx Improve alignment around quotes 2018-08-16 01:04:34 +02:00
_ml.py Revert changes to optimizer default hyper-params (WIP) (#3415) 2019-03-16 21:39:02 +01:00
about.py Set version to 2.1.1 2019-03-20 00:59:45 +01:00
attrs.pxd
attrs.pyx
compat.py Tidy up and improve docs and docstrings (#3370) 2019-03-08 11:42:26 +01:00
errors.py Fix formatting (hopefully also restarts build properly) 2019-03-20 09:55:45 +01:00
glossary.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
gold.pxd
gold.pyx Fix jsonl to json conversion (#3419) 2019-03-17 22:12:54 +01:00
language.py Merge pull request #3416 from explosion/feature/improve-beam 2019-03-16 18:42:18 +01:00
lemmatizer.py Tidy up and improve docs and docstrings (#3370) 2019-03-08 11:42:26 +01:00
lexeme.pxd 💫 Support lexical attributes in retokenizer attrs (closes #2390) (#3325) 2019-02-24 21:13:51 +01:00
lexeme.pyx Tidy up property code style (#3391) 2019-03-11 15:59:09 +01:00
morphology.pxd
morphology.pyx 💫 Fix interaction of lemmatizer and tokenizer exceptions (#3388) 2019-03-11 01:31:21 +01:00
parts_of_speech.pxd
parts_of_speech.pyx
scorer.py 💫 Tidy up and auto-format .py files (#2983) 2018-11-30 17:03:03 +01:00
strings.pxd
strings.pyx 💫 Make serialization methods consistent (#3385) 2019-03-10 19:16:45 +01:00
structs.pxd Make NORM a token attribute (#3029) 2018-12-08 10:49:10 +01:00
symbols.pxd
symbols.pyx
tokenizer.pxd
tokenizer.pyx Add actual deprecation warning for n_threads (resolves #3410) 2019-03-15 16:38:44 +01:00
typedefs.pxd
typedefs.pyx
util.py Auto-format [ci skip] 2019-03-11 17:10:50 +01:00
vectors.pyx Update Vectors.find docs [ci skip] 2019-03-16 17:10:57 +01:00
vocab.pxd 💫 Small efficiency fixes to tokenizer (#2587) 2018-07-24 23:35:54 +02:00
vocab.pyx Tidy up property code style (#3391) 2019-03-11 15:59:09 +01:00