spaCy/spacy/tests/pipeline
Adriane Boyd a4b32b9552
Handle missing reference values in scorer (#6286)
* Handle missing reference values in scorer

Handle missing values in reference doc during scoring where it is
possible to detect an unset state for the attribute. If no reference
docs contain annotation, `None` is returned instead of a score. `spacy
evaluate` displays `-` for missing scores and the missing scores are
saved as `None`/`null` in the metrics.

Attributes without unset states:

* `token.head`: relies on `token.dep` to recognize unset values
* `doc.cats`: unable to handle missing annotation

Additional changes:

* add optional `has_annotation` check to `score_scans` to replace
`doc.sents` hack
* update `score_token_attr_per_feat` to handle missing and empty morph
representations
* fix bug in `Doc.has_annotation` for normalization of `IS_SENT_START`
vs. `SENT_START`

* Fix import

* Update return types
2020-11-03 15:47:18 +01:00
..
__init__.py
test_analysis.py Simplify pipe analysis 2020-08-01 13:40:06 +02:00
test_attributeruler.py Handle missing reference values in scorer (#6286) 2020-11-03 15:47:18 +01:00
test_entity_linker.py adding tests for trained models to ensure predict reproducibility 2020-10-13 21:07:13 +02:00
test_entity_ruler.py Clear rule-based components on initialize 2020-10-08 09:51:31 +02:00
test_functions.py Tidy up tests and docs 2020-09-21 20:43:54 +02:00
test_initialize.py Test with default value 2020-09-29 17:00:40 +02:00
test_lemmatizer.py Make lemmatizers use initialize logic (#6182) 2020-10-02 15:42:36 +02:00
test_models.py call NumpyOps instead of get_current_ops() 2020-10-14 16:55:00 +02:00
test_morphologizer.py adding tests for trained models to ensure predict reproducibility 2020-10-13 21:07:13 +02:00
test_pipe_factories.py TextCat updates and fixes (#6263) 2020-10-18 14:50:41 +02:00
test_pipe_methods.py Fix typo in test 2020-10-09 18:00:21 +02:00
test_sentencizer.py Refactor Docs.is_ flags (#6044) 2020-09-17 00:14:01 +02:00
test_senter.py adding tests for trained models to ensure predict reproducibility 2020-10-13 21:07:13 +02:00
test_tagger.py adding tests for trained models to ensure predict reproducibility 2020-10-13 21:07:13 +02:00
test_textcat.py TextCat updates and fixes (#6263) 2020-10-18 14:50:41 +02:00
test_tok2vec.py Also rename to include_static_vectors in CharEmbed 2020-10-09 11:54:48 +02:00