Commit Graph

7864 Commits

Author SHA1 Message Date
Adriane Boyd 50f20cf722 Revert changes to Scorer.score_spans 2020-09-25 08:21:47 +02:00
Matthew Honnibal 93d7ff309f Remove print 2020-09-24 21:05:27 +02:00
Matthew Honnibal 16475528f7
Fix skipped documents in entity scorer (#6137)
* Fix skipped documents in entity scorer

* Add back the skipping of unannotated entities

* Update spacy/scorer.py

* Use more specific NER scorer

* Fix import

* Fix get_ner_prf

* Add scorer

* Fix scorer

Co-authored-by: Ines Montani <ines@ines.io>
2020-09-24 20:38:57 +02:00
Matthew Honnibal 2abb4ba9db
Make a pre-check to speed up alignment cache (#6139)
* Dirty trick to fast-track alignment cache

* Improve alignment cache check

* Fix header

* Fix align cache

* Fix align logic
2020-09-24 18:13:39 +02:00
Ines Montani 26e28ed413 Fix combined scores if multiple components report it 2020-09-24 17:11:13 +02:00
Ines Montani 0b52b6904c Update entity_linker.py 2020-09-24 17:10:35 +02:00
Ines Montani 20b89a9717 Increment version [ci skip] 2020-09-24 16:57:02 +02:00
Adriane Boyd 3c062b3911
Add MORPH handling to Matcher (#6107)
* Add MORPH handling to Matcher

* Add `MORPH` to `Matcher` schema
* Rename `_SetMemberPredicate` to `_SetPredicate`
* Add `ISSUBSET` and `ISSUPERSET` operators to `_SetPredicate`
  * Add special handling for normalization and conversion of morph
    values into sets
  * For other attrs, `ISSUBSET` acts like `IN` and `ISSUPERSET` only
    matches for 0 or 1 values

* Update test

* Rename to IS_SUBSET and IS_SUPERSET
2020-09-24 16:55:09 +02:00
Adriane Boyd 59340606b7
Add option to disable Matcher errors (#6125)
* Add option to disable Matcher errors

* Add option to disable Matcher errors when a doc doesn't contain a
particular type of annotation

Minor additional change:

* Update `AttributeRuler.load_from_morph_rules` to allow direct `MORPH`
values

* Rename suppress_errors to allow_missing

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>

* Refactor annotation checks in Matcher and PhraseMatcher

Co-authored-by: Matthew Honnibal <honnibal+gh@gmail.com>
2020-09-24 16:54:39 +02:00
Sofie Van Landeghem c7eedd3534
updates to NEL functionality (#6132)
* NEL: read sentences and ents from reference

* fiddling with sent_start annotations

* add KB serialization test

* KB write additional file with strings.json

* score_links function to calculate NEL P/R/F

* formatting

* documentation
2020-09-24 16:53:59 +02:00
Ines Montani d0ef4a4cf5 Prevent division by zero in score weights 2020-09-24 16:42:13 +02:00
Matthew Honnibal 74ee456374 Merge branch 'develop' of https://github.com/explosion/spaCy into develop 2020-09-24 16:11:47 +02:00
Matthew Honnibal 0bc214c102 Fix pull 2020-09-24 16:11:33 +02:00
Ines Montani 3f751e68f5 Increment version [ci skip] 2020-09-24 14:45:41 +02:00
Ines Montani 58dde293ce
Merge pull request #6089 from adrianeboyd/feature/doc-ents-v3-2 2020-09-24 14:44:42 +02:00
Ines Montani 74e1f192b4
Merge pull request #6134 from explosion/feature/training_before_to_disk 2020-09-24 14:44:11 +02:00
Ines Montani 24e7ac3f2b Fix download CLI [ci skip] 2020-09-24 14:43:56 +02:00
Ines Montani 88e54caa12 accuracy -> performance 2020-09-24 14:32:35 +02:00
Ines Montani 92f8b6959a Fix typo 2020-09-24 13:48:41 +02:00
Adriane Boyd 5c13e0cf1b Remove unused error 2020-09-24 13:41:55 +02:00
Ines Montani be56c0994b Add [training.before_to_disk] callback 2020-09-24 12:40:25 +02:00
Adriane Boyd 8eaacaae97 Refactor Doc.ents setter to use Doc.set_ents
Additional changes:

* Entity spans with missing labels are ignored
* Fix ent_kb_id setting in `Doc.set_ents`
2020-09-24 12:36:51 +02:00
Ines Montani c6c67b606e
Merge pull request #6133 from explosion/fix/score_weights 2020-09-24 12:00:57 +02:00
Ines Montani f69fea8b25 Improve error handling around non-number scores 2020-09-24 11:29:07 +02:00
Ines Montani 4eb39b5c43 Fix logging 2020-09-24 11:04:35 +02:00
Ines Montani 4bbe41f017 Fix combined scores and update test 2020-09-24 10:42:47 +02:00
Sofie Van Landeghem c645c4e7ce
fix micro PRF for textcat (#6130)
* fix micro PRF for textcat

* small fix
2020-09-24 10:31:17 +02:00
Matthew Honnibal 17a6b0a173
Make project pull order insensitive (#6131) 2020-09-24 10:30:42 +02:00
Ines Montani ae51f580c1 Fix handling of score_weights 2020-09-24 10:27:33 +02:00
Ines Montani f25f05c503 Adjust sort order [ci skip] 2020-09-23 20:03:04 +02:00
Ines Montani 3f77eb749c Increment version [ci skip] 2020-09-23 19:50:15 +02:00
svlandeg b816ace4bb format 2020-09-23 17:33:13 +02:00
svlandeg 5a9fdbc8ad state_type as Literal 2020-09-23 17:32:14 +02:00
svlandeg 35dbc63578 Merge remote-tracking branch 'upstream/develop' into fix/nr_features
# Conflicts:
#	spacy/ml/models/parser.py
#	spacy/tests/serialize/test_serialize_config.py
#	website/docs/api/architectures.md
2020-09-23 17:01:13 +02:00
svlandeg 25b34bba94 throw custom error when state_type is invalid 2020-09-23 16:57:14 +02:00
Ines Montani 916050bf2f
Merge pull request #6127 from explosion/feature/literal-nr_feature_tokens 2020-09-23 16:56:08 +02:00
Ines Montani 3c3863654e Increment version [ci skip] 2020-09-23 16:54:43 +02:00
svlandeg dd2292793f 'parser' instead of 'deps' for state_type 2020-09-23 16:53:49 +02:00
Ines Montani 50a4425cda Adjust docs 2020-09-23 16:03:32 +02:00
Ines Montani 76bbed3466 Use Literal type for nr_feature_tokens 2020-09-23 16:00:03 +02:00
svlandeg 6c85fab316 state_type and extra_state_tokens instead of nr_feature_tokens 2020-09-23 13:35:09 +02:00
Ines Montani 7745d77a38 Fix whitespace in template [ci skip] 2020-09-23 13:21:42 +02:00
svlandeg 6435458d51 simplify expression 2020-09-23 12:12:38 +02:00
svlandeg 20b0ec5dcf avoid logging performance of frozen components 2020-09-23 10:37:12 +02:00
Ines Montani ae5dacf75f Tidy up and add types 2020-09-23 10:14:34 +02:00
Ines Montani 6ca06cb62c Update docs and formatting [ci skip] 2020-09-23 10:14:27 +02:00
Ines Montani 888f936a73
Merge pull request #6106 from svlandeg/feature/textcat-quickstart 2020-09-23 10:11:45 +02:00
Ines Montani 60a317520a
Merge pull request #6109 from svlandeg/feature/2rename 2020-09-23 09:47:12 +02:00
Ines Montani f976bab710 Remove empty file [ci skip] 2020-09-23 09:30:09 +02:00
svlandeg 556f3e4652 add pooling to NEL's TransformerListener 2020-09-23 09:24:28 +02:00