spaCy/spacy
Raphael Mitsch c0fd8a2e71
find-threshold: CLI command for multi-label classifier threshold tuning (#11280)
* Add foundation for find-threshold CLI functionality.

* Finish first draft for find-threshold.

* Add tests.

* Revert adjusted import statements.

* Fix mypy errors.

* Fix imports.

* Harmonize arguments with spacy evaluate command.

* Generalize component and threshold handling. Harmonize arguments with 'spacy evaluate' CLI.

* Fix Spancat test.

* Add beta parameter to Scorer and PRFScore.

* Make beta a component scorer setting.

* Remove beta.

* Update nlp.config (workaround).

* Reload pipeline on threshold change. Adjust tests. Remove confection reference.

* Remove assumption of component being a Pipe object or having a .cfg attribute.

* Adjust test output and reference values.

* Remove beta references. Delete universe.json.

* Reverting unnecessary changes. Removing unused default values. Renaming variables in find-cli tests.

* Update spacy/cli/find_threshold.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Remove adding labels in tests.

* Remove unused error

* Undo changes to PRFScorer

* Change default value for n_trials. Log table iteratively.

* Add warnings for pointless applications of find_threshold().

* Fix imports.

* Adjust type check of TextCategorizer to exclude subclasses.

* Change check of if there's only one unique value in scores.

* Update spacy/cli/find_threshold.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Incorporate feedback.

* Fix test issue. Update docstring.

* Update docs & docstring.

* Update spacy/tests/test_cli.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Add examples to docs. Rename _nlp to nlp in tests.

* Update spacy/cli/find_threshold.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Update spacy/cli/find_threshold.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2022-11-25 11:44:55 +01:00
..
cli find-threshold: CLI command for multi-label classifier threshold tuning (#11280) 2022-11-25 11:44:55 +01:00
displacy Docs: displaCy documentation - data types, `parse_{deps,ents,spans}`, spans example (#10950) 2022-08-16 11:23:34 -04:00
kb Refactor KB for easier customization (#11268) 2022-09-08 10:38:07 +02:00
lang Update Russian and Ukrainian lemmatizers (#11811) 2022-11-25 11:12:46 +01:00
matcher Fix Matcher cython profile=True header (#11867) 2022-11-24 16:03:42 +01:00
ml Handle Docs with no entities in EntityLinker (#11640) 2022-10-28 10:25:34 +02:00
pipeline find-threshold: CLI command for multi-label classifier threshold tuning (#11280) 2022-11-25 11:44:55 +01:00
tests find-threshold: CLI command for multi-label classifier threshold tuning (#11280) 2022-11-25 11:44:55 +01:00
tokens Fix types for Span.id and Span.id_ (#11744) 2022-11-07 08:11:13 +01:00
training Add `training.before_update` callback (#11739) 2022-11-23 17:54:58 +01:00
__init__.pxd
__init__.py Simplify and clarify enable/disable behavior of spacy.load() (#11459) 2022-09-27 14:22:36 +02:00
__main__.py
about.py Set version to v3.4.2 (#11672) 2022-10-19 17:33:55 +02:00
attrs.pxd
attrs.pyx Intify IOB (#9738) 2022-01-20 13:19:38 +01:00
compat.py Custom component types in spacy.ty (#9469) 2021-10-21 15:31:06 +02:00
default_config.cfg Add `training.before_update` callback (#11739) 2022-11-23 17:54:58 +01:00
default_config_pretraining.cfg Add new parameter for saving every n epoch in pretraining (#8912) 2021-08-12 11:14:48 +02:00
errors.py find-threshold: CLI command for multi-label classifier threshold tuning (#11280) 2022-11-25 11:44:55 +01:00
glossary.py Add glossary entry for root (#10821) 2022-05-20 09:56:32 +02:00
language.py Remove unused error object (#11837) 2022-11-23 10:51:31 +01:00
lexeme.pxd
lexeme.pyi fix type of lexeme.rank (#9979) 2022-01-04 13:15:25 +01:00
lexeme.pyx Bugfix for similarity return types (#10051) 2022-01-20 11:40:46 +01:00
lookups.py Fix issues for Mypy 0.950 and Pydantic 1.9.0 (#10786) 2022-05-25 09:33:54 +02:00
morphology.pxd
morphology.pyx
parts_of_speech.pxd
parts_of_speech.pyx
pipe_analysis.py 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
py.typed
schemas.py Add `training.before_update` callback (#11739) 2022-11-23 17:54:58 +01:00
scorer.py Update textcat scorer threshold behavior (#11696) 2022-11-02 15:35:04 +01:00
strings.pxd `StringStore`-related optimizations (#10938) 2022-07-04 15:04:03 +02:00
strings.pyi Fix StringStore.__getitem__ return type depending on parameter types (#10741) 2022-05-03 17:57:07 +02:00
strings.pyx `StringStore`-related optimizations (#10938) 2022-07-04 15:04:03 +02:00
structs.pxd
symbols.pxd
symbols.pyx
tokenizer.pxd Add tokenizer option to allow Matcher handling for all rules (#10452) 2022-03-24 13:21:32 +01:00
tokenizer.pyx Add tokenizer option to allow Matcher handling for all rules (#10452) 2022-03-24 13:21:32 +01:00
ty.py Custom component types in spacy.ty (#9469) 2021-10-21 15:31:06 +02:00
typedefs.pxd
typedefs.pyx
util.py Fix default parameters for load functions (fix #11706) (#11713) 2022-11-03 10:52:59 +01:00
vectors.pyx Add equality definition for vectors (#11806) 2022-11-16 09:44:42 +01:00
vocab.pxd Add support for floret vectors (#8909) 2021-10-27 14:08:31 +02:00
vocab.pyi Add vector deduplication (#10551) 2022-03-30 08:54:23 +02:00
vocab.pyx fix comparison of constants (#11834) 2022-11-21 08:12:03 +01:00