spaCy/spacy/tests/pipeline
kadarakos c003aac29a
SpanFinder into spaCy from experimental (#12507)
* span finder integrated into spacy from experimental

* black

* isort

* black

* default spankey constant

* black

* Update spacy/pipeline/spancat.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* rename

* rename

* max_length and min_length as Optional[int] and strict checking

* black

* mypy fix for integer type infinity

* revert line order

* implement all comparison operators for inf int

* avoid two for loops over all docs by not precomputing

* interleave thresholding with span creation

* black

* revert to not interleaving (relized its faster)

* black

* Update spacy/errors.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* update dosctring

* enforce that the gold and predicted documents have the same text

* new error for ensuring reference and predicted texts are the same

* remove todo

* adjust test

* black

* handle misaligned tokenization

* return correct variable

* failing overfit test

* only use a single spans_key like in spancat

* black

* remove debug lines

* typo

* remove comment

* remove near duplicate reduntant method

* use the 'spans_key' variable name everywhere

* Update spacy/pipeline/span_finder.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* flaky test fix suggestion, hand set bias terms

* only test suggester and test result exhaustively

* make it clear that the span_finder_suggester is more general (not specific to span_finder)

* Update spacy/tests/pipeline/test_span_finder.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Apply suggestions from code review

* remove question comment

* move preset_spans_suggester test to spancat tests

* Add docs and unify default configs for spancat and span finder

* Add `allow_overlap=True` to span finder scorer

* Fix offset bug in set_annotations

* Ignore labels in span finder scorer

* Format

* Add span_finder to quickstart template

* Move settings to self.cfg, store min/max unset as None

* Remove debugging

* Update docstrings and docs

* Update spacy/pipeline/span_finder.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix imports

---------

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-06-07 15:52:28 +02:00
..
__init__.py
test_analysis.py
test_annotates_on_update.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
test_attributeruler.py Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
test_edit_tree_lemmatizer.py Fix speed problem with `top_k>1` on CPU in edit tree lemmatizer (#12017) 2023-01-20 19:34:11 +01:00
test_entity_linker.py Fix EL failure with sentence-crossing entities (#12398) 2023-03-14 22:02:49 +01:00
test_entity_ruler.py Enable fuzzy text matching in Matcher (#11359) 2023-01-10 10:36:17 +01:00
test_functions.py Add doc_cleaner component (#9659) 2021-11-23 15:33:33 +01:00
test_initialize.py Test with default value 2020-09-29 17:00:40 +02:00
test_lemmatizer.py Tidy up and auto-format 2021-07-18 15:44:56 +10:00
test_models.py Tidy up code 2021-06-28 12:08:15 +02:00
test_morphologizer.py Convert values to numpy for label smoothing tests (#12472) 2023-03-31 13:41:41 +02:00
test_pipe_factories.py Auto-format code with black (#10795) 2022-05-13 19:02:08 +02:00
test_pipe_methods.py Revert disable/disabled merging behavior (#11745) 2022-11-08 14:58:10 +01:00
test_sentencizer.py
test_senter.py Add Pipe.hide_labels to omit labels from pipeline meta (#10175) 2022-02-05 17:59:24 +01:00
test_span_finder.py SpanFinder into spaCy from experimental (#12507) 2023-06-07 15:52:28 +02:00
test_span_ruler.py Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
test_spancat.py SpanFinder into spaCy from experimental (#12507) 2023-06-07 15:52:28 +02:00
test_tagger.py Convert values to numpy for label smoothing tests (#12472) 2023-03-31 13:41:41 +02:00
test_textcat.py Improve score_cats for use with multiple textcat components (#11820) 2023-01-09 11:43:48 +01:00
test_tok2vec.py Auto-format code with black (#11687) 2022-10-21 11:54:17 +02:00