spaCy/website/docs/api
kadarakos c003aac29a
SpanFinder into spaCy from experimental (#12507)
* span finder integrated into spacy from experimental

* black

* isort

* black

* default spankey constant

* black

* Update spacy/pipeline/spancat.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* rename

* rename

* max_length and min_length as Optional[int] and strict checking

* black

* mypy fix for integer type infinity

* revert line order

* implement all comparison operators for inf int

* avoid two for loops over all docs by not precomputing

* interleave thresholding with span creation

* black

* revert to not interleaving (relized its faster)

* black

* Update spacy/errors.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* update dosctring

* enforce that the gold and predicted documents have the same text

* new error for ensuring reference and predicted texts are the same

* remove todo

* adjust test

* black

* handle misaligned tokenization

* return correct variable

* failing overfit test

* only use a single spans_key like in spancat

* black

* remove debug lines

* typo

* remove comment

* remove near duplicate reduntant method

* use the 'spans_key' variable name everywhere

* Update spacy/pipeline/span_finder.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* flaky test fix suggestion, hand set bias terms

* only test suggester and test result exhaustively

* make it clear that the span_finder_suggester is more general (not specific to span_finder)

* Update spacy/tests/pipeline/test_span_finder.py

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>

* Apply suggestions from code review

* remove question comment

* move preset_spans_suggester test to spancat tests

* Add docs and unify default configs for spancat and span finder

* Add `allow_overlap=True` to span finder scorer

* Fix offset bug in set_annotations

* Ignore labels in span finder scorer

* Format

* Add span_finder to quickstart template

* Move settings to self.cfg, store min/max unset as None

* Remove debugging

* Update docstrings and docs

* Update spacy/pipeline/span_finder.py

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>

* Fix imports

---------

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
2023-06-07 15:52:28 +02:00
..
architectures.mdx Make generation of empty `KnowledgeBase` instances configurable in `EntityLinker` (#12320) 2023-03-01 16:02:55 +01:00
attributeruler.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
attributes.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
cli.mdx Remove shorthand for output-file in spacy apply (#12636) 2023-05-17 12:36:29 +02:00
coref.mdx corrected example code (#12466) 2023-03-27 11:32:49 +02:00
corpus.mdx Add `spacy.PlainTextCorpusReader.v1` (#12122) 2023-01-26 11:33:22 +01:00
cython-classes.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
cython-structs.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
cython.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
data-formats.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
dependencymatcher.mdx docs(REL_OP): modify docs for REL_OPs to match Semgrex's update on CoreNLP v4.5.2 (#12531) 2023-04-17 13:14:01 +02:00
dependencyparser.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
doc.mdx Backslash fixes in docs (#12213) 2023-02-01 10:15:38 +01:00
docbin.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
edittreelemmatizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
entitylinker.mdx Fix new tags in docs for v3.5.x (#12629) 2023-05-15 12:06:58 +02:00
entityrecognizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
entityruler.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
example.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
index.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
inmemorylookupkb.mdx Update inmemorylookupkb.mdx (#12586) 2023-05-02 12:51:13 +02:00
kb.mdx API docs: Rename kb_in_memory to inmemorylookupkb, add to sidebar (#12128) 2023-01-19 13:29:17 +01:00
language.mdx Add scorer option to return per-component scores (#12540) 2023-05-12 15:36:54 +02:00
legacy.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
lemmatizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
lexeme.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
lookups.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
matcher.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
morphologizer.mdx Tagger label smoothing (#12293) 2023-03-22 12:17:56 +01:00
morphology.mdx Fix new tags in docs for v3.5.x (#12629) 2023-05-15 12:06:58 +02:00
phrasematcher.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
pipe.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
pipeline-functions.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
scorer.mdx Add scorer option to return per-component scores (#12540) 2023-05-12 15:36:54 +02:00
sentencerecognizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
sentencizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
span-resolver.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
span.mdx Add span_id to Span.char_span, update Doc/Span.char_span docs (#12196) 2023-01-27 15:09:17 +01:00
spancategorizer.mdx SpanFinder into spaCy from experimental (#12507) 2023-06-07 15:52:28 +02:00
spanfinder.mdx SpanFinder into spaCy from experimental (#12507) 2023-06-07 15:52:28 +02:00
spangroup.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
spanruler.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
stringstore.mdx Add info to stringstore and vocab (#12471) 2023-03-27 13:15:14 +02:00
tagger.mdx Tagger label smoothing (#12293) 2023-03-22 12:17:56 +01:00
textcategorizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
tok2vec.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
token.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
tokenizer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
top-level.mdx Add scorer option to return per-component scores (#12540) 2023-05-12 15:36:54 +02:00
transformer.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
vectors.mdx Website migration from Gatsby to Next (#12058) 2023-01-11 17:30:07 +01:00
vocab.mdx Add info to stringstore and vocab (#12471) 2023-03-27 13:15:14 +02:00