spaCy

History

Paul O'Leary McCann 6be09bbd07 Fix Entity Linker with tokenization mismatches (fix #9575 ) (#10457 ) * Add failing test * Partial fix for issue This kind of works. The issue with token length mismatches is gone. The problem is that when you get empty lists of encodings to compare, it fails because the sizes are not the same, even though they're both zero: (0, 3) vs (0,). Not sure why that happens... * Short circuit on empties * Remove spurious check The check here isn't needed now the the short circuit is fixed. * Update spacy/tests/pipeline/test_entity_linker.py Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com> * Use "eg", not "example" Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>		2022-05-23 20:42:26 +02:00
..
_edit_tree_internals	Refactor error messages to remove hardcoded strings (#10729 )	2022-05-02 13:38:46 +02:00
_parser_internals	Refactor error messages to remove hardcoded strings (#10729 )	2022-05-02 13:38:46 +02:00
legacy	Fix entity linker batching (#9669 )	2022-03-04 09:17:36 +01:00
__init__.py	Add edit tree lemmatizer (#10231 )	2022-03-28 11:13:50 +02:00
attributeruler.py	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1	2021-10-26 11:53:50 +02:00
dep_parser.pyx	Document scorers in registry and components from #8766 (#8929 )	2021-08-12 12:50:03 +02:00
edit_tree_lemmatizer.py	Add edit tree lemmatizer (#10231 )	2022-03-28 11:13:50 +02:00
entity_linker.py	Fix Entity Linker with tokenization mismatches (fix #9575 ) (#10457 )	2022-05-23 20:42:26 +02:00
entityruler.py	Entity ruler remove pattern (#9685 )	2021-12-06 15:32:49 +01:00
functions.py	Add doc_cleaner component (#9659 )	2021-11-23 15:33:33 +01:00
lemmatizer.py	Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1	2021-10-26 11:53:50 +02:00
morphologizer.pyx	Tagger: use unnormalized probabilities for inference (#10197 )	2022-03-15 14:15:31 +01:00
multitask.pyx	Replace negative rows with 0 in StaticVectors (#7674 )	2021-04-22 18:04:15 +10:00
ner.pyx	Document scorers in registry and components from #8766 (#8929 )	2021-08-12 12:50:03 +02:00
pipe.pxd	TrainablePipe (#6213 )	2020-10-08 21:33:49 +02:00
pipe.pyi	Add Pipe.hide_labels to omit labels from pipeline meta (#10175 )	2022-02-05 17:59:24 +01:00
pipe.pyx	Add Pipe.hide_labels to omit labels from pipeline meta (#10175 )	2022-02-05 17:59:24 +01:00
sentencizer.pyx	Add overwrite settings for more components (#9050 )	2021-09-30 15:35:55 +02:00
senter.pyx	Tagger: use unnormalized probabilities for inference (#10197 )	2022-03-15 14:15:31 +01:00
spancat.py	Save span candidates produced by spancat suggesters (#10413 )	2022-03-14 16:46:58 +01:00
tagger.pyx	Tagger: use unnormalized probabilities for inference (#10197 )	2022-03-15 14:15:31 +01:00
textcat.py	Bugfixes and test for rehearse (#10347 )	2022-02-23 16:10:05 +01:00
textcat_multilabel.py	Fix Scorer.score_cats for missing labels (#9443 )	2021-12-29 11:04:39 +01:00
tok2vec.py	Fix Tok2Vec for empty batches (#10324 )	2022-02-21 10:22:36 +01:00
trainable_pipe.pxd	Refactor scoring methods to use registered functions (#8766 )	2021-08-10 15:13:39 +02:00
trainable_pipe.pyx	Pass excludes when serializing vocab (#8824 )	2021-08-03 14:42:44 +02:00
transition_parser.pxd	TrainablePipe (#6213 )	2020-10-08 21:33:49 +02:00
transition_parser.pyx	Document scorers in registry and components from #8766 (#8929 )	2021-08-12 12:50:03 +02:00