spaCy/spacy/pipeline
kadarakos 1bb87f35bc
Detect cycle during projectivize (#10877)
* detect cycle during projectivize

* not complete test to detect cycle in projectivize

* boolean to int type to propagate error

* use unordered_set instead of set

* moved error message to errors

* removed cycle from test case

* use find instead of count

* cycle check: only perform one lookup

* Return bool again from _has_head_as_ancestor

Communicate presence of cycles through an output argument.

* Switch to returning std::pair to encode presence of a cycle

The has_cycle pointer is too easy to misuse. Ideally, we would have a
sum type like Rust's `Result` here, but C++ is not there yet.

* _is_non_proj_arc: clarify what we are returning

* _has_head_as_ancestor: remove count

We are now explicitly checking for cycles, so the algorithm must always
terminate. Either we encounter the head, we find a root, or a cycle.

* _is_nonproj_arc: simplify condition

* Another refactor using C++ exceptions

* Remove unused error code

* Print graph with cycle on exception

* Include .hh files in source package

* Add FIXME comment

* cycle detection test

* find cycle when starting from problematic vertex

Co-authored-by: Daniël de Kok <me@danieldk.eu>
2022-06-08 19:34:11 +02:00
..
_edit_tree_internals Refactor error messages to remove hardcoded strings (#10729) 2022-05-02 13:38:46 +02:00
_parser_internals Detect cycle during projectivize (#10877) 2022-06-08 19:34:11 +02:00
legacy Fix issues for Mypy 0.950 and Pydantic 1.9.0 (#10786) 2022-05-25 09:33:54 +02:00
__init__.py Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
attributeruler.py Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
dep_parser.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
edit_tree_lemmatizer.py Fix issues for Mypy 0.950 and Pydantic 1.9.0 (#10786) 2022-05-25 09:33:54 +02:00
entity_linker.py Minor NEL type fixes (#10860) 2022-06-01 00:41:28 +02:00
entityruler.py Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
functions.py Add doc_cleaner component (#9659) 2021-11-23 15:33:33 +01:00
lemmatizer.py Merge remote-tracking branch 'upstream/master' into chore/update-develop-from-master-v3.2-1 2021-10-26 11:53:50 +02:00
morphologizer.pyx Tagger: use unnormalized probabilities for inference (#10197) 2022-03-15 14:15:31 +01:00
multitask.pyx Replace negative rows with 0 in StaticVectors (#7674) 2021-04-22 18:04:15 +10:00
ner.pyx Document scorers in registry and components from #8766 (#8929) 2021-08-12 12:50:03 +02:00
pipe.pxd TrainablePipe (#6213) 2020-10-08 21:33:49 +02:00
pipe.pyi Add Pipe.hide_labels to omit labels from pipeline meta (#10175) 2022-02-05 17:59:24 +01:00
pipe.pyx fix typo + CI slow testing (#10835) 2022-06-02 00:10:16 +02:00
sentencizer.pyx Add overwrite settings for more components (#9050) 2021-09-30 15:35:55 +02:00
senter.pyx Tagger: use unnormalized probabilities for inference (#10197) 2022-03-15 14:15:31 +01:00
span_ruler.py Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
spancat.py Fix issues for Mypy 0.950 and Pydantic 1.9.0 (#10786) 2022-05-25 09:33:54 +02:00
tagger.pyx Tagger: use unnormalized probabilities for inference (#10197) 2022-03-15 14:15:31 +01:00
textcat.py Bugfixes and test for rehearse (#10347) 2022-02-23 16:10:05 +01:00
textcat_multilabel.py Fix Scorer.score_cats for missing labels (#9443) 2021-12-29 11:04:39 +01:00
tok2vec.py Fix Tok2Vec for empty batches (#10324) 2022-02-21 10:22:36 +01:00
trainable_pipe.pxd Refactor scoring methods to use registered functions (#8766) 2021-08-10 15:13:39 +02:00
trainable_pipe.pyx Pass excludes when serializing vocab (#8824) 2021-08-03 14:42:44 +02:00
transition_parser.pxd Parser: use C saxpy/sgemm provided by the Ops implementation (#10773) 2022-05-27 11:20:52 +02:00
transition_parser.pyx Parser: use C saxpy/sgemm provided by the Ops implementation (#10773) 2022-05-27 11:20:52 +02:00