spaCy/spacy/tokens
Madeesh Kannan 41389ffe1e
Avoid pickling `Doc` inputs passed to `Language.pipe()` (#10864)
* `Language.pipe()`: Serialize `Doc` objects to bytes when using multiprocessing to avoid pickling overhead

* `Doc.to_dict()`: Serialize `_context` attribute (keeping in line with `(un)pickle_doc()`

* Correct type annotations

* Fix typo

* `Doc`: Do not serialize `_context`

* `Language.pipe`: Send context objects to child processes, Simplify `as_tuples` handling

* Fix type annotation

* `Language.pipe`: Simplify `as_tuple` multiprocessor handling

* Cleanup code, fix typos

* MyPy fixes

* Move doc preparation function into `_multiprocessing_pipe`
Whitespace changes

* Remove superfluous comma

* Rename `prepare_doc` to `prepare_input`

* Update spacy/errors.py

* Undo renaming for error

Co-authored-by: Adriane Boyd <adrianeboyd@gmail.com>
2022-06-02 20:06:49 +02:00
..
__init__.pxd * Break up tokens.pyx into tokens/doc.pyx, tokens/token.pyx, tokens/spans.pyx 2015-07-13 20:20:58 +02:00
__init__.py
_dict_proxies.py Fix: De/Serialize `SpanGroups` including the SpanGroup keys (#10707) 2022-06-02 15:56:27 +02:00
_retokenize.pyi 🏷 Add Mypy check to CI and ignore all existing Mypy errors (#9167) 2021-10-14 15:21:40 +02:00
_retokenize.pyx
_serialize.py
doc.pxd
doc.pyi Add Doc.from_json() (#10688) 2022-06-02 14:03:47 +02:00
doc.pyx Avoid pickling `Doc` inputs passed to `Language.pipe()` (#10864) 2022-06-02 20:06:49 +02:00
graph.pxd
graph.pyx
morphanalysis.pxd
morphanalysis.pyi
morphanalysis.pyx
span.pxd
span.pyi Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
span.pyx Add SpanRuler component (#9880) 2022-06-02 13:12:53 +02:00
span_group.pxd
span_group.pyi Fix: De/Serialize `SpanGroups` including the SpanGroup keys (#10707) 2022-06-02 15:56:27 +02:00
span_group.pyx
token.pxd
token.pyi
token.pyx
underscore.py