Add note on merging speed in v2.1 (see #3300) [ci skip]

2019-02-21 12:34:18 +01:00 · 2019-02-21 12:34:18 +01:00 · 0fc908d7a5
parent 236aa94ded
commit 0fc908d7a5
1 changed files with 16 additions and 0 deletions
--- a/website/docs/usage/v2-1.md
+++ b/website/docs/usage/v2-1.md
@ -215,6 +215,22 @@ if all of your models are up to date, you can run the
  means that the `Matcher` in v2.1.x may produce different results compared to
  the `Matcher` in v2.0.x.

+- The deprecated [`Doc.merge`](/api/doc#merge) and
+  [`Span.merge`](/api/span#merge) methods still work, but you may notice that
+  they now run slower when merging many objects in a row. That's because the
+  merging engine was rewritten to be more reliable and to support more efficient
+  merging **in bulk**. To take advantage of this, you should rewrite your logic
+  to use the [`Doc.retokenize`](/api/doc#retokenize) context manager and perform
+  as many merges as possible together in the `with` block.
+
+  ```diff
+  - doc[1:5].merge()
+  - doc[6:8].merge()
+  + with doc.retokenize() as retokenizer:
+  +     retokenizer.merge(doc[1:5])
+  +     retokenizer.merge(doc[6:8])
+  ```
+
 - For better compatibility with the Universal Dependencies data, the lemmatizer
  now preserves capitalization, e.g. for proper nouns. See
  [this issue](https://github.com/explosion/spaCy/issues/3256) for details.