From 0dd127bb00c111613bd0a23a39c7220bf17d2b12 Mon Sep 17 00:00:00 2001
From: Ines Montani <ines@ines.io>
Date: Tue, 1 Oct 2019 21:37:06 +0200
Subject: [PATCH] Update v2-2.md [ci skip]

---
 website/docs/usage/v2-2.md | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/website/docs/usage/v2-2.md b/website/docs/usage/v2-2.md
index 3941b046c..ef616825a 100644
--- a/website/docs/usage/v2-2.md
+++ b/website/docs/usage/v2-2.md
@@ -334,6 +334,11 @@ check if all of your models are up to date, you can run the
   the `Vocab` and serialized with it. This means that serialized objects (`nlp`,
   pipeline components, vocab) will now include additional data, and models
   written to disk will include additional files.
+- The [`Lemmatizer`](/api/lemmatizer) class is now initialized with an instance
+  of [`Lookups`](/api/lookups) containing the rules and tables, instead of dicts
+  as separate arguments. This makes it easier to share data tables and modify
+  them at runtime. This is mostly internals, but if you've been implementing a
+  custom `Lemmatizer`, you'll need to update your code.
 - The [Dutch model](/models/nl) has been trained on a new NER corpus (custom
   labelled UD instead of WikiNER), so their predictions may be very different
   compared to the previous version. The results should be significantly better
@@ -399,6 +404,29 @@ don't explicitly install the lookups data, that `nlp` object won't have any
 lemmatization rules available. spaCy will now show you a warning when you train
 a new part-of-speech tagger and the vocab has no lookups available.
 
+#### Lemmatizer initialization
+
+This is mainly internals and should hopefully not affect your code. But if
+you've been creating custom [`Lemmatizers`](/api/lemmatizer), you'll need to
+update how they're initialized and pass in an instance of
+[`Lookups`](/api/lookups) with the (optional) tables `lemma_index`, `lemma_exc`,
+`lemma_rules` and `lemma_lookup`.
+
+```diff
+from spacy.lemmatizer import Lemmatizer
++ from spacy.lookups import Lookups
+
+lemma_index = {"verb": ("cope", "cop")}
+lemma_exc = {"verb": {"coping": ("cope",)}}
+lemma_rules = {"verb": [["ing", ""]]}
+- lemmatizer = Lemmatizer(lemma_index, lemma_exc, lemma_rules)
++ lookups = Lookups()
++ lookups.add_table("lemma_index", lemma_index)
++ lookups.add_table("lemma_exc", lemma_exc)
++ lookups.add_table("lemma_rules", lemma_rules)
++ lemmatizer = Lemmatizer(lookups)
+```
+
 #### Converting entity offsets to BILUO tags
 
 If you've been using the