format

2020-08-18 19:43:20 +02:00 · 2020-08-18 19:43:20 +02:00 · 6ed67d495a
parent f9fe5eb323
commit 6ed67d495a
1 changed files with 7 additions and 7 deletions
--- a/website/docs/usage/training.md
+++ b/website/docs/usage/training.md
@ -665,18 +665,18 @@ can create and register a custom function that generates
 using this dataset for training, stopping criteria such as maximum number of
 steps, or stopping when the loss does not decrease further, can be used.

-In this example we assume a custom function `read_custom_data()`
-which loads or generates texts with relevant textcat annotations. Then, small
-lexical variations of the input text are created before generating the final
-`Example` objects.
+In this example we assume a custom function `read_custom_data()` which loads or
+generates texts with relevant textcat annotations. Then, small lexical
+variations of the input text are created before generating the final `Example`
+objects.

 We can also customize the batching strategy by registering a new "batcher" which
 turns a stream of items into a stream of batches. spaCy has several useful
 built-in batching strategies with customizable sizes<!-- TODO: link  -->, but
 it's also easy to implement your own. For instance, the following function takes
-the stream of generated `Example` objects, and removes those which have the exact
-same underlying raw text, to avoid duplicates in the final training data. Note
-that in a more realistic implementation, you'd also want to check whether the
+the stream of generated `Example` objects, and removes those which have the
+exact same underlying raw text, to avoid duplicates within each batch. Note that
+in a more realistic implementation, you'd also want to check whether the
 annotations are exactly the same.

 > ```ini