diff --git a/website/docs/api/data-formats.md b/website/docs/api/data-formats.md
index 87f3ecbf2..8b67aa263 100644
--- a/website/docs/api/data-formats.md
+++ b/website/docs/api/data-formats.md
@@ -5,7 +5,7 @@ menu:
- ['Training Config', 'config']
- ['Training Data', 'training']
- ['Pretraining Data', 'pretraining']
- - ['Vocabulary', 'vocab']
+ - ['Vocabulary', 'vocab-jsonl']
- ['Model Meta', 'meta']
---
@@ -135,7 +135,7 @@ process that are used when you run [`spacy train`](/api/cli#train).
| `dropout` | The dropout rate. Defaults to `0.1`. ~~float~~ |
| `accumulate_gradient` | Whether to divide the batch up into substeps. Defaults to `1`. ~~int~~ |
| `init_tok2vec` | Optional path to pretrained tok2vec weights created with [`spacy pretrain`](/api/cli#pretrain). Defaults to variable `${paths.init_tok2vec}`. ~~Optional[str]~~ |
-| `raw_text` | TODO: ... Defaults to variable `${paths.raw}`. ~~Optional[str]~~ |
+| `raw_text` | Optional path to a jsonl file with unlabelled text documents for a [rehearsal](/api/language#rehearse) step. Defaults to variable `${paths.raw}`. ~~Optional[str]~~ |
| `vectors` | Model name or path to model containing pretrained word vectors to use, e.g. created with [`init model`](/api/cli#init-model). Defaults to `null`. ~~Optional[str]~~ |
| `patience` | How many steps to continue without improvement in evaluation score. Defaults to `1600`. ~~int~~ |
| `max_epochs` | Maximum number of epochs to train for. Defaults to `0`. ~~int~~ |
@@ -391,10 +391,10 @@ tokenization can be provided.
> srsly.write_jsonl("/path/to/text.jsonl", data)
> ```
-| Key | Description |
-| -------- | ------------------------------------------------------------------ |
-| `text` | The raw input text. Is not required if `tokens` available. ~~str~~ |
-| `tokens` | Optional tokenization, one string per token. ~~List[str]~~ |
+| Key | Description |
+| -------- | --------------------------------------------------------------------- |
+| `text` | The raw input text. Is not required if `tokens` is available. ~~str~~ |
+| `tokens` | Optional tokenization, one string per token. ~~List[str]~~ |
```json
### Example
@@ -407,7 +407,7 @@ tokenization can be provided.
## Lexical data for vocabulary {#vocab-jsonl new="2"}
To populate a model's vocabulary, you can use the
-[`spacy init-model`](/api/cli#init-model) command and load in a
+[`spacy init model`](/api/cli#init-model) command and load in a
[newline-delimited JSON](http://jsonlines.org/) (JSONL) file containing one
lexical entry per line via the `--jsonl-loc` option. The first line defines the
language and vocabulary settings. All other lines are expected to be JSON
@@ -510,23 +510,23 @@ of truth** used for loading a model.
> }
> ```
-| Name | Description |
-| ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `lang` | Model language [ISO code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). Defaults to `"en"`. ~~str~~ |
-| `name` | Model name, e.g. `"core_web_sm"`. The final model package name will be `{lang}_{name}`. Defaults to `"model"`. ~~str~~ |
-| `version` | Model version. Will be used to version a Python package created with [`spacy package`](/api/cli#package). Defaults to `"0.0.0"`. ~~str~~ |
-| `spacy_version` | spaCy version range the model is compatible with. Defaults to spaCy version used to create the model, up to next minor version, which is the default compatibility for the available [pretrained models](/models). For instance, a model trained with v3.0.0 will have the version range `">=3.0.0,<3.1.0"`. ~~str~~ |
-| `parent_package` | Name of the spaCy package. Typically `"spacy"` or `"spacy_nightly"`. Defaults to `"spacy"`. ~~str~~ |
-| `description` | Model description. Also used for Python package. Defaults to `""`. ~~str~~ |
-| `author` | Model author name. Also used for Python package. Defaults to `""`. ~~str~~ |
-| `email` | Model author email. Also used for Python package. Defaults to `""`. ~~str~~ |
-| `url` | Model author URL. Also used for Python package. Defaults to `""`. ~~str~~ |
-| `license` | Model license. Also used for Python package. Defaults to `""`. ~~str~~ |
-| `sources` | Data sources used to train the model. Typically a list of dicts with the keys `"name"`, `"url"`, `"author"` and `"license"`. [See here](https://github.com/explosion/spacy-models/tree/master/meta) for examples. Defaults to `None`. ~~Optional[List[Dict[str, str]]]~~ |
-| `vectors` | Information about the word vectors included with the model. Typically a dict with the keys `"width"`, `"vectors"` (number of vectors), `"keys"` and `"name"`. ~~Dict[str, Any]~~ |
-| `pipeline` | Names of pipeline component names in the model, in order. Corresponds to [`nlp.pipe_names`](/api/language#pipe_names). Only exists for reference and is not used to create the components. This information is defined in the [`config.cfg`](/api/data-formats#config). Defaults to `[]`. ~~List[str]~~ |
-| `labels` | Label schemes of the trained pipeline components, keyed by component name. Corresponds to [`nlp.pipe_labels`](/api/language#pipe_labels). [See here](https://github.com/explosion/spacy-models/tree/master/meta) for examples. Defaults to `{}`. ~~Dict[str, Dict[str, List[str]]]~~ |
-| `accuracy` | Training accuracy, added automatically by [`spacy train`](/api/cli#train). Dictionary of [score names](/usage/training#metrics) mapped to scores. Defaults to `{}`. ~~Dict[str, Union[float, Dict[str, float]]]~~ |
-| `speed` | Model speed, added automatically by [`spacy train`](/api/cli#train). Typically a dictionary with the keys `"cpu"`, `"gpu"` and `"nwords"` (words per second). Defaults to `{}`. ~~Dict[str, Optional[Union[float, str]]]~~ |
-| `spacy_git_version` 3 | Git commit of [`spacy`](https://github.com/explosion/spaCy) used to create model. ~~str~~ |
-| other | Any other custom meta information you want to add. The data is preserved in [`nlp.meta`](/api/language#meta). ~~Any~~ |
+| Name | Description |
+| ---------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `lang` | Model language [ISO code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). Defaults to `"en"`. ~~str~~ |
+| `name` | Model name, e.g. `"core_web_sm"`. The final model package name will be `{lang}_{name}`. Defaults to `"model"`. ~~str~~ |
+| `version` | Model version. Will be used to version a Python package created with [`spacy package`](/api/cli#package). Defaults to `"0.0.0"`. ~~str~~ |
+| `spacy_version` | spaCy version range the model is compatible with. Defaults to the spaCy version used to create the model, up to next minor version, which is the default compatibility for the available [pretrained models](/models). For instance, a model trained with v3.0.0 will have the version range `">=3.0.0,<3.1.0"`. ~~str~~ |
+| `parent_package` | Name of the spaCy package. Typically `"spacy"` or `"spacy_nightly"`. Defaults to `"spacy"`. ~~str~~ |
+| `description` | Model description. Also used for Python package. Defaults to `""`. ~~str~~ |
+| `author` | Model author name. Also used for Python package. Defaults to `""`. ~~str~~ |
+| `email` | Model author email. Also used for Python package. Defaults to `""`. ~~str~~ |
+| `url` | Model author URL. Also used for Python package. Defaults to `""`. ~~str~~ |
+| `license` | Model license. Also used for Python package. Defaults to `""`. ~~str~~ |
+| `sources` | Data sources used to train the model. Typically a list of dicts with the keys `"name"`, `"url"`, `"author"` and `"license"`. [See here](https://github.com/explosion/spacy-models/tree/master/meta) for examples. Defaults to `None`. ~~Optional[List[Dict[str, str]]]~~ |
+| `vectors` | Information about the word vectors included with the model. Typically a dict with the keys `"width"`, `"vectors"` (number of vectors), `"keys"` and `"name"`. ~~Dict[str, Any]~~ |
+| `pipeline` | Names of pipeline component names in the model, in order. Corresponds to [`nlp.pipe_names`](/api/language#pipe_names). Only exists for reference and is not used to create the components. This information is defined in the [`config.cfg`](/api/data-formats#config). Defaults to `[]`. ~~List[str]~~ |
+| `labels` | Label schemes of the trained pipeline components, keyed by component name. Corresponds to [`nlp.pipe_labels`](/api/language#pipe_labels). [See here](https://github.com/explosion/spacy-models/tree/master/meta) for examples. Defaults to `{}`. ~~Dict[str, Dict[str, List[str]]]~~ |
+| `accuracy` | Training accuracy, added automatically by [`spacy train`](/api/cli#train). Dictionary of [score names](/usage/training#metrics) mapped to scores. Defaults to `{}`. ~~Dict[str, Union[float, Dict[str, float]]]~~ |
+| `speed` | Model speed, added automatically by [`spacy train`](/api/cli#train). Typically a dictionary with the keys `"cpu"`, `"gpu"` and `"nwords"` (words per second). Defaults to `{}`. ~~Dict[str, Optional[Union[float, str]]]~~ |
+| `spacy_git_version` 3 | Git commit of [`spacy`](https://github.com/explosion/spaCy) used to create model. ~~str~~ |
+| other | Any other custom meta information you want to add. The data is preserved in [`nlp.meta`](/api/language#meta). ~~Any~~ |
diff --git a/website/docs/api/doc.md b/website/docs/api/doc.md
index e8ce7343d..3c4825f0d 100644
--- a/website/docs/api/doc.md
+++ b/website/docs/api/doc.md
@@ -317,9 +317,7 @@ array of attributes.
| `exclude` | String names of [serialization fields](#serialization-fields) to exclude. ~~Iterable[str]~~ |
| **RETURNS** | The `Doc` itself. ~~Doc~~ |
-## Doc.from_docs {#from_docs tag="staticmethod"}
-
-
+## Doc.from_docs {#from_docs tag="staticmethod" new="3"}
Concatenate multiple `Doc` objects to form a new one. Raises an error if the
`Doc` objects do not all share the same `Vocab`.
diff --git a/website/docs/api/top-level.md b/website/docs/api/top-level.md
index 325a94f5c..61fca6ec5 100644
--- a/website/docs/api/top-level.md
+++ b/website/docs/api/top-level.md
@@ -282,17 +282,18 @@ concept of function registries. spaCy also uses the function registry for
language subclasses, model architecture, lookups and pipeline component
factories.
-
-
> #### Example
>
> ```python
+> from typing import Iterator
> import spacy
-> from thinc.api import Model
>
-> @spacy.registry.architectures("CustomNER.v1")
-> def custom_ner(n0: int) -> Model:
-> return Model("custom", forward, dims={"nO": nO})
+> @spacy.registry.schedules("waltzing.v1")
+> def waltzing() -> Iterator[float]:
+> i = 0
+> while True:
+> yield i % 3 + 1
+> i += 1
> ```
| Registry name | Description |