remove non-existing link

This commit is contained in:
svlandeg 2020-12-29 14:59:39 +01:00
parent 543073bf9d
commit 43cc6aea93
1 changed files with 7 additions and 7 deletions

View File

@ -807,15 +807,16 @@ $ python -m spacy train [config_path] [--output] [--code] [--verbose] [--gpu-id]
## pretrain {#pretrain new="2.1" tag="command,experimental"} ## pretrain {#pretrain new="2.1" tag="command,experimental"}
Pretrain the "token to vector" ([`Tok2vec`](/api/tok2vec)) layer of pipeline Pretrain the "token to vector" ([`Tok2vec`](/api/tok2vec)) layer of pipeline
components on [raw text](/api/data-formats#pretrain), using an approximate components on raw text, using an approximate language-modeling objective.
language-modeling objective. Specifically, we load pretrained vectors, and train Specifically, we load pretrained vectors, and train a component like a CNN,
a component like a CNN, BiLSTM, etc to predict vectors which match the BiLSTM, etc to predict vectors which match the pretrained ones. The weights are
pretrained ones. The weights are saved to a directory after each epoch. You can saved to a directory after each epoch. You can then include a **path to one of
then include a **path to one of these pretrained weights files** in your these pretrained weights files** in your
[training config](/usage/training#config) as the `init_tok2vec` setting when you [training config](/usage/training#config) as the `init_tok2vec` setting when you
train your pipeline. This technique may be especially helpful if you have little train your pipeline. This technique may be especially helpful if you have little
labelled data. See the usage docs on labelled data. See the usage docs on
[pretraining](/usage/embeddings-transformers#pretraining) for more info. [pretraining](/usage/embeddings-transformers#pretraining) for more info. To read
the raw text, a [`JsonlCorpus`](/api/top-level#JsonlCorpus) is typically used.
<Infobox title="Changed in v3.0" variant="warning"> <Infobox title="Changed in v3.0" variant="warning">
@ -835,7 +836,6 @@ auto-generated by setting `--pretraining` on
> $ python -m spacy pretrain config.cfg output_pretrain --paths.raw_text="data.jsonl" > $ python -m spacy pretrain config.cfg output_pretrain --paths.raw_text="data.jsonl"
> ``` > ```
```cli ```cli
$ python -m spacy pretrain [config_path] [output_dir] [--code] [--resume-path] [--epoch-resume] [--gpu-id] [overrides] $ python -m spacy pretrain [config_path] [output_dir] [--code] [--resume-path] [--epoch-resume] [--gpu-id] [overrides]
``` ```