mirror of https://github.com/explosion/spaCy.git
1.4 KiB
1.4 KiB
title | teaser | tag | source | new |
---|---|---|---|---|
Corpus | An annotated corpus | class | spacy/gold/corpus.py | 3 |
This class manages annotated corpora and can read training and development
datasets in the DocBin (.spacy
) format.
Corpus.__init__
Create a Corpus
. The input data can be a file or a directory of files.
Name | Type | Description |
---|---|---|
train |
str / Path |
Training data (.spacy file or directory of .spacy files). |
dev |
str / Path |
Development data (.spacy file or directory of .spacy files). |
limit |
int | Maximum number of examples returned. |
RETURNS | Corpus |
The newly constructed object. |