spaCy/website/docs/api/goldcorpus.md

1.1 KiB

title teaser tag source new
GoldCorpus An annotated corpus, using the JSON file format class spacy/gold.pyx 2

This class manages annotations for tagging, dependency parsing and NER.

GoldCorpus.__init__

Create a GoldCorpus. IF the input data is an iterable, each item should be a (text, paragraphs) tuple, where each paragraph is a tuple (sentences, brackets), and each sentence is a tuple (ids, words, tags, heads, ner). See the implementation of gold.read_json_file for further details.

Name Type Description
train unicode / Path / iterable Training data, as a path (file or directory) or iterable.
dev unicode / Path / iterable Development data, as a path (file or directory) or iterable.
RETURNS GoldCorpus The newly constructed object.