2.3 KiB
layout | title | parent | nav_order | permalink |
---|---|---|---|---|
default | Corpora | Advanced topics | 3 | /advanced-topics/corpora/ |
Accessing Corpora
{: .no_toc}
If you want to access the corpora that we are using for your fuzz targets (synthesized by the fuzzing engines), follow these steps.
- TOC {:toc}
Obtain access
To get access to a project's corpora, you must be listed as the
primary contact or as an auto cc in the project's project.yaml
file, as described
in the [New Project Guide]({{ site.baseurl }}/getting-started/new-project-guide/#projectyaml).
If you don't do this, most of the links below won't work.
Install Google Cloud SDK
The corpora for fuzz targets are stored on Google Cloud
Storage. To access them, you need to
install the gsutil
tool, which is part of
the Google Cloud SDK. Follow the instructions on the installation page to
login with the Google account listed in your project's project.yaml
file.
Viewing the corpus for a fuzz target
The fuzzer statistics page for your project on [ClusterFuzz]({{ site.baseurl }}/further-reading/clusterfuzz) contains a link to the Google Cloud console for your corpus under the corpus_size column. Click the link to browse and download individual test inputs in the corpus.
Downloading the corpus
If you want to download the entire corpus, click the link in the corpus_size column, then copy the Buckets path at the top of the page:
Copy the corpus to a directory on your machine by running the following command:
$ gsutil -m cp -r gs://<bucket_path> <local_directory>
Using the expat example above, this would be:
$ gsutil -m cp -r \
gs://expat-corpus.clusterfuzz-external.appspot.com/libFuzzer/expat_parse_fuzzer \
<local_directory>
Corpus backups
We keep daily zipped backups of your corpora. These can be accessed from the
corpus_backup column of the fuzzer statistics page. Downloading these can
be significantly faster than running gsutil -m cp -r
on the corpus bucket.