2020-11-25 00:19:45 +00:00
|
|
|
---
|
|
|
|
layout: default
|
|
|
|
title: Integrating a Python project
|
|
|
|
parent: Setting up a new project
|
|
|
|
grand_parent: Getting started
|
|
|
|
nav_order: 3
|
|
|
|
permalink: /getting-started/new-project-guide/python-lang/
|
|
|
|
---
|
|
|
|
|
2020-12-01 18:50:00 +00:00
|
|
|
# Integrating a Python project
|
2020-11-25 00:19:45 +00:00
|
|
|
{: .no_toc}
|
|
|
|
|
|
|
|
- TOC
|
|
|
|
{:toc}
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
The process of integrating a project written in Python with OSS-Fuzz is very
|
|
|
|
similar to the general
|
|
|
|
[Setting up a new project]({{ site.baseurl }}/getting-started/new-project-guide/)
|
|
|
|
process. The key specifics of integrating a Python project are outlined below.
|
|
|
|
|
|
|
|
## Atheris
|
|
|
|
|
|
|
|
Python fuzzing in OSS-Fuzz depends on
|
|
|
|
[Atheris](https://github.com/google/atheris). Fuzzers will depend on the
|
|
|
|
`atheris` package, and dependencies are pre-installed on the OSS-Fuzz base
|
|
|
|
docker images.
|
|
|
|
|
|
|
|
## Project files
|
|
|
|
|
|
|
|
### Example project
|
|
|
|
|
2021-02-25 13:58:31 +00:00
|
|
|
We recommend viewing [ujson](https://github.com/google/oss-fuzz/tree/master/projects/ujson) as an
|
2021-01-15 23:33:29 +00:00
|
|
|
example of a simple Python fuzzing project, with both plain-Atheris and
|
|
|
|
Atheris + Hypothesis harnesses.
|
2020-11-25 00:19:45 +00:00
|
|
|
|
|
|
|
### project.yaml
|
|
|
|
|
|
|
|
The `language` attribute must be specified.
|
|
|
|
|
|
|
|
```yaml
|
|
|
|
language: python
|
|
|
|
```
|
|
|
|
|
2020-11-30 21:26:33 +00:00
|
|
|
The only supported fuzzing engine is libFuzzer (`libfuzzer`). The supported
|
|
|
|
sanitizers are AddressSanitizer (`address`) and
|
|
|
|
UndefinedBehaviorSanitizer (`undefined`). These must be explicitly specified.
|
2020-11-25 00:19:45 +00:00
|
|
|
|
|
|
|
```yaml
|
|
|
|
fuzzing_engines:
|
|
|
|
- libfuzzer
|
2020-11-30 21:26:33 +00:00
|
|
|
sanitizers:
|
|
|
|
- address
|
|
|
|
- undefined
|
2020-11-25 00:19:45 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
### Dockerfile
|
|
|
|
|
|
|
|
Because most dependencies are already pre-installed on the images, no
|
|
|
|
significant changes are needed in the Dockerfile for Python fuzzing projects.
|
|
|
|
You should simply clone the project, set a `WORKDIR`, and copy any necessary
|
|
|
|
files, or install any project-specific dependencies here as you normally would.
|
|
|
|
|
|
|
|
### build.sh
|
|
|
|
|
|
|
|
For Python projects, `build.sh` does need some more significant modifications
|
|
|
|
over normal projects. The following is an annotated example build script,
|
|
|
|
explaining why each step is necessary and when they can be omitted.
|
|
|
|
|
|
|
|
```sh
|
|
|
|
# Build and install project (using current CFLAGS, CXXFLAGS). This is required
|
|
|
|
# for projects with C extensions so that they're built with the proper flags.
|
|
|
|
pip3 install .
|
|
|
|
|
|
|
|
# Build fuzzers into $OUT. These could be detected in other ways.
|
|
|
|
for fuzzer in $(find $SRC -name '*_fuzzer.py'); do
|
|
|
|
fuzzer_basename=$(basename -s .py $fuzzer)
|
|
|
|
fuzzer_package=${fuzzer_basename}.pkg
|
|
|
|
|
|
|
|
# To avoid issues with Python version conflicts, or changes in environment
|
|
|
|
# over time on the OSS-Fuzz bots, we use pyinstaller to create a standalone
|
|
|
|
# package. Though not necessarily required for reproducing issues, this is
|
|
|
|
# required to keep fuzzers working properly in OSS-Fuzz.
|
|
|
|
pyinstaller --distpath $OUT --onefile --name $fuzzer_package $fuzzer
|
|
|
|
|
|
|
|
# Create execution wrapper. Atheris requires that certain libraries are
|
|
|
|
# preloaded, so this is also done here to ensure compatibility and simplify
|
|
|
|
# test case reproduction. Since this helper script is what OSS-Fuzz will
|
|
|
|
# actually execute, it is also always required.
|
2020-11-30 18:42:25 +00:00
|
|
|
# NOTE: If you are fuzzing python-only code and do not have native C/C++
|
|
|
|
# extensions, then remove the LD_PRELOAD line below as preloading sanitizer
|
|
|
|
# library is not required and can lead to unexpected startup crashes.
|
2020-11-30 02:49:47 +00:00
|
|
|
echo "#!/bin/sh
|
2020-11-25 00:19:45 +00:00
|
|
|
# LLVMFuzzerTestOneInput for fuzzer detection.
|
2020-12-13 23:07:28 +00:00
|
|
|
this_dir=\$(dirname \"\$0\")
|
|
|
|
LD_PRELOAD=\$this_dir/sanitizer_with_fuzzer.so \
|
|
|
|
ASAN_OPTIONS=\$ASAN_OPTIONS:symbolize=1:external_symbolizer_path=\$this_dir/llvm-symbolizer:detect_leaks=0 \
|
|
|
|
\$this_dir/$fuzzer_package \$@" > $OUT/$fuzzer_basename
|
2021-05-12 14:03:21 +00:00
|
|
|
chmod +x $OUT/$fuzzer_basename
|
2020-11-25 00:19:45 +00:00
|
|
|
done
|
|
|
|
```
|
2021-01-15 23:33:29 +00:00
|
|
|
|
|
|
|
## Hypothesis
|
|
|
|
|
|
|
|
Using [Hypothesis](https://hypothesis.readthedocs.io/), the Python library for
|
|
|
|
[property-based testing](https://hypothesis.works/articles/what-is-property-based-testing/),
|
|
|
|
makes it really easy to generate complex inputs - whether in traditional test suites
|
|
|
|
or [by using test functions as fuzz harnesses](https://hypothesis.readthedocs.io/en/latest/details.html#use-with-external-fuzzers).
|
|
|
|
|
|
|
|
> Property based testing is the construction of tests such that, when these tests are fuzzed,
|
|
|
|
failures in the test reveal problems with the system under test that could not have been
|
|
|
|
revealed by direct fuzzing of that system.
|
|
|
|
|
2021-02-25 13:58:31 +00:00
|
|
|
We recommend using the [`hypothesis write`](https://hypothesis.readthedocs.io/en/latest/ghostwriter.html)
|
|
|
|
command to generate a starter fuzz harness. This "ghostwritten" code may be usable as-is,
|
|
|
|
or provide a useful template for writing more specific tests.
|
2021-01-15 23:33:29 +00:00
|
|
|
|
|
|
|
See [here for the core "strategies"](https://hypothesis.readthedocs.io/en/latest/data.html),
|
|
|
|
for arbitrary data, [here for Numpy + Pandas support](https://hypothesis.readthedocs.io/en/latest/numpy.html),
|
|
|
|
or [here for a variety of third-party extensions](https://hypothesis.readthedocs.io/en/latest/strategies.html)
|
|
|
|
supporting everything from protobufs, to jsonschemas, to networkx graphs or geojson
|
|
|
|
or valid Python source code.
|
2021-02-25 13:58:31 +00:00
|
|
|
Hypothesis' integrated test-case reduction also makes it trivial to report a canonical minimal
|
|
|
|
example for each distinct failure discovered while fuzzing - just run the test function!
|
2021-01-15 23:33:29 +00:00
|
|
|
|
|
|
|
To use Hypothesis in OSS-Fuzz, install it in your Dockerfile with
|
|
|
|
|
|
|
|
```shell
|
|
|
|
RUN pip3 install hypothesis
|
|
|
|
```
|
|
|
|
|
|
|
|
See [the `ujson` structured fuzzer](https://github.com/google/oss-fuzz/blob/master/projects/ujson/hypothesis_structured_fuzzer.py)
|
|
|
|
for an example "polyglot" which can either be run with `pytest` as a standard test function,
|
|
|
|
or run with OSS-Fuzz as a fuzz harness.
|