Add script to generate fuzzer seed corpus from tests This adds a script which runs the end2end_tests, captures a wire trace, and then minimizes the corpus with the fuzzer. Minimizing the corpus requires libfuzzer, so this only works in a Chromium checkout. Unseeded, the fuzzer starts with coverage of about 600 features. Using a seed corpus captured from the tests, the fuzzer quickly increases coverage to about 10,000 features. Change-Id: I8d0db5121745bd5ee4a350cf46fb37cfa434e3dc Bug: dawn:295 Reviewed-on: https://dawn-review.googlesource.com/c/dawn/+/14242 Commit-Queue: Austin Eng <enga@chromium.org> Reviewed-by: Kai Ninomiya <kainino@chromium.org>

commit: 97fb51f4af11f672f69f3cb502ad5ca7a86b835d [log] [tgz]
author: Austin Eng <enga@chromium.org> Fri Dec 13 01:27:31 2019 +0000
committer: Commit Bot service account <commit-bot@chromium.org> Fri Dec 13 01:27:31 2019 +0000
tree: c9a5a8d1ece0898809354ba12817f32427d32551
parent: 5c413afdc7f8ae44071b7520e828bc75f0e4a4ed [diff]
diff --git a/docs/fuzzing.md b/docs/fuzzing.md
new file mode 100644
index 0000000..2777b1e
--- /dev/null
+++ b/docs/fuzzing.md

@@ -0,0 +1,20 @@
+# Fuzzing Dawn
+
+## `dawn_wire_server_and_frontend_fuzzer`
+
+The `dawn_wire_server_and_frontend_fuzzer` sets up Dawn using the Null backend, and passes inputs to the wire server. This fuzzes the `dawn_wire` deserialization, as well as Dawn's frontend validation.
+
+## Updating the Seed Corpus
+
+Using a seed corpus significantly improves the efficiency of fuzzing. Dawn's fuzzers use interesting testcases discovered in previous fuzzing runs to seed future runs. Fuzzing can be further improved by using Dawn tests as a example of API usage which allows the fuzzer to quickly discover and use new API entrypoints and usage patterns.
+
+The script [update_fuzzer_seed_corpus.sh](../scripts/update_fuzzer_seed_corpus.sh) can be used to capture a trace while running Dawn tests, and merge it with all existing interesting fuzzer inputs.
+
+To run the script:
+1. Make sure gcloud is installed: https://g3doc.corp.google.com/cloud/sdk/g3doc/index.md?cl=head
+2. Login with @google.com credentials: `gcloud auth login`
+3. You must be in a Chromium checkout using the GN arg `use_libfuzzer=true`
+4. Run `./third_party/dawn/scripts/update_fuzzer_seed_corpus.sh <out_dir> <fuzzer> <test>`.
+
+   Example: `./third_party/dawn/scripts/update_fuzzer_seed_corpus.sh out/fuzz dawn_wire_server_and_frontend_fuzzer dawn_end2end_tests`
+5. The script will print instructions for testing, and then uploading new inputs. Please, only upload inputs after testing the fuzzer with new inputs, and verifying there is a meaningful change in coverage.

diff --git a/scripts/update_fuzzer_seed_corpus.sh b/scripts/update_fuzzer_seed_corpus.sh
new file mode 100755
index 0000000..6243e4b
--- /dev/null
+++ b/scripts/update_fuzzer_seed_corpus.sh

@@ -0,0 +1,87 @@
+#!/bin/bash
+
+# Copyright 2019 The Dawn Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Generates a seed corpus for fuzzing based on dumping wire traces
+# from running Dawn tests
+
+# Exit if anything fails
+set -e
+
+if [ "$#" -ne 3 ]; then
+cat << EOF
+
+Usage:
+  $0 <out_dir> <fuzzer_name> <test_name>
+
+Example:
+  $0 out/fuzz dawn_wire_server_and_frontend_fuzzer dawn_end2end_tests
+
+EOF
+    exit 1
+fi
+
+out_dir=$1
+fuzzer_name=$2
+test_name=$3
+
+testcase_dir="/tmp/testcases/${fuzzer_name}/"
+minimized_testcase_dir="/tmp/testcases/${fuzzer_name}_minimized/"
+
+# Make a directory for temporarily storing testcases
+mkdir -p "$testcase_dir"
+
+# Make an empty directory for temporarily storing minimized testcases
+rm -rf "$minimized_testcase_dir"
+mkdir -p "$minimized_testcase_dir"
+
+# Download the existing corpus. First argument is src, second is dst.
+gsutil -m rsync gs://clusterfuzz-corpus/libfuzzer/${fuzzer_name}/ "$testcase_dir"
+
+# Build the fuzzer and test
+autoninja -C $out_dir $fuzzer_name $test_name
+
+fuzzer_binary="${out_dir}/${fuzzer_name}"
+test_binary="${out_dir}/${test_name}"
+
+# Run the test binary
+$test_binary --use-wire --wire-trace-dir="$testcase_dir"
+
+# Run the fuzzer to minimize the corpus
+$fuzzer_binary -merge=1 "$minimized_testcase_dir" "$testcase_dir"
+
+if [ -z "$(ls -A $minimized_testcase_dir)" ]; then
+cat << EOF
+
+Minimized testcase directory is empty!
+Are you building with use_libfuzzer=true ?
+
+EOF
+    exit 1
+fi
+
+cat << EOF
+
+Please test the corpus in $minimized_testcase_dir with $fuzzer_name and confirm it works as expected.
+
+    $fuzzer_binary $minimized_testcase_dir
+
+Then, run the following command to upload new testcases to the seed corpus:
+
+    gsutil -m rsync $minimized_testcase_dir gs://clusterfuzz-corpus/libfuzzer/${fuzzer_name}/
+
+    WARNING: Add [-d] argument to delete all GCS files that are not also in $minimized_testcase_dir
+
+EOF
commit	97fb51f4af11f672f69f3cb502ad5ca7a86b835d	[log] [tgz]
author	Austin Eng <enga@chromium.org>	Fri Dec 13 01:27:31 2019 +0000
committer	Commit Bot service account <commit-bot@chromium.org>	Fri Dec 13 01:27:31 2019 +0000
tree	c9a5a8d1ece0898809354ba12817f32427d32551
parent	5c413afdc7f8ae44071b7520e828bc75f0e4a4ed [diff]