Add script to generate fuzzer seed corpus from tests

This adds a script which runs the end2end_tests, captures a wire trace,
and then minimizes the corpus with the fuzzer. Minimizing the corpus
requires libfuzzer, so this only works in a Chromium checkout.

Unseeded, the fuzzer starts with coverage of about 600 features.
Using a seed corpus captured from the tests, the fuzzer quickly
increases coverage to about 10,000 features.

Change-Id: I8d0db5121745bd5ee4a350cf46fb37cfa434e3dc
Bug: dawn:295
Reviewed-on: https://dawn-review.googlesource.com/c/dawn/+/14242
Commit-Queue: Austin Eng <enga@chromium.org>
Reviewed-by: Kai Ninomiya <kainino@chromium.org>
diff --git a/docs/fuzzing.md b/docs/fuzzing.md
new file mode 100644
index 0000000..2777b1e
--- /dev/null
+++ b/docs/fuzzing.md
@@ -0,0 +1,20 @@
+# Fuzzing Dawn
+
+## `dawn_wire_server_and_frontend_fuzzer`
+
+The `dawn_wire_server_and_frontend_fuzzer` sets up Dawn using the Null backend, and passes inputs to the wire server. This fuzzes the `dawn_wire` deserialization, as well as Dawn's frontend validation.
+
+## Updating the Seed Corpus
+
+Using a seed corpus significantly improves the efficiency of fuzzing. Dawn's fuzzers use interesting testcases discovered in previous fuzzing runs to seed future runs. Fuzzing can be further improved by using Dawn tests as a example of API usage which allows the fuzzer to quickly discover and use new API entrypoints and usage patterns.
+
+The script [update_fuzzer_seed_corpus.sh](../scripts/update_fuzzer_seed_corpus.sh) can be used to capture a trace while running Dawn tests, and merge it with all existing interesting fuzzer inputs.
+
+To run the script:
+1. Make sure gcloud is installed: https://g3doc.corp.google.com/cloud/sdk/g3doc/index.md?cl=head
+2. Login with @google.com credentials: `gcloud auth login`
+3. You must be in a Chromium checkout using the GN arg `use_libfuzzer=true`
+4. Run `./third_party/dawn/scripts/update_fuzzer_seed_corpus.sh <out_dir> <fuzzer> <test>`.
+
+   Example: `./third_party/dawn/scripts/update_fuzzer_seed_corpus.sh out/fuzz dawn_wire_server_and_frontend_fuzzer dawn_end2end_tests`
+5. The script will print instructions for testing, and then uploading new inputs. Please, only upload inputs after testing the fuzzer with new inputs, and verifying there is a meaningful change in coverage.
diff --git a/scripts/update_fuzzer_seed_corpus.sh b/scripts/update_fuzzer_seed_corpus.sh
new file mode 100755
index 0000000..6243e4b
--- /dev/null
+++ b/scripts/update_fuzzer_seed_corpus.sh
@@ -0,0 +1,87 @@
+#!/bin/bash
+
+# Copyright 2019 The Dawn Authors
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Generates a seed corpus for fuzzing based on dumping wire traces
+# from running Dawn tests
+
+# Exit if anything fails
+set -e
+
+if [ "$#" -ne 3 ]; then
+cat << EOF
+
+Usage:
+  $0 <out_dir> <fuzzer_name> <test_name>
+
+Example:
+  $0 out/fuzz dawn_wire_server_and_frontend_fuzzer dawn_end2end_tests
+
+EOF
+    exit 1
+fi
+
+out_dir=$1
+fuzzer_name=$2
+test_name=$3
+
+testcase_dir="/tmp/testcases/${fuzzer_name}/"
+minimized_testcase_dir="/tmp/testcases/${fuzzer_name}_minimized/"
+
+# Make a directory for temporarily storing testcases
+mkdir -p "$testcase_dir"
+
+# Make an empty directory for temporarily storing minimized testcases
+rm -rf "$minimized_testcase_dir"
+mkdir -p "$minimized_testcase_dir"
+
+# Download the existing corpus. First argument is src, second is dst.
+gsutil -m rsync gs://clusterfuzz-corpus/libfuzzer/${fuzzer_name}/ "$testcase_dir"
+
+# Build the fuzzer and test
+autoninja -C $out_dir $fuzzer_name $test_name
+
+fuzzer_binary="${out_dir}/${fuzzer_name}"
+test_binary="${out_dir}/${test_name}"
+
+# Run the test binary
+$test_binary --use-wire --wire-trace-dir="$testcase_dir"
+
+# Run the fuzzer to minimize the corpus
+$fuzzer_binary -merge=1 "$minimized_testcase_dir" "$testcase_dir"
+
+if [ -z "$(ls -A $minimized_testcase_dir)" ]; then
+cat << EOF
+
+Minimized testcase directory is empty!
+Are you building with use_libfuzzer=true ?
+
+EOF
+    exit 1
+fi
+
+cat << EOF
+
+Please test the corpus in $minimized_testcase_dir with $fuzzer_name and confirm it works as expected.
+
+    $fuzzer_binary $minimized_testcase_dir
+
+Then, run the following command to upload new testcases to the seed corpus:
+
+    gsutil -m rsync $minimized_testcase_dir gs://clusterfuzz-corpus/libfuzzer/${fuzzer_name}/
+
+    WARNING: Add [-d] argument to delete all GCS files that are not also in $minimized_testcase_dir
+
+EOF