2015-04-24 05:29:37 +08:00
|
|
|
=================
|
2015-04-24 04:40:04 +08:00
|
|
|
SanitizerCoverage
|
2015-04-24 05:29:37 +08:00
|
|
|
=================
|
2015-04-24 04:40:04 +08:00
|
|
|
|
|
|
|
.. contents::
|
|
|
|
:local:
|
|
|
|
|
|
|
|
Introduction
|
|
|
|
============
|
|
|
|
|
|
|
|
Sanitizer tools have a very simple code coverage tool built in. It allows to
|
|
|
|
get function-level, basic-block-level, and edge-level coverage at a very low
|
|
|
|
cost.
|
|
|
|
|
|
|
|
How to build and run
|
|
|
|
====================
|
|
|
|
|
|
|
|
SanitizerCoverage can be used with :doc:`AddressSanitizer`,
|
2016-06-15 05:33:40 +08:00
|
|
|
:doc:`LeakSanitizer`, :doc:`MemorySanitizer`,
|
|
|
|
UndefinedBehaviorSanitizer, or without any sanitizer. Pass one of the
|
|
|
|
following compile-time flags:
|
2015-04-24 04:40:04 +08:00
|
|
|
|
2015-05-08 07:04:19 +08:00
|
|
|
* ``-fsanitize-coverage=func`` for function-level coverage (very fast).
|
|
|
|
* ``-fsanitize-coverage=bb`` for basic-block-level coverage (may add up to 30%
|
2015-04-24 04:40:04 +08:00
|
|
|
**extra** slowdown).
|
2015-05-08 07:04:19 +08:00
|
|
|
* ``-fsanitize-coverage=edge`` for edge-level coverage (up to 40% slowdown).
|
2015-04-24 04:40:04 +08:00
|
|
|
|
2015-05-08 07:04:19 +08:00
|
|
|
You may also specify ``-fsanitize-coverage=indirect-calls`` for
|
|
|
|
additional `caller-callee coverage`_.
|
2015-04-24 04:40:04 +08:00
|
|
|
|
2016-06-15 05:33:40 +08:00
|
|
|
At run time, pass ``coverage=1`` in ``ASAN_OPTIONS``,
|
|
|
|
``LSAN_OPTIONS``, ``MSAN_OPTIONS`` or ``UBSAN_OPTIONS``, as
|
|
|
|
appropriate. For the standalone coverage mode, use ``UBSAN_OPTIONS``.
|
2015-05-08 07:04:19 +08:00
|
|
|
|
|
|
|
To get `Coverage counters`_, add ``-fsanitize-coverage=8bit-counters``
|
2015-04-24 04:40:04 +08:00
|
|
|
to one of the above compile-time flags. At runtime, use
|
|
|
|
``*SAN_OPTIONS=coverage=1:coverage_counters=1``.
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
% cat -n cov.cc
|
|
|
|
1 #include <stdio.h>
|
|
|
|
2 __attribute__((noinline))
|
|
|
|
3 void foo() { printf("foo\n"); }
|
|
|
|
4
|
|
|
|
5 int main(int argc, char **argv) {
|
|
|
|
6 if (argc == 2)
|
|
|
|
7 foo();
|
|
|
|
8 printf("main\n");
|
|
|
|
9 }
|
2015-05-08 07:04:19 +08:00
|
|
|
% clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=func
|
2015-04-24 04:40:04 +08:00
|
|
|
% ASAN_OPTIONS=coverage=1 ./a.out; ls -l *sancov
|
|
|
|
main
|
|
|
|
-rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
|
|
|
|
% ASAN_OPTIONS=coverage=1 ./a.out foo ; ls -l *sancov
|
|
|
|
foo
|
|
|
|
main
|
|
|
|
-rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
|
|
|
|
-rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov
|
|
|
|
|
|
|
|
Every time you run an executable instrumented with SanitizerCoverage
|
|
|
|
one ``*.sancov`` file is created during the process shutdown.
|
|
|
|
If the executable is dynamically linked against instrumented DSOs,
|
|
|
|
one ``*.sancov`` file will be also created for every DSO.
|
|
|
|
|
|
|
|
Postprocessing
|
|
|
|
==============
|
|
|
|
|
|
|
|
The format of ``*.sancov`` files is very simple: the first 8 bytes is the magic,
|
|
|
|
one of ``0xC0BFFFFFFFFFFF64`` and ``0xC0BFFFFFFFFFFF32``. The last byte of the
|
|
|
|
magic defines the size of the following offsets. The rest of the data is the
|
|
|
|
offsets in the corresponding binary/DSO that were executed during the run.
|
|
|
|
|
|
|
|
A simple script
|
|
|
|
``$LLVM/projects/compiler-rt/lib/sanitizer_common/scripts/sancov.py`` is
|
|
|
|
provided to dump these offsets.
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
% sancov.py print a.out.22679.sancov a.out.22673.sancov
|
|
|
|
sancov.py: read 2 PCs from a.out.22679.sancov
|
|
|
|
sancov.py: read 1 PCs from a.out.22673.sancov
|
|
|
|
sancov.py: 2 files merged; 2 PCs total
|
|
|
|
0x465250
|
|
|
|
0x4652a0
|
|
|
|
|
|
|
|
You can then filter the output of ``sancov.py`` through ``addr2line --exe
|
|
|
|
ObjectFile`` or ``llvm-symbolizer --obj ObjectFile`` to get file names and line
|
|
|
|
numbers:
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
% sancov.py print a.out.22679.sancov a.out.22673.sancov 2> /dev/null | llvm-symbolizer --obj a.out
|
|
|
|
cov.cc:3
|
|
|
|
cov.cc:5
|
|
|
|
|
2016-01-28 07:56:12 +08:00
|
|
|
Sancov Tool
|
|
|
|
===========
|
|
|
|
|
|
|
|
A new experimental ``sancov`` tool is developed to process coverage files.
|
|
|
|
The tool is part of LLVM project and is currently supported only on Linux.
|
2016-02-12 08:29:45 +08:00
|
|
|
It can handle symbolization tasks autonomously without any extra support
|
|
|
|
from the environment. You need to pass .sancov files (named
|
|
|
|
``<module_name>.<pid>.sancov`` and paths to all corresponding binary elf files.
|
|
|
|
Sancov matches these files using module names and binaries file names.
|
2016-01-28 07:56:12 +08:00
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
2016-02-12 08:29:45 +08:00
|
|
|
USAGE: sancov [options] <action> (<binary file>|<.sancov file>)...
|
2016-01-28 07:56:12 +08:00
|
|
|
|
|
|
|
Action (required)
|
|
|
|
-print - Print coverage addresses
|
2016-02-15 04:20:58 +08:00
|
|
|
-covered-functions - Print all covered functions.
|
|
|
|
-not-covered-functions - Print all not covered functions.
|
2016-01-28 07:56:12 +08:00
|
|
|
-html-report - Print HTML coverage report.
|
|
|
|
|
|
|
|
Options
|
|
|
|
-blacklist=<string> - Blacklist file (sanitizer blacklist format).
|
|
|
|
-demangle - Print demangled function name.
|
|
|
|
-strip_path_prefix=<string> - Strip this prefix from file paths in reports
|
|
|
|
|
|
|
|
|
|
|
|
Automatic HTML Report Generation
|
|
|
|
================================
|
|
|
|
|
|
|
|
If ``*SAN_OPTIONS`` contains ``html_cov_report=1`` option set, then html
|
|
|
|
coverage report would be automatically generated alongside the coverage files.
|
|
|
|
The ``sancov`` binary should be present in ``PATH`` or
|
|
|
|
``sancov_path=<path_to_sancov`` option can be used to specify tool location.
|
|
|
|
|
|
|
|
|
2015-04-24 04:40:04 +08:00
|
|
|
How good is the coverage?
|
|
|
|
=========================
|
|
|
|
|
2015-05-07 05:09:00 +08:00
|
|
|
It is possible to find out which PCs are not covered, by subtracting the covered
|
|
|
|
set from the set of all instrumented PCs. The latter can be obtained by listing
|
|
|
|
all callsites of ``__sanitizer_cov()`` in the binary. On Linux, ``sancov.py``
|
|
|
|
can do this for you. Just supply the path to binary and a list of covered PCs:
|
2015-04-24 04:40:04 +08:00
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
2015-05-07 05:09:00 +08:00
|
|
|
% sancov.py print a.out.12345.sancov > covered.txt
|
|
|
|
sancov.py: read 2 64-bit PCs from a.out.12345.sancov
|
|
|
|
sancov.py: 1 file merged; 2 PCs total
|
|
|
|
% sancov.py missing a.out < covered.txt
|
|
|
|
sancov.py: found 3 instrumented PCs in a.out
|
|
|
|
sancov.py: read 2 PCs from stdin
|
|
|
|
sancov.py: 1 PCs missing from coverage
|
|
|
|
0x4cc61c
|
2015-04-24 04:40:04 +08:00
|
|
|
|
|
|
|
Edge coverage
|
|
|
|
=============
|
|
|
|
|
|
|
|
Consider this code:
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
|
|
|
void foo(int *a) {
|
|
|
|
if (a)
|
|
|
|
*a = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
It contains 3 basic blocks, let's name them A, B, C:
|
|
|
|
|
|
|
|
.. code-block:: none
|
|
|
|
|
|
|
|
A
|
|
|
|
|\
|
|
|
|
| \
|
|
|
|
| B
|
|
|
|
| /
|
|
|
|
|/
|
|
|
|
C
|
|
|
|
|
|
|
|
If blocks A, B, and C are all covered we know for certain that the edges A=>B
|
|
|
|
and B=>C were executed, but we still don't know if the edge A=>C was executed.
|
|
|
|
Such edges of control flow graph are called
|
|
|
|
`critical <http://en.wikipedia.org/wiki/Control_flow_graph#Special_edges>`_. The
|
2015-05-08 07:04:19 +08:00
|
|
|
edge-level coverage (``-fsanitize-coverage=edge``) simply splits all critical
|
|
|
|
edges by introducing new dummy blocks and then instruments those blocks:
|
2015-04-24 04:40:04 +08:00
|
|
|
|
|
|
|
.. code-block:: none
|
|
|
|
|
|
|
|
A
|
|
|
|
|\
|
|
|
|
| \
|
|
|
|
D B
|
|
|
|
| /
|
|
|
|
|/
|
|
|
|
C
|
|
|
|
|
|
|
|
Bitset
|
|
|
|
======
|
|
|
|
|
|
|
|
When ``coverage_bitset=1`` run-time flag is given, the coverage will also be
|
|
|
|
dumped as a bitset (text file with 1 for blocks that have been executed and 0
|
|
|
|
for blocks that were not).
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
2015-05-08 07:04:19 +08:00
|
|
|
% clang++ -fsanitize=address -fsanitize-coverage=edge cov.cc
|
2015-04-24 04:40:04 +08:00
|
|
|
% ASAN_OPTIONS="coverage=1:coverage_bitset=1" ./a.out
|
|
|
|
main
|
|
|
|
% ASAN_OPTIONS="coverage=1:coverage_bitset=1" ./a.out 1
|
|
|
|
foo
|
|
|
|
main
|
|
|
|
% head *bitset*
|
|
|
|
==> a.out.38214.bitset-sancov <==
|
|
|
|
01101
|
|
|
|
==> a.out.6128.bitset-sancov <==
|
|
|
|
11011%
|
|
|
|
|
|
|
|
For a given executable the length of the bitset is always the same (well,
|
|
|
|
unless dlopen/dlclose come into play), so the bitset coverage can be
|
|
|
|
easily used for bitset-based corpus distillation.
|
|
|
|
|
|
|
|
Caller-callee coverage
|
|
|
|
======================
|
|
|
|
|
|
|
|
(Experimental!)
|
|
|
|
Every indirect function call is instrumented with a run-time function call that
|
|
|
|
captures caller and callee. At the shutdown time the process dumps a separate
|
|
|
|
file called ``caller-callee.PID.sancov`` which contains caller/callee pairs as
|
|
|
|
pairs of lines (odd lines are callers, even lines are callees)
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
a.out 0x4a2e0c
|
|
|
|
a.out 0x4a6510
|
|
|
|
a.out 0x4a2e0c
|
|
|
|
a.out 0x4a87f0
|
|
|
|
|
|
|
|
Current limitations:
|
|
|
|
|
|
|
|
* Only the first 14 callees for every caller are recorded, the rest are silently
|
|
|
|
ignored.
|
|
|
|
* The output format is not very compact since caller and callee may reside in
|
|
|
|
different modules and we need to spell out the module names.
|
|
|
|
* The routine that dumps the output is not optimized for speed
|
|
|
|
* Only Linux x86_64 is tested so far.
|
|
|
|
* Sandboxes are not supported.
|
|
|
|
|
|
|
|
Coverage counters
|
|
|
|
=================
|
|
|
|
|
|
|
|
This experimental feature is inspired by
|
2016-02-22 21:09:36 +08:00
|
|
|
`AFL <http://lcamtuf.coredump.cx/afl/technical_details.txt>`__'s coverage
|
2015-04-24 04:40:04 +08:00
|
|
|
instrumentation. With additional compile-time and run-time flags you can get
|
|
|
|
more sensitive coverage information. In addition to boolean values assigned to
|
|
|
|
every basic block (edge) the instrumentation will collect imprecise counters.
|
|
|
|
On exit, every counter will be mapped to a 8-bit bitset representing counter
|
|
|
|
ranges: ``1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+`` and those 8-bit bitsets will
|
|
|
|
be dumped to disk.
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
2015-05-08 07:04:19 +08:00
|
|
|
% clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=edge,8bit-counters
|
2015-04-24 04:40:04 +08:00
|
|
|
% ASAN_OPTIONS="coverage=1:coverage_counters=1" ./a.out
|
|
|
|
% ls -l *counters-sancov
|
|
|
|
... a.out.17110.counters-sancov
|
|
|
|
% xxd *counters-sancov
|
|
|
|
0000000: 0001 0100 01
|
|
|
|
|
|
|
|
These counters may also be used for in-process coverage-guided fuzzers. See
|
|
|
|
``include/sanitizer/coverage_interface.h``:
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
|
|
|
// The coverage instrumentation may optionally provide imprecise counters.
|
|
|
|
// Rather than exposing the counter values to the user we instead map
|
|
|
|
// the counters to a bitset.
|
|
|
|
// Every counter is associated with 8 bits in the bitset.
|
|
|
|
// We define 8 value ranges: 1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+
|
|
|
|
// The i-th bit is set to 1 if the counter value is in the i-th range.
|
|
|
|
// This counter-based coverage implementation is *not* thread-safe.
|
|
|
|
|
|
|
|
// Returns the number of registered coverage counters.
|
|
|
|
uintptr_t __sanitizer_get_number_of_counters();
|
|
|
|
// Updates the counter 'bitset', clears the counters and returns the number of
|
|
|
|
// new bits in 'bitset'.
|
|
|
|
// If 'bitset' is nullptr, only clears the counters.
|
|
|
|
// Otherwise 'bitset' should be at least
|
|
|
|
// __sanitizer_get_number_of_counters bytes long and 8-aligned.
|
|
|
|
uintptr_t
|
|
|
|
__sanitizer_update_counter_bitset_and_clear_counters(uint8_t *bitset);
|
|
|
|
|
2015-12-02 10:08:26 +08:00
|
|
|
Tracing basic blocks
|
|
|
|
====================
|
2016-04-19 05:28:37 +08:00
|
|
|
Experimental support for basic block (or edge) tracing.
|
2015-12-02 10:08:26 +08:00
|
|
|
With ``-fsanitize-coverage=trace-bb`` the compiler will insert
|
|
|
|
``__sanitizer_cov_trace_basic_block(s32 *id)`` before every function, basic block, or edge
|
|
|
|
(depending on the value of ``-fsanitize-coverage=[func,bb,edge]``).
|
2016-04-19 05:28:37 +08:00
|
|
|
Example:
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
% clang -g -fsanitize=address -fsanitize-coverage=edge,trace-bb foo.cc
|
|
|
|
% ASAN_OPTIONS=coverage=1 ./a.out
|
|
|
|
|
|
|
|
This will produce two files after the process exit:
|
|
|
|
`trace-points.PID.sancov` and `trace-events.PID.sancov`.
|
|
|
|
The first file will contain a textual description of all the instrumented points in the program
|
|
|
|
in the form that you can feed into llvm-symbolizer (e.g. `a.out 0x4dca89`), one per line.
|
|
|
|
The second file will contain the actual execution trace as a sequence of 4-byte integers
|
|
|
|
-- these integers are the indices into the array of instrumented points (the first file).
|
|
|
|
|
|
|
|
Basic block tracing is currently supported only for single-threaded applications.
|
|
|
|
|
2015-12-02 10:08:26 +08:00
|
|
|
|
2016-02-18 05:34:43 +08:00
|
|
|
Tracing PCs
|
|
|
|
===========
|
|
|
|
*Experimental* feature similar to tracing basic blocks, but with a different API.
|
2016-02-18 08:49:23 +08:00
|
|
|
With ``-fsanitize-coverage=trace-pc`` the compiler will insert
|
|
|
|
``__sanitizer_cov_trace_pc()`` on every edge.
|
|
|
|
With an additional ``...=trace-pc,indirect-calls`` flag
|
2016-02-18 05:34:43 +08:00
|
|
|
``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call.
|
|
|
|
These callbacks are not implemented in the Sanitizer run-time and should be defined
|
2016-02-18 08:49:23 +08:00
|
|
|
by the user. So, these flags do not require the other sanitizer to be used.
|
|
|
|
This mechanism is used for fuzzing the Linux kernel (https://github.com/google/syzkaller)
|
2016-02-22 21:09:36 +08:00
|
|
|
and can be used with `AFL <http://lcamtuf.coredump.cx/afl>`__.
|
2016-02-18 05:34:43 +08:00
|
|
|
|
2016-09-14 09:39:49 +08:00
|
|
|
Tracing PCs with guards
|
|
|
|
=======================
|
2016-09-16 06:11:08 +08:00
|
|
|
Another *experimental* feature that tries to combine the functionality of `trace-pc`,
|
|
|
|
`8bit-counters` and boolean coverage.
|
2016-09-14 09:39:49 +08:00
|
|
|
|
|
|
|
With ``-fsanitize-coverage=trace-pc-guard`` the compiler will insert the following code
|
|
|
|
on every edge:
|
|
|
|
|
|
|
|
.. code-block:: none
|
|
|
|
|
2016-09-18 12:52:23 +08:00
|
|
|
if (guard_variable)
|
2016-09-14 09:39:49 +08:00
|
|
|
__sanitizer_cov_trace_pc_guard(&guard_variable)
|
|
|
|
|
2016-09-30 01:43:24 +08:00
|
|
|
Every edge will have its own `guard_variable` (uint32_t).
|
2016-09-16 06:11:08 +08:00
|
|
|
|
2016-09-14 09:39:49 +08:00
|
|
|
The compler will also insert a module constructor that will call
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
2016-09-17 13:03:05 +08:00
|
|
|
// The guards are [start, stop).
|
|
|
|
// This function may be called multiple times with the same values of start/stop.
|
2016-09-30 02:34:40 +08:00
|
|
|
__sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop);
|
2016-09-17 13:03:05 +08:00
|
|
|
|
|
|
|
Similarly to `trace-pc,indirect-calls`, with `trace-pc-guards,indirect-calls`
|
|
|
|
``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call.
|
2016-09-14 09:39:49 +08:00
|
|
|
|
2016-09-17 13:03:05 +08:00
|
|
|
The functions `__sanitizer_cov_trace_pc_*` should be defined by the user.
|
2016-09-14 09:39:49 +08:00
|
|
|
|
2016-09-30 02:58:17 +08:00
|
|
|
Example:
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
|
|
|
// trace-pc-guard-cb.cc
|
|
|
|
#include <stdint.h>
|
|
|
|
#include <stdio.h>
|
|
|
|
#include <sanitizer/coverage_interface.h>
|
|
|
|
|
|
|
|
// This callback is inserted by the compiler as a module constructor
|
|
|
|
// into every compilation unit. 'start' and 'stop' correspond to the
|
|
|
|
// beginning and end of the section with the guards for the entire
|
|
|
|
// binary (executable or DSO) and so it will be called multiple times
|
|
|
|
// with the same parameters.
|
|
|
|
extern "C" void __sanitizer_cov_trace_pc_guard_init(uint32_t *start,
|
|
|
|
uint32_t *stop) {
|
|
|
|
static uint64_t N; // Counter for the guards.
|
|
|
|
if (start == stop || *start) return; // Initialize only once.
|
|
|
|
printf("INIT: %p %p\n", start, stop);
|
|
|
|
for (uint32_t *x = start; x < stop; x++)
|
|
|
|
*x = ++N; // Guards should start from 1.
|
|
|
|
}
|
|
|
|
|
|
|
|
// This callback is inserted by the compiler on every edge in the
|
|
|
|
// control flow (some optimizations apply).
|
|
|
|
// Typically, the compiler will emit the code like this:
|
|
|
|
// if(*guard)
|
|
|
|
// __sanitizer_cov_trace_pc_guard(guard);
|
|
|
|
// But for large functions it will emit a simple call:
|
|
|
|
// __sanitizer_cov_trace_pc_guard(guard);
|
|
|
|
extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
|
|
|
|
if (!*guard) return; // Duplicate the guard check.
|
|
|
|
// If you set *guard to 0 this code will not be called again for this edge.
|
|
|
|
// Now you can get the PC and do whatever you want:
|
|
|
|
// store it somewhere or symbolize it and print right away.
|
|
|
|
// The values of `*guard` are as you set them in
|
|
|
|
// __sanitizer_cov_trace_pc_guard_init and so you can make the consecutive
|
|
|
|
// and use them to dereference an array or a bit vector.
|
|
|
|
void *PC = __builtin_return_address(0);
|
|
|
|
char PcDescr[1024];
|
|
|
|
// This function is a part of the sanitizer run-time.
|
|
|
|
// To use it, link with AddressSanitizer or other sanitizer.
|
|
|
|
__sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr));
|
|
|
|
printf("guard: %p %x PC %s\n", guard, *guard, PcDescr);
|
|
|
|
}
|
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
|
|
|
// trace-pc-guard-example.cc
|
|
|
|
void foo() { }
|
|
|
|
int main(int argc, char **argv) {
|
|
|
|
if (argc > 1) foo();
|
|
|
|
}
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
clang++ -g -fsanitize-coverage=trace-pc-guard trace-pc-guard-example.cc -c
|
|
|
|
clang++ trace-pc-guard-cb.cc trace-pc-guard-example.o -fsanitize=address
|
|
|
|
ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
INIT: 0x71bcd0 0x71bce0
|
|
|
|
guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:2
|
|
|
|
guard: 0x71bcd8 3 PC 0x4ecd9e in main trace-pc-guard-example.cc:3:7
|
|
|
|
|
|
|
|
|
2015-08-01 05:48:10 +08:00
|
|
|
Tracing data flow
|
|
|
|
=================
|
|
|
|
|
2016-08-30 09:27:03 +08:00
|
|
|
Support for data-flow-guided fuzzing.
|
2015-08-01 05:48:10 +08:00
|
|
|
With ``-fsanitize-coverage=trace-cmp`` the compiler will insert extra instrumentation
|
|
|
|
around comparison instructions and switch statements.
|
2016-08-30 09:27:03 +08:00
|
|
|
Similarly, with ``-fsanitize-coverage=trace-div`` the compiler will instrument
|
|
|
|
integer division instructions (to capture the right argument of division)
|
|
|
|
and with ``-fsanitize-coverage=trace-gep`` --
|
|
|
|
the `LLVM GEP instructions <http://llvm.org/docs/GetElementPtr.html>`_
|
|
|
|
(to capture array indices).
|
2015-08-01 05:48:10 +08:00
|
|
|
|
|
|
|
.. code-block:: c++
|
|
|
|
|
|
|
|
// Called before a comparison instruction.
|
|
|
|
// Arg1 and Arg2 are arguments of the comparison.
|
2016-08-18 09:26:36 +08:00
|
|
|
void __sanitizer_cov_trace_cmp1(uint8_t Arg1, uint8_t Arg2);
|
|
|
|
void __sanitizer_cov_trace_cmp2(uint16_t Arg1, uint16_t Arg2);
|
|
|
|
void __sanitizer_cov_trace_cmp4(uint32_t Arg1, uint32_t Arg2);
|
|
|
|
void __sanitizer_cov_trace_cmp8(uint64_t Arg1, uint64_t Arg2);
|
2015-08-01 05:48:10 +08:00
|
|
|
|
|
|
|
// Called before a switch statement.
|
|
|
|
// Val is the switch operand.
|
|
|
|
// Cases[0] is the number of case constants.
|
|
|
|
// Cases[1] is the size of Val in bits.
|
|
|
|
// Cases[2:] are the case constants.
|
|
|
|
void __sanitizer_cov_trace_switch(uint64_t Val, uint64_t *Cases);
|
|
|
|
|
2016-08-30 09:27:03 +08:00
|
|
|
// Called before a division statement.
|
|
|
|
// Val is the second argument of division.
|
|
|
|
void __sanitizer_cov_trace_div4(uint32_t Val);
|
|
|
|
void __sanitizer_cov_trace_div8(uint64_t Val);
|
|
|
|
|
|
|
|
// Called before a GetElemementPtr (GEP) instruction
|
|
|
|
// for every non-constant array index.
|
|
|
|
void __sanitizer_cov_trace_gep(uintptr_t Idx);
|
|
|
|
|
|
|
|
|
2015-08-01 05:48:10 +08:00
|
|
|
This interface is a subject to change.
|
2015-12-01 06:17:19 +08:00
|
|
|
The current implementation is not thread-safe and thus can be safely used only for single-threaded targets.
|
2015-08-01 05:48:10 +08:00
|
|
|
|
2015-04-24 04:40:04 +08:00
|
|
|
Output directory
|
|
|
|
================
|
|
|
|
|
|
|
|
By default, .sancov files are created in the current working directory.
|
|
|
|
This can be changed with ``ASAN_OPTIONS=coverage_dir=/path``:
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
% ASAN_OPTIONS="coverage=1:coverage_dir=/tmp/cov" ./a.out foo
|
|
|
|
% ls -l /tmp/cov/*sancov
|
|
|
|
-rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
|
|
|
|
-rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov
|
|
|
|
|
|
|
|
Sudden death
|
|
|
|
============
|
|
|
|
|
|
|
|
Normally, coverage data is collected in memory and saved to disk when the
|
|
|
|
program exits (with an ``atexit()`` handler), when a SIGSEGV is caught, or when
|
|
|
|
``__sanitizer_cov_dump()`` is called.
|
|
|
|
|
|
|
|
If the program ends with a signal that ASan does not handle (or can not handle
|
|
|
|
at all, like SIGKILL), coverage data will be lost. This is a big problem on
|
|
|
|
Android, where SIGKILL is a normal way of evicting applications from memory.
|
|
|
|
|
|
|
|
With ``ASAN_OPTIONS=coverage=1:coverage_direct=1`` coverage data is written to a
|
|
|
|
memory-mapped file as soon as it collected.
|
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
|
|
|
% ASAN_OPTIONS="coverage=1:coverage_direct=1" ./a.out
|
|
|
|
main
|
|
|
|
% ls
|
|
|
|
7036.sancov.map 7036.sancov.raw a.out
|
|
|
|
% sancov.py rawunpack 7036.sancov.raw
|
|
|
|
sancov.py: reading map 7036.sancov.map
|
|
|
|
sancov.py: unpacking 7036.sancov.raw
|
|
|
|
writing 1 PCs to a.out.7036.sancov
|
|
|
|
% sancov.py print a.out.7036.sancov
|
|
|
|
sancov.py: read 1 PCs from a.out.7036.sancov
|
|
|
|
sancov.py: 1 files merged; 1 PCs total
|
|
|
|
0x4b2bae
|
|
|
|
|
|
|
|
Note that on 64-bit platforms, this method writes 2x more data than the default,
|
|
|
|
because it stores full PC values instead of 32-bit offsets.
|
|
|
|
|
|
|
|
In-process fuzzing
|
|
|
|
==================
|
|
|
|
|
|
|
|
Coverage data could be useful for fuzzers and sometimes it is preferable to run
|
|
|
|
a fuzzer in the same process as the code being fuzzed (in-process fuzzer).
|
|
|
|
|
|
|
|
You can use ``__sanitizer_get_total_unique_coverage()`` from
|
|
|
|
``<sanitizer/coverage_interface.h>`` which returns the number of currently
|
|
|
|
covered entities in the program. This will tell the fuzzer if the coverage has
|
|
|
|
increased after testing every new input.
|
|
|
|
|
|
|
|
If a fuzzer finds a bug in the ASan run, you will need to save the reproducer
|
|
|
|
before exiting the process. Use ``__asan_set_death_callback`` from
|
|
|
|
``<sanitizer/asan_interface.h>`` to do that.
|
|
|
|
|
|
|
|
An example of such fuzzer can be found in `the LLVM tree
|
|
|
|
<http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Fuzzer/README.txt?view=markup>`_.
|
|
|
|
|
|
|
|
Performance
|
|
|
|
===========
|
|
|
|
|
|
|
|
This coverage implementation is **fast**. With function-level coverage
|
2015-05-08 07:04:19 +08:00
|
|
|
(``-fsanitize-coverage=func``) the overhead is not measurable. With
|
|
|
|
basic-block-level coverage (``-fsanitize-coverage=bb``) the overhead varies
|
2015-04-24 04:40:04 +08:00
|
|
|
between 0 and 25%.
|
|
|
|
|
|
|
|
============== ========= ========= ========= ========= ========= =========
|
|
|
|
benchmark cov0 cov1 diff 0-1 cov2 diff 0-2 diff 1-2
|
|
|
|
============== ========= ========= ========= ========= ========= =========
|
|
|
|
400.perlbench 1296.00 1307.00 1.01 1465.00 1.13 1.12
|
|
|
|
401.bzip2 858.00 854.00 1.00 1010.00 1.18 1.18
|
|
|
|
403.gcc 613.00 617.00 1.01 683.00 1.11 1.11
|
|
|
|
429.mcf 605.00 582.00 0.96 610.00 1.01 1.05
|
|
|
|
445.gobmk 896.00 880.00 0.98 1050.00 1.17 1.19
|
|
|
|
456.hmmer 892.00 892.00 1.00 918.00 1.03 1.03
|
|
|
|
458.sjeng 995.00 1009.00 1.01 1217.00 1.22 1.21
|
|
|
|
462.libquantum 497.00 492.00 0.99 534.00 1.07 1.09
|
|
|
|
464.h264ref 1461.00 1467.00 1.00 1543.00 1.06 1.05
|
|
|
|
471.omnetpp 575.00 590.00 1.03 660.00 1.15 1.12
|
|
|
|
473.astar 658.00 652.00 0.99 715.00 1.09 1.10
|
|
|
|
483.xalancbmk 471.00 491.00 1.04 582.00 1.24 1.19
|
|
|
|
433.milc 616.00 627.00 1.02 627.00 1.02 1.00
|
|
|
|
444.namd 602.00 601.00 1.00 654.00 1.09 1.09
|
|
|
|
447.dealII 630.00 634.00 1.01 653.00 1.04 1.03
|
|
|
|
450.soplex 365.00 368.00 1.01 395.00 1.08 1.07
|
|
|
|
453.povray 427.00 434.00 1.02 495.00 1.16 1.14
|
|
|
|
470.lbm 357.00 375.00 1.05 370.00 1.04 0.99
|
|
|
|
482.sphinx3 927.00 928.00 1.00 1000.00 1.08 1.08
|
|
|
|
============== ========= ========= ========= ========= ========= =========
|
|
|
|
|
|
|
|
Why another coverage?
|
|
|
|
=====================
|
|
|
|
|
|
|
|
Why did we implement yet another code coverage?
|
|
|
|
* We needed something that is lightning fast, plays well with
|
|
|
|
AddressSanitizer, and does not significantly increase the binary size.
|
|
|
|
* Traditional coverage implementations based in global counters
|
|
|
|
`suffer from contention on counters
|
|
|
|
<https://groups.google.com/forum/#!topic/llvm-dev/cDqYgnxNEhY>`_.
|