forked from OSchip/llvm-project
140 lines
5.2 KiB
ReStructuredText
140 lines
5.2 KiB
ReStructuredText
=======================================================
|
|
Hardware-assisted AddressSanitizer Design Documentation
|
|
=======================================================
|
|
|
|
This page is a design document for
|
|
**hardware-assisted AddressSanitizer** (or **HWASAN**)
|
|
a tool similar to :doc:`AddressSanitizer`,
|
|
but based on partial hardware assistance.
|
|
|
|
The document is a draft, suggestions are welcome.
|
|
|
|
|
|
Introduction
|
|
============
|
|
|
|
:doc:`AddressSanitizer`
|
|
tags every 8 bytes of the application memory with a 1 byte tag (using *shadow memory*),
|
|
uses *redzones* to find buffer-overflows and
|
|
*quarantine* to find use-after-free.
|
|
The redzones, the quarantine, and, to a less extent, the shadow, are the
|
|
sources of AddressSanitizer's memory overhead.
|
|
See the `AddressSanitizer paper`_ for details.
|
|
|
|
AArch64 has the `Address Tagging`_, a hardware feature that allows
|
|
software to use 8 most significant bits of a 64-bit pointer as
|
|
a tag. HWASAN uses `Address Tagging`_
|
|
to implement a memory safety tool, similar to :doc:`AddressSanitizer`,
|
|
but with smaller memory overhead and slightly different (mostly better)
|
|
accuracy guarantees.
|
|
|
|
Algorithm
|
|
=========
|
|
* Every heap/stack/global memory object is forcibly aligned by `N` bytes
|
|
(`N` is e.g. 16 or 64)
|
|
* For every such object a random `K`-bit tag `T` is chosen (`K` is e.g. 4 or 8)
|
|
* The pointer to the object is tagged with `T`.
|
|
* The memory for the object is also tagged with `T`
|
|
(using a `N=>1` shadow memory)
|
|
* Every load and store is instrumented to read the memory tag and compare it
|
|
with the pointer tag, exception is raised on tag mismatch.
|
|
|
|
Instrumentation
|
|
===============
|
|
|
|
Memory Accesses
|
|
---------------
|
|
All memory accesses are prefixed with a call to a run-time function.
|
|
The function encodes the type and the size of access in its name;
|
|
it receives the address as a parameter, e.g. `__hwasan_load4(void *ptr)`;
|
|
it loads the memory tag, compares it with the
|
|
pointer tag, and executes `__builtin_trap` (or calls `__hwasan_error_load4(void *ptr)`) on mismatch.
|
|
|
|
It's possible to inline this callback too.
|
|
|
|
Heap
|
|
----
|
|
|
|
Tagging the heap memory/pointers is done by `malloc`.
|
|
This can be based on any malloc that forces all objects to be N-aligned.
|
|
|
|
Stack
|
|
-----
|
|
|
|
Special compiler instrumentation is required to align the local variables
|
|
by N, tag the memory and the pointers.
|
|
Stack instrumentation is expected to be a major source of overhead,
|
|
but could be optional.
|
|
TODO: details.
|
|
|
|
Globals
|
|
-------
|
|
|
|
TODO: details.
|
|
|
|
Error reporting
|
|
---------------
|
|
|
|
Errors are generated by `__builtin_trap` and are handled by a signal handler.
|
|
|
|
Attribute
|
|
---------
|
|
|
|
HWASAN uses its own LLVM IR Attribute `sanitize_hwaddress` and a matching
|
|
C function attribute. An alternative would be to re-use ASAN's attribute
|
|
`sanitize_address`. The reasons to use a separate attribute are:
|
|
|
|
* Users may need to disable ASAN but not HWASAN, or vise versa,
|
|
because the tools have different trade-offs and compatibility issues.
|
|
* LLVM (ideally) does not use flags to decide which pass is being used,
|
|
ASAN or HWASAN are being applied, based on the function attributes.
|
|
|
|
This does mean that users of HWASAN may need to add the new attribute
|
|
to the code that already uses the old attribute.
|
|
|
|
|
|
Comparison with AddressSanitizer
|
|
================================
|
|
|
|
HWASAN:
|
|
* Is less portable than :doc:`AddressSanitizer`
|
|
as it relies on hardware `Address Tagging`_ (AArch64).
|
|
Address Tagging can be emulated with compiler instrumentation,
|
|
but it will require the instrumentation to remove the tags before
|
|
any load or store, which is infeasible in any realistic environment
|
|
that contains non-instrumented code.
|
|
* May have compatibility problems if the target code uses higher
|
|
pointer bits for other purposes.
|
|
* May require changes in the OS kernels (e.g. Linux seems to dislike
|
|
tagged pointers passed from address space:
|
|
https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt).
|
|
* **Does not require redzones to detect buffer overflows**,
|
|
but the buffer overflow detection is probabilistic, with roughly
|
|
`(2**K-1)/(2**K)` probability of catching a bug.
|
|
* **Does not require quarantine to detect heap-use-after-free,
|
|
or stack-use-after-return**.
|
|
The detection is similarly probabilistic.
|
|
|
|
The memory overhead of HWASAN is expected to be much smaller
|
|
than that of AddressSanitizer:
|
|
`1/N` extra memory for the shadow
|
|
and some overhead due to `N`-aligning all objects.
|
|
|
|
|
|
Related Work
|
|
============
|
|
* `SPARC ADI`_ implements a similar tool mostly in hardware.
|
|
* `Effective and Efficient Memory Protection Using Dynamic Tainting`_ discusses
|
|
similar approaches ("lock & key").
|
|
* `Watchdog`_ discussed a heavier, but still somewhat similar
|
|
"lock & key" approach.
|
|
* *TODO: add more "related work" links. Suggestions are welcome.*
|
|
|
|
|
|
.. _Watchdog: http://www.cis.upenn.edu/acg/papers/isca12_watchdog.pdf
|
|
.. _Effective and Efficient Memory Protection Using Dynamic Tainting: https://www.cc.gatech.edu/~orso/papers/clause.doudalis.orso.prvulovic.pdf
|
|
.. _SPARC ADI: https://lazytyped.blogspot.com/2017/09/getting-started-with-adi.html
|
|
.. _AddressSanitizer paper: https://www.usenix.org/system/files/conference/atc12/atc12-final39.pdf
|
|
.. _Address Tagging: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch12s05s01.html
|
|
|