llvm-project/clang/docs/HardwareAssistedAddressSani...

=======================================================
Hardware-assisted AddressSanitizer Design Documentation
=======================================================

This page is a design document for
**hardware-assisted AddressSanitizer** (or **HWASAN**)
a tool similar to :doc:`AddressSanitizer`,
but based on partial hardware assistance.

The document is a draft, suggestions are welcome.


Introduction
============

:doc:`AddressSanitizer`
tags every 8 bytes of the application memory with a 1 byte tag (using *shadow memory*),
uses *redzones* to find buffer-overflows and
*quarantine* to find use-after-free.
The redzones, the quarantine, and, to a less extent, the shadow, are the
sources of AddressSanitizer's memory overhead.
See the `AddressSanitizer paper`_ for details.

AArch64 has the `Address Tagging`_, a hardware feature that allows
software to use 8 most significant bits of a 64-bit pointer as
a tag. HWASAN uses `Address Tagging`_
to implement a memory safety tool, similar to :doc:`AddressSanitizer`,
but with smaller memory overhead and slightly different (mostly better)
accuracy guarantees.

Algorithm
=========
* Every heap/stack/global memory object is forcibly aligned by `N` bytes
  (`N` is e.g. 16 or 64)
* For every such object a random `K`-bit tag `T` is chosen (`K` is e.g. 4 or 8)
* The pointer to the object is tagged with `T`.
* The memory for the object is also tagged with `T`
  (using a `N=>1` shadow memory)
* Every load and store is instrumented to read the memory tag and compare it
  with the pointer tag, exception is raised on tag mismatch.

Instrumentation
===============

Memory Accesses
---------------
All memory accesses are prefixed with a call to a run-time function.
The function encodes the type and the size of access in its name;
it receives the address as a parameter, e.g. `__hwasan_load4(void *ptr)`;
it loads the memory tag, compares it with the
pointer tag, and executes `__builtin_trap` (or calls `__hwasan_error_load4(void *ptr)`) on mismatch.

It's possible to inline this callback too.

Heap
----

Tagging the heap memory/pointers is done by `malloc`.
This can be based on any malloc that forces all objects to be N-aligned.

Stack
-----

Special compiler instrumentation is required to align the local variables
by N, tag the memory and the pointers.
Stack instrumentation is expected to be a major source of overhead,
but could be optional.
TODO: details.

Globals
-------

TODO: details.

Error reporting
---------------

Errors are generated by `__builtin_trap` and are handled by a signal handler.

Attribute
---------

HWASAN uses its own LLVM IR Attribute `sanitize_hwaddress` and a matching
C function attribute. An alternative would be to re-use ASAN's attribute
`sanitize_address`. The reasons to use a separate attribute are:

  * Users may need to disable ASAN but not HWASAN, or vise versa,
    because the tools have different trade-offs and compatibility issues.
  * LLVM (ideally) does not use flags to decide which pass is being used,
    ASAN or HWASAN are being applied, based on the function attributes.

This does mean that users of HWASAN may need to add the new attribute
to the code that already uses the old attribute.


Comparison with AddressSanitizer
================================

HWASAN:
  * Is less portable than :doc:`AddressSanitizer`
    as it relies on hardware `Address Tagging`_ (AArch64).
    Address Tagging can be emulated with compiler instrumentation,
    but it will require the instrumentation to remove the tags before
    any load or store, which is infeasible in any realistic environment
    that contains non-instrumented code.
  * May have compatibility problems if the target code uses higher
    pointer bits for other purposes.
  * May require changes in the OS kernels (e.g. Linux seems to dislike
    tagged pointers passed from address space:
    https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt).
  * **Does not require redzones to detect buffer overflows**,
    but the buffer overflow detection is probabilistic, with roughly
    `(2**K-1)/(2**K)` probability of catching a bug.
  * **Does not require quarantine to detect heap-use-after-free,
    or stack-use-after-return**.
    The detection is similarly probabilistic.

The memory overhead of HWASAN is expected to be much smaller
than that of AddressSanitizer:
`1/N` extra memory for the shadow
and some overhead due to `N`-aligning all objects.


Related Work
============
* `SPARC ADI`_ implements a similar tool mostly in hardware.
* `Effective and Efficient Memory Protection Using Dynamic Tainting`_ discusses
  similar approaches ("lock & key").
* `Watchdog`_ discussed a heavier, but still somewhat similar
  "lock & key" approach.
* *TODO: add more "related work" links. Suggestions are welcome.*


.. _Watchdog: http://www.cis.upenn.edu/acg/papers/isca12_watchdog.pdf
.. _Effective and Efficient Memory Protection Using Dynamic Tainting: https://www.cc.gatech.edu/~orso/papers/clause.doudalis.orso.prvulovic.pdf
.. _SPARC ADI: https://lazytyped.blogspot.com/2017/09/getting-started-with-adi.html
.. _AddressSanitizer paper: https://www.usenix.org/system/files/conference/atc12/atc12-final39.pdf
.. _Address Tagging: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch12s05s01.html
update hwasan docs Summary: * use more readable name * document the hwasan attribute Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D40938 llvm-svn: 320075 2017-12-08 03:21:30 +08:00			`=======================================================`
			`Hardware-assisted AddressSanitizer Design Documentation`
			`=======================================================`
design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer Summary: preliminary design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer The name TaggedAddressSanitizer and the rest of the document, are early draft, suggestions are welcome. The code will follow shortly. Reviewers: eugenis, alekseyshl Reviewed By: eugenis Subscribers: davidxl, cryptoad, fedor.sergeev, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D40568 llvm-svn: 319684 2017-12-05 04:01:38 +08:00
			`This page is a design document for`
update hwasan docs Summary: * use more readable name * document the hwasan attribute Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D40938 llvm-svn: 320075 2017-12-08 03:21:30 +08:00			`hardware-assisted AddressSanitizer (or HWASAN)`
design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer Summary: preliminary design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer The name TaggedAddressSanitizer and the rest of the document, are early draft, suggestions are welcome. The code will follow shortly. Reviewers: eugenis, alekseyshl Reviewed By: eugenis Subscribers: davidxl, cryptoad, fedor.sergeev, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D40568 llvm-svn: 319684 2017-12-05 04:01:38 +08:00			a tool similar to :doc:`AddressSanitizer`,
			`but based on partial hardware assistance.`

			`The document is a draft, suggestions are welcome.`


			`Introduction`
			`============`

			:doc:`AddressSanitizer`
			`tags every 8 bytes of the application memory with a 1 byte tag (using shadow memory),`
			`uses redzones to find buffer-overflows and`
			`quarantine to find use-after-free.`
			`The redzones, the quarantine, and, to a less extent, the shadow, are the`
			`sources of AddressSanitizer's memory overhead.`
			See the `AddressSanitizer paper`_ for details.

			AArch64 has the `Address Tagging`_, a hardware feature that allows
			`software to use 8 most significant bits of a 64-bit pointer as`
update hwasan docs Summary: * use more readable name * document the hwasan attribute Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D40938 llvm-svn: 320075 2017-12-08 03:21:30 +08:00			a tag. HWASAN uses `Address Tagging`_
design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer Summary: preliminary design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer The name TaggedAddressSanitizer and the rest of the document, are early draft, suggestions are welcome. The code will follow shortly. Reviewers: eugenis, alekseyshl Reviewed By: eugenis Subscribers: davidxl, cryptoad, fedor.sergeev, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D40568 llvm-svn: 319684 2017-12-05 04:01:38 +08:00			to implement a memory safety tool, similar to :doc:`AddressSanitizer`,
			`but with smaller memory overhead and slightly different (mostly better)`
			`accuracy guarantees.`

			`Algorithm`
			`=========`
			* Every heap/stack/global memory object is forcibly aligned by `N` bytes
			(`N` is e.g. 16 or 64)
			* For every such object a random `K`-bit tag `T` is chosen (`K` is e.g. 4 or 8)
			* The pointer to the object is tagged with `T`.
			* The memory for the object is also tagged with `T`
			(using a `N=>1` shadow memory)
			`* Every load and store is instrumented to read the memory tag and compare it`
			`with the pointer tag, exception is raised on tag mismatch.`

			`Instrumentation`
			`===============`

			`Memory Accesses`
			`---------------`
			`All memory accesses are prefixed with a call to a run-time function.`
			`The function encodes the type and the size of access in its name;`
			it receives the address as a parameter, e.g. `__hwasan_load4(void *ptr)`;
			`it loads the memory tag, compares it with the`
			pointer tag, and executes `__builtin_trap` (or calls `__hwasan_error_load4(void *ptr)`) on mismatch.

			`It's possible to inline this callback too.`

			`Heap`
			`----`

			Tagging the heap memory/pointers is done by `malloc`.
			`This can be based on any malloc that forces all objects to be N-aligned.`

			`Stack`
			`-----`

			`Special compiler instrumentation is required to align the local variables`
			`by N, tag the memory and the pointers.`
			`Stack instrumentation is expected to be a major source of overhead,`
			`but could be optional.`
			`TODO: details.`

			`Globals`
			`-------`

			`TODO: details.`

			`Error reporting`
			`---------------`

			Errors are generated by `__builtin_trap` and are handled by a signal handler.

update hwasan docs Summary: * use more readable name * document the hwasan attribute Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D40938 llvm-svn: 320075 2017-12-08 03:21:30 +08:00			`Attribute`
			`---------`

[hwasan] typo in docs llvm-svn: 320168 2017-12-09 02:14:03 +08:00			HWASAN uses its own LLVM IR Attribute `sanitize_hwaddress` and a matching
update hwasan docs Summary: * use more readable name * document the hwasan attribute Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D40938 llvm-svn: 320075 2017-12-08 03:21:30 +08:00			`C function attribute. An alternative would be to re-use ASAN's attribute`
			`sanitize_address`. The reasons to use a separate attribute are:

			`* Users may need to disable ASAN but not HWASAN, or vise versa,`
			`because the tools have different trade-offs and compatibility issues.`
			`* LLVM (ideally) does not use flags to decide which pass is being used,`
			`ASAN or HWASAN are being applied, based on the function attributes.`

			`This does mean that users of HWASAN may need to add the new attribute`
			`to the code that already uses the old attribute.`

design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer Summary: preliminary design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer The name TaggedAddressSanitizer and the rest of the document, are early draft, suggestions are welcome. The code will follow shortly. Reviewers: eugenis, alekseyshl Reviewed By: eugenis Subscribers: davidxl, cryptoad, fedor.sergeev, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D40568 llvm-svn: 319684 2017-12-05 04:01:38 +08:00
			`Comparison with AddressSanitizer`
			`================================`

update hwasan docs Summary: * use more readable name * document the hwasan attribute Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D40938 llvm-svn: 320075 2017-12-08 03:21:30 +08:00			`HWASAN:`
design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer Summary: preliminary design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer The name TaggedAddressSanitizer and the rest of the document, are early draft, suggestions are welcome. The code will follow shortly. Reviewers: eugenis, alekseyshl Reviewed By: eugenis Subscribers: davidxl, cryptoad, fedor.sergeev, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D40568 llvm-svn: 319684 2017-12-05 04:01:38 +08:00			* Is less portable than :doc:`AddressSanitizer`
			as it relies on hardware `Address Tagging`_ (AArch64).
			`Address Tagging can be emulated with compiler instrumentation,`
			`but it will require the instrumentation to remove the tags before`
			`any load or store, which is infeasible in any realistic environment`
			`that contains non-instrumented code.`
			`* May have compatibility problems if the target code uses higher`
			`pointer bits for other purposes.`
			`* May require changes in the OS kernels (e.g. Linux seems to dislike`
update hwasan docs Summary: * use more readable name * document the hwasan attribute Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D40938 llvm-svn: 320075 2017-12-08 03:21:30 +08:00			`tagged pointers passed from address space:`
			`https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt).`
design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer Summary: preliminary design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer The name TaggedAddressSanitizer and the rest of the document, are early draft, suggestions are welcome. The code will follow shortly. Reviewers: eugenis, alekseyshl Reviewed By: eugenis Subscribers: davidxl, cryptoad, fedor.sergeev, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D40568 llvm-svn: 319684 2017-12-05 04:01:38 +08:00			`* Does not require redzones to detect buffer overflows,`
			`but the buffer overflow detection is probabilistic, with roughly`
			`(2K-1)/(2K)` probability of catching a bug.
			`* **Does not require quarantine to detect heap-use-after-free,`
			`or stack-use-after-return**.`
			`The detection is similarly probabilistic.`

update hwasan docs Summary: * use more readable name * document the hwasan attribute Reviewers: eugenis Reviewed By: eugenis Subscribers: llvm-commits, cfe-commits Differential Revision: https://reviews.llvm.org/D40938 llvm-svn: 320075 2017-12-08 03:21:30 +08:00			`The memory overhead of HWASAN is expected to be much smaller`
design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer Summary: preliminary design document for a hardware-assisted memory safety (HWAMS) tool, similar to AddressSanitizer The name TaggedAddressSanitizer and the rest of the document, are early draft, suggestions are welcome. The code will follow shortly. Reviewers: eugenis, alekseyshl Reviewed By: eugenis Subscribers: davidxl, cryptoad, fedor.sergeev, cfe-commits, llvm-commits Differential Revision: https://reviews.llvm.org/D40568 llvm-svn: 319684 2017-12-05 04:01:38 +08:00			`than that of AddressSanitizer:`
			`1/N` extra memory for the shadow
			and some overhead due to `N`-aligning all objects.


			`Related Work`
			`============`
			* `SPARC ADI`_ implements a similar tool mostly in hardware.
			* `Effective and Efficient Memory Protection Using Dynamic Tainting`_ discusses
			`similar approaches ("lock & key").`
			* `Watchdog`_ discussed a heavier, but still somewhat similar
			`"lock & key" approach.`
			`* TODO: add more "related work" links. Suggestions are welcome.`


			`.. _Watchdog: http://www.cis.upenn.edu/acg/papers/isca12_watchdog.pdf`
			`.. _Effective and Efficient Memory Protection Using Dynamic Tainting: https://www.cc.gatech.edu/~orso/papers/clause.doudalis.orso.prvulovic.pdf`
			`.. _SPARC ADI: https://lazytyped.blogspot.com/2017/09/getting-started-with-adi.html`
			`.. _AddressSanitizer paper: https://www.usenix.org/system/files/conference/atc12/atc12-final39.pdf`
			`.. _Address Tagging: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch12s05s01.html`