Commit Graph

246 Commits

Author SHA1 Message Date
Andrew Browne 31d12df3b9 [DFSan] Remove deprecated flag from build-libc-list.py
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D126429
2022-06-01 11:00:13 -07:00
Andrew Browne 15d5db276c [DFSan] build-libc-list.py no longer provides a list of default files.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D126430
2022-05-31 11:25:56 -07:00
Andrew Browne b2b0322a81 [DFSan] Add option to specify individual library files, and an option to exit with an error code if any library file was not found.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D126336
2022-05-24 16:15:46 -07:00
Andrew Browne 204c12eef9 [DFSan] Print an error before calling null extern_weak functions, incase dfsan instrumentation optimized out a null check.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D124051
2022-04-19 17:01:41 -07:00
Andrew Browne dbf8c00b09 [DFSan] Remove trampolines to unblock opaque pointers. (Reland with fix)
https://github.com/llvm/llvm-project/issues/54172

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D121250
2022-03-14 16:03:25 -07:00
Andrew Browne edc33fa569 Revert "[DFSan] Remove trampolines to unblock opaque pointers."
This reverts commit 84af90336f.
2022-03-14 13:47:41 -07:00
Andrew Browne 84af90336f [DFSan] Remove trampolines to unblock opaque pointers.
https://github.com/llvm/llvm-project/issues/54172

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D121250
2022-03-14 13:39:49 -07:00
Andrew Browne 7607ddd981 [NFC][DFSan] Cleanup code to use align functions.
Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D116761
2022-01-06 14:42:38 -08:00
Andrew Browne 32167bfe64 [DFSan] Refactor dfsan_mem_shadow_transfer.
Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D116704
2022-01-06 09:33:19 -08:00
Andrew Browne 4e173585f6 [DFSan] Add option for conditional callbacks.
This allows DFSan to find tainted values used to control program behavior.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D116207
2022-01-05 15:07:09 -08:00
Andrew Browne d39d2acfdd [DFSan] Make dfsan_read_origin_of_first_taint public.
Makes origins easier to use with dfsan_read_label(addr, size).

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D116197
2021-12-22 23:45:30 -08:00
Andrew Browne ed6c757d5c [DFSan] Add functions to print origin trace from origin id instead of address.
dfsan_print_origin_id_trace
dfsan_sprint_origin_id_trace

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D116184
2021-12-22 16:45:54 -08:00
Vitaly Buka 6318001209 [sanitizer] Support IsRssLimitExceeded in all sanitizers
Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D115000
2021-12-03 12:45:44 -08:00
Vitaly Buka cb0e14ce6d [sanitizer] Switch dlsym hack to internal_allocator
Since glibc 2.34, dlsym does
  1. malloc 1
  2. malloc 2
  3. free pointer from malloc 1
  4. free pointer from malloc 2
These sequence was not handled by trivial dlsym hack.

This fixes https://bugs.llvm.org/show_bug.cgi?id=52278

Reviewed By: eugenis, morehouse

Differential Revision: https://reviews.llvm.org/D112588
2021-11-12 16:11:10 -08:00
Vitaly Buka ffd9c123e7 [dfsan] Dfsan version of D113328
Depends on D113328.

Differential Revision: https://reviews.llvm.org/D113454
2021-11-09 18:23:55 -08:00
Vitaly Buka 63886c21ec [NFC][dfsan] Split Init and ThreadStart 2021-11-08 19:16:55 -08:00
Martin Liska 13a442ca49 Enable -Wformat-pedantic and fix fallout.
Differential Revision: https://reviews.llvm.org/D113172
2021-11-05 13:12:35 +01:00
David CARLIER bb168f3207 [compiler-rt] update detect_write_exec option for apple devices.
Reviewed By: yln, vitalybuka

Differential Revision: https://reviews.llvm.org/D111390
2021-10-28 17:08:23 +01:00
Vitaly Buka df43d419de [NFC][sanitizer] Remove includes from header 2021-10-08 14:27:05 -07:00
Andrew Browne d81723c99b [DFSan] Optimize code for writing to shadow. Move SetShadow to namespace.
Writing zeros to shadow (including checking for existing zero) is now ~2x
faster on one example.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D110733
2021-09-30 12:42:21 -07:00
Kazuaki Ishizaki a1e7e401d2 [compiler-rt] NFC: Fix trivial typo
Reviewed By: xgupta

Differential Revision: https://reviews.llvm.org/D77457
2021-09-04 14:12:58 +05:30
Andrew Browne befb384484 [DFSan][NFC] Fix comment formatting. 2021-08-31 15:35:08 -07:00
Andrew Browne 76777b216b [DFSan] Add wrapper for getentropy().
Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D108604
2021-08-24 15:10:13 -07:00
Kostya Serebryany 8103b0700d [sanitizer coverage] add a basic default implementation of callbacks for -fsanitize-coverage=inline-8bit-counters,pc-table
[sanitizer coverage] add a basic default implementation of callbacks for -fsanitize-coverage=inline-8bit-counters,pc-table

Reviewed By: kostik

Differential Revision: https://reviews.llvm.org/D108405
2021-08-24 14:56:15 -07:00
Dmitry Vyukov 123c58ea26 sanitizer_common: enable format string checking
Enable -Wformat in sanitizer_common now that it's
cleaned up from existing warnings.
But disable it in all sanitizers for now since
they are not cleaned up yet, but inherit sanitizer_common CFLAGS.

Depends on D107980.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D107981
2021-08-13 13:44:52 +02:00
Vitaly Buka 44c83eccf9 [sanitizer] Remove cpplint annotations
cpplint was removed by D107197

Differential Revision: https://reviews.llvm.org/D107198
2021-07-30 18:20:40 -07:00
Dmitry Vyukov 440e936c47 Revert "sanitizers: increase .clang-format columns to 100"
This reverts commit 5d1df6d220.

There is a strong objection to this change:
https://reviews.llvm.org/D106436#2905618

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D106847
2021-07-28 09:40:21 +02:00
George Balatsouras 228bea6a36 Revert D106195 "[dfsan] Add wrappers for v*printf functions"
This reverts commit bf281f3647.

This commit causes dfsan to segfault.
2021-07-24 08:53:48 +00:00
George Balatsouras bf281f3647 [dfsan] Add wrappers for v*printf functions
Functions `vsnprintf`, `vsprintf` and `vfprintf` commonly occur in DFSan warnings.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D106195
2021-07-22 15:39:17 -07:00
Jianzhou Zhao a806f933a2 [dfsan] Make warn_unimplemented off by default
Because almost all internal use cases need to turn warn_unimplemented off.
2021-07-22 21:45:41 +00:00
Dmitry Vyukov 5d1df6d220 sanitizers: increase .clang-format columns to 100
The current (default) line length is 80 columns.
That's based on old hardware and historical conventions.
There are no existent reasons to keep line length that small,
especially provided that our coding style uses quite lengthy
identifiers. The Linux kernel recently switched to 100,
let's start with 100 as well.

This change intentionally does not re-format code.
Re-formatting is intended to happen incrementally,
or on dir-by-dir basis separately.

Reviewed By: vitalybuka, melver, MaskRay

Differential Revision: https://reviews.llvm.org/D106436
2021-07-22 11:15:02 +02:00
John Ericson 1e03c37b97 Prepare Compiler-RT for GnuInstallDirs, matching libcxx, document all
This is a second attempt at D101497, which landed as
9a9bc76c0e but had to be reverted in
8cf7ddbdd4.

This issue was that in the case that `COMPILER_RT_INSTALL_PATH` is
empty, expressions like "${COMPILER_RT_INSTALL_PATH}/bin" evaluated to
"/bin" not "bin" as intended and as was originally.

One solution is to make `COMPILER_RT_INSTALL_PATH` always non-empty,
defaulting it to `CMAKE_INSTALL_PREFIX`. D99636 adopted that approach.
But, I think it is more ergonomic to allow those project-specific paths
to be relative the global ones. Also, making install paths absolute by
default inhibits the proper behavior of functions like
`GNUInstallDirs_get_absolute_install_dir` which make relative install
paths absolute in a more complicated way.

Given all this, I will define a function like the one asked for in
https://gitlab.kitware.com/cmake/cmake/-/issues/19568 (and needed for a
similar use-case).

---

Original message:

Instead of using `COMPILER_RT_INSTALL_PATH` through the CMake for
complier-rt, just use it to define variables for the subdirs which
themselves are used.

This preserves compatibility, but later on we might consider getting rid
of `COMPILER_RT_INSTALL_PATH` and just changing the defaults for the
subdir variables directly.

---

There was a seaming bug where the (non-Apple) per-target libdir was
`${target}` not `lib/${target}`. I suspect that has to do with the docs
on `COMPILER_RT_INSTALL_PATH` saying was the library dir when that's no
longer true, so I just went ahead and fixed it, allowing me to define
fewer and more sensible variables.

That last part should be the only behavior changes; everything else
should be a pure refactoring.

---

I added some documentation of these variables too. In particular, I
wanted to highlight the gotcha where `-DSomeCachePath=...` without the
`:PATH` will lead CMake to make the path absolute. See [1] for
discussion of the problem, and [2] for the brief official documentation
they added as a result.

[1]: https://cmake.org/pipermail/cmake/2015-March/060204.html

[2]: https://cmake.org/cmake/help/latest/manual/cmake.1.html#options

In 38b2dec37e the problem was somewhat
misidentified and so `:STRING` was used, but `:PATH` is better as it
sets the correct type from the get-go.

---

D99484 is the main thrust of the `GnuInstallDirs` work. Once this lands,
it should be feasible to follow both of these up with a simple patch for
compiler-rt analogous to the one for libcxx.

Reviewed By: phosek, #libc_abi, #libunwind

Differential Revision: https://reviews.llvm.org/D105765
2021-07-13 15:21:41 +00:00
Martin Storsjö 8cf7ddbdd4 Revert "Prepare Compiler-RT for GnuInstallDirs, matching libcxx"
This reverts commit 9a9bc76c0e.

That commit broke "ninja install" when building compiler-rt for mingw
targets, building standalone (pointing cmake at the compiler-rt
directory) with cmake 3.16.3 (the one shipped in ubuntu 20.04), with
errors like this:

-- Install configuration: "Release"
CMake Error at cmake_install.cmake:44 (file):
  file cannot create directory: /include/sanitizer.  Maybe need
  administrative privileges.
Call Stack (most recent call first):
  /home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/cmake_install.cmake:37 (include)

FAILED: include/CMakeFiles/install-compiler-rt-headers
cd /home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/include && /usr/bin/cmake -DCMAKE_INSTALL_COMPONENT="compiler-rt-headers" -P /home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/cmake_install.cmake
ninja: build stopped: subcommand failed.
2021-07-10 10:45:54 +03:00
John Ericson 9a9bc76c0e Prepare Compiler-RT for GnuInstallDirs, matching libcxx
Instead of using `COMPILER_RT_INSTALL_PATH` through the CMake for
complier-rt, just use it to define variables for the subdirs which
themselves are used.

This preserves compatibility, but later on we might consider getting rid
of `COMPILER_RT_INSTALL_PATH` and just changing the defaults for the
subdir variables directly.

---

There was a seaming bug where the (non-Apple) per-target libdir was
`${target}` not `lib/${target}`. I suspect that has to do with the docs
on `COMPILER_RT_INSTALL_PATH` saying was the library dir when that's no
longer true, so I just went ahead and fixed it, allowing me to define
fewer and more sensible variables.

That last part should be the only behavior changes; everything else
should be a pure refactoring.

---

D99484 is the main thrust of the `GnuInstallDirs` work. Once this lands,
it should be feasible to follow both of these up with a simple patch for
compiler-rt analogous to the one for libcxx.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D101497
2021-07-09 20:41:53 +00:00
Jianzhou Zhao ae6648cee0 [dfsan] Expose dfsan_get_track_origins to get origin tracking status
This allows application code checks if origin tracking is on before
printing out traces.

-dfsan-track-origins can be 0,1,2.
The current code only distinguishes 1 and 2 in compile time, but not at runtime.
Made runtime distinguish 1 and 2 too.

Reviewed By: browneee

Differential Revision: https://reviews.llvm.org/D105128
2021-06-29 20:32:39 +00:00
Andrew Browne 45f6d5522f [DFSan] Change shadow and origin memory layouts to match MSan.
Previously on x86_64:

  +--------------------+ 0x800000000000 (top of memory)
  | application memory |
  +--------------------+ 0x700000008000 (kAppAddr)
  |                    |
  |       unused       |
  |                    |
  +--------------------+ 0x300000000000 (kUnusedAddr)
  |       origin       |
  +--------------------+ 0x200000008000 (kOriginAddr)
  |       unused       |
  +--------------------+ 0x200000000000
  |   shadow memory    |
  +--------------------+ 0x100000008000 (kShadowAddr)
  |       unused       |
  +--------------------+ 0x000000010000
  | reserved by kernel |
  +--------------------+ 0x000000000000

  MEM_TO_SHADOW(mem) = mem & ~0x600000000000
  SHADOW_TO_ORIGIN(shadow) = kOriginAddr - kShadowAddr + shadow

Now for x86_64:

  +--------------------+ 0x800000000000 (top of memory)
  |    application 3   |
  +--------------------+ 0x700000000000
  |      invalid       |
  +--------------------+ 0x610000000000
  |      origin 1      |
  +--------------------+ 0x600000000000
  |    application 2   |
  +--------------------+ 0x510000000000
  |      shadow 1      |
  +--------------------+ 0x500000000000
  |      invalid       |
  +--------------------+ 0x400000000000
  |      origin 3      |
  +--------------------+ 0x300000000000
  |      shadow 3      |
  +--------------------+ 0x200000000000
  |      origin 2      |
  +--------------------+ 0x110000000000
  |      invalid       |
  +--------------------+ 0x100000000000
  |      shadow 2      |
  +--------------------+ 0x010000000000
  |    application 1   |
  +--------------------+ 0x000000000000

  MEM_TO_SHADOW(mem) = mem ^ 0x500000000000
  SHADOW_TO_ORIGIN(shadow) = shadow + 0x100000000000

Reviewed By: stephan.yichao.zhao, gbalats

Differential Revision: https://reviews.llvm.org/D104896
2021-06-25 17:00:38 -07:00
Andrew Browne 759e797767 [DFSan][NFC] Refactor Origin Address Alignment code.
Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104565
2021-06-21 14:52:02 -07:00
Andrew Browne 14407332de [DFSan] Cleanup code for platforms other than Linux x86_64.
These other platforms are unsupported and untested.
They could be re-added later based on MSan code.

Reviewed By: gbalats, stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104481
2021-06-18 11:21:46 -07:00
Andrew Browne 39295e92f7 Revert "[DFSan] Cleanup code for platforms other than Linux x86_64."
This reverts commit 8441b993bd.

Buildbot failures.
2021-06-17 14:19:18 -07:00
Andrew Browne 8441b993bd [DFSan] Cleanup code for platforms other than Linux x86_64.
These other platforms are unsupported and untested.
They could be re-added later based on MSan code.

Reviewed By: gbalats, stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104481
2021-06-17 14:08:40 -07:00
George Balatsouras 98504959a6 [dfsan] Add stack-trace printing functions to dfsan interface
Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104165
2021-06-14 14:09:00 -07:00
George Balatsouras 5b4dda550e [dfsan] Add full fast8 support
Complete support for fast8:
- amend shadow size and mapping in runtime
- remove fast16 mode and -dfsan-fast-16-labels flag
- remove legacy mode and make fast8 mode the default
- remove dfsan-fast-8-labels flag
- remove functions in dfsan interface only applicable to legacy
- remove legacy-related instrumentation code and tests
- update documentation.

Reviewed By: stephan.yichao.zhao, browneee

Differential Revision: https://reviews.llvm.org/D103745
2021-06-07 17:20:54 -07:00
Jianzhou Zhao 2c82588dac [dfsan] Use the sanitizer allocator to reduce memory cost
dfsan does not use sanitizer allocator as others. In practice,
we let it use glibc's allocator since tcmalloc needs more work
to be working with dfsan well. With glibc, we observe large
memory leakage. This could relate to two things:

1) glibc allocator has limitation: for example, tcmalloc can reduce memory footprint 2x easily

2) glibc may call unmmap directly as an internal system call by using system call number. so DFSan has no way to release shadow spaces for those unmmap.

Using sanitizer allocator addresses the above issues
1) its memory management is close to tcmalloc

2) we can register callback when sanitizer allocator calls unmmap, so dfsan can release shadow spaces correctly.

Our experiment with internal server-based application proved that with the change, in a-few-day run, memory usage leakage is close to what tcmalloc does w/o dfsan.

This change mainly follows MSan's code.

1) define allocator callbacks at dfsan_allocator.h|cpp

2) mark allocator APIs to be discard

3) intercept allocator APIs

4) make dfsan_set_label consistent with MSan's SetShadow when setting 0 labels, define dfsan_release_meta_memory when unmap is called

5) add flags about whether zeroing memory after malloc/free. dfsan works at byte-level, so bit-level oparations can cause reading undefined shadow. See D96842. zeroing memory after malloc helps this. About zeroing after free, reading after free is definitely UB, but if user code does so, it is hard to debug an overtainting caused by this w/o running MSan. So we add the flag to help debugging.

This change will be split to small changes for review. Before that, a question is
"this code shares a lot of with MSan, for example, dfsan_allocator.* and dfsan_new_delete.*.
Does it make sense to unify the code at sanitizer_common? will that introduce some
maintenance issue?"

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101204
2021-06-06 22:09:31 +00:00
George Balatsouras a11cb10a36 [dfsan] Add function that prints origin stack trace to buffer
Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D102451
2021-05-24 11:09:03 -07:00
Jianzhou Zhao 1fb612d060 [dfsan] Add a DFSan allocator
This is a part of https://reviews.llvm.org/D101204

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101666
2021-05-05 00:51:45 +00:00
Jianzhou Zhao 36cec26b38 [dfsan] move dfsan_flags.h to cc files
D101666 needs this change.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101857
2021-05-04 22:54:02 +00:00
Fangrui Song 2fec8860d8 [sanitizer] Set IndentPPDirectives: AfterHash in .clang-format
Code patterns like this are common, `#` at the line beginning
(https://google.github.io/styleguide/cppguide.html#Preprocessor_Directives),
one space indentation for if/elif/else directives.
```
#if SANITIZER_LINUX
# if defined(__aarch64__)
# endif
#endif
```

However, currently clang-format wants to reformat the code to
```
#if SANITIZER_LINUX
#if defined(__aarch64__)
#endif
#endif
```

This significantly harms readability in my review.  Use `IndentPPDirectives:
AfterHash` to defeat the diagnostic. clang-format will now suggest:

```
#if SANITIZER_LINUX
#  if defined(__aarch64__)
#  endif
#endif
```

Unfortunately there is no clang-format option using indent with 1 for
just preprocessor directives. However, this is still one step forward
from the current behavior.

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D100238
2021-05-03 13:49:41 -07:00
Jianzhou Zhao 7fdf270965 [dfsan] Track origin at loads
The first version of origin tracking tracks only memory stores. Although
    this is sufficient for understanding correct flows, it is hard to figure
    out where an undefined value is read from. To find reading undefined values,
    we still have to do a reverse binary search from the last store in the chain
    with printing and logging at possible code paths. This is
    quite inefficient.

    Tracking memory load instructions can help this case. The main issues of
    tracking loads are performance and code size overheads.

    With tracking only stores, the code size overhead is 38%,
    memory overhead is 1x, and cpu overhead is 3x. In practice #load is much
    larger than #store, so both code size and cpu overhead increases. The
    first blocker is code size overhead: link fails if we inline tracking
    loads. The workaround is using external function calls to propagate
    metadata. This is also the workaround ASan uses. The cpu overhead
    is ~10x. This is a trade off between debuggability and performance,
    and will be used only when debugging cases that tracking only stores
    is not enough.

Reviewed By: gbalats

Differential Revision: https://reviews.llvm.org/D100967
2021-04-22 16:25:24 +00:00
Jianzhou Zhao e4701471d6 [dfsan] Set sigemptyset's return label to be 0
This was not set from when the wrapper was introduced.

Reviewed By: gbalats

Differential Revision: https://reviews.llvm.org/D99678
2021-03-31 21:07:44 +00:00
Jianzhou Zhao aaab444179 [dfsan] Ignore dfsan origin wrappers when instrumenting code 2021-03-29 00:15:26 +00:00