Commit Graph

123 Commits

Author SHA1 Message Date
Andrew Browne 5748219fd2 [DFSan] Add dfsan-combine-taint-lookup-table option as work around for
false negatives when dfsan-combine-pointer-labels-on-load=0 and
dfsan-combine-offset-labels-on-gep=0 miss data flows through lookup tables.

Example case:
628a2825f8/absl/strings/ascii.h (L182)

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D122787
2022-04-05 11:05:10 -07:00
Andrew Browne 18564095a7 [DFSan] Remove use of setarch in dfsan test.
Use of setarch Was added by
f93c2b64ed

Running the test now it doesn't seem necessary because:

1) Explicitly only x86_64 is supported for dfsan.

2) https://reviews.llvm.org/D111522 makes it less flakey.

Differential Revision: https://reviews.llvm.org/D121439
2022-03-14 10:03:51 -07:00
Andrew Browne 12bfea58b8 [DFSan] Fix several bugs in dfsan custom callbacks test.
Reviewed By: kda

Differential Revision: https://reviews.llvm.org/D121249
2022-03-08 14:26:28 -08:00
Andrew Browne 4e173585f6 [DFSan] Add option for conditional callbacks.
This allows DFSan to find tainted values used to control program behavior.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D116207
2022-01-05 15:07:09 -08:00
Andrew Browne d39d2acfdd [DFSan] Make dfsan_read_origin_of_first_taint public.
Makes origins easier to use with dfsan_read_label(addr, size).

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D116197
2021-12-22 23:45:30 -08:00
Andrew Browne ed6c757d5c [DFSan] Add functions to print origin trace from origin id instead of address.
dfsan_print_origin_id_trace
dfsan_sprint_origin_id_trace

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D116184
2021-12-22 16:45:54 -08:00
Vitaly Buka b6169e231e [nfc][dfsan] Remove obsolete comment 2021-11-18 18:37:13 -08:00
Andrew Browne 50a08e2c6d [DFSan] Fix flakey release_shadow_space.c accounting for Origin chains.
Test sometimes fails on buildbot (after two non-Origins executions):

/usr/bin/ld: warning: Cannot export local symbol 'dfsan_flush'
RSS at start: 4620, after mmap: 107020, after mmap+set label: 209424, after fixed map: 4624, after another mmap+set label: 209424, after munmap: 4624
/usr/bin/ld: warning: Cannot export local symbol 'dfsan_flush'
RSS at start: 4620, after mmap: 107020, after mmap+set label: 209424, after fixed map: 4624, after another mmap+set label: 209424, after munmap: 4624
/usr/bin/ld: warning: Cannot export local symbol 'dfsan_flush'
RSS at start: 4620, after mmap: 107020, after mmap+set label: 317992, after fixed map: 10792, after another mmap+set label: 317992, after munmap: 10792
release_shadow_space.c.tmp: /b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/test/dfsan/release_shadow_space.c:91: int main(int, char **): Assertion `after_fixed_mmap <= before + delta' failed.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D111522
2021-10-11 00:35:12 -07:00
Andrew Browne 61ec2148c5 [DFSan] Remove -dfsan-args-abi support in favor of TLS.
ArgsABI was originally added in https://reviews.llvm.org/D965

Current benchmarking does not show a significant difference.
There is no need to maintain both ABIs.

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D111097
2021-10-08 11:18:36 -07:00
Andrew Browne c533b88a6d [DFSan] Add force_zero_label abilist option to DFSan. This can be used as a work-around for overtainting.
Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D109847
2021-09-17 12:57:40 -07:00
Andrew Browne 76777b216b [DFSan] Add wrapper for getentropy().
Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D108604
2021-08-24 15:10:13 -07:00
George Balatsouras 228bea6a36 Revert D106195 "[dfsan] Add wrappers for v*printf functions"
This reverts commit bf281f3647.

This commit causes dfsan to segfault.
2021-07-24 08:53:48 +00:00
George Balatsouras bf281f3647 [dfsan] Add wrappers for v*printf functions
Functions `vsnprintf`, `vsprintf` and `vfprintf` commonly occur in DFSan warnings.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D106195
2021-07-22 15:39:17 -07:00
Jianzhou Zhao a806f933a2 [dfsan] Make warn_unimplemented off by default
Because almost all internal use cases need to turn warn_unimplemented off.
2021-07-22 21:45:41 +00:00
Jianzhou Zhao ae6648cee0 [dfsan] Expose dfsan_get_track_origins to get origin tracking status
This allows application code checks if origin tracking is on before
printing out traces.

-dfsan-track-origins can be 0,1,2.
The current code only distinguishes 1 and 2 in compile time, but not at runtime.
Made runtime distinguish 1 and 2 too.

Reviewed By: browneee

Differential Revision: https://reviews.llvm.org/D105128
2021-06-29 20:32:39 +00:00
Andrew Browne 45f6d5522f [DFSan] Change shadow and origin memory layouts to match MSan.
Previously on x86_64:

  +--------------------+ 0x800000000000 (top of memory)
  | application memory |
  +--------------------+ 0x700000008000 (kAppAddr)
  |                    |
  |       unused       |
  |                    |
  +--------------------+ 0x300000000000 (kUnusedAddr)
  |       origin       |
  +--------------------+ 0x200000008000 (kOriginAddr)
  |       unused       |
  +--------------------+ 0x200000000000
  |   shadow memory    |
  +--------------------+ 0x100000008000 (kShadowAddr)
  |       unused       |
  +--------------------+ 0x000000010000
  | reserved by kernel |
  +--------------------+ 0x000000000000

  MEM_TO_SHADOW(mem) = mem & ~0x600000000000
  SHADOW_TO_ORIGIN(shadow) = kOriginAddr - kShadowAddr + shadow

Now for x86_64:

  +--------------------+ 0x800000000000 (top of memory)
  |    application 3   |
  +--------------------+ 0x700000000000
  |      invalid       |
  +--------------------+ 0x610000000000
  |      origin 1      |
  +--------------------+ 0x600000000000
  |    application 2   |
  +--------------------+ 0x510000000000
  |      shadow 1      |
  +--------------------+ 0x500000000000
  |      invalid       |
  +--------------------+ 0x400000000000
  |      origin 3      |
  +--------------------+ 0x300000000000
  |      shadow 3      |
  +--------------------+ 0x200000000000
  |      origin 2      |
  +--------------------+ 0x110000000000
  |      invalid       |
  +--------------------+ 0x100000000000
  |      shadow 2      |
  +--------------------+ 0x010000000000
  |    application 1   |
  +--------------------+ 0x000000000000

  MEM_TO_SHADOW(mem) = mem ^ 0x500000000000
  SHADOW_TO_ORIGIN(shadow) = shadow + 0x100000000000

Reviewed By: stephan.yichao.zhao, gbalats

Differential Revision: https://reviews.llvm.org/D104896
2021-06-25 17:00:38 -07:00
George Balatsouras c6b5a25eeb [dfsan] Replace dfs$ prefix with .dfsan suffix
The current naming scheme adds the `dfs$` prefix to all
DFSan-instrumented functions.  This breaks mangling and prevents stack
trace printers and other tools from automatically demangling function
names.

This new naming scheme is mangling-compatible, with the `.dfsan`
suffix being a vendor-specific suffix:
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling-structure

With this fix, demangling utils would work out-of-the-box.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104494
2021-06-17 22:42:47 -07:00
George Balatsouras 98504959a6 [dfsan] Add stack-trace printing functions to dfsan interface
Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D104165
2021-06-14 14:09:00 -07:00
George Balatsouras 5b4dda550e [dfsan] Add full fast8 support
Complete support for fast8:
- amend shadow size and mapping in runtime
- remove fast16 mode and -dfsan-fast-16-labels flag
- remove legacy mode and make fast8 mode the default
- remove dfsan-fast-8-labels flag
- remove functions in dfsan interface only applicable to legacy
- remove legacy-related instrumentation code and tests
- update documentation.

Reviewed By: stephan.yichao.zhao, browneee

Differential Revision: https://reviews.llvm.org/D103745
2021-06-07 17:20:54 -07:00
Jianzhou Zhao a82747fafe [dfsan] Fix internal build errors because of more strict warning checks 2021-06-07 16:55:56 +00:00
Jianzhou Zhao 2c82588dac [dfsan] Use the sanitizer allocator to reduce memory cost
dfsan does not use sanitizer allocator as others. In practice,
we let it use glibc's allocator since tcmalloc needs more work
to be working with dfsan well. With glibc, we observe large
memory leakage. This could relate to two things:

1) glibc allocator has limitation: for example, tcmalloc can reduce memory footprint 2x easily

2) glibc may call unmmap directly as an internal system call by using system call number. so DFSan has no way to release shadow spaces for those unmmap.

Using sanitizer allocator addresses the above issues
1) its memory management is close to tcmalloc

2) we can register callback when sanitizer allocator calls unmmap, so dfsan can release shadow spaces correctly.

Our experiment with internal server-based application proved that with the change, in a-few-day run, memory usage leakage is close to what tcmalloc does w/o dfsan.

This change mainly follows MSan's code.

1) define allocator callbacks at dfsan_allocator.h|cpp

2) mark allocator APIs to be discard

3) intercept allocator APIs

4) make dfsan_set_label consistent with MSan's SetShadow when setting 0 labels, define dfsan_release_meta_memory when unmap is called

5) add flags about whether zeroing memory after malloc/free. dfsan works at byte-level, so bit-level oparations can cause reading undefined shadow. See D96842. zeroing memory after malloc helps this. About zeroing after free, reading after free is definitely UB, but if user code does so, it is hard to debug an overtainting caused by this w/o running MSan. So we add the flag to help debugging.

This change will be split to small changes for review. Before that, a question is
"this code shares a lot of with MSan, for example, dfsan_allocator.* and dfsan_new_delete.*.
Does it make sense to unify the code at sanitizer_common? will that introduce some
maintenance issue?"

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101204
2021-06-06 22:09:31 +00:00
Jianzhou Zhao fc1d39849e [dfsan] Add a flag about whether to propagate offset labels at gep
DFSan has flags to control flows between pointers and objects referred
by pointers. For example,

a = *p;
L(a) = L(*p)        when -dfsan-combine-pointer-labels-on-load = false
L(a) = L(*p) + L(p) when -dfsan-combine-pointer-labels-on-load = true

*p = b;
L(*p) = L(b)        when -dfsan-combine-pointer-labels-on-store = false
L(*p) = L(b) + L(p) when -dfsan-combine-pointer-labels-on-store = true
The question is what to do with p += c.

In practice we found many confusing flows if we propagate labels from c
to p. So a new flag works like this

p += c;
L(p) = L(p)        when -dfsan-propagate-via-pointer-arithmetic = false
L(p) = L(p) + L(c) when -dfsan-propagate-via-pointer-arithmetic = true

Reviewed-by: gbalats

Differential Revision: https://reviews.llvm.org/D103176
2021-05-28 00:06:19 +00:00
George Balatsouras a11cb10a36 [dfsan] Add function that prints origin stack trace to buffer
Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D102451
2021-05-24 11:09:03 -07:00
Jianzhou Zhao 87a6325fbe [dfsan] Rename and fix an internal test issue for mmap+calloc
The linker suggests using -Wl,-z,notext.

Replaced assert by exit also fixed this.

After renaming, interceptor.c would be used to test interceptors in general by D101204.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101649
2021-05-07 00:57:21 +00:00
Jianzhou Zhao f3e3a1d79e [dfsan] extend a test case to measure origin memory usage
This is to support D101204.

Reviewed By: gbalats

Differential Revision: https://reviews.llvm.org/D101877
2021-05-06 00:19:44 +00:00
Jianzhou Zhao 79debe8d7b [dfsan] Turn off all dfsan test cases on non x86_64 OSs
https://reviews.llvm.org/D101666 enables sanitizer allocator.
This broke all test cases on non x86-64.
2021-05-05 05:30:53 +00:00
Nico Weber d7ec48d71b [clang] accept -fsanitize-ignorelist= in addition to -fsanitize-blacklist=
Use that for internal names (including the default ignorelists of the
sanitizers).

Differential Revision: https://reviews.llvm.org/D101832
2021-05-04 10:24:00 -04:00
Jianzhou Zhao 7fdf270965 [dfsan] Track origin at loads
The first version of origin tracking tracks only memory stores. Although
    this is sufficient for understanding correct flows, it is hard to figure
    out where an undefined value is read from. To find reading undefined values,
    we still have to do a reverse binary search from the last store in the chain
    with printing and logging at possible code paths. This is
    quite inefficient.

    Tracking memory load instructions can help this case. The main issues of
    tracking loads are performance and code size overheads.

    With tracking only stores, the code size overhead is 38%,
    memory overhead is 1x, and cpu overhead is 3x. In practice #load is much
    larger than #store, so both code size and cpu overhead increases. The
    first blocker is code size overhead: link fails if we inline tracking
    loads. The workaround is using external function calls to propagate
    metadata. This is also the workaround ASan uses. The cpu overhead
    is ~10x. This is a trade off between debuggability and performance,
    and will be used only when debugging cases that tracking only stores
    is not enough.

Reviewed By: gbalats

Differential Revision: https://reviews.llvm.org/D100967
2021-04-22 16:25:24 +00:00
George Balatsouras 98b114d480 [dfsan] Remove hard-coded constant in release_shadow_space.c
Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D100608
2021-04-15 17:24:35 -07:00
George Balatsouras b2b59f622e [dfsan] Add test for origin tracking stack traces
Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D100518
2021-04-15 16:22:47 -07:00
Jianzhou Zhao af9f461298 [dfsan] test flush on only x86 2021-03-25 02:45:43 +00:00
Jianzhou Zhao f9a135b652 [dfsan] Test dfsan_flush with origins
This is a part of https://reviews.llvm.org/D95835.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D99295
2021-03-25 00:12:53 +00:00
Jianzhou Zhao 4950695eba [dfsan] Add Origin ABI Wrappers
Supported ctime_r, fgets, getcwd, get_current_dir_name, gethostname,
getrlimit, getrusage, strcpy, time, inet_pton, localtime_r,
getpwuid_r, epoll_wait, poll, select, sched_getaffinity

Most of them work as calling their non-origin verision directly.

This is a part of https://reviews.llvm.org/D95835.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D98966
2021-03-24 18:22:03 +00:00
Jianzhou Zhao 91516925dd [dfsan] Add Origin ABI Wrappers
Supported strrchr, strrstr, strto*, recvmmsg, recrmsg, nanosleep,
    memchr, snprintf, socketpair, sprintf, getocketname, getsocketopt,
    gettimeofday, getpeername.

    strcpy was added because the test of sprintf need it. It will be
    committed by D98966. Please ignore it when reviewing.

    This is a part of https://reviews.llvm.org/D95835.

    Reviewed By: gbalats

    Differential Revision: https://reviews.llvm.org/D99109
2021-03-24 16:13:09 +00:00
Jianzhou Zhao 1fe042041c [dfsan] Add origin ABI wrappers
supported: dl_get_tls_static_info, calloc, clock_gettime,
dfsan_set_write_callback, dl_iterato_phdr, dlopen, memcpy,
memmove, memset, pread, read, strcat, strdup, strncpy

This is a part of https://reviews.llvm.org/D95835.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D98790
2021-03-19 16:23:25 +00:00
Jianzhou Zhao ec5ed66cee [dfsan] Add origin ABI wrappers
supported: bcmp, fstat, memcmp, stat, strcasecmp, strchr, strcmp,
strncasecmp, strncp, strpbrk

This is a part of https://reviews.llvm.org/D95835.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D98636
2021-03-17 02:22:35 +00:00
Jianzhou Zhao 4e67ae7b6b [dfsan] Add origin ABI wrappers for thread/signal/fork
This is a part of https://reviews.llvm.org/D95835.

See bb91e02efd about the similar issue of fork in MSan's origin tracking.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D98359
2021-03-15 16:18:00 +00:00
Jianzhou Zhao 37520a0b2b [dfsan] Disable testing origin tracking on non x86_64 arch
Fix test cases related to https://reviews.llvm.org/D95835.
2021-03-11 21:22:43 +00:00
Jianzhou Zhao 6a9a686ce7 [dfsan] Tracking origins at phi nodes
This is a part of https://reviews.llvm.org/D95835.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D98268
2021-03-10 17:02:58 +00:00
Jianzhou Zhao 8506fe5b41 [dfsan] Tracking origins at memory transfer
This is a part of https://reviews.llvm.org/D95835.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D98192
2021-03-09 22:15:07 +00:00
Jianzhou Zhao 469d5462fa [dfsan] Re-enable origin tracking test cases 2021-03-06 02:41:56 +00:00
Jianzhou Zhao d02e0ba070 [dfsan] Disable origin test cases temporarily 2021-03-06 01:12:54 +00:00
Jianzhou Zhao c20db7ea6a [dfsan] Add utils to get and print origin paths and some test cases
This is a part of https://reviews.llvm.org/D95835.

Reviewed By: morehouse, gbalats

Differential Revision: https://reviews.llvm.org/D97962
2021-03-06 00:11:35 +00:00
Jianzhou Zhao c5c316f6d9 [dfsan] Do not test origin-tracking in atomic.cpp
This would cause linking errors after https://reviews.llvm.org/D97483
that introduced new prefixes for ABI wrappers with origin tracking mode.
We will renable this after the full origin tracking is checked in.
2021-02-26 19:44:18 +00:00
Jianzhou Zhao c88fedef2a [dfsan] Conservative solution to atomic load/store
DFSan at store does store shadow data; store app data; and at load does
load shadow data; load app data.

When an application data is atomic, one overtainting case is

thread A: load shadow
thread B: store shadow
thread B: store app
thread A: load app

If the application address had been used by other flows, thread A reads
previous shadow, causing overtainting.

The change is similar to MSan's solution.
1) enforce ordering of app load/store
2) load shadow after load app; store shadow before shadow app
3) do not track atomic store by reseting its shadow to be 0.
The last one is to address a case like this.

Thread A: load app
Thread B: store shadow
Thread A: load shadow
Thread B: store app

This approach eliminates overtainting as a trade-off between undertainting
flows via shadow data race.

Note that this change addresses only native atomic instructions, but
does not support builtin libcalls yet.
   https://llvm.org/docs/Atomics.html#libcalls-atomic

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D97310
2021-02-25 23:34:58 +00:00
Jianzhou Zhao 0f3fd3b281 [dfsan] Add thread registration
This is a part of https://reviews.llvm.org/D95835.

This change is to address two problems
1) When recording stacks in origin tracking, libunwind is not async signal safe. Inside signal callbacks, we need
to use fast unwind. Fast unwind needs threads
2) StackDepot used by origin tracking is not async signal safe, we set a flag per thread inside
a signal callback to prevent from using it.

The thread registration is similar to ASan and MSan.

Related MSan changes are
* 98f5ea0dba
* f653cda269
* 5a7c364343

Some changes in the diff are used in the next diffs
1) The test case pthread.c is not very interesting for now. It will be
  extended to test origin tracking later.
2) DFsanThread::InSignalHandler will be used by origin tracking later.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D95963
2021-02-05 17:38:59 +00:00
Jianzhou Zhao 15f26c5f51 [dfsan] Wrap strcat
Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D95923
2021-02-03 18:50:29 +00:00
Jianzhou Zhao eb5c0a90e7 [dfsan] Test IGN and DFL for sigaction
Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D95957
2021-02-03 18:46:49 +00:00
Jianzhou Zhao 93afc3452c [dfsan] Clean TLS after signal callbacks
Similar to https://reviews.llvm.org/D95642, this diff fixes signal.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D95896
2021-02-03 17:21:28 +00:00
Jianzhou Zhao 3f568e1fbb [dfsan] Wrap memmove
Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D95883
2021-02-03 05:15:56 +00:00