Commit Graph

72 Commits

Author SHA1 Message Date
Matt Morehouse 7bc7501ac1 [DFSan] Add custom wrapper for recvmmsg.
Uses the recvmsg wrapper logic in a loop.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D93059
2020-12-11 06:24:56 -08:00
Matt Morehouse 5ff35356f1 [DFSan] Appease the custom wrapper lint script. 2020-12-10 14:12:26 -08:00
Matt Morehouse 009931644a [DFSan] Add custom wrapper for pthread_join.
The wrapper clears shadow for retval.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D93047
2020-12-10 13:41:24 -08:00
Matt Morehouse fa4bd4b338 [DFSan] Add custom wrapper for getpeername.
The wrapper clears shadow for addr and addrlen when written to.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D93046
2020-12-10 12:26:06 -08:00
Matt Morehouse 72fd47b93d [DFSan] Add custom wrapper for _dl_get_tls_static_info.
Implementation is here:
https://code.woboq.org/userspace/glibc/elf/dl-tls.c.html#307

We use weak symbols to avoid linking issues with glibcs older than 2.27.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D93053
2020-12-10 11:03:28 -08:00
Matt Morehouse bdaeb82a5f [DFSan] Add custom wrapper for sigaltstack.
The wrapper clears shadow for old_ss.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D93041
2020-12-10 10:16:36 -08:00
Matt Morehouse 8a874a4277 [DFSan] Add custom wrapper for getsockname.
The wrapper clears shadow for any bytes written to addr or addrlen.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D92964
2020-12-10 08:13:05 -08:00
Matt Morehouse 4eedc2e3af [DFSan] Add custom wrapper for getsockopt.
The wrapper clears shadow for optval and optlen when written.

Reviewed By: stephan.yichao.zhao, vitalybuka

Differential Revision: https://reviews.llvm.org/D92961
2020-12-09 14:29:38 -08:00
Matt Morehouse a3eb2fb247 [DFSan] Add custom wrapper for recvmsg.
The wrapper clears shadow for anything written by recvmsg.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D92949
2020-12-09 13:07:51 -08:00
Jianzhou Zhao ea981165a4 [dfsan] Track field/index-level shadow values in variables
*************
* The problem
*************
See motivation examples in compiler-rt/test/dfsan/pair.cpp. The current
DFSan always uses a 16bit shadow value for a variable with any type by
combining all shadow values of all bytes of the variable. So it cannot
distinguish two fields of a struct: each field's shadow value equals the
combined shadow value of all fields. This introduces an overtaint issue.

Consider a parsing function

   std::pair<char*, int> get_token(char* p);

where p points to a buffer to parse, the returned pair includes the next
token and the pointer to the position in the buffer after the token.

If the token is tainted, then both the returned pointer and int ar
tainted. If the parser keeps on using get_token for the rest parsing,
all the following outputs are tainted because of the tainted pointer.

The CL is the first change to address the issue.

**************************
* The proposed improvement
**************************
Eventually all fields and indices have their own shadow values in
variables and memory.

For example, variables with type {i1, i3}, [2 x i1], {[2 x i4], i8},
[2 x {i1, i1}] have shadow values with type {i16, i16}, [2 x i16],
{[2 x i16], i16}, [2 x {i16, i16}] correspondingly; variables with
primary type still have shadow values i16.

***************************
* An potential implementation plan
***************************

The idea is to adopt the change incrementially.

1) This CL
Support field-level accuracy at variables/args/ret in TLS mode,
load/store/alloca still use combined shadow values.

After the alloca promotion and SSA construction phases (>=-O1), we
assume alloca and memory operations are reduced. So if struct
variables do not relate to memory, their tracking is accurate at
field level.

2) Support field-level accuracy at alloca
3) Support field-level accuracy at load/store

These two should make O0 and real memory access work.

4) Support vector if necessary.
5) Support Args mode if necessary.
6) Support passing more accurate shadow values via custom functions if
necessary.

***************
* About this CL.
***************
The CL did the following

1) extended TLS arg/ret to work with aggregate types. This is similar
to what MSan does.

2) implemented how to map between an original type/value/zero-const to
its shadow type/value/zero-const.

3) extended (insert|extract)value to use field/index-level progagation.

4) for other instructions, propagation rules are combining inputs by or.
The CL converts between aggragate and primary shadow values at the
cases.

5) Custom function interfaces also need such a conversion because
all existing custom functions use i16. It is unclear whether custome
functions need more accurate shadow propagation yet.

6) Added test cases for aggregate type related cases.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D92261
2020-12-09 19:38:35 +00:00
Matt Morehouse 6f13445fb6 [DFSan] Add custom wrapper for epoll_wait.
The wrapper clears shadow for any events written.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D92891
2020-12-09 06:05:29 -08:00
Jianzhou Zhao 6fa06628a7 [dfsan] Add test cases for struct/pair
This is a child diff of D92261.

This locks down the behavior before the change.
2020-12-02 21:25:23 +00:00
Adhemerval Zanella f93c2b64ed [sanitizer] Disable ASLR for release_shadow_space
On aarch64 with kernel 4.12.13 the test sporadically fails with

RSS at start: 1564, after mmap: 103964, after mmap+set label: 308768, \
after fixed map: 206368, after another mmap+set label: 308768, after \
munmap: 206368
release_shadow_space.c.tmp: [...]/release_shadow_space.c:80: int \
main(int, char **): Assertion `after_fixed_mmap <= before + delta' failed.

It seems on some executions the memory is not fully released, even
after munmap.  And it also seems that ASLR is hurting it by adding
some fragmentation, by disabling it I could not reproduce the issue
in multiple runs.
2020-10-29 16:09:03 -03:00
Jianzhou Zhao 91dc545bf2 Set Huge Page mode on shadow regions based on no_huge_pages_for_shadow
It turned out that at dynamic shared library mode, the memory access
pattern can increase memory footprint significantly on OS when transparent
hugepages (THP) are enabled. This could cause >70x memory overhead than
running a static linked binary. For example, a static binary with RSS
overhead 300M can use > 23G RSS if it is built dynamically.
/proc/../smaps shows in 6204552 kB RSS 6141952 kB relates to
AnonHugePages.

Also such a high RSS happens in some rate: around 25% runs may use > 23G RSS, the
rest uses in between 6-23G. I guess this may relate to how user memory
is allocated and distributted across huge pages.

THP is a trade-off between time and space. We have a flag
no_huge_pages_for_shadow for sanitizer. It is true by default but DFSan
did not follow this. Depending on if a target is built statically or
dynamically, maybe Clang can set no_huge_pages_for_shadow accordingly
after this change. But it still seems fine to follow the default setting of
no_huge_pages_for_shadow. If time is an issue, and users are fine with
high RSS, this flag can be set to false selectively.
2020-10-20 16:50:59 +00:00
Jianzhou Zhao 4d1d8ae710 Replace shadow space zero-out by madvise at mmap
After D88686, munmap uses MADV_DONTNEED to ensure zero-out before the
next access. Because the entire shadow space is created by MAP_PRIVATE
and MAP_ANONYMOUS, the first access is also on zero-filled values.

So it is fine to not zero-out data, but use madvise(MADV_DONTNEED) at
mmap. This reduces runtime
overhead.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D88755
2020-10-06 21:29:50 +00:00
Jianzhou Zhao 88c9162c9d Fix the test case in D88686
Adjusted when to check RSS.
2020-10-03 00:23:39 +00:00
Jianzhou Zhao 3847986fd2 Fix the test case from D88686
It seems that one buildnot RSS value is much higher after munmap than
local run.
2020-10-02 22:59:55 +00:00
Jianzhou Zhao 045a620c45 Release the shadow memory used by the mmap range at munmap
When an application does a lot of pairs of mmap and munmap, if we did
not release shadoe memory used by mmap addresses, this would increase
memory usage.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D88686
2020-10-02 20:17:22 +00:00
Matt Morehouse 23bab1eb43 [DFSan] Add strpbrk wrapper.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87849
2020-09-18 08:54:14 -07:00
Matt Morehouse 50dd545b00 [DFSan] Add bcmp wrapper.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87801
2020-09-17 09:23:49 -07:00
Matt Morehouse df017fd906 Revert "[DFSan] Add bcmp wrapper."
This reverts commit 559f919812 due to bot
failure.
2020-09-17 08:43:45 -07:00
Matt Morehouse 559f919812 [DFSan] Add bcmp wrapper.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87801
2020-09-17 08:23:09 -07:00
Matt Morehouse 2df6efedef [DFSan] Re-enable event_callbacks test.
Mark the dest pointers for memcpy and memmove as volatile, to avoid dead
store elimination.  Fixes https://bugs.llvm.org/show_bug.cgi?id=47488.
2020-09-11 09:15:05 -07:00
Jeremy Morse 82390454f0 [DFSan] XFail a test that's suffering too much optimization
See https://bugs.llvm.org/show_bug.cgi?id=47488 , rGfb109c42d9 is
optimizing out part of this test.
2020-09-11 11:25:24 +01:00
Matt Morehouse 4deda57106 [DFSan] Handle mmap() calls before interceptors are installed.
InitializeInterceptors() calls dlsym(), which calls calloc().  Depending
on the allocator implementation, calloc() may invoke mmap(), which
results in a segfault since REAL(mmap) is still being resolved.

We fix this by doing a direct syscall if interceptors haven't been fully
resolved yet.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D86168
2020-08-19 15:07:41 -07:00
Matt Morehouse 69721fc9d1 [DFSan] Support fast16labels mode in dfsan_union.
While the instrumentation never calls dfsan_union in fast16labels mode,
the custom wrappers do.  We detect fast16labels mode by checking whether
any labels have been created.  If not, we must be using fast16labels
mode.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D86012
2020-08-17 11:27:28 -07:00
Matt Morehouse bb3a3da38d [DFSan] Don't unmap during dfsan_flush().
Unmapping and remapping is dangerous since another thread could touch
the shadow memory while it is unmapped.  But there is really no need to
unmap anyway, since mmap(MAP_FIXED) will happily clobber the existing
mapping with zeroes.  This is thread-safe since the mmap() is done under
the same kernel lock as page faults are done.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D85947
2020-08-14 11:43:49 -07:00
Matt Morehouse c1f9c1c13c [DFSan] Fix parameters to strtoull wrapper.
base and nptr_label were swapped, which meant we were passing nptr's
shadow as the base to the operation.  Usually, the shadow is 0, which
causes strtoull to guess the correct base from the string prefix (e.g.,
0x means base-16 and 0 means base-8), hiding this bug.  Adjust the test
case to expose the bug.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D85935
2020-08-14 08:02:30 -07:00
Matt Morehouse e2d0b44a7c [DFSan] Add efficient fast16labels instrumentation mode.
Adds the -fast-16-labels flag, which enables efficient instrumentation
for DFSan when the user needs <=16 labels.  The instrumentation
eliminates most branches and most calls to __dfsan_union or
__dfsan_union_load.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D84371
2020-07-29 18:58:47 +00:00
Matt Morehouse c6f2142428 Reland "[DFSan] Handle fast16labels for all API functions."
Support fast16labels in `dfsan_has_label`, and print an error for all
other API functions.  For `dfsan_dump_labels` we return silently rather
than crashing since it is also called from the atexit handler where it
is undefined behavior to call exit() again.

Reviewed By: kcc

Differential Revision: https://reviews.llvm.org/D84215
2020-07-23 21:19:39 +00:00
Matt Morehouse df441c9015 Revert "[DFSan] Handle fast16labels for all API functions."
This reverts commit 19d9c0397e due to
buildbot failure.
2020-07-23 17:49:55 +00:00
Matt Morehouse 84980b1395 [DFSan] Print more debugging info on test failure. 2020-07-23 15:47:56 +00:00
Matt Morehouse 19d9c0397e [DFSan] Handle fast16labels for all API functions.
Summary:
Support fast16labels in `dfsan_has_label`, and print an error for all
other API functions.

Reviewers: kcc, vitalybuka, pcc

Reviewed By: kcc

Subscribers: jfb, llvm-commits, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D84215
2020-07-22 23:54:26 +00:00
Sam Kerner e5ce95c660 [dfsan] Fix a bug in strcasecmp() and strncasecmp(): Compare the lowercase versions of the characters when choosing a return value.
Summary:
Resolves this bug:

  https://bugs.llvm.org/show_bug.cgi?id=38369

Reviewers: morehouse, pcc

Reviewed By: morehouse

Subscribers: #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D78490
2020-04-20 17:13:40 -07:00
Sam Kerner 10070e31a5 Fix DataFlowSanitizer implementation of strchr() so that strchr(..., '\0') returns a pointer to '\0'.
Summary:

Fixes https://bugs.llvm.org/show_bug.cgi?id=22392

Reviewers: pcc, morehouse

Reviewed By: morehouse

Subscribers: morehouse, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D77996
2020-04-15 13:08:47 -07:00
Matt Morehouse 30bb737a75 [DFSan] Add __dfsan_cmp_callback.
Summary:
When -dfsan-event-callbacks is specified, insert a call to
__dfsan_cmp_callback on every CMP instruction.

Reviewers: vitalybuka, pcc, kcc

Reviewed By: kcc

Subscribers: hiraditya, #sanitizers, eugenis, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D75389
2020-02-28 15:49:44 -08:00
Matt Morehouse f668baa459 [DFSan] Add __dfsan_mem_transfer_callback.
Summary:
When -dfsan-event-callbacks is specified, insert a call to
__dfsan_mem_transfer_callback on every memcpy and memmove.

Reviewers: vitalybuka, kcc, pcc

Reviewed By: kcc

Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D75386
2020-02-28 15:48:25 -08:00
Matt Morehouse 52f889abec [DFSan] Add __dfsan_load_callback.
Summary:
When -dfsan-event-callbacks is specified, insert a call to
__dfsan_load_callback() on every load.

Reviewers: vitalybuka, pcc, kcc

Reviewed By: vitalybuka, kcc

Subscribers: hiraditya, #sanitizers, llvm-commits, eugenis, kcc

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D75363
2020-02-28 14:26:09 -08:00
Matt Morehouse 470db54cbd [DFSan] Add flag to insert event callbacks.
Summary:
For now just insert the callback for stores, similar to how MSan tracks
origins.  In the future we may want to add callbacks for loads, memcpy,
function calls, CMPs, etc.

Reviewers: pcc, vitalybuka, kcc, eugenis

Reviewed By: vitalybuka, kcc, eugenis

Subscribers: eugenis, hiraditya, #sanitizers, llvm-commits, kcc

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D75312
2020-02-27 17:14:19 -08:00
Nico Weber c4310f921d compiler-rt: Rename .cc file in test/dfsan to cpp
See r367849 et al.

llvm-svn: 367854
2019-08-05 13:19:28 +00:00
Reid Kleckner 8007ff1ab1 [compiler-rt] Rename lit.*.cfg.* -> lit.*.cfg.py.*
These lit configuration files are really Python source code. Using the
.py file extension helps editors and tools use the correct language
mode. LLVM and Clang already use this convention for lit configuration,
this change simply applies it to all of compiler-rt.

Reviewers: vitalybuka, dberris

Differential Revision: https://reviews.llvm.org/D63658

llvm-svn: 364591
2019-06-27 20:56:04 +00:00
Kostya Serebryany 6b936d88a4 [dfsan] Introduce dfsan_flush().
Summary:
dfsan_flush() allows to restart tain tracking from scratch in the same process.
The primary purpose right now is to allow more efficient data flow tracing
for DFT fuzzing: https://github.com/google/oss-fuzz/issues/1632

Reviewers: pcc

Reviewed By: pcc

Subscribers: delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D63037

llvm-svn: 363321
2019-06-13 20:11:06 +00:00
Kostya Serebryany 300c0c79de Experimantal dfsan mode "fast16labels=1"
Summary:
dfsan mode "fast16labels=1".
In this mode the labels are treated as 16-bit bit masks.

Reviewers: pcc

Reviewed By: pcc

Subscribers: delcypher, #sanitizers, llvm-commits

Tags: #llvm, #sanitizers

Differential Revision: https://reviews.llvm.org/D62870

llvm-svn: 362859
2019-06-08 00:22:23 +00:00
Kostya Serebryany d74d04a6c5 Add weak definitions of trace-cmp hooks to dfsan
Summary:
This allows to build and link the code with e.g.
-fsanitize=dataflow -fsanitize-coverage=trace-pc-guard,pc-table,func,trace-cmp
w/o providing (all) the definitions of trace-cmp hooks.

This is similar to dummy hooks provided by asan/ubsan/msan for the same purpose,
except that some of the hooks need to have the __dfsw_ prefix
since we need dfsan to replace them.

Reviewers: pcc

Reviewed By: pcc

Subscribers: delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D47605

llvm-svn: 333796
2018-06-01 21:59:25 +00:00
Simon Dardis a1b7447dfd [compiler-rt][dfsan][mips] UnXPASS a consistently passing test
llvm-svn: 329422
2018-04-06 17:03:36 +00:00
Simon Dardis f570c76c5c [mips] XFAIL dfsan's custom.cc test on mips64.
Test was already marked as failing for mips64el. Now that it's being
tested on mips64, it has to be XFAILed there as well.

llvm-svn: 302570
2017-05-09 19:17:16 +00:00
Renato Golin 3bdc0f165b [DFSAN] Another unstable test in AArch64 breaking bots unnecessarily
llvm-svn: 289253
2016-12-09 19:02:04 +00:00
Kuba Mracek ff1bd20ded [sanitizer] Add macOS minimum deployment target to all compiler invocations in lit tests
The Clang driver on macOS decides the deployment target based on various things, like your host OS version, the SDK version and some environment variables, which makes lit tests pass or fail based on your environment. Let's make sure we run all lit tests with `-mmacosx-version-min=${SANITIZER_MIN_OSX_VERSION}` (10.9 unless overriden).

Differential Revision: https://reviews.llvm.org/D26929

llvm-svn: 288186
2016-11-29 19:25:53 +00:00
Daniel Sanders 6a540c1f38 [mips] XFAIL the new mips64el compiler-rt tests that fail on clang-cmake-mipsel.
The mips64el compiler-rt build has recently been enabled. XFAIL the failing
tests to make the buildbot green again.

The two asan tests require the integrated assembler. This will be fixed soon
for Debian mips64el but not for any other mips64el targets since doing so
requires triple-related issues to be fixed..
The msan tests are largely failing because caused by a kernel update (a patch
has already been posted for this).
I'm not sure why the dfsan test fails yet.

llvm-svn: 278504
2016-08-12 11:56:36 +00:00
Etienne Bergeron ab42f4ddba [compiler-rt] Fix VisualStudio virtual folders layout
Summary:
This patch is a refactoring of the way cmake 'targets' are grouped.
It won't affect non-UI cmake-generators.

Clang/LLVM are using a structured way to group targets which ease
navigation through Visual Studio UI. The Compiler-RT projects
differ from the way Clang/LLVM are grouping targets.

This patch doesn't contain behavior changes.

Reviewers: kubabrecka, rnk

Subscribers: wang0109, llvm-commits, kubabrecka, chrisha

Differential Revision: http://reviews.llvm.org/D21952

llvm-svn: 275111
2016-07-11 21:51:56 +00:00