llvm-project

Commit Graph

Author	SHA1	Message	Date
Jianzhou Zhao	af9f461298	[dfsan] test flush on only x86	2021-03-25 02:45:43 +00:00
Jianzhou Zhao	f9a135b652	[dfsan] Test dfsan_flush with origins This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D99295	2021-03-25 00:12:53 +00:00
Jianzhou Zhao	4950695eba	[dfsan] Add Origin ABI Wrappers Supported ctime_r, fgets, getcwd, get_current_dir_name, gethostname, getrlimit, getrusage, strcpy, time, inet_pton, localtime_r, getpwuid_r, epoll_wait, poll, select, sched_getaffinity Most of them work as calling their non-origin verision directly. This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D98966	2021-03-24 18:22:03 +00:00
Jianzhou Zhao	91516925dd	[dfsan] Add Origin ABI Wrappers Supported strrchr, strrstr, strto*, recvmmsg, recrmsg, nanosleep, memchr, snprintf, socketpair, sprintf, getocketname, getsocketopt, gettimeofday, getpeername. strcpy was added because the test of sprintf need it. It will be committed by D98966. Please ignore it when reviewing. This is a part of https://reviews.llvm.org/D95835. Reviewed By: gbalats Differential Revision: https://reviews.llvm.org/D99109	2021-03-24 16:13:09 +00:00
Jianzhou Zhao	1fe042041c	[dfsan] Add origin ABI wrappers supported: dl_get_tls_static_info, calloc, clock_gettime, dfsan_set_write_callback, dl_iterato_phdr, dlopen, memcpy, memmove, memset, pread, read, strcat, strdup, strncpy This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D98790	2021-03-19 16:23:25 +00:00
Jianzhou Zhao	ec5ed66cee	[dfsan] Add origin ABI wrappers supported: bcmp, fstat, memcmp, stat, strcasecmp, strchr, strcmp, strncasecmp, strncp, strpbrk This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D98636	2021-03-17 02:22:35 +00:00
Jianzhou Zhao	4e67ae7b6b	[dfsan] Add origin ABI wrappers for thread/signal/fork This is a part of https://reviews.llvm.org/D95835. See `bb91e02efd` about the similar issue of fork in MSan's origin tracking. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D98359	2021-03-15 16:18:00 +00:00
Jianzhou Zhao	37520a0b2b	[dfsan] Disable testing origin tracking on non x86_64 arch Fix test cases related to https://reviews.llvm.org/D95835.	2021-03-11 21:22:43 +00:00
Jianzhou Zhao	6a9a686ce7	[dfsan] Tracking origins at phi nodes This is a part of https://reviews.llvm.org/D95835. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D98268	2021-03-10 17:02:58 +00:00
Jianzhou Zhao	8506fe5b41	[dfsan] Tracking origins at memory transfer This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D98192	2021-03-09 22:15:07 +00:00
Jianzhou Zhao	469d5462fa	[dfsan] Re-enable origin tracking test cases	2021-03-06 02:41:56 +00:00
Jianzhou Zhao	d02e0ba070	[dfsan] Disable origin test cases temporarily	2021-03-06 01:12:54 +00:00
Jianzhou Zhao	c20db7ea6a	[dfsan] Add utils to get and print origin paths and some test cases This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse, gbalats Differential Revision: https://reviews.llvm.org/D97962	2021-03-06 00:11:35 +00:00
Jianzhou Zhao	c5c316f6d9	[dfsan] Do not test origin-tracking in atomic.cpp This would cause linking errors after https://reviews.llvm.org/D97483 that introduced new prefixes for ABI wrappers with origin tracking mode. We will renable this after the full origin tracking is checked in.	2021-02-26 19:44:18 +00:00
Jianzhou Zhao	c88fedef2a	[dfsan] Conservative solution to atomic load/store DFSan at store does store shadow data; store app data; and at load does load shadow data; load app data. When an application data is atomic, one overtainting case is thread A: load shadow thread B: store shadow thread B: store app thread A: load app If the application address had been used by other flows, thread A reads previous shadow, causing overtainting. The change is similar to MSan's solution. 1) enforce ordering of app load/store 2) load shadow after load app; store shadow before shadow app 3) do not track atomic store by reseting its shadow to be 0. The last one is to address a case like this. Thread A: load app Thread B: store shadow Thread A: load shadow Thread B: store app This approach eliminates overtainting as a trade-off between undertainting flows via shadow data race. Note that this change addresses only native atomic instructions, but does not support builtin libcalls yet. https://llvm.org/docs/Atomics.html#libcalls-atomic Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D97310	2021-02-25 23:34:58 +00:00
Jianzhou Zhao	0f3fd3b281	[dfsan] Add thread registration This is a part of https://reviews.llvm.org/D95835. This change is to address two problems 1) When recording stacks in origin tracking, libunwind is not async signal safe. Inside signal callbacks, we need to use fast unwind. Fast unwind needs threads 2) StackDepot used by origin tracking is not async signal safe, we set a flag per thread inside a signal callback to prevent from using it. The thread registration is similar to ASan and MSan. Related MSan changes are * `98f5ea0dba` * `f653cda269` * `5a7c364343` Some changes in the diff are used in the next diffs 1) The test case pthread.c is not very interesting for now. It will be extended to test origin tracking later. 2) DFsanThread::InSignalHandler will be used by origin tracking later. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95963	2021-02-05 17:38:59 +00:00
Jianzhou Zhao	15f26c5f51	[dfsan] Wrap strcat Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95923	2021-02-03 18:50:29 +00:00
Jianzhou Zhao	eb5c0a90e7	[dfsan] Test IGN and DFL for sigaction Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95957	2021-02-03 18:46:49 +00:00
Jianzhou Zhao	93afc3452c	[dfsan] Clean TLS after signal callbacks Similar to https://reviews.llvm.org/D95642, this diff fixes signal. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95896	2021-02-03 17:21:28 +00:00
Jianzhou Zhao	3f568e1fbb	[dfsan] Wrap memmove Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95883	2021-02-03 05:15:56 +00:00
Jianzhou Zhao	e1a4322f81	[dfsan] Clean TLS after sigaction callbacks DFSan uses TLS to pass metadata of arguments and return values. When an instrumented function accesses the TLS, if a signal callback happens, and the callback calls other instrumented functions with updating the same TLS, the TLS is in an inconsistent state after the callback ends. This may cause either under-tainting or over-tainting. This fix follows MSan's workaround. `cb22c67a21` It simply resets TLS at restore. This prevents from over-tainting. Although under-tainting may still happen, a taint flow can be found eventually if we run a DFSan-instrumented program multiple times. The alternative option is saving the entire TLS. However the TLS storage takes 2k bytes, and signal calls could be nested. So it does not seem worth. This diff fixes sigaction. A following diff will be fixing signal. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95642	2021-02-02 22:07:17 +00:00
Matt Morehouse	7bc7501ac1	[DFSan] Add custom wrapper for recvmmsg. Uses the recvmsg wrapper logic in a loop. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93059	2020-12-11 06:24:56 -08:00
Matt Morehouse	5ff35356f1	[DFSan] Appease the custom wrapper lint script.	2020-12-10 14:12:26 -08:00
Matt Morehouse	009931644a	[DFSan] Add custom wrapper for pthread_join. The wrapper clears shadow for retval. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93047	2020-12-10 13:41:24 -08:00
Matt Morehouse	fa4bd4b338	[DFSan] Add custom wrapper for getpeername. The wrapper clears shadow for addr and addrlen when written to. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93046	2020-12-10 12:26:06 -08:00
Matt Morehouse	72fd47b93d	[DFSan] Add custom wrapper for _dl_get_tls_static_info. Implementation is here: https://code.woboq.org/userspace/glibc/elf/dl-tls.c.html#307 We use weak symbols to avoid linking issues with glibcs older than 2.27. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93053	2020-12-10 11:03:28 -08:00
Matt Morehouse	bdaeb82a5f	[DFSan] Add custom wrapper for sigaltstack. The wrapper clears shadow for old_ss. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93041	2020-12-10 10:16:36 -08:00
Matt Morehouse	8a874a4277	[DFSan] Add custom wrapper for getsockname. The wrapper clears shadow for any bytes written to addr or addrlen. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92964	2020-12-10 08:13:05 -08:00
Matt Morehouse	4eedc2e3af	[DFSan] Add custom wrapper for getsockopt. The wrapper clears shadow for optval and optlen when written. Reviewed By: stephan.yichao.zhao, vitalybuka Differential Revision: https://reviews.llvm.org/D92961	2020-12-09 14:29:38 -08:00
Matt Morehouse	a3eb2fb247	[DFSan] Add custom wrapper for recvmsg. The wrapper clears shadow for anything written by recvmsg. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92949	2020-12-09 13:07:51 -08:00
Jianzhou Zhao	ea981165a4	[dfsan] Track field/index-level shadow values in variables ************* * The problem ************* See motivation examples in compiler-rt/test/dfsan/pair.cpp. The current DFSan always uses a 16bit shadow value for a variable with any type by combining all shadow values of all bytes of the variable. So it cannot distinguish two fields of a struct: each field's shadow value equals the combined shadow value of all fields. This introduces an overtaint issue. Consider a parsing function std::pair<char, int> get_token(char p); where p points to a buffer to parse, the returned pair includes the next token and the pointer to the position in the buffer after the token. If the token is tainted, then both the returned pointer and int ar tainted. If the parser keeps on using get_token for the rest parsing, all the following outputs are tainted because of the tainted pointer. The CL is the first change to address the issue. ************************** * The proposed improvement ************************ Eventually all fields and indices have their own shadow values in variables and memory. For example, variables with type {i1, i3}, [2 x i1], {[2 x i4], i8}, [2 x {i1, i1}] have shadow values with type {i16, i16}, [2 x i16], {[2 x i16], i16}, [2 x {i16, i16}] correspondingly; variables with primary type still have shadow values i16. ************************* * An potential implementation plan ************************* The idea is to adopt the change incrementially. 1) This CL Support field-level accuracy at variables/args/ret in TLS mode, load/store/alloca still use combined shadow values. After the alloca promotion and SSA construction phases (>=-O1), we assume alloca and memory operations are reduced. So if struct variables do not relate to memory, their tracking is accurate at field level. 2) Support field-level accuracy at alloca 3) Support field-level accuracy at load/store These two should make O0 and real memory access work. 4) Support vector if necessary. 5) Support Args mode if necessary. 6) Support passing more accurate shadow values via custom functions if necessary. ************* * About this CL. *************** The CL did the following 1) extended TLS arg/ret to work with aggregate types. This is similar to what MSan does. 2) implemented how to map between an original type/value/zero-const to its shadow type/value/zero-const. 3) extended (insert\|extract)value to use field/index-level progagation. 4) for other instructions, propagation rules are combining inputs by or. The CL converts between aggragate and primary shadow values at the cases. 5) Custom function interfaces also need such a conversion because all existing custom functions use i16. It is unclear whether custome functions need more accurate shadow propagation yet. 6) Added test cases for aggregate type related cases. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D92261	2020-12-09 19:38:35 +00:00
Matt Morehouse	6f13445fb6	[DFSan] Add custom wrapper for epoll_wait. The wrapper clears shadow for any events written. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92891	2020-12-09 06:05:29 -08:00
Jianzhou Zhao	6fa06628a7	[dfsan] Add test cases for struct/pair This is a child diff of D92261. This locks down the behavior before the change.	2020-12-02 21:25:23 +00:00
Adhemerval Zanella	f93c2b64ed	[sanitizer] Disable ASLR for release_shadow_space On aarch64 with kernel 4.12.13 the test sporadically fails with RSS at start: 1564, after mmap: 103964, after mmap+set label: 308768, \ after fixed map: 206368, after another mmap+set label: 308768, after \ munmap: 206368 release_shadow_space.c.tmp: [...]/release_shadow_space.c:80: int \ main(int, char **): Assertion `after_fixed_mmap <= before + delta' failed. It seems on some executions the memory is not fully released, even after munmap. And it also seems that ASLR is hurting it by adding some fragmentation, by disabling it I could not reproduce the issue in multiple runs.	2020-10-29 16:09:03 -03:00
Jianzhou Zhao	91dc545bf2	Set Huge Page mode on shadow regions based on no_huge_pages_for_shadow It turned out that at dynamic shared library mode, the memory access pattern can increase memory footprint significantly on OS when transparent hugepages (THP) are enabled. This could cause >70x memory overhead than running a static linked binary. For example, a static binary with RSS overhead 300M can use > 23G RSS if it is built dynamically. /proc/../smaps shows in 6204552 kB RSS 6141952 kB relates to AnonHugePages. Also such a high RSS happens in some rate: around 25% runs may use > 23G RSS, the rest uses in between 6-23G. I guess this may relate to how user memory is allocated and distributted across huge pages. THP is a trade-off between time and space. We have a flag no_huge_pages_for_shadow for sanitizer. It is true by default but DFSan did not follow this. Depending on if a target is built statically or dynamically, maybe Clang can set no_huge_pages_for_shadow accordingly after this change. But it still seems fine to follow the default setting of no_huge_pages_for_shadow. If time is an issue, and users are fine with high RSS, this flag can be set to false selectively.	2020-10-20 16:50:59 +00:00
Jianzhou Zhao	4d1d8ae710	Replace shadow space zero-out by madvise at mmap After D88686, munmap uses MADV_DONTNEED to ensure zero-out before the next access. Because the entire shadow space is created by MAP_PRIVATE and MAP_ANONYMOUS, the first access is also on zero-filled values. So it is fine to not zero-out data, but use madvise(MADV_DONTNEED) at mmap. This reduces runtime overhead. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D88755	2020-10-06 21:29:50 +00:00
Jianzhou Zhao	88c9162c9d	Fix the test case in D88686 Adjusted when to check RSS.	2020-10-03 00:23:39 +00:00
Jianzhou Zhao	3847986fd2	Fix the test case from D88686 It seems that one buildnot RSS value is much higher after munmap than local run.	2020-10-02 22:59:55 +00:00
Jianzhou Zhao	045a620c45	Release the shadow memory used by the mmap range at munmap When an application does a lot of pairs of mmap and munmap, if we did not release shadoe memory used by mmap addresses, this would increase memory usage. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D88686	2020-10-02 20:17:22 +00:00
Matt Morehouse	23bab1eb43	[DFSan] Add strpbrk wrapper. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D87849	2020-09-18 08:54:14 -07:00
Matt Morehouse	50dd545b00	[DFSan] Add bcmp wrapper. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D87801	2020-09-17 09:23:49 -07:00
Matt Morehouse	df017fd906	Revert "[DFSan] Add bcmp wrapper." This reverts commit `559f919812` due to bot failure.	2020-09-17 08:43:45 -07:00
Matt Morehouse	559f919812	[DFSan] Add bcmp wrapper. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D87801	2020-09-17 08:23:09 -07:00
Matt Morehouse	2df6efedef	[DFSan] Re-enable event_callbacks test. Mark the dest pointers for memcpy and memmove as volatile, to avoid dead store elimination. Fixes https://bugs.llvm.org/show_bug.cgi?id=47488.	2020-09-11 09:15:05 -07:00
Jeremy Morse	82390454f0	[DFSan] XFail a test that's suffering too much optimization See https://bugs.llvm.org/show_bug.cgi?id=47488 , rGfb109c42d9 is optimizing out part of this test.	2020-09-11 11:25:24 +01:00
Matt Morehouse	4deda57106	[DFSan] Handle mmap() calls before interceptors are installed. InitializeInterceptors() calls dlsym(), which calls calloc(). Depending on the allocator implementation, calloc() may invoke mmap(), which results in a segfault since REAL(mmap) is still being resolved. We fix this by doing a direct syscall if interceptors haven't been fully resolved yet. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D86168	2020-08-19 15:07:41 -07:00
Matt Morehouse	69721fc9d1	[DFSan] Support fast16labels mode in dfsan_union. While the instrumentation never calls dfsan_union in fast16labels mode, the custom wrappers do. We detect fast16labels mode by checking whether any labels have been created. If not, we must be using fast16labels mode. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D86012	2020-08-17 11:27:28 -07:00
Matt Morehouse	bb3a3da38d	[DFSan] Don't unmap during dfsan_flush(). Unmapping and remapping is dangerous since another thread could touch the shadow memory while it is unmapped. But there is really no need to unmap anyway, since mmap(MAP_FIXED) will happily clobber the existing mapping with zeroes. This is thread-safe since the mmap() is done under the same kernel lock as page faults are done. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D85947	2020-08-14 11:43:49 -07:00
Matt Morehouse	c1f9c1c13c	[DFSan] Fix parameters to strtoull wrapper. base and nptr_label were swapped, which meant we were passing nptr's shadow as the base to the operation. Usually, the shadow is 0, which causes strtoull to guess the correct base from the string prefix (e.g., 0x means base-16 and 0 means base-8), hiding this bug. Adjust the test case to expose the bug. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D85935	2020-08-14 08:02:30 -07:00
Matt Morehouse	e2d0b44a7c	[DFSan] Add efficient fast16labels instrumentation mode. Adds the -fast-16-labels flag, which enables efficient instrumentation for DFSan when the user needs <=16 labels. The instrumentation eliminates most branches and most calls to __dfsan_union or __dfsan_union_load. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D84371	2020-07-29 18:58:47 +00:00

1 2

93 Commits