llvm-project

Commit Graph

Author	SHA1	Message	Date
Vitaly Buka	2fcd872d8a	[dfsan] Remove dfsan_get_origin from done_abilist.txt Followup for D95835	2021-03-05 17:59:39 -08:00
Jianzhou Zhao	c20db7ea6a	[dfsan] Add utils to get and print origin paths and some test cases This is a part of https://reviews.llvm.org/D95835. Reviewed By: morehouse, gbalats Differential Revision: https://reviews.llvm.org/D97962	2021-03-06 00:11:35 +00:00
Vitaly Buka	657a58a571	[dfsan,NFC] Suppress cpplint warning	2021-03-04 20:42:18 -08:00
Jianzhou Zhao	a47d435bc4	[dfsan] Propagate origins for callsites This is a part of https://reviews.llvm.org/D95835. Each customized function has two wrappers. The first one dfsw is for the normal shadow propagation. The second one dfso is used when origin tracking is on. It calls the first one, and does additional origin propagation. Which one to use can be decided at instrumentation time. This is to ensure minimal additional overhead when origin tracking is off. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D97483	2021-02-26 19:12:03 +00:00
Jianzhou Zhao	a05aa0dd5e	[dfsan] Update memset and dfsan_(set\|add)_label with origin tracking This is a part of https://reviews.llvm.org/D95835. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D97302	2021-02-23 23:16:33 +00:00
Jianzhou Zhao	063a6fa87e	[dfsan] Add origin tls/move/read APIs This is a part of https://reviews.llvm.org/D95835. Added 1) TLS storage 2) a weak global used to set by instrumented code 3) move origins These APIs are similar to MSan's APIs https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/msan/msan_poisoning.cpp We first improved MSan's by https://reviews.llvm.org/D94572 and https://reviews.llvm.org/D94552. So the correctness has been verified by MSan. After the DFSan instrument code is ready, we wil be adding more test cases 4) read To reduce origin tracking cost, some of the read APIs return only the origin from the first taint data. Note that we did not add origin set APIs here because they are related to code instrumentation, will be added later with IR transformation code. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D96564	2021-02-18 17:48:20 +00:00
Jianzhou Zhao	a7538fee3a	[dfsan] Comment out ChainOrigin temporarily It was added by D96160, will be used by D96564. Some OS got errors if it is not used. Comment it out for the time being.	2021-02-12 18:13:24 +00:00
Jianzhou Zhao	7590c0078d	[dfsan] Turn off THP at dfsan_flush https://reviews.llvm.org/D89662 turned this off at dfsan_init. dfsan_flush also needs to turn it off. W/o this a program may get more and more memory usage after hours. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D96569	2021-02-12 17:10:09 +00:00
Jianzhou Zhao	083d45b21c	[dfsan] Fix building OriginAddr at non-linux OS Fix the broken build by D96545	2021-02-12 05:02:14 +00:00
Jianzhou Zhao	5ebbc5802f	[dfsan] Introduce memory mapping for origin tracking Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D96545	2021-02-11 22:33:16 +00:00
Jianzhou Zhao	2d9c6e10e9	[dfsan] Add origin chain utils This is a part of https://reviews.llvm.org/D95835. The design is based on MSan origin chains. An 4-byte origin is a hash of an origin chain. An origin chain is a pair of a stack hash id and a hash to its previous origin chain. 0 means no previous origin chains exist. We limit the length of a chain to be 16. With origin_history_size = 0, the limit is removed. The change does not have any test cases yet. The following change will be adding test cases when the APIs are used. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D96160	2021-02-11 19:10:11 +00:00
Jianzhou Zhao	0f3fd3b281	[dfsan] Add thread registration This is a part of https://reviews.llvm.org/D95835. This change is to address two problems 1) When recording stacks in origin tracking, libunwind is not async signal safe. Inside signal callbacks, we need to use fast unwind. Fast unwind needs threads 2) StackDepot used by origin tracking is not async signal safe, we set a flag per thread inside a signal callback to prevent from using it. The thread registration is similar to ASan and MSan. Related MSan changes are * `98f5ea0dba` * `f653cda269` * `5a7c364343` Some changes in the diff are used in the next diffs 1) The test case pthread.c is not very interesting for now. It will be extended to test origin tracking later. 2) DFsanThread::InSignalHandler will be used by origin tracking later. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95963	2021-02-05 17:38:59 +00:00
Jianzhou Zhao	15f26c5f51	[dfsan] Wrap strcat Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95923	2021-02-03 18:50:29 +00:00
Jianzhou Zhao	93afc3452c	[dfsan] Clean TLS after signal callbacks Similar to https://reviews.llvm.org/D95642, this diff fixes signal. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95896	2021-02-03 17:21:28 +00:00
Jianzhou Zhao	3f568e1fbb	[dfsan] Wrap memmove Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95883	2021-02-03 05:15:56 +00:00
Jianzhou Zhao	e1a4322f81	[dfsan] Clean TLS after sigaction callbacks DFSan uses TLS to pass metadata of arguments and return values. When an instrumented function accesses the TLS, if a signal callback happens, and the callback calls other instrumented functions with updating the same TLS, the TLS is in an inconsistent state after the callback ends. This may cause either under-tainting or over-tainting. This fix follows MSan's workaround. `cb22c67a21` It simply resets TLS at restore. This prevents from over-tainting. Although under-tainting may still happen, a taint flow can be found eventually if we run a DFSan-instrumented program multiple times. The alternative option is saving the entire TLS. However the TLS storage takes 2k bytes, and signal calls could be nested. So it does not seem worth. This diff fixes sigaction. A following diff will be fixing signal. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D95642	2021-02-02 22:07:17 +00:00
Matt Morehouse	7bc7501ac1	[DFSan] Add custom wrapper for recvmmsg. Uses the recvmsg wrapper logic in a loop. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93059	2020-12-11 06:24:56 -08:00
Matt Morehouse	009931644a	[DFSan] Add custom wrapper for pthread_join. The wrapper clears shadow for retval. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93047	2020-12-10 13:41:24 -08:00
Matt Morehouse	fa4bd4b338	[DFSan] Add custom wrapper for getpeername. The wrapper clears shadow for addr and addrlen when written to. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93046	2020-12-10 12:26:06 -08:00
Matt Morehouse	72fd47b93d	[DFSan] Add custom wrapper for _dl_get_tls_static_info. Implementation is here: https://code.woboq.org/userspace/glibc/elf/dl-tls.c.html#307 We use weak symbols to avoid linking issues with glibcs older than 2.27. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93053	2020-12-10 11:03:28 -08:00
Matt Morehouse	bdaeb82a5f	[DFSan] Add custom wrapper for sigaltstack. The wrapper clears shadow for old_ss. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D93041	2020-12-10 10:16:36 -08:00
Matt Morehouse	8a874a4277	[DFSan] Add custom wrapper for getsockname. The wrapper clears shadow for any bytes written to addr or addrlen. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92964	2020-12-10 08:13:05 -08:00
Matt Morehouse	4eedc2e3af	[DFSan] Add custom wrapper for getsockopt. The wrapper clears shadow for optval and optlen when written. Reviewed By: stephan.yichao.zhao, vitalybuka Differential Revision: https://reviews.llvm.org/D92961	2020-12-09 14:29:38 -08:00
Matt Morehouse	a3eb2fb247	[DFSan] Add custom wrapper for recvmsg. The wrapper clears shadow for anything written by recvmsg. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92949	2020-12-09 13:07:51 -08:00
Matt Morehouse	6f13445fb6	[DFSan] Add custom wrapper for epoll_wait. The wrapper clears shadow for any events written. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92891	2020-12-09 06:05:29 -08:00
Matt Morehouse	483fb33360	[DFSan] Add pthread and other functions to ABI list. The non-pthread functions are all clear discard functions. Some of the pthread ones could clear shadow, but aren't worth writing custom wrappers for. I can't think of any reasonable scenario where we would pass tainted memory to these pthread functions. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92877	2020-12-08 13:55:35 -08:00
Matt Morehouse	3bd2ad5a08	[DFSan] Add several math functions to ABI list. These are all straightforward functional entries. Reviewed By: stephan.yichao.zhao Differential Revision: https://reviews.llvm.org/D92791	2020-12-08 10:51:05 -08:00
Jianzhou Zhao	80e326a8c4	[dfsan] Support passing non-i16 shadow values in TLS mode This is a child diff of D92261. It extended TLS arg/ret to work with aggregate types. For a function t foo(t1 a1, t2 a2, ... tn an) Its arguments shadow are saved in TLS args like a1_s, a2_s, ..., an_s TLS ret simply includes r_s. By calculating the type size of each shadow value, we can get their offset. This is similar to what MSan does. See __msan_retval_tls and __msan_param_tls from llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp. Note that this change does not add test cases for overflowed TLS arg/ret because this is hard to test w/o supporting aggregate shdow types. We will be adding them after supporting that. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D92440	2020-12-04 02:45:07 +00:00
Jianzhou Zhao	b4ac05d763	Replace the equivalent code by UnionTableAddr UnionTableAddr is always inlined. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/DD91758	2020-11-19 20:15:25 +00:00
Jianzhou Zhao	3597fba4e5	Add a simple stack trace printer for DFSan Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D91235	2020-11-11 19:00:59 +00:00
Jianzhou Zhao	91dc545bf2	Set Huge Page mode on shadow regions based on no_huge_pages_for_shadow It turned out that at dynamic shared library mode, the memory access pattern can increase memory footprint significantly on OS when transparent hugepages (THP) are enabled. This could cause >70x memory overhead than running a static linked binary. For example, a static binary with RSS overhead 300M can use > 23G RSS if it is built dynamically. /proc/../smaps shows in 6204552 kB RSS 6141952 kB relates to AnonHugePages. Also such a high RSS happens in some rate: around 25% runs may use > 23G RSS, the rest uses in between 6-23G. I guess this may relate to how user memory is allocated and distributted across huge pages. THP is a trade-off between time and space. We have a flag no_huge_pages_for_shadow for sanitizer. It is true by default but DFSan did not follow this. Depending on if a target is built statically or dynamically, maybe Clang can set no_huge_pages_for_shadow accordingly after this change. But it still seems fine to follow the default setting of no_huge_pages_for_shadow. If time is an issue, and users are fine with high RSS, this flag can be set to false selectively.	2020-10-20 16:50:59 +00:00
Jianzhou Zhao	cc07fbe37d	Release pages to OS when setting 0 label This is a follow up patch of https://reviews.llvm.org/D88755. When set 0 label for an address range, we can release pages within the corresponding shadow address range to OS, and set only addresses outside the pages to be 0. Reviewed-by: morehouse, eugenis Differential Revision: https://reviews.llvm.org/D89199	2020-10-20 16:22:11 +00:00
Jianzhou Zhao	4d1d8ae710	Replace shadow space zero-out by madvise at mmap After D88686, munmap uses MADV_DONTNEED to ensure zero-out before the next access. Because the entire shadow space is created by MAP_PRIVATE and MAP_ANONYMOUS, the first access is also on zero-filled values. So it is fine to not zero-out data, but use madvise(MADV_DONTNEED) at mmap. This reduces runtime overhead. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D88755	2020-10-06 21:29:50 +00:00
Jianzhou Zhao	045a620c45	Release the shadow memory used by the mmap range at munmap When an application does a lot of pairs of mmap and munmap, if we did not release shadoe memory used by mmap addresses, this would increase memory usage. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D88686	2020-10-02 20:17:22 +00:00
Matt Morehouse	23bab1eb43	[DFSan] Add strpbrk wrapper. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D87849	2020-09-18 08:54:14 -07:00
Matt Morehouse	50dd545b00	[DFSan] Add bcmp wrapper. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D87801	2020-09-17 09:23:49 -07:00
Matt Morehouse	df017fd906	Revert "[DFSan] Add bcmp wrapper." This reverts commit `559f919812` due to bot failure.	2020-09-17 08:43:45 -07:00
Matt Morehouse	559f919812	[DFSan] Add bcmp wrapper. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D87801	2020-09-17 08:23:09 -07:00
Matt Morehouse	4deda57106	[DFSan] Handle mmap() calls before interceptors are installed. InitializeInterceptors() calls dlsym(), which calls calloc(). Depending on the allocator implementation, calloc() may invoke mmap(), which results in a segfault since REAL(mmap) is still being resolved. We fix this by doing a direct syscall if interceptors haven't been fully resolved yet. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D86168	2020-08-19 15:07:41 -07:00
Matt Morehouse	69721fc9d1	[DFSan] Support fast16labels mode in dfsan_union. While the instrumentation never calls dfsan_union in fast16labels mode, the custom wrappers do. We detect fast16labels mode by checking whether any labels have been created. If not, we must be using fast16labels mode. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D86012	2020-08-17 11:27:28 -07:00
Matt Morehouse	bb3a3da38d	[DFSan] Don't unmap during dfsan_flush(). Unmapping and remapping is dangerous since another thread could touch the shadow memory while it is unmapped. But there is really no need to unmap anyway, since mmap(MAP_FIXED) will happily clobber the existing mapping with zeroes. This is thread-safe since the mmap() is done under the same kernel lock as page faults are done. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D85947	2020-08-14 11:43:49 -07:00
Matt Morehouse	c1f9c1c13c	[DFSan] Fix parameters to strtoull wrapper. base and nptr_label were swapped, which meant we were passing nptr's shadow as the base to the operation. Usually, the shadow is 0, which causes strtoull to guess the correct base from the string prefix (e.g., 0x means base-16 and 0 means base-8), hiding this bug. Adjust the test case to expose the bug. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D85935	2020-08-14 08:02:30 -07:00
Matt Morehouse	005991a3fe	[DFSan] Remove dfsan_use_fast16labels from abilist. Its implementation was scrapped in the final fast16labels instrumentation patch.	2020-07-29 23:18:07 +00:00
Matt Morehouse	e2d0b44a7c	[DFSan] Add efficient fast16labels instrumentation mode. Adds the -fast-16-labels flag, which enables efficient instrumentation for DFSan when the user needs <=16 labels. The instrumentation eliminates most branches and most calls to __dfsan_union or __dfsan_union_load. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D84371	2020-07-29 18:58:47 +00:00
Matt Morehouse	c6f2142428	Reland "[DFSan] Handle fast16labels for all API functions." Support fast16labels in `dfsan_has_label`, and print an error for all other API functions. For `dfsan_dump_labels` we return silently rather than crashing since it is also called from the atexit handler where it is undefined behavior to call exit() again. Reviewed By: kcc Differential Revision: https://reviews.llvm.org/D84215	2020-07-23 21:19:39 +00:00
Matt Morehouse	df441c9015	Revert "[DFSan] Handle fast16labels for all API functions." This reverts commit `19d9c0397e` due to buildbot failure.	2020-07-23 17:49:55 +00:00
Matt Morehouse	19d9c0397e	[DFSan] Handle fast16labels for all API functions. Summary: Support fast16labels in `dfsan_has_label`, and print an error for all other API functions. Reviewers: kcc, vitalybuka, pcc Reviewed By: kcc Subscribers: jfb, llvm-commits, #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D84215	2020-07-22 23:54:26 +00:00
Vitaly Buka	d059d01c23	[dfsan] Remove realloc from done_abilist.txt Summary: Currently, realloc is marked as "discard" in done_abilist.txt. As discussed in PR#45583, this is probably not the expected behavior; a custom wrapper seems to be required. Since this wrapper has not been implemented yet, realloc should not be in the done_abilist.txt file so that a warning is displayed when it is called. Reviewers: kcc, pcc, vitalybuka Reviewed By: vitalybuka Subscribers: #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D78379	2020-05-05 22:32:45 -07:00
Sam Kerner	e5ce95c660	[dfsan] Fix a bug in strcasecmp() and strncasecmp(): Compare the lowercase versions of the characters when choosing a return value. Summary: Resolves this bug: https://bugs.llvm.org/show_bug.cgi?id=38369 Reviewers: morehouse, pcc Reviewed By: morehouse Subscribers: #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D78490	2020-04-20 17:13:40 -07:00
Sam Kerner	10070e31a5	Fix DataFlowSanitizer implementation of strchr() so that strchr(..., '\0') returns a pointer to '\0'. Summary: Fixes https://bugs.llvm.org/show_bug.cgi?id=22392 Reviewers: pcc, morehouse Reviewed By: morehouse Subscribers: morehouse, #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D77996	2020-04-15 13:08:47 -07:00

1 2 3 4

189 Commits