Summary:
Handling callbr is very similar to handling an inline assembly call:
MSan must checks the instruction's inputs.
callbr doesn't (yet) have outputs, so there's nothing to unpoison,
and conservative assembly handling doesn't apply either.
Fixes PR42479.
Reviewers: eugenis
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64072
llvm-svn: 365008
The 3-field form was introduced by D3499 in 2014 and the legacy 2-field
form was planned to be removed in LLVM 4.0
For the textual format, this patch migrates the existing 2-field form to
use the 3-field form and deletes the compatibility code.
test/Verifier/global-ctors-2.ll checks we have a friendly error message.
For bitcode, lib/IR/AutoUpgrade UpgradeGlobalVariables will upgrade the
2-field form (add i8* null as the third field).
Reviewed By: rnk, dexonsmith
Differential Revision: https://reviews.llvm.org/D61547
llvm-svn: 360742
Summary:
When a variable goes into scope several times within a single function
or when two variables from different scopes share a stack slot it may
be incorrect to poison such scoped locals at the beginning of the
function.
In the former case it may lead to false negatives (see
https://github.com/google/sanitizers/issues/590), in the latter - to
incorrect reports (because only one origin remains on the stack).
If Clang emits lifetime intrinsics for such scoped variables we insert
code poisoning them after each call to llvm.lifetime.start().
If for a certain intrinsic we fail to find a corresponding alloca, we
fall back to poisoning allocas for the whole function, as it's now
impossible to tell which alloca was missed.
The new instrumentation may slow down hot loops containing local
variables with lifetime intrinsics, so we allow disabling it with
-mllvm -msan-handle-lifetime-intrinsics=false.
Reviewers: eugenis, pcc
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60617
llvm-svn: 359536
Just as as llvm IR supports explicitly specifying numeric value ids
for instructions, and emits them by default in textual output, now do
the same for blocks.
This is a slightly incompatible change in the textual IR format.
Previously, llvm would parse numeric labels as string names. E.g.
define void @f() {
br label %"55"
55:
ret void
}
defined a label *named* "55", even without needing to be quoted, while
the reference required quoting. Now, if you intend a block label which
looks like a value number to be a name, you must quote it in the
definition too (e.g. `"55":`).
Previously, llvm would print nameless blocks only as a comment, and
would omit it if there was no predecessor. This could cause confusion
for readers of the IR, just as unnamed instructions did prior to the
addition of "%5 = " syntax, back in 2008 (PR2480).
Now, it will always print a label for an unnamed block, with the
exception of the entry block. (IMO it may be better to print it for
the entry-block as well. However, that requires updating many more
tests.)
Thus, the following is supported, and is the canonical printing:
define i32 @f(i32, i32) {
%3 = add i32 %0, %1
br label %4
4:
ret i32 %3
}
New test cases covering this behavior are added, and other tests
updated as required.
Differential Revision: https://reviews.llvm.org/D58548
llvm-svn: 356789
Summary:
They simply shuffle bits. MSan needs to do the same with shadow bits,
after making sure that the shuffle mask is fully initialized.
Reviewers: pcc, vitalybuka
Subscribers: hiraditya, #sanitizers, llvm-commits
Tags: #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D58858
llvm-svn: 355348
Summary: To avoid adding an extern function to the global ctors list, apply the changes of D56538 also to MSan.
Reviewers: chandlerc, vitalybuka, fedor.sergeev, leonardchan
Subscribers: hiraditya, bollu, llvm-commits
Differential Revision: https://reviews.llvm.org/D56734
llvm-svn: 351322
Summary:
Keeping msan a function pass requires replacing the module level initialization:
That means, don't define a ctor function which calls __msan_init, instead just
declare the init function at the first access, and add that to the global ctors
list.
Changes:
- Pull the actual sanitizer and the wrapper pass apart.
- Add a newpm msan pass. The function pass inserts calls to runtime
library functions, for which it inserts declarations as necessary.
- Update tests.
Caveats:
- There is one test that I dropped, because it specifically tested the
definition of the ctor.
Reviewers: chandlerc, fedor.sergeev, leonardchan, vitalybuka
Subscribers: sdardis, nemanjai, javed.absar, hiraditya, kbarton, bollu, atanasyan, jsji
Differential Revision: https://reviews.llvm.org/D55647
llvm-svn: 350305
MSan used to report false positives in the case the argument of
llvm.is.constant intrinsic was uninitialized.
In fact checking this argument is unnecessary, as the intrinsic is only
used at compile time, and its value doesn't depend on the value of the
argument.
llvm-svn: 350173
Those intrinsics will be autoupgraded soon to @llvm.sadd.sat generics (D55894), so to keep a x86-specific case I'm replacing it with @llvm.x86.sse2.pmulhu.w
llvm-svn: 349739
LLVM treats void* pointers passed to assembly routines as pointers to
sized types.
We used to emit calls to __msan_instrument_asm_load() for every such
void*, which sometimes led to false positives.
A less error-prone (and truly "conservative") approach is to unpoison
only assembly output arguments.
llvm-svn: 349734
This change enables conservative assembly instrumentation in KMSAN builds
by default.
It's still possible to disable it with -msan-handle-asm-conservative=0
if something breaks. It's now impossible to enable conservative
instrumentation for userspace builds, but it's not used anyway.
llvm-svn: 348112
Turns out it's not always possible to figure out whether an asm()
statement argument points to a valid memory region.
One example would be per-CPU objects in the Linux kernel, for which the
addresses are calculated using the FS register and a small offset in the
.data..percpu section.
To avoid pulling all sorts of checks into the instrumentation, we replace
actual checking/unpoisoning code with calls to
msan_instrument_asm_load(ptr, size) and
msan_instrument_asm_store(ptr, size) functions in the runtime.
This patch doesn't implement the runtime hooks in compiler-rt, as there's
been no demand in assembly instrumentation for userspace apps so far.
llvm-svn: 345702
Introduce the -msan-kernel flag, which enables the kernel instrumentation.
The main differences between KMSAN and MSan instrumentations are:
- KMSAN implies msan-track-origins=2, msan-keep-going=true;
- there're no explicit accesses to shadow and origin memory.
Shadow and origin values for a particular X-byte memory location are
read and written via pointers returned by
__msan_metadata_ptr_for_load_X(u8 *addr) and
__msan_store_shadow_origin_X(u8 *addr, uptr shadow, uptr origin);
- TLS variables are stored in a single struct in per-task storage. A call
to a function returning that struct is inserted into every instrumented
function before the entry block;
- __msan_warning() takes a 32-bit origin parameter;
- local variables are poisoned with __msan_poison_alloca() upon function
entry and unpoisoned with __msan_unpoison_alloca() before leaving the
function;
- the pass doesn't declare any global variables or add global constructors
to the translation unit.
llvm-svn: 341637
Add the __msan_va_arg_origin_tls TLS array to keep the origins for variadic function parameters.
Change the instrumentation pass to store parameter origins in this array.
This is a reland of r341528.
test/msan/vararg.cc doesn't work on Mips, PPC and AArch64 (because this
patch doesn't touch them), XFAIL these arches.
Also turned out Clang crashed on i80 vararg arguments because of
incorrect origin type returned by getOriginPtrForVAArgument() - fixed it
and added a test.
llvm-svn: 341554
Add the __msan_va_arg_origin_tls TLS array to keep the origins for
variadic function parameters.
Change the instrumentation pass to store parameter origins in this array.
llvm-svn: 341528
Turns out that calling a variadic function with too many (e.g. >100 i64's)
arguments overflows __msan_va_arg_tls, which leads to smashing other TLS
data with function argument shadow values.
getShadow() already checks for kParamTLSSize and returns clean shadow if
the argument does not fit, so just skip storing argument shadow for such
arguments.
llvm-svn: 341525
If code is compiled for X86 without SSE support, the register save area
doesn't contain FPU registers, so `AMD64FpEndOffset` should be equal to
`AMD64GpEndOffset`.
llvm-svn: 339414
When pointer checking is enabled, it's important that every pointer is
checked before its value is used.
For stores MSan used to generate code that calculates shadow/origin
addresses from a pointer before checking it.
For userspace this isn't a problem, because the shadow calculation code
is quite simple and compiler is able to move it after the check on -O2.
But for KMSAN getShadowOriginPtr() creates a runtime call, so we want the
check to be performed strictly before that call.
Swapping materializeChecks() and materializeStores() resolves the issue:
both functions insert code before the given IR location, so the new
insertion order guarantees that the code calculating shadow address is
between the address check and the memory access.
llvm-svn: 337571
See https://reviews.llvm.org/D47106 for details.
Reviewed By: probinson
Differential Revision: https://reviews.llvm.org/D47171
This commit drops that patch's changes to:
llvm/test/CodeGen/NVPTX/f16x2-instructions.ll
llvm/test/CodeGen/NVPTX/param-load-store.ll
For some reason, the dos line endings there prevent me from commiting
via the monorepo. A follow-up commit (not via the monorepo) will
finish the patch.
llvm-svn: 336843
Summary:
Floating point division by zero or even undef does not have undefined
behavior and may occur due to optimizations.
Fixes https://bugs.llvm.org/show_bug.cgi?id=37523.
Reviewers: kcc
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D47085
llvm-svn: 332761
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is
!DILabel(scope: !1, name: "foo", file: !2, line: 3)
We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is
llvm.dbg.label(metadata !1)
It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.
We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.
Differential Revision: https://reviews.llvm.org/D45024
Patch by Hsiangkai Wang.
llvm-svn: 331841
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.
Patch by tkrupa
Differential Revision: https://reviews.llvm.org/D44785
llvm-svn: 330322
The default assembly handling mode may introduce false positives in the
cases when MSan doesn't understand that the assembly call initializes
the memory pointed to by one of its arguments.
We introduce the conservative mode, which initializes the first
|sizeof(type)| bytes for every |type*| pointer passed into the
assembly statement.
llvm-svn: 329054
MSan used to insert the shadow check of the store pointer operand
_after_ the shadow of the value operand has been written.
This happens to work in the userspace, as the whole shadow range is
always mapped. However in the kernel the shadow page may not exist, so
the bug may cause a crash.
This patch moves the address check in front of the shadow access.
llvm-svn: 318901
Summary:
Add canary tests to verify that MSAN currently does nothing with the element atomic memory intrinsics for memcpy, memmove, and memset.
Placeholder tests that will fail once element atomic @llvm.mem[cpy|move|set] instrinsics have been added to the MemIntrinsic class hierarchy. These will act as a reminder to verify that MSAN handles these intrinsics properly once they have been added to that class hierarchy.
Reviewers: reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35510
llvm-svn: 308251
It turned out that MSan was incorrectly calculating the shadow for int comparisons: it was done by truncating the result of (Shadow1 OR Shadow2) to i1, effectively rendering all bits except LSB useless.
This approach doesn't work e.g. in the case where the values being compared are even (i.e. have the LSB of the shadow equal to zero).
Instead, if CreateShadowCast() has to cast a bigger int to i1, we replace the truncation with an ICMP to 0.
This patch doesn't affect the code generated for SPEC 2006 binaries, i.e. there's no performance impact.
For the test case reported in PR32842 MSan with the patch generates a slightly more efficient code:
orq %rcx, %rax
jne .LBB0_6
, instead of:
orl %ecx, %eax
testb $1, %al
jne .LBB0_6
llvm-svn: 302787
Fix incorrect calculation of the type size for __msan_maybe_warning_N
call that resulted in an invalid (narrowing) zext instruction and
"Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed."
Only happens in very large functions (with more than 3500 MSan
checks) operating on integer types that are not power-of-two.
llvm-svn: 274395
CodeGen has hooks that allow targets to emit specialized code instead
of calls to memcmp, memchr, strcpy, stpcpy, strcmp, strlen, strnlen.
When ASan/MSan/TSan/ESan is in use, this sidesteps its interceptors, resulting
in uninstrumented memory accesses. To avoid that, make these sanitizers
mark the calls as nobuiltin.
Differential Revision: http://reviews.llvm.org/D19781
llvm-svn: 273083