This change extends r251438 to handle more narrow load promotions
including byte type, unscaled, and signed. For example, this change will
convert :
ldursh w1, [x0, #-2]
ldurh w2, [x0, #-4]
into
ldur w2, [x0, #-4]
asr w1, w2, #16
and w2, w2, #0xffff
llvm-svn: 253577
Summary:
Newer libstdc++ has global pool, which is filled with objects
allocated during libstdc++ initialization, and never released.
Using use_globals=0 in the lit tests results in these objects
being treated as leaks.
Fix this by porting several tests to plain C, and introducing
a simple sanity test case for __lsan::ScopedDisabler.
Reviewers: kcc, ygribov
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14798
llvm-svn: 253576
This is another step towards allowing SimplifyCFG to speculate harder, but then have
CGP clean things up if the target doesn't like it.
Previous patches in this series:
http://reviews.llvm.org/D12882http://reviews.llvm.org/D13297
D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special
handling because of weirdness in the intrinsic definition for handling a zero input
(that definition can probably be blamed on x86).
For example, if we have the usual speculated-by-select expensive op pattern like this:
%tobool = icmp eq i64 %A, 0
%0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true
%cond = select i1 %tobool, i64 64, i64 %0
ret i64 %cond
There's an instcombine that will turn it into:
%0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false) ; is_zero_undef == false
This CGP patch is looking for that case and despeculating it back into:
entry:
%tobool = icmp eq i64 %A, 0
br i1 %tobool, label %cond.end, label %cond.true
cond.true:
%0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true
br label %cond.end
cond.end:
%cond = phi i64 [ %0, %cond.true ], [ 64, %entry ]
ret i64 %cond
This unfortunately may lead to poorer codegen (see the changes in the existing x86 test),
but if we increase speculation in SimplifyCFG (the next step in this patch series), then
we should avoid those kinds of cases in the first place.
The need for this patch was originally mentioned here:
http://reviews.llvm.org/D7506
with follow-up here:
http://reviews.llvm.org/D7554
Differential Revision: http://reviews.llvm.org/D14630
llvm-svn: 253573
When dumping function samples or writing them out as text format, it
helps if the samples are emitted sorted by source location. The sorting
of the maps is a bit slow, so we only do it on demand.
llvm-svn: 253568
Copying one mask register to another under BW should be done with kmovq instruction, otherwise we can loose some bits.
Copying 8 bits under DQ may be done with kmovb.
Differential Revision: http://reviews.llvm.org/D14812
llvm-svn: 253563
The lowering patterns for X86ISD::VZEXT_MOVL for 128-bit to 256-bit vectors were just copying the lower xmm instead of actually masking off the first scalar using a blend.
Fix for PR25320.
Differential Revision: http://reviews.llvm.org/D14151
llvm-svn: 253561
On OS X, the thread finalization is fragile due to thread-local variables destruction order. I've seen cases where the we destroy the ThreadState too early and subsequent thread-local values' destructors call interceptors again. Let's replace the TLV-based thread finalization method with libpthread hooks. The notification PTHREAD_INTROSPECTION_THREAD_TERMINATE is called *after* all TLVs have been destroyed.
Differential Revision: http://reviews.llvm.org/D14777
llvm-svn: 253560
On OS X, we build a dylib of the TSan runtime, which doesn't necessarily need to contain debugging symbols (and file and line information), so llvm-symbolizer might not be able to find file names for TSan internal frames. FrameIsInternal currently only considers filenames, but we should simply treat all frames within `libclang_rt.tsan_osx_dynamic.dylib` as internal. This patch treats all modules starting with `libclang_rt.tsan_` as internal, because there may be more runtimes for other platforms in the future.
Differential Revision: http://reviews.llvm.org/D14813
llvm-svn: 253559
Several testcases need pthread barriers (e.g. all bench_*.cc which use test/tsan/bench.h) which are not available on OS X. Let's mark them with "UNSUPPORTED: darwin".
Differential Revision: http://reviews.llvm.org/D14636
llvm-svn: 253558
Make X86AsmBackend generate smarter nops instead of a bunch of 0x90 for code alignment for CPUs which don't support long nop instructions.
Differential Revision: http://reviews.llvm.org/D14178
llvm-svn: 253557
the script when running a ShTest with an external or internal shell.
This bug is caused by use of the ``map`` function in Python 3 which
returns an iterable (rather than a list in Python 2). After the iterable
is exhausted it won't return any more output and consequently when
``_runShTest()`` tries to access the ``script`` which has already been
iterated over it is empty. Converting to a list immediatley after
calling ``map()`` fixes this.
This fixes the ``tests/shtest-format.py`` test when running under
Python3 which was previously failing.
llvm-svn: 253556
Reimplement dispatch_once in an interceptor to solve these issues that may produce false positives with TSan on OS X:
1) there is a racy load inside an inlined part of dispatch_once,
2) the fast path in dispatch_once doesn't perform an acquire load, so we don't properly synchronize the initialization and subsequent uses of whatever is initialized,
3) dispatch_once is already used in a lot of already-compiled code, so TSan doesn't see the inlined fast-path.
This patch uses a trick to avoid ever taking the fast path (by never storing ~0 into the predicate), which means the interceptor will always be called even from already-compiled code. Within the interceptor, our own atomic reads and writes are not written into shadow cells, so the race in the inlined part is not reported (because the accesses are only loads).
Differential Revision: http://reviews.llvm.org/D14811
llvm-svn: 253552
Add support for vector mode attributes like "attribute((mode(V4SF)))". Also add warning about deprecated vector modes like GCC does.
Differential Revision: http://reviews.llvm.org/D14744
llvm-svn: 253551
This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc).
The syntax is -force-attribute=function_name:attribute_name
All function attributes are parsed except alignstack as it requires an argument.
llvm-svn: 253550
The masked intrinsics support all integer and floating point data types. I added the pointer type to this list.
Added tests for CodeGen and for Loop Vectorizer.
Updated the Language Reference.
Differential Revision: http://reviews.llvm.org/D14150
llvm-svn: 253544
The LLVMContext was only used for Diagnostic. Pass a DiagnosticHandler
instead.
Differential Revision: http://reviews.llvm.org/D14794
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 253540
Ubsan detected undefined behavior in the MathExtras SaturatingMultiply test.
This change disables the test while it is being investigated.
llvm-svn: 253539
Optimizations like LoadPRE in GVN will insert new instructions.
If the insertion point is in a already processed BB, they should
get a value number explicitly. If the insertion point is after
current instruction, then just leave it. However, current GVN framework
has no support for it.
In this patch, we just bail out if a VN can't be found.
Dfferential Revision: http://reviews.llvm.org/D14670
A test/Transforms/GVN/pr25440.ll
M lib/Transforms/Scalar/GVN.cpp
llvm-svn: 253536
driving a canonical difference between that and an unqualified
type is a really bad idea when both are valid. Instead, remember
that it was there in a non-canonical way, then look for that in
the one place we really care about it: block captures. The net
effect closely resembles the behavior of a decl attribute, except
still closely following ARC's standard qualifier parsing rules.
llvm-svn: 253534
to start at the offset of the first ivar instead of the rounded-up
end of the superclass. The latter could include a large amount of
tail padding because of a highly-aligned ivar, and subclass ivars
can be laid out within that.
llvm-svn: 253533
Conversions between unrelated pointer types (e.g. char * and void *) involve
bitcasts which were not properly modeled in case of static initializers. The
patch fixes this problem.
The problem was originally spotted by Artem Dergachev. Patched by Yuri Gribov!
Differential Revision: http://reviews.llvm.org/D14652
llvm-svn: 253532
Summary:
dlopen(NULL, ...) is intended to give you back a handle to the
executable for use with dlsym. Casting it to link_map and using it with
ForEachMappedRegion results in a crash.
We also shouldn't unpoison the globals of a DSO that is already in
memory. This ensures that we don't do it for the executable, but in
general, MSan may have false negatives if the DSO is already loaded.
Reviewers: eugenis
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14795
llvm-svn: 253530