Summary:
Separate the check for blend shuffle_vector masks into isBlendMask.
This function will also be used to check if a vector shuffle is legal. No
change in functionality was intended, but we ended up improving codegen on
two tests, which were being (more) optimized only if the resulting shuffle
was legal.
Reviewers: nadav, delena, andreadb
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3964
llvm-svn: 209923
This reapplies r209910 with a fix for the assertion failures hit on the
buildbots.
original commit message:
I thought we could get away without this, but it means that the
FileEntry objects actually refer to the wrong files, since pcms are not
updated inplace, they are atomically renamed into place after compiling
a module.
So we are close to the original behaviour of invalidating the cache for
all modules being removed, but now we should only invalidate the ones
that depend on whichever module failed to load.
Unfortunately I haven't come up with a new test that didn't require
a race between parallel invocations of clang.
<rdar://problem/17038180>
llvm-svn: 209922
I switched the lldb_private::FileSpec code over to use "llvm::StringRef llvm::sys::path::filename(llvm::StringRef)" for basename() and "llvm::StringRef llvm::sys::path::parent_path(llvm::StringRef)" for dirname().
<rdar://problem/16870083>
llvm-svn: 209917
Learned that MacOSX only accepts signal delivery on a thread that is
already signal handling. Reworked the test exe to cause a SIGSEGV
and recover if either nothing intercepts the SIGSEGV handler, or
if a SIGUSR1 is inserted. The test uses the latter part to test
signal delivery on continue using the SIGUSR1.
I still don't have this working on MacOSX. I'm seeing the
signal get delivered to a different thread than the one I'm
specifying with $Hc{thread-id} + $C{signo}, or with
$vCont;C{signo}:{thread-id};c. I'll come back to this
after getting it working on the llgs branch on Linux x86_64.
llvm-svn: 209912
I thought we could get away without this, but it means that the
FileEntry objects actually refer to the wrong files, since pcms are not
updated inplace, they are atomically renamed into place after compiling
a module.
So we are close to the original behaviour of invalidating the cache for
all modules being removed, but now we should only invalidate the ones
that depend on whichever module failed to load.
Unfortunately I haven't come up with a new test that didn't require
a race between parallel invocations of clang.
<rdar://problem/17038180>
llvm-svn: 209910
There was a single problem in cxa_demangle.cpp, where gcc would complain
`error: changes meaning of 'String'` about the line `typedef String String;`.
According to 3.3.7p2, this diagnostic is allowed (but not required, so clang
does not have to report this).
As a fix, make string_pair a template and pass String as template parameter.
This fixes the error with gcc and also removes some repetition from the code.
No behavior change.
llvm-svn: 209909
This implements the central part of support for dllimport/dllexport on
classes: allowing the attribute on class declarations, inheriting it
to class members, and forcing emission of exported members. It's based
on Nico Rieck's patch from http://reviews.llvm.org/D1099.
This patch doesn't propagate dllexport to bases that are template
specializations, which is an interesting problem. It also doesn't
look at the rules when redeclaring classes with different attributes,
I'd like to do that separately.
Differential Revision: http://reviews.llvm.org/D3877
llvm-svn: 209908
For MIPS, we have to encode the personality routine with
an indirect pointer to absptr; otherwise, some link warning
warning will be raised, and the program might crash in some
early MIPS Android device.
llvm-svn: 209907
Unordered is strictly weaker than monotonic, so if the latter doesn't have any
barriers then the former certainly shouldn't.
rdar://problem/16548260
llvm-svn: 209901
There shouldn't be any difference in behaviour here, at least not in
any configurations people care about and possibly not in any reachable
configurations.
llvm-svn: 209899
Add a script that is used to deflake inherently flaky tsan tests.
It is invoked from lit tests as:
%deflake %run %t
The script runs the target program up to 10 times,
until it produces a tsan warning.
llvm-svn: 209898
The optimization is two-fold:
First, the algorithm now uses SSE instructions to
handle all 4 shadow slots at once. This makes processing
faster.
Second, if shadow contains the same access, we do not
store the event into trace. This increases effective
trace size, that is, tsan can remember up to 10x more
previous memory accesses.
Perofrmance impact:
Before:
[ OK ] DISABLED_BENCH.Mop8Read (2461 ms)
[ OK ] DISABLED_BENCH.Mop8Write (1836 ms)
After:
[ OK ] DISABLED_BENCH.Mop8Read (1204 ms)
[ OK ] DISABLED_BENCH.Mop8Write (976 ms)
But this measures only fast-path.
On large real applications the speedup is ~20%.
Trace size impact:
On app1:
Memory accesses : 1163265870
Including same : 791312905 (68%)
on app2:
Memory accesses : 166875345
Including same : 150449689 (90%)
90% of filtered events means that trace size is effectively 10x larger.
llvm-svn: 209897
This breaks with MSVC.
With IsLateTemplateParsed, FunctionDecl::doesThisDeclarationHaveABody() returns true regardless of Body.
This reinstates what was fixed in r208985.
llvm-svn: 209896
Darwin prologues save their GPRs in two stages: a narrow push of r0-r7 & lr,
followed by a wide push of the remaining registers if there are any. AAPCS uses
a single push.w instruction.
It turns out that, on average, enough registers get pushed that code is smaller
in the AAPCS prologue, which is a nice property for M-class programmers. They
also have other options available for back-traces, so can hopefully deal with
the fact that FP & LR aren't adjacent in memory.
rdar://problem/15909583
llvm-svn: 209895
(clang doesn't complain about this, but gcc does. This is necessary for a
follow-up patch that will enable _LIBCPP_CONSTEXPR for gcc.)
llvm-svn: 209888
The C and C++ semantics for compare_exchange require it to return a bool
indicating success. This gets mapped to LLVM IR which follows each cmpxchg with
an icmp of the value loaded against the desired value.
When lowered to ldxr/stxr loops, this extra comparison is redundant: its
results are implicit in the control-flow of the function.
This commit makes two changes: it replaces that icmp with appropriate PHI
nodes, and then makes sure earlyCSE is called after expansion to actually make
use of the opportunities revealed.
I've also added -{arm,aarch64}-enable-atomic-tidy options, so that
existing fragile tests aren't perturbed too much by the change. Many
of them either rely on undef/unreachable too pervasively to be
restored to something well-defined (particularly while making sure
they test the same obscure assert from many years ago), or depend on a
particular CFG shape, which is disrupted by SimplifyCFG.
rdar://problem/16227836
llvm-svn: 209883
The main problem is in the predicate passed to the `std::stable_sort()`.
This predicate always returns false if **both** section's names do not
start with `.init_array` or `.fini_array` prefixes. In short, it does not
define a strict weak orderng. Suppose we have the following sections:
.A .init_array.1 .init_array.2
The predicate states that:
not .init_array.1 < .A
not .A < .init_array.2
but .init_array.1 < .init_array.2 !!!
The second problem is that `.init_array` section without number should
go last in the list. Not it has the lowest priority '0' and goes first.
The patch fixes both of the problems.
llvm-svn: 209875