The 64/128-bit vector types are legal if NEON instructions are
available. However, there was no matching patterns for @llvm.cttz.*()
intrinsics and result in fatal error.
This commit fixes the problem by lowering cttz to:
a. ctpop((x & -x) - 1)
b. width - ctlz(x & -x) - 1
llvm-svn: 242037
Summary:
The iteration order within a member of DepCands is deterministic
and therefore we don't have to sort the accesses within a member.
We also don't have to copy the indices of the pointers into a
vector, since we can iterate over the members of the class.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11145
llvm-svn: 242033
Attribute names usually support an alternate spelling that uses double
underscores before and after the attribute name, like e.g. attribute
((__aligned__)) for attribute ((aligned)). This is necessary to allow
use of attributes in system headers without polluting the name space.
However, for attribute ((enable_if)) that alternate spelling does not
work correctly. This is because of code in Parser::ParseGNUAttributeArgs
(ParseDecl.cpp) that specifically checks for the "enable_if" spelling
without allowing the alternate spelling.
Similar code in ParseDecl.cpp uses the normalizeAttrName helper to allow
both spellings. This patch adds use of that helper for the "enable_if"
check as well, which fixes attribute ((__enable_if__)).
Differential Revision: http://reviews.llvm.org/D11142
llvm-svn: 242029
In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch.
I temporary removed the old intrinsics test (just to split this patch).
Half types are not covered here.
Differential Revision: http://reviews.llvm.org/D11134
llvm-svn: 242023
Summary:
r241964 has added a dependency on uuid.h, which (on linux at least) necessitates instalation of a
new package. Since the only thing we need from that file is uuid_t (and this is already defined
in UuidCompatibility.h), we can avoid this dependency by making this include __APPLE__ specific.
If in future, we need more from this library, we can revisit this decision.
Reviewers: jasonmolenda
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D11135
llvm-svn: 242022
The size of a long double was hardcoded in DataExtractor for x86 and
x86_64 architectures. This CL removes the hard coded values and use the
actual size based on the floating point semantics specified.
Differential revision: http://reviews.llvm.org/D8417
llvm-svn: 242019
Summary:
This is the first part of our effort to make llgs single threaded. Currently, llgs consists of
about three threads and the synchronisation between them is a major source of latency when
debugging linux and android applications.
In order to be able to go single threaded, we must have the ability to listen for events from
multiple sources (primarily, client commands coming over the network and debug events from the
inferior) and perform necessary actions. For this reason I introduce the concept of a MainLoop.
A main loop has the ability to register callback's which will be invoked upon receipt of certain
events. MainLoopPosix has the ability to listen for file descriptors and signals.
For the moment, I have merely made the GDBRemoteCommunicationServerLLGS class use MainLoop
instead of waiting on the network socket directly, but the other threads still remain. In the
followup patches I indend to migrate NativeProcessLinux to this class and remove the remaining
threads.
Reviewers: ovyalov, clayborg, amccarth, zturner, emaste
Subscribers: tberghammer, lldb-commits
Differential Revision: http://reviews.llvm.org/D11066
llvm-svn: 242018
On Android the oatdata and the oatexec symbols in
system@framework@boot.oat covers the full .text section what causes
issues with displaying unusable symbol name to the user and very slow
unwinding speed because the instruction emulation based unwind plans
try to emulate all instructions in these symbols. Don't add these
symbols to the symbol list as they have no use for the debugger and
they are causing a lot of trouble.
Differential revision: http://reviews.llvm.org/D11065
llvm-svn: 242017
This patch:
- Allows mips32 cores to match with any mips32/mips64 cores.
- Allows mips32r2 cores to match with core only up-to mips32r2/mips64r2.
- Allows mips32r3 cores to match with core only up-to mips32r3/mips64r3.
- Allows mips32r5 cores to match with core only up-to mips32r3/mips64r5.
- Allows mips32r6 core to match with only mips32r6/mips64r6 or mips32/mips64.
Reviewers: emaste, jaydeep, clayborg
Subscribers: mohit.bhakkad, nitesh.jain, bhushan, lldb-commits
Differential Revision: http://reviews.llvm.org/D10921
llvm-svn: 242016
It exists for compatibility with GCC which requires it to print MSA registers
for the 'f' constraint. Although LLVM doesn't need it, the 'w' modifier should
still be used for portability between the two compilers.
llvm-svn: 242015
Summary:
This at least saves compile time. I also encountered a case where
ephemeral values affect whether other variables are promoted, causing
performance issues. It may be a bug in LSR, but I didn't manage to
reduce it yet. Anyhow, I believe it's in general not worth considering
ephemeral values in LSR.
Reviewers: atrick, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11115
llvm-svn: 242011
Three things:
- The atomic intrinsics mandate memory barriers, let's start emitting
some.
- We don't need to manually create RMW operations, we can just do
__atomic_fetch_foo instead of performing __atomic_foo_fetch and
undoing foo.
- Don't use inline assembly, we don't need it for these intrinsics.
This fixes PR24101.
llvm-svn: 242009
clang-cl doesn't compile std::atomic_flag correctly (PR24101). Since the COFF
linker doesn't use threads yet, just revert r241420 and r241481 for now to
work around this clang-cl bug.
llvm-svn: 242006
With LLVM_ENABLE_THREADS disabled, all the llvm code assumes that it runs on
a single thread and doesn't use any mutexes. lld still spawned lots of threads
in that case and called into llvm, assuming that llvm is thread-safe.
As fix, let lld use only a single thread if LLVM_ENABLE_THREADS is disabled.
I left in all the mutexes in lld. That means lld is a bit slower than
necessary in single-thread mode, but that's probably worth the simpler code.
llvm-svn: 242004
before the first imported declaration.
We don't need to track all formerly-canonical declarations of an entity; it's sufficient to track those ones for which no other formerly-canonical declaration was imported into the same module. We call those ones "key declarations", and use them as our starting points for collecting redeclarations and performing namespace lookups.
llvm-svn: 241999
In the test, y1 is not reference compatible to y2 and we currently assume
the cast is ill-formed so we emit a diagnostic. Instead, in order to honour
the standard, if y1 it's not reference-compatible to y2 then it can't be
converted using a static_cast, and a reinterpret_cast should be tried instead.
Richard Smith provided the correct interpretation of the standard and
explanation about the subtle difference between "can't be cast" and "the cast
is ill-formed". The former applies in this case.
PR: 23802
llvm-svn: 241998
Register r12 ('ip') is used by GCC for this purpose
and hence is used here. As discussed on the GCC mailing
list, the register choice is an ABI issue and so
choosing the same register as GCC means
__builtin_call_with_static_chain is compatible.
A similar patch has just gone in the AArch64 backend,
so this is just the ARM counterpart, following the same
discussion.
Patch by Stephen Cross.
llvm-svn: 241996
While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized.
This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together.
Differential Revision: http://reviews.llvm.org/D11063
llvm-svn: 241989
The function uses parallel_for() and then writes error messages from the
parallel loop's body. This produces nondetermistic error messages. Instead,
copy error messages to a vector and sort it by the atom's file offsets before
printing all error messages after the parallel_for(). This results in a few
string copies, but only in the error case. (And passing tests seem more
important than performance.)
This makes tests elf/AArch64/rel-prel16-overflow.test and
elf/AArch64/rel-prel32-overflow.test pass on Windows: Both tests check that
atom error messages are emitted in a certain order, and on Windows they
happened to be emitted in a different order before this patch.
llvm-svn: 241988
There is no suitable basic block to sink instructions in loops without
exits. The only way an instruction in a loop without exits can be used
is as an incoming value to a PHI. In such cases, the incoming block for
the corresponding value is unreachable.
This fixes PR24013.
Differential Revision: http://reviews.llvm.org/D10903
llvm-svn: 241987
r238842 added the TargetRecip system for controlling use of reciprocal
estimates for sqrt and division using a set of parameters that can be set by
the frontend. Clang now supports a sophisticated -mrecip option, and this will
allow that option to effectively control the relevant code-generation
functionality of the PPC backend.
llvm-svn: 241985