Commit Graph

419 Commits

Author SHA1 Message Date
Akira Hatanaka b8d2873d93 [AArch64][Inline-Asm] Return the 32-bit floating point register class
when constraint "w" is used on a 32-bit operand.

This enables compiling the following code, which used to error out in
the backend:

void foo1(int a) {
  asm volatile ("sqxtn h0, %s0\n" : : "w"(a):);
}

Fixes PR28633.

llvm-svn: 276344
2016-07-21 21:39:05 +00:00
David Majnemer 5d26127752 Revert "Disable this-return argument forwarding on ARM/AArch64"
Inference of the 'returned' attribute was fixed in r276008, lets try
turning the backend support back on.

This reverts commit r275677.

llvm-svn: 276081
2016-07-20 04:13:01 +00:00
Evandro Menezes 238fa76574 [AArch64] Properly validate the reciprocal estimation.
Add check for legal data types when expanding into a Newton series.

Differential Revision: https://reviews.llvm.org/D22267

llvm-svn: 276041
2016-07-19 22:31:11 +00:00
Hal Finkel 04b5330ccd Disable this-return argument forwarding on ARM/AArch64
r275042 reverted function-attribute inference for the 'returned' attribute
because the feature triggered self-hosting failures on ARM and AArch64. James
Molloy determined that the this-return argument forwarding feature, which
directly ties the returned input argument to the returned value, was the cause.
It seems likely that this forwarding code contains, or triggers, a subtle bug.
Disabling for now until we can track that down.

llvm-svn: 275677
2016-07-16 07:07:29 +00:00
Justin Lebar 9c375817ac [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).

Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.

This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted.  It also greatly simplifies the process of adding another flag
to getLoad.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D22249

llvm-svn: 275592
2016-07-15 18:27:10 +00:00
Balaram Makam d4acd7ed10 Revert r259387: "AArch64: Implement missed conditional compare sequences."
This reverts commit r259387 because it inserts illegal code after legalization
    in some backends where i64 OR type is illegal for example.

llvm-svn: 274573
2016-07-05 20:24:05 +00:00
Benjamin Kramer 3bc1edf95b Use arrays or initializer lists to feed ArrayRefs instead of SmallVector where possible.
No functionality change intended.

llvm-svn: 274431
2016-07-02 11:41:39 +00:00
Craig Topper 2bd8b4b180 [CodeGen,Target] Remove the version of DAG.getVectorShuffle that takes a pointer to a mask array. Convert all callers to use the ArrayRef version. No functional change intended.
For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array.

llvm-svn: 274337
2016-07-01 06:54:47 +00:00
Duncan P. N. Exon Smith e4f5e4f4d1 CodeGen: Use MachineInstr& in TargetLowering, NFC
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr.  In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.

As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.

llvm-svn: 274287
2016-06-30 22:52:52 +00:00
Craig Topper bc56e3ba53 Use ShuffleVectorSDNode::isSplat member method instead of static method isSplatMask where the mask came directly from getMask() on a shuffle node.
llvm-svn: 274208
2016-06-30 04:38:51 +00:00
Krzysztof Parzyszek e116d500a7 [SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee
The setCallee function will set the number of fixed arguments based
on the size of the argument list. The FixedArgs parameter was often
explicitly set to 0, leading to a lack of consistent value for non-
vararg functions.

Differential Revision: http://reviews.llvm.org/D20376

llvm-svn: 273403
2016-06-22 12:54:25 +00:00
David Majnemer e61e4bfd87 Replace silly uses of 'signed' with 'int'
llvm-svn: 273244
2016-06-21 05:10:24 +00:00
Benjamin Kramer bdc4956bac Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Evandro Menezes a3a0a60cff [AArch64] Add preferred alignments for Exynos M1
Differential Revision: http://reviews.llvm.org/D21203

llvm-svn: 272400
2016-06-10 16:00:18 +00:00
Benjamin Kramer 46e38f3678 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

llvm-svn: 272126
2016-06-08 10:01:20 +00:00
Geoff Berry 486f49cc63 Reapply [AArch64] Fix isLegalAddImmediate() to return true for valid negative values.
Originally reviewed here: http://reviews.llvm.org/D17463

llvm-svn: 272023
2016-06-07 16:48:43 +00:00
Matthias Braun 651cff42c4 AArch64: Do not test for CPUs, use SubtargetFeatures
Testing for specific CPUs has a number of problems, better use subtarget
features:
- When some tweak is added for a specific CPU it is often desirable for
  the next version of that CPU as well, yet we often forget to add it.
- It is hard to keep track of checks scattered around the target code;
  Declaring all target specifics together with the CPU in the tablegen
  file is a clear representation.
- Subtarget features can be tweaked from the command line.

To discourage people from using CPU checks in the future I removed the
isCortexXX(), isCyclone(), ... functions. I added an getProcFamily()
function for exceptional circumstances but made it clear in the comment
that usage is discouraged.

Reformat feature list in AArch64.td to have 1 feature per line in
alphabetical order to simplify merging and sorting for out of tree
tweaks.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D20762

llvm-svn: 271555
2016-06-02 18:03:53 +00:00
Rafael Espindola 4d29099f7f Delete AArch64II::MO_CONSTPOOL.
A constant pool holding the address of a variable in equivalent to
a got entry. It produces exactly the same instruction sequence as a
got use and unlike a got use this is not uniqued by the linker.

llvm-svn: 271311
2016-05-31 18:31:14 +00:00
Chad Rosier 14aa2ad1f4 [AArch64] Generate rev16/rev32 from bswap + srl when upper bits are known zero.
Canonicalize (srl (bswap i32 x), 16) to (rotr (bswap i32 x), 16), if the high
16-bits of x are zero. Similarly, canonicalize (srl (bswap i64 x), 32) to
(rotr (bswap i64 x), 32), if the high 32-bits of x are zero.

test_rev_w_srl16:            test_rev_w_srl16:
  and w8, w0, #0xffff          and     w8, w0, #0xffff
  rev w8, w8           --->    rev16   w0, w8
  lsr     w0, w8, #16

test_rev_x_srl32:            test_rev_x_srl32:
  rev x8, x8           --->    rev32   x0, x8
  lsr x0, x8, #32

llvm-svn: 270896
2016-05-26 19:41:33 +00:00
Chad Rosier 39481ace40 [AArch64] Remove command-line option use for testing.
The EXTR combine has been in tree for over 2 years without complain, so go ahead
and remove the option.

llvm-svn: 269292
2016-05-12 13:27:24 +00:00
Weiming Zhao 095c271131 [AArch64] Fix DAG selection for cmps for fp16 type
Summary: When emitting comparison for fp16, in addition to promote the LHS and RHS to fp32, we need to change the VT as well.

Reviewers: t.p.northover

Subscribers: t.p.northover, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19922

llvm-svn: 269151
2016-05-11 01:26:32 +00:00
Tim Northover 9508a70adc AArch64: allow vN to represent 64-bit registers in inline asm.
Unlike xN/wN, the size of vN is genuinely ambiguous in the assembly, so we
should try to infer what was intended from the type. But only down to 64-bits
(vN can never represent sN, hN or bN).

llvm-svn: 269132
2016-05-10 22:26:45 +00:00
Silviu Baranga f60be28ed8 [AArch64] Implement lowering of the X constraint on AArch64
Summary:
This implements the lowering of the X constraint on
AArch64.

The default behaviour of the X constraint lowering is to
restrict it to "f". This is a problem because the "f"
constraint is not implemented on AArch64 and would be too
restrictive anyway. Therefore, the AArch64 hook will
lower this to "w" (if the operand is a floating point or
vector) or "r" otherwise.

The implementation is similar with the one added for
ARM (r267411).

This is the AArch64 side of the fix for http://llvm.org/PR26493

Reviewers: rengolin

Subscribers: aemerson, rengolin, llvm-commits, t.p.northover

Differential Revision: http://reviews.llvm.org/D19967

llvm-svn: 268907
2016-05-09 11:10:44 +00:00
Evandro Menezes bcb95cd0ed [AArch64] Use the reciprocal estimation machinery
This patch adds support for estimating the square root, its reciprocal and
division or reciprocal using the combiner generic reciprocal machinery.

llvm-svn: 268539
2016-05-04 20:18:27 +00:00
Craig Topper 33772c5375 [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior.
llvm-svn: 267853
2016-04-28 03:34:31 +00:00
Craig Topper 3b4842b56f [AArch64] Expand CTTZ for all vector types.
llvm-svn: 267837
2016-04-28 01:58:21 +00:00
Ahmed Bougacha 128f8732a5 [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.
Differential Revision: http://reviews.llvm.org/D17176

llvm-svn: 267606
2016-04-26 21:15:30 +00:00
Craig Topper c5551bfc26 [AArch64] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267522
2016-04-26 05:26:51 +00:00
Marcin Koscielnicki 3fdc257d6a [AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic.
Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that
just return the thread pointer.  I have a pending patch that does the same
for SystemZ (D19054), and there are many more targets that could benefit
from one.

This patch merges the ARM and AArch64 intrinsics into a single target
independent one that will also be used by subsequent targets.

Differential Revision: http://reviews.llvm.org/D19098

llvm-svn: 266818
2016-04-19 20:51:05 +00:00
Tim Shen e885d5e4d3 [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.

There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.

Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.

llvm-svn: 266806
2016-04-19 19:40:37 +00:00
Craig Topper 18e69f4f63 Use MVT instead of EVT to remove a bunch of unnecessary calls to getSimpleVT.
llvm-svn: 266414
2016-04-15 06:20:21 +00:00
Tim Northover cdf1529c01 AArch64: expand cmpxchg after regalloc at -O0.
FastRegAlloc works only at the basic-block level and spills all live-out
registers. Unfortunately for a stack-based cmpxchg near the spill slots, this
can perpetually clear the exclusive monitor, which means the cmpxchg will never
succeed.

I believe the only way to handle this within LLVM is by expanding the loop
post-regalloc. We don't want this in general because it severely limits the
optimisations that can be done, so we limit this to -O0 compilations.

It's an ugly hack, and about the one good point in the whole mess is that we
can treat all cmpxchg operations in the most naive way possible (seq_cst, no
clrex faff) without affecting correctness.

Should fix PR25526.

llvm-svn: 266339
2016-04-14 17:03:29 +00:00
Matthias Braun 46b0f03e12 TargetLowering: Factor out common code for tail call eligibility checking; NFC
llvm-svn: 266270
2016-04-14 01:10:42 +00:00
Matthias Braun 74a0bd319a AArch64: Use a callee save registers for swiftself parameters
It is very likely that the swiftself parameter is alive throughout most
functions function so putting it into a callee save register should
avoid spills for the callers with only a minimum amount of extra spills
in the callees.

Currently the generated code is correct but unnecessarily spills and
reloads arguments passed in callee save registers, I will address this
in upcoming patches.

This also adds a missing check that for tail calls the preserved value
of the caller must be the same as the callees parameter.

Differential Revision: http://reviews.llvm.org/D19007

llvm-svn: 266251
2016-04-13 21:43:16 +00:00
Matthias Braun dff243e597 AArch64: Drive-by cleanup
llvm-svn: 266035
2016-04-12 02:16:13 +00:00
Tim Shen 0012756489 [SSP] Remove llvm.stackprotectorcheck.
This is a cleanup patch for SSP support in LLVM. There is no functional change.
llvm.stackprotectorcheck is not needed, because SelectionDAG isn't
actually lowering it in SelectBasicBlock; rather, it adds check code in
FinishBasicBlock, ignoring the position where the intrinsic is inserted
(See FindSplitPointForStackProtector()).

llvm-svn: 265851
2016-04-08 21:26:31 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Evgeniy Stepanov dde29e2799 Faster stack-protector for Android/AArch64.
Bionic has a defined thread-local location for the stack protector
cookie. Emit a direct load instead of going through __stack_chk_guard.

llvm-svn: 265481
2016-04-05 22:41:50 +00:00
Matthias Braun 870c34f0cf ARM, AArch64, X86: Check preserved registers for tail calls.
We can only perform a tail call to a callee that preserves all the
registers that the caller needs to preserve.

This situation happens with calling conventions like preserver_mostcc or
cxx_fast_tls. It was explicitely handled for fast_tls and failing for
preserve_most. This patch generalizes the check to any calling
convention.

Related to rdar://24207743

Differential Revision: http://reviews.llvm.org/D18680

llvm-svn: 265329
2016-04-04 18:56:13 +00:00
Matthias Braun cc7fba40fe AArch64ISelLowering: Remove unused variables/arguments; NFC
llvm-svn: 265098
2016-04-01 02:49:17 +00:00
Matthias Braun 8d41436004 CodeGen: Factor out code for tail call result compatibility check; NFC
llvm-svn: 264959
2016-03-30 22:46:04 +00:00
Haicheng Wu 6a6bc750d5 [AArch64] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize
Mimic what x86 does when optimizing sdiv/udiv for minsize.

llvm-svn: 264606
2016-03-28 18:17:07 +00:00
Manman Ren 4865d89653 [CXX_FAST_TLS] Disable tail call when calling conventions are mismatched.
Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller
and callee have mismatched calling conventions.

llvm-svn: 263856
2016-03-18 23:41:51 +00:00
Manman Ren 2828c57b6f [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.
Since at O0, explicit copies via SplitCSR may not be removed even if
they are unnecessary, we choose not to use SplitCSR at O0.

llvm-svn: 263855
2016-03-18 23:38:49 +00:00
Sanjay Patel 7506852709 [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Ahmed Bougacha 171f7b9986 [AArch64] Don't blindly lower f16/f128 FCCMPs.
Instead, extend f16 (like we do when lowering a standalone SETCC),
and let f128 be legalized to the RT calls.

Fixes PR26803.

llvm-svn: 263301
2016-03-11 22:02:58 +00:00
Tim Northover 6092de5075 AArch64: only try to use scaled fcvt ops on legal vector types.
Before we ended up calling getSimpleVectorType on a <3 x float>, which
asserted.

llvm-svn: 263169
2016-03-10 23:02:21 +00:00
Roman Levenstein 2792b3f02f Add support for a preserve_most calling convention to the AArch64 backend.
This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64.

There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention.

Differential Revision: http://reviews.llvm.org/D18016

llvm-svn: 263092
2016-03-10 04:35:09 +00:00
Sanjay Patel d6cb4ec2a2 [AArch64] fold 'isPositive' vector integer operations (PR26819)
This is one of the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26819

Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that
in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might
come from autovectorization even if it's unlikely to be produced from C NEON intrinsics.

The patch is based on the x86 equivalent:
http://reviews.llvm.org/rL262036

Differential Revision: http://reviews.llvm.org/D17834

llvm-svn: 262623
2016-03-03 15:56:08 +00:00