Commit Graph

8654 Commits

Author SHA1 Message Date
Benjamin Kramer 4fed928f53 Avoid some copies by using const references.
clang-tidy's performance-unnecessary-copy-initialization with some manual
fixes. No functional changes intended.

llvm-svn: 270988
2016-05-27 12:30:51 +00:00
Benjamin Kramer 3e9a5d3468 Apply clang-tidy's misc-static-assert where it makes sense.
Also fold conditions into assert(0) where it makes sense. No functional
change intended.

llvm-svn: 270982
2016-05-27 11:36:04 +00:00
Ranjeet Singh c520e93d9a Test commit.
llvm-svn: 270056
2016-05-19 12:44:39 +00:00
Rafael Espindola 8c34dd8257 Delete Reloc::Default.
Having an enum member named Default is quite confusing: Is it distinct
from the others?

This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.

llvm-svn: 269988
2016-05-18 22:04:49 +00:00
Rafael Espindola 38af4d6347 Trivial cleanups.
This just clang formats and cleans comments in an area I am about to
post a patch for review.

llvm-svn: 269946
2016-05-18 16:00:24 +00:00
Rafael Espindola 712f957cae Simplify handling of hidden stub.
Since r207518 they are printed exactly like non-hidden stubs on x86 and
since r207517 on ARM.

This means we can use a single set for all stubs in those platforms.

llvm-svn: 269776
2016-05-17 16:01:32 +00:00
Renato Golin 57bfb69aa4 [ARM] ARM mov InstAlias for MOVW lacks HasV6T2
The movw instruction is only available in ARM state for V6T2 and above.
The MOVi16 instruction has requirement HasV6T2 but the InstAlias
for mov rd, imm where the operand is imm0_65535_expr:$imm does not.

This means that movw can incorrectly be used in ARMv4 and ARMv5 by
writing mov rd, 0x1234. The simple fix is to the requirement HasV6T2
to the InstAlias. Tests added to not-armv4.s.

Patch by Peter Smith.

llvm-svn: 269761
2016-05-17 13:05:28 +00:00
Saleem Abdulrasool 8df2f49889 ARM: support export directives for Windows
It seems that cl will emit the export directives for Windows ARM targets.  The
fact that it did this had originally been missed and this functionality was
never implemented.  This makes it possible to rely solely on the source code for
indicating what the exported interfaces are and brings us more compatibility
with cl.

llvm-svn: 269574
2016-05-14 18:58:34 +00:00
Tim Northover f8b0a7af52 ARM: use callee-saved list in the order they're actually saved.
When setting the frame pointer, the offset from SP is calculated based on the
stack slot it gets allocated, but this slot is in turn based on the order of
the CSR list so that list should match the order we actually save the registers
in. Mostly it did, but in the edge-case of MachO AAPCS targets it was wrong.

llvm-svn: 269459
2016-05-13 19:16:14 +00:00
Renato Golin 608cb5def6 [ARM] Support and tests for transform of LDR rt, = to MOV
This change implements the transformation in processInstruction() for the
LDR rt, =expression to MOV rt, expression when the expression can be evaluated
and can fit into the immediate field of the MOV or a MVN.

Across the ARM and Thumb instruction sets there are several cases to consider,
each with a different range of representatble constants.

In ARM we have:
 * Modified immediate (All ARM architectures)
 * MOVW (v6t2 and above)

In Thumb we have:
 * Modified immediate (v6t2, v7m and v8m.mainline)
 * MOVW (v6t2, v7m, v8.mainline and v8m.baseline)
 * Narrow Thumb MOV that can be used in an IT block (non flag-setting)

If the immediate fits any of the available alternatives then we make the transformation.

Fixes 25722.

Patch by Peter Smith.

llvm-svn: 269354
2016-05-12 21:22:42 +00:00
Renato Golin 3f126138a1 [ARM] Delay ARM constant pool creation. NFC.
This change adds a new constant pool kind to ARMOperand. When parsing the
operand for =immediate we create an instance of this operand rather than
creating a constant pool entry and rewriting the operand.

As the new operand kind is only created for ldr rt,= we can make ldr rt,=
an explicit pseudo instruction in ARM, Thumb and Thumb2

The pseudo instruction is expanded in processInstruction(). This creates the
constant pool and transforms the pseudo instruction into a pc-relative ldr to
the constant pool.

There are no functional changes and no modifications needed to existing tests.

Required by the patch that fixes PR25722.

Patch by Peter Smith.

llvm-svn: 269352
2016-05-12 21:22:31 +00:00
Renato Golin f6ed8bbf46 [scan-build] fix warnings emitted on LLVM ARM code base
Fix "Logic error" warnings of the type "Called C++ object pointer is
null" reported by Clang Static Analyzer.

Patch by Apelete Seketeli.

llvm-svn: 269285
2016-05-12 12:33:33 +00:00
Justin Bogner 4557136645 SDAG: Implement Select instead of SelectImpl in ARMDAGToDAGISel
This is a large change, but it's pretty mechanical:
- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
  the method to try* and return a bool for success.
- Where we were calling SelectNodeTo, just return afterwards.

Part of llvm.org/pr26808.

llvm-svn: 269258
2016-05-12 00:31:09 +00:00
Justin Bogner ed4f37850e SDAG: Clean up dangling nodes in ARMISelDAGToDAG::SelectImpl
When we convert to the void Select interface, leaving unreferenced
nodes around won't be allowed anymore.

Part of llvm.org/pr26808.

llvm-svn: 269256
2016-05-12 00:20:19 +00:00
Tim Northover 56048d5c2c ARM: report an error when attempting to target a misalgined BLX
The CodeGen problem was fixed in r269101, but we still miscompiled assembly
that tried the same thing.

llvm-svn: 269126
2016-05-10 21:48:48 +00:00
Tim Northover b5ece527a1 ARM: stop emitting blx instructions for most calls on MachO.
I'm really not sure why we were in the first place, it's the linker's job to
convert between BL/BLX as necessary. Even worse, using BLX left Thumb calls
that could be locally resolved completely unencodable since all offsets to BLX
are multiples of 4.

rdar://26182344

llvm-svn: 269101
2016-05-10 19:17:47 +00:00
Matthias Braun 31d19d43c7 CodeGen: Move TargetPassConfig from Passes.h to an own header; NFC
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.

llvm-svn: 269011
2016-05-10 03:21:59 +00:00
Weiming Zhao 5b5501e817 [ARM] Fix Scavenger assert due to underestimated stack size
(re-apply r268810 as it exposed an uninitialized variable in ARM MFI.
 Patch 268868 should fix that.)

Summary:
Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure.

Reviewers: rengolin

Subscribers: vitalybuka, aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D19896

llvm-svn: 268869
2016-05-08 05:11:54 +00:00
Weiming Zhao 453b79013e Fix use-of-uninitialized-value of ARMMachineFunctionInfo
Summary: Explicitly initialize ArgumentStackSize to prevent the msan failure.

Reviewers: rengolin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D20051

llvm-svn: 268868
2016-05-08 05:04:47 +00:00
Vitaly Buka e81d96be6f Revert r268810 becase it brakes msan bot.
16802==WARNING: MemorySanitizer: use-of-uninitialized-value
    lib/Target/ARM/ARMFrameLowering.cpp:1632

llvm-svn: 268833
2016-05-07 01:54:00 +00:00
Weiming Zhao 74f12d31c1 [ARM] Fix Scavenger assert due to underestimated stack size
(this is resubmit of r268529 with minor refactoring. r268529 was reverted
 at r268536 due a memory sanitizer failure.  I have not been able to
 reproduce that failure and I checked all the variable used in my change
 but I could not spot an issue. I did some refactoring and see if it will
 give a clearer hint)

Summary:
Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure.

Reviewers: rengolin

Subscribers: vitalybuka, aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D19896

llvm-svn: 268810
2016-05-06 22:20:13 +00:00
Justin Bogner b012699741 SDAG: Rename Select->SelectImpl and repurpose Select as returning void
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.

We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.

Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.

llvm-svn: 268693
2016-05-05 23:19:08 +00:00
Tim Northover df43264cf7 ARM: don't attempt to merge litpools referencing different PC-anchors.
Given something like:

    ldr r0, .LCPI0_0 (== pc-rel var)
    add r0, pc

    ldr r1, .LCPI0_1 (== pc-rel var)
    add r1, pc

we cannot combine the 2 ldr instructions and litpools because they get added to
a different pc to form the correct address. I think the original logic came
from a time when we fused the LDRpci/PICADD instructions into one
pseudo-instruction so the PC was always immediately at-hand. That's no longer
the case.

Should fix general-dynamic TLS access on Linux, and quite possibly other -fPIC
code that relies on litpools (e.g. v6m and -Oz compilations) though trivial
tweaks of the .ll test didn't provoke anything.

llvm-svn: 268662
2016-05-05 18:38:53 +00:00
Justin Bogner 8752be775c ARM: Use a Handle to track SDNodes in case they're CSE'd. NFC
The code here is recursively Select-ing a new Node to avoid issues
where N is CSE'd during replaceDAGValue and stops being valid. We can
accomplish the same goal in a more principled way by using a
HandleSDNode.

This is essentially a less dodgy fix for PR25733 than the original
attempt back in r255120.

llvm-svn: 268590
2016-05-05 01:43:49 +00:00
Vitaly Buka 6b5c89262a Revert r268529 because it caused use-of-uninitialized-value
Summary: This reverts commit d88cc0862bf7da64850b89e9bb5ea9f95e7f1184.

#0 0xfed467 in llvm::ARMFrameLowering::determineCalleeSaves(llvm::MachineFunction&, llvm::BitVector&, llvm::RegScavenger*) const /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Target/ARM/ARMFrameLowering.cpp:1625:52
#1 0x330d4cc in (anonymous namespace)::PEI::runOnMachineFunction(llvm::MachineFunction&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/PrologEpilogInserter.cpp:186:3
#2 0x3193e12 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:60:13
#3 0x396237d in llvm::FPPassManager::runOnFunction(llvm::Function&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1526:23
#4 0x3962a23 in llvm::FPPassManager::runOnModule(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1547:16
#5 0x3963d52 in runOnModule /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1603:23
#6 0x3963d52 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/LegacyPassManager.cpp:1706
#7 0x6bb910 in compileModule(char**, llvm::LLVMContext&) /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:412:5
#8 0x6b3c25 in main /mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm/tools/llc/llc.cpp:218:22
#9 0x7fd4a7d37ec4 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
#10 0x625c93 in _start (/mnt/b/sanitizer-buildbot2/sanitizer-x86_64-linux-bootstrap/build/llvm_build_msan/bin/llc+0x625c93)

Reviewers:

Subscribers:

llvm-svn: 268536
2016-05-04 19:44:11 +00:00
Weiming Zhao 2373f769ce [ARM] Fix Scavenger assert due to underestimated stack size
Summary:
Currently, when checking if a stack is "BigStack" or not, it doesn't count into spills and arguments. Therefore, LLVM won't reserve spill slot for this actually "BigStack". This may cause scavenger failure.

Reviewers: rengolin

Subscribers: aemerson, rengolin, tberghammer, danalbert, srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D19896

llvm-svn: 268529
2016-05-04 18:19:33 +00:00
Matthias Braun d1aabb2813 livePhysRegs: Pass MBB by reference in addLive{Ins|Outs}(); NFC
The block must no be nullptr for the addLiveIns()/addLiveOuts()
function.

llvm-svn: 268340
2016-05-03 00:24:32 +00:00
Matthias Braun 24f26e6d91 LivePhysRegs: Automatically determine presence of pristine regs.
Remove the AddPristinesAndCSRs parameters from
addLiveIns()/addLiveOuts().

We need to respect pristine registers after prologue epilogue insertion,
Seeing that we got this wrong in at least two commits already, we should
rather pay the small price to query MachineFrameInfo for it.

There are three cases that did not set AddPristineAndCSRs to true even
after register allocation:
- ExecutionDepsFix: live-out registers are used as a hint that the
  register is used soon. This is not true for pristine registers so
  use the new addLiveOutsNoPristines() to maintain this behaviour.
- SystemZShortenInst: Not setting AddPristineAndCSRs to true looks like
  a bug, should do the right thing automatically now.
- StackMapLivenessAnalysis: Not adding pristine registers looks like a
  bug to me. Added a FIXME comment but maintain the current behaviour
  as a change may need to get coordinated with GC runtimes.

llvm-svn: 268336
2016-05-03 00:08:46 +00:00
Tim Northover c08db1840c ARM: fix handling of SUB immediates in peephole opt.
We were negating an immediate that was going to be used in a SUBri form
unnecessarily. Since ADD/SUB are very similar we *can* do that, but we have to
change the SUB to an ADD at the same time. This also applies to ADD, and allows
us to handle a slightly larger range of immediates for those two operations.

rdar://25992245

llvm-svn: 268276
2016-05-02 18:30:08 +00:00
Filipe Cabecinhas 0da9937517 Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them.
Summary:
Historically, we had a switch in the Makefiles for turning on "expensive
checks". This has never been ported to the cmake build, but the
(dead-ish) code is still around.

This will also make it easier to turn it on in buildbots.

Reviewers: chandlerc

Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D19723

llvm-svn: 268050
2016-04-29 15:22:48 +00:00
Craig Topper 33772c5375 [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior.
llvm-svn: 267853
2016-04-28 03:34:31 +00:00
Ahmed Bougacha 65572afea8 [ARM] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs.
We run after PEI.
Found via inspection; no obvious testcase.

Follow-up to r266679.

llvm-svn: 267781
2016-04-27 20:33:07 +00:00
Ahmed Bougacha b4af107239 [ARM] Set correct successors in CMPXCHG pseudo expansion.
transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas
it should be a successor of the original MBB.

The testcase changes are caused by Thumb2SizeReduction, which
was previously confused by the broken CFG.

Follow-up to r266679.

Unfortunately, it's tricky to catch this in the verifier.

llvm-svn: 267778
2016-04-27 20:32:54 +00:00
Ahmed Bougacha 128f8732a5 [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.
Differential Revision: http://reviews.llvm.org/D17176

llvm-svn: 267606
2016-04-26 21:15:30 +00:00
Craig Topper d8d6be4f99 [ARM] Expand vector ctlz_zero_undef so it becomes ctlz.
The default is Legal, which results in 'Cannot select' errors.

llvm-svn: 267521
2016-04-26 05:04:37 +00:00
Craig Topper edb4a6ba98 [ARM] Expand v1i64 and v2i64 ctlz.
The default is legal, which results in 'Cannot select' errors.

llvm-svn: 267520
2016-04-26 05:04:33 +00:00
Andrew Kaylor 1aa3cf7d18 Reverting Thumb2SizeReduction opt bisect change to fix failing buildbots.
llvm-svn: 267506
2016-04-26 00:56:36 +00:00
Junmo Park 3c65acf87e Remove MinLatency in SchedMachineModel. NFC.
Summary:
We don't use MinLatency any more since r184032.

Reviewers: atrick, hfinkel, mcrosier

Differential Revision: http://reviews.llvm.org/D19474

llvm-svn: 267502
2016-04-26 00:37:46 +00:00
Andrew Kaylor 736efc894d Fix build warning
llvm-svn: 267487
2016-04-25 22:27:30 +00:00
Andrew Kaylor a2b9111ef7 Add optimization bisect opt-in calls for ARM passes
Differential Revision: http://reviews.llvm.org/D19449

llvm-svn: 267480
2016-04-25 22:01:04 +00:00
Tim Northover 5c3140f745 ARM: put extern __thread stubs in a special section.
The linker needs to know that the symbols are thread-local to do its job
properly.

llvm-svn: 267473
2016-04-25 21:12:04 +00:00
Silviu Baranga 82d04260b7 [ARM] Add support for the X asm constraint
Summary:
This patch adds support for the X asm constraint.

To do this, we lower the constraint to either a "w" or "r" constraint
depending on the operand type (both constraints are supported on ARM).

Fixes PR26493

Reviewers: t.p.northover, echristo, rengolin

Subscribers: joker.eph, jgreenhalgh, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D19061

llvm-svn: 267411
2016-04-25 14:29:18 +00:00
Saleem Abdulrasool 9611518646 ARM: fix __chkstk Frame Setup on WoA
This corrects the MI annotations for the stack adjustment following the __chkstk
invocation.  We were marking the original SP usage as a Def rather than Kill.
The (new) assigned value is the definition, the original reference is killed.

Adjust the ISelLowering to mark Kills and FrameSetup as well.

This partially resolves PR27480.

llvm-svn: 267361
2016-04-24 20:12:48 +00:00
Peter Collingbourne 265ebd7d70 CodeGen: Use PLT relocations for relative references to unnamed_addr functions.
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.

Also includes a bonus feature: addends for COFF image-relative references.

Differential Revision: http://reviews.llvm.org/D17938

llvm-svn: 267211
2016-04-22 20:40:10 +00:00
David Majnemer 68318e0414 Fix some spelling mistakes
llvm-svn: 267112
2016-04-22 06:37:48 +00:00
Saleem Abdulrasool a028853540 ARM: restrict register class for WIN__DBZCHK
WIN__DBZCHK will insert a CBZ instruction into the stream.  This instruction
reserves 3 bits for the condition register (rn).  As such, we must ensure that
we restrict the register to a low register.  Use the tGPR class instead of GPR
to ensure that this is properly constrained.  In debug builds, we would attempt
to use lr as a condition register which would silently get truncated with no
hint that the register selection was incorrect.

llvm-svn: 267080
2016-04-21 23:53:19 +00:00
Tim Northover c52c74efdf MachO: enable .data_region directives everywhere
We'd disabled them on x86 because back in the early days some host tools
couldn't handle the new load commands. This no longer holds: anyone capable of
deploying Clang should be able to deploy its copies of ar/ranlib/etc.

rdar://25254790

llvm-svn: 267075
2016-04-21 23:00:17 +00:00
Tim Northover 1ee27c74cb ARM: fix assertion failure on -O0 cmpxchg.
Because lowering of CMP_SWAP_64 occurs during type legalization, there can be
i64 types produced by more than just a BUILD_PAIR or similar. My initial tests
used just incoming function args.

llvm-svn: 266828
2016-04-19 22:25:02 +00:00
Marcin Koscielnicki 3fdc257d6a [AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic.
Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that
just return the thread pointer.  I have a pending patch that does the same
for SystemZ (D19054), and there are many more targets that could benefit
from one.

This patch merges the ARM and AArch64 intrinsics into a single target
independent one that will also be used by subsequent targets.

Differential Revision: http://reviews.llvm.org/D19098

llvm-svn: 266818
2016-04-19 20:51:05 +00:00
Tim Northover b629c77692 ARM: use a pseudo-instruction for cmpxchg at -O0.
The fast register-allocator cannot cope with inter-block dependencies without
spilling. This is fine for ldrex/strex loops coming from atomicrmw instructions
where any value produced within a block is dead by the end, but not for
cmpxchg. So we lower a cmpxchg at -O0 via a pseudo-inst that gets expanded
after regalloc.

Fortunately this is at -O0 so we don't have to care about performance. This
simplifies the various axes of expansion considerably: we assume a strong
seq_cst operation and ensure ordering via the always-present DMB instructions
rather than v8 acquire/release instructions.

Should fix the 32-bit part of PR25526.

llvm-svn: 266679
2016-04-18 21:48:55 +00:00
Renato Golin 4b18a510a2 [ARM] AArch32 v8 NEON is still not IEEE-754 compliant
llvm-svn: 266603
2016-04-18 12:06:47 +00:00
Mehdi Amini b550cb1750 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
Tim Northover 903f81ba18 ARM: don't try to hoist constant RHS out of a division.
Divisions by a constant can be converted into multiplies which are usually
cheaper, but this isn't possible if the constant gets separated (particularly
in loops). Fix this by telling ConstantHoisting that the immediate in a DIV is
cheap.

I considered making the check generic, but neither AArch64 (strangely) nor x86
showed any benefit on the tests I had.

llvm-svn: 266464
2016-04-15 18:17:18 +00:00
Renato Golin 5cb666add7 [ARM] Adding IEEE-754 SIMD detection to loop vectorizer
Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON.

This patch teaches the loop vectorizer to only allow transformations of loops
that either contain no floating-point operations or have enough allowance
flags supporting lack of precision (ex. -ffast-math, Darwin).

For that, the target description now has a method which tells us if the
vectorizer is allowed to handle FP math without falling into unsafe
representations, plus a check on every FP instruction in the candidate loop
to check for the safety flags.

This commit makes LLVM behave like GCC with respect to ARM NEON support, but
it stops short of fixing the underlying problem: sub-normals. Neither GCC
nor LLVM have a flag for allowing sub-normal operations. Before this patch,
GCC only allows it using unsafe-math flags and LLVM allows it by default with
no way to turn it off (short of not using NEON at all).

As a first step, we push this change to make it safe and in sync with GCC.
The second step is to discuss a new sub-normal's flag on both communitues
and come up with a common solution. The third step is to improve the FastMath
flags in LLVM to encode sub-normals and use those flags to restrict NEON FP.

Fixes PR16275.

llvm-svn: 266363
2016-04-14 20:42:18 +00:00
Matthias Braun 46b0f03e12 TargetLowering: Factor out common code for tail call eligibility checking; NFC
llvm-svn: 266270
2016-04-14 01:10:42 +00:00
Tim Northover 5c02f9ad28 ARM: override cost function to re-enable ConstantHoisting (& fix it).
At some point, ARM stopped getting any benefit from ConstantHoisting because
the pass called a different variant of getIntImmCost. Reimplementing the
correct variant revealed some problems, however:

  + ConstantHoisting was modifying switch statements. This is simply invalid,
    the cases must remain integer constants no matter the notional cost.
  + ConstantHoisting was mangling alloca instructions in the entry block. These
    should be handled by FrameLowering, so constants actually have a cost of 0.
    Worse, the resulting bitcasts meant they became dynamic allocas.

rdar://25707382

llvm-svn: 266260
2016-04-13 23:08:27 +00:00
Matthias Braun 707e02c273 ARM: Use a callee save register for the swiftself parameter.
It is very likely that the swiftself parameter is alive throughout most
functions function so putting it into a callee save register should
avoid spills for the callers with only a minimum amount of extra spills
in the callees.

Currently the generated code is correct but unnecessarily spills and
reloads arguments passed in callee save registers, I will address this
in upcoming patches.

This also adds a missing check that for tail calls the preserved value
of the caller must be the same as the callees parameter.

Differential Revision: http://reviews.llvm.org/D18901

llvm-svn: 266253
2016-04-13 21:43:25 +00:00
Tim Northover a6dea06fe3 ARM: use r7 as the frame-pointer on all MachO targets.
This is better for a few reasons:
  + It matches the other tooling for iOS.
  + It matches EABI in more cases (i.e. Thumb-mode, and in practice we don't
    use ARM mode).
  + It leads to infinitesimally smaller code (0.2%, yay!).

rdar://25369506

llvm-svn: 266003
2016-04-11 22:27:40 +00:00
Manman Ren 5751814eda Swift Calling Convention: swifterror target support.
Differential Revision: http://reviews.llvm.org/D18716

llvm-svn: 265997
2016-04-11 21:08:06 +00:00
Oliver Stannard c869e9158d [ARM] Avoid switching ARM/Thumb mode on .arch/.cpu directive
When we see a .arch or .cpu directive, we should try to avoid switching
ARM/Thumb mode if possible.

If we do have to switch modes, we also need to emit the correct mapping
symbol for the new ISA. We did not do this previously, so could emit
ARM code with Thumb mapping symbols (or vice-versa).

The GAS behaviour is to always stay in the same mode, and to emit an
error on any instructions seen when the current mode is not available on
the current target. We can't represent that situation easily (we assume
that Thumb mode is available if ModeThumb is set), so we differ from the
GAS behaviour when switching to a target that can't support the old
mode. I've added a warning for when this implicit mode-switch occurs.

Differential Revision: http://reviews.llvm.org/D18955

llvm-svn: 265936
2016-04-11 13:06:28 +00:00
Sam Parker 2d5126cdf5 [ARM] Enable SMLAW[B|T] and SMLUW[B|T] instruction selection
Added ISelDAGToDAG functions to enable selection of the smlawb, smlawt,
smulwb and smulwt instructions for the ARM backend. Also updated the smul
CodeGen test and removed the smulw one.

Differential Revision: http://reviews.llvm.org/D18892

llvm-svn: 265793
2016-04-08 16:02:53 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Manman Ren 802cd6f9d7 Swift Calling Convention: swiftcc for ARM.
Differential Revision: http://reviews.llvm.org/D18769

llvm-svn: 265482
2016-04-05 22:44:44 +00:00
Sam Parker 0d3a3a537c [ARM] Cleanup of smul and smla instruction descriptions
Removed the SDNode argument passed to the AI_smul and AI_smla multiclass
definitions as they are always mul.

Differential Revision: http://reviews.llvm.org/D18791

llvm-svn: 265409
2016-04-05 16:01:25 +00:00
Matthias Braun 870c34f0cf ARM, AArch64, X86: Check preserved registers for tail calls.
We can only perform a tail call to a callee that preserves all the
registers that the caller needs to preserve.

This situation happens with calling conventions like preserver_mostcc or
cxx_fast_tls. It was explicitely handled for fast_tls and failing for
preserve_most. This patch generalizes the check to any calling
convention.

Related to rdar://24207743

Differential Revision: http://reviews.llvm.org/D18680

llvm-svn: 265329
2016-04-04 18:56:13 +00:00
Derek Schuff 1dbf7a571f Add MachineFunctionProperty checks for AllVRegsAllocated for target passes
Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.

Reviewers: qcolombet

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18525

llvm-svn: 265313
2016-04-04 17:09:25 +00:00
James Y Knight e6a4646372 Remove useless check for ThreadModel==Single in ARMISelLowering. NFC.
ThreadModel::Single is already handled already by ARMPassConfig adding
LowerAtomicPass to the pass list, which lowers all atomics to non-atomic
ops and deletes fences.

So by the time we get to ISel, there's no atomic fences left, so they
don't need special handling.

llvm-svn: 265178
2016-04-01 19:33:19 +00:00
James Molloy b876c72bcc Fix for pr24346: arm asm label calculation error in sub
Some ARM instructions encode 32-bit immediates as a 8-bit integer (0-255)
and a 4-bit rotation (0-30, even) in its least significant 12 bits. The
original fixup, FK_Data_4, patches the instruction by the value bit-to-bit,
regardless of the encoding. For example, assuming the label L1 and L2 are
0x0 and 0x104 respectively, the following instruction:

  add r0, r0, #(L2 - L1) ; expects 0x104, i.e., 260

would be assembled to the following, which adds 1 to r0, instead of 260:

  e2800104 add r0, r0, #4, 2 ; equivalently 1

The new fixup kind fixup_arm_mod_imm takes care of the encoding:

  e2800f41 add r0, r0, #260

Patch by Ting-Yuan Huang!

llvm-svn: 265122
2016-04-01 09:40:47 +00:00
Benjamin Kramer 569efd2cfd [ARM] Expand v1i64 and v2i64 ctpop.
The default is legal, which results in 'Cannot select' errors. This is
triggered during selfhost due to a recent cost model change.

llvm-svn: 265040
2016-03-31 19:42:04 +00:00
Hans Wennborg e1a2e90ffa Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method
merge adjacent stack adjustments, i.e. it might erase the previous
and/or next instruction.

It also greatly simplifies the calls to this function from Prolog-
EpilogInserter. Previously, that had a bunch of logic to resume iteration
after the call; now it just continues with the returned iterator.

Note that this changes the behaviour of PEI a little. Previously,
it attempted to re-visit the new instruction created by
eliminateCallFramePseudoInstr(). That code was added in r36625,
but I can't see any reason for it: the new instructions will obviously
not be pseudo instructions, they will not have FrameIndex operands,
and we have already accounted for the stack adjustment.

Differential Revision: http://reviews.llvm.org/D18627

llvm-svn: 265036
2016-03-31 18:33:38 +00:00
Matthias Braun 8d41436004 CodeGen: Factor out code for tail call result compatibility check; NFC
llvm-svn: 264959
2016-03-30 22:46:04 +00:00
Aaron Ballman ef0fe1eed8 Silencing warnings from MSVC 2015 Update 2. All of these changes silence "C4334 '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)". NFC.
llvm-svn: 264929
2016-03-30 21:30:00 +00:00
Nirav Dave 8dd66e5753 Remove HasFnAttribute guards to getFnAttribute calls
These checks are redundant and can be removed

Reviewers: hans

Subscribers: llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D18564

llvm-svn: 264872
2016-03-30 15:41:12 +00:00
Manman Ren f46262e0b7 Swift Calling Convention: add swiftself attribute.
Differential Revision: http://reviews.llvm.org/D17866

llvm-svn: 264754
2016-03-29 17:37:21 +00:00
Saleem Abdulrasool 750a90df6a ARM: maintain BB ordering when expanding WIN__DBZCHK
It is possible to have a fallthrough MBB prior to MBB placement.  The original
addition of the BB would result in reordering the BB as not preceding the
successor.  Because of the fallthrough nature of the BB, we could end up
executing incorrect code or even a constant pool island!  Insert the spliced BB
into the same location to avoid that.

Thanks to Tim Northover for invaluable hints and Fiora for the discussion on
what may have been occurring!

llvm-svn: 264454
2016-03-25 19:48:06 +00:00
Saleem Abdulrasool 0dab98d926 ARM: fix optimised division on WoA
We did not have an explicit branch to the continuation BB.  When the check was
hoisted, this could permit control follow to fall through into the division
trap.  Add the explicit branch to the continuation basic block to ensure that
code execution is correct.

llvm-svn: 264370
2016-03-25 00:34:11 +00:00
Artyom Skrobov e6f1b7f094 Replace a string comparison in ARMSubtarget.h with a tablegen entry in ARM.td (NFC)
Reviewers: rengolin, t.p.northover

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D18393

llvm-svn: 264165
2016-03-23 16:18:13 +00:00
Peter Collingbourne 86b9fbe980 ARM: Better codegen for 64-bit compares.
This introduces a custom lowering for ISD::SETCCE (introduced in r253572)
that allows us to emit a short code sequence for 64-bit compares.

Before:

	push	{r7, lr}
	cmp	r0, r2
	mov.w	r0, #0
	mov.w	r12, #0
	it	hs
	movhs	r0, #1
	cmp	r1, r3
	it	ge
	movge.w	r12, #1
	it	eq
	moveq	r12, r0
	cmp.w	r12, #0
	bne	.LBB1_2
@ BB#1:                                 @ %bb1
	bl	f
	pop	{r7, pc}
.LBB1_2:                                @ %bb2
	bl	g
	pop	{r7, pc}

After:

	push	{r7, lr}
	subs	r0, r0, r2
	sbcs.w	r0, r1, r3
	bge	.LBB1_2
@ BB#1:                                 @ %bb1
	bl	f
	pop	{r7, pc}
.LBB1_2:                                @ %bb2
	bl	g
	pop	{r7, pc}

Saves around 80KB in Chromium's libchrome.so.

Some notes on this patch:

- I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I
  introduced (nothing else needs them). However, they are necessary in
  order to avoid poor codegen, and they seem similar to existing combines
  in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to
  (brcond Compare)).

- No support for Thumb-1. This is in principle possible, but we'd need
  to implement ARMISD::SUBE for Thumb-1.

Differential Revision: http://reviews.llvm.org/D15256

llvm-svn: 263962
2016-03-21 18:00:02 +00:00
Renato Golin 2b6b7ffd6c [ARM] Add Cortex-A32 support
Adding Cortex-A32 as an available target in the ARM backend.

Patch by Sam Parker.

llvm-svn: 263956
2016-03-21 17:29:01 +00:00
Manman Ren a3a019cf90 [CXX_FAST_TLS] Fix issues in ARM.
We need to be careful on which registers can be explicitly handled
via copies. Prologue, Epilogue use physical registers and if one belongs
to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites
it after the explicit copies.

llvm-svn: 263857
2016-03-18 23:44:37 +00:00
Manman Ren 4865d89653 [CXX_FAST_TLS] Disable tail call when calling conventions are mismatched.
Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller
and callee have mismatched calling conventions.

llvm-svn: 263856
2016-03-18 23:41:51 +00:00
Manman Ren 2828c57b6f [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.
Since at O0, explicit copies via SplitCSR may not be removed even if
they are unnecessary, we choose not to use SplitCSR at O0.

llvm-svn: 263855
2016-03-18 23:38:49 +00:00
Tim Northover 498c56c240 ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering.
llvm-svn: 263741
2016-03-17 20:10:28 +00:00
Saleem Abdulrasool 071a099102 ARM: Revert SVN r253865, 254158, fix windows division
The two changes together weakened the test and caused a regression with division
handling in MSVC mode.  They were applied to avoid an assertion being triggered
in the block frequency analysis.  However, the underlying problem was simply
being masked rather than solved properly.  Address the actual underlying problem
and revert the changes.  Rather than analyze the cause of the assertion, the
division failure was assumed to be an overflow.

The underlying issue was a subtle bug in the BB construction in the emission of
the div-by-zero check (WIN__DBZCHK).  We did not construct the proper successor
information in the basic blocks, nor did we update the PHIs associated with the
basic block when we split them.  This would result in assertions being triggered
in the block frequency analysis pass.

Although the original tests are being removed, the tests themselves performed
very little in terms of validation but merely tested that we did not assert when
generating code.  Update this with new tests that actually ensure that we do not
regress on the code generation.

llvm-svn: 263714
2016-03-17 14:10:49 +00:00
James Y Knight f44fc5219f Tweak some atomics functions in preparation for larger changes; NFC.
- Rename getATOMIC to getSYNC, as llvm will soon be able to emit both
  '__sync' libcalls and '__atomic' libcalls, and this function is for
  the '__sync' ones.

- getInsertFencesForAtomic() has been replaced with
  shouldInsertFencesForAtomic(Instruction), so that the decision can be
  made per-instruction. This functionality will be used soon.

- emitLeadingFence/emitTrailingFence are no longer called if
  shouldInsertFencesForAtomic returns false, and thus don't need to
  check the condition themselves.

llvm-svn: 263665
2016-03-16 22:12:04 +00:00
Davide Italiano dfdf278ebf [MC] Rename TLSDESC as it's not ARM specific.
Similarly to what was done for TLSCALL in r263515.

llvm-svn: 263564
2016-03-15 17:29:52 +00:00
Davide Italiano 249c45d92e [MC] Rename TLSCALL as it's not ARM specific.
`MCSymbolRefExpr` variant kind for TLSCALL is prefixed with 
_ARM_ since this is how it was originally implemented.
The X86_64 version is exactly the same so there's no reason
to create a new variant, we can just rename the existing
one to be machine-independent.
This generalization is the first step to implement support
for GNU2 TLS dialect in MC.

Differential Revision:  http://reviews.llvm.org/D18160

llvm-svn: 263515
2016-03-15 00:25:22 +00:00
Sanjay Patel 7506852709 [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Peter Collingbourne aba16fca5d ARM: Support relative references using the PREL31 symbol variant.
Differential Revision: http://reviews.llvm.org/D17937

llvm-svn: 263156
2016-03-10 19:30:18 +00:00
Alexandros Lamprineas 843164242e [ARM] Cortex-R8 support
This patch adds Cortex-R8 to Target Parser and TableGen.
It also adds CodeGen tests for the build attributes.

Patch by Pablo Barrio.

Differential Revision: http://reviews.llvm.org/D17925

llvm-svn: 263132
2016-03-10 17:38:41 +00:00
Saleem Abdulrasool 1632fe1f77 ARM: follow up improvements for SVN r263118
The initial change was insufficiently complete for always getting the semantics
of __builtin_longjmp correct.  The builtin is translated into a
`tInt_eh_sjlj_longjmp` DAG node.  This node set R7 as clobbered.  However, the
code would then follow up with a clobber of R11.  I had failed to notice the
imp-def,kill on R7 in the isel.  Unfortunately, it seems that it is not possible
to conditionalise the Defs list via an !if.  Instead, construct a new parallel
WIN node and prefer that when targeting windows.  This ensures that we now both
correctly model the __builtin_longjmp as well as construct the frame in a more
ABI conformant manner.

llvm-svn: 263123
2016-03-10 16:26:37 +00:00
Saleem Abdulrasool 8b30f9854e ARM: correct __builtin_longjmp on WoA
WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast
to AAPCS which states r7.  This adjusts __builtin_longjmp to not clobber r7 and
to properly restore the frame pointer on execution.

llvm-svn: 263118
2016-03-10 15:11:09 +00:00
Roman Levenstein 2792b3f02f Add support for a preserve_most calling convention to the AArch64 backend.
This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64.

There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention.

Differential Revision: http://reviews.llvm.org/D18016

llvm-svn: 263092
2016-03-10 04:35:09 +00:00
Artyom Skrobov 5ddea6a8e9 [ARM] Simplify ARMInstr*.td by getting rid of identity PatFrags (NFC)
Reviewers: t.p.northover, grosbach, resistor

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17636

llvm-svn: 262936
2016-03-08 16:23:54 +00:00
Renato Golin 175c6d6d95 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Renato Golin 3d78271eac Revert "[ARM] Merging 64-bit divmod lib calls into one"
This reverts commit r262507, which broke some ARM buildbots.

llvm-svn: 262594
2016-03-03 08:57:44 +00:00
Renato Golin 93e42d9934 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262507
2016-03-02 19:35:45 +00:00
Matthias Braun f290912d22 ARM: Introduce conservative load/store optimization mode
Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which
means that unaligned loads/stores do not trap and even extensive testing
will not catch these bugs. However the multi/double variants are not
affected by this bit and will still trap. In effect a more aggressive
load/store optimization will break existing (bad) code.

These bugs do not necessarily manifest in the broken code where the
misaligned pointer is formed but often later in perfectly legal code
where it is accessed. This means recompiling system libraries (which
have no alignment bugs) with a newer compiler will break existing
applications (with alignment bugs) that worked before.

So (under protest) I implemented this safe mode which limits the
formation of multi/double operations to cases that are not affected by
user code (stack operations like spills/reloads) or cases where the
normal operations trap anyway (floating point load/stores). It is
disabled by default.

Differential Revision: http://reviews.llvm.org/D17015

llvm-svn: 262504
2016-03-02 19:20:00 +00:00
Matthias Braun 17cb57995e TableGen: Check scheduling models for completeness
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:

- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model

Typical steps necessary to complete a model:

- Ensure all pseudo instructions that are expanded before machine
  scheduling (usually everything handled with EmitYYY() functions in
  XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
  resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.

Differential Revision: http://reviews.llvm.org/D17747

llvm-svn: 262384
2016-03-01 20:03:21 +00:00
Duncan P. N. Exon Smith fd8cc23220 CodeGen: Change MachineInstr to use MachineInstr&, NFC
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null.  Slowly inching
toward being able to fix PR26753.

llvm-svn: 262149
2016-02-27 20:01:33 +00:00
Tim Northover aa35bd26c7 ARM: disallow pc as a base register in Thumb2 memory ops.
These should all be deferring to the "OP (literal)" variant according to the
ARM ARM.

llvm-svn: 261895
2016-02-25 16:54:52 +00:00
Tim Northover b9edc5ca21 ARM: fix handling of movw/movt relocations with addend.
We were emitting only one half of a the paired relocations needed for these
instructions because we decided that an offset needed a scattered relocation.
In fact, movw/movt relocations can be paired without being scattered.

llvm-svn: 261679
2016-02-23 20:20:23 +00:00
Weiming Zhao 5e0c3ebe9e Fix PR25339: ARM Constant Island
Summary:
Currently, the ARM Constant Island may not converge (or not converge quickly).
This patch let it move to the closest water after the user if it doesn't converge after 15 iterations.

This address https://llvm.org/bugs/show_bug.cgi?id=25339

Reviewers: t.p.northover, srhines, kristof.beyls, aadg, rengolin

Subscribers: weimingz, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D16890

llvm-svn: 261665
2016-02-23 18:39:19 +00:00
Junmo Park 453f4aa4dd [ARM] fix initialization of PredictableSelectIsExpensive
Summary:
If we want classify OoO or not, using getSchedModel().isOutOfOrder()
could be more proper way than using Subtarget->isLikeA9().

Reviewers: jmolloy, rengolin

Differential Revision: http://reviews.llvm.org/D17433

llvm-svn: 261623
2016-02-23 09:56:58 +00:00
Duncan P. N. Exon Smith 6307eb5518 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

llvm-svn: 261605
2016-02-23 02:46:52 +00:00
Duncan P. N. Exon Smith d84f600653 CodeGen: Bring back MachineBasicBlock::iterator::getInstrIterator()...
This is a little embarrassing.

When I reverted r261504 (getIterator() => getInstrIterator()) in
r261567, I did a `git grep` to see if there were new calls to
`getInstrIterator()` that I needed to migrate.  There were 10-20 hits,
and I blindly did a `sed ...` before calling `ninja check`.

However, these were `MachineInstrBundleIterator::getInstrIterator()`,
which predated r261567.  Perhaps coincidentally, these had an identical
name and return type.

This commit undoes my careless sed and restores
`MachineBasicBlock::iterator::getInstrIterator()`.

llvm-svn: 261577
2016-02-22 21:30:15 +00:00
Duncan P. N. Exon Smith c5b668deb8 Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"
This reverts commit r261504, since it's not obvious the new name is
better:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html

I'll recommit if we get consensus that it's the right direction.

llvm-svn: 261567
2016-02-22 20:49:58 +00:00
Duncan P. N. Exon Smith dc0848c029 CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.

- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
  that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator().  This matches the
  naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator().  This is explicitly called
  "bundle" (not matching MachineBasicBlock) to disintinguish it clearly
  from ilist_node::getIterator().
- Update all calls.  Some of these I switched to `auto` to remove
  boiler-plate, since the new name is clear about the type.

There was one call I updated that looked fishy, but it wasn't clear what
the right answer was.  This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp.  I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.

llvm-svn: 261504
2016-02-21 22:58:35 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Junmo Park 1108ab059c Minor code cleanups. NFC.
llvm-svn: 261294
2016-02-19 01:46:04 +00:00
Ahmed Bougacha 93cff7fb82 [CodeGen] Document and use getConstant's splat-building feature. NFC.
Differential Revision: http://reviews.llvm.org/D17229

llvm-svn: 260901
2016-02-15 18:07:29 +00:00
Ahmed Bougacha f8dfb47c02 [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
llvm-svn: 260316
2016-02-09 22:54:12 +00:00
Saleem Abdulrasool f36005a358 ARM: support TLS for WoA
Add support for TLS access for Windows on ARM.  This generates a similar access
to MSVC for ARM.

The changes to the tablegen data is needed to support loading an external symbol
global that is not for a call.  The adjustments to the DAG to DAG transforms are
needed to preserve the 32-bit move.

llvm-svn: 259676
2016-02-03 18:21:59 +00:00
Renato Golin 6027dd38ef [ARM] Move GNUEABI divmod to __aeabi_divmod*
The GNU toolchain emits __aeabi_divmod for soft-divide on ARM cores
which happens to be a lot faster than __divsi3/__modsi3 when the core
has hardware divide instructions. Do the same here.

Fixes PR26450.

llvm-svn: 259657
2016-02-03 16:10:54 +00:00
Sjoerd Meijer ffe19f5245 Removed FeatureVFPOnlySP from the Cortex-R7 processor model
description and changed the regression test accordingly.
The default configuration of a Cortex-R7 is to implement the
VFPv3-D16 architecture and the feature line as it was is too
restrictive.

llvm-svn: 259480
2016-02-02 09:28:20 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Yaron Keren eb2a25467e Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment.
clang part in r259232, this is the LLVM part of the patch.

llvm-svn: 259240
2016-01-29 20:50:44 +00:00
Tim Northover c4093c3ced ARM: don't mangle DAG constant if it has more than one use
The basic optimisation was to convert (mul $LHS, $complex_constant) into
roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected
to be cheaper. The original logic checks that the mul only has one use (since
we're mangling $complex_constant), but when used in even more complex
addressing modes there may be an outer addition that can pick up the wrong
value too.

I *think* the ARM addressing-mode problem is actually unreachable at the
moment, but that depends on complex assessments of the profitability of
pre-increment addressing modes so I've put a real check in there instead of an
assertion.

llvm-svn: 259228
2016-01-29 19:18:46 +00:00
Alexandros Lamprineas 8c26e7c647 [ARM] Emit trap instruction using .inst directive
The trap instruction is emitted as a data-in-text rather
than an instruction. This patch uses the .inst directive
for emitting trap.

Differential Revision: http://reviews.llvm.org/D16684

llvm-svn: 259182
2016-01-29 10:23:32 +00:00
Tim Northover 042a6c1fe1 ARMv7k: base ABI decision on v7k Arch rather than watchos OS.
Various bits we want to use the new ABI actually compile with "-arch armv7k
-miphoneos-version-min=9.0". Not ideal, but also not ridiculous given how
slices work.

llvm-svn: 258975
2016-01-27 19:32:29 +00:00
Benjamin Kramer 391be792f2 One more batch of self-containing headers.
llvm-svn: 258974
2016-01-27 19:29:56 +00:00
Benjamin Kramer b32a5042bd Don't put classes in headers into anonymous namespaces.
You want ODR violations? That's how you get ODR violations.

llvm-svn: 258973
2016-01-27 19:29:42 +00:00
Benjamin Kramer 45275a4d3c Make more headers self-contained.
A lot of this comes from the new complete type requirement of DenseMap.

llvm-svn: 258956
2016-01-27 18:03:37 +00:00
Benjamin Kramer f9172fd4ac Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/
It's a SelectionDAG thing, not a Target thing.

llvm-svn: 258939
2016-01-27 16:32:26 +00:00
Benjamin Kramer b3e8a6d2b8 Move MCTargetAsmParser.h to llvm/MC/MCParser where it belongs.
llvm-svn: 258917
2016-01-27 10:01:28 +00:00
Chris Bieneman e49730d4ba Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi

Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark

Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16471

llvm-svn: 258861
2016-01-26 21:29:08 +00:00
Benjamin Kramer f57c1977c1 Reflect the MC/MCDisassembler split on the include/ level.
No functional change, just moving code around.

llvm-svn: 258818
2016-01-26 16:44:37 +00:00
Bradley Smith d27a6a7072 [ARM] Add DSP build attribute and extension targeting
This patch was originally committed as r257885, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258683
2016-01-25 11:26:11 +00:00
Bradley Smith f277c8a5ea [ARM] Add new system registers to ARMv8-M Baseline/Mainline
This patch was originally committed as r257884, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258682
2016-01-25 11:25:36 +00:00
Bradley Smith fed3e4ac00 [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline
This patch was originally committed as r257883, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258681
2016-01-25 11:24:47 +00:00
Oliver Stannard 65b85382f6 [ARM] Add ARMv8.2-A FP16 scalar instructions
This was originally committed as r255762, but reverted as it broke windows
bots. Re-commitiing the exact same patch, as the underlying cause was fixed by
r258677.

ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

The assembly for these instructions uses S registers (AArch32 does not
have H registers), but the instructions have ".f16" type specifiers
rather than ".f32" or ".f64". The top 16 bits of each source register
are ignored, and the top 16 bits of the destination register are set to
zero.

These instructions are mostly the same as the 32- and 64-bit versions,
but they use coprocessor 9 rather than 10 and 11.

Two new instructions, VMOVX and VINS, have been added to allow packing
and extracting two 16-bit floats stored in the top and bottom halves of
an S register.

New fixup kinds have been added for the PC-relative load and store
instructions, but no ELF relocations have been added as they have a
range of 512 bytes.

Differential Revision: http://reviews.llvm.org/D15038

llvm-svn: 258678
2016-01-25 10:26:26 +00:00
Oliver Stannard 9f68749eba [ARM] Operands for PKHTB alias should be swapped
When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of
the input operands needs to be swapped.

Differential Revision: http://reviews.llvm.org/D16288

llvm-svn: 258044
2016-01-18 11:56:35 +00:00
Manuel Jacob 190577ac81 [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of going through PointerType::getElementType.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: dsanders, llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16273

llvm-svn: 258023
2016-01-17 22:37:39 +00:00
Manman Ren e5f807f928 CXX_FAST_TLS calling convention: fix issue on ARM.
When we have a single basic block, the explicit copy-back instructions should
be inserted right before the terminator. Before this fix, they were wrongly
placed at the beginning of the basic block.

PR26136

llvm-svn: 257930
2016-01-15 20:24:11 +00:00
Reid Kleckner d4a0d18899 Revert "[ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline"
This reverts commit r257883.

Somehow this didn't make it into r257916.

llvm-svn: 257919
2016-01-15 18:55:12 +00:00
Reid Kleckner 47f2452da8 # This is a combination of 2 commits.
# The first commit's message is:

Revert "[ARM] Add DSP build attribute and extension targeting"

This reverts commit b11cc50c0b4a7c8cdb628abc50b7dc226ff583dc.

# This is the 2nd commit message:

Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline"

This reverts commit 837d08454e3e5beb8581951ac26b22fa07df3cd5.

llvm-svn: 257916
2016-01-15 18:31:29 +00:00
Bradley Smith 48b93e1f21 [ARM] Add DSP build attribute and extension targeting
llvm-svn: 257885
2016-01-15 10:28:25 +00:00
Bradley Smith 42f6e90a43 [ARM] Add new system registers to ARMv8-M Baseline/Mainline
llvm-svn: 257884
2016-01-15 10:28:03 +00:00
Bradley Smith 618712df04 [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257883
2016-01-15 10:27:14 +00:00
Bradley Smith 433c22e35c [ARM] Add ARMv8-A semaphore/atomic instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257882
2016-01-15 10:26:51 +00:00
Bradley Smith a1189106d5 [ARM] Add B.W and CBZ instructions to ARMv8-M Baseline
llvm-svn: 257881
2016-01-15 10:26:17 +00:00
Bradley Smith 519563e371 [ARM] Add SDIV/UDIV instructions to ARMv8-M Baseline
llvm-svn: 257880
2016-01-15 10:25:35 +00:00
Bradley Smith d9a99ce53d [ARM] Add MOVW/MOVT instructions to ARMv8-M Baseline/Mainline
llvm-svn: 257879
2016-01-15 10:25:14 +00:00
Bradley Smith e26f799422 [ARM] Add ARMv8-M Baseline/Mainline LLVM targeting
llvm-svn: 257878
2016-01-15 10:24:39 +00:00
Bradley Smith 4c21cba72b [ARM] Split out ARMv8-A semaphores and atomics and ARMv7 clrex as separate features
llvm-svn: 257877
2016-01-15 10:23:46 +00:00
Rui Ueyama da00f2fdf4 Update to use new name alignTo().
llvm-svn: 257804
2016-01-14 21:06:47 +00:00
Benjamin Kramer fc1f7d893e [ARM] Use the efficient version of BitVector::set and a static_assert.
No functional change intended.

llvm-svn: 257766
2016-01-14 14:33:04 +00:00
Rafael Espindola 8340f94df1 Convert a few assert failures into proper errors.
Fixes PR25944.

llvm-svn: 257697
2016-01-13 22:56:57 +00:00
Ana Pazos 359cab3bb3 Guard fabs to bfc convert with V6T2 flag
Summary:
BFC instructions are available in ARMv6T2 and above.


Reviewers: t.p.northover

Subscribers: aemerson

Differential Revision: http://reviews.llvm.org/D16076

llvm-svn: 257546
2016-01-13 00:03:35 +00:00