Commit Graph

2298 Commits

Author SHA1 Message Date
Tim Northover 9ac3e42211 AArch64: remove all kill flags when extending register liveness.
When we forward a stored value to a load and eliminate it entirely we need to
make sure the liveness of the register is maintained all the way to its use.
Previously we only cleared liveness on the store doing the forwarding, but
there could be other killing uses in between.

We already do the right thing when the load has to be converted into something
else, it was just this one path that skipped it.

llvm-svn: 306318
2017-06-26 18:49:25 +00:00
Hiroshi Inoue a85d24b73d fix trivial typos in comment, NFC
llvm-svn: 306211
2017-06-24 16:00:26 +00:00
Rafael Espindola 6418856127 Simplify the processFixupValue interface. NFC.
llvm-svn: 306202
2017-06-24 05:22:28 +00:00
Rafael Espindola f351292141 Remove redundant argument.
llvm-svn: 306189
2017-06-24 00:26:57 +00:00
Rafael Espindola 801b42de31 ARM: move some logic from processFixupValue to applyFixup.
processFixupValue is called on every relaxation iteration. applyFixup
is only called once at the very end. applyFixup is then the correct
place to do last minute changes and value checks.

While here, do proper range checks again for fixup_arm_thumb_bl. We
used to do it, but dropped because of thumb2. We now do it again, but
use the thumb2 range.

llvm-svn: 306177
2017-06-23 22:52:36 +00:00
Geoff Berry dd239718bd [AArch64][Falkor] Remove some non-existent opcodes from sched detail regexes. NFC.
llvm-svn: 306170
2017-06-23 21:59:09 +00:00
Chad Rosier 6db9ff64a8 [AArch64] Prefer Bcc to CBZ/CBNZ/TBZ/TBNZ when NZCV flags can be set for "free".
This patch contains a pass that transforms CBZ/CBNZ/TBZ/TBNZ instructions into a
conditional branch (Bcc), when the NZCV flags can be set for "free". This is
preferred on targets that have more flexibility when scheduling Bcc
instructions as compared to CBZ/CBNZ/TBZ/TBNZ (assuming all other variables are
equal). This can reduce register pressure and is also the default behavior for
GCC.

A few examples:

 add w8, w0, w1  -> cmn w0, w1             ; CMN is an alias of ADDS.
 cbz w8, .LBB_2  -> b.eq .LBB0_2           ; single def/use of w8 removed.

 add w8, w0, w1  -> adds w8, w0, w1        ; w8 has multiple uses.
 cbz w8, .LBB1_2 -> b.eq .LBB1_2

 sub w8, w0, w1       -> subs w8, w0, w1   ; w8 has multiple uses.
 tbz w8, #31, .LBB6_2 -> b.ge .LBB6_2

In looking at all current sub-target machine descriptions, this transformation
appears to be either positive or neutral.

Differential Revision: https://reviews.llvm.org/D34220.

llvm-svn: 306144
2017-06-23 19:20:12 +00:00
Tim Northover 4b4eec7009 GlobalISel: remove G_SEQUENCE instruction.
It was trying to do too many things. The basic lumping together of values for
legalization purposes is now handled by G_MERGE_VALUES. More complex things
involving gaps and odd sizes are handled by G_INSERT sequences.

llvm-svn: 306120
2017-06-23 16:15:55 +00:00
Rafael Espindola 88d9e37ec8 Use a MutableArrayRef. NFC.
llvm-svn: 305968
2017-06-21 23:06:53 +00:00
Christof Douma c1c28051d2 [AARCH64][LSE] Preliminary support for ARMv8.1 LSE Atomics.
Implemented support to AArch64 codegen for ARMv8.1 Large System
Extensions atomic instructions. Where supported, these instructions can
provide atomic operations with higher performance.

Currently supported operations include: fetch_add, fetch_or, fetch_xor,
fetch_smin, fetch_min/max (signed and unsigned), swap, and
compare_exchange.

This implementation implies sequential-consistency ordering, more
relaxed ordering is under development.

Subtarget->hasLSE is currently supported for Cavium ThunderX2T99.

Patch by Ananth Jasty.

Differential Revision: https://reviews.llvm.org/D33586

Change-Id: I82f6d3d64255622791ceb0715b7ab9f4dc4d4b2c
llvm-svn: 305893
2017-06-21 10:58:31 +00:00
Florian Hahn 8552e591a1 [AArch64] Add early exit to promoteLoadFromStore.
There should be at most a single kill flag for the
promoted operand between the store/load pair.
Discussed in https://reviews.llvm.org/D34402.

llvm-svn: 305889
2017-06-21 09:51:52 +00:00
Florian Hahn 80e485179e [AArch64] Preserve register flags when promoting a load from store.
Summary:
This patch updates promoteLoadFromStore to use the store MachineOperand as the
source operand of the of the new instruction instead of creating a new
register MachineOperand. This way, the existing register flags are
preserved. 

This fixes PR33468 (https://bugs.llvm.org/show_bug.cgi?id=33468). 


Reviewers: MatzeB, t.p.northover, junbuml

Reviewed By: MatzeB

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34402

llvm-svn: 305885
2017-06-21 08:47:23 +00:00
Rafael Espindola 3ac4c09daf clang-format a region.
It will make a followup patch easier to read.

llvm-svn: 305865
2017-06-20 22:53:29 +00:00
Geoff Berry 5e46600e3a [AArch64][Falkor] Fix MOVZ sched predicate to not assert on non-imm operands (e.g. blockaddress).
llvm-svn: 305752
2017-06-19 21:57:44 +00:00
Geoff Berry e9972cabbd [AArch64][Kryo] Add missing write latency for LDAXP, LDXP second destination.
Fixes PR33491 and PR33512.

llvm-svn: 305751
2017-06-19 21:57:42 +00:00
Geoff Berry 3cc4b9f780 [AArch64][Falkor] Refine load/store increment latencies.
Also fix LDXP & LDAXP write latency to avoid similar assert as PR33491 and PR33512.

llvm-svn: 305750
2017-06-19 21:56:21 +00:00
Florian Hahn fd44ca6c76 [AArch64] Fix order of checks in shouldScheduleAdjacent.
We need to check the opcode of FirstMI before accessing the operands. This
caused a buildbot failure during bootstrapping on AArch64.

llvm-svn: 305694
2017-06-19 13:45:41 +00:00
Florian Hahn 5f746c8e27 Recommit rL305677: [CodeGen] Add generic MacroFusion pass
Use llvm::make_unique to avoid ambiguity with MSVC.

This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.

Differential Revision: https://reviews.llvm.org/D34144

llvm-svn: 305690
2017-06-19 12:53:31 +00:00
Florian Hahn e16d3106f3 Revert r305677 [CodeGen] Add generic MacroFusion pass.
This causes Windows buildbot failures do an ambiguous call.

llvm-svn: 305681
2017-06-19 11:26:15 +00:00
Florian Hahn ee1b096f8a [CodeGen] Add generic MacroFusion pass.
Summary:
This patch adds a generic MacroFusion pass, that is used on X86 and
AArch64, which both define target-specific shouldScheduleAdjacent
functions. This generic pass should make it easier for other targets to
implement macro fusion and I intend to add macro fusion for ARM shortly.

Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB

Reviewed By: MatzeB

Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34144

llvm-svn: 305677
2017-06-19 10:51:38 +00:00
Nirav Dave 85e92223b4 [AArch64] Add indexed check to splitStores. NFC.
Add explicit check for unhandled cases in preparation for delaying
splitStores to post-legalization.

llvm-svn: 305471
2017-06-15 14:47:44 +00:00
Florian Hahn 0a26d2c298 [AArch64] Enable FeatureFuseAES for the generic processor model.
Summary:
Scheduling AESE/AESMC and AESD/AESIMC instruction pairs back-to-back
gives a double digit speedup on benchmarks using those instructions on
Cortex-A processors. In GCC, this optimization is part of the generic
processor model as well.

This change should not have a major performance impact on processors
that do not optimize AES instruction pairs, although I only had access
to Cortex-A processors for benchmarking.


Reviewers: rengolin, kristof.beyls, javed.absar, evandro, silviu.baranga, MatzeB, mcrosier, joelkevinjones, joel_k_jones, bmakam, t.p.northover

Reviewed By: evandro

Subscribers: sbaranga, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D33836

llvm-svn: 305457
2017-06-15 09:31:23 +00:00
Geoff Berry 13d5dcb093 [AArch64][Falkor] Fix sched details for FDIV, FSQRT, SDIV, UDIV
llvm-svn: 305310
2017-06-13 17:43:39 +00:00
Tim Northover 7a61316e89 AArch64: don't try to emit an add (shifted reg) for SP.
The "Add/sub (shifted reg)" instructions use the 31 encoding for xzr and wzr
rather than the SP, so we need to use different variants.

Situations where this actually comes up are rare enough (see test-case) that I
think falling back to DAG is fine.

llvm-svn: 305230
2017-06-12 20:49:53 +00:00
Haicheng Wu ef790ffd56 [Falkor] Enable SW Prefetch.
SW prefetch is good for Falkor.

Differential Revision: http://reviews.llvm.org/D34084

llvm-svn: 305199
2017-06-12 16:34:19 +00:00
Daniel Neilson c0112ae8da Const correctness for TTI::getRegisterBitWidth
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.

Reviewers: chandlerc, rnk, reames

Reviewed By: reames

Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits

Differential Revision: https://reviews.llvm.org/D33903

llvm-svn: 305189
2017-06-12 14:22:21 +00:00
I-Jui (Ray) Sung 21fde385fa [AArch64] Add fallback in FastISel fp16 conversions
Summary:
- Fix assertion failures on F16 to/from int types in FastISel by falling
  back to regular ISel
- Add a testcase of various conversion cases with FastISel (-O0)

Reviewers: kristof.beyls, jmolloy, SjoerdMeijer

Reviewed By: SjoerdMeijer

Subscribers: SjoerdMeijer, llvm-commits, srhines, pirama, aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D33734

llvm-svn: 305127
2017-06-09 22:40:50 +00:00
Zachary Turner 264b5d9e88 Move Object format code to lib/BinaryFormat.
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.

Differential Revision: https://reviews.llvm.org/D33843

llvm-svn: 304864
2017-06-07 03:48:56 +00:00
Chandler Carruth 6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Mandeep Singh Grang 5e1697ef28 [llvm] Remove double semicolons
Reviewers: craig.topper, arsenm, mehdi_amini

Reviewed By: mehdi_amini

Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33924

llvm-svn: 304767
2017-06-06 05:08:36 +00:00
Geoff Berry 57d8a417e7 [AArch64][Falkor] Model immediate forwarding.
llvm-svn: 304552
2017-06-02 14:27:41 +00:00
Eugene Zelenko 7ea692373c [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 304495
2017-06-01 23:25:02 +00:00
Florian Hahn ff25b6d8f6 [AArch64] Enable FeatureFuseAES on Cortex-A53.
It improves performance on Cortex-A53.

llvm-svn: 304307
2017-05-31 15:50:03 +00:00
Florian Hahn 064a2f9222 [AArch64] Enable FeatureFuseAES on Cortex-A73.
It improves performance on Cortex-A73.

llvm-svn: 304304
2017-05-31 15:25:25 +00:00
Abderrazek Zaafrani 855411566b Add latency info for Exynos interleaved Load/Store instructions.
llvm-svn: 304259
2017-05-31 00:20:55 +00:00
Matthias Braun 5e394c3d6f TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFC
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.

While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.

llvm-svn: 304247
2017-05-30 21:36:41 +00:00
Craig Topper f6d4dc5b4a [SelectionDAG] Set ISD::FPOWI to Expand by default
Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".

This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.

Reviewers: spatel, RKSimon, efriedma

Reviewed By: RKSimon

Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits

Differential Revision: https://reviews.llvm.org/D33530

llvm-svn: 304215
2017-05-30 15:27:55 +00:00
Kristof Beyls 2af1e90eb2 Fix PR33031: correct the estimate of maximum offset for instructions spilling/filling the stack.
llvm-svn: 304196
2017-05-30 06:58:41 +00:00
Geoff Berry 2739ebafb6 [AArch64][Falkor] Combine sched details files into one. NFC.
llvm-svn: 304109
2017-05-28 22:20:44 +00:00
Geoff Berry b542fb3817 [AArch64][Falkor] Fix some sched details.
- Remove all uses of base sched model entries and set them all to
  Unsupported so all the opcodes are described in
  AArch64SchedFalkorDetails.td.
- Remove entries for unsupported half-float opcodes.
- Remove entries for unsupported LSE extension opcodes.
- Add entry for MOVbaseTLS (and set Sched in base td file entry to
  WriteSys) and a few other pseudo ops.
- Fix a few FP load/store with reg offset entries to use the LSLfast
  predicates.
- Add Q size BIF/BIT/BSL entries.
- Fix swapped Q/D sized CLS/CLZ/CNT/RBIT entires.
- Fix pre/post increment address register latency (this operand is
  always dest 0).
- Fix swapped FCVTHD/FCVTHS/FCVTDH/FCVTDS entries.
- Fix XYZ resource over usage on LD[1-4] opcodes.

llvm-svn: 304108
2017-05-28 21:48:31 +00:00
Matthias Braun 88c8c9847d AArch64/PEI: Do not add reserved regs to liveins
We do not track liveness for reserved registers. It is unnecessary to
add them to block livein lists.

llvm-svn: 304059
2017-05-27 03:38:02 +00:00
Quentin Colombet 7a43eddf28 [AArch64][GlobalISel] Add the Localizer pass for the O0 pipeline
This should fix most of the issue we have right now with constants being
spilled all over the place.

llvm-svn: 304052
2017-05-27 01:34:07 +00:00
Matthias Braun b4f74224ff AArch64: Fix cmpxchg O0 expansion
- Rewrite livein calculation to use the computeLiveIns() helper
  function. This is slightly less efficient but easier to reason about
  and doesn't unnecessarily add pristine and reserved registers[1]
- Zero the status register at the beginning of the loop to make sure it
  has a defined value.
- Remove kill flags of values that need to stay alive throughout the loop.

[1] An upcoming commit of mine will tighten the MachineVerifier to catch
    these.

llvm-svn: 304048
2017-05-26 23:48:59 +00:00
Matthias Braun ac4307c41e LivePhysRegs: Rework constructor + documentation; NFC
- Take reference instead of pointer to a TRI that cannot be nullptr.
- Improve documentation comments.

llvm-svn: 304038
2017-05-26 21:51:00 +00:00
Nirav Dave 6ff50bf242 Fix signedness of constant. NFC.
llvm-svn: 303980
2017-05-26 12:53:10 +00:00
Manoj Gupta d536180fdc [AArch64]: add 'a' inline asm operand modifier.
Summary:
This is used in the Linux kernel, and effectively just means "print an
address". This brings back r193593.

Reviewed by: Renato Golin

Reviewers: t.p.northover, rengolin, richard.barton.arm, kristof.beyls

Subscribers: aemerson, javed.absar, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D33558

llvm-svn: 303901
2017-05-25 19:07:57 +00:00
Nirav Dave bb20b5d5c3 [AArch64] Prevent nested ADDs from address calc in splitStoreSplat. NFC
In preparation for late-stage store merging.

llvm-svn: 303800
2017-05-24 19:55:49 +00:00
Matthew Simpson 6349380fa4 Revert r291254: [AArch64] Reduce vector insert/extract cost for Falkor
The default vector insert/extract cost is more profitable on Falkor than the
reduced cost.

llvm-svn: 303771
2017-05-24 16:48:39 +00:00
Geoff Berry d6ac96f953 [AArch64][Falkor] Refine sched details for LSLfast/ASRfast.
llvm-svn: 303682
2017-05-23 19:57:45 +00:00
Geoff Berry e6366f505f [AArch64][Falkor] Fix sched details for FMOV of WZR/XZR.
llvm-svn: 303680
2017-05-23 19:54:28 +00:00