Commit Graph

1713 Commits

Author SHA1 Message Date
Martin Storsjo eacf4e408b [AArch64] Rewrite stack frame handling for win64 vararg functions
The previous attempt, which made do with a single offset in
computeCalleeSaveRegisterPairs, wasn't quite enough. The previous
attempt only worked as long as CombineSPBump == true (since the
offset would be adjusted later in fixupCalleeSaveRestoreStackOffset).

Instead include the size for the fixed stack area used for win64
varargs in calculations in emitPrologue/emitEpilogue. The stack
consists of mainly three parts;
- AFI->getLocalStackSize()
- AFI->getCalleeSavedStackSize()
- FixedObject

Most of the places in the code which previously used the CSStackSize
now use PrologueSaveSize instead, which is the sum of the latter
two, while some cases which need exactly the middle one use
AFI->getCalleeSavedStackSize() explicitly instead of a local variable.

In addition to moving the offsetting into emitPrologue/emitEpilogue
(which fixes functions with CombineSPBump == false), also set the
frame pointer to point to the right location, where the frame pointer
and link register actually are stored. In addition to the prologue/epilogue,
this also requires changes to resolveFrameIndexReference.

Add tests for a function that keeps a frame pointer and another one
that uses a VLA.

Differential Revision: https://reviews.llvm.org/D35919

llvm-svn: 309744
2017-08-01 21:13:54 +00:00
Nirav Dave 27a6605bdc Revert "[DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector."
This reverts commit r309680 which appears to be raising an assertion
in the test-suite.

llvm-svn: 309717
2017-08-01 18:09:25 +00:00
Nirav Dave b5cb48c6ae [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.
Summary:
Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.

Reviewers: efriedma, RKSimon, spatel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35566

llvm-svn: 309680
2017-08-01 13:45:35 +00:00
Aditya Nandakumar 02c602e18c [GISel]: Support Widening G_ICMP's destination operand.
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737

Reviewed by Tim

llvm-svn: 309579
2017-07-31 17:00:16 +00:00
Florian Hahn f63a5e91db [AArch64] Tie source and destination operands for AESMC/AESIMC.
Summary:
Most CPUs implementing AES fusion require instruction pairs of the form
    AESE Vn, _
    AESMC Vn, Vn
and
    AESD Vn, _
    AESIMC Vn, Vn

The constraint is added to AES(I)MC instructions which use the result of
an AES(E|D) instruction by using AES(I)MCTrr pseudo instructions, which
constraint source and destination registers to be the same.

A nice side effect of this change is that now all possible pairs are
scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll
test case.

I had to update aes_load_store. The version I added initially was very
reduced and with the new constraint, AESE/AESMC could not be scheduled
back-to-back. I updated the test to be more realistic and still expose
the same scheduling problem as the initial test case.

Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga

Reviewed By: t.p.northover, evandro

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35299

llvm-svn: 309495
2017-07-29 20:35:28 +00:00
Florian Hahn 2f86e3d494 [AArch64] Use 8 bytes as preferred function alignment on Cortex-A53.
Summary:
This change gives a 0.25% speedup on execution time, a 0.82% improvement
in benchmark scores and a 0.20% increase in binary size on a Cortex-A53.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite and a range of proprietary suites.

Reviewers: t.p.northover, aadg, silviu.baranga, mcrosier, rengolin

Reviewed By: rengolin

Subscribers: grimar, davide, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35568

llvm-svn: 309494
2017-07-29 20:04:54 +00:00
Tim Northover a7f583e33b GlobalISel: map 128-bit values to an FPR by default.
Eventually we may want to allow a pair of GPRs but absolutely nothing in the
entire world is ready for that yet.

llvm-svn: 309404
2017-07-28 17:11:01 +00:00
Ahmed Bougacha 87807c5a86 [AArch64] Fix legality info passed to demanded bits for TBI opt.
The (seldom-used) TBI-aware optimization had a typo lying dormant since
it was first introduced, in r252573:  when asking for demanded bits, it
told TLI that it was running after legalize, where the opposite was
true.

This is an important piece of information, that the demanded bits
analysis uses to make assumptions about the node.  r301019 added such an
assumption, which was broken by the TBI combine.

Instead, pass the correct flags to TLO.

llvm-svn: 309323
2017-07-27 21:27:25 +00:00
Martin Storsjo 84cda2d779 [AArch64] Add a test for float argument passing to win64 vararg functions
The existing tests only tested how a va_start is lowered.

Differential Revision: https://reviews.llvm.org/D35540

llvm-svn: 309015
2017-07-25 19:57:22 +00:00
Martin Storsjo 8cb3667541 [AArch64] Reserve a 16 byte aligned amount of fixed stack for win64 varargs
Create a dummy 8 byte fixed object for the unused slot below the first
stored vararg.

Alternative ideas tested but skipped: One could try to align the whole
fixed object to 16, but I haven't found how to add an offset to the stack
frame used in LowerWin64_VASTART.

If only the size of the fixed stack object size is padded but not the offset, via
MFI.CreateFixedObject(alignTo(GPRSaveSize, 16), -(int)GPRSaveSize, false),
PrologEpilogInserter crashes due to "Attempted to reset backwards range!".

This fixes misconceptions about where registers are spilled, since
AArch64FrameLowering.cpp assumes the offset from fixed objects is
aligned to 16 bytes (and the Win64 case there already manually aligns
the offset to 16 bytes).

This fixes cases where local stack allocations could overwrite callee
saved registers on the stack.

Differential Revision: https://reviews.llvm.org/D35720

llvm-svn: 308950
2017-07-25 05:20:01 +00:00
Florian Hahn 57ffb2c9d8 [AArch64] Add test for function alignment for a optsize function (NFC).
Reviewers: dblaikie, t.p.northover, rengolin

Reviewed By: rengolin

Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D35620

llvm-svn: 308852
2017-07-23 21:15:10 +00:00
Chad Rosier 9b2b4c961a [AArch64] Redundant Copy Elimination - remove more zero copies.
This patch removes unnecessary zero copies in BBs that are targets of b.eq/b.ne
and we know the result of the compare instruction is zero.  For example,

BB#0:
  subs w0, w1, w2
  str w0, [x1]
  b.ne .LBB0_2
BB#1:
  mov w0, wzr  ; <-- redundant
  str w0, [x2]
.LBB0_2

Differential Revision: https://reviews.llvm.org/D35075

llvm-svn: 308849
2017-07-23 16:38:08 +00:00
Tim Northover 7b6d66c0c9 Recommit: GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64.
It revealed a bug in the Localizer pass which has now been fixed.

This includes the fix for SUBREG_TO_REG committed separately last time.

llvm-svn: 308688
2017-07-20 22:58:38 +00:00
Tim Northover 071d77a51f GlobalISel: stop localizer putting constants before EH_LABELs
If the localizer pass puts one of its constants before the label that tells the
unwinder "jump here to handle your exception" then control-flow will skip it,
leaving uninitialized registers at runtime. That's bad.

llvm-svn: 308687
2017-07-20 22:58:26 +00:00
Matt Arsenault db78273b6e Add an ID field to StackObjects
On AMDGPU SGPR spills are really spilled to another register.
The spiller creates the spills to new frame index objects,
which is used as a placeholder.

This will eventually be replaced with a reference to a position
in a VGPR to write to and the frame index deleted. It is
most likely not a real stack location that can be shared
with another stack object.

This is a problem when StackSlotColoring decides it should
combine a frame index used for a normal VGPR spill with
a real stack location and a frame index used for an SGPR.

Add an ID field so that StackSlotColoring has a way
of knowing the different frame index types are
incompatible.

llvm-svn: 308673
2017-07-20 21:03:45 +00:00
Diana Picus 7534b28291 Revert "GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64."
This reverts commit 36c6a2ea9669bc3bb695928529a85d12d1d3e3f9 because it
broke the test-suite on the GlobalISel bot.

llvm-svn: 308603
2017-07-20 11:36:03 +00:00
Francis Visoiu Mistrih 52042aa21e [PEI] Add basic opt-remarks support
Add optimization remarks support to the PrologueEpilogueInserter. For
now, emit the stack size as an analysis remark, but more additions wrt
shrink-wrapping may be added.

https://reviews.llvm.org/D35645

llvm-svn: 308556
2017-07-19 23:47:32 +00:00
Tim Northover 0e0b3c97dd GlobalISel: fix SUBREG_TO_REG implementation.
The first argument needs to be an immediate rather than a register. Should fix
some crashes in the verifier bot.

llvm-svn: 308540
2017-07-19 22:08:08 +00:00
Tim Northover d59fbec8e2 GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64.
llvm-svn: 308493
2017-07-19 16:47:07 +00:00
Balaram Makam b05a55787a [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary:
When simplifying unconditional branches from empty blocks, we pre-test if the
BB belongs to a set of loop headers and keep the block to prevent passes from
destroying canonical loop structure. However, the current algorithm fails if
the destination of the branch is a loop header. Especially when such a loop's
latch block is folded into loop header it results in additional backedges and
LoopSimplify turns it into a nested loop which prevent later optimizations
from being applied (e.g., loop  unrolling and loop interleaving).

This patch augments the existing algorithm by further checking if the
destination of the branch belongs to a set of loop headers and defer
eliminating it if yes to LateSimplifyCFG.

Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl

Reviewed By: efriedma

Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35411

llvm-svn: 308422
2017-07-19 08:53:34 +00:00
Mandeep Singh Grang d857b4ca98 [COFF, ARM64] Reserve X18 register by default
Reviewers: compnerd, rnk, ruiu, mstorsjo

Reviewed By: mstorsjo

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35531

llvm-svn: 308358
2017-07-18 20:41:33 +00:00
Nirav Dave d839749ae8 [DAG] Improve Aliasing of operations to static alloca
Re-recommiting after landing DAG extension-crash fix.

Recommiting after adding check to avoid miscomputing alias information
on addresses of the same base but different subindices.

Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.

Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.

Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.

The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.

Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand

Reviewed By: rnk

Subscribers: sdardis, nemanjai, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33345

llvm-svn: 308350
2017-07-18 20:06:24 +00:00
Geoff Berry 9962faed2b [AArch64][Falkor] Avoid HW prefetcher tag collisions (step 2)
Summary:
Avoid HW prefetcher instruction tag collisions in loops by inserting
MOVs to change the base address register of strided loads.

Reviewers: t.p.northover, mcrosier

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D35366

llvm-svn: 308324
2017-07-18 16:14:22 +00:00
Daniel Sanders 40b66d646e [globalisel][tablegen] Enable the import of rules involving fma.
Summary:
G_FMA was recently added to GlobalISel which enables the import of rules
involving fma. Add the mapping to allow it.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35130

llvm-svn: 308308
2017-07-18 14:10:07 +00:00
Florian Hahn 3530094de6 [AArch64] Use 16 bytes as preferred function alignment on Cortex-A73.
Summary:
Using 16 byte alignment is beneficial on Cortex-A73, similar to
Cortex-A72 (added in D34961).

Reviewers: mcrosier, t.p.northover, aadg, silviu.baranga

Reviewed By: t.p.northover

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35493

llvm-svn: 308283
2017-07-18 09:31:18 +00:00
Chandler Carruth a15e080b05 Revert r308025 due to uncovering a crash in SelectionDAG. This is filed
with a minimal test case in http://llvm.org/PR33833.

Original commit message:
  Improve Aliasing of operations to static alloca

llvm-svn: 308271
2017-07-18 07:53:47 +00:00
Martin Storsjo 2f24e93481 [AArch64] Extend CallingConv::X86_64_Win64 to AArch64 as well
Rename the enum value from X86_64_Win64 to plain Win64.

The symbol exposed in the textual IR is changed from 'x86_64_win64cc'
to 'win64cc', but the numeric value is kept, keeping support for
old bitcode.

Differential Revision: https://reviews.llvm.org/D34474

llvm-svn: 308208
2017-07-17 20:05:19 +00:00
Mandeep Singh Grang ed64963f1e [llvm] Remove redundant check-prefix=CHECK from tests. NFC.
Reviewers: t.p.northover, oren_ben_simhon, niravd, mcrosier

Reviewed By: oren_ben_simhon, mcrosier

Subscribers: nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35466

llvm-svn: 308193
2017-07-17 17:32:45 +00:00
Yi Kong 3b680d8d81 [AArch64] Avoid selecting XZR inline ASM memory operand
Restricting register class to PointerRegClass for memory operands.

Also fix the PointerRegClass for AArch64 from GPR64 to GPR64sp, since
XZR cannot hold a memory pointer while SP is.

Fixes PR33134.

Differential Revision: https://reviews.llvm.org/D34999

llvm-svn: 308060
2017-07-14 21:46:16 +00:00
Geoff Berry b1e8714af9 [AArch64][Falkor] Avoid HW prefetcher tag collisions (step 1)
Summary:
This patch is the first step in reducing HW prefetcher instruction tag
collisions in inner loops for Falkor.  It adds a pass that annotates IR
loads with metadata to indicate that they are known to be strided loads,
and adds a target lowering hook that translates this metadata to a
target-specific MachineMemOperand flag.

A follow on change will use this MachineMemOperand flag to re-write
instructions to reduce tag collisions.

Reviewers: mcrosier, t.p.northover

Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34963

llvm-svn: 308059
2017-07-14 21:44:12 +00:00
Nirav Dave a8f63af9d1 Improve Aliasing of operations to static alloca
Recommiting after adding check to avoid miscomputing alias information
on addresses of the same base but different subindices.

Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.

Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.

Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.

The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.

Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand

Reviewed By: rnk

Subscribers: sdardis, nemanjai, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33345

llvm-svn: 308025
2017-07-14 13:56:21 +00:00
Martin Storsjo 68266faa31 [AArch64] Implement support for windows style vararg functions
Pass parameters properly in calls to such functions (pass all
floats in integer registers), and handle va_start properly (allocate
stack immediately below the arguments on the stack, to save the
register arguments into a single continuous array).

Differential Revision: https://reviews.llvm.org/D35006

llvm-svn: 307928
2017-07-13 17:03:12 +00:00
Matthew Simpson 06e6a6bdff [AArch64] Add preliminary support for ARMv8.1 SUB/AND atomics
This patch is a follow-up to r305893 and adds preliminary support for the
fetch_sub and fetch_and operations.

llvm-svn: 307913
2017-07-13 15:01:23 +00:00
Justin Bogner 4fc696635d GlobalISel: Handle selection of G_IMPLICIT_DEF in AArch64
A generic variant of IMPLICIT_DEF was added in r306875, but this
survives to selection and hits a `Cannot Select`. Add handling that
converts the note to a regular IMPLICIT_DEF.

llvm-svn: 307817
2017-07-12 17:32:32 +00:00
Evandro Menezes 14ba3d7730 [CodeGen] Add dependency printer
Add SDep printer to make debugging sessions more productive.

Differential revision: https://reviews.llvm.org/D35144

llvm-svn: 307799
2017-07-12 15:30:59 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Daniel Sanders fe12c0fa56 [globalisel][tablegen] Correct matching of intrinsic ID's.
TreePatternNode considers them to be plain integers but MachineInstr considers
them to be a distinct kind of operand.

The tweak to AArch64InstrInfo.td to produce a simple test case is a NFC for
everything except GlobalISelEmitter (confirmed by diffing the tablegenerated
files). GlobalISelEmitter is currently unable to infer the type of operands in
the Dst pattern from the operands in the Src pattern.

llvm-svn: 307634
2017-07-11 08:57:29 +00:00
Matthias Braun b38736706e Revert "[DAG] Improve Aliasing of operations to static alloca"
Reverting as it breaks tramp3d-v4 in the llvm test-suite. I added some
comments to https://reviews.llvm.org/D33345 about it.

This reverts commit r307546.

llvm-svn: 307589
2017-07-10 20:51:30 +00:00
Nirav Dave 163e1ad9dc [DAG] Improve Aliasing of operations to static alloca
Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.

Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.

Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.

The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.

Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand

Reviewed By: rnk

Subscribers: sdardis, nemanjai, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33345

llvm-svn: 307546
2017-07-10 15:39:41 +00:00
Florian Hahn d4550baf3b [AArch64] Use 16 bytes as preferred function alignment on Cortex-A57.
Summary:
This change gives a 0.89% speed on execution time, a 0.94% improvement
in benchmark scores and a 0.62% increase in binary size on a Cortex-A57.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite, SPEC2000, SPEC2006 and a range of proprietary suites.

The software optimization guide for the Cortex-A57 recommends 16 byte
branch alignment.

Reviewers: t.p.northover, mcrosier, javed.absar, kristof.beyls, sbaranga

Reviewed By: kristof.beyls

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D34954

llvm-svn: 307389
2017-07-07 10:43:01 +00:00
Florian Hahn e3666ec9d6 [AArch64] Use 16 bytes as preferred function alignment on Cortex-A72.
Summary:
This change gives a 0.34% speed on execution time, a 0.61% improvement
in benchmark scores and a 0.57% increase in binary size on a Cortex-A72.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite, SPEC2000, SPEC2006 and a range of proprietary suites.

The software optimization guide for the Cortex-A72 recommends 16 byte
branch alignment.


Reviewers: t.p.northover, kristof.beyls, rengolin, sbaranga, mcrosier, javed.absar

Reviewed By: kristof.beyls

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D34961

llvm-svn: 307380
2017-07-07 10:15:49 +00:00
Florian Hahn 9872a6aaad [AArch64] Add test case for preferred function alignment (NFC).
Reviewers: evandro, joelkevinjones, mcrosier

Reviewed By: joelkevinjones, mcrosier

Subscribers: mcrosier, aemerson, llvm-commits, rengolin, evandro, javed.absar, joelkevinjones, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34951

llvm-svn: 307369
2017-07-07 09:17:53 +00:00
Brian Gesiak 4ef3daafef [ORE] Add diagnostics hotness threshold
Summary:
Add an option to prevent diagnostics that do not meet a minimum hotness
threshold from being output. When generating optimization remarks for
large codebases with a ton of cold code paths, this option can be used
to limit the optimization remark output at a reasonable size. Discussion of
this change can be read here:
http://lists.llvm.org/pipermail/llvm-dev/2017-June/114377.html

Reviewers: anemet, davidxl, hfinkel

Reviewed By: anemet

Subscribers: qcolombet, javed.absar, fhahn, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D34867

llvm-svn: 306912
2017-06-30 23:14:53 +00:00
Tim Northover ff5e7e1295 GlobalISel: add G_IMPLICIT_DEF instruction.
It looks like there are two target-independent but not GISel instructions that
need legalization, IMPLICIT_DEF and PHI. These are already anomalies since
their operands have important LLTs attached, so to make things more uniform it
seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF.

llvm-svn: 306875
2017-06-30 20:27:36 +00:00
Aditya Nandakumar 20f6207013 [GISel]: New Opcode G_FLOG/G_FLOG2
https://reviews.llvm.org/D34837

llvm-svn: 306766
2017-06-29 23:43:44 +00:00
Alexandros Lamprineas c0432d86aa [AArch64] AArch64CondBrTuningPass generates wrong branch instructions
Some conditional branch instructions generated by this pass are checking
the wrong condition code. The instructions TBZ and TBNZ are transformed
into B.GE and B.LT instead of B.PL and B.MI respectively. They should
only be checking the Negative bit.

Differential Revision: https://reviews.llvm.org/D34743

llvm-svn: 306550
2017-06-28 15:09:11 +00:00
Aditya Nandakumar cca75d2406 [GISel]: Add G_FEXP, G_FEXP2 opcodes
Also add IRTranslator support.
https://reviews.llvm.org/D34710

llvm-svn: 306475
2017-06-27 22:19:32 +00:00
Tim Northover 849fcca090 GlobalISel: verify that a COPY is trivial when created.
Without this check, COPY instructions can actually be one of the generic casts
in disguise. That's confusing and bad.

At some point during ISel this restriction has to be relaxed since the fully
selected instructions will usually use COPY for those purposes. Right now I
think it's possible that relaxation occurs during RegBankSelect (hence the
change there). I'm not convinced that's where it belongs long-term though.

llvm-svn: 306470
2017-06-27 21:41:40 +00:00
Matthew Simpson 0bd79f416a [AArch64] Update successor probabilities after ccmp-conversion
This patch modifies the conditional compares pass so that it keeps successor
probabilities up-to-date after the conversion. Previously, successor
probabilities were being normalized to a uniform distribution, even though they
may have been heavily biased prior to the conversion (e.g., if one of the edges
was the back edge of a loop). This loss of information affected passes later in
the pipeline.

Differential Revision: https://reviews.llvm.org/D34109

llvm-svn: 306412
2017-06-27 15:00:22 +00:00
Daniel Sanders cc36dbf55d [globalisel][tablegen] Add support for EXTRACT_SUBREG.
Summary:
After this patch, we finally have test cases that require multiple
instruction emission.

Depends on D33590

Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls

Subscribers: javed.absar, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D33596

llvm-svn: 306388
2017-06-27 10:11:39 +00:00