Commit Graph

20957 Commits

Author SHA1 Message Date
Petr Hosek 710479cede [CodeGen][X86] Fuchsia supports sincos* libcalls and sin+cos->sincos optimization
Patch by Roland McGrath

Differential Revision: https://reviews.llvm.org/D35748

llvm-svn: 308854
2017-07-23 22:30:00 +00:00
Florian Hahn 57ffb2c9d8 [AArch64] Add test for function alignment for a optsize function (NFC).
Reviewers: dblaikie, t.p.northover, rengolin

Reviewed By: rengolin

Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D35620

llvm-svn: 308852
2017-07-23 21:15:10 +00:00
Chad Rosier 9b2b4c961a [AArch64] Redundant Copy Elimination - remove more zero copies.
This patch removes unnecessary zero copies in BBs that are targets of b.eq/b.ne
and we know the result of the compare instruction is zero.  For example,

BB#0:
  subs w0, w1, w2
  str w0, [x1]
  b.ne .LBB0_2
BB#1:
  mov w0, wzr  ; <-- redundant
  str w0, [x2]
.LBB0_2

Differential Revision: https://reviews.llvm.org/D35075

llvm-svn: 308849
2017-07-23 16:38:08 +00:00
Craig Topper 6912d7faa3 [X86] Add patterns for memory forms of SARX/SHLX/SHRX with careful complexity adjustment to keep shift by immediate using the legacy instructions.
These patterns were only missing to favor using the legacy instructions when the shift was a constant. With careful adjustment of the pattern complexity we can make sure the immediate instructions still have priority over these patterns.

llvm-svn: 308834
2017-07-23 03:59:37 +00:00
Nirav Dave 4e6dcf73f9 [DAG] Fix typo preventing some stores merges to truncated stores.
Check the actual memory type stored and not the extended value size
when considering if truncated store merge is worthwhile.

Reviewers: efriedma, RKSimon, spatel, jyknight

Reviewed By: efriedma

Subscribers: llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D35623

llvm-svn: 308833
2017-07-23 02:06:28 +00:00
Matt Arsenault c5d1e503e1 RA: Remove another assert on empty intervals
This case is similar to the one fixed in r308808,
except when rematerializing.

Fixes bug 33884.

llvm-svn: 308813
2017-07-22 00:24:01 +00:00
Matt Arsenault 6a963f76ca RA: Remove assert on empty live intervals
This is possible if there is an undef use when
splitting the vreg during spilling.

Fixes bug 33620.

llvm-svn: 308808
2017-07-21 23:56:13 +00:00
Erich Keane d8f61f8f7e Remove Bitrig: LLVM Changes
Bitrig code has been merged back to OpenBSD, thus the OS has been abandoned.

Differential Revision: https://reviews.llvm.org/D35707

llvm-svn: 308799
2017-07-21 22:48:47 +00:00
Konstantin Zhuravlyov e9a5a77ee3 AMDGPU: Implement memory model
llvm-svn: 308781
2017-07-21 21:19:23 +00:00
Krzysztof Parzyszek 3ad0d01e9e [Hexagon] Add inline-asm constraint 'a' for modifier register class
For example
  asm ("memw(%0++%1) = %2" : : "r"(addr),"a"(mod),"r"(val) : "memory")

llvm-svn: 308761
2017-07-21 17:51:27 +00:00
Simon Dardis 0310eb7a67 [mips] Support -membedded-data and fix a related bug
-membedded-data changes the location of constant data from the .sdata to
the .rodata section. Previously it was (incorrectly) always located in the
.rodata section.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D35686

llvm-svn: 308758
2017-07-21 17:19:00 +00:00
Jonas Paulsson be7a7e4979 [SystemZ] test update
test/CodeGen/SystemZ/loop-01.ll was incorrectly updated by r308729.

llvm-svn: 308736
2017-07-21 13:14:17 +00:00
Jonas Paulsson 024e319489 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Simon Pilgrim 84cbd8e750 [X86][SSE] Add extra (sra (sra x, c1), c2) -> (sra x, (add c1, c2)) test case
We should be able to handle the case where some c1+c2 elements exceed max shift and some don't by performing a clamp after the sum

llvm-svn: 308724
2017-07-21 10:22:49 +00:00
Simon Pilgrim 32c377a1cf [X86][SSE] Add pre-AVX2 support for (i32 bitcast(v32i1)) -> 2xMOVMSK
Currently we only support (i32 bitcast(v32i1)) using the AVX2 VPMOVMSKB ymm instruction.

This patch adds support for splitting pre-AVX2 targets into 2 x (V)PMOVMSKB xmm instructions and merging the integer results.

In future we could probably generalize this to handle more cases.

Differential Revision: https://reviews.llvm.org/D35303

llvm-svn: 308723
2017-07-21 09:58:50 +00:00
Craig Topper 31140ade70 [AVX-512] Fix a bug that prevented some non-temporal loads from using the movntdqa instruction.
The bitconverts here had an input type of 128-bits and an output type of 256 bits. The input type should also have been 256 bits.

llvm-svn: 308702
2017-07-21 00:40:42 +00:00
Tim Northover 7b6d66c0c9 Recommit: GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64.
It revealed a bug in the Localizer pass which has now been fixed.

This includes the fix for SUBREG_TO_REG committed separately last time.

llvm-svn: 308688
2017-07-20 22:58:38 +00:00
Tim Northover 071d77a51f GlobalISel: stop localizer putting constants before EH_LABELs
If the localizer pass puts one of its constants before the label that tells the
unwinder "jump here to handle your exception" then control-flow will skip it,
leaving uninitialized registers at runtime. That's bad.

llvm-svn: 308687
2017-07-20 22:58:26 +00:00
Artem Belevich d7a73824e4 [NVPTX] Add lowering of i128 params.
The patch adds support of i128 params lowering. The changes are quite trivial to
support i128 as a "special case" of integer type. With this patch, we lower i128
params the same way as aggregates of size 16 bytes: .param .b8 _ [16].

Currently, NVPTX can't deal with the 128 bit integers:
* in some cases because of failed assertions like
  ValVTs.size() == OutVals.size() && "Bad return value decomposition"
* in other cases emitting PTX with .i128 or .u128 types (which are not valid [1])
  [1] http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#fundamental-types

Differential Revision: https://reviews.llvm.org/D34555
Patch by: Denys Zariaiev (denys.zariaiev@gmail.com)

llvm-svn: 308675
2017-07-20 21:16:03 +00:00
Matt Arsenault db78273b6e Add an ID field to StackObjects
On AMDGPU SGPR spills are really spilled to another register.
The spiller creates the spills to new frame index objects,
which is used as a placeholder.

This will eventually be replaced with a reference to a position
in a VGPR to write to and the frame index deleted. It is
most likely not a real stack location that can be shared
with another stack object.

This is a problem when StackSlotColoring decides it should
combine a frame index used for a normal VGPR spill with
a real stack location and a frame index used for an SGPR.

Add an ID field so that StackSlotColoring has a way
of knowing the different frame index types are
incompatible.

llvm-svn: 308673
2017-07-20 21:03:45 +00:00
Zvi Rackover eac8e7c08a [X86] Adding ISel tests for strided-shuffles with non-zero offset. NFC.
llvm-svn: 308672
2017-07-20 21:03:36 +00:00
James Y Knight bb76d48d59 [SPARC] Clean up the support for disabling fsmuld and fmuls instructions.
Summary:
Also enable no-fsmuld for sparcv7 (which doesn't have the
instruction).

The previous code which used a post-processing pass to do this was
unnecessary; disabling the instruction is entirely sufficient.

Reviewers: jacob_hansen, ekedaigle

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35576

llvm-svn: 308661
2017-07-20 20:09:11 +00:00
Craig Topper 27c12e088e [X86] Allow masks with more than 6 bits set on the x << (y & mask) optimization for the 64-bit memory shifts.
llvm-svn: 308657
2017-07-20 19:29:58 +00:00
Craig Topper 02959b3d05 [X86] Add test case to demonstrate that we don't allow masks wider than 6 bits in the (shift x, (and y, mask)) patterns for the 64-bit memory form.
We allow wider than 5 bits in the 16 and 32 bit store forms. And we allow wider than 6 bits on the 64-bit regsiter form.:w

I'm assuming this was a mistake made back in r148024.

llvm-svn: 308656
2017-07-20 19:29:56 +00:00
Nirav Dave df86d2d008 [DAG] Handle missing transform in fold of value extension case.
Summary:
When pushing an extension of a constant bitwise operator on a load
into the load, change other uses of the load value if they exist to
prevent the old load from persisting.

Reviewers: spatel, RKSimon, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35030

llvm-svn: 308618
2017-07-20 13:57:32 +00:00
Nirav Dave 77cc6f23b9 [DAG] Optimize away degenerate INSERT_VECTOR_ELT nodes.
Summary:
Add missing vector write of vector read reduction, i.e.:

(insert_vector_elt x (extract_vector_elt x idx) idx) to x

Reviewers: spatel, RKSimon, efriedma

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35563

llvm-svn: 308617
2017-07-20 13:48:17 +00:00
Stefan Maksimovic be0bc71e02 Reland r308585
Builder clang-x86_64-linux-abi-test apparently failed due
to a spurious error unrelated to the changes r308585
introduced.

llvm-svn: 308612
2017-07-20 13:08:18 +00:00
Simon Pilgrim b6485252aa [X86][AVX512] Improve vector rotation constant folding tests
Test constant folding both on node creation (which already works) and once the input nodes have been folded themselves (not working yet).

llvm-svn: 308611
2017-07-20 13:07:37 +00:00
Simon Atanasyan fb953926b1 [mips] Support `long_call/far/near` attributes passed by front-end
This patch adds handling of the `long_call`, `far`, and `near`
attributes passed by front-end. The patch depends on D35479.

Differential revision: https://reviews.llvm.org/D35480.

llvm-svn: 308606
2017-07-20 12:19:26 +00:00
Diana Picus 7534b28291 Revert "GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64."
This reverts commit 36c6a2ea9669bc3bb695928529a85d12d1d3e3f9 because it
broke the test-suite on the GlobalISel bot.

llvm-svn: 308603
2017-07-20 11:36:03 +00:00
Simon Pilgrim 2911296f10 [DAGCombiner] Match ISD::SRL non-uniform constant vectors patterns using predicates.
Use predicate matchers introduced in D35492 to match more ISD::SRL constant folds

llvm-svn: 308602
2017-07-20 11:03:30 +00:00
Simon Pilgrim 7ff0e49d8c [DAGCombiner] Match ISD::SRA non-uniform constant vectors patterns using predicates.
Use predicate matchers introduced in D35492 to match more ISD::SRA constant folds

llvm-svn: 308600
2017-07-20 10:43:05 +00:00
Simon Pilgrim 9d7863b935 [DAGCombiner] Match non-uniform constant vectors using predicates.
Most combines currently recognise scalar and splat-vector constants, but not non-uniform vector constants.

This patch introduces a matching mechanism that uses predicates to check against BUILD_VECTOR of ConstantSDNode, as well as scalar ConstantSDNode cases.

I've changed a couple of predicates to demonstrate - the combine-shl changes add currently unsupported cases, while the MatchRotate replaces an existing mechanism.

Differential Revision: https://reviews.llvm.org/D35492

llvm-svn: 308598
2017-07-20 10:13:40 +00:00
Stefan Maksimovic 3793a82b28 Revert r308585
Builder clang-x86_64-linux-abi-test seems to fail after this change

llvm-svn: 308597
2017-07-20 09:57:14 +00:00
Stefan Maksimovic 8539f77bc3 [mips] Fix fp select machine verifier errors
Introduced FSELECT node necesary when lowering ISD::SELECT
which has i32, f64, f64 as its operands.
SEL_D instruction required that its output and first operand
of a SELECT node, which it used, have matching types.
MTC1_D64 node introduced to aid FSELECT lowering.

This fixes machine verifier errors on following tests:
CodeGen/Mips/llvm-ir/select-dbl.ll
CodeGen/Mips/llvm-ir/select-flt.ll
CodeGen/Mips/select.ll

Differential Revision: https://reviews.llvm.org/D35408

llvm-svn: 308595
2017-07-20 09:21:10 +00:00
Craig Topper 33225ef314 [X86] Use SARX/SHLX/SHLX instructions for (shift x (and y, (BitWidth-1)))
Fixes PR33841.

llvm-svn: 308591
2017-07-20 06:19:55 +00:00
Craig Topper bdd114ef9d [X86] Add test cases for (shift x (and y, (BitWidth-1))) to the BMI2 shift test.
We should use SHLX and similar instructions for these patterns, but we currently don't.

llvm-svn: 308590
2017-07-20 06:19:54 +00:00
Craig Topper a774ecc7f5 [X86] Regenerate shift-and.ll and shift-bmi2.ll using update_llc_test_checks.py.
I've stripped the checks for 64-bit types in 32-bit mode to match the existing tests.

llvm-svn: 308589
2017-07-20 06:19:53 +00:00
Craig Topper 01d4ca3916 [X86] Remove outdated bug comment from a test.
The test issue was fixed and the test was updated in r244577, but the comment wasn't removed.

llvm-svn: 308588
2017-07-20 06:19:52 +00:00
Francis Visoiu Mistrih 52042aa21e [PEI] Add basic opt-remarks support
Add optimization remarks support to the PrologueEpilogueInserter. For
now, emit the stack size as an analysis remark, but more additions wrt
shrink-wrapping may be added.

https://reviews.llvm.org/D35645

llvm-svn: 308556
2017-07-19 23:47:32 +00:00
Tim Northover 0e0b3c97dd GlobalISel: fix SUBREG_TO_REG implementation.
The first argument needs to be an immediate rather than a register. Should fix
some crashes in the verifier bot.

llvm-svn: 308540
2017-07-19 22:08:08 +00:00
Wolfgang Pieb 3610942c12 Forgot to add triple to test in r308513.
llvm-svn: 308527
2017-07-19 21:45:21 +00:00
Wolfgang Pieb e018bbd835 Fixing an issue with the initialization of LexicalScopes objects when mixing debug
and non-debug units.

Patch by Andrea DiBiagio.

Differential Revision:  https://reviews.llvm.org/D35637

llvm-svn: 308513
2017-07-19 19:36:40 +00:00
Krzysztof Parzyszek ac01994db9 [Hexagon] Fix a bug in r308502: post-inc offset is always 0
llvm-svn: 308510
2017-07-19 19:17:32 +00:00
Davide Italiano 5fc5d0a406 [X86] Don't try to scale down if that exceeds the bitwidth.
Fixes the crash reported in PR33844.

llvm-svn: 308503
2017-07-19 18:09:46 +00:00
Tim Northover d59fbec8e2 GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64.
llvm-svn: 308493
2017-07-19 16:47:07 +00:00
Javed Absar 2cb0c95031 [ARM] Unify handling of M-Class system registers
This patch cleans up and fixes issues in the M-Class system register handling:

1. It defines the system registers and the encoding (SYSm values) in one place:
   a new ARMSystemRegister.td using SearchableTable, thereby removing the
   hand-coded values which existed in multiple places.

2. Some system registers e.g. BASEPRI_MAX_NS which do not exist were being allowed!
   Ref: ARMv6/7/8M architecture reference manual.

Reviewed by: @t.p.northover, @olist01, @john.brawn
Differential Revision: https://reviews.llvm.org/D35209

llvm-svn: 308456
2017-07-19 12:57:16 +00:00
Simon Pilgrim e5c7925c5e [X86][XOP] Use default AVX2 lowering for v4i64 ashr by splat constants
XOP shifts only support 128-bit vectors, so we were ending up with less optimal codegen requiring constants

llvm-svn: 308430
2017-07-19 10:29:31 +00:00
Balaram Makam b05a55787a [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary:
When simplifying unconditional branches from empty blocks, we pre-test if the
BB belongs to a set of loop headers and keep the block to prevent passes from
destroying canonical loop structure. However, the current algorithm fails if
the destination of the branch is a loop header. Especially when such a loop's
latch block is folded into loop header it results in additional backedges and
LoopSimplify turns it into a nested loop which prevent later optimizations
from being applied (e.g., loop  unrolling and loop interleaving).

This patch augments the existing algorithm by further checking if the
destination of the branch belongs to a set of loop headers and defer
eliminating it if yes to LateSimplifyCFG.

Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605

Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl

Reviewed By: efriedma

Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35411

llvm-svn: 308422
2017-07-19 08:53:34 +00:00
Chandler Carruth bb83558f00 Revert r308273 to reinstate part of r308100.
That part was reverted because the underlying change necessitating it
(r308025) was reverted in r308271.

Nirav re-landed r308025 again in r308350, so re-landing this fix.

llvm-svn: 308418
2017-07-19 04:15:30 +00:00