Commit Graph

156544 Commits

Author SHA1 Message Date
Matt Arsenault 88efb9ff8e MI: Print ranges on MMO
llvm-svn: 318020
2017-11-13 07:09:20 +00:00
Craig Topper ca8abedb2a [X86] Use sse_load_f32/f64 to improve load folding for scalar VFPCLASS intrinsics.
llvm-svn: 318019
2017-11-13 06:46:48 +00:00
Craig Topper bf328f263e [X86] Add tests for missed opportunities to fold a 128-bit vector load into vfpclassss and vpfpclasssd.
llvm-svn: 318018
2017-11-13 06:46:46 +00:00
Matt Arsenault e5e0c742df AMDGPU: Preserve nuw in shl add ptr combine
llvm-svn: 318017
2017-11-13 05:33:35 +00:00
Craig Topper d4f6094091 [X86] Fix SQRTSS/SQRTSD/RCPSS/RCPSD intrinsics to use sse_load_f32/sse_load_f64 to increase load folding opportunities.
llvm-svn: 318016
2017-11-13 05:25:24 +00:00
Craig Topper 24389c6746 [X86] Add tests for full vector loads to fold-load-unops.ll.
We should be able to fold a full vector load into a scalar intrinsic. Since it's legal to narrow a load.

llvm-svn: 318015
2017-11-13 05:25:23 +00:00
Craig Topper a95a1fd42d [X86] Regenerate fold-load-unops.ll and add and avx512f command line.
llvm-svn: 318014
2017-11-13 05:25:21 +00:00
Matt Arsenault fbe9533509 AMDGPU: Fix multi-use shl/add combine
This was using a custom function that didn't handle the
addressing modes properly for private. Use
isLegalAddressingMode to avoid duplicating this.

Additionally, skip the combine if there is only one use
since the standard combine will handle it.

llvm-svn: 318013
2017-11-13 05:11:54 +00:00
Craig Topper 23493f3777 [X86] Attempt to fix signed and unsigned comparison warning.
llvm-svn: 318010
2017-11-13 02:19:13 +00:00
Craig Topper deee24b83c [X86] Use sse_load_f32/f64 in patterns for the memory forms of VRNDSCALESS/SD.
llvm-svn: 318009
2017-11-13 02:03:01 +00:00
Craig Topper 63157c4784 [X86] Use EVEX encoded VRNDSCALE instructions to implement the legacy round intrinsics.
The VRNDSCALE instructions implement a superset of the (V)ROUND instructions. They are equivalent if the upper 4-bits of the immediate are 0.

This patch lowers the legacy intrinsics to the VRNDSCALE ISD node and masks the upper bits of the immediate to 0. This allows us to take advantage of the larger register encoding space.

We should maybe consider converting VRNDSCALE back to VROUND in the EVEX to VEX pass if the extended registers are not being used.

I notice some load folding opportunities being missed for the VRNDSCALESS/SD instructions that I'll try to fix in future patches.

llvm-svn: 318008
2017-11-13 02:03:00 +00:00
Craig Topper 0af48f1ad4 [X86] Split VRNDSCALE/VREDUCE/VGETMANT/VRANGE ISD nodes into versions with and without the rounding operand. NFCI
I want to reuse the VRNDSCALE node for the legacy SSE rounding intrinsics so that those intrinsics can use EVEX instructions. All of these nodes share tablegen multiclasses so I split them all so that they all remain similar in their implementations.

llvm-svn: 318007
2017-11-13 02:02:58 +00:00
Matt Arsenault 90e4f719e1 Fix some misc. -enable-var-scope violations
llvm-svn: 318006
2017-11-13 01:47:52 +00:00
Matt Arsenault e1cd482fda AMDGPU: Select d16 loads into low component of register
llvm-svn: 318005
2017-11-13 00:22:09 +00:00
Matt Arsenault 70b9282015 AMDGPU: Fix -enable-var-scope violations
llvm-svn: 318004
2017-11-12 23:53:44 +00:00
Matt Arsenault cf9b6d8d57 AMDGPU: Fix missing gfx9 atomic inc/dec tests
The global instructions weren't tested. Plus there
were also some -enable-var-scope violations and
broken check prefixes.

llvm-svn: 318003
2017-11-12 23:40:12 +00:00
Craig Topper b42a23ff8f [X86] Add an X86ISD::RANGES opcode to use for the scalar intrinsics.
This fixes a bug where we selected packed instructions for scalar intrinsics.

llvm-svn: 317999
2017-11-12 18:51:09 +00:00
Craig Topper 6b53c4a982 [X86] Add test cases and command lines demonstrating how we accidentally select vrangeps/vrangepd from vrangess/vrangesd instrinsics when the rounding mode is CUR_DIRECTION
llvm-svn: 317998
2017-11-12 18:51:08 +00:00
Craig Topper 1382932c12 [X86] Remove some no longer needed intrinsic lowering code.
llvm-svn: 317997
2017-11-12 18:51:06 +00:00
Mandeep Singh Grang d104673257 [llvm] Remove redundant return [NFC]
Reviewers: davidxl, olista01, Eugene.Zelenko

Reviewed By: Eugene.Zelenko

Subscribers: sdardis, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39917

llvm-svn: 317995
2017-11-12 03:47:50 +00:00
Craig Topper d3e5781e53 [InstCombine] Teach visitICmpInst to not break integer absolute value idioms
Summary:
This patch adds an early out to visitICmpInst if we are looking at a compare as part of an integer absolute value idiom. Similar is already done for min/max.

In the particular case I observed in a benchmark we had an absolute value of a load from an indexed global. We simplified the compare using foldCmpLoadFromIndexedGlobal into a magic bit vector, a shift, and an and. But the load result was still used for the select and the negate part of the absolute valute idiom. So we overcomplicated the code and lost the ability to recognize it as an absolute value.

I've chosen a simpler case for the test here.

Reviewers: spatel, davide, majnemer

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39766

llvm-svn: 317994
2017-11-12 02:28:21 +00:00
Craig Topper ac250825c6 [X86] Use vrndscaleps/pd for 128/256 ffloor/ftrunc/fceil/fnearbyint/frint when avx512vl is enabled.
This matches what we do for scalar and 512-bit types.

llvm-svn: 317991
2017-11-11 21:44:51 +00:00
Craig Topper ae9ffa1f5a [X86] Remove avx512-round.ll. The 512-bit rounding tests are now in vec_floor.ll with 128/256 sizes.
llvm-svn: 317990
2017-11-11 21:44:50 +00:00
Craig Topper e44fc7836e [X86] Add avx512vl command line to vec_floor.ll. Add 512-bit test cases.
llvm-svn: 317989
2017-11-11 21:44:49 +00:00
Craig Topper a9f48803d7 [X86] Add avx512f command line to rounding-ops.ll
llvm-svn: 317988
2017-11-11 21:44:48 +00:00
Craig Topper 1a20db2108 [X86] Regenerate rounding-ops.ll with update_llc_test_checks.py
llvm-svn: 317987
2017-11-11 21:44:47 +00:00
Simon Pilgrim 294b87b432 [X86] Attempt to match multiple binary reduction ops at once. NFCI
matchBinOpReduction currently matches against a single opcode, but we already have a case where we repeat calls to try to match against AND/OR and I'll be shortly adding another case for SMAX/SMIN/UMAX/UMIN (D39729).

This NFCI patch alters matchBinOpReduction to try and pattern match against any of the provided list of candidate bin ops at once to save time.

Differential Revision: https://reviews.llvm.org/D39726

llvm-svn: 317985
2017-11-11 18:16:55 +00:00
Craig Topper 0ccec70ff5 [X86] Add scalar register class versions of VRNDSCALE instructions and rename the existing versions to _Int.
This is consistent with out normal implementation of scalar instructions.

While there disable load folding for the patterns with IMPLICIT_DEF unless optimizing for size which is also our standard practice.

llvm-svn: 317977
2017-11-11 08:24:15 +00:00
Craig Topper 4d80c5dafc [X86] Regenerate avx512-round.ll test.
llvm-svn: 317976
2017-11-11 08:24:13 +00:00
Craig Topper 80405076b0 [X86] Inline some SDNode operand multiclass operands that don't vary. NFC
llvm-svn: 317975
2017-11-11 08:24:12 +00:00
Craig Topper 4a63843706 [X86] Set the execution domain for VFPCLASS to SSEPackedSingle/Double.
llvm-svn: 317974
2017-11-11 06:57:44 +00:00
Craig Topper 1a093934a9 [X86] Set the execution domain for vptest instruction to the integer domain.
llvm-svn: 317973
2017-11-11 06:19:12 +00:00
Daniel Sanders 7e52367398 [globalisel][tablegen] Import signextload and zeroextload.
Allow a pattern rewriter to be installed in CodeGenDAGPatterns and use it to
correct situations where SelectionDAG and GlobalISel disagree on
representation. For example, it would rewrite:
  (sextload:i32 $ptr)<<unindexedload>><<sextload>><<sextloadi16>
to:
  (sext:i32 (load:i16 $ptr)<<unindexedload>>)

I'd have preferred to replace the fragments and have the expansion happen
naturally as part of PatFrag expansion but the type inferencing system can't
cope with loads of types narrower than those mentioned in register classes.
This is because the SDTCisInt's on the sext constrain both the result and
operand to the 'legal' integer types (where legal is defined as 'a register
class can contain the type') which immediately rules the narrower types out.
Several targets (those with only one legal integer type) would then go on to
crash on the SDTCisOpSmallerThanOp<> when it removes all the possible types
for the result of the extend.

Also, improve isObviouslySafeToFold() slightly to automatically return true for
neighbouring instructions. There can't be any re-ordering problems if
re-ordering isn't happenning. We'll need to improve it further to handle
sign/zero-extending loads when the extend and load aren't immediate neighbours
though.

llvm-svn: 317971
2017-11-11 03:23:44 +00:00
Craig Topper 0eb4a43384 [X86] Correct the execution domain on ROUND/VROUND instructions.
llvm-svn: 317968
2017-11-11 02:26:05 +00:00
Craig Topper bf9b944ea7 [X86] Remove the default for one of the arguments to some tablegen multiclasses. NFC
No one ever uses this default and probably shouldn't since it sets the execution domain to generic.

llvm-svn: 317967
2017-11-11 02:26:02 +00:00
NAKAMURA Takumi 9f65a1ffc8 llvm/Support/TargetParser.h: Fix -fmodules build in rL317900.
llvm-svn: 317966
2017-11-11 02:05:47 +00:00
Tony Tye 3507750063 [AMDGPU] Correct targets that support XNACK
Differential Revision: https://reviews.llvm.org/D39887

llvm-svn: 317955
2017-11-11 00:50:32 +00:00
Craig Topper bdb8db4589 [SelectionDAG] Make getUniformBase in SelectionDAGBuilder fail if any of the middle GEP indices are non-constant.
This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored.

llvm-svn: 317950
2017-11-10 23:36:56 +00:00
Zachary Turner c631ba5f52 Update test_debuginfo.pl script to point to new tree location.
llvm-svn: 317949
2017-11-10 23:13:14 +00:00
Craig Topper ffd48e3c27 [SelectionDAG] Teach SelectionDAGBuilder's getUniformBase for gather/scatter handling to accept GEPs with more than 2 operands if the middle operands are all 0s
Currently we can only get a uniform base from a simple GEP with 2 operands. This causes us to miss address folding opportunities for simple global array accesses as the test case shows.

This patch adds support for larger GEPs if the other indices are 0 since those don't require any additional computations to be inserted.

We may also want to handle constant splats of zero here, but I'm leaving that for future work when I have a real world example.

Differential Revision: https://reviews.llvm.org/D39911

llvm-svn: 317947
2017-11-10 22:50:50 +00:00
Evgeniy Stepanov 989299c42b [asan] Use dynamic shadow on 32-bit Android.
Summary:
The following kernel change has moved ET_DYN base to 0x4000000 on arm32:
https://marc.info/?l=linux-kernel&m=149825162606848&w=2

Switch to dynamic shadow base to avoid such conflicts in the future.

Reserve shadow memory in an ifunc resolver, but don't use it in the instrumentation
until PR35221 is fixed. This will eventually let use save one load per function.

Reviewers: kcc

Subscribers: aemerson, srhines, kubamracek, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D39393

llvm-svn: 317943
2017-11-10 22:27:48 +00:00
Martin Storsjo ba664c1d04 [llvm-cvtres] Add support for ARM64
Also change some default cases into llvm_unreachable in
WindowsResourceCOFFWriter, to make it easier to find if they
are triggerd from within e.g. lld, which supported ARM64 earlier
than llvm-cvtres did.

Differential Revision: https://reviews.llvm.org/D39892

llvm-svn: 317942
2017-11-10 22:27:41 +00:00
Mitch Phillips 3b9ea32ef8 [cfi-verify] Made FileAnalysis operate on a GraphResult rather than build one and validate it.
Refactors the behaviour of building graphs out of FileAnalysis, allowing for analysis of the GraphResult by the callee without having to rebuild the graph. Means when we want to analyse the constructed graph (planned for later revisions), we don't do repeated work.

Also makes CFI verification in FileAnalysis now return an enum that allows us to differentiate why something failed, not just that it did/didn't fail.

Reviewers: vlad.tsyrklevich

Subscribers: kcc, pcc, llvm-commits

Differential Revision: https://reviews.llvm.org/D39764

llvm-svn: 317927
2017-11-10 21:00:22 +00:00
Amaury Sechet 3f0f650f49 [DAGcombine] Do not replace truncate node by itself when doing constant folding, this trigger needless extra rounds of combine for nothing. NFC
llvm-svn: 317926
2017-11-10 20:59:53 +00:00
Zachary Turner 0f2ce11df7 [debuginfo-tests] Make debuginfo-tests work in a standard configuration.
Previously, debuginfo-tests was expected to be checked out into
clang/test and then the tests would automatically run as part of
check-clang.  This is not a standard workflow for handling
external projects, and it brings with it some serious drawbacks
such as the inability to depend on things other than clang, which
we will need going forward.

The goal of this patch is to migrate towards a more standard
workflow.  To ease the transition for build bot maintainers,
this patch tries not to break the existing workflow, but instead
simply deprecate it to give maintainers a chance to update
the build infrastructure.

Differential Revision: https://reviews.llvm.org/D39605

llvm-svn: 317925
2017-11-10 20:57:57 +00:00
Tony Tye f59d0715b1 [AMDGPU] AMDGPUUsage.rst minor corrections
Differential Revision: https://reviews.llvm.org/D39887

llvm-svn: 317924
2017-11-10 20:51:43 +00:00
Davide Italiano acf6065183 [SimplifyCFG] Use auto * when the type is obvious. NFCI.
llvm-svn: 317923
2017-11-10 20:46:21 +00:00
Krzysztof Parzyszek e8926438a9 Recommit r317904: [Hexagon] Create HexagonISelDAGToDAG.h, NFC
The Windows builder did not reconstruct the HexagonGenDAGISel.inc file
after the TableGen binary has changed.

llvm-svn: 317921
2017-11-10 20:09:46 +00:00
Konstantin Zhuravlyov 27b0a033d8 AMDGPU/NFC: Split Processors.td into GCNProcessors.td and R600Processors.td
Differential Revision: https://reviews.llvm.org/D39880

llvm-svn: 317920
2017-11-10 20:01:58 +00:00
Daniel Neilson 6e4aa1e481 Expand IRBuilder interface for atomic memcpy to require pointer alignments. (NFC)
Summary:
 The specification of the @llvm.memcpy.element.unordered.atomic intrinsic requires
that the pointer arguments have alignments of at least the element size. The existing
IRBuilder interface to create a call to this intrinsic does not allow for providing
the alignment of these pointer args. Having an interface that makes it easy to
construct invalid intrinsic calls doesn't seem sensible, so this patch simply
adds the requirement that one provide the argument alignments when using IRBuilder
to create atomic memcpy calls.

llvm-svn: 317918
2017-11-10 19:38:12 +00:00