Commit Graph

41083 Commits

Author SHA1 Message Date
Eugene Zelenko 926883e1c2 [Mips] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293729
2017-02-01 01:22:51 +00:00
Matt Arsenault da7a656542 AMDGPU: Cleanup fmin/fmax legacy function
Use a more specific subtarget check and combine hasOneUse checks

llvm-svn: 293726
2017-02-01 00:42:40 +00:00
Matt Arsenault 1575cb893c AMDGPU: Fix warning
llvm-svn: 293717
2017-01-31 23:48:37 +00:00
Justin Lebar 06fcea4cd9 [NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than x*rsqrt(x).
x*rsqrt(x) returns NaN for x == 0, whereas 1/rsqrt(x) returns 0, as
desired.

Verified that the particular nvptx approximate instructions here do in
fact return 0 for x = 0.

llvm-svn: 293713
2017-01-31 23:08:57 +00:00
Michael Kuperstein e18aad39ab Shut up GCC warning about operator precedence. NFC.
Technically, this is actually changes the expression and the original
assert was "wrong", but since the conjunction is with true, it doesn't
matter in this case.

llvm-svn: 293709
2017-01-31 22:48:45 +00:00
Peter Collingbourne d763c4cc85 MC: Introduce the ABS8 symbol modifier.
@ABS8 can be applied to symbols which appear as immediate operands to
instructions that have a 8-bit immediate form for that operand. It causes
the assembler to use the 8-bit form and an 8-bit relocation (e.g. R_386_8
or R_X86_64_8) for the symbol.

Differential Revision: https://reviews.llvm.org/D28688

llvm-svn: 293667
2017-01-31 18:28:44 +00:00
Matt Arsenault d5d78510c7 AMDGPU: Use source mods with fcanonicalize
llvm-svn: 293654
2017-01-31 17:28:40 +00:00
Nirav Dave a7c041d147 [X86] Implement -mfentry
Summary: Insert calls to __fentry__ at function entry.

Reviewers: hfinkel, craig.topper

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D28000

llvm-svn: 293648
2017-01-31 17:00:27 +00:00
Tom Stellard 124f5cc8c2 AMDGPU/SI: Fix inst-select-load-smrd.mir on some builds
Summary:
For some reason instructions are being inserted in the wrong order with some
builds.  I'm not sure why this is happening.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D29325

llvm-svn: 293639
2017-01-31 15:24:11 +00:00
Simon Pilgrim 1b39d5db7b [X86][SSE] Add support for combining PINSRB into a target shuffle.
llvm-svn: 293637
2017-01-31 14:59:44 +00:00
Sam Parker 9bf658d5fe [ARM] Avoid using ARM instructions in Thumb mode
The Requires class overrides the target requirements of an instruction,
rather than adding to them, so all ARM instructions need to include the
IsARM predicate when they have overwitten requirements.

This caused the swp and swpb instructions to be allowed in thumb mode
assembly, and the ARM encoding of CDP to be selected in codegen (which
is different for conditional instructions).

Differential Revision: https://reviews.llvm.org/D29283

llvm-svn: 293634
2017-01-31 14:35:01 +00:00
Benjamin Kramer 94a833962c [X86] Silence unused variable warning in Release builds.
llvm-svn: 293631
2017-01-31 14:13:53 +00:00
Simon Pilgrim 4eab18f6b8 [X86][SSE] Detect unary PBLEND shuffles.
These can appear during shuffle combining.

llvm-svn: 293628
2017-01-31 13:58:01 +00:00
Simon Pilgrim c29eab52e8 [X86][SSE] Add support for combining PINSRW into a target shuffle.
Also add the ability to recognise PINSR(Vex, 0, Idx).

Targets shuffle combines won't replace multiple insertions with a bit mask until a depth of 3 or more, so we avoid codesize bloat.

The unnecessary vpblendw in clearupper8xi16a will be fixed in an upcoming patch.

llvm-svn: 293627
2017-01-31 13:51:10 +00:00
Nemanja Ivanovic 2f2a6ab991 [PowerPC][Altivec] Add vmr extended mnemonic
Just adds the vmr (Vector Move Register) mnemonic for the VOR instruction in
the PPC back end.

Committing on behalf of brunoalr (Bruno Rosa).

Differential Revision: https://reviews.llvm.org/D29133

llvm-svn: 293626
2017-01-31 13:43:11 +00:00
Simon Dardis 12850eeac5 [mips] Addition of the immediate cases for the instructions [d]div, [d]divu
Related to http://reviews.llvm.org/D15772

Depends on http://reviews.llvm.org/D16888

Adds support for immediate operand for [D]DIV[U] instructions.

Patch By: Srdjan Obucina

Reviewers: zoran.jovanovic, vkalintiris, dsanders, obucina

Differential Revision: https://reviews.llvm.org/D16889

llvm-svn: 293614
2017-01-31 10:49:24 +00:00
Craig Topper 2cfa2071bd [AVX-512] Don't both looking into the AVX512DQ execution domain fixing tables if AVX512DQ isn't supported since we can't do any conversion anyway.
llvm-svn: 293608
2017-01-31 06:49:55 +00:00
Craig Topper 797e32dd98 [X86] Add AVX and SSE2 version of MOVSDmr to execution domain fixing table. AVX-512 already did this for the EVEX version.
llvm-svn: 293607
2017-01-31 06:49:53 +00:00
Craig Topper 779e4c5bb4 [AVX-512] Fix copy and paste bug in execution domain fixing tables so that we can convert 256-bit movnt instructions.
llvm-svn: 293606
2017-01-31 06:49:50 +00:00
Justin Lebar 1c9692a46f [NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate.
Summary:

This lets us lower to sqrt.approx and rsqrt.approx under more
circumstances.

* Now we emit sqrt.approx and rsqrt.approx for calls to @llvm.sqrt.f32,
  when fast-math is enabled.  Previously, we only would emit it for
  calls to @llvm.nvvm.sqrt.f.  (With this patch we no longer emit
  sqrt.approx for calls to @llvm.nvvm.sqrt.f; we rely on intcombine to
  simplify llvm.nvvm.sqrt.f into llvm.sqrt.f32.)

* Now we emit the ftz version of rsqrt.approx when ftz is enabled.
  Previously, we only emitted rsqrt.approx when ftz was disabled.

Reviewers: hfinkel

Subscribers: llvm-commits, tra, jholewinski

Differential Revision: https://reviews.llvm.org/D28508

llvm-svn: 293605
2017-01-31 05:58:22 +00:00
Craig Topper 06e038c6de [X86] Update the broadcast fallback patterns to use shuffle instructions from the appropriate execution domain.
llvm-svn: 293603
2017-01-31 05:18:29 +00:00
Craig Topper e9e84c8284 [AVX-512] Fix the ExeDomain for VMOVDDUP, VMOVSLDUP, and VMOVSHDUP.
llvm-svn: 293601
2017-01-31 05:18:24 +00:00
Matt Arsenault f84e5d9a27 AMDGPU: Generalize matching of v_med3_f32
I think this is safe as long as no inputs are known to ever
be nans.

Also add an intrinsic for fmed3 to be able to handle all safe
math cases.

llvm-svn: 293598
2017-01-31 03:07:46 +00:00
Craig Topper d064cc93b2 [X86] Remove patterns for X86VPermilpi with integer types. I don't think we've formed these since the shuffle lowering rewrite.
llvm-svn: 293592
2017-01-31 02:09:53 +00:00
Craig Topper 85935f69fb [X86] Remove duplicate patterns for X86VPermilpv that already exist in the instructions themselves.
llvm-svn: 293591
2017-01-31 02:09:51 +00:00
Craig Topper ced68315ce [X86] Remove patterns for selecting PSHUFD with FP types. We don't seem to do this anymore and the AVX case definitely should be using VPERMILPS anyway.
llvm-svn: 293590
2017-01-31 02:09:49 +00:00
Craig Topper b76494e017 [X86] Remove 'else' after 'return'. NFC
llvm-svn: 293589
2017-01-31 02:09:46 +00:00
Craig Topper f9d901f0ea [X86] Use integer broadcast instructions for integer broadcast patterns.
I'm not sure why we were using an FP instruction before and had to have a comment calling attention to it, but not justifying it.

llvm-svn: 293588
2017-01-31 02:09:43 +00:00
Matt Arsenault b6491cc854 AMDGPU: Implement hook for InferAddressSpaces
For now just port some of the existing NVPTX tests
and from an old HSAIL optimization pass which
approximately did the same thing.

Don't enable the pass yet until more testing is done.

llvm-svn: 293580
2017-01-31 01:20:54 +00:00
Matt Arsenault 850657a439 NVPTX: Move InferAddressSpaces to generic code
llvm-svn: 293579
2017-01-31 01:10:58 +00:00
Eugene Zelenko 342257ea92 [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293578
2017-01-31 00:56:17 +00:00
Matt Arsenault 9f432ec24c NVPTX: Trivial cleanups of NVPTXInferAddressSpaces
- Move DEBUG_TYPE below includes
- Change unknown address space constant to be consistent with other
  passes
- Grammar fixes in debug output

llvm-svn: 293567
2017-01-30 23:27:11 +00:00
Eugene Zelenko dde94e4c4f [Mips] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 293565
2017-01-30 23:21:32 +00:00
Matt Arsenault 42b6478344 NVPTX: Refactor NVPTXInferAddressSpaces to check TTI
Add a new TTI hook for getting the generic address space value.

llvm-svn: 293563
2017-01-30 23:02:12 +00:00
Simon Pilgrim 3905e03a47 [X86][SSE] Fix unsigned <= 0 warning in assert. NFCI.
Thanks to @mkuper

llvm-svn: 293561
2017-01-30 22:58:44 +00:00
Simon Pilgrim a80a47afef [X86][SSE] Generalize the number of decoded shuffle inputs. NFCI.
combineX86ShufflesRecursively can still only handle a maximum of 2 shuffle inputs but everything before it now supports any number of shuffle inputs.

This will be necessary for combining OR(SHUFFLE, SHUFFLE) patterns.

llvm-svn: 293560
2017-01-30 22:48:49 +00:00
Eli Friedman 2345733246 Fix line endings.
llvm-svn: 293554
2017-01-30 22:04:23 +00:00
Tom Stellard 887a2562b7 AMDGPU: Fix release build broken by r293551
llvm-svn: 293553
2017-01-30 22:02:58 +00:00
Tom Stellard ca16621b2a Re-commit AMDGPU/GlobalISel: Add support for simple shaders
Fix build when global-isel is disabled and fix a warning.

Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.

Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm

Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D26730

llvm-svn: 293551
2017-01-30 21:56:46 +00:00
Stanislav Mekhanoshin a3b72798af [AMDGPU] Internalize non-kernel symbols
Since we have no call support and late linking we can produce code
only for used symbols. This saves compilation time, size of the final
executable, and size of any intermediate dumps.

Run Internalize pass early in the opt pipeline followed by global
DCE pass. To enable it RT can pass -amdgpu-internalize-symbols option.

Differential Revision: https://reviews.llvm.org/D29214

llvm-svn: 293549
2017-01-30 21:05:18 +00:00
Matt Arsenault af635240d5 AMDGPU: Undo sub x, c -> add x, -c canonicalization
This is worse if the original constant is an inline immediate.

This should also be done for 64-bit adds, but requires fixing
operand folding bugs first.

llvm-svn: 293540
2017-01-30 19:30:24 +00:00
Krzysztof Parzyszek 3695d06a10 [RDF] Add support for regmasks
llvm-svn: 293538
2017-01-30 19:16:30 +00:00
Matt Arsenault 0c3293844b AMDGPU: Run AMDGPUCodeGenPrepare after inlining
With leaf functions, this makes nonsensical decisions
based on the uniformity of the arguments.

llvm-svn: 293525
2017-01-30 18:40:29 +00:00
Matt Arsenault ee3f0acf20 AMDGPU: Make i32 uaddo/usubo legal
llvm-svn: 293514
2017-01-30 18:11:38 +00:00
Krzysztof Parzyszek 49ffff12e5 [RDF] Extract the physical register information into a separate class
llvm-svn: 293510
2017-01-30 17:46:56 +00:00
Tom Stellard 7a19d56f73 Revert "AMDGPU/GlobalISel: Add support for simple shaders"
This reverts commit r293503.

Revert while I investigate some of the buildbot failures.

llvm-svn: 293509
2017-01-30 17:42:41 +00:00
Matt Arsenault 41c1499504 AMDGPU: Fix atomic_inc/atomic_dec + ds_swizzle not being divergent
llvm-svn: 293504
2017-01-30 17:09:47 +00:00
Tom Stellard e48f60aec8 AMDGPU/GlobalISel: Add support for simple shaders
Summary: We can select constant/global G_LOAD, global G_STORE, and G_GEP.

Reviewers: qcolombet, MatzeB, t.p.northover, ab, arsenm

Subscribers: mehdi_amini, vkalintiris, kzhuravl, wdng, nhaehnle, mgorny, yaxunl, tony-tye, modocache, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D26730

llvm-svn: 293503
2017-01-30 17:09:15 +00:00
Simon Pilgrim 098998aef0 [X86][SSE] Add support for combining PINSRW+ASSERTZEXT+PEXTRW patterns with target shuffles
llvm-svn: 293500
2017-01-30 16:58:34 +00:00
Krzysztof Parzyszek b561cf953a [RDF] Add phis for entry block live-ins (in addition to function live-ins)
llvm-svn: 293491
2017-01-30 16:20:30 +00:00