Commit Graph

43284 Commits

Author SHA1 Message Date
Konstantin Zhuravlyov e9a5a77ee3 AMDGPU: Implement memory model
llvm-svn: 308781
2017-07-21 21:19:23 +00:00
Guozhi Wei e0094ce22e [PPC] Add Defs = [CARRY] to MIR SRADI_32
MIR SRADI uses instruction template XSForm_1rc which declares Defs = [CARRY]. But MIR SRADI_32 uses instruction template XSForm_1, and it doesn't declare such implicit definition. With patch D33720 it causes wrong code generation for perl.

This patch adds the implicit definition.

Differential Revision: https://reviews.llvm.org/D35699

llvm-svn: 308780
2017-07-21 21:06:08 +00:00
Konstantin Zhuravlyov 070d88e335 AMDGPU: Introduce maybeAtomic instruction flag
Testing is in the follow up change

llvm-svn: 308779
2017-07-21 21:05:45 +00:00
Matt Arsenault f014d7cbde AMDGPU: Preserve undef flag in eliminateFrameIndex
Fixes verifier errors in some call tests.
Not sure why we haven't run into this before.

Test split into separate patch for once
call support is committed.

llvm-svn: 308774
2017-07-21 19:31:44 +00:00
Matt Arsenault 0ed39d329d AMDGPU: Partially fix improper reliance on memoperands
There are 2 more places doing this, but I'm not sure
what they are doing and don't make any sense to me

llvm-svn: 308770
2017-07-21 18:54:54 +00:00
Matt Arsenault 6ab9ea9614 AMDGPU: Don't track lgkmcnt for global_/scratch_ instructions
llvm-svn: 308766
2017-07-21 18:34:51 +00:00
Matt Arsenault 37a58e03c7 AMDGPU: Fix getMemOpBaseRegImmOfs for flat with offsets
llvm-svn: 308762
2017-07-21 18:06:36 +00:00
Krzysztof Parzyszek 3ad0d01e9e [Hexagon] Add inline-asm constraint 'a' for modifier register class
For example
  asm ("memw(%0++%1) = %2" : : "r"(addr),"a"(mod),"r"(val) : "memory")

llvm-svn: 308761
2017-07-21 17:51:27 +00:00
Simon Dardis 0310eb7a67 [mips] Support -membedded-data and fix a related bug
-membedded-data changes the location of constant data from the .sdata to
the .rodata section. Previously it was (incorrectly) always located in the
.rodata section.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D35686

llvm-svn: 308758
2017-07-21 17:19:00 +00:00
Matt Arsenault ca7b0a1777 AMDGPU: Add instruction definitions for some scratch_* instructions
Omit atomics for now since they probably aren't useful.

llvm-svn: 308747
2017-07-21 15:36:16 +00:00
Petar Jovanovic 9494258223 [mips] Enable IAS by default for Android MIPS64
Follow up to r306280 in Clang.
Enable IAS by default for Android MIPS64 (uses N64 ABI).

Differential Revision: https://reviews.llvm.org/D35482

llvm-svn: 308742
2017-07-21 14:25:42 +00:00
Dmitry Preobrazhensky abf2839478 [AMDGPU][MC][GFX9] Added support of VOP3 'op_sel' modifier
See bug 33591: https://bugs.llvm.org//show_bug.cgi?id=33591

Reviewers: vpykhtin, artem.tamazov, SamWot, arsenm

Differential Revision: https://reviews.llvm.org/D35424

llvm-svn: 308740
2017-07-21 13:54:11 +00:00
Jonas Paulsson 024e319489 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Simon Pilgrim 32c377a1cf [X86][SSE] Add pre-AVX2 support for (i32 bitcast(v32i1)) -> 2xMOVMSK
Currently we only support (i32 bitcast(v32i1)) using the AVX2 VPMOVMSKB ymm instruction.

This patch adds support for splitting pre-AVX2 targets into 2 x (V)PMOVMSKB xmm instructions and merging the integer results.

In future we could probably generalize this to handle more cases.

Differential Revision: https://reviews.llvm.org/D35303

llvm-svn: 308723
2017-07-21 09:58:50 +00:00
Craig Topper 31140ade70 [AVX-512] Fix a bug that prevented some non-temporal loads from using the movntdqa instruction.
The bitconverts here had an input type of 128-bits and an output type of 256 bits. The input type should also have been 256 bits.

llvm-svn: 308702
2017-07-21 00:40:42 +00:00
Evandro Menezes 55459609c8 [AArch64] Adjust the cost model for Exynos M1 and M2
Add the cost for the EXT instructions and explicitly add the cost for a few
instructions that were implied by the coarse model.

llvm-svn: 308697
2017-07-20 23:41:50 +00:00
Tim Northover 7b6d66c0c9 Recommit: GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64.
It revealed a bug in the Localizer pass which has now been fixed.

This includes the fix for SUBREG_TO_REG committed separately last time.

llvm-svn: 308688
2017-07-20 22:58:38 +00:00
Artem Belevich d7a73824e4 [NVPTX] Add lowering of i128 params.
The patch adds support of i128 params lowering. The changes are quite trivial to
support i128 as a "special case" of integer type. With this patch, we lower i128
params the same way as aggregates of size 16 bytes: .param .b8 _ [16].

Currently, NVPTX can't deal with the 128 bit integers:
* in some cases because of failed assertions like
  ValVTs.size() == OutVals.size() && "Bad return value decomposition"
* in other cases emitting PTX with .i128 or .u128 types (which are not valid [1])
  [1] http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#fundamental-types

Differential Revision: https://reviews.llvm.org/D34555
Patch by: Denys Zariaiev (denys.zariaiev@gmail.com)

llvm-svn: 308675
2017-07-20 21:16:03 +00:00
Matt Arsenault e5456ce5e5 AMDGPU: Rename _RTN atomic instructions
Move the _RTN to the end of the name. It reads
better if the other addressing mode components
line up with the non-RTN version. It is also
more convenient to define saddr variants of
FLAT atomics to have the RTN last, and it is
good to have a consistent naming scheme.

llvm-svn: 308674
2017-07-20 21:06:04 +00:00
Matt Arsenault db78273b6e Add an ID field to StackObjects
On AMDGPU SGPR spills are really spilled to another register.
The spiller creates the spills to new frame index objects,
which is used as a placeholder.

This will eventually be replaced with a reference to a position
in a VGPR to write to and the frame index deleted. It is
most likely not a real stack location that can be shared
with another stack object.

This is a problem when StackSlotColoring decides it should
combine a frame index used for a normal VGPR spill with
a real stack location and a frame index used for an SGPR.

Add an ID field so that StackSlotColoring has a way
of knowing the different frame index types are
incompatible.

llvm-svn: 308673
2017-07-20 21:03:45 +00:00
Artem Belevich fef0804e35 Changed EOL back to LF. NFC.
llvm-svn: 308671
2017-07-20 20:57:51 +00:00
Mandeep Singh Grang d41ac895bb [COFF, ARM64, CodeView] Add support to emit CodeView debug info for ARM64 COFF
Reviewers: compnerd, ruiu, rnk, zturner

Reviewed By: rnk

Subscribers: majnemer, aemerson, aprantl, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35518

llvm-svn: 308665
2017-07-20 20:20:00 +00:00
James Y Knight bb76d48d59 [SPARC] Clean up the support for disabling fsmuld and fmuls instructions.
Summary:
Also enable no-fsmuld for sparcv7 (which doesn't have the
instruction).

The previous code which used a post-processing pass to do this was
unnecessary; disabling the instruction is entirely sufficient.

Reviewers: jacob_hansen, ekedaigle

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35576

llvm-svn: 308661
2017-07-20 20:09:11 +00:00
Krzysztof Parzyszek f3a778d757 Implement LaneBitmask::getNumLanes and LaneBitmask::getHighestLane
This should eliminate most uses of countPopulation and Log2_32 on
the lane mask values.

llvm-svn: 308658
2017-07-20 19:43:19 +00:00
Craig Topper 27c12e088e [X86] Allow masks with more than 6 bits set on the x << (y & mask) optimization for the 64-bit memory shifts.
llvm-svn: 308657
2017-07-20 19:29:58 +00:00
Krzysztof Parzyszek e9f0c1e031 Use LaneBitmask::getLane in a few more places
llvm-svn: 308655
2017-07-20 19:15:56 +00:00
Matt Arsenault c37fe66ec5 AMDGPU: Add encoding for carryless add/sub instructions
llvm-svn: 308639
2017-07-20 17:42:47 +00:00
Matt Arsenault f65c5ac9c9 AMDGPU: Add encodings for global atomics
llvm-svn: 308638
2017-07-20 17:31:56 +00:00
Stefan Maksimovic be0bc71e02 Reland r308585
Builder clang-x86_64-linux-abi-test apparently failed due
to a spurious error unrelated to the changes r308585
introduced.

llvm-svn: 308612
2017-07-20 13:08:18 +00:00
Javed Absar e9599e39fe [ARM] Simplify ExpandPseudoInst. NFC.
Remove headers not required and convert to range-loop

Reviewed by: @mcrosier
Differential Revision: https://reviews.llvm.org/D35626

llvm-svn: 308607
2017-07-20 12:35:37 +00:00
Simon Atanasyan fb953926b1 [mips] Support `long_call/far/near` attributes passed by front-end
This patch adds handling of the `long_call`, `far`, and `near`
attributes passed by front-end. The patch depends on D35479.

Differential revision: https://reviews.llvm.org/D35480.

llvm-svn: 308606
2017-07-20 12:19:26 +00:00
Diana Picus 7534b28291 Revert "GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64."
This reverts commit 36c6a2ea9669bc3bb695928529a85d12d1d3e3f9 because it
broke the test-suite on the GlobalISel bot.

llvm-svn: 308603
2017-07-20 11:36:03 +00:00
Stefan Maksimovic 3793a82b28 Revert r308585
Builder clang-x86_64-linux-abi-test seems to fail after this change

llvm-svn: 308597
2017-07-20 09:57:14 +00:00
Stefan Maksimovic 8539f77bc3 [mips] Fix fp select machine verifier errors
Introduced FSELECT node necesary when lowering ISD::SELECT
which has i32, f64, f64 as its operands.
SEL_D instruction required that its output and first operand
of a SELECT node, which it used, have matching types.
MTC1_D64 node introduced to aid FSELECT lowering.

This fixes machine verifier errors on following tests:
CodeGen/Mips/llvm-ir/select-dbl.ll
CodeGen/Mips/llvm-ir/select-flt.ll
CodeGen/Mips/select.ll

Differential Revision: https://reviews.llvm.org/D35408

llvm-svn: 308595
2017-07-20 09:21:10 +00:00
Craig Topper 33225ef314 [X86] Use SARX/SHLX/SHLX instructions for (shift x (and y, (BitWidth-1)))
Fixes PR33841.

llvm-svn: 308591
2017-07-20 06:19:55 +00:00
Matt Arsenault 04004716ff AMDGPU: Correct encoding for global instructions
The soffset field needs to be be set to 0x7f to disable it,
not 0. 0 is interpreted as an SGPR offset.

This should be enough to get basic usage of the global instructions
working. Technically it is possible to use an SGPR_32 offset,
but I'm not sure if it's correct with 64-bit pointers, but
that is not handled now. This should also be cleaned up
to be more similar to how different MUBUF modes are handled,
and to have InstrMappings between the different types.

llvm-svn: 308583
2017-07-20 05:17:54 +00:00
Tim Northover 967d4aa7a0 GlobalISel: partially revert r308540.
An unfinished and untested implementation of ISel for G_UNMERGE_VALUES crept in
by mistake.

llvm-svn: 308542
2017-07-19 22:11:08 +00:00
Tim Northover 0e0b3c97dd GlobalISel: fix SUBREG_TO_REG implementation.
The first argument needs to be an immediate rather than a register. Should fix
some crashes in the verifier bot.

llvm-svn: 308540
2017-07-19 22:08:08 +00:00
Martin Storsjo b2e9fcfca4 [AArch64] Force relocations for all ADRP instructions
This generalizes an existing fix from ELF to MachO and COFF.

Test that an ADRP to a local symbol whose offset is known at assembly
time still produces relocations, both for MachO and COFF. Test that
an ADRP without a @page modifier on MachO fails (previously it
didn't).

Differential Revision: https://reviews.llvm.org/D35544

llvm-svn: 308518
2017-07-19 20:14:32 +00:00
Martin Storsjo 2ff5f5d681 [AArch64, COFF] Interpret .align as power of two for COFF as well
Differential Revision: https://reviews.llvm.org/D35545

llvm-svn: 308517
2017-07-19 20:14:24 +00:00
Krzysztof Parzyszek ac01994db9 [Hexagon] Fix a bug in r308502: post-inc offset is always 0
llvm-svn: 308510
2017-07-19 19:17:32 +00:00
Davide Italiano 5fc5d0a406 [X86] Don't try to scale down if that exceeds the bitwidth.
Fixes the crash reported in PR33844.

llvm-svn: 308503
2017-07-19 18:09:46 +00:00
Krzysztof Parzyszek 3fce9d9c49 [Hexagon] Handle subregisters in areMemAccessesTriviallyDisjoint
llvm-svn: 308502
2017-07-19 18:03:46 +00:00
Tim Northover d59fbec8e2 GlobalISel: select G_EXTRACT and G_INSERT instructions on AArch64.
llvm-svn: 308493
2017-07-19 16:47:07 +00:00
Krzysztof Parzyszek b449dc189a [Hexagon] Handle subregisters and non-immediates in getBaseAndOffset
llvm-svn: 308485
2017-07-19 15:39:28 +00:00
Javed Absar 2cb0c95031 [ARM] Unify handling of M-Class system registers
This patch cleans up and fixes issues in the M-Class system register handling:

1. It defines the system registers and the encoding (SYSm values) in one place:
   a new ARMSystemRegister.td using SearchableTable, thereby removing the
   hand-coded values which existed in multiple places.

2. Some system registers e.g. BASEPRI_MAX_NS which do not exist were being allowed!
   Ref: ARMv6/7/8M architecture reference manual.

Reviewed by: @t.p.northover, @olist01, @john.brawn
Differential Revision: https://reviews.llvm.org/D35209

llvm-svn: 308456
2017-07-19 12:57:16 +00:00
Simon Pilgrim e5c7925c5e [X86][XOP] Use default AVX2 lowering for v4i64 ashr by splat constants
XOP shifts only support 128-bit vectors, so we were ending up with less optimal codegen requiring constants

llvm-svn: 308430
2017-07-19 10:29:31 +00:00
Jonas Paulsson 4690193dec [SystemZ] Minor fixing in SystemZScheduleZ14.td
Some minor corrections for recently added instructions.

Review: Ulrich Weigand
llvm-svn: 308429
2017-07-19 10:19:21 +00:00
James Y Knight 0e4ce61d2a [SPARC] Add missing variable initialization after r308343.
llvm-svn: 308415
2017-07-19 04:08:42 +00:00
Craig Topper 106b5b6856 AMD znver1 Initial Scheduler model
Summary:
This patch adds the following
1. Adds a skeleton scheduler model for AMD Znver1.
2. Introduces the znver1 execution units and pipes.
3. Caters the instructions based on the generic scheduler classes.
4. Further additions to the scheduler model with instruction itineraries will be carried out incrementally based on
        a. Instructions types
        b. Registers used
5. Since itineraries are not added based on instructions, throughput information are bound to change when incremental changes are added.
6. Scheduler testcases are modified accordingly to suit the new model.

Patch by Ganesh Gopalasubramanian. With minor formatting tweaks from me.

Reviewers: craig.topper, RKSimon

Subscribers: javed.absar, shivaram, ddibyend, vprasad

Differential Revision: https://reviews.llvm.org/D35293

llvm-svn: 308411
2017-07-19 02:45:14 +00:00