Commit Graph

52919 Commits

Author SHA1 Message Date
Djordje Todorovic 0739ccd3b5 Revert "[DwarfDebug] Dump call site debug info"
A build failure was found on the SystemZ platform.

This reverts commit 9e7e73578e54cd22b3c7af4b54274d743b6607cc.

llvm-svn: 365886
2019-07-12 09:45:12 +00:00
Sam Elliott fafec5155e [RISCV] Allow parsing dot '.' in assembly
Summary:
Useful for jumps, such as `j .`.

I am not sure who should review this. Do not hesitate to change the reviewers if needed.

Reviewers: asb, jrtc27, lenary

Reviewed By: lenary

Subscribers: MaskRay, lenary, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63669

Patch by John LLVM (JohnLLVM)

llvm-svn: 365881
2019-07-12 08:36:07 +00:00
Bryant Wong 7ba838d29c Test commit. NFC.
Formatting fix.

llvm-svn: 365878
2019-07-12 08:25:59 +00:00
Simon Atanasyan ee5af50eb0 [mips] Fix JmpLink to texternalsym and tglobaladdr on mcroMIPS R6
There is not match for the `MipsJmpLink texternalsym` and `MipsJmpLink
tglobaladdr` patterns for microMIPS R6. As a result LLVM incorrectly
selects the `JALRC16` compact 2-byte instruction which takes a target
instruction address from a register only and assign `R_MIPS_32` relocation
for this instruction. This relocation completely overwrites `JALRC16`
and nearby instructions.

This patch adds missed matching patterns, selects `BALC` instruction and
assign a correct `R_MICROMIPS_PC26_S1` relocation.

Differential Revision: https://reviews.llvm.org/D64552

llvm-svn: 365870
2019-07-12 04:58:45 +00:00
Michael Liao 16d3c1ac03 [AMDGPU] Skip calculating callee saved registers for entry function.
Reviewers: arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64596

llvm-svn: 365846
2019-07-11 23:53:30 +00:00
Matt Arsenault e5fb434d92 AMDGPU: s_waitcnt field should be treated as unsigned
Also make it an ImmLeaf, so it should work with global isel as well,
which was part of the point of moving it in the first place.

llvm-svn: 365842
2019-07-11 23:42:57 +00:00
Stanislav Mekhanoshin 28550c8680 [AMDGPU] Fixed asan error with agpr spilling
Instruction was used after it was erased.

llvm-svn: 365837
2019-07-11 22:30:11 +00:00
Stanislav Mekhanoshin 937ff6e701 [AMDGPU] gfx908 agpr spilling
Differential Revision: https://reviews.llvm.org/D64594

llvm-svn: 365833
2019-07-11 21:54:13 +00:00
Stanislav Mekhanoshin 7d2019bb96 [AMDGPU] gfx908 hazard recognizer
Differential Revision: https://reviews.llvm.org/D64593

llvm-svn: 365829
2019-07-11 21:30:34 +00:00
Stanislav Mekhanoshin b83e283e65 [AMDGPU] gfx908 scheduling
Differential Revision: https://reviews.llvm.org/D64590

llvm-svn: 365826
2019-07-11 21:25:00 +00:00
Stanislav Mekhanoshin e67cc380a8 [AMDGPU] gfx908 mfma support
Differential Revision: https://reviews.llvm.org/D64584

llvm-svn: 365824
2019-07-11 21:19:33 +00:00
Wouter van Oortmerssen a617967d68 [WebAssembly] Assembler: support negative float constants.
Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64367

llvm-svn: 365802
2019-07-11 18:18:07 +00:00
Benjamin Kramer fa1a4e4de5 [NVPTX] Use atomicrmw fadd instead of intrinsics
AutoUpgrade the old intrinsics to atomicrmw fadd.

llvm-svn: 365796
2019-07-11 17:11:25 +00:00
Sanjay Patel 5cc7c9ab93 [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483)
Follow up to D58597, where it was noted that the commuted ISD::SUB variant
was having problems with lack of combines.

See also D63958 where we untangled setcc/sub pairs.

Differential Revision: https://reviews.llvm.org/D58875

llvm-svn: 365791
2019-07-11 15:56:33 +00:00
Matt Arsenault b725d27350 AMDGPU/GlobalISel: Move kernel argument handling to separate function
llvm-svn: 365782
2019-07-11 14:18:25 +00:00
Tim Northover 67828edbbd OpaquePtr: switch to GlobalValue::getValueType in a few places. NFC.
llvm-svn: 365770
2019-07-11 13:13:02 +00:00
Fangrui Song f9ca13cb5f [X86] -fno-plt: use GOT __tls_get_addr only if GOTPCRELX is enabled
Summary:
As of binutils 2.32, ld has a bogus TLS relaxation error when the GD/LD
code sequence using R_X86_64_GOTPCREL (instead of R_X86_64_GOTPCRELX) is
attempted to be relaxed to IE/LE (binutils PR24784). gold and lld are good.

In gcc/config/i386/i386.md, there is a configure-time check of as/ld
support and the GOT relaxation will not be used if as/ld doesn't support
it:

    if (flag_plt || !HAVE_AS_IX86_TLS_GET_ADDR_GOT)
      return "call\t%P2";
    return "call\t{*%p2@GOT(%1)|[DWORD PTR %p2@GOT[%1]]}";

In clang, -DENABLE_X86_RELAX_RELOCATIONS=OFF is the default. The ld.bfd
bogus error can be reproduced with:

    thread_local int a;
    int main() { return a; }

clang -fno-plt -fpic a.cc -fuse-ld=bfd

GOTPCRELX gained relative good support in 2016, which is considered
relatively new.  It is even difficult to conditionally default to
-DENABLE_X86_RELAX_RELOCATIONS=ON due to cross compilation reasons. So
work around the ld.bfd bug by only using GOT when GOTPCRELX is enabled.

Reviewers: dalias, hjl.tools, nikic, rnk

Reviewed By: nikic

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64304

llvm-svn: 365752
2019-07-11 10:10:09 +00:00
Sam Parker 08b4a8da07 [ARM][LowOverheadLoops] Correct offset checking
This patch addresses a couple of problems:
1) The maximum supported offset of LE is -4094.
2) The offset of WLS also needs to be checked, this uses a
   maximum positive offset of 4094.
    
The use of BasicBlockUtils has been changed because the block offsets
weren't being initialised, but the isBBInRange checks both positive
and negative offsets.
    
ARMISelLowering has been tweaked because the test case presented
another pattern that we weren't supporting.

llvm-svn: 365749
2019-07-11 09:56:15 +00:00
Simon Tatham 7916198a41 [ARM] Remove nonexistent unsigned forms of MVE VQDMLAH.
The VQDMLAH.U8, VQDMLAH.U16 and VQDMLAH.U32 instructions don't
actually exist: the Armv8.1-M architecture spec only lists signed
forms of that instruction. The unsigned ones were added in error: they
existed in an early draft of the spec, but they were removed before
the public version, and we missed that particular spec change.

Also affects the variant forms VQDMLASH, VQRDMLAH and VQRDMLASH.

Reviewers: miyuki

Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64502

llvm-svn: 365747
2019-07-11 09:52:15 +00:00
Petar Avramovic 962524070a [MIPS GlobalISel] Skip copies in addUseDef and addDefUses
Skip copies between virtual registers during search for UseDefs
and DefUses.
Since each operand has one def search for UseDefs is straightforward.
But since operand can have many uses, we have to check all uses of
each copy we traverse during search for DefUses.

Differential Revision: https://reviews.llvm.org/D64486

llvm-svn: 365744
2019-07-11 09:28:34 +00:00
Petar Avramovic e3bb0a72b6 [MIPS GlobalISel] RegBankSelect for chains of ambiguous instructions
When one of the uses/defs of ambiguous instruction is also ambiguous
visit it recursively and search its uses/defs for instruction with
only one mapping available.
When all instruction in a chain are ambiguous arbitrary mapping can
be selected. For s64 operands in ambiguous chain fprb is selected since
it results in less instructions then having to narrow scalar s64 to s32.
For s32 both gprb and fprb result in same number of instructions and
gprb is selected like a general purpose option.

At the moment we always avoid cross register bank copies.
TODO: Implement a model for costs calculations of different mappings
on same instruction and cross bank copies. Allow cross bank copies
when appropriate according to cost model.

Differential Revision: https://reviews.llvm.org/D64485

llvm-svn: 365743
2019-07-11 09:22:49 +00:00
Jay Foad c1b7db9eda Remove some redundant code from r290372 and improve a comment.
llvm-svn: 365741
2019-07-11 08:49:52 +00:00
Sam Parker 85ad78b1cf [ARM][ParallelDSP] Change the search for smlads
Two functional changes have been made here:
- Now search up from any add instruction to find the chains of
  operations that we may turn into a smlad. This allows the
  generation of a smlad which doesn't accumulate into a phi.
- The search function has been corrected to stop it falsely searching
  up through an invalid path.
    
The bulk of the changes have been making the Reduction struct a class
and making it more C++y with getters and setters.

Differential Revision: https://reviews.llvm.org/D61780

llvm-svn: 365740
2019-07-11 07:47:50 +00:00
Heejin Ahn 54c136bbdf [WebAssembly] Print error message for llvm.clear_cache intrinsic
Summary:
Wasm does not currently support `llvm.clear_cache` intrinsic, and this
prints a proper error message instead of segfault.

Reviewers: dschuff, sbc100, sunfish

Subscribers: jgravelle-google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64322

llvm-svn: 365731
2019-07-11 05:55:47 +00:00
Craig Topper 88729e3dec [X86] Don't convert 8 or 16 bit ADDs to LEAs on Atom in FixupLEAPass.
We use the functions that convert to three address to do the
conversion, but changing an 8 or 16 bit will cause it to create
a virtual register. This can't be done after register allocation
where this pass runs.

I've switched the pass completely to a white list of instructions
that can be converted to LEA instead of a blacklist that was
incorrect. This will avoid surprises if we enhance the three
address conversion function to include additional instructions
in the future.

Fixes PR42565.

llvm-svn: 365720
2019-07-11 01:01:39 +00:00
Stanislav Mekhanoshin e93279fd1b [AMDGPU] gfx908 atomic fadd and atomic pk_fadd
Differential Revision: https://reviews.llvm.org/D64435

llvm-svn: 365717
2019-07-11 00:10:17 +00:00
Stanislav Mekhanoshin c0ae1be066 [AMDGPU] gfx908 dot instruction support
Differential Revision: https://reviews.llvm.org/D64431

llvm-svn: 365715
2019-07-11 00:00:27 +00:00
Craig Topper 1c327c7e0a [X86] Add patterns with and_flag_nocf for BLSI and TBM instructions.
Fixes similar issues to r352306.

llvm-svn: 365705
2019-07-10 22:44:32 +00:00
Craig Topper d916f23b83 [X86] Add BLSR and BLSMSK to isUseDefConvertible.
Unfortunately subo formation in CGP prevents obvious ways of
testing this.

But we already have BLSI in here and the flag behavior is
well understood.

Might become more useful if we improve PR42571.

llvm-svn: 365702
2019-07-10 22:14:39 +00:00
David Tenty a2681296e0 [NFC]Fix IR/MC depency issue for function descriptor SDAG implementation
Summary: llvm/IR/GlobalValue.h can't be included in MC, that creates a circular dependency between MC and IR libraries. This circular dependency is causing an issue for build system that enforce layering.

Author: Xiangling_L

Reviewers: sfertile, jasonliu, hubert.reinterpretcast, gribozavr

Reviewed By: gribozavr

Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64445

llvm-svn: 365701
2019-07-10 22:13:55 +00:00
Craig Topper 021ba49b31 [X86] Remove unused variable. NFC
llvm-svn: 365697
2019-07-10 21:01:34 +00:00
Amara Emerson 7a4d2df04a [AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and unknown values.
Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes
obstruct attempts to find constant source values. These usually come about when
try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough
about this operation allows the CBZ/CBNZ optimization to catch more cases.

This change also improves the case where we can't find a constant source at all.
Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit
a cmp and conditional branch, saving an instruction.

The cumulative code size improvement of this change plus D64354 is 5.5% geomean
on arm64 CTMark -O0.

Differential Revision: https://reviews.llvm.org/D64377

llvm-svn: 365690
2019-07-10 19:21:43 +00:00
Jessica Paquette 7c95925b13 [GlobalISel][AArch64] Use getOpcodeDef instead of findMIFromReg
Some minor cleanup.

This function in Utils does the same thing as `findMIFromReg`. It also looks
through copies, which `findMIFromReg` didn't.

Delete `findMIFromReg` and use `getOpcodeDef` instead. This only happens in
`tryOptVectorDup` right now.

Update opt-shuffle-splat to show that we can look through the copies now, too.

Differential Revision: https://reviews.llvm.org/D64520

llvm-svn: 365684
2019-07-10 18:46:56 +00:00
Jessica Paquette 3132968ae9 [GlobalISel][AArch64][NFC] Use getDefIgnoringCopies from Utils where we can
There are a few places where we walk over copies throughout
AArch64InstructionSelector.cpp. In Utils, there's a function that does exactly
this which we can use instead.

Note that the utility function works with the case where we run into a COPY
from a physical register. We've run into bugs with this a couple times, so using
it should defend us from similar future bugs.

Also update opt-fold-compare.mir to show that we still handle physical registers
properly.

Differential Revision: https://reviews.llvm.org/D64513

llvm-svn: 365683
2019-07-10 18:44:57 +00:00
David Greene d300a493df Revert "[System Model] [TTI] Update cache and prefetch TTI interfaces"
This broke some PPC prefetching tests.

This reverts commit 9fdfb045ae.

llvm-svn: 365680
2019-07-10 18:25:58 +00:00
David Greene 9fdfb045ae [System Model] [TTI] Update cache and prefetch TTI interfaces
Rework the TTI cache and software prefetching APIs to prepare for the
introduction of a general system model.  Changes include:

- Marking existing interfaces const and/or override as appropriate
- Adding comments
- Adding BasicTTIImpl interfaces that delegate to a subtarget
  implementation
- Adding a default "no information" subtarget implementation

Only a handful of targets use these interfaces currently: AArch64,
Hexagon, PPC and SystemZ.  AArch64 already has a custom subtarget
implementation, so its custom TTI implementation is migrated to use
the new facilities in BasicTTIImpl to invoke its custom subtarget
implementation.  The custom TTI implementations continue to exist for
the other targets with this change.  They are not moved over to
subtarget-based implementations.

The end goal is to have the default subtarget implementation defer to
the system model defined by the target.  With this change, the default
subtarget implementation essentially returns "no information" for
these interfaces.  None of the existing users of TTI will hit that
implementation because they define their own custom TTI
implementations and won't use the BasicTTIImpl implementations.

Once system models are in place for the targets that use these
interfaces, their custom TTI implementations can be removed.

Differential Revision: https://reviews.llvm.org/D63614

llvm-svn: 365676
2019-07-10 18:07:01 +00:00
Simon Pilgrim 5dd2af5248 [X86] EltsFromConsecutiveLoads - clean up element size calcs. NFCI.
Determine the element/load size calculations earlier and assert that they are whole bytes in size.

llvm-svn: 365674
2019-07-10 17:49:27 +00:00
Peter Collingbourne 893f8d719c MC: AArch64: Add support for pg_hi21_nc relocation specifier.
Differential Revision: https://reviews.llvm.org/D64455

llvm-svn: 365661
2019-07-10 16:36:46 +00:00
Matt Arsenault 6ce1b4fec5 GlobalISel: Legalization for G_FMINNUM/G_FMAXNUM
llvm-svn: 365658
2019-07-10 16:31:19 +00:00
Simon Pilgrim 093f4aa72f [X86] EltsFromConsecutiveLoads - remove duplicate check for element size. NFCI.
We've already checked that each element is the correct contributory size for VT when we inspect the elements for Undef/Zero/Load.

llvm-svn: 365656
2019-07-10 16:22:31 +00:00
Simon Pilgrim 893448a3e4 [X86] EltsFromConsecutiveLoads - ensure element reg/store sizes are the same size. NFCI.
This renames the type so it doesn't sound like its based off the load size - as we're moving towards supporting combining loads of different sizes.

llvm-svn: 365655
2019-07-10 16:14:26 +00:00
Matt Arsenault 58426a3707 AMDGPU: Serialize mode from MachineFunctionInfo
llvm-svn: 365653
2019-07-10 16:09:26 +00:00
Jay Foad bba37e89a5 [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32
Summary:
D59191 added support for these modifiers in the assembler and
disassembler. This patch just teaches instruction selection that it can
use them.

Reviewers: arsenm, tstellar

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64497

llvm-svn: 365640
2019-07-10 14:53:47 +00:00
Simon Pilgrim 0a9479ef39 [X86] EltsFromConsecutiveLoads - cleanup Zero/Undef/Load element collection. NFCI.
llvm-svn: 365628
2019-07-10 13:28:13 +00:00
Petar Avramovic 7d0778ea6b [MIPS GlobalISel] Select float and double phi
Select float and double phi for MIPS32.

Differential Revision: https://reviews.llvm.org/D64420

llvm-svn: 365627
2019-07-10 13:18:13 +00:00
Petar Avramovic 7b31491ae2 [MIPS GlobalISel] Select float and double load and store
Select float and double load and store for MIPS32.

Differential Revision: https://reviews.llvm.org/D64419

llvm-svn: 365626
2019-07-10 12:55:21 +00:00
Sam Parker 775b2f598a [NFC][ARM] Convert lambdas to static helpers
Break up and convert some of the lambdas in ARMLowOverheadLoops into
static functions. 

llvm-svn: 365623
2019-07-10 12:29:43 +00:00
Simon Pilgrim ef1aac3191 [X86] EltsFromConsecutiveLoads - LDBase is non-null. NFCI.
Don't bother checking for LDBase != null - it should be (and we assert that it is).

llvm-svn: 365622
2019-07-10 12:22:59 +00:00
Simon Pilgrim c972193583 [X86] EltsFromConsecutiveLoads - store Loads on a per-element basis. NFCI.
Cache the LoadSDNode nodes so we can easily map to/from the element index instead of packing them together - this will be useful for future patches for PR16739 etc.

llvm-svn: 365620
2019-07-10 11:26:57 +00:00
Simon Pilgrim 6a58583951 [X86][SSE] EltsFromConsecutiveLoads - add basic dereferenceable support
This patch checks to see if the vector element loads are based off a dereferenceable pointer that covers the entire vector width, in which case we don't need to have element loads at both extremes of the vector width - just the start (base pointer) of it.

Another step towards partial vector loads......

Differential Revision: https://reviews.llvm.org/D64205

llvm-svn: 365614
2019-07-10 10:46:36 +00:00