Commit Graph

53062 Commits

Author SHA1 Message Date
Pavel Labath 2a6afe5f87 [CodeGen/AccelTable]: Handle -dwarf-linkage-names=Abstract correctly
Summary:
If we are not emitting a linkage name in the .debug_info sections, we
should not add it into the index either. This makes sure our index is
consistent with the actual debug info.

I am also explicitly setting the --dwarf-linkage-names=All in the
name-collsions test as that one would now fail on targets where this
defaults to "Abstract" (in fact, it would have failed already if there
wasn't a bug in the DWARF verifier, which I fix as well).

Reviewers: probinson, aprantl, JDevlieghere

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46748

llvm-svn: 332246
2018-05-14 14:13:20 +00:00
Sanjay Patel bf55e6dee1 [AggressiveInstCombine] avoid crashing on unsimplified code (PR37446)
This bug:
https://bugs.llvm.org/show_bug.cgi?id=37446
...raises another question: why do we run aggressive-instcombine before 
regular instcombine?

llvm-svn: 332243
2018-05-14 13:43:32 +00:00
Simon Dardis bb95dea8e7 [mips] Add missing test case from r332227
I did not commit this test from D46689.

llvm-svn: 332241
2018-05-14 13:18:51 +00:00
Simon Dardis fba0362096 [mips] Correct the predicates of indexed floating point stores and loads.
Also, fix the register class for microMIPS.

Reviewers: atanasyan, abeserminji, smaksimovic

Differential Revision: https://reviews.llvm.org/D46689

llvm-svn: 332227
2018-05-14 10:53:15 +00:00
Robert Widmann bce36770b7 [LLVM-C] Add Bindings For Module Flags
Summary:
The first foray into merging debug info into the echo tests.

- Add bindings to Module::getModuleFlagsMetadata() in the form of LLVMCopyModuleFlagsMetadata
- Add the opaque type LLVMModuleFlagEntry to represent Module::ModuleFlagEntry
- Add accessors for LLVMModuleFlagEntry's behavior, key, and metadata node.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: aprantl, JDevlieghere, llvm-commits, harlanhaskins

Differential Revision: https://reviews.llvm.org/D46792

llvm-svn: 332219
2018-05-14 08:09:00 +00:00
Bill Wendling 2a302210d0 Correct compatibility with the GNU Assembler's handling of comparison ops
GAS returns -1 for a comparison operator if the result is true and 0 if false.

  https://www.sourceware.org/binutils/docs-2.12/as.info/Infix-Ops.html#Infix%20Ops

llvm-svn: 332215
2018-05-14 05:25:36 +00:00
Craig Topper f633f3eb67 [X86] Add fast isel test cases for the clang output for 512-bit cvtps2pd related intrinsics.
llvm-svn: 332214
2018-05-14 05:09:41 +00:00
Craig Topper 0e71c6d5ca [X86] Remove and autoupgrade the cvtusi2sd intrinsic. Use uitofp+insertelement instead.
llvm-svn: 332206
2018-05-14 00:06:49 +00:00
Craig Topper 97e74b05ef [X86] Add patterns for combining movss+uint_to_fp into the intrinsic instructions under AVX512.
This matches what we do for sint_to_fp.

llvm-svn: 332205
2018-05-13 23:24:21 +00:00
Craig Topper 12067185d4 [X86] Add fast-isel test cases for _mm_cvtu32_sd, _mm_cvtu64_sd, _mm_cvtu32_ss, and _mm_cvtu64_ss.
llvm-svn: 332204
2018-05-13 23:24:19 +00:00
Craig Topper 911025b1cd [X86] Extend instcombine folds for pclmuldq intrinsics to the 256 and 512 bit version.
llvm-svn: 332202
2018-05-13 21:56:32 +00:00
Craig Topper f170b85d40 [X86] Add missing test for the InstCombines of pclmulqdq.
Apparently this test was lost when r293151 was committed. It was present in the review, but not the commit.

llvm-svn: 332199
2018-05-13 18:26:06 +00:00
Craig Topper 85906cf041 [X86] Remove and autoupgrade masked vpermd/vpermps intrinsics.
llvm-svn: 332198
2018-05-13 18:03:59 +00:00
Dimitry Andric a39c409619 Follow-up to rL332176 by adding a test case for PR37264.
Noticed by Simon Pilgrim.

llvm-svn: 332197
2018-05-13 14:32:23 +00:00
Matt Arsenault dfb88dfe30 AMDGPU: Make undef legal for v2i16/v2f16
This is apparently necessary to stop undef from being
turned into a build_vector of 0s.

llvm-svn: 332195
2018-05-13 10:04:38 +00:00
Puyan Lotfi 380a6f55ff [NFC] MIR-Canon: switching to a stable string sorting of instructions.
llvm-svn: 332191
2018-05-13 06:07:20 +00:00
Craig Topper 38b713d4a7 [X86] Add some load folding patterns for cvtsi2ss/sd into intrinsic instructions.
llvm-svn: 332189
2018-05-13 01:54:33 +00:00
Craig Topper 28b85caea8 [X86] Remove some unused CHECK lines from tests.
llvm-svn: 332188
2018-05-13 00:58:23 +00:00
Craig Topper df3a9cedff [X86] Remove an autoupgrade legacy cvtss2sd intrinsics.
llvm-svn: 332187
2018-05-13 00:29:40 +00:00
Craig Topper 38ad7ddabc [X86] Remove and autoupgrade cvtsi2ss/cvtsi2sd intrinsics to match what clang has used for a very long time.
llvm-svn: 332186
2018-05-12 23:14:39 +00:00
Craig Topper a288f241cd [X86] Remove some unused masked conversion intrinsics that can be replaced with an older intrinsic and a select.
This is what clang already uses.

llvm-svn: 332170
2018-05-12 02:34:28 +00:00
Stanislav Mekhanoshin 7012c246c1 [AMDGPU] Fix amdgpu-waves-per-eu accounting in scheduler
We cannot query this attribute from a subtarget given a machine function.
At this point attribute itself is already unavailable and can only be
obtained through MFI.

Differential Revision: https://reviews.llvm.org/D46781

llvm-svn: 332166
2018-05-12 01:41:56 +00:00
Tom Stellard 655fdd3f82 AMDGPU/GlobalISel: Implement select() for >32-bit G_STORE
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D46153

llvm-svn: 332154
2018-05-11 23:12:49 +00:00
Sergey Dmitriev 69c9cd277d [CodeExtractor] Allow extracting blocks with exception handling
This is a CodeExtractor improvement which adds support for extracting blocks
which have exception handling constructs if that is legal to do. CodeExtractor
performs validation checks to ensure that extraction is legal when it finds
invoke instructions or EH pads (landingpad, catchswitch, or cleanuppad) in
blocks to be extracted.

I have also added an option to allow extraction of blocks with alloca
instructions, but no validation is done for allocas. CodeExtractor caller has
to validate it himself before allowing alloca instructions to be extracted.
By default allocas are still not allowed in extraction blocks.

Differential Revision: https://reviews.llvm.org/D45904

llvm-svn: 332151
2018-05-11 22:49:49 +00:00
Changpeng Fang f094885a9e AMDGPU/SI: Don't promote alloca to vector for AddrSpaceCast instruction.
Summary:
  We have no logic to promote alloca to vector for an AddrSpaceCast instruction.

Reviewer:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D45993

llvm-svn: 332147
2018-05-11 22:17:57 +00:00
Craig Topper a17d627abb [X86] Remove and autoupgrade a bunch of FMA instrinsics that are no longer used by clang.
llvm-svn: 332146
2018-05-11 21:59:34 +00:00
Artem Belevich c2cd5d5ce0 [Split GEP] handle trunc() in separate-const-offset-from-gep pass.
Let separate-const-offset-from-gep pass handle trunc() when it calculates
constant offset relative to base. The pass itself may insert trunc()
instructions when it canonicalises array indices to pointer-size integers
and needs to handle trunc() in order to evaluate the offset.

Differential Revision: https://reviews.llvm.org/D46732

llvm-svn: 332142
2018-05-11 21:13:19 +00:00
Yaxun Liu deba150c27 [AMDGPU] Fix compilation failure when IR contains comdat
Remove a useless SwitchSection which also causes compilation failure
when IR contains comdat.

The SwitchSection is useless because the current section is already
correct text section for the function therefore no need to switch.

It causes compilation failure for comdat because functions with comdat
has specific text section, not the default .text section.

Since HIP uses comdat, this bug caused failures for HIP.

Differential Revision: https://reviews.llvm.org/D46770

llvm-svn: 332137
2018-05-11 20:40:14 +00:00
Daniel Neilson f6651d4d94 [InstCombine] Handle atomic memset in the same way as regular memset
Summary:
This change adds handling of the atomic memset intrinsic to the
code path that simplifies the regular memset. In practice this means
that we will now also expand a small constant-length atomic memset
into a single unordered atomic store.

Reviewers: apilipenko, skatkov, mkazantsev, anna, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D46660

llvm-svn: 332132
2018-05-11 20:04:50 +00:00
Vedant Kumar 99d5c072f0 [DAGCombiner] Set the right SDLoc on extended SETCC uses (7/N)
ExtendSetCCUses updates SETCC nodes which use a load (OriginalLoad) to
reflect a simplification to the load (ExtLoad).

Based on my reading, ExtendSetCCUses may create new nodes to extend a
constant attached to a SETCC. It also creates fresh SETCC nodes which
refer to any updated operands.

ISTM that the location applied to the new constant and SETCC nodes
should be the same as the location of the ExtLoad.

This was suggested by Adrian in https://reviews.llvm.org/D45995.

Part of: llvm.org/PR37262

Differential Revision: https://reviews.llvm.org/D46216

llvm-svn: 332119
2018-05-11 18:40:10 +00:00
Vedant Kumar fd340a4047 [DAGCombiner] Set the right SDLoc on a newly-created sextload (6/N)
This teaches tryToFoldExtOfLoad to set the right location on a
newly-created extload. With that in place, the logic for performing a
certain ([s|z]ext (load ...)) combine becomes identical for sexts and
zexts, and we can get rid of one copy of the logic.

The test case churn is due to dependencies on IROrders inherited from
the wrong SDLoc.

Part of: llvm.org/PR37262

Differential Revision: https://reviews.llvm.org/D46158

llvm-svn: 332118
2018-05-11 18:40:08 +00:00
David Bolvansky cd93c4ef1a [InstCombine] snprintf optimizations
Reviewers: spatel, efriedma, majnemer, rja, bkramer

Reviewed By: rja, bkramer

Subscribers: mstorsjo, rja, llvm-commits

Differential Revision: https://reviews.llvm.org/D46285

llvm-svn: 332110
2018-05-11 17:50:49 +00:00
Simon Pilgrim 661ae7778d [X86][BtVer2] Model ymm move as double pumped instructions
We still need to handle mmx/xmm moves as 'decode-only' no-pipe instructions

llvm-svn: 332109
2018-05-11 17:38:36 +00:00
Alex Bradbury bca0c3cdb6 [RISCV] Support .option rvc and norvc assembler directives
These directives allow the 'C' (compressed) extension to be enabled/disabled 
within a single file.

Differential Revision: https://reviews.llvm.org/D45864
Patch by Kito Cheng

llvm-svn: 332107
2018-05-11 17:30:28 +00:00
Martin Storsjo 0d7c37756b [Analysis] Validate the return type of s(n)printf like libcalls
If the sprintf function is static (as on mingw-w64, where many stdio
functions are static inline wrappers), earlier optimization passes
could optimize out the return value altogether, and make it void,
which could break optimizations of this libcall that touch the
return value.

This fixes the issue discussed in PR37408 for the sprintf function.

Differential Revision: https://reviews.llvm.org/D46752

llvm-svn: 332106
2018-05-11 16:53:56 +00:00
Simon Pilgrim 706403bab8 [X86][MMX] Tag MMX Move/Load/Store as WriteVec schedule classes
Fixes an issue on SLM/Btver2 where we had instructions were being treated as scalar loads/stores

llvm-svn: 332104
2018-05-11 16:38:59 +00:00
Geoff Berry 60460268c0 [AArch64] Fix performPostLD1Combine to check for constant lane index.
Summary:
performPostLD1Combine in AArch64ISelLowering looks for vector
insert_vector_elt of a loaded value which it can optimize into a single
LD1LANE instruction.  The code checking for the pattern was not checking
if the lane index was a constant which could cause two problems:

- an assert when lowering the LD1LANE ISD node since it assumes an
  constant operand

- an assert in isel if the lane index value depends on the
  post-incremented base register

Both of these issues are avoided by simply checking that the lane index
is a constant.

Fixes bug 35822.

Reviewers: t.p.northover, javed.absar

Subscribers: rengolin, kristof.beyls, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D46591

llvm-svn: 332103
2018-05-11 16:25:06 +00:00
Sanjoy Das 82105e2a7d Use iteration instead of recursion in CFIInserter
Summary: This recursive step can overflow the stack.

Reviewers: djokov, petarj

Subscribers: mcrosier, jlebar, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D46671

llvm-svn: 332101
2018-05-11 15:54:46 +00:00
Davide Italiano 6e1f7bf316 [Reassociate] Prevent infinite loops when processing PHIs.
Phi nodes can reside in live blocks but one of their incoming
arguments can come from a dead block. Dead blocks and reassociate
don't play nice together. In fact, reassociate performs an RPO
as a first step to avoid processing dead blocks.

The reason why Reassociate might not fixpoint when examining
dead blocks is that the following:

  %xor0 = xor i16 %xor1, undef
  %xor1 = xor i16 %xor0, undef

is perfectly valid LLVM IR (if it appears in a dead block),
so the worklist algorithm keeps pushing the two instructions for
reexamination. Note that this is not Reassociate fault, at least
not entirely. It's llvm that has a weird definition of dominance.

Fixes PR37390.

llvm-svn: 332100
2018-05-11 15:45:36 +00:00
Simon Dardis d4169ad7c1 [mips] Enable disassembly of fused (negative) multiply add/sub instructions
Reviewers: atanasyan, smaksimovic, abeserminji

Differential Revision: https://reviews.llvm.org/D46392

llvm-svn: 332097
2018-05-11 15:21:40 +00:00
Simon Pilgrim 032a01f74a [X86][SLM] Vector stores only use the MEC port.
Confirmed by both Agner and Intel's AOM - the IEC/FPC are not required for pure load/stores (even if its a partial update).

Can't fix WriteStore until all RMW instructions are cleaned up though....

llvm-svn: 332096
2018-05-11 15:16:15 +00:00
Simon Pilgrim 22dd72b995 [X86] Split WriteF/WriteVec Move/Load/Store scheduler classes by vector width
Fixes a SNB issue that was missing vlddqu/vmovntdqa ymm instructions

llvm-svn: 332094
2018-05-11 14:30:54 +00:00
Daniel Neilson 8f30ec65b0 [InstCombine] Unify handling of atomic memtransfer with non-atomic memtransfer
Summary:
This change reworks the handling of atomic memcpy within the instcombine pass.
Previously, a constant length atomic memcpy would be lowered into loads & stores
as long as no more than 16 load/store pairs are created. This is quite different
from the lowering done for a non-atomic memcpy; which only ever lowers into a single
load/store pair of no more than 8 bytes. Larger constant-sized memcpy calls are
expanded to load/stores in later passes, such as SelectionDAG lowering.

In this change the behaviour for atomic memcpy is unified with non-atomic memcpy;
atomic memcpy is now treated in the same was as non-atomic memcpy has always been.
We leave it to later passes to lower longer-length atomic memcpy calls.

Due to the structure of the pass's handling of memtransfer intrinsics, this change
also gives us handling of atomic memmove that we did not previously have.

Reviewers: apilipenko, skatkov, mkazantsev, anna, reames

Reviewed By: reames

Subscribers: reames, llvm-commits

Differential Revision: https://reviews.llvm.org/D46658

llvm-svn: 332093
2018-05-11 14:30:02 +00:00
Tom Stellard dcc95e9385 AMDGPU/GlobalISel: Implement select() for 32-bit G_FPTOUI
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45883

llvm-svn: 332082
2018-05-11 05:44:16 +00:00
Alexander Shaposhnikov 18b5fb7b84 [llvm-strip] Add support for -remove-section
This diff adds support for -remove-section to llvm-strip.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D46567

llvm-svn: 332081
2018-05-11 05:27:06 +00:00
Alexander Shaposhnikov 191913e3e7 [llvm-objcopy] Update remove-section.test
Verify that the input binary is not getting modified
and add an invocation which uses -remove-section instead of -R.

Test plan: make check-all

llvm-svn: 332078
2018-05-11 04:30:57 +00:00
Brian Gesiak c651113439 [Coroutines] PR34897: Fix incorrect elisions
Summary:
https://bugs.llvm.org/show_bug.cgi?id=34897 demonstrates an incorrect
coroutine frame allocation elision in the coro-elide pass. The elision
is performed on the basis that the SSA variables from all llvm.coro.begin
are directly referenced in subsequent llvm.coro.destroy instructions.

However, this ignores the fact that the function may exit through paths
that do not run these destroy instructions. In the sample program from
PR34897, for example, the llvm.coro.destroy instruction is only
executed in exception handling code. When the coroutine function exits
normally, llvm.coro.destroy is not called. Eliding the allocation in
this case causes a subsequent reference to the coroutine handle from
outside of the function to access freed memory.

To fix the issue, when finding an llvm.coro.destroy for each llvm.coro.begin,
only consider llvm.coro.destroy that are executed along non-exceptional paths.

Test Plan:
1. Download the sample program from
   https://bugs.llvm.org/show_bug.cgi?id=34897, compile it with
   `clang++ -fcoroutines-ts -stdlib=libc++ -std=c++1z -O2`, and run it.
   It should print `"run1\ncheck1\nrun2\ncheck2"` and then exit
   successfully.
2. Compile https://godbolt.org/g/mCKfnr and confirm it is still
   optimized to a single instruction, 'return 1190'.
3. `check-llvm`

Reviewers: rsmith, GorNishanov, eric_niebler

Reviewed By: GorNishanov

Subscribers: andrewrk, lewissbaker, EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D43242

llvm-svn: 332077
2018-05-11 03:12:28 +00:00
Kostya Serebryany a2759327fd [sanitizer-coverage] don't instrument a function if it's entry block ends with 'unreachable'
llvm-svn: 332072
2018-05-11 01:09:39 +00:00
Craig Topper 4b026e5ebd [InstCombine] Add tests for cases where we don't recognize type promoted rotate idioms.
These rotates take the form

(x << (n & mask)) | (x >> (-n & mask)) where mask is bitwidth - 1.

If x has been promoted to a wider type than its original bit width due to type promotion we fail to narrower it and therefore don't recognize it as a rotate.

llvm-svn: 332068
2018-05-11 00:46:09 +00:00
Wei Mi 0c2f6be662 [SampleFDO] Don't treat warm callsite with inline instance in the profile as cold
We found current sampleFDO had a performance issue when triaging a regression.
For a callsite with inline instance in the profile, even if hot callsite inliner
cannot inline it, it may still execute enough times and should not be treated as
cold in regular inliner later. However, currently if such callsite is not inlined
by hot callsite inliner, and the BB where the callsite locates doesn't get
samples from other instructions inside of it, the callsite will have no profile
metadata annotated. In regular inliner cost analysis, if the callsite has no
profile annotated and its caller has profile information, it will be treated as
cold.

The fix changes the isCallsiteHot check and chooses to compare
CallsiteTotalSamples with hot cutoff value computed by ProfileSummaryInfo.

Differential Revision: https://reviews.llvm.org/D45377

llvm-svn: 332058
2018-05-10 23:02:27 +00:00