Commit Graph

162722 Commits

Author SHA1 Message Date
Andrea Di Biagio b24953bbfb [llvm-mca] Let the Scheduler notify dispatch stall events caused by the lack of scheduling resources.
This patch moves part of the logic that notifies dispatch stall events from the
DispatchUnit to the Scheduler.

The main goal of this patch is to remove (yet another) dependency between the
DispatchUnit and the Scheduler. Before this patch, the DispatchUnit had to know
about `Scheduler::Event` and how to classify stalls due to the lack of scheduling
resources. This patch removes that knowledge and simplifies the logic in
DispatchUnit::checkScheduler.

This is another change done in preparation for the work to fix PR36663.

No functional change intended.

llvm-svn: 329835
2018-04-11 18:05:23 +00:00
Simon Pilgrim 7f321d8c24 [X86] Generalize X86PadShortFunction to work with TargetSchedModel
Pre-commit for D45486, don't rely on itinerary scheduler model to determine latencies for padding, use the generic TargetSchedModel::computeInstrLatency call.

Also, replace hard coded (atom specific) 2*uop creation per padding cycle with a version based on the scheduler model's issue width.

Differential Revision: https://reviews.llvm.org/D45486

llvm-svn: 329834
2018-04-11 18:05:17 +00:00
Artem Belevich 2f8efcf3ca [NVPTX] Removed 'satom' feature which is no longer used.
Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329830
2018-04-11 17:51:33 +00:00
Artem Belevich 24e8a680e5 [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.
When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature,
consider those features available if we're compiling for GPU >= sm_XX or have
enabled PTX version >= ptxYY.

Differential Revision: https://reviews.llvm.org/D45061

llvm-svn: 329829
2018-04-11 17:51:19 +00:00
Tim Renouf fd8d4af3bc [AMDGPU] Ensure there are enough registers for wave dispatch
Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Re-landed after noticing that the buildbot failure from 329808 seemed to
be unrelated.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329826
2018-04-11 17:18:36 +00:00
Daniel Neilson 7e2e5c3c58 [DSE] Regenerate tests with update_test_checks.py (NFC)
Summary:
In preparation for a future commit, this regenerates the test checks for
test/Transforms/DeadStoreElimination/simple.ll
test/Transforms/DeadStoreElimination/memintrinsics.ll

llvm-svn: 329824
2018-04-11 16:50:04 +00:00
Reid Kleckner 0828699488 [FastISel] Disable local value sinking by default
This is causing compilation timeouts on code with long sequences of
local values and calls (i.e. foo(1); foo(2); foo(3); ...).  It turns out
that code coverage instrumentation is a great way to create sequences
like this, which how our users ran into the issue in practice.

Intel has a tool that detects these kinds of non-linear compile time
issues, and Andy Kaylor reported it as PR37010.

The current sinking code scans the whole basic block once per local
value sink, which happens before emitting each call. In theory, local
values should only be introduced to be used by instructions between the
current flush point and the last flush point, so we should only need to
scan those instructions.

llvm-svn: 329822
2018-04-11 16:03:07 +00:00
Sanjay Patel ff98682c9c [InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse()
llvm-svn: 329821
2018-04-11 15:57:18 +00:00
Paul Robinson 0195469a23 [DWARFv5] Fuss with asm syntax for conveying MD5 checksum.
Previously the MD5 option of the .file directive provided the checksum
as a quoted hex string; now it's a normal hex number with 0x prefix,
same as the .octa directive accepts.

Differential Revision: https://reviews.llvm.org/D45459

llvm-svn: 329820
2018-04-11 15:14:05 +00:00
Petar Jovanovic 366857a23a [MIPS GlobalISel] Select add i32, i32
Add the minimal support necessary to lower a function that returns the
sum of two i32 values.
Support argument/return lowering of i32 values through registers only.
Add tablegen for regbankselect and instructionselect.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D44304

llvm-svn: 329819
2018-04-11 15:12:32 +00:00
Haicheng Wu 5ba379557d [SLP] update a test case. NFC.
llvm-svn: 329818
2018-04-11 15:09:49 +00:00
Yaxun Liu 9381ae9791 [AMDGPU] Fix lowering enqueue_kernel
Two issues were fixed:

runtime has difficulty to allocate memory for an external symbol of a
kernel and set the address of the external symbol, therefore make the runtime
handle of an enqueued kernel an ordinary global variable. Runtime only needs
to store the address of the loaded kernel to the handle and has verified
that this approach works.

handle the situation where __enqueue_kernel* gets inlined therefore
the enqueued kernel may be used through a constant expr instead
of an instruction.

Differential Revision: https://reviews.llvm.org/D45187

llvm-svn: 329815
2018-04-11 14:46:15 +00:00
Andrea Di Biagio b15737e07c Revert "[llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS"
It caused a buildbot failure (clang-ppc64le-linux-multistage - build #6424)

llvm-svn: 329812
2018-04-11 14:35:23 +00:00
Tim Renouf 8ca33bfcf3 Revert "[AMDGPU] Ensure there are enough registers for wave dispatch"
This reverts 329808. That change caused a report of a failure in
test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir that I didn't see. I suspect
it is an expensive-check-only error.

Change-Id: I8133f26f15e7d5ec2b09c687c12cd70e918461b0
llvm-svn: 329811
2018-04-11 14:27:41 +00:00
Sander de Smalen c88f9a1a57 [AArch64][AsmParser] Split index parsing from vector list.
Summary:
Place parsing of a vector index into a separate function to reduce
duplication, since the code is duplicated in both the parsing of a
Neon vector register operand and a Neon vector list.

This is patch [2/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: rengolin

Subscribers: kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45428

llvm-svn: 329809
2018-04-11 14:10:37 +00:00
Tim Renouf f26b723491 [AMDGPU] Ensure there are enough registers for wave dispatch
Summary:
This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to
allow for registers set up in wave dispatch, even if those registers are
not used in the shader.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45503

Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771
llvm-svn: 329808
2018-04-11 14:02:41 +00:00
Andrea Di Biagio 5782ec29ab [llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS.
llvm-svn: 329807
2018-04-11 13:52:42 +00:00
Simon Pilgrim 89c8a10f7c [X86] Add variable shuffle schedule classes
Split variable index shuffles from immediate index shuffles

WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.)
WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.)

WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.)
WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.)

Differential Revision: https://reviews.llvm.org/D45404

llvm-svn: 329806
2018-04-11 13:49:19 +00:00
Francis Visoiu Mistrih 7bcb5720fd [AArch64] Add test case for r329797
Forgot to add a test case in the previous commit.

llvm-svn: 329805
2018-04-11 13:37:25 +00:00
Simon Pilgrim 6f97328b1f [X86][SSE] Tweak cmpps schedule test so that it works properly with just sse1
movhps/movlps test are still broken so we can't disable sse2 yet

llvm-svn: 329802
2018-04-11 13:15:36 +00:00
Dmitry Preobrazhensky fc715551a3 [AMDGPU][MC][GFX9] Added v_screen_partition_4se_b32
See bug 36845: https://bugs.llvm.org/show_bug.cgi?id=36845

Differential Revision: https://reviews.llvm.org/D45443

Reviewers: artem.tamazov, arsenm, timcorringham
llvm-svn: 329801
2018-04-11 13:13:30 +00:00
Francis Visoiu Mistrih 6463922e3a [AArch64] Fix regression after r329691
In r329691, we would choose FP even if the offset wouldn't fit, just
because the offset is smaller than the one from BP. This made many
accesses through FP need to scavenge a register, which resulted in
slower and bigger code for no good reason.

This patch now always picks the offset that fits first, even if FP is
preferred.

llvm-svn: 329797
2018-04-11 12:36:55 +00:00
Andrea Di Biagio 074ff7c5b6 [llvm-mca] Minor code cleanup. NFC
llvm-svn: 329796
2018-04-11 12:31:44 +00:00
Andrea Di Biagio f41ad5c59e [llvm-mca] Renamed BackendStatistics to RetireControlUnitStatistics.
Also, removed flag -verbose in favor of flag -retire-stats.

llvm-svn: 329794
2018-04-11 12:12:53 +00:00
Andrea Di Biagio 1cc29c045e [llvm-mca] Move the logic that prints scheduler statistics from BackendStatistics to its own view.
Added flag -scheduler-stats to print scheduler related statistics.

llvm-svn: 329792
2018-04-11 11:37:46 +00:00
Artur Gainullin d928201ac5 Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max.
Bitwise 'not' of the min/max could be eliminated in the pattern:

%notx = xor i32 %x, -1
%cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y
%smax = select i1 %cmp1, i32 %notx, i32 %y
%res = xor i32 %smax, -1

https://rise4fun.com/Alive/lCN

Reviewers: spatel

Reviewed by: spatel

Subscribers: a.elovikov, llvm-commits

Differential Revision: https://reviews.llvm.org/D45317

llvm-svn: 329791
2018-04-11 10:29:37 +00:00
Sjoerd Meijer ac96d7c4b3 [ARM] FP16 VSEL codegen
This is a follow up of rL327695 to instruction select more variants of VSELGT
and VSELGE, for which it is necessary to custom lower SELECT.

More work is required in this area, which will be addressed soon:
- more variants need to be regression tested, but this depends on the next point.
- first LowerConstantFP need to be adjusted for fp16 values.

Differential Revision: https://reviews.llvm.org/D45205

llvm-svn: 329788
2018-04-11 09:28:04 +00:00
Clement Courbet 33922a511d [Build][NFC] Split off libpfm detection to a separate module.
llvm-svn: 329783
2018-04-11 07:39:00 +00:00
Sander de Smalen 73937b7c9d [AArch64][AsmParser] Unify code for parsing Neon/SVE vectors.
Summary:
Merged 'tryMatchVectorRegister' (specific to Neon) and
'tryParseSVERegister' into a single 'tryParseVectorRegister' function, and
created a generic 'parseVectorKind()' function that returns the #Elements
and ElementWidth of a vector suffix. This reduces the duplication of
this functionality between two the vector implementations.

This is patch [1/6] in a series to add assembler/disassembler support for
SVE's contiguous ST1 (scalar+imm) instructions.

Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro

Reviewed By: fhahn

Subscribers: tschuett, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D45427

llvm-svn: 329782
2018-04-11 07:36:10 +00:00
Clement Courbet 23db1744f1 [llvm-exegesis] Add a flag to disable libpfm even if present.
Summary: Fixes PR37053.

Reviewers: uabelho, gchatelet

Subscribers: mgorny, tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D45436

llvm-svn: 329781
2018-04-11 07:32:43 +00:00
Petr Hosek 9b4035a85a [CMake][runtimes] Process common options in runtimes build
This was removed in D39932 but turned out this is actually needed
because runtimes such as compiler-rt and libc++ rely on common options
processing for setting certain flags such as -ffunction-sections and
-fdata-sections.

Differential Revision: https://reviews.llvm.org/D45507

llvm-svn: 329778
2018-04-11 05:18:03 +00:00
Craig Topper 9507fa358c [X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace 512-bit masked intrinsic with unmasked intrinsic and a select.
The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency.

llvm-svn: 329774
2018-04-11 04:55:04 +00:00
Craig Topper ee2c1dea4d [X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit an explicit MOV8mr instruction.
Previously the code only knew how to handle setcc to a register.

This should fix a crash in the chromium build.

llvm-svn: 329771
2018-04-11 01:09:10 +00:00
Craig Topper 72fa9f12a7 [X86] Switch a test from grep to FileCheck. NFC
llvm-svn: 329769
2018-04-11 01:05:32 +00:00
Sriraman Tallam 182f2df7c5 Simplification of libcall like printf->puts must check for RtLibUseGOT metadata.
With -fno-plt, for example, calls to printf when getting converted to puts
still use the PLT. This patch checks for the metadata "RtLibUseGOT" and
annotates the declaration with the right attributes.

Differential Revision: https://reviews.llvm.org/D45180

llvm-svn: 329768
2018-04-10 23:32:36 +00:00
Rui Ueyama eb820c3aac Use contains_lower() instead of find_lower() != StringRef::npos. NFC.
llvm-svn: 329767
2018-04-10 22:58:08 +00:00
Sriraman Tallam d693093a65 GOTPCREL references must always use RIP.
With -fno-plt, global value references can use GOTPCREL and RIP must be used.

Differential Revision: https://reviews.llvm.org/D45460

llvm-svn: 329765
2018-04-10 22:50:05 +00:00
Marek Olsak a9a58fa236 AMDGPU: enable 128-bit for local addr space under an option
Author: Samuel Pitoiset

ds_read_b128 and ds_write_b128 have been recently enabled
under the amdgpu-ds128 option because the performance benefit
is unclear.

Though, using 128-bit loads/stores for the local address space
appears to introduce regressions in tessellation shaders. Not
sure what is broken, but as ds_read_b128/ds_write_b128 are not
enabled by default, just introduce a global option and enable
128-bit only if requested (until it's fixed/used correctly).

v2: - fix regressions in merge-stores.ll and multiple_tails.ll

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
llvm-svn: 329764
2018-04-10 22:48:23 +00:00
Galina Kistanova 3dc27f1a69 Disable flaky tests till they get fixed.
llvm-svn: 329763
2018-04-10 22:07:29 +00:00
Geoff Berry 5696e075c3 [AArch64][Falkor] Fix bug in Falkor HWPF collision avoidance pass.
Summary:
When inserting MOVs to avoid Falkor HWPF collisions, the non-base
register operand of load instructions (e.g. a register offset) was not
being considered live, so it could potentially have been used as a
scratch register, clobbering the actual offset value.

Reviewers: mcrosier

Subscribers: rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D45502

llvm-svn: 329761
2018-04-10 21:43:03 +00:00
Sanjay Patel 3b6d46761f [CVP] simplify phi with constant incoming values that match common variable edge values
This is based on an example that was recently posted on llvm-dev:

void *propagate_null(void* b, int* g) {
  if (!b) {
    return 0;
  }
  (*g)++;
  return b;
}

https://godbolt.org/g/xYk3qG

The original code or constant propagation in other passes has obscured the fact 
that the phi can be removed completely.

Differential Revision: https://reviews.llvm.org/D45448

llvm-svn: 329755
2018-04-10 20:42:39 +00:00
Daniel Neilson 5e10637a3b [Verifier] Refactor duplicate code for atomic mem intrinsic verification (NFC)
Summary:
The verification rules for the intrinsics for atomic memcpy, atomic memmove,
and atomic memset are basically code clones. This change merges their verification
rules into a single block to remove duplication.

llvm-svn: 329753
2018-04-10 20:23:50 +00:00
Steven Wu d0804aa6dc [MachO] Emit Weak ReadOnlyWithRel to ConstDataSection
Summary:
Darwin dynamic linker can handle weak symbols in ConstDataSection.
ReadonReadOnlyWithRel symbols should be emitted in ConstDataSection
instead of normal DataSection.

rdar://problem/39298457

Reviewers: dexonsmith, kledzik

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45472

llvm-svn: 329752
2018-04-10 20:16:35 +00:00
Daniel Neilson 5eae06f21d [IR] Refactor memset inst classes (NFC)
Summary:
A simple refactor to remove duplicate code in the definitions of
MemSetInst, AtomicMemSetInst, and AnyMemSetInst. Introduce a
templated base class that contains all of the methods unique to
a memset intrinsic, and derive these three classes from that.

llvm-svn: 329747
2018-04-10 19:51:44 +00:00
Jessica Paquette a450ed2352 Recommit r329716 "Add missing nullptr check before getSection() to AArch64MachObjectWriter::recordRelocation"
This commit fixes the bot failures that were coming up before with r329716.

The fix was to move the check for "isInSection()" inside of the if condition
and emit the error there instead of waiting to get past the unreachable statement.

This should work in debug and release builds now.

llvm-svn: 329746
2018-04-10 19:46:43 +00:00
Daniel Neilson 08a930a9c2 [IR] Refactor memtransfer inst classes (NFC)
Summary:
A simple refactor to remove duplicate code in the definitions of
MemTransferInst, AtomicMemTransferInst, and AnyMemTransferInst.
Introduce a templated base class that contains all of the methods
unique to a memory transfer intrinsic, and derive these three
classes from that.

llvm-svn: 329744
2018-04-10 19:23:11 +00:00
Amara Emerson e27d5016ef [AArch64] Fix isel failure when BUILD_PAIR nodes are left over.
rdar://39175175

llvm-svn: 329743
2018-04-10 19:01:58 +00:00
Gabor Buella 213edc4a15 [X86] Split up -march=icelake to -client & -server
Reviewers: craig.topper, zvi, echristo

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D45055

llvm-svn: 329742
2018-04-10 18:59:13 +00:00
Sanjay Patel 5da361a0b0 [InstSimplify] fix formatting; NFC
llvm-svn: 329736
2018-04-10 18:38:19 +00:00
Craig Topper 442428540a [X86] Change the name string for the newly add DF flag register to 'dirflag' to match the clobber name supported by clang for MS inline assembly.
This should fix the failure found by Chromium reported here https://bugs.chromium.org/p/chromium/issues/detail?id=831158

The test case will be added in clang.

llvm-svn: 329734
2018-04-10 18:21:04 +00:00