llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrea Di Biagio	b24953bbfb	[llvm-mca] Let the Scheduler notify dispatch stall events caused by the lack of scheduling resources. This patch moves part of the logic that notifies dispatch stall events from the DispatchUnit to the Scheduler. The main goal of this patch is to remove (yet another) dependency between the DispatchUnit and the Scheduler. Before this patch, the DispatchUnit had to know about `Scheduler::Event` and how to classify stalls due to the lack of scheduling resources. This patch removes that knowledge and simplifies the logic in DispatchUnit::checkScheduler. This is another change done in preparation for the work to fix PR36663. No functional change intended. llvm-svn: 329835	2018-04-11 18:05:23 +00:00
Simon Pilgrim	7f321d8c24	[X86] Generalize X86PadShortFunction to work with TargetSchedModel Pre-commit for D45486, don't rely on itinerary scheduler model to determine latencies for padding, use the generic TargetSchedModel::computeInstrLatency call. Also, replace hard coded (atom specific) 2*uop creation per padding cycle with a version based on the scheduler model's issue width. Differential Revision: https://reviews.llvm.org/D45486 llvm-svn: 329834	2018-04-11 18:05:17 +00:00
Artem Belevich	2f8efcf3ca	[NVPTX] Removed 'satom' feature which is no longer used. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329830	2018-04-11 17:51:33 +00:00
Artem Belevich	24e8a680e5	[NVPTX, CUDA] Improved feature constraints on NVPTX target builtins. When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature, consider those features available if we're compiling for GPU >= sm_XX or have enabled PTX version >= ptxYY. Differential Revision: https://reviews.llvm.org/D45061 llvm-svn: 329829	2018-04-11 17:51:19 +00:00
Tim Renouf	fd8d4af3bc	[AMDGPU] Ensure there are enough registers for wave dispatch Summary: This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to allow for registers set up in wave dispatch, even if those registers are not used in the shader. Re-landed after noticing that the buildbot failure from 329808 seemed to be unrelated. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45503 Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771 llvm-svn: 329826	2018-04-11 17:18:36 +00:00
Daniel Neilson	7e2e5c3c58	[DSE] Regenerate tests with update_test_checks.py (NFC) Summary: In preparation for a future commit, this regenerates the test checks for test/Transforms/DeadStoreElimination/simple.ll test/Transforms/DeadStoreElimination/memintrinsics.ll llvm-svn: 329824	2018-04-11 16:50:04 +00:00
Reid Kleckner	0828699488	[FastISel] Disable local value sinking by default This is causing compilation timeouts on code with long sequences of local values and calls (i.e. foo(1); foo(2); foo(3); ...). It turns out that code coverage instrumentation is a great way to create sequences like this, which how our users ran into the issue in practice. Intel has a tool that detects these kinds of non-linear compile time issues, and Andy Kaylor reported it as PR37010. The current sinking code scans the whole basic block once per local value sink, which happens before emitting each call. In theory, local values should only be introduced to be used by instructions between the current flush point and the last flush point, so we should only need to scan those instructions. llvm-svn: 329822	2018-04-11 16:03:07 +00:00
Sanjay Patel	ff98682c9c	[InstCombine] limit X - (cast(-Y) --> X + cast(Y) with hasOneUse() llvm-svn: 329821	2018-04-11 15:57:18 +00:00
Paul Robinson	0195469a23	[DWARFv5] Fuss with asm syntax for conveying MD5 checksum. Previously the MD5 option of the .file directive provided the checksum as a quoted hex string; now it's a normal hex number with 0x prefix, same as the .octa directive accepts. Differential Revision: https://reviews.llvm.org/D45459 llvm-svn: 329820	2018-04-11 15:14:05 +00:00
Petar Jovanovic	366857a23a	[MIPS GlobalISel] Select add i32, i32 Add the minimal support necessary to lower a function that returns the sum of two i32 values. Support argument/return lowering of i32 values through registers only. Add tablegen for regbankselect and instructionselect. Patch by Petar Avramovic. Differential Revision: https://reviews.llvm.org/D44304 llvm-svn: 329819	2018-04-11 15:12:32 +00:00
Haicheng Wu	5ba379557d	[SLP] update a test case. NFC. llvm-svn: 329818	2018-04-11 15:09:49 +00:00
Yaxun Liu	9381ae9791	[AMDGPU] Fix lowering enqueue_kernel Two issues were fixed: runtime has difficulty to allocate memory for an external symbol of a kernel and set the address of the external symbol, therefore make the runtime handle of an enqueued kernel an ordinary global variable. Runtime only needs to store the address of the loaded kernel to the handle and has verified that this approach works. handle the situation where __enqueue_kernel* gets inlined therefore the enqueued kernel may be used through a constant expr instead of an instruction. Differential Revision: https://reviews.llvm.org/D45187 llvm-svn: 329815	2018-04-11 14:46:15 +00:00
Andrea Di Biagio	b15737e07c	Revert "[llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS" It caused a buildbot failure (clang-ppc64le-linux-multistage - build #6424) llvm-svn: 329812	2018-04-11 14:35:23 +00:00
Tim Renouf	8ca33bfcf3	Revert "[AMDGPU] Ensure there are enough registers for wave dispatch" This reverts 329808. That change caused a report of a failure in test/CodeGen/MIR/AMDGPU/mir-canon-multi.mir that I didn't see. I suspect it is an expensive-check-only error. Change-Id: I8133f26f15e7d5ec2b09c687c12cd70e918461b0 llvm-svn: 329811	2018-04-11 14:27:41 +00:00
Sander de Smalen	c88f9a1a57	[AArch64][AsmParser] Split index parsing from vector list. Summary: Place parsing of a vector index into a separate function to reduce duplication, since the code is duplicated in both the parsing of a Neon vector register operand and a Neon vector list. This is patch [2/6] in a series to add assembler/disassembler support for SVE's contiguous ST1 (scalar+imm) instructions. Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: rengolin Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45428 llvm-svn: 329809	2018-04-11 14:10:37 +00:00
Tim Renouf	f26b723491	[AMDGPU] Ensure there are enough registers for wave dispatch Summary: This fixes the number of SGPRs and VGPRs in the *_RSRC1 register to allow for registers set up in wave dispatch, even if those registers are not used in the shader. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45503 Change-Id: I6575f0e0d2a528d1319d0b289f0ebe4510fa5771 llvm-svn: 329808	2018-04-11 14:02:41 +00:00
Andrea Di Biagio	5782ec29ab	[llvm-mca][CMake] Remove unused libraries from set LLVM_LINK_COMPONENTS. llvm-svn: 329807	2018-04-11 13:52:42 +00:00
Simon Pilgrim	89c8a10f7c	[X86] Add variable shuffle schedule classes Split variable index shuffles from immediate index shuffles WriteFVarShuffle - variable 'in-lane' shuffles (VPERMILPS/VPERMIL2PS etc.) WriteVarShuffle - variable 'in-lane' shuffles (PSHUFB/VPPERM etc.) WriteFVarShuffle256 - variable 'cross-lane' shuffles (VPERMPS etc.) WriteVarShuffle256 - variable 'cross-lane' shuffles (VPERMD etc.) Differential Revision: https://reviews.llvm.org/D45404 llvm-svn: 329806	2018-04-11 13:49:19 +00:00
Francis Visoiu Mistrih	7bcb5720fd	[AArch64] Add test case for r329797 Forgot to add a test case in the previous commit. llvm-svn: 329805	2018-04-11 13:37:25 +00:00
Simon Pilgrim	6f97328b1f	[X86][SSE] Tweak cmpps schedule test so that it works properly with just sse1 movhps/movlps test are still broken so we can't disable sse2 yet llvm-svn: 329802	2018-04-11 13:15:36 +00:00
Dmitry Preobrazhensky	fc715551a3	[AMDGPU][MC][GFX9] Added v_screen_partition_4se_b32 See bug 36845: https://bugs.llvm.org/show_bug.cgi?id=36845 Differential Revision: https://reviews.llvm.org/D45443 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 329801	2018-04-11 13:13:30 +00:00
Francis Visoiu Mistrih	6463922e3a	[AArch64] Fix regression after r329691 In r329691, we would choose FP even if the offset wouldn't fit, just because the offset is smaller than the one from BP. This made many accesses through FP need to scavenge a register, which resulted in slower and bigger code for no good reason. This patch now always picks the offset that fits first, even if FP is preferred. llvm-svn: 329797	2018-04-11 12:36:55 +00:00
Andrea Di Biagio	074ff7c5b6	[llvm-mca] Minor code cleanup. NFC llvm-svn: 329796	2018-04-11 12:31:44 +00:00
Andrea Di Biagio	f41ad5c59e	[llvm-mca] Renamed BackendStatistics to RetireControlUnitStatistics. Also, removed flag -verbose in favor of flag -retire-stats. llvm-svn: 329794	2018-04-11 12:12:53 +00:00
Andrea Di Biagio	1cc29c045e	[llvm-mca] Move the logic that prints scheduler statistics from BackendStatistics to its own view. Added flag -scheduler-stats to print scheduler related statistics. llvm-svn: 329792	2018-04-11 11:37:46 +00:00
Artur Gainullin	d928201ac5	Eliminate a bitwise 'not' op of 'not' min/max by inverting the min/max. Bitwise 'not' of the min/max could be eliminated in the pattern: %notx = xor i32 %x, -1 %cmp1 = icmp sgt[slt/ugt/ult] i32 %notx, %y %smax = select i1 %cmp1, i32 %notx, i32 %y %res = xor i32 %smax, -1 https://rise4fun.com/Alive/lCN Reviewers: spatel Reviewed by: spatel Subscribers: a.elovikov, llvm-commits Differential Revision: https://reviews.llvm.org/D45317 llvm-svn: 329791	2018-04-11 10:29:37 +00:00
Sjoerd Meijer	ac96d7c4b3	[ARM] FP16 VSEL codegen This is a follow up of rL327695 to instruction select more variants of VSELGT and VSELGE, for which it is necessary to custom lower SELECT. More work is required in this area, which will be addressed soon: - more variants need to be regression tested, but this depends on the next point. - first LowerConstantFP need to be adjusted for fp16 values. Differential Revision: https://reviews.llvm.org/D45205 llvm-svn: 329788	2018-04-11 09:28:04 +00:00
Clement Courbet	33922a511d	[Build][NFC] Split off libpfm detection to a separate module. llvm-svn: 329783	2018-04-11 07:39:00 +00:00
Sander de Smalen	73937b7c9d	[AArch64][AsmParser] Unify code for parsing Neon/SVE vectors. Summary: Merged 'tryMatchVectorRegister' (specific to Neon) and 'tryParseSVERegister' into a single 'tryParseVectorRegister' function, and created a generic 'parseVectorKind()' function that returns the #Elements and ElementWidth of a vector suffix. This reduces the duplication of this functionality between two the vector implementations. This is patch [1/6] in a series to add assembler/disassembler support for SVE's contiguous ST1 (scalar+imm) instructions. Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro Reviewed By: fhahn Subscribers: tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D45427 llvm-svn: 329782	2018-04-11 07:36:10 +00:00
Clement Courbet	23db1744f1	[llvm-exegesis] Add a flag to disable libpfm even if present. Summary: Fixes PR37053. Reviewers: uabelho, gchatelet Subscribers: mgorny, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D45436 llvm-svn: 329781	2018-04-11 07:32:43 +00:00
Petr Hosek	9b4035a85a	[CMake][runtimes] Process common options in runtimes build This was removed in D39932 but turned out this is actually needed because runtimes such as compiler-rt and libc++ rely on common options processing for setting certain flags such as -ffunction-sections and -fdata-sections. Differential Revision: https://reviews.llvm.org/D45507 llvm-svn: 329778	2018-04-11 05:18:03 +00:00
Craig Topper	9507fa358c	[X86] Remove 128/256-bit masked pmaddubsw and pmaddwd intrinsics. Replace 512-bit masked intrinsic with unmasked intrinsic and a select. The 128/256-bit versions were no longer used by clang. It uses the legacy SSE/AVX2 version and a select. The 512-bit was changed to the same for consistency. llvm-svn: 329774	2018-04-11 04:55:04 +00:00
Craig Topper	ee2c1dea4d	[X86] In X86FlagsCopyLowering, when rewriting a memory setcc we need to emit an explicit MOV8mr instruction. Previously the code only knew how to handle setcc to a register. This should fix a crash in the chromium build. llvm-svn: 329771	2018-04-11 01:09:10 +00:00
Craig Topper	72fa9f12a7	[X86] Switch a test from grep to FileCheck. NFC llvm-svn: 329769	2018-04-11 01:05:32 +00:00
Sriraman Tallam	182f2df7c5	Simplification of libcall like printf->puts must check for RtLibUseGOT metadata. With -fno-plt, for example, calls to printf when getting converted to puts still use the PLT. This patch checks for the metadata "RtLibUseGOT" and annotates the declaration with the right attributes. Differential Revision: https://reviews.llvm.org/D45180 llvm-svn: 329768	2018-04-10 23:32:36 +00:00
Rui Ueyama	eb820c3aac	Use contains_lower() instead of find_lower() != StringRef::npos. NFC. llvm-svn: 329767	2018-04-10 22:58:08 +00:00
Sriraman Tallam	d693093a65	GOTPCREL references must always use RIP. With -fno-plt, global value references can use GOTPCREL and RIP must be used. Differential Revision: https://reviews.llvm.org/D45460 llvm-svn: 329765	2018-04-10 22:50:05 +00:00
Marek Olsak	a9a58fa236	AMDGPU: enable 128-bit for local addr space under an option Author: Samuel Pitoiset ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). v2: - fix regressions in merge-stores.ll and multiple_tails.ll Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 llvm-svn: 329764	2018-04-10 22:48:23 +00:00
Galina Kistanova	3dc27f1a69	Disable flaky tests till they get fixed. llvm-svn: 329763	2018-04-10 22:07:29 +00:00
Geoff Berry	5696e075c3	[AArch64][Falkor] Fix bug in Falkor HWPF collision avoidance pass. Summary: When inserting MOVs to avoid Falkor HWPF collisions, the non-base register operand of load instructions (e.g. a register offset) was not being considered live, so it could potentially have been used as a scratch register, clobbering the actual offset value. Reviewers: mcrosier Subscribers: rengolin, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D45502 llvm-svn: 329761	2018-04-10 21:43:03 +00:00
Sanjay Patel	3b6d46761f	[CVP] simplify phi with constant incoming values that match common variable edge values This is based on an example that was recently posted on llvm-dev: void propagate_null(void b, int* g) { if (!b) { return 0; } (*g)++; return b; } https://godbolt.org/g/xYk3qG The original code or constant propagation in other passes has obscured the fact that the phi can be removed completely. Differential Revision: https://reviews.llvm.org/D45448 llvm-svn: 329755	2018-04-10 20:42:39 +00:00
Daniel Neilson	5e10637a3b	[Verifier] Refactor duplicate code for atomic mem intrinsic verification (NFC) Summary: The verification rules for the intrinsics for atomic memcpy, atomic memmove, and atomic memset are basically code clones. This change merges their verification rules into a single block to remove duplication. llvm-svn: 329753	2018-04-10 20:23:50 +00:00
Steven Wu	d0804aa6dc	[MachO] Emit Weak ReadOnlyWithRel to ConstDataSection Summary: Darwin dynamic linker can handle weak symbols in ConstDataSection. ReadonReadOnlyWithRel symbols should be emitted in ConstDataSection instead of normal DataSection. rdar://problem/39298457 Reviewers: dexonsmith, kledzik Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45472 llvm-svn: 329752	2018-04-10 20:16:35 +00:00
Daniel Neilson	5eae06f21d	[IR] Refactor memset inst classes (NFC) Summary: A simple refactor to remove duplicate code in the definitions of MemSetInst, AtomicMemSetInst, and AnyMemSetInst. Introduce a templated base class that contains all of the methods unique to a memset intrinsic, and derive these three classes from that. llvm-svn: 329747	2018-04-10 19:51:44 +00:00
Jessica Paquette	a450ed2352	Recommit r329716 "Add missing nullptr check before getSection() to AArch64MachObjectWriter::recordRelocation" This commit fixes the bot failures that were coming up before with r329716. The fix was to move the check for "isInSection()" inside of the if condition and emit the error there instead of waiting to get past the unreachable statement. This should work in debug and release builds now. llvm-svn: 329746	2018-04-10 19:46:43 +00:00
Daniel Neilson	08a930a9c2	[IR] Refactor memtransfer inst classes (NFC) Summary: A simple refactor to remove duplicate code in the definitions of MemTransferInst, AtomicMemTransferInst, and AnyMemTransferInst. Introduce a templated base class that contains all of the methods unique to a memory transfer intrinsic, and derive these three classes from that. llvm-svn: 329744	2018-04-10 19:23:11 +00:00
Amara Emerson	e27d5016ef	[AArch64] Fix isel failure when BUILD_PAIR nodes are left over. rdar://39175175 llvm-svn: 329743	2018-04-10 19:01:58 +00:00
Gabor Buella	213edc4a15	[X86] Split up -march=icelake to -client & -server Reviewers: craig.topper, zvi, echristo Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45055 llvm-svn: 329742	2018-04-10 18:59:13 +00:00
Sanjay Patel	5da361a0b0	[InstSimplify] fix formatting; NFC llvm-svn: 329736	2018-04-10 18:38:19 +00:00
Craig Topper	442428540a	[X86] Change the name string for the newly add DF flag register to 'dirflag' to match the clobber name supported by clang for MS inline assembly. This should fix the failure found by Chromium reported here https://bugs.chromium.org/p/chromium/issues/detail?id=831158 The test case will be added in clang. llvm-svn: 329734	2018-04-10 18:21:04 +00:00

1 2 3 4 5 ...

162722 Commits