llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	752579736e	RegBankSelect: Handle slightly more complex value mappings Try to use concat_vectors. Also remove unnecessary assert on pointers. Fixes asserting for <4 x s16> operations and 64-bit pointers for AMDGPU. llvm-svn: 354828	2019-02-25 22:24:13 +00:00
Matt Arsenault	f4bfe4cd17	AMDGPU/GlobalISel: Fix bit ops for non-power-of-2 sizes llvm-svn: 354825	2019-02-25 21:32:48 +00:00
Matt Arsenault	82b103998b	AMDGPU/GlobalISel: Clamp max implicit_def elements llvm-svn: 354818	2019-02-25 20:46:06 +00:00
Matt Arsenault	f97ace5639	AMDGPU: Remove IntrReadMem from memtime/memrealtime intrinsics EarlyCSE with MemorySSA was able to use this to merge multiple calls with no intervening store. llvm-svn: 354814	2019-02-25 20:16:11 +00:00
Matt Arsenault	fd6fd00773	AMDGPU: Correct definitions for bitset instructions These really read and write the result register, so these need a tied input. llvm-svn: 354809	2019-02-25 19:24:46 +00:00
Konstantin Zhuravlyov	9a278bf6b5	Revert "AMDGPU/NFC: Cleanup subtarget predicates" It breaks one of our downstream merges, so revert it temporarily while investigating failures downstream llvm-svn: 354700	2019-02-22 23:21:06 +00:00
Matt Arsenault	476e26b5d3	AMDGPU: Use removeAllRegUnitsForPhysReg llvm-svn: 354686	2019-02-22 19:03:36 +00:00
Matt Arsenault	aa6fb4c45e	AMDGPU: Remove debugger related subtarget features As far as I know these aren't needed anymore. llvm-svn: 354634	2019-02-21 23:27:46 +00:00
Konstantin Zhuravlyov	c2650178a1	AMDGPU/NFC: Cleanup subtarget predicates Differential Revision: https://reviews.llvm.org/D58522 llvm-svn: 354620	2019-02-21 20:43:43 +00:00
Mark Searles	599ce44d3f	[AMDGPU] remove unused AssemblerPredicates An internal build is hitting asserts complaining about too many subtarget features: llvm/utils/TableGen/Types.cpp:42: const char* llvm::getMinimalTypeForEnumBitfield(uint64_t): Assertion `MaxIndex <= 64 && "Too many bits"' failed. llvm/utils/TableGen/AsmMatcherEmitter.cpp:1476: void {anonymous}::AsmMatcherInfo::buildInfo(): Assertion `SubtargetFeatures.size() <= 64 && "Too many subtarget features!"' failed. The short-term solution is to remove a few unused AssemblerPredicates to get under the limit. The long-term solution seems to be to revisit these asserts. E.g., rather than hardcoded '64', use the standard sized std::bitset like the other places that track subtarget features. Differential Revision: https://reviews.llvm.org/D58516 llvm-svn: 354604	2019-02-21 18:19:54 +00:00
Matt Arsenault	2e0ee47712	AMDGPU/GlobalISel: Make phis legal llvm-svn: 354592	2019-02-21 15:48:13 +00:00
Matt Arsenault	b10fa8df3f	AMDGPU/GlobalISel: Fix bit count ops for non-power-of-2 types llvm-svn: 354587	2019-02-21 15:22:20 +00:00
Stanislav Mekhanoshin	42e229e130	[AMDGPU] fix commuted case of sub combine Differential Revision: https://reviews.llvm.org/D58481 llvm-svn: 354543	2019-02-21 02:58:00 +00:00
Tom Stellard	79b5c3842b	AMDGPU/GlobalISel: Move SMRD selection logic to TableGen Reviewers: arsenm Reviewed By: arsenm Subscribers: volkan, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52922 llvm-svn: 354516	2019-02-20 21:02:37 +00:00
Matt Arsenault	75e30c4d5d	GlobalISel: Fix fewerElementsVector for ctlz with different result type Also complete the set of related operations. llvm-svn: 354480	2019-02-20 16:42:52 +00:00
Matt Arsenault	c4d07554e4	GlobalISel: Implement moreElementsVector for g_insert results llvm-svn: 354477	2019-02-20 16:11:22 +00:00
Matt Arsenault	b4c95b338b	GlobalISel: Implement moreElementsVector for select llvm-svn: 354354	2019-02-19 17:03:09 +00:00
Matt Arsenault	4d88427a58	GlobalISel: Implement moreElementsVector for G_EXTRACT source llvm-svn: 354348	2019-02-19 16:44:22 +00:00
Matt Arsenault	26b7e859ef	GlobalISel: Implement moreElementsVector for bit ops llvm-svn: 354345	2019-02-19 16:30:19 +00:00
Changpeng Fang	4cabf6d3b5	AMDGPU: Use MachineInstr::mayAlias to replace areMemAccessesTriviallyDisjoint in LoadStoreOptimizer pass. Summary: This is to fix a memory dependence bug in LoadStoreOptimizer. Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58295 llvm-svn: 354295	2019-02-18 23:00:26 +00:00
Matt Arsenault	fbe92a53d0	GlobalISel: Implement widenScalar for g_extract scalar results llvm-svn: 354293	2019-02-18 22:39:27 +00:00
Konstantin Zhuravlyov	1e126c503b	AMDGPU: Set ABI version to 1 for code object v3 Differential Revision: https://reviews.llvm.org/D57811 llvm-svn: 354085	2019-02-14 23:56:04 +00:00
Matt Arsenault	530d05e94a	GlobalISel: Add alignment to LegalityQuery MMOs This allows targets to specify the minimum alignment required for the load/store. llvm-svn: 354071	2019-02-14 22:41:09 +00:00
Matt Arsenault	9e5e868d95	AMDGPU/GlobalISel: Fix RegBankSelect for GEP. This is basically a pointer typed add, so shouldn't be any different. This was assuming everything was an SGPR, which is not true. Also cleanup legality for GEP. I don't seem to be seeing the problem the hack marking s64 as a legal pointer type the comment mentions. llvm-svn: 354067	2019-02-14 22:24:28 +00:00
Stanislav Mekhanoshin	871821f786	[AMDGPU] Ressociate 'add (add x, y), z' to use SALU Reassociate adds to collect scalar operands in a single instruction when possible. That will result in a scalar add followed by vector instead of two vector adds, thus better utilizing SALU. Differential Revision: https://reviews.llvm.org/D58220 llvm-svn: 354066	2019-02-14 22:11:25 +00:00
Matt Arsenault	d3d496338e	AMDGPU/GlobalISel: Handle split for 64-bit VALU select llvm-svn: 354065	2019-02-14 21:58:12 +00:00
Matt Arsenault	4cd9509e1d	AMDGPU: Try to use function specific ST Subtargets are a function level property, so ideally we would eliminate everywhere that needs to check the global one. Rename the function to try avoiding confusion. llvm-svn: 353900	2019-02-12 23:44:13 +00:00
Matt Arsenault	d24296e282	AMDGPU: Ignore CodeObjectV3 when inlining This was inhibiting inlining of library functions when clang was invoking the inliner directly. This is covering a bit of a mess with subtarget feature handling, and this shouldn't be a subtarget feature. The behavior is different depending on whether you are using a -mattr flag in clang, or llc, opt. llvm-svn: 353899	2019-02-12 23:30:11 +00:00
Konstantin Zhuravlyov	6220d62e5c	AMDGPU/NFC: Remove SubtargetFeatureISAVersion since it is not used anywhere llvm-svn: 353892	2019-02-12 22:49:49 +00:00
Konstantin Zhuravlyov	acb231c8d8	AMDGPU: Remove duplicate processor (gfx900) llvm-svn: 353889	2019-02-12 22:29:25 +00:00
Matt Arsenault	00ccd13c73	AMDGPU/GlobalISel: Only make f16 constants legal on f16 targets We could deal with it, but there's no real point. llvm-svn: 353845	2019-02-12 14:54:55 +00:00
Matt Arsenault	18ec382698	GlobalISel: Implement moreElementsVector for implicit_def llvm-svn: 353754	2019-02-11 22:00:39 +00:00
Matt Arsenault	9dba67f431	GlobalISel: Add G_FCANONICALIZE instruction llvm-svn: 353719	2019-02-11 17:05:20 +00:00
Benjamin Kramer	582c16013d	[AMDGPU] Remove unused variable llvm-svn: 353704	2019-02-11 14:49:54 +00:00
Neil Henning	8c10fa1a90	[AMDGPU] Fix DPP sequence in atomic optimizer. This commit fixes the DPP sequence in the atomic optimizer (which was previously missing the row_shr:3 step), and works around a read_register exec bug by using a ballot instead. Differential Revision: https://reviews.llvm.org/D57737 llvm-svn: 353703	2019-02-11 14:44:14 +00:00
Valery Pykhtin	ded96df01e	[AMDGPU] Enable DPP combiner pass by default. Related revisions: https://reviews.llvm.org/D55444, https://reviews.llvm.org/D55314 llvm-svn: 353691	2019-02-11 11:15:03 +00:00
Stanislav Mekhanoshin	0e858b028d	[AMDGPU] Split dot-insts feature Differential Revision: https://reviews.llvm.org/D57971 llvm-svn: 353587	2019-02-09 00:34:21 +00:00
Craig Topper	784929d045	Implementation of asm-goto support in LLVM This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563	2019-02-08 20:48:56 +00:00
Matt Arsenault	564f0f832c	AMDGPU: Eliminate GPU specific SubtargetFeatures Inline compatability is determined from the individual feature bits. These are just sets of the separate features, but will always be treated as incompatible unless they are specifically ignored. Defining the ISA version number here in tablegen would be nice, but it turns out this wasn't actually used. llvm-svn: 353558	2019-02-08 19:59:32 +00:00
Matt Arsenault	d7047276ec	AMDGPU: Remove GCN features and predicates These are no longer necessary since the R600 tablegen files are split out now. llvm-svn: 353548	2019-02-08 19:18:01 +00:00
Carl Ritson	494b8ac95a	[AMDGPU] Fix CS scratch setup on pre-GCN3 ASICs Summary: Prior to GCN3 s_load_dword offsets are in dwords rather than bytes. Thus the scratch buffer descriptor offset must be adjusted for pre-GCN3 ASICs. Reviewers: nhaehnle, tpr Reviewed By: nhaehnle Subscribers: sheredom, arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D56496 llvm-svn: 353530	2019-02-08 15:41:11 +00:00
Matt Arsenault	b0a227049f	AMDGPU/GlobalISel: Fix shift legalization for non-power-of-2 clampScalar doesn't do anything for non-power-of-2 in range. There should probably be a combination rule to reduce the number of matching rules. llvm-svn: 353526	2019-02-08 15:06:24 +00:00
Dmitry Preobrazhensky	942c273d64	[AMDGPU][MC] Added support of lds_direct operand See bug 39293: https://bugs.llvm.org/show_bug.cgi?id=39293 Reviewers: artem.tamazov, rampitec Differential Revision: https://reviews.llvm.org/D57889 llvm-svn: 353524	2019-02-08 14:57:37 +00:00
Matt Arsenault	0f2debb1c2	AMDGPU/GlobalISel: Fix non-power-of-2 implicit_def llvm-svn: 353522	2019-02-08 14:46:27 +00:00
Matt Arsenault	dc88a2ce35	AMDGPU/GlobalISel: Don't use a copy in addrspacecast lowering llvm-svn: 353516	2019-02-08 14:16:11 +00:00
Dmitry Preobrazhensky	62a0318dff	[AMDGPU][MC][CODEOBJECT] Added predefined symbols to access GPU minor and stepping numbers Added the following Code Object v3 symbols: .amdgcn.gfx_generation_minor .amdgcn.gfx_generation_stepping Reviewers: artem.tamazov, kzhuravl Differential Revision: https://reviews.llvm.org/D57826 llvm-svn: 353515	2019-02-08 13:51:31 +00:00
Valery Pykhtin	7fe97f8c7c	[AMDGPU] Fix DPP combiner Differential revision: https://reviews.llvm.org/D55444 dpp move with uses and old reg initializer should be in the same BB. bound_ctrl:0 is only considered when bank_mask and row_mask are fully enabled (0xF). Otherwise the old register value is checked for identity. Added add, subrev, and, or instructions to the old folding function. Kill flag is cleared for the src0 (DPP register) as it may be copied into more than one user. The pass is still disabled by default. llvm-svn: 353513	2019-02-08 11:59:48 +00:00
Matt Arsenault	a8b4339c2f	AMDGPU/GlobalISel: Legalize addrspacecast Use a placeholder constant for now on targets that need the load from the queue ptr. llvm-svn: 353497	2019-02-08 02:40:47 +00:00
Matt Arsenault	fbec8fe93b	GlobalISel: Implement narrowScalar for shift main type This is pretty much directly ported from SelectionDAG. Doesn't include the shift by non-constant but known bits version, since there isn't a globalisel version of computeKnownBits yet. This shows a disadvantage of targets not specifically which type should be used for the shift amount. If type 0 is legalized before type 1, the operations on the shift amount type use the wider type (which are also less likely to legalize). This can be avoided by targets specifying legalization actions on type 1 earlier than for type 0. llvm-svn: 353455	2019-02-07 19:37:44 +00:00
Matt Arsenault	d914189a2e	AMDGPU/GlobalISel: Restrict g_implicit_def legality llvm-svn: 353452	2019-02-07 19:10:15 +00:00

1 2 3 4 5 ...

3161 Commits