llvm-project

Commit Graph

Author	SHA1	Message	Date
Stanislav Mekhanoshin	ad8131bb03	[AMDGPU] Fix VC warning about singed/unsigned comparison. NFC. This is the warning reported in https://reviews.llvm.org/D89599	2020-10-26 11:55:57 -07:00
Stanislav Mekhanoshin	611959f004	[AMDGPU] Fixed v_swap_b32 match 1. Fixed liveness issue with implicit kills. 2. Fixed potential problem with an indirect mov. Fixes: SWDEV-256848 Differential Revision: https://reviews.llvm.org/D89599	2020-10-21 10:14:24 -07:00
Austin Kerbow	ebdcef20ce	[AMDGPU] Avoid inserting noops during scheduling Passes that are run after the post-RA scheduler may insert instructions like waitcnt which eliminate the need for certain noops. After this patch the scheduler is still aware of possible latency from hazards but noops will not be inserted until the dedicated hazard recognizer pass is run. Depends on D89753. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D89754	2020-10-20 17:11:36 -07:00
Stanislav Mekhanoshin	91f503c3af	[AMDGPU] gfx1030 RT support Differential Revision: https://reviews.llvm.org/D87782	2020-09-16 11:40:58 -07:00
Michael Liao	bf41c4d29e	[codegen] Ensure target flags are cleared/set properly. NFC. - When an operand is changed into an immediate value or like, ensure their target flags being cleared or set properly. Differential Revision: https://reviews.llvm.org/D87109	2020-09-03 18:37:39 -04:00
Jay Foad	3497860203	[AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister ... in favour of the isPhysical/isVirtual methods.	2020-08-20 17:59:11 +01:00
Matt Arsenault	3e52667433	AMDGPU: Fix verifier error with undef source producing s_bitset* This needs to preserve the undef flag.	2020-08-05 14:42:20 -04:00
Jay Foad	1658b8d7dd	[AMDGPU] Avoid using s_cmpk when src0 is not register The hardware spec require src0 of s_cmpk should be a register. So, we should not optimize s_cmp to s_cmpk if src0 is not register. Patch by Ruiling Song!	2020-07-14 09:05:53 +01:00
Stanislav Mekhanoshin	9ee272f13d	[AMDGPU] Add gfx1030 target Differential Revision: https://reviews.llvm.org/D81886	2020-06-15 16:18:05 -07:00
Matt Arsenault	f596ab4066	AMDGPU: Use early return	2020-04-07 13:48:00 -04:00
Changpeng Fang	6370c7c13e	AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. Summary: Current implementation of matchSwap in SIShrinkInstructions searches the entire use_nodbg_operands set to find the possible pattern to generate v_swap instruction. This approach will lead to a O(N^3) in compile time for SIShrinkInstructions. But in reality, the matching pattern only exists within nearby instructions in the same basic block. This work limits the search to a maximum of 16 instructions, and has a linear compile time comsumption. Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D74180	2020-02-07 11:06:33 -08:00
Stanislav Mekhanoshin	cacc3b7a55	[AMDGPU] Cleanup assumptions about generated subregs We are using countPopulation on a LaneBitmask to determine a number of registers it covers. This is the assumption which does not necessarily need to be true. It is not changed but factored into a single call SIRegisterInfo::getNumCoveredRegs(). Some other places are cleaned up with respect to assumptions about subreg indexes values and tablegen behavior. Differential Revision: https://reviews.llvm.org/D74177	2020-02-06 17:39:24 -08:00
Stanislav Mekhanoshin	2863c26968	Revert "AMDGPU: Limit the search in finding the instruction pattern for v_swap generation." This reverts commit `9827806481`.	2020-02-06 17:38:55 -08:00
Changpeng Fang	9827806481	AMDGPU: Limit the search in finding the instruction pattern for v_swap generation. Summary: Current implementation of matchSwap in SIShrinkInstructions searches the entire use_nodbg_operands set to find the possible pattern to generate v_swap instruction. This approach will lead to a O(N^3) in compile time for SIShrinkInstructions. But in reality, the matching pattern only exists within nearby instructions in the same basic block. This work limits the search to a maximum of 16 instructions, and has a linear compile time comsumption. Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D74180	2020-02-06 16:40:21 -08:00
Matt Arsenault	edca9ac0de	AMDGPU: Don't fold S_NOPs with implicit operands	2019-10-30 14:40:56 -07:00
Daniel Sanders	0c47611131	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041	2019-08-15 19:22:08 +00:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Nicolai Haehnle	2710171a15	AMDGPU: Write LDS objects out as global symbols in code generation Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then expected to resolve relocations, which are also emitted. Initially disabled for HSA and PAL environments until they have caught up in terms of linker and runtime loader. Some notes: - The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied. The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now. - We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check. Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275 Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61494 llvm-svn: 364297	2019-06-25 11:52:30 +00:00
Stanislav Mekhanoshin	5250021672	[AMDGPU] gfx10 conditional registers handling This is cpp source part of wave32 support, excluding overriden getRegClass(). Differential Revision: https://reviews.llvm.org/D63351 llvm-svn: 363513	2019-06-16 17:13:09 +00:00
Stanislav Mekhanoshin	692560dc98	[AMDGPU] gfx1010 MIMG implementation Differential Revision: https://reviews.llvm.org/D61339 llvm-svn: 359698	2019-05-01 16:32:58 +00:00
Matt Arsenault	fd6fd00773	AMDGPU: Correct definitions for bitset instructions These really read and write the result register, so these need a tied input. llvm-svn: 354809	2019-02-25 19:24:46 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Graham Sellers	b297379ef0	[AMDGPU] Shrink scalar AND, OR, XOR instructions This change attempts to shrink scalar AND, OR and XOR instructions which take an immediate that isn't inlineable. It performs: AND s0, s0, ~(1 << n) -> BITSET0 s0, n OR s0, s0, (1 << n) -> BITSET1 s0, n AND s0, s1, x -> ANDN2 s0, s1, ~x OR s0, s1, x -> ORN2 s0, s1, ~x XOR s0, s1, x -> XNOR s0, s1, ~x In particular, this catches setting and clearing the sign bit for fabs (and x, 0x7ffffffff -> bitset0 x, 31 and or x, 0x80000000 -> bitset1 x, 31). llvm-svn: 348601	2018-12-07 15:33:21 +00:00
Stanislav Mekhanoshin	6b1c6548bd	[AMDGPU] Fixed return value causing warning and regression llvm-svn: 345518	2018-10-29 17:53:23 +00:00
Stanislav Mekhanoshin	79080ecd82	[AMDGPU] Match v_swap_b32 Differential Revision: https://reviews.llvm.org/D52677 llvm-svn: 345514	2018-10-29 17:26:01 +00:00
Matt Arsenault	de6c421cc8	AMDGPU: Shrink insts to fold immediates This needs to be done in the SSA fold operands pass to be effective, so there is a bit of overlap with SIShrinkInstructions but I don't think this is practically avoidable. llvm-svn: 340859	2018-08-28 18:34:24 +00:00
Matt Arsenault	35b1902bce	AMDGPU: Move canShrink into TII llvm-svn: 340855	2018-08-28 18:22:34 +00:00
Matt Arsenault	5ae765e68c	AMDGPU: Use existing function to check for VGPRs llvm-svn: 337621	2018-07-20 21:20:36 +00:00
Tom Stellard	5bfbae5cb1	AMDGPU: Refactor Subtarget classes Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851	2018-07-11 20:59:01 +00:00
Tom Stellard	79fffe3515	AMDGPU: Remove AMDGPUMCInstLower.h Summary: The AMDGPUMCInstLower class is not used outside AMDGPUMCInstLower.cpp, so we don't need a header file. Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47264 llvm-svn: 333254	2018-05-25 04:57:02 +00:00
Tom Stellard	44b30b4537	AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930	2018-05-22 02:03:23 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Adrian Prantl	5f8f34e459	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272	2018-05-01 15:54:18 +00:00
Matt Arsenault	0084adc516	AMDGPU: Add Vega12 and Vega20 Changes by Matt Arsenault Konstantin Zhuravlyov llvm-svn: 331215	2018-04-30 19:08:16 +00:00
Hiroshi Inoue	372ffa15cb	[NFC] fix trivial typos in comments "the the" -> "the", "we we" -> "we", etc llvm-svn: 330006	2018-04-13 11:37:06 +00:00
Stanislav Mekhanoshin	fa48c496e2	[AMDGPU] Shrinking V_SUBBREV_U32 V_SUBBREV_U32 is a commute opcode for V_SUBB_U32. However, when we try to commute V_SUBB_U32 in order to shrink it we do not then process V_SUBBREV_U32 and it stay VOP3. This is fixed. Differential Revision: https://reviews.llvm.org/D43699 llvm-svn: 326011	2018-02-24 01:32:32 +00:00
Matthias Braun	f1caa2833f	MachineFunction: Return reference from getFunction(); NFC The Function can never be nullptr so we can return a reference. llvm-svn: 320884	2017-12-15 22:22:58 +00:00
Matt Arsenault	9cff06f37b	AMDGPU: Allow SIShrinkInstructions to fold FrameIndexes llvm-svn: 307576	2017-07-10 20:04:35 +00:00
Matt Arsenault	6c29c5acfe	AMDGPU: Allow SIShrinkInstructions to work in non-SSA Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. llvm-svn: 307575	2017-07-10 19:53:57 +00:00
Matt Arsenault	fda5318204	AMDGPU: Remove unnecessary check for constant operands An instruction that has an immediate operand can't reach this point. This is only called for a freshly shrunk instruction, which prevously couldn't have had a literal constant operand. This was also not conservative enough since it woudl also have had to filter other constant-like inputs like frame indexes. llvm-svn: 307574	2017-07-10 19:33:38 +00:00
Matt Arsenault	a81198d82d	AMDGPU: Minor cleanup of shrinking logic llvm-svn: 307312	2017-07-06 20:56:59 +00:00
Stanislav Mekhanoshin	a9d846c6ef	[AMDGPU] Fix illegal shrink of V_SUBB_U32 and V_ADDC_U32 If there is an immediate operand we shall not shrink V_SUBB_U32 and V_ADDC_U32, it does not fit e32 encoding. Differential Revison: https://reviews.llvm.org/D34291 llvm-svn: 305840	2017-06-20 20:33:44 +00:00
Diana Picus	116bbab4e4	[CodeGen] Rename MachineInstrBuilder::addOperand. NFC Rename from addOperand to just add, to match the other method that has been added to MachineInstrBuilder for adding more than just 1 operand. See https://reviews.llvm.org/D28057 for the whole discussion. Differential Revision: https://reviews.llvm.org/D28556 llvm-svn: 291891	2017-01-13 09:58:52 +00:00
Matt Arsenault	24a1273ae1	AMDGPU: Fix shrinking of addc/subb. To shrink to VOP2 the input carry must also be VCC. llvm-svn: 291720	2017-01-11 22:58:12 +00:00
Matt Arsenault	28bd4cbeaf	AMDGPU: Fix breaking VOP3 v_add_i32s This was shrinking the instruction even though the carry output register was a virtual register, not known VCC. llvm-svn: 291716	2017-01-11 22:35:17 +00:00
Matt Arsenault	4bd7236193	AMDGPU: Fix handling of 16-bit immediates Since 32-bit instructions with 32-bit input immediate behavior are used to materialize 16-bit constants in 32-bit registers for 16-bit instructions, determining the legality based on the size is incorrect. Change operands to have the size specified in the type. Also adds a workaround for a disassembler bug that produces an immediate MCOperand for an operand that is supposed to be OPERAND_REGISTER. The assembler appears to accept out of bounds immediates and truncates them, but this seems to be an issue for 32-bit already. llvm-svn: 289306	2016-12-10 00:39:12 +00:00
Konstantin Zhuravlyov	f86e4b7266	[AMDGPU] Add f16 support (VI+) Differential Revision: https://reviews.llvm.org/D25975 llvm-svn: 286753	2016-11-13 07:01:11 +00:00
Matt Arsenault	663ab8c119	AMDGPU: Use brev for materializing SGPR constants This is already done with VGPR immediates and saves 4 bytes. llvm-svn: 285765	2016-11-01 23:14:20 +00:00
Mehdi Amini	117296c0a0	Use StringRef in Pass/PassManager APIs (NFC) llvm-svn: 283004	2016-10-01 02:56:57 +00:00
Matt Arsenault	5d8eb25e78	AMDGPU: Use unsigned compare for eq/ne For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one. llvm-svn: 282832	2016-09-30 01:50:20 +00:00

1 2

76 Commits