llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	96b0b9a5e3	[X86] Enable shrink-wrapping for no-frame-pointer non-nounwind functions on platforms not using compact unwind The current compact unwind scheme does not work when the prologue is not at the start (the instructions before the prologue cannot be described). (Technically this is fixable, but it requires multiple compact unwind descriptors for one function.) rL255175 chose to not perform shrink-wrapping for no-frame-pointer functions not marked as nounwind to work around PR25614. This is overly limited, as platforms not supporting compact unwind (all non-Darwin) does not need the workaround. This patch restricts the limitation to compact unwind platforms. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D89930	2020-11-04 16:51:48 -08:00
Arthur Eubanks	ab0ddbc38a	Reland [NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 13:11:40 -08:00
Arthur Eubanks	9173b5a99d	Revert "[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback" This reverts commit `7a83aa0520`. Causing buildbot failures.	2020-11-04 12:57:32 -08:00
Arthur Eubanks	7a83aa0520	[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 12:53:30 -08:00
Eric Astor	07c4f1d10b	[ms] [llvm-ml] Lex MASM strings, including escaping Allow single-quoted strings and double-quoted character values, as well as doubled-quote escaping. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D89731	2020-11-04 15:28:43 -05:00
Cameron McInally	c126eb7529	[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247. Differential Revision: https://reviews.llvm.org/D90644	2020-11-04 14:20:31 -06:00
Mircea Trofin	5dc47541f9	[NFC] Use Register/MCRegister Differential Revision: https://reviews.llvm.org/D90724	2020-11-04 12:20:17 -08:00
Craig Topper	cc3bf27077	[RISCV] Remove assertsexti32 from fslw/fsrw isel patterns. The operations in these patterns shouldn't be effected by sign bits. And the pattern is starting from a sign_extend_inreg so we aren't expecting sign bits to be passed through either. Differential Revision: https://reviews.llvm.org/D90739	2020-11-04 11:37:58 -08:00
Craig Topper	d47300f503	[RISCV] Correct the operand order for fshl/fshr to fsl/fsr instructions. fsl/fsr take their shift amount in $rs2 or an immediate. The sources are $rs1 and $rs3. fshl/fshr ISD opcodes both concatenate operand 0 in the high bits and operand 1 in the lower bits. fshl returns the high bits after shifting and fshr returns the low bits. So a shift amount of 0 returns operand 0 for fshl and operand 1 for fshr. fsl/fsr concatenate their operands in different orders such that $rs1 will be returned for a shift amount of 0. So $rs1 needs to come from operand 0 of fshl and operand 1 of fshr. Differential Revision: https://reviews.llvm.org/D90735	2020-11-04 11:13:25 -08:00
Craig Topper	0122a4ea66	[RISCV] Remove assertsexti32 from inputs to riscv_sllw/srlw nodes in B extension isel patterns. riscv_sllw/srlw only reads the lower 32 bits of the first operand. And the lower 5 bits of the second operands. Whether the upper 32 bits of the input are sign bits or not doesn't matter. Also use ineg and not to shorten the patterns. Differential Revision: https://reviews.llvm.org/D90668	2020-11-04 10:35:05 -08:00
Craig Topper	857563eaf0	[RISCV] Check all 64-bits of the mask in SelectRORIW. We need to ensure the upper 32 bits of the mask are zero. So that the srl shifts zeroes into the lower 32 bits. Differential Revision: https://reviews.llvm.org/D90585	2020-11-04 10:15:30 -08:00
Christopher Tetreault	900ec97bbe	[UBSan] Cannot negate smallest negative signed integer Silence warning Undefined Behavior Sanitzer warning: runtime error: negation of -9223372036854775808 cannot be represented in type 'int64_t' (aka 'long'); cast to an unsigned type to negate this value to itself Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D90710	2020-11-04 10:07:52 -08:00
Craig Topper	3701e33a22	[RISCV] Remove custom isel for (srl (shl val, 32), imm). Use pattern instead. NFCI We don't need custom matching, we just a need a predicate to check the immediate is greater than 32. We can use the existing ImmSub32 to adjust the immediate. I've also used the new predicate in the other location that used ImmSub32. I tried to create a test case where we would break without the greater than 32 check on that pattern, but DAG combine defeated me. Still seemed safer to have it. Differential Revision: https://reviews.llvm.org/D90546	2020-11-04 09:59:14 -08:00
Joe Nash	58adab34c4	[AMDGPU] Resolve pseudo registers at encoding uses Pseudo-registers allow different register encodings between gpu generations. Make sure we resolve the pseudo regs to real regs whenever we get their hardware encoding. Using the correct encodings revealed a register bank conflict and an unnecessary write dependency. Tests have been updated to match. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D90721 Change-Id: I73c154cd24aecc820993b50bebaf4df97a5710ca	2020-11-04 12:52:32 -05:00
Sebastian Neubauer	31a0b2834f	[AMDGPU] Fix iterating in SIFixSGPRCopies The insertion of waterfall loops splits the current basic block into three blocks. So the basic block that we iterate over must be updated. This failed assert(!NodePtr->isKnownSentinel()) in ilist_iterator for divergent calls in branches before. Differential Revision: https://reviews.llvm.org/D90596	2020-11-04 18:43:19 +01:00
Paul C. Anagnostopoulos	d56cd4291e	[TableGen] Add !interleave operator to concatenate a list of values with delimiters Add a test. Use it in some TableGen files. Differential Revision: https://reviews.llvm.org/D90469	2020-11-04 09:23:54 -05:00
Simon Moll	351c10cc72	[VE] Add +vpu attribute `+vpu` controls whether VEISelLowering adds any vregs. This defaults to `-vpu` to have scalar code generation out of the box. We bring up vector isel under the `+vpu` flag. Once vector isel is stable we switch to `+vpu` and advertise vregs and vops in TTI. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D90465	2020-11-04 12:42:00 +01:00
Kerry McLaughlin	f2412d372d	[SVE][CodeGen] Lower scalable integer vector reductions This patch uses the existing LowerFixedLengthReductionToSVE function to also lower scalable vector reductions. A separate function has been added to lower VECREDUCE_AND & VECREDUCE_OR operations with predicate types using ptest. Lowering scalable floating-point reductions will be addressed in a follow up patch, for now these will hit the assertion added to expandVecReduce() in TargetLowering. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D89382	2020-11-04 11:38:49 +00:00
Sebastian Neubauer	1124bf4ab7	[AMDGPU] Set rsrc1 flags for graphics shaders Before they were only set for compute kernels and compute shaders but not for other shaders. Differential Revision: https://reviews.llvm.org/D89399	2020-11-04 12:25:41 +01:00
Sebastian Neubauer	76313288cd	[AMDGPU] Fix ieee mode default value Previously, the default value for ieee mode was - on for compute kernels and compute shaders, - off for all shaders except compute shaders. This commit changes the default to be - on for compute kernels, - off for shaders. This aligns the default value with the settings that are actually in use. To my knowledge, all users of shader calling conventions (mesa and llpc) disable the ieee mode by default. Differential Revision: https://reviews.llvm.org/D89388	2020-11-04 12:25:38 +01:00
David Green	eb611930b6	[ARM] Remove unused variable. NFC	2020-11-04 09:00:03 +00:00
Sander de Smalen	73b6cb67dc	[NFCI] Replace AArch64StackOffset by StackOffset. This patch replaces the AArch64StackOffset class by the generic one defined in TypeSize.h. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D88983	2020-11-04 08:49:00 +00:00
Amara Emerson	393b55380a	[AArch64][GlobalISel] Add combine for G_EXTRACT_VECTOR_ELT to allow selection of pairwise FADD. For the <2 x float> case, instead of adding another combine or legalization to get it into a <4 x float> form, I'm just adding a GISel specific selection pattern to cover it. Differential Revision: https://reviews.llvm.org/D90699	2020-11-03 17:25:14 -08:00
Julien Jorge	0fca651711	[WebAssembly] Don't fold frame offset for global addresses When machine instructions are in the form of ``` %0 = CONST_I32 @str %1 = ADD_I32 %stack.0, %0 %2 = LOAD 0, 0, %1 ``` In the `ADD_I32` instruction, it is possible to fold it if `%0` is a `CONST_I32` from an immediate number. But in this case it is a global address, so we shouldn't do that. But we haven't checked if the operand of `ADD` is an immediate so far. This fixes the problem. (The case applies the same for `ADD_I64` and `CONST_I64` instructions.) Fixes https://bugs.llvm.org/show_bug.cgi?id=47944. Patch by Julien Jorge (jjorge@quarkslab.com) Reviewed By: dschuff Differential Revision: https://reviews.llvm.org/D90577	2020-11-03 14:56:25 -08:00
Sanjay Patel	c40126e740	[ARM] remove cost-kind predicate for most math op costs This is based on the same idea that I am using for the basic model implementation and what I have partly already done for x86: throughput cost is number of instructions/uops, so size/blended costs are identical except in special cases (for example, fdiv or other known-expensive machine instructions or things like MVE that may require cracking into >1 uop)). Differential Revision: https://reviews.llvm.org/D90692	2020-11-03 17:23:46 -05:00
Jordan Rupprecht	980bf1d5d1	[NFC] Inline wasm assertion-only variable	2020-11-03 13:06:59 -08:00
Andy Wingo	107c3a12d6	[WebAssembly] Implement ref.null This patch adds a new "heap type" operand kind to the WebAssembly MC layer, used by ref.null. Currently the possible values are "extern" and "func"; when typed function references come, though, this operand may be a type index. Note that the "heap type" production is still known as "refedtype" in the draft proposal; changing its name in the spec is ongoing (https://github.com/WebAssembly/reference-types/issues/123). The register form of ref.null is still untested. Differential Revision: https://reviews.llvm.org/D90608	2020-11-03 10:46:23 -08:00
Craig Topper	00eff96e1d	[RISCV] Add missing patterns for rotr with immediate for Zbb/Zbp extensions. DAGCombine doesn't canonicalize rotl/rotr with immediate so we need patterns for both. Remove the custom matcher for rotl to RORI and just use a SDNodeXForm to convert the immediate instead. Doing this gives priority to the rev32/rev16 versions of grevi over rori since an explicit immediate is more precise than any immediate. I also added rotr patterns for rev32/rev16. And removed the (or (shl), (shr)) patterns that should be combined to rotl by DAG combine. There is at least one other grev pattern that probably needs a another rotr pattern, but we need more test coverage first. Differential Revision: https://reviews.llvm.org/D90575	2020-11-03 10:04:52 -08:00
Esme-Yi	5053eab890	Revert "[PowerPC] Extend folding RLWINM + RLWINM to post-RA." This reverts commit `119ab2181e`.	2020-11-03 16:34:02 +00:00
Tim Renouf	89d41f3a2b	[AMDGPU] Add gfx1033 target Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761	2020-11-03 16:27:48 +00:00
Tim Renouf	ee3e642627	[AMDGPU] Add gfx90c target This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-11-03 16:27:43 +00:00
Jay Foad	040c50278c	[AMDGPU] Fix ds_read2/write2 with unaligned offsets These instructions use a scaled offset. We were wrongly selecting them even when the required offset was not a multiple of the scale factor. Differential Revision: https://reviews.llvm.org/D90607	2020-11-03 15:16:10 +00:00
Jameson Nash	a0ad066ce4	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Replaces https://reviews.llvm.org/D74158 Differential Revision: https://reviews.llvm.org/D89613	2020-11-03 10:02:09 -05:00
Sanjay Patel	9af561ec99	[x86] update cost table comments for maxnum; NFC Follow-up suggested in D90613.	2020-11-03 08:09:59 -05:00
David Green	bd32386410	[ARM] Remove unused variable. NFC	2020-11-03 12:58:10 +00:00
David Green	e474499402	[ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops If an instruction will be lowered to a call there is no advantage of using a low overhead loop as the LR register will need to be spilled and reloaded around the call, and the low overhead will end up being reverted. This teaches our hardware loop lowering that these memory intrinsics will be calls under certain situations. Differential Revision: https://reviews.llvm.org/D90439	2020-11-03 11:53:09 +00:00
Nicholas Guy	54d8627852	[AArch64] Redundant masks in downcast long multiply Adds patterns to catch masks preceeding a long multiply, and generating a single umull/smull instruction instead. Differential revision: https://reviews.llvm.org/D89956	2020-11-03 10:12:28 +00:00
Petar Avramovic	0031418dce	AMDGPU/GlobalISel: Use same builder/observer in post-legalizer-combiner Change match/apply functions into methods of new target specific combiner helper class. Use reference to MachineIRBuilder from helper instead of constructing new MachineIRBuilder each time new instruction needs to made. Allows correct tracking of newly created instructions. Differential Revision: https://reviews.llvm.org/D90623	2020-11-03 09:24:50 +01:00
Esme-Yi	119ab2181e	[PowerPC] Extend folding RLWINM + RLWINM to post-RA. Summary: This patch depends on D89846. We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example rGc4690b007743. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too. Reviewed By: shchenz, steven.zhang Differential Revision: https://reviews.llvm.org/D89855	2020-11-03 07:44:11 +00:00
Craig Topper	46e91f6701	[RISCV] Remove isel patterns for fshl/fshr with same inputs. NFC These were being selected to ROL/ROR, but DAG combine should canonicalize fshl/fshr with same inputs to rotl/rotr which we also have patterns for.	2020-11-02 23:12:18 -08:00
Esme-Yi	b969dfe26f	[NFC][PowerPC] Move the folding RLWINMs from ppc-mi-peephole to PPCInstrInfo. Summary: We have the patterns to fold 2 RLWINMs in ppc-mi-peephole, while some RLWINM will be generated after RA, for example D88274. If the RLWINM generated after RA followed by another RLWINM, we expect to perform the optimization after RA, too. This is a NFC patch to move the folding patterns to PPCInstrInfo, and the follow-up works will be calling it in pre-emit-peephole and expand the patterns to handle more cases. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D89846	2020-11-03 06:28:56 +00:00
Jessica Clarke	7601a21738	[RISCV] Only return DestSourcePair from isCopyInstrImpl for registers ADDI often has a frameindex in operand 1, but consumers of this interface, such as MachineSink, tend to call getReg() on the Destination and Source operands, leading to the following crash when building FreeBSD after this implementation was added in 8cf6778d30: ``` clang: llvm/include/llvm/CodeGen/MachineOperand.h:359: llvm::Register llvm::MachineOperand::getReg() const: Assertion `isReg() && "This is not a register operand!"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: #0 0x00007f4286f9b4d0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) llvm/lib/Support/Unix/Signals.inc:563:0 #1 0x00007f4286f9b587 PrintStackTraceSignalHandler(void) llvm/lib/Support/Unix/Signals.inc:630:0 #2 0x00007f4286f9926b llvm::sys::RunSignalHandlers() llvm/lib/Support/Signals.cpp:71:0 #3 0x00007f4286f9ae52 SignalHandler(int) llvm/lib/Support/Unix/Signals.inc:405:0 #4 0x00007f428646ffd0 (/lib/x86_64-linux-gnu/libc.so.6+0x3efd0) #5 0x00007f428646ff47 raise /build/glibc-2ORdQG/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0 #6 0x00007f42864718b1 abort /build/glibc-2ORdQG/glibc-2.27/stdlib/abort.c:81:0 #7 0x00007f428646142a __assert_fail_base /build/glibc-2ORdQG/glibc-2.27/assert/assert.c:89:0 #8 0x00007f42864614a2 (/lib/x86_64-linux-gnu/libc.so.6+0x304a2) #9 0x00007f428d4078e2 llvm::MachineOperand::getReg() const llvm/include/llvm/CodeGen/MachineOperand.h:359:0 #10 0x00007f428d8260e7 attemptDebugCopyProp(llvm::MachineInstr&, llvm::MachineInstr&) llvm/lib/CodeGen/MachineSink.cpp:862:0 #11 0x00007f428d826442 performSink(llvm::MachineInstr&, llvm::MachineBasicBlock&, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::SmallVectorImpl<llvm::MachineInstr>&) llvm/lib/CodeGen/MachineSink.cpp:918:0 #12 0x00007f428d826e27 (anonymous namespace)::MachineSinking::SinkInstruction(llvm::MachineInstr&, bool&, std::map<llvm::MachineBasicBlock, llvm::SmallVector<llvm::MachineBasicBlock, 4u>, std::less<llvm::MachineBasicBlock>, std::allocator<std::pair<llvm::MachineBasicBlock const, llvm::SmallVector<llvm::MachineBasicBlock*, 4u> > > >&) llvm/lib/CodeGen/MachineSink.cpp:1073:0 #13 0x00007f428d824a2c (anonymous namespace)::MachineSinking::ProcessBlock(llvm::MachineBasicBlock&) llvm/lib/CodeGen/MachineSink.cpp:410:0 #14 0x00007f428d824513 (anonymous namespace)::MachineSinking::runOnMachineFunction(llvm::MachineFunction&) llvm/lib/CodeGen/MachineSink.cpp:340:0 ``` Thus, check that operand 1 is also a register in the condition. Reviewed By: arichardson, luismarques Differential Revision: https://reviews.llvm.org/D89090	2020-11-03 03:55:47 +00:00
Qiu Chaofan	d14e51806b	[PowerPC] Skip IEEE 128-bit FP type in FastISel Vector types, quadword integers and f128 currently cannot be handled in FastISel. We did not skip f128 type in lowering arguments, which causes a crash. This patch will fix it. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D90206	2020-11-03 11:17:11 +08:00
Qiu Chaofan	3204ffeade	[PowerPC] [NFC] Rename VCMPo to VCMP_rec Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D90581	2020-11-03 11:10:59 +08:00
Fangrui Song	ca01a6b3ac	[PowerPC] Parse and ignore .machine ppc64 In the wild, kexec-tools purgatory/arch/ppc64/v2wrap.S and hvcall.S use this directive.	2020-11-02 16:49:57 -08:00
Krzysztof Parzyszek	b26a2755dc	[Hexagon] Move isTypeForHVX from Hexagon TTI to HexagonSubtarget, NFC It's useful outside of Hexagon TTI, and with how TTI is implemented, it is not accessible outside of TTI.	2020-11-02 14:00:45 -06:00
Stanislav Mekhanoshin	c9d6fe6f7d	[AMDGPU] Improve FLAT scratch detection We were useing too broad check for isFLATScratch() which also includes FLAT global. Differential Revision: https://reviews.llvm.org/D90505	2020-11-02 11:37:33 -08:00
Craig Topper	9ac2910093	[RISCV] Make SelectRORIW handle the commutability of OR. The SHL and SRL could be in opposite order so account for that. Differential Revision: https://reviews.llvm.org/D90586	2020-11-02 09:32:54 -08:00
Sanjay Patel	35fa3c474f	[x86] add AVX2 cost model entries for maxnum of 256-bit vectors As noticed in D90554 , the AVX2 costs for 256-bit vectors did not include FMAXNUM entries, so we fell back to AVX1 which assumes those ops will be split into 128-bit halves or something close to that. Differential Revision: https://reviews.llvm.org/D90613	2020-11-02 12:20:17 -05:00
Craig Topper	7142ec3aaf	[RISCV] When matching RORIW, make sure the same input is given to both shifts. The code is looking for (sext_inreg (or (shl X, C2), (shr (and Y, C3), C1))). We need to ensure X and Y are the same. Differential Revision: https://reviews.llvm.org/D90580	2020-11-02 09:12:40 -08:00

1 2 3 4 5 ...

59979 Commits