llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	4b4496312e	AMDGPU: Start adding MODE register uses to instructions This is the groundwork required to implement strictfp. For now, this should be NFC for regular instructoins (many instructions just gain an extra use of a reserved register). Regalloc won't rematerialize instructions with reads of physical registers, but we were suffering from that anyway with the exec reads. Should add it for all the related FP uses (possibly with some extras). I did not add it to either the gpr index mode instructions (or every single VALU instruction) since it's a ridiculous feature already modeled as an arbitrary side effect. Also work towards marking instructions with FP exceptions. This doesn't actually set the bit yet since this would start to change codegen. It seems nofpexcept is currently not implied from the regular IR FP operations. Add it to some MIR tests where I think it might matter.	2020-05-27 14:47:00 -04:00
John Fastabend	13f6c81c5d	[BPF] simplify zero extension with MOV_32_64 The current pattern matching for zext results in the following code snippet being produced, w1 = w0 r1 <<= 32 r1 >>= 32 Because BPF implementations require zero extension on 32bit loads this both adds a few extra unneeded instructions but also makes it a bit harder for the verifier to track the r1 register bounds. For example in this verifier trace we see at the end of the snippet R2 offset is unknown. However, if we track this correctly we see w1 should have the same bounds as r8. R8 smax is less than U32 max value so a zero extend load should keep the same value. Adding a max value of 800 (R8=inv(id=0,smax_value=800)) to an off=0, as seen in R7 should create a max offset of 800. However at the end of the snippet we note the R2 max offset is 0xffffFFFF. R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=2147483647,var_off=(0x0; 0x7fffffff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R9=inv800 R10=fp0 fp-8=mmmm???? 58: (1c) w9 -= w8 59: (bc) w1 = w8 60: (67) r1 <<= 32 61: (77) r1 >>= 32 62: (bf) r2 = r7 63: (0f) r2 += r1 64: (bf) r1 = r6 65: (bc) w3 = w9 66: (b7) r4 = 0 67: (85) call bpf_get_stack#67 R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R3_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R4_w=inv0 R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R9_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R10=fp0 fp-8=mmmm???? After this patch R1 bounds are not smashed by the <<=32 >>=32 shift and we get correct bounds on R2 umax_value=800. Further it reduces 3 insns to 1. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Differential Revision: https://reviews.llvm.org/D73985	2020-05-27 11:26:39 -07:00
Lei Huang	2368bf52cd	[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm Summary: This patch simply adds support for the new CPU in anticipation of Power10. There isn't really any functionality added so there are no associated test cases at this time. Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc Reviewed By: stefanp, nemanjai, amyk, #powerpc Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo Tags: #clang, #powerpc, #llvm Differential Revision: https://reviews.llvm.org/D80020	2020-05-27 13:14:25 -05:00
Matt Arsenault	07cd19efa2	AMDGPU: Fix dropping MI flags when rewriting instructions All 3 passes that change instruction encodings were dropping MI flags. This avoids scheduling regressions caused by setting mayRaiseFPExceptions on FP instructions for non-strictfp functions.	2020-05-27 13:27:06 -04:00
Fangrui Song	5b4cd2d4c4	[X86] Assemble movzb 1280(%rbx, %r12), %r12 after D80608 ffmpeg/libavcodec/x86/h264_cabac.c inline assembly may produce movzb 1280(%rbx, %r12), %r12 After D80608, llvm-mc errors: error: unknown use of instruction mnemonic without a size suffix	2020-05-27 09:55:55 -07:00
Philip Reames	1af3705c7f	Start migrating away from statepoint's inline length prefixed argument bundles In the current statepoint design, we have four distinct groups of operands to the call: call args, gc transition args, deopt args, and gc args. This format prexisted the support in IR for operand bundles and was in fact one of the inspirations for the extension. However, we never went back and rearchitected statepoints to fully leverage bundles. This change is the first in a small sequence to do so. All this does is extend the SelectionDAG lowering code to allow deopt and gc transition operands to be specified in either inline argument bundles or operand bundles. Differential Revision: https://reviews.llvm.org/D8059	2020-05-27 09:16:10 -07:00
Alex Richardson	3be5e53f20	[FileCheck] Allow parenthesized expressions With this change it is be possible to write FileCheck expressions such as [[#(VAR+1)-2]]. Currently, the only supported arithmetic operators are plus and minus, so this is not particularly useful yet. However, it our CHERI fork we have tests that benefit from having multiplication in FileCheck expressions. Allowing parenthesized expressions is the simplest way for us to work around the current lack of operator precedence in FileCheck expressions. Reviewed By: thopre, jhenderson Differential Revision: https://reviews.llvm.org/D77383	2020-05-27 16:31:39 +01:00
Lei Huang	559845f8fe	Revert "[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm" This reverts commit `7eb666b155`.	2020-05-27 09:40:21 -05:00
Ties Stuij	78bd0c0e5e	[AArch64][BFloat] add BFloat instruction support for AArch64 Summary: Add support for lowering various BFloat related SelDAG nodes: - load/store (ldrh/strh) - concat - dup/duplane - bitconvert/bitcast - insert_subvector/insert_subreg This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile Reviewers: ab, t.p.northover, john.brawn, fpetrogalli, sdesmalen, LukeGeeson Reviewed By: fpetrogalli Subscribers: LukeGeeson, pbarrio, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79712	2020-05-27 15:36:54 +01:00
Georgii Rymar	4ab03e62fd	[llvm-readobj] - Do not crash when an invalid .eh_frame_hdr is dumped using --unwind. When the p_offset/p_filesz of the PT_GNU_EH_FRAME is invalid (e.g larger than the file size) then llvm-readobj might crash. This patch fixes the issue. I've introduced `ELFFile<ELFT>::getSegmentContent` method, which is very similar to `ELFFile<ELFT>::getSectionContentsAsArray` one. Differential revision: https://reviews.llvm.org/D80380	2020-05-27 16:41:09 +03:00
David Green	70d4a20299	[UnJ] Update LI for inner nested loops This makes sure to correctly register the loop info of the children of unroll and jammed loops. It re-uses some code from the unroller for registering subloops. Differential Revision: https://reviews.llvm.org/D80619	2020-05-27 14:36:38 +01:00
Matt Arsenault	833996cef1	AMDGPU: Fix backwards s_cselect_* operands The vector equivalent has backwards operands, but the scalar version does not. The passes that use these hooks aren't enabled by default, so this doesn't really change anything.	2020-05-27 09:26:09 -04:00
Guillaume Chatelet	5b84ee4f61	[Alignment] Fix misaligned interleaved loads Summary: Tentatively fixing https://bugs.llvm.org/show_bug.cgi?id=45957 Reviewers: craig.topper, nlopes Subscribers: hiraditya, llvm-commits, RKSimon, jdoerfert, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D80276	2020-05-27 12:12:22 +00:00
Victor Campos	c7593b0f0d	[ARM] Fix rewrite of frame index in Thumb2's address mode i8s4 Summary: In Thumb2's frame index rewriting process, the address mode i8s4, which is used by LDRD and STRD instructions, is handled by taking the immediate offset operand and multiplying it by 4. This behaviour is wrong, however. In this specific address mode, the MachineInstr's immediate operand is already in the expected form. By consequence of that, multiplying it once more by 4 yields a flawed offset value, four times greater than it should be. Differential Revision: https://reviews.llvm.org/D80557	2020-05-27 13:09:13 +01:00
Guillaume Chatelet	6e1eff7858	[NFC] Updating tests Summary: Updating IR now that alignment is explicitly set. This is a prerequisite to D80276. Reviewers: efriedma Subscribers: llvm-commits, craig.topper Tags: #llvm Differential Revision: https://reviews.llvm.org/D80549	2020-05-27 12:02:46 +00:00
Daniil Suchkov	706b22e3e4	[SimpleLoopUnswitch] Drop uses of instructions before block deletion Currently if instructions defined in a block are used in unreachable blocks and SimpleLoopUnswitch attempts deleting the block, it triggers assertion "Uses remain when a value is destroyed!". This patch fixes it by replacing all uses of instructions from BB with undefs before BB deletion. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D80551	2020-05-27 18:25:18 +07:00
Georgii Rymar	fc98447af6	[llvm-readobj] - Do not skip building of the GNU hash table histogram. When the `--elf-hash-histogram` is used, the code first tries to build a histogram for the .hash table and then for the .gnu.hash table. The problem is that dumper might return early when unable or do not need to build a histogram for the .hash. This patch reorders the code slightly to fix the issue and adds a test case. Differential revision: https://reviews.llvm.org/D80204	2020-05-27 13:46:41 +03:00
Simon Pilgrim	410667f1b7	[X86][SSE] Convert PTEST to MOVMSK for allsign bits vector results If we are using PTEST to check 'allsign bits' vector elements we can use MOVMSK to extract the signbits directly and perform the comparison on the scalar value. For vXi16 cases, as we don't have a MOVMSK for this type, we must mask each signbit out of a PMOVMSKB v2Xi8 result, which folds into the TEST comparison. If this allows us to remove a vector op (via the SimplifyMultipleUseDemandedBits call) this is consistently faster than a PTEST (https://godbolt.org/z/ziJUst). I'm investigating whether we ever get regressions without the SimplifyMultipleUseDemandedBits call, even if this means we don't remove a vector op, but that has exposed some other poor codegen issues that I'm still investigating and would have to wait for a later patch. Suggested on PR42035 to avoid unnecessary ashr(x,bw-1)/pcmpgt(0,x) sign splat patterns feeding into ptest. Differential Revision: https://reviews.llvm.org/D80563	2020-05-27 11:06:16 +01:00
Konstantin Schwarz	f2fad3f703	[GlobalISel][InlineAsm] Add missing EarlyClobber flag to inline asm output operands Summary: Previously, we only added early-clobber flags to the 'group' immediate flag operand of an inline asm operand. However, we also have to add the EarlyClobber flag to the MachineOperand itself. This fixes PR46028 Reviewers: arsenm, leonardchan Reviewed By: arsenm, leonardchan Subscribers: phosek, wdng, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80467	2020-05-27 12:04:18 +02:00
Vitaly Buka	f6383643d9	[StackSafety] Bailout on some function calls Don't miss values used in calls outside regular argument list.	2020-05-27 02:48:42 -07:00
Vitaly Buka	06a07dd608	[StackSafety] Fix formatting in the test	2020-05-27 02:48:41 -07:00
Vitaly Buka	b101c6251a	[StackSafety] Ignore some use of values We should ignore value used in MemTransferInst as other then src/dst argument.	2020-05-27 02:48:41 -07:00
Kazushi (Jam) Marukawa	dedaf3a2ac	[VE] Dynamic stack allocation Summary: This patch implements dynamic stack allocation for the VE target. Changes: * compiler-rt: `__ve_grow_stack` to request stack allocation on the VE. * VE: base pointer support, dynamic stack allocation. Differential Revision: https://reviews.llvm.org/D79084	2020-05-27 10:11:06 +02:00
Daniil Suchkov	fc44da746f	Add test exposing a bug in SimpleLoopUnswitch.	2020-05-27 15:02:28 +07:00
Wang, Pengfei	6565b58584	[X86][llvm-mc] Make the suffix matcher more accurate. Summary: Some instruction like VPMULDQ is NOT the variant of VPMULD but a new one. So we should make sure the suffix matcher only works for memory variant that has the same size with the suffix. Currently we only check for SSE/AVX* instructions, because many legacy instructions didn't declare the alias instructions of their variants. Differential Revision: https://reviews.llvm.org/D80608	2020-05-27 14:45:17 +08:00
Vitaly Buka	32a1f60d11	[StackSafety] Use SCEV to find mem operation length	2020-05-26 23:22:37 -07:00
Vitaly Buka	d0f1f5adfa	[StackSafety] Use getSignedRange for offsets	2020-05-26 23:22:36 -07:00
Kang Zhang	23a2f45214	[NFC][PowerPC] Modify the test case two-address-crash.mir	2020-05-27 02:35:45 +00:00
Matt Arsenault	ef3e831226	GlobalISel: Basic legalization for G_PTRMASK	2020-05-26 21:20:30 -04:00
Vitaly Buka	b5ae70046b	[StackSafety] Simplify SCEVRewriteVisitor Probably NFC.	2020-05-26 18:09:43 -07:00
Jessica Paquette	ae597a771e	[AArch64][GlobalISel] Do not modify predicate when optimizing G_ICMP This fixes a bug in `tryOptArithImmedIntegerCompare`. It is unsafe to update the predicate on a MachineOperand when optimizing a G_ICMP, because it may be used in more than one place. For example, when we are optimizing G_SELECT, we allow compares which are used in more than one G_SELECT. If we modify the G_ICMP, then we'll break one of the G_SELECTs. Since the compare is being produced to either 1) Select a G_ICMP 2) Fold a G_ICMP into an instruction when profitable there's no reason to actually modify it. The change is local to the specific compare. Instead, pass a `CmpInst::Predicate` to `tryOptArithImmedIntegerCompare` which can be modified by reference. Differential Revision: https://reviews.llvm.org/D80585	2020-05-26 17:51:08 -07:00
Philip Reames	bed6624ac4	Split a test file so that most of it can be autogened	2020-05-26 17:33:32 -07:00
Philip Reames	b90eb0f23b	Autogen a couple of test files to make a future diff easier to read	2020-05-26 17:33:32 -07:00
Alexander Shaposhnikov	842a8cc10c	[llvm-objcopy][MachO] Add support for removing Swift symbols cctools strip has the option "-T" which removes Swift symbols. This diff implements this option in llvm-strip for MachO. Test plan: make check-all Differential revision: https://reviews.llvm.org/D80099	2020-05-26 16:49:56 -07:00
Arthur Eubanks	9a0b0855a9	Modify verifier checks to support musttail + preallocated Summary: preallocated and musttail can work together, but we don't want to call @llvm.call.preallocated.setup() to modify the stack in musttail calls. So we shouldn't have the "preallocated" operand bundle when a preallocated call is musttail. Also disallow use of preallocated on calls without preallocated. Codegen not yet implemented. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80581	2020-05-26 15:20:20 -07:00
Chris Jackson	9eacda51fa	[debuginfo] Fix broken tests from MachineLICM salvaging fix Previous commit: `bd7ff5d94f` - Added missing x86 triples - Added missing asserts	2020-05-26 22:46:07 +01:00
Vitaly Buka	6a74ad6baa	[sancov] Accommodate sancov and coverage report server for use under Windows Summary: This patch makes the following changes to SanCov and its complementary Python script in order to resolve issues pertaining to non-UNIX file paths in JSON symbolization information: * Convert all paths to use forward slash. * Update `coverage-report-server.py` to correctly handle paths to sources which contain spaces. * Remove Linux platform restriction for all SanCov unit tests. All SanCov tests passed when ran on my local Windows machine. Patch by Douglas Gliner. Reviewers: kcc, filcab, phosek, morehouse, vitalybuka, metzman Reviewed By: vitalybuka Subscribers: vsk, Dor1s, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D51018	2020-05-26 14:36:44 -07:00
Vedant Kumar	6e39379bbb	[DwarfExpression] Support entry values for indirect parameters Summary: A struct argument can be passed-by-value to a callee via a pointer to a temporary stack copy. Add support for emitting an entry value DBG_VALUE when an indirect parameter DBG_VALUE becomes unavailable. This is done by omitting DW_OP_stack_value from the entry value expression, to make the expression describe the location of an object. rdar://63373691 Reviewers: djtodoro, aprantl, dstenb Subscribers: hiraditya, lldb-commits, llvm-commits Tags: #lldb, #llvm Differential Revision: https://reviews.llvm.org/D80345	2020-05-26 14:22:28 -07:00
Stanislav Mekhanoshin	512e806a33	[AMDGPU] Bail alloca vectorization if GEP not found Differential Revision: https://reviews.llvm.org/D80587	2020-05-26 13:59:49 -07:00
Davide Italiano	01fee8aa24	[MLICM] Remove unneeded option so the test doesn't fail.	2020-05-26 13:53:56 -07:00
Matt Arsenault	bb10fa3a53	AMDGPU: Fix wrong null value for private address space I'm guessing this was a holdover from when 0 was an invalid stack pointer, but surprised nobody has discovered this before. Also don't allow offset folding for -1 pointers, since it looks weird to partially fold this.	2020-05-26 16:35:13 -04:00
Chris Jackson	bd7ff5d94f	[DebugInfo] Correct debuginfo for post-ra hoist and sink in Machine LICM Reviewers: vsk, aprantl Differential Revision: https://reviews.llvm.org/D79868	2020-05-26 21:07:10 +01:00
Stanislav Mekhanoshin	42725aeed8	Process gep (select ptr1, ptr2) in SROA Differential Revision: https://reviews.llvm.org/D79217	2020-05-26 12:56:02 -07:00
Sanjay Patel	1a2bffaf8b	[InstCombine] reassociate sub+add to increase adds and throughput The -reassociate pass tends to transform this kind of pattern into something that is worse for vectorization and codegen. See PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953 Follows-up the FP version of the same transform: rGa0ce2338a083	2020-05-26 14:49:17 -04:00
Sanjay Patel	f5cfcc4b06	[LoopVectorize] regenerate full test checks; NFC	2020-05-26 14:49:17 -04:00
Sanjay Patel	0788392637	[InstCombine] add tests for reassociative sub/add expressions; NFC	2020-05-26 14:49:16 -04:00
Lei Huang	7eb666b155	[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm Summary: This patch simply adds support for the new CPU in anticipation of Power10. There isn't really any functionality added so there are no associated test cases at this time. Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc Reviewed By: stefanp, nemanjai, amyk, #powerpc Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo Tags: #clang, #powerpc, #llvm Differential Revision: https://reviews.llvm.org/D80020	2020-05-26 13:48:22 -05:00
Nemanja Ivanovic	6e9223a2c6	[PowerPC][NFC] Update test to prevent DCE from causing failures The test case provided in PR45709 can be simplified by DCE to an empty function. To prevent this from happening if DCE is run prior to ISEL in the back end, just add optnone to the function. The behaviour it is testing for is in the SDAG legalization and is not sensitive to optnone so the test case still achieves its desired objective.	2020-05-26 13:37:48 -05:00
Hiroshi Yamauchi	106ec64fbc	[PGO] Add memcmp/bcmp size value profiling. Summary: This adds support for memcmp/bcmp to the existing memcpy/memset value profiling. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79751	2020-05-26 10:28:04 -07:00
Sanjay Patel	a0ce2338a0	[InstCombine] reassociate fsub+fadd with FMF to increase adds and throughput The -reassociate pass tends to transform this kind of pattern into something that is worse for vectorization and codegen. See PR43953: https://bugs.llvm.org/show_bug.cgi?id=43953	2020-05-26 13:17:15 -04:00

1 2 3 4 5 ...

71533 Commits