llvm-project

Commit Graph

Author	SHA1	Message	Date
Arthur Eubanks	9ccf13c36d	[NewPM][NVPTX] Port NVPTX opt passes There are only two used in the IR optimization pipeline. Port these and add them to the default pipeline. Similar to https://reviews.llvm.org/D93863. I added -mtriple to some tests since under the new PM, the passes are only available when the TargetMachine is specified. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D93930	2021-01-07 15:12:35 -08:00
Matt Arsenault	2cbbc6e87c	GlobalISel: Fail legalization on narrowing extload below memory size	2021-01-07 17:40:34 -05:00
Matt Arsenault	1f9b6ef91f	GlobalISel: Add combine for G_UREM by power of 2 Really I want this in the legalizer, but this is a start.	2021-01-07 16:36:35 -05:00
Wouter van Oortmerssen	5c38ae36c5	[WebAssembly] Fixed byval args missing DWARF DW_AT_LOCATION A struct in C passed by value did not get debug information. Such values are currently lowered to a Wasm local even in -O0 (not to an alloca like on other archs), which becomes a Target Index operand (TI_LOCAL). The DWARF writing code was not emitting locations in for TI's specifically if the location is a single range (not a list). In addition, the ExplicitLocals pass which removes the ARGUMENT pseudo instructions did not update the associated DBG_VALUEs, and couldn't even find these values since the code assumed such instructions are adjacent, which is not the case here. Also fixed asm printing of TIs needed by a test. Differential Revision: https://reviews.llvm.org/D94140	2021-01-07 10:31:38 -08:00
Mircea Trofin	ee57d30f44	[NFC] Removed unused prefixes from CodeGen/AMDGPU Last bulk batch. Differential Revision: https://reviews.llvm.org/D94236	2021-01-07 09:48:14 -08:00
Mircea Trofin	e881a25f1e	[NFC] Removed unused prefixes in CodeGen/AMDGPU This covers tests starting with s. Differential Revision: https://reviews.llvm.org/D94184	2021-01-07 08:00:11 -08:00
Cameron McInally	f4013359b3	[SVE] Add unpacked scalable floating point ZIP/UZP/TRN patterns Differential Revision: https://reviews.llvm.org/D94193	2021-01-07 09:56:53 -06:00
Matt Arsenault	6b7d5a928f	AMDGPU/GlobalISel: Start cleaning up calling convention lowering There are various hacks working around limitations in handleAssignments, and the logical split between different parts isn't correct. Start separating the type legalization to satisfy going through the DAG infrastructure from the code required to split into register types. The type splitting should be moved to generic code.	2021-01-07 10:36:45 -05:00
Simon Pilgrim	350ab7aa1c	[DAG] Simplify OR(X,SHL(Y,BW/2)) eq/ne 0/-1 'all/any-of' style patterns Attempt to simplify all/any-of style patterns that concatenate 2 smaller integers together into an and(x,y)/or(x,y) + icmp 0/-1 instead. This is mainly to help some bool predicate reduction patterns where we end up concatenating bool vectors that have been bitcasted to integers. Differential Revision: https://reviews.llvm.org/D93599	2021-01-07 12:03:19 +00:00
Fraser Cormack	c9154e8fa3	[RISCV] Add vector mask arithmetic ISel patterns The patterns that want to use 'vnot' use a custom PatFrag. This is because 'vnot' uses immAllOnesV which implicitly uses BUILD_VECTOR rather than SPLAT_VECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94078	2021-01-07 09:43:25 +00:00
Nikita Popov	7d48eff8ba	[PowerPC] Avoid call to undef in test (NFC) Replace call to undef with a dummy function, to avoid affecting this change by changes to call undef folding.	2021-01-06 21:09:02 +01:00
Arthur Eubanks	a515342de9	[test] Pin AMDGPU/opt-pipeline.ll to legacy PM The pipeline being tested is specifically the legacy PM pipeline.	2021-01-06 11:44:16 -08:00
Mircea Trofin	90347ab96f	[NFC] Removed unused prefixes in CodeGen/AMDGPU This covers tests starting with m-r. Differential Revision: https://reviews.llvm.org/D94181	2021-01-06 10:32:44 -08:00
Simon Pilgrim	3f8c2520c0	[X86] Add commuted patterns test coverage for D93599 Suggested by @spatel	2021-01-06 18:03:20 +00:00
Reid Kleckner	08e5e91e45	[X86] Remove [ER]SP from all CSR lists The CSR lists control which registers are spilled and reloaded in the prologue and epilogue. The stack pointer is managed explicitly, and should never be pushed or popped. Remove it from these lists. This affected regcall and preserves all / most. Differential Revision: https://reviews.llvm.org/D94118	2021-01-06 09:50:46 -08:00
Mircea Trofin	b470630913	[NFC] Removed unused prefixes from CodeGen/AMDGPU All the 'l'-starting tests. Differential Revision: https://reviews.llvm.org/D94151	2021-01-06 09:34:11 -08:00
Matt Arsenault	ab3a3f543b	AMDGPU/GlobalISel: Update fdiv lowering for denormal/ulp interaction Change the GlobalISel fast fdiv handling to match the changes in `2531535984` and `884acbb9e1`	2021-01-06 12:32:01 -05:00
Peter Waller	3e357ecd44	[llvm][NFC] Disallow all warnings in TypeSize tests This is a follow-up to a request from a reviewer [0]. The text may change in the future and these tests should not produce any warning output. [0] https://reviews.llvm.org/D91806#inline-879243 Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D94161	2021-01-06 17:17:07 +00:00
Matt Arsenault	0a3cf7f476	AMDGPU/GlobalISel: Add baseline IR tests for fdiv The fdiv lowering is currently split between an IR pass and codegen, so make sure this works end to end. We also currently differ from the DAG on some edge cases, which this will show in a future change.	2021-01-06 11:37:00 -05:00
Matt Arsenault	136f498919	AMDGPU: Explicitly use SelectionDAG in legacy intrinsic tests GlobalISel will probably not support the legacy buffer intrinsics, so don't fail when the default is switched.	2021-01-06 11:37:00 -05:00
Simon Pilgrim	1307e3f6c4	[TargetLowering] Add icmp ne/eq (srl (ctlz x), log2(bw)) vector support.	2021-01-06 16:13:51 +00:00
Nicholas Guy	350247a93c	[AArch64] Rearrange mul(dup(sext/zext)) to mul(sext/zext(dup)) Performing this rearrangement allows for existing patterns to match cases where the vector may be built after an extend, instead of before. Differential Revision: https://reviews.llvm.org/D91255	2021-01-06 16:02:16 +00:00
Simon Pilgrim	b69fe6a85d	[X86] Add icmp ne/eq (srl (ctlz x), log2(bw)) test coverage. Add vector coverage as well (which isn't currently supported).	2021-01-06 15:50:29 +00:00
Simon Pilgrim	37ac4f865f	[Hexagon] Regenerate zext-v4i1.ll tests This will be improved by part of the work for D86578	2021-01-06 12:56:06 +00:00
Simon Pilgrim	dfcb872c3e	[X86] Add scalar/vector test coverage for D93599 This expands the test coverage beyond just the boolvector/movmsk concat pattern	2021-01-06 11:58:27 +00:00
Stefan Pintilie	cb0c034edc	[PowerPC] Fix issue where vsrq is given incorrect shift vector The new Power10 instruction vsrq was being given the wrong shift vector. The original code assumed that the shift would be found in bits 121 to 127. This is not correct. The shift is found in bits 57 to 63. This can be fixed by swaping the first and second double words. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D94113	2021-01-06 05:56:09 -06:00
Sander de Smalen	aa280c99f7	[AArch64][SVE] Emit DWARF location expr for SVE (dbg.declare) When using dbg.declare, the debug-info is generated from a list of locals rather than through DBG_VALUE instructions in the MIR. This patch is different from D90020 because it emits the DWARF location expressions from that list of locals directly. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D90044	2021-01-06 11:45:05 +00:00
Sander de Smalen	84a1120943	[LiveDebugValues] Handle spill locations with a fixed and scalable component. This patch fixes the two LiveDebugValues implementations (InstrRef/VarLoc)Based to handle cases where the StackOffset contains both a fixed and scalable component. This depends on the `TargetRegisterInfo::prependOffsetExpression` being added in D90020. Feel free to leave comments on that patch if you have them. Reviewed By: djtodoro, jmorse Differential Revision: https://reviews.llvm.org/D90046	2021-01-06 11:30:13 +00:00
David Green	63dce70b79	[ARM] Handle any extend whilst lowering addw/addl/subw/subl Same as `a9b6440edd`, use zanyext to treat any_extends as zero extends during lowering to create addw/addl/subw/subl nodes. Differential Revision: https://reviews.llvm.org/D93835	2021-01-06 11:26:39 +00:00
Ben Shi	351a45ca73	[RISCV][NFC] Add new test cases for mul	2021-01-06 18:55:56 +08:00
David Green	ddb82fc76c	[ARM] Handle any extend whilst lowering mull Similar to `78d8a821e2` but for ARM, this handles any_extend whilst creating MULL nodes, treating them as zextends. Differential Revision: https://reviews.llvm.org/D93834	2021-01-06 10:51:12 +00:00
Sander de Smalen	e4cda13d5a	Fix test failure in `a7e3339f3b` Set the target-triple to aarch64 in debug-info-sve-dbg-value.mir to avoid "'+sve' is not a recognized feature for this target" diagnostic.	2021-01-06 10:43:48 +00:00
David Green	a9b6440edd	[AArch64] Handle any extend whilst lowering addw/addl/subw/subl This adds an extra tablegen PatFrag, zanyext, which matches either any extend or zext and uses that in the aarch64 backend to handle any extends in addw/addl/subw/subl patterns. Differential Revision: https://reviews.llvm.org/D93833	2021-01-06 10:35:23 +00:00
David Green	78d8a821e2	[AArch64] Handle any extend whilst lowering mull Demanded bits may turn a sext or zext into an anyext if the top bits are not needed. This currently prevents the lowering to instructions like mull, addl and addw. This patch fixes the mull generation by keeping it simple and treating them like zextends. Differential Revision: https://reviews.llvm.org/D93832	2021-01-06 10:08:43 +00:00
Sander de Smalen	a7e3339f3b	[AArch64][SVE] Emit DWARF location expression for SVE stack objects. Extend PEI to emit a DWARF expression for StackOffsets that have a fixed and scalable component. This means the expression that needs to be added is either: <base> + offset or: <base> + offset + scalable_offset * scalereg where for SVE, the scale reg is the Vector Granule Dwarf register, which encodes the number of 64bit 'granules' in an SVE vector and which the debugger can evaluate at runtime. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D90020	2021-01-06 09:40:53 +00:00
Sander de Smalen	a9f5e4375b	[AArch64] Use faddp to implement fadd reductions. Custom-expand legal VECREDUCE_FADD SDNodes to benefit from pair-wise faddp instructions. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D59259	2021-01-06 09:36:51 +00:00
Fraser Cormack	e130dea92a	[RISCV] Add vector integer mul/mulh/div/rem ISel patterns There is no test coverage for the mulhs or mulhu patterns as I can't get the DAGCombiner to generate them for scalable vectors. There are a few places in that still need updating for that to work. I left the patterns in regardless as they are correct. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94073	2021-01-06 09:24:07 +00:00
Mircea Trofin	c1cd42d698	[NFC] Removed unused prefixes in CodeGen/AMDGPU This covers the tests starting with h-k. Differential Revision: https://reviews.llvm.org/D94147	2021-01-05 20:22:40 -08:00
Mircea Trofin	cdfd4c5c1a	[NFC] Removed unused prefixes in test/CodeGen/AMDGPU More patches to follow. This covers the pertinent tests starting with e, f, and g. Differential Revision: https://reviews.llvm.org/D94124	2021-01-05 19:18:30 -08:00
Changpeng Fang	cb5b52a06e	AMDGPU: Annotate amdgpu.noclobber for global loads only Summary: This is to avoid unnecessary analysis since amdgpu.noclobber is only used for globals. Reviewers: arsenm Fixes: SWDEV-239161 Differential Revision: https://reviews.llvm.org/D94107	2021-01-05 14:47:19 -08:00
Mircea Trofin	1ebe86adf5	[NFC] Removed unused prefixes in test/CodeGen/AMDGPU More patches to follow. Differential Revision: https://reviews.llvm.org/D94121	2021-01-05 14:16:52 -08:00
Mircea Trofin	bec987ea67	[NFC] Removed unused prefixes in CodeGen/AMDGPU This is part of the pertinent tests, more to follow in subsequent patches. Differential Revision: https://reviews.llvm.org/D94114	2021-01-05 14:10:03 -08:00
Mircea Trofin	a9543469d5	[NFC] Removed unused prefixes in CodeGen/AMDGPU/GlobalISel Differential Revision: https://reviews.llvm.org/D94099	2021-01-05 12:57:17 -08:00
Thomas Lively	497026c902	[WebAssembly] Prototype prefetch instructions As proposed in https://github.com/WebAssembly/simd/pull/352 and using the opcodes used in the V8 prototype: https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions are only usable via intrinsics and clang builtins to make them opt-in while they are being benchmarked. Differential Revision: https://reviews.llvm.org/D93883	2021-01-05 11:32:03 -08:00
Craig Topper	249d7de119	[RISCV] Don't print zext.b alias. This alias for andi x, 255 was recently added to the spec. If we print it, code we output can't be compiled with -fno-integrated-as unless the GNU assembler is also a version that supports alias. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D93826	2021-01-05 10:41:08 -08:00
Craig Topper	c707716c04	[RISCV] Match vmslt(u).vx intrinsics with a small immediate to vmsle(u).vx. There are vmsle(u).vx and vmsle(u).vi instructions, but there is only vmslt(u).vx and no vmslt(u).vi. vmslt(u).vi can be emulated for some immediates by decrementing the immediate and using vmsle(u).vi. To avoid the user needing to know about this, this patch does this conversion. The assembler does the same thing for vmslt(u).vi and vmsge(u).vi pseudoinstructions. There is no vmsge(u).vx intrinsic or instruction so this patch is limited to vmslt(u). Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D94070	2021-01-05 10:20:21 -08:00
David Green	0c59a4da59	[ARM][AArch64] Some extra test to show anyextend lowering. NFC	2021-01-05 17:34:23 +00:00
Jinsong Ji	f26bc0ddd5	[RegisterClassInfo] Return non-zero for RC without allocatable reg In some case, the RC may have 0 allocatable reg. eg: VRSAVERC in PowerPC, which has only 1 reg, but it is also reserved. The curreent implementation will keep calling the computePSetLimit because getRegPressureSetLimit assume computePSetLimit will return a non-zero value. The fix simply early return the value from TableGen for such special case. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92907	2021-01-05 16:18:34 +00:00
Bradley Smith	c73ae747cb	[AArch64][SVE] Add optimization to remove redundant ptest instructions Co-Authored-by: Graham Hunter <graham.hunter@arm.com> Co-Authored-by: Paul Walker <paul.walker@arm.com> Differential Revision: https://reviews.llvm.org/D93292	2021-01-05 15:28:36 +00:00
Simon Pilgrim	73a44f437b	[X86][AVX] combineVectorSignBitsTruncation - use PACKSS/PACKUS in more AVX cases AVX512 has fast truncation ops, but if the truncation source is a concatenation of subvectors then its likely that we can use PACK more efficiently. This is only guaranteed to work for truncations to 128/256-bit vectors as the PACK works across 128-bit sub-lanes, for now I've just disabled 512-bit truncation cases but we need to get them working eventually for D61129.	2021-01-05 15:01:45 +00:00

1 2 3 4 5 ...

37103 Commits