llvm-project

Commit Graph

Author	SHA1	Message	Date
David Blaikie	5146fc15fc	llvm-dwarfdump: Include unit count in DWP index header dumping And add comma separators (to be consistent with recent changes/improvements to the dumping of other section headers) while I'm here.	2020-06-12 12:40:02 -07:00
Michael Liao	ec02635d10	[amdgpu] Skip OR combining on 64-bit integer before legalizing ops. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81710	2020-06-12 15:22:38 -04:00
Amara Emerson	1cbebd95de	[AArch64][GlobalISel] Legalize vector G_PTR_ADD and enable selection. Differential Revision: https://reviews.llvm.org/D81419	2020-06-12 11:25:17 -07:00
David Green	46529978bf	[ARM] Always use reductions intrinsics under MVE Similar to a recent change to the X86 backend, this changes things so that we always produce a reduction intrinsics for all reduction types, not just the legal ones. This gives a better chance in the backend to custom lower them to something more suitable for MVE. Especially for something like fadd the in-order reduction produced during DAG lowering is already better than the shuffles produced in the midend, and we can do even better with a bit of custom lowering. Differential Revision: https://reviews.llvm.org/D81398	2020-06-12 19:21:17 +01:00
Michael Liao	e7b920e6fe	[DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2)) Reviewers: arsenm Subscribers: sdardis, wdng, hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, ecnelises, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81708	2020-06-12 13:53:08 -04:00
Jessica Paquette	d3a56f062b	[AArch64][GlobalISel] Allow G_DUP for elements smaller than 32 B. We select all of these via patterns now, so there's no reason to disallow this. Update select-dup.mir to show that we correctly select the smaller types. Differential Revision: https://reviews.llvm.org/D81322	2020-06-12 09:40:34 -07:00
Jessica Paquette	305862a5a6	[AArch64][GlobalISel] Set hasSideEffects = 0 on custom shuffle opcodes This was making it so that the instructions weren't eliminated in select-rev.mir and select-trn.mir despite not being used. Update the tests accordingly. Differential Revision: https://reviews.llvm.org/D81492	2020-06-12 09:39:46 -07:00
Matt Arsenault	350ee7fb3f	GlobalISel: Fix not erasing old instruction in sitofp/uitofp lowering	2020-06-12 10:33:23 -04:00
Masoud Ataei	2d038370bb	DAGCombiner optimization for pow(x,0.75) and pow(x,0.25) on double and single precision even in case massv function is asked Here, I am proposing to add an special case for massv powf4/powd2 function (SIMD counterpart of powf/pow function in MASSV library) in MASSV pass to get later optimizations like conversion from pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's in the DAGCombiner in vector float case. My reason for doing this is: the optimized pow(x,0.75) and pow(x,0.25) for double and single precision to sequence of sqrt's is faster than powf4/powd2 on P8 and P9. In case MASSV functions is called, and if the exponent of pow is 0.75 or 0.25, we will get the sequence of sqrt's and if exponent is not 0.75 or 0.25 we will get the appropriate MASSV function. Reviewed By: steven.zhang Tags: #LLVM #PowerPC Differential Revision: https://reviews.llvm.org/D80744	2020-06-12 10:02:16 -04:00
Simon Pilgrim	5509e2cc2e	[DAG] foldAddSubOfSignBit - add support for non-uniform vector constants	2020-06-12 14:58:15 +01:00
Simon Pilgrim	a5a00155a2	[X86] Add non-uniform vector signbit test cases	2020-06-12 14:58:15 +01:00
Simon Pilgrim	8d30945ab9	[X86][SSE] combineX86ShuffleChain - combine INSERT_VECTOR_ELT patterns to INSERTPS Noticed while trying to cleanup D66004 - if a shuffle operand came from a scalar, we're better off using INSERTPS vs UNPCKLPS as this is more likely to load fold later on. It also matches our existing BUILD_VECTOR lowering. We can extend this to other PINSRB/D/Q/W cases in the future as the need arises.	2020-06-12 11:59:01 +01:00
Florian Hahn	4495a6b141	[BreakCritEdges] Add option to opt-out of perserving loop-simplify. This patch adds a new option to CriticalEdgeSplittingOptions to control whether loop-simplify form must be preserved. It is them used by GVN to indicate that loop-simplify form does not have to be preserved. This fixes a crash exposed by `189efe295b`. If the critical edge we are splitting goes from a block inside a loop to a block outside the loop, splitting the edge will create a new exit block. As a result, the new block will branch to the original exit block, which will add a non-loop predecessor, breaking loop-simplify form. To preserve loop-simplify form, the predecessor blocks of the original exit are split, but that does not work for blocks with indirectbr terminators. If preserving loop-simplify form is requested, bail out , before making any changes. Reviewers: reames, hfinkel, davide, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D81582	2020-06-12 11:47:13 +01:00
Xing GUO	7e0827e86f	[ObjectYAML][test] Use a single test file to test the empty 'DWARF' entry. This patch addresses comments in [D81450](https://reviews.llvm.org/D81450#inline-748745) Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D81529	2020-06-12 17:56:22 +08:00
Florian Hahn	3a846d4d92	[VPlan] Reject loops without computable backedge taken counts getOrCreateTripCount is used to generate code for the outer loop, but it requires a computable backedge taken counts. Check that in the VPlan native path. Reviewers: Ayal, gilr, rengolin, sguggill Reviewed By: sguggill Differential Revision: https://reviews.llvm.org/D81088	2020-06-12 10:31:18 +01:00
Sebastian Neubauer	29a6ad94fd	[AMDGPU] Add G16 support to image instructions Add G16 feature for GFX10 and support A16 and G16 in GlobalISel. Differential Revision: https://reviews.llvm.org/D76836	2020-06-12 11:26:31 +02:00
Georgii Rymar	d95f8e7aef	[yaml2obj][MachO] - Fix PubName/PubType handling. `PubName` and `PubType` are optional fields since D80722. They are defined as: Optional<PubSection> PubNames; Optional<PubSection> PubTypes; And initialized in the following way: IO.mapOptional("debug_pubnames", DWARF.PubNames); IO.mapOptional("debug_pubtypes", DWARF.PubTypes); But problem is that because of the issue in `YAMLTraits.cpp`, when there are no `debug_pubnames`/`debug_pubtypes` keys in a YAML description, they are not initialized to `Optional::None` as the code expects, but they are initialized to default `PubSection()` instances. Because of this, the `if` condition in the following code is always true: if (Obj.DWARF.PubNames) Err = DWARFYAML::emitPubSection(OS, *Obj.DWARF.PubNames, Obj.IsLittleEndian); What means `emitPubSection` is always called and it writes few values. This patch fixes the issue. I've reduced `sizeofcmds` by size of data previously written because of this bug. Differential revision: https://reviews.llvm.org/D81686	2020-06-12 12:03:51 +03:00
EgorBo	012909dcaf	[InstCombine] "X - (X / C) * C == 0" to "X & C-1 == 0" Summary: "X % C == 0" is optimized to "X & C-1 == 0" (where C is a power-of-two) However, "X % Y" can also be represented as "X - (X / Y) * Y" so if I rewrite the initial expression: "X - (X / C) * C == 0" it's not currently optimized to "X & C-1 == 0", see godbolt: https://godbolt.org/z/KzuXUj This is my first contribution to LLVM so I hope I didn't mess things up Reviewers: lebedev.ri, spatel Reviewed By: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79369	2020-06-12 10:20:06 +03:00
EgorBo	6538b3adbe	[NFC][InstCombine] Tests for "X - (X / C) * C == 0" pattern See https://reviews.llvm.org/D79369	2020-06-12 10:20:06 +03:00
Kristof Beyls	c35ed40f4f	[AArch64] Extend AArch64SLSHardeningPass to harden BLR instructions. To make sure that no barrier gets placed on the architectural execution path, each BLR x<N> instruction gets transformed to a BL __llvm_slsblr_thunk_x<N> instruction, with __llvm_slsblr_thunk_x<N> a thunk that contains __llvm_slsblr_thunk_x<N>: BR x<N> <speculation barrier> Therefore, the BLR instruction gets split into 2; one BL and one BR. This transformation results in not inserting a speculation barrier on the architectural execution path. The mitigation is off by default and can be enabled by the harden-sls-blr subtarget feature. As a linker is allowed to clobber X16 and X17 on function calls, the above code transformation would not be correct in case a linker does so when N=16 or N=17. Therefore, when the mitigation is enabled, generation of BLR x16 or BLR x17 is avoided. As BLRA* indirect calls are not produced by LLVM currently, this does not aim to implement support for those. Differential Revision: https://reviews.llvm.org/D81402	2020-06-12 07:34:33 +01:00
Vitaly Buka	999307323a	[StackSafety] Fix byval handling We don't need process paramenters which marked as byval as we are not going to pass interested allocas without copying. If we pass value into byval argument, we just handle that as Load of corresponding type and stop that branch of analysis.	2020-06-11 20:58:36 -07:00
Alexander Shaposhnikov	c966ed8dc7	[llvm-objcopy][MachO] Fix cmdsize of LC_RPATH Fix the calculation of the field cmdsize (in the function buildRPathLoadCommand) to account for the null byte terminator. Patch by Sameer Arora! Test plan: make check-all Differential revision: https://reviews.llvm.org/D81575	2020-06-11 19:55:04 -07:00
Yonghong Song	4db1878158	[BPF] fix incorrect type in BPFISelDAGToDAG readonly load optimization In BPF Instruction Selection DAGToDAG transformation phase, BPF backend had an optimization to turn load from readonly data section to direct load of the values. This phase is implemented before libbpf has readonly section support and before alu32 is supported. This phase however may generate incorrect type when alu32 is enabled. The following is an example, -bash-4.4$ cat ~/tmp2/t.c struct t { unsigned char a; unsigned char b; unsigned char c; }; extern void foo(void ); int test() { struct t v = { .b = 2, }; foo(&v); return 0; } The compiler will turn local variable "v" into a readonly section. During instruction selection phase, the compiler generates two loads from readonly section, one 2 byte load or 1 byte load, e.g., for 2 loads, t8: i32,ch = load<(dereferenceable load 2 from `i8 getelementptr inbounds (%struct.t, %struct.t* @__const.test.v, i64 0, i32 0)`, align 1), anyext from i16> t3, GlobalAddress:i64<%struct.t* @__const.test.v> 0, undef:i64 t9: ch = store<(store 2 into %ir.v1.sub1), trunc to i16> t3, t8, FrameIndex:i64<0>, undef:i64 BPF backend changed t8 to i64 = Constant<2> and eventually the generated machine IR: t10: i64 = MOV_ri TargetConstant:i64<2> t40: i32 = SLL_ri_32 t10, TargetConstant:i32<8> t41: i32 = OR_ri_32 t40, TargetConstant:i64<0> t9: ch = STH32<Mem:(store 2 into %ir.v1.sub1)> t41, TargetFrameIndex:i64<0>, TargetConstant:i64<0>, t3 Note that t10 in the above is not correct. The type should be i32 and instruction should be MOV_ri_32. The reason for incorrect insn selection is BPF insn selection generated an i64 constant instead of an i32 constant as specified in the original load instruction. Such incorrect insn sequence eventually caused the following fatal error when a COPY insn tries to copy a 64bit register to a 32bit subregister. Impossible reg-to-reg copy UNREACHABLE executed at ../lib/Target/BPF/BPFInstrInfo.cpp:42! This patch fixed the issue by using the load result type instead of always i64 when doing readonly load optimization. Differential Revision: https://reviews.llvm.org/D81630	2020-06-11 19:31:06 -07:00
Esme-Yi	af9f8c24a0	Revert "[PowerPC][NFC] Testing ROTL of v1i128." This reverts commit `174192af01`.	2020-06-12 02:23:52 +00:00
Cyndy Ishida	28fefcc83c	[llvm][llvm-nm] add TextAPI/MachO support Summary: This completes the needed glueing to support reading tbd files from nm. This includes specifying which slice filtering with `--arch` and a new option specifically for tbd files `--add-inlinedinfo` which will show the reexported libraries that are appended in the tbd file. Reviewers: ributzka, steven_wu, JDevlieghere, jhenderson Reviewed By: JDevlieghere Subscribers: hiraditya, MaskRay, dexonsmith, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81614	2020-06-11 18:54:16 -07:00
Alina Sbirlea	519b019a0a	Verify MemorySSA after all updates. Verify after completing all updates. Resolves PR46275.	2020-06-11 18:48:41 -07:00
Matt Arsenault	7d913becfc	AMDGPU/GlobalISel: Fix select of private <2 x s16> load	2020-06-11 19:25:25 -04:00
Matt Arsenault	27f8bd94cb	AMDGPU/GlobalISel: Fix select of <8 x s64> scalar load	2020-06-11 19:09:43 -04:00
Matt Arsenault	2247072b65	AMDGPU/GlobalISel: Set insert point when emitting control flow pseudos This was implicitly assuming the branch instruction was the next after the pseudo. It's possible for another non-terminator instruction to be inserted between the intrinsic and the branch, so adjust the insertion point. Fixes a non-terminator after terminator verifier error (which without the verifier, manifested itself as an infinite loop in analyzeBranch much later on).	2020-06-11 18:53:26 -04:00
Kirill Naumov	1022b5eb5b	[InlineCost] Preparational patch for creation of Printer pass. - Renaming the printer class, flag - Refactoring - Changing some tests This patch is a preparational stage for introducing a new printing pass and new functionality to the existing Annotation Writer. I plan to extend this functionality for this tool to be more useful when looking at the inline process.	2020-06-11 22:29:03 +00:00
Stanislav Mekhanoshin	a98d618f6e	Fixed assertion in SROA if block has ho successors BasicBlock::isLegalToHoistInto() asserts if block does not have successors. The case is degenarate but assertion still needs to be avoided. https://bugs.llvm.org/show_bug.cgi?id=46280 Differential Revision: https://reviews.llvm.org/D81674	2020-06-11 15:15:19 -07:00
Thomas Lively	c5d012341e	[WebAssembly] Make BR_TABLE non-duplicable Summary: After their range checks were removed in `7f50c15be5`, br_tables started being duplicated into their predecessors by tail folding. Unfortunately, when the br_tables were in loops this transformation introduced bad irreducible control flow which was later expanded into even more br_tables. This commit abuses the `isNotDuplicable` property to prevent this irreducible control flow from being introduced. This change saves a few dozen bytes of code size and has a negligible affect on performance for most of the large Emscripten benchmarks, but can improve performance significantly on microbenchmarks of switches in loops. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81628	2020-06-11 15:11:45 -07:00
Fangrui Song	432f20bc18	[GlobalISel][test] Add REQUIRES: asserts after D76934	2020-06-11 13:50:56 -07:00
Craig Topper	8fa3e8fa14	[X86] Force VIA PadLock crypto instructions to emit a 0xF3 prefix when they encode to match what GNU as does. The spec for these says they need 0xf3 but also mentions REP before the mnemonic. But I don't think its fair to users to make them write REP first. And gas doesn't make them. objdump seems to disassemble with or without the prefix and just prints any 0xf3 as REP.	2020-06-11 12:59:21 -07:00
Eli Friedman	12459ec926	[AArch64] Regenerate SVE test llvm-ir-to-intrinsic.ll.	2020-06-11 12:14:24 -07:00
Stanislav Mekhanoshin	59491b208f	Regenerated SROA phi-gep.ll test. NFC.	2020-06-11 10:51:06 -07:00
Sanjay Patel	d386297c67	[VectorCombine] add tests for compare scalarization; NFC	2020-06-11 12:29:00 -04:00
Petar Avramovic	bd3d951b8b	AMDGPU/GlobalISel: Fix lower for f64->f16 G_FPTRUNC Put AND before ADD in LegalizerHelper::lowerFPTRUNC_F64_TO_F16 in order to match algorithm from AMDGPUTargetLowering::LowerFP_TO_FP16. Differential Revision: https://reviews.llvm.org/D81666	2020-06-11 18:19:27 +02:00
Fangrui Song	5ee571735d	[llvm-objdump] Decrease instruction indentation for non-x86 Place the instruction at the 24th column (0-based indexing), matching GNU objdump ARM/AArch64/powerpc/etc when the address is low. This is beneficial for non-x86 targets which have short instruction lengths. ``` // GNU objdump AArch64 0: 91001062 add x2, x3, #0x4 400078: 91001062 add x2, x3, #0x4 // llvm-objdump, with this patch 0: 62 10 00 91 add x2, x3, #4 400078: 62 10 00 91 add x2, x3, #4 // llvm-objdump, if we change to print a word instead of bytes in the future 0: 91001062 add x2, x3, #4 400078: 91001062 add x2, x3, #4 // GNU objdump Thumb 0: bf00 nop // GNU objdump Power ISA 3.1 64-bit instruction // 0: 00 00 10 04 plwa r3,0 // 4: 00 00 60 a4 ``` Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D81590	2020-06-11 09:10:50 -07:00
Simon Pilgrim	7706c7af74	[X86] Fold vXi1 OR(KSHIFTL(X,NumElts/2),Y) -> KUNPCK Convert shift+or bool vector patterns into CONCAT_VECTORS if we know this will be lowered to KUNPCK (which requires 16+ vector elements). Fixes PR32547	2020-06-11 15:47:20 +01:00
Jay Foad	69bdfb075b	[IR] Clean up dead instructions after simplifying a conditional branch Change BasicBlock::removePredecessor to optionally return a vector of instructions which might be dead. Use this in ConstantFoldTerminator to delete them if they are dead. Reapply with a bug fix: don't drop the "!KeepOneInputPHIs" argument when removePredecessor calls PHINode::removeIncomingValue. Differential Revision: https://reviews.llvm.org/D80206	2020-06-11 14:53:01 +01:00
Sam Parker	3d5f7c8531	[IR] Remove assert from ShuffleVectorInst Which triggers on valid, but not useful, IR such as a undef mask. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46276 Differential Revision: https://reviews.llvm.org/D81634	2020-06-11 14:52:17 +01:00
Jay Foad	f45c65aa41	Revert "[IR] Clean up dead instructions after simplifying a conditional branch" This reverts commit `4494e45316`. It caused problems for sanitizer buildbots.	2020-06-11 14:22:16 +01:00
Simon Pilgrim	8824913e93	[X86][AVX512] Add second test case for PR32547 Demonstrate missing support for OR(X,KSHIFTL(Y,8)) -> KUNPCKBW as well as the existing OR(KSHIFTL(X,8),Y) -> KUNPCKBW test.	2020-06-11 13:37:44 +01:00
Jay Foad	4494e45316	[IR] Clean up dead instructions after simplifying a conditional branch Change BasicBlock::removePredecessor to optionally return a vector of instructions which might be dead. Use this in ConstantFoldTerminator to delete them if they are dead. Differential Revision: https://reviews.llvm.org/D80206	2020-06-11 13:28:10 +01:00
Pavel Labath	9ed452f370	[llvm/DWARFDebugLine] Remove spurious full stop from warning messages Other warnings messages don't have a trailing full stop.	2020-06-11 13:14:21 +02:00
Pavel Labath	fccaa89e23	[llvm/DWARFDebugLine] Fix a typo in one warning message	2020-06-11 13:04:52 +02:00
Chris Jackson	4707bc2177	[DebugInfo] Refactor SalvageDebugInfo and SalvageDebugInfoForDbgValues - Simplify the salvaging interface and the algorithm in InstCombine Reviewers: vsk, aprantl, Orlando, jmorse, TWeaver Reviewed by: Orlando Differential Revision: https://reviews.llvm.org/D79863	2020-06-11 11:13:46 +01:00
Georgii Rymar	818ab3d654	[yaml2obj] - Allocate the file space for SHT_NOBITS sections in some cases. This teaches yaml2obj to allocate file space for a no-bits section when there is a non-nobits section in the same segment that follows it. It was discussed in D78005 thread and matches GNU linkers and LLD behavior. Differential revision: https://reviews.llvm.org/D80629	2020-06-11 12:54:53 +03:00
Simon Pilgrim	5cca9828ff	[X86][AVX512] Avoid bitcasts between scalar and vXi1 bool vectors AVX512 mask types are often bitcasted to scalar integers for various ops before being bitcast back to be used as a predicate. In many cases we can avoid these KMASK<->GPR transfers and perform equivalent operations on the mask unit. If the destination mask type is legal, and we can confirm that the scalar op originally came from a mask/vector/float/double type then we should try to avoid the scalar entirely. This avoids some codegen issues noticed while working on PTEST/MOVMSK improvements. Partially fixes PR32547 - we don't create a KUNPCK yet, but OR(X,KSHIFTL(Y)) can be handled in a separate patch. Differential Revision: https://reviews.llvm.org/D81548	2020-06-11 10:22:55 +01:00

1 2 3 4 5 ...

72055 Commits