llvm-project

Commit Graph

Author	SHA1	Message	Date
Valery Pykhtin	cb8de55f47	[AMDGPU] Constrain the AMDGPU inliner on maximum number of basic blocks in a caller function (compile time performance) Differential revision: https://reviews.llvm.org/D62917 llvm-svn: 362789	2019-06-07 12:16:46 +00:00
Cullen Rhodes	1f0d251244	[AArch64][AsmParser] error on unexpected SVE predicate type suffix Summary: This patch fixes a bug in the assembler that permitted a type suffix on predicate registers when not expected. For instance, the following was previously valid: faddv h0, p0.q, z1.h This bug was present in all SVE instructions containing predicates with no type suffix and no predication form qualifier, i.e. /z or /m. The latter instructions are already caught with an appropiate error message by the assembler, e.g.: .text <stdin>:1:13: error: not expecting size suffix cmpne p1.s, p0.b/z, z2.s, 0 ^ A similar issue for SVE vector registers was fixed in: https://reviews.llvm.org/D59636 Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62942 llvm-svn: 362780	2019-06-07 08:46:56 +00:00
Cullen Rhodes	f730548484	[AArch64][AsmParser] Provide better diagnostics for SVE predicates Patch by Sander de Smalen (sdesmalen) Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62941 llvm-svn: 362779	2019-06-07 08:37:00 +00:00
George Rimar	33044a7ae2	[llvm-objcopy] - Emit error and don't crash if program header reaches past end of file. This is https://bugs.llvm.org/show_bug.cgi?id=42122. If an object file has a size less than program header's file [offset + size] (i.e. if we have overflow), llvm-objcopy crashes instead of reporting a error. The patch fixes this issue. Differential revision: https://reviews.llvm.org/D62898 llvm-svn: 362778	2019-06-07 08:34:18 +00:00
Pengfei Wang	f8b28931a7	[X86] -march=cooperlake (llvm) Support intel -march=cooperlake in llvm Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62836 llvm-svn: 362776	2019-06-07 08:31:35 +00:00
Sam Parker	c5ef502ee8	[CodeGen] Generic Hardware Loop Support Patch which introduces a target-independent framework for generating hardware loops at the IR level. Most of the code has been taken from PowerPC CTRLoops and PowerPC has been ported over to use this generic pass. The target dependent parts have been moved into TargetTransformInfo, via isHardwareLoopProfitable, with HardwareLoopInfo introduced to transfer information from the backend. Three generic intrinsics have been introduced: - void @llvm.set_loop_iterations Takes as a single operand, the number of iterations to be executed. - i1 @llvm.loop_decrement(anyint) Takes the maximum number of elements processed in an iteration of the loop body and subtracts this from the total count. Returns false when the loop should exit. - anyint @llvm.loop_decrement_reg(anyint, anyint) Takes the number of elements remaining to be processed as well as the maximum numbe of elements processed in an iteration of the loop body. Returns the updated number of elements remaining. llvm-svn: 362774	2019-06-07 07:35:30 +00:00
Michael Pozulp	65d1ff8e7e	[NFC] Delete trailing whitespace character. llvm-svn: 362772	2019-06-07 06:28:43 +00:00
Michael Pozulp	767bdd55e1	[llvm-objdump] Print source when subsequent lines in the translation unit come from the same line in two different headers. Reviewers: grimar, rupprecht, jhenderson Reviewed By: grimar, jhenderson Subscribers: llvm-commits, jhenderson Tags: #llvm Differential Revision: https://reviews.llvm.org/D62461 llvm-svn: 362771	2019-06-07 06:23:54 +00:00
Michael Pozulp	50f61af3f3	[llvm-objdump] Add warning if --disassemble-functions specifies an unknown symbol Summary: Fixes Bug 41904 https://bugs.llvm.org/show_bug.cgi?id=41904 Reviewers: jhenderson, rupprecht, grimar, MaskRay Reviewed By: jhenderson, rupprecht, MaskRay Subscribers: dexonsmith, rupprecht, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62275 llvm-svn: 362768	2019-06-07 05:11:13 +00:00
Fangrui Song	c841b9abf0	[MC][ELF] Don't create relocations with section symbols for STB_LOCAL ifunc We should keep the symbol type (STT_GNU_IFUNC) for a local ifunc because it may result in an IRELATIVE reloc that the dynamic loader will use to resolve the address at startup time. There is another problem that is not fixed by this patch: a PC relative relocation should also create a relocation with the ifunc symbol. llvm-svn: 362767	2019-06-07 03:47:22 +00:00
Michael Pozulp	c7029e4ef4	[NFC] Test commit. llvm-svn: 362763	2019-06-07 01:55:59 +00:00
Matt Arsenault	c0edb8f5cf	AMDGPU: Don't count mask branch pseudo towards skip threshold llvm-svn: 362761	2019-06-07 00:14:55 +00:00
Matt Arsenault	99ee81b183	AMDGPU: Insert skips for blocks with FLAT This already forced a skip for VMEM, so it should also be done for flat. I'm somewhat skeptical about the benefit of this though. llvm-svn: 362760	2019-06-07 00:14:45 +00:00
Nemanja Ivanovic	ef4a3aa549	[PowerPC] Exploit the vector min/max instructions Use the PPC vector min/max instructions for computing the corresponding operation as these should be faster than the compare/select sequences we currently emit. Differential revision: https://reviews.llvm.org/D47332 llvm-svn: 362759	2019-06-06 23:49:01 +00:00
Matt Arsenault	b6cfa129cc	AMDGPU: Insert skip branches over return blocks SIInsertSkips really doesn't understand the control flow, and makes very stupid assumptions about the block layout. This was able to get away with not skipping return blocks, since usually after structurization there is only one placed at the end of the function. Tail duplication can break this assumption. llvm-svn: 362754	2019-06-06 22:51:51 +00:00
Cameron McInally	66f286845c	[NFC][CodeGen] Add unary fneg tests to X86/fma4-intrinsics-x86.ll llvm-svn: 362752	2019-06-06 21:49:59 +00:00
Alexey Lapshin	b9f1e7b16e	[DebugInfo] Incorrect debug info record generated for loop counter. Incorrect Debug Variable Range was calculated while "COMPUTING LIVE DEBUG VARIABLES" stage. Range for Debug Variable("i") computed according to current state of instructions inside of basic block. But Register Allocator creates new instructions which were not taken into account when Live Debug Variables computed. In the result DBG_VALUE instruction for the "i" variable was put after these newly inserted instructions. This is incorrect. Debug Value for the loop counter should be inserted before any loop instruction. Differential Revision: https://reviews.llvm.org/D62650 llvm-svn: 362750	2019-06-06 21:19:39 +00:00
Alexander Timofeev	37bd9bd137	[AMDGPU] Partial revert for the `ba447bae74` "Divergence driven ISel. Assign register class for cross block values according to the divergence." that discovered the design flaw leading to several issues that required to be solved before. This change reverts AMDGPU specific changes and keeps common part unaffected. llvm-svn: 362749	2019-06-06 21:13:02 +00:00
Cameron McInally	169fc2b020	[NFC][CodeGen] Add unary fneg tests to X86/fma-intrinsics-x86.ll llvm-svn: 362748	2019-06-06 21:12:22 +00:00
Craig Topper	f320f26716	[X86] Make a bunch of merge masked binops commutable for loading folding. This primarily affects add/fadd/mul/fmul/and/or/xor/pmuludq/pmuldq/max/min/fmaxc/fminc/pmaddwd/pavg. We already commuted the unmasked and zero masked versions. I've added 512-bit stack folding tests for most of the instructions affected. I've tested needing commuting and not commuting across unmasked, merged masked, and zero masked. The 128/256 bit instructions should behave similarly. llvm-svn: 362746	2019-06-06 21:00:04 +00:00
Sanjay Patel	38c5ee1802	[InstSimplify] add tests for fcmp with known-never-nan operands; NFC llvm-svn: 362742	2019-06-06 20:14:06 +00:00
Cameron McInally	3d2ee0053a	[NFC][CodeGen] Add unary fneg tests to X86/fma-scalar-combine.ll llvm-svn: 362741	2019-06-06 20:11:30 +00:00
Craig Topper	ca541b20d0	[CFLGraph] Add support for unary fneg instruction. Differential Revision: https://reviews.llvm.org/D62791 llvm-svn: 362737	2019-06-06 19:21:23 +00:00
Jason Liu	60ec248148	[AIX] Implement function descriptor on SDAG Summary: (1) Function descriptor on AIX On AIX, a called routine may have 2 distinct symbols associated with it: * A function descriptor (Name) * A function entry point (.Name) The descriptor structure on AIX is the same as those in the ELF V1 ABI: * The address of the entry point of the function. * The TOC base address for the function. * The environment pointer. The descriptor symbol uses the same name as the source level function in C. The function entry point is analogous to the symbol we would generate for a function in a non-descriptor-based ABI, except that it is renamed by prepending a ".". Which symbol gets referenced depends on the context: * Taking the address of the function references the descriptor symbol. * Calling the function references the entry point symbol. (2) Speaking of implementation on AIX, for direct function call target, we create proper MCSymbol SDNode(e.g . ".foo") while constructing SDAG to replace original TargetGlobalAddress SDNode. Then down the path, we can take advantage of this MCSymbol. Patch by: Xiangling_L Reviewed by: sfertile, hubert.reinterpretcast, jasonliu, syzaara Differential Revision: https://reviews.llvm.org/D62532 llvm-svn: 362735	2019-06-06 19:13:36 +00:00
Cameron McInally	f288a0685f	[NFC][CodeGen] Add unary fneg tests to X86/fma4-fneg-combine.ll llvm-svn: 362733	2019-06-06 19:02:46 +00:00
Craig Topper	6cda33ba36	[InlineCost] Add support for unary fneg. This adds support for unary fneg based on the implementation of BinaryOperator without the soft float FP cost. Previously we would just delegate to visitUnaryInstruction. I think the only real change is that we will pass the FastMath flags to SimplifyFNeg now. Differential Revision: https://reviews.llvm.org/D62699 llvm-svn: 362732	2019-06-06 19:02:18 +00:00
Cameron McInally	06de52674d	[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns.ll llvm-svn: 362730	2019-06-06 18:41:18 +00:00
Philip Reames	101915cfda	[LoopPred] Fix a bug in unconditional latch bailout introduced in r362284 This is a really silly bug that even a simple test w/an unconditional latch would have caught. I tried to guard against the case, but put it in the wrong if check. Oops. llvm-svn: 362727	2019-06-06 18:02:36 +00:00
Simon Pilgrim	842c7792aa	[DAGCombine] MergeConsecutiveStores - improve non-temporal load\store handling (PR42123) This patch is the first step towards ensuring MergeConsecutiveStores correctly handles non-temporal loads\stores: 1 - When merging load\stores we must ensure that they all have the same non-temporal flag. This is unlikely to occur, but can in strange cases where we're storing at the end of one page and the beginning of another. 2 - The merged load\store node must retain the non-temporal flag. Differential Revision: https://reviews.llvm.org/D62910 llvm-svn: 362723	2019-06-06 17:04:13 +00:00
Cameron McInally	f1b8c6ac4f	[NFC][CodeGen] Add unary fneg tests to X86/fma_patterns_wide.ll llvm-svn: 362720	2019-06-06 16:55:51 +00:00
Craig Topper	6b67dfa54c	[X86] Make masked floating point equality/ordered compares commutable for load folding purposes. Same as what is supported for the unmasked form. llvm-svn: 362717	2019-06-06 16:39:04 +00:00
Cameron McInally	5c01140581	[NFC][CodeGen] Add unary fneg tests to fmul-combines.ll fnabs.ll llvm-svn: 362715	2019-06-06 16:13:23 +00:00
Cameron McInally	1d85a7518c	[NFC][CodeGen] Add unary fneg tests to fp-fast.ll fp-fold.ll fp-in-intregs.ll fp-stack-compare-cmov.ll fp-stack-compare.ll fsxor-alignment.ll llvm-svn: 362712	2019-06-06 15:29:11 +00:00
Cameron McInally	0924f44859	[NFC][CodeGen] Remove duplicate test in fp-fast.ll @test10 is the same as @test11. llvm-svn: 362710	2019-06-06 14:52:16 +00:00
Jason Liu	0338b88861	[AIX] Implement call lowering with parameters could pass onto GPRs Summary: This patch implements SDAG call lowering on AIX for functions which only have parameters that could fit into GPRs. Reviewers: hubert.reinterpretcast, syzaara Differential Revision: https://reviews.llvm.org/D62823 llvm-svn: 362708	2019-06-06 14:36:43 +00:00
Thomas Preud'homme	71d3f227a7	FileCheck [6/12]: Introduce numeric variable definition Summary: This patch is part of a patch series to add support for FileCheck numeric expressions. This specific patch introduces support for defining numeric variable in a CHECK directive. This commit introduces support for defining numeric variable from a litteral value in the input text. Numeric expressions can then use the variable provided it is on a later line. Copyright: - Linaro (changes up to diff 183612 of revision D55940) - GraphCore (changes in later versions of revision D55940 and in new revision created off D55940) Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D60386 llvm-svn: 362705	2019-06-06 13:21:06 +00:00
Owen Reynolds	bf5bca5bea	[llvm-ar] Create thin archives with MRI scripts This patch implements the "CREATE_THIN" MRI script command, allowing thin archives to be created via MRI scripts. Differential Revision: https://reviews.llvm.org/D62919 llvm-svn: 362704	2019-06-06 13:19:50 +00:00
Sanjay Patel	dd2d1a168f	[InstCombine] add tests for loads of bitcasted vector pointer; NFC llvm-svn: 362703	2019-06-06 13:18:20 +00:00
Adhemerval Zanella	559e69a821	AArch64] Handle ISD::LRINT and ISD::LLRINT for float16 This patch is a follow up for D62018 to add lrint/llrint support for float16. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62863 llvm-svn: 362700	2019-06-06 12:38:11 +00:00
Benjamin Kramer	f1249442cf	Revert "[SCEV] Use wrap flags in InsertBinop" This reverts commit r362687. Miscompiles llvm-profdata during selfhost. llvm-svn: 362699	2019-06-06 12:35:46 +00:00
Adhemerval Zanella	bce9e11a7b	[AArch64] Handle ISD::LROUND and ISD::LLROUND for float16 This patch is a follow up for D61391 to add lround/llround support for float16. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62861 llvm-svn: 362698	2019-06-06 11:53:26 +00:00
Simon Pilgrim	dc8affe607	[X86][SSE] Add nonuniform constant vector test for PR42105 llvm-svn: 362697	2019-06-06 11:15:36 +00:00
Luis Marques	711f361596	[RISCV] Disable test/Analysis/CostModel/RISCV tests if RISCV backend not built Adds missing lit.local.cfg. Fixes rL362691. llvm-svn: 362693	2019-06-06 10:12:28 +00:00
Petar Avramovic	81132ce0e9	[MIPS GlobalISel] Select sqrt Select G_FSQRT for MIPS32. Differential Revision: https://reviews.llvm.org/D62905 llvm-svn: 362692	2019-06-06 10:00:41 +00:00
Luis Marques	cff7d2fdc9	[RISCV] Add CostModel GEP tests Differential Revision: https://reviews.llvm.org/D61185 llvm-svn: 362691	2019-06-06 09:47:53 +00:00
Petar Avramovic	0a1fd355b2	[MIPS GlobalISel] Select fabs Select G_FABS for MIPS32. Differential Revision: https://reviews.llvm.org/D62903 llvm-svn: 362690	2019-06-06 09:22:37 +00:00
Petar Avramovic	a7d0006447	[MIPS GlobalISel] Select fpext and fptrunc Select G_FPEXT and G_FPTRUNC for MIPS32. Differential Revision: https://reviews.llvm.org/D62902 llvm-svn: 362689	2019-06-06 09:16:58 +00:00
Petar Avramovic	faaa2b5d21	[MIPS GlobalISel] Select floor and ceil Select G_FFLOOR and G_FCEIL for MIPS32. Differential Revision: https://reviews.llvm.org/D62901 llvm-svn: 362688	2019-06-06 09:02:24 +00:00
Sam Parker	7cc580f5e9	[SCEV] Use wrap flags in InsertBinop If the given SCEVExpr has no (un)signed flags attached to it, transfer these to the resulting instruction or use them to find an existing instruction. Differential Revision: https://reviews.llvm.org/D61934 llvm-svn: 362687	2019-06-06 08:56:26 +00:00
Dylan McKay	3c82c57d2b	[AVR] Fix the 'load.ll' test after r362351 In that commit, the 'load.ll' test was modified, but still failed. This commit updates the test so that it now passes. llvm-svn: 362684	2019-06-06 08:06:50 +00:00
Amara Emerson	d3144a4abc	[AArch64][GlobalISel] Add manual selection support for G_ZEXTLOADs to s64. We already get support for G_ZEXTLOAD to s32 from the importer, but it can't deal with the SUBREG_TO_REG in the pattern. Tweaking the existing manual selection code for G_LOAD to handle an additional SUBREG_TO_REG when dealing with G_ZEXTLOAD isn't much work. Also add tests to check the imported pattern selections to s32 work. llvm-svn: 362681	2019-06-06 07:58:37 +00:00
Amara Emerson	d940e20051	[AArch64][GlobalISel] Add the new changes to fix PR42129 that were supposed to go into r362666. The changes weren't staged so ended up just re-commiting the unmodified reverted change. llvm-svn: 362677	2019-06-06 07:33:47 +00:00
Craig Topper	9226ba6b37	[X86] Don't turn avx masked.load with constant mask into masked.load+vselect when passthru value is all zeroes. This is intended to enable the use of an immediate blend or more optimal instruction. But if the passthru is zero we don't need any additional instructions. llvm-svn: 362675	2019-06-06 05:41:27 +00:00
Craig Topper	cf44372137	[X86] Add test case for masked load with constant mask and all zeros passthru. avx/avx2 masked loads only support all zeros for passthru in hardware. So we have to emit a blend for all other values. We have an optimization that tries to optimize this blend if the mask is constant. But we don't need to perform this optimization if the passthru value is zero which doesn't need the blend at all. llvm-svn: 362674	2019-06-06 05:41:22 +00:00
Amara Emerson	c37ff0d138	Revert "Revert "[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT is fp"" When looking through copies, make sure to not try to find the vreg def of a physreg. Normally getVRegDef will return nullptr in this case, but if there happens to be multiple defs then it will assert. This fixes PR42129. llvm-svn: 362666	2019-06-05 23:46:16 +00:00
Matt Arsenault	34c8b835b1	AMDGPU: Don't fix emergency stack slot at offset 0 This forced the caller to be aware of this, which is an ugly ABI feature. Partially reverts r295877. The original reasons for doing this are mostly fixed. Alloca is now in a non-0 address space, so it should be OK to have 0 as a valid pointer. Since we treat the absolute address as the pointer value, this part only really needed to apply to kernels. Since r357093, we avoid the need to increment/decrement the offset register in more cases, and since r354816 the scavenger can fail without spilling, so it's less critical that we try to avoid an offset that fits in the MUBUF offset. Restrict to callable functions for now to split this into 2 steps to limit thte number of test updates and in case anything breaks. llvm-svn: 362665	2019-06-05 22:37:50 +00:00
Cameron McInally	c72fbe5dc1	[MSAN] Add unary FNeg visitor to the MemorySanitizer Differential Revision: https://reviews.llvm.org/D62909 llvm-svn: 362664	2019-06-05 22:37:05 +00:00
Ulrich Weigand	6c5d5ce551	Allow target to handle STRICT floating-point nodes The ISD::STRICT_ nodes used to implement the constrained floating-point intrinsics are currently never passed to the target back-end, which makes it impossible to handle them correctly (e.g. mark instructions are depending on a floating-point status and control register, or mark instructions as possibly trapping). This patch allows the target to use setOperationAction to switch the action on ISD::STRICT_ nodes to Legal. If this is done, the SelectionDAG common code will stop converting the STRICT nodes to regular floating-point nodes, but instead pass the STRICT nodes to the target using normal SelectionDAG matching rules. To avoid having the back-end duplicate all the floating-point instruction patterns to handle both strict and non-strict variants, we make the MI codegen explicitly aware of the floating-point exceptions by introducing two new concepts: - A new MCID flag "mayRaiseFPException" that the target should set on any instruction that possibly can raise FP exception according to the architecture definition. - A new MI flag FPExcept that CodeGen/SelectionDAG will set on any MI instruction resulting from expansion of any constrained FP intrinsic. Any MI instruction that is both marked as mayRaiseFPException and FPExcept then needs to be considered as raising exceptions by MI-level codegen (e.g. scheduling). Setting those two new flags is straightforward. The mayRaiseFPException flag is simply set via TableGen by marking all relevant instruction patterns in the .td files. The FPExcept flag is set in SDNodeFlags when creating the STRICT_ nodes in the SelectionDAG, and gets inherited in the MachineSDNode nodes created from it during instruction selection. The flag is then transfered to an MIFlag when creating the MI from the MachineSDNode. This is handled just like fast-math flags like no-nans are handled today. This patch includes both common code changes required to implement the new features, and the SystemZ implementation. Reviewed By: andrew.w.kaylor Differential Revision: https://reviews.llvm.org/D55506 llvm-svn: 362663	2019-06-05 22:33:10 +00:00
Petr Hosek	2f94203e23	Revert "[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT is fp" This reverts commit r362435 as this triggers ICE, see PR42129 for details. llvm-svn: 362662	2019-06-05 22:27:31 +00:00
Matt Arsenault	b812b7a45e	AMDGPU: Invert frame index offset interpretation Since the beginning, the offset of a frame index has been consistently interpreted backwards. It was treating it as an offset from the scratch wave offset register as a frame register. The correct interpretation is the offset from the SP on entry to the function, before the prolog. Frame index elimination then should select either SP or another register as an FP. Treat the scratch wave offset on kernel entry as the pre-incremented SP. Rely more heavily on the standard hasFP and frame pointer elimination logic, and clean up the private reservation code. This saves a copy in most callee functions. The kernel prolog emission code is still kind of a mess relying on checking the uses of physical registers, which I would prefer to eliminate. Currently selection directly emits MUBUF instructions, which require using a reference to some register. Use the register chosen for SP, and then ignore this later. This should probably be cleaned up to use pseudos that don't refer to any specific base register until frame index elimination. Add a workaround for shaders using large numbers of SGPRs. I'm not sure these cases were ever working correctly, since as far as I can tell the logic for figuring out which SGPR is the scratch wave offset doesn't match up with the shader input initialization in the shader programming guide. llvm-svn: 362661	2019-06-05 22:20:47 +00:00
Joseph Tremoulet	acb5609063	[EarlyCSE] Add tests for negated min/max/abs [NFC] Summary: I'm planning to update the hashing logic to recognize their equivalence in a subsequent change (D62644). Reviewers: spatel Reviewed By: spatel Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62918 llvm-svn: 362657	2019-06-05 21:30:10 +00:00
Matt Arsenault	663d762c9a	NewGVN: Handle addrspacecast The AllConstant check needs to be moved out of the if/else if chain to avoid a test regression. The "there is no SimplifyZExt" comment puzzles me, since there is SimplifyCastInst. Additionally, the Simplify* calls seem to not see the operand as constant, so this needs to be tried if the simplify failed. llvm-svn: 362653	2019-06-05 21:15:52 +00:00
Tim Northover	8d7f118ab2	InstCombine: correctly change byval type attribute alongside call args. When the byval attribute has a type, it must match the pointee type of any parameter; but InstCombine was not updating the attribute when folding casts of various kinds away. llvm-svn: 362643	2019-06-05 20:38:17 +00:00
Matt Arsenault	4fb580c314	AMDGPU: Remove amdgpu-max-work-group-size attribute This has been deprecated for a long time, and mesa recently switched to amdgpu-flat-work-group-size. llvm-svn: 362641	2019-06-05 20:32:32 +00:00
Dan Gohman	53572d0470	[WebAssembly] Limit PIC support to the Emscripten target The current PIC support currently only works with Emscripten, so disable it for other targets. This is the PIC portion of https://reviews.llvm.org/D62542. Reviewed By: dschuff, sbc100 llvm-svn: 362638	2019-06-05 20:01:01 +00:00
Simon Pilgrim	036fa5346f	[X86][SSE] Add vector tests to cover more isNegatibleForFree/GetNegatedExpression cases (PR42105) Some already combine correctly, but vector constant analysis is weak. llvm-svn: 362633	2019-06-05 18:55:54 +00:00
Cameron McInally	8b83a9c6b1	[NFC][Reassociate] Fix mistake in 468b2ad Missed 2 'fast fsub(0.0,X) -> fneg(X)' changes. llvm-svn: 362631	2019-06-05 18:50:07 +00:00
Cameron McInally	5162266515	[NFC][Reassociate] Add unary fneg tests to fast-basictest.ll llvm-svn: 362630	2019-06-05 18:35:54 +00:00
Craig Topper	d0fff89b81	[X86] Add the vector integer min/max instructions to isAssociativeAndCommutative. As far as I know these should be freely reassociatable just like the floating point MAXC/MINC instructions. The reduce test changes are largely regressions and caused by the "generic" CPU we default to not having a scheduler model. The machine-combiner-int-vec.ll test shows the positive benefits of this change. Differential Revision: https://reviews.llvm.org/D62787 llvm-svn: 362629	2019-06-05 18:25:09 +00:00
Philip Reames	13dd125043	[Tests] Add poison inference tests for indvars showing both existing transforms, and some room for improvement llvm-svn: 362628	2019-06-05 18:00:59 +00:00
Cameron McInally	0a31726d20	[NFC][Reassociate] Regenerate CHECKs for fast-basictest.ll llvm-svn: 362627	2019-06-05 18:00:27 +00:00
Sanjay Patel	2bf82879bd	[x86] split more 256-bit stores of concatenated vectors As suggested in D62498 - collectConcatOps() matches both concat_vectors and insert_subvector patterns, and we see more test improvements by using the more general match. llvm-svn: 362620	2019-06-05 16:40:57 +00:00
Simon Pilgrim	a0e350e640	[X86][SSE] Add additional nt-load test cases as discussed on D62910 llvm-svn: 362616	2019-06-05 16:11:57 +00:00
George Rimar	5da702308c	[llvm-readobj] - Remove TODOs from gnu-hash-symbols.test and demangle.test test cases. We can remove this TODOs now. Differential revision: https://reviews.llvm.org/D62846 llvm-svn: 362614	2019-06-05 15:29:50 +00:00
Dinar Temirbulatov	15c657d13d	[SLP] Fix regression in broadcasts caused by operand reordering patch D59973. This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 . The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found. Please see the lit test for some examples. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D62427 llvm-svn: 362613	2019-06-05 15:26:28 +00:00
Roman Lebedev	253086230f	[NFC][Codegen][X86] Add AVX2 runline for '(X & (C l>> Y)) ==/!= 0' tests llvm-svn: 362606	2019-06-05 14:08:11 +00:00
Roman Lebedev	54bd6c840e	UpdateTestChecks: hexagon support Summary: These tests are being affected by an upcoming patch, so having an understandable (autogenerated) diff is helpful. This target, again, prefers `-march`: ``` llvm/test/CodeGen/Hexagon$ grep -r triple \| wc -l 467 llvm/test/CodeGen/Hexagon$ grep -r march \| wc -l 1167 ``` Reviewers: RKSimon, kparzysz Reviewed By: kparzysz Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62867 llvm-svn: 362605	2019-06-05 14:08:01 +00:00
Petar Avramovic	22e99c434f	[MIPS GlobalISel] Select fcmp Select floating point compare for MIPS32. Differential Revision: https://reviews.llvm.org/D62721 llvm-svn: 362603	2019-06-05 14:03:13 +00:00
George Rimar	66296dc3e4	[yaml2obj] - Change how we handle implicit sections. We have a few sections that can be added implicitly to the output: ".dynsym", ".dynstr", ".symtab", ".strtab" and ".shstrtab". Problem appears when such section is listed explicitly in YAML. In that case it's content is written twice: first time during writing of regular sections listed in the document and second time during special handling. Because of that their file offsets can become unexpectedly broken: (yaml file for sample below lists .dynsym explicitly before .text.foo) Before patch: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .dynsym DYNSYM 0000000000000100 00000250 0000000000000030 0000000000000018 A 6 0 8 [ 2] .text.foo PROGBITS 0000000000000200 00000200 0000000000000000 0000000000000000 AX 0 0 0 After patch: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .dynsym DYNSYM 0000000000000100 00000200 0000000000000030 0000000000000018 A 6 0 8 [ 2] .text.foo PROGBITS 0000000000000200 00000230 0000000000000000 0000000000000000 AX 0 0 0 This patch reorganizes our code and fixes the issue described. Differential revision: https://reviews.llvm.org/D62809 llvm-svn: 362602	2019-06-05 13:16:53 +00:00
Simon Pilgrim	886a55eaa0	[X86][AVX] combineX86ShuffleChain - combine shuffle(extractsubvector(x),extractsubvector(y)) We already handle the case where we combine shuffle(extractsubvector(x),extractsubvector(x)), this relaxes the requirement to permit different sources as long as they have the same value type. This causes a couple of cases where the VPERMV3 binary shuffles occur at a wider width than before, which I intend to improve in future commits - but as only the subvector's mask indices are defined, these will broadcast so we don't see any increase in constant size. llvm-svn: 362599	2019-06-05 12:56:53 +00:00
George Rimar	b42196661b	[llvm-objdump] - Disassemble non-executable sections if specifically requested. This is https://bugs.llvm.org/show_bug.cgi?id=41897. Previously -d + -j .data had no effect, that wasn't consistent with GNU, which proccesses .data in that case. With this patch we follow this behavior. Diffeential revision: https://reviews.llvm.org/D62848 llvm-svn: 362596	2019-06-05 11:37:53 +00:00
Simon Pilgrim	ddfbfd6172	[X86][SSE] Add some nt-store test cases inspired by PR42123 llvm-svn: 362594	2019-06-05 10:55:55 +00:00
Yevgeny Rouban	a3e16719c4	Resubmit "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst" This reverts commit `5b32f60ec3`. The fix is in commit `4f9e68148b`. This patch fixes the CorrelatedValuePropagation pass to keep prof branch_weights metadata of SwitchInst consistent. It makes use of SwitchInstProfUpdateWrapper. New tests are added. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D62126 llvm-svn: 362583	2019-06-05 05:46:40 +00:00
Johannes Doerfert	aade782a98	[Attributor] Pass infrastructure and fixpoint framework NOTE: Note that no attributes are derived yet. This patch will not go in alone but only with others that derive attributes. The framework is split for review purposes. This commit introduces the Attributor pass infrastructure and fixpoint iteration framework. Further patches will introduce abstract attributes into this framework. In a nutshell, the Attributor will update instances of abstract arguments until a fixpoint, or a "timeout", is reached. Communication between the Attributor and the abstract attributes that are derived is restricted to the AbstractState and AbstractAttribute interfaces. Please see the file comment in Attributor.h for detailed information including design decisions and typical use case. Also consider the class documentation for Attributor, AbstractState, and AbstractAttribute. Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59918 llvm-svn: 362578	2019-06-05 03:02:24 +00:00
Johannes Doerfert	76467c4d7f	[NFC][FnAttrs] Stress tests for attribute deduction This commit is a preparation of upcoming patches on attribute deduction. It will shorten the diffs and make it clear what we inferred before. Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes Subscribers: bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59903 llvm-svn: 362577	2019-06-05 03:00:06 +00:00
Nemanja Ivanovic	7c842fadf1	[PowerPC] Collapse RLDICL/RLDICR into RLDIC when possible Generally speaking, we lower to an optimal rotate sequence for nodes visible in the SDAG. However, there are instances where the two rotates are not visible at ISEL time - most notably those in a very common sequence when lowering switch statements to jump tables. A common situation is a switch on a 32-bit integer. This value has to have the upper 32 bits cleared and because jump table offsets are word offsets, the value needs to be shifted left by 2 bits. We currently emit the clear and the left shift as two separate instructions, but this is not needed as we can lower it to a single RLDIC. This patch just cleans that up. Differential revision: https://reviews.llvm.org/D60402 llvm-svn: 362576	2019-06-05 02:36:40 +00:00
Nemanja Ivanovic	cfb6c82172	[PowerPC][NFC] Add codegen test for consecutive stores of vector elements NFC commit of a test case in order for the subsequent review to show differences in codegen. Differential revision: https://reviews.llvm.org/D62843 llvm-svn: 362573	2019-06-05 02:09:03 +00:00
Fangrui Song	f090e6f7b6	[llvm-objdump/llvm-readobj/obj2yaml/yaml2obj] Support DT_PPC_GOT and DT_PPC_OPT In glibc, DT_PPC_GOT indicates that PowerPC32 Secure PLT ABI is used. I plan to use it in D62464. DT_PPC_OPT currently indicates if a TLSDESC inspired TLS optimization is enabled. Reviewed By: grimar, jhenderson, rupprecht Differential Revision: https://reviews.llvm.org/D62851 llvm-svn: 362569	2019-06-05 01:36:48 +00:00
Nemanja Ivanovic	fe97754acf	Initial support for IBM MASS vector library This is the LLVM portion of patch https://reviews.llvm.org/D59881. The clang portion is to follow. llvm-svn: 362568	2019-06-05 01:31:43 +00:00
Amara Emerson	5e312be0fa	[AArch64] FastISel: fix test to specify -fast-isel when -fast-isel-abort=1 is used. This test has been inadvertently been GISel, and now assert due to incompatible flags. llvm-svn: 362559	2019-06-04 23:11:42 +00:00
Cameron McInally	5c7245b830	[Scalarizer] Add UnaryOperator visitor to scalarization pass Differential Revision: https://reviews.llvm.org/D62858 llvm-svn: 362558	2019-06-04 23:01:36 +00:00
Alex Brachet	375d5fb9ca	[test][llvm-objcopy] Test llvm-objcopy with standard streams Differential Revision: https://reviews.llvm.org/D62817 llvm-svn: 362556	2019-06-04 22:17:27 +00:00
Amara Emerson	2d37cb82f0	[AArch64][GlobalISel] Make extloads to i64 legal. Although we had the support in the prelegalizer combiner to generate the G_SEXTLOAD or G_ZEXTLOAD ops, the legalizer definitions for arm64 had them as lowering back to separate ops. llvm-svn: 362553	2019-06-04 21:51:34 +00:00
Craig Topper	1648cb17e4	[X86] Add avx512bw to the avx512 machine-combiner-int-vec.ll to ensure we use zmm for v32i16/v64i8. llvm-svn: 362552	2019-06-04 21:47:50 +00:00
Craig Topper	8362518c6e	[X86] Add vector min/max reassociation tests to machine-combiner-int-vec.ll. NFC llvm-svn: 362550	2019-06-04 21:26:46 +00:00
Craig Topper	2fb7306f82	[X86] Add 512-bit test cases to machine-combiner-int-vec.ll. NFC llvm-svn: 362549	2019-06-04 21:26:36 +00:00
Thomas Lively	3d9ca00e74	[WebAssembly] Fix ISel crash on sext_inreg/extract type mismatch Summary: Adjusts the index and adds a bitcast around the vector operand of EXTRACT_VECTOR_ELT so that its lane type matches the source type of its parent sext_inreg. Without this bitcast the ISel patterns do not match and ISel fails. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62646 llvm-svn: 362547	2019-06-04 21:08:20 +00:00
Johannes Doerfert	6b432dca5d	[SelectionDAG][FIX] Allow "returned" arguments to be bit-casted Summary: An argument that is return by a function but bit-casted before can still be annotated as "returned". Make sure we do not crash for this case. Reviewers: sunfish, stephenwlin, niravd, arsenm Subscribers: wdng, hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59917 llvm-svn: 362546	2019-06-04 20:34:43 +00:00
Nico Weber	1dce82636c	llvm-undname: Correctly demangle vararg parameters FunctionSignatureNode already had an IsVariadic field, but it wasn't used anywhere yet. Set it and use it. llvm-svn: 362541	2019-06-04 19:10:08 +00:00
Nico Weber	4638548468	llvm-undname: More coverage-related cleanups - The loop in demangleFunctionParameterList() only exits on Error, @, and Z. All 3 cases were handled, so the rest of the function is DEMANGLE_UNREACHABLE. - The loop in demangleTemplateParameterList() always returns on Error, so there's no need to check for that in the loop header and after the loop. - Add test cases for invalid function parameter manglings. - Add a (redundant) test case for a simple template parameter list mangling. - Add a test case pointing out that varargs functions aren't demangled correctly. llvm-svn: 362540	2019-06-04 18:49:05 +00:00
Nemanja Ivanovic	aed7227b71	Revert r362472 as it is breaking PPC build bots The patch https://reviews.llvm.org/rL362472 broke PPC LNT buildbots. Reverting it to bring the bots back to green. llvm-svn: 362539	2019-06-04 18:48:43 +00:00
Alina Sbirlea	bfceed49ce	[Utils] Clean another duplicated util method. Summary: Following the cleanup in D48202, method foldBlockIntoPredecessor has the same behavior. Replace its uses with MergeBlockIntoPredecessor. Remove foldBlockIntoPredecessor. Reviewers: chandlerc, dmgreen Subscribers: jlebar, javed.absar, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62751 llvm-svn: 362538	2019-06-04 18:45:15 +00:00
Nico Weber	878df1c2a9	llvm-undname: Add test coverage for demangleInitFiniStub() llvm-svn: 362536	2019-06-04 18:06:28 +00:00
Craig Topper	09a4415803	[DAGCombiner][X86] Fold (not (neg X)) -> (add X, -1) This is a special case of a more general transform (not (sub Y, X)) -> (add X, ~Y). InstCombine knows the general form. I've restricted to the special case to fix the motivating case PR42118. I tried handling any case where Y was constant, but got some changes on some Mips tests that I couldn't quickly prove where beneficial. Fixes PR42118 Differential Revision: https://reviews.llvm.org/D62828 llvm-svn: 362533	2019-06-04 17:44:18 +00:00
Philip Reames	0cdaf3a09f	[Tests] Autogen a test so future changes are visible Oddly, I had to change a value name from "tmp0" to "bc0" to get the autogened test to pass. I'm putting this down to an oddity of update_test_checks or FileCheck, but don't understand it. llvm-svn: 362532	2019-06-04 17:29:55 +00:00
Roman Lebedev	925553ec91	[NFC][Codegen][PowerPC] Autogenerate shift-cmp.ll test Being affected by upcoming patch llvm-svn: 362529	2019-06-04 17:05:34 +00:00
Roman Lebedev	78ec94e4ec	[NFC][Codegen][AMDGPU] Autogenerate commute-shifts.ll test Being affected by upcoming patch llvm-svn: 362528	2019-06-04 17:05:06 +00:00
Sanjay Patel	606eb2367f	[x86] split 256-bit store of concatenated vectors This shows up as a side issue to the main problem for the AVX target example from PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 - https://godbolt.org/z/7tpRa3 But as we can see in the pile of existing test diffs, it's actually a widespread problem that affects any AVX or later target. Apart from a couple of oddballs, I think these are all improvements for the reasons stated in the code comment: we do not want to enable YMM unnecessarily (avoid vzeroupper and frequency throttling) and some cores split 256-bit stores anyway. We could say that MergeConsecutiveStores() is going overboard on some of these examples, but that won't solve the problem completely. But that is a reason I'm proposing this as a lowering rather than a combine: we will infinite loop fighting the merge code if we try this earlier. Differential Revision: https://reviews.llvm.org/D62498 llvm-svn: 362524	2019-06-04 16:40:04 +00:00
Peter Smith	f15e3d856f	[AArch64][ELF] Add support for PLT decoding with BTI instructions present Arm Architecture v8.5a introduces Branch Target Identification (BTI). When enabled all indirect branches must target a bti instruction of the appropriate form. As PLT sequences may sometimes be the target of an indirect branch and PLT[0] always is, a static linker may need to generate PLT sequences that contain "bti c" as the first instruction. In effect: bti c adrp x16, page offset to .got.plt ... Instead of: adrp x16, page offset to .got.plt ... At present the PLT decoding assumes the adrp will always be the first instruction. This patch adds support for a single "bti c" to prefix it. A test binary has been uploaded with such a PLT sequence. A forthcoming LLD patch will make heavy use of the PLT decoding code. Differential Revision: https://reviews.llvm.org/D62598 llvm-svn: 362523	2019-06-04 16:35:40 +00:00
Nico Weber	d98a0a362f	llvm-undname: Yet more coverage for error paths - For error returns in demangleSpecialTableNode(), demangleLocalStaticGuard(), RTTITypeDescriptor, demangleRttiBaseClassDescriptorNode(), demangleUnsigned(), demangleUntypedVariable() (via RttiBaseClassArray) - For ?_A and ?_P which are handled at early levels of the demangler but are not implemented in a later stage; this is now more obvious - Replace a "default:" with an explicit list of cases, to get -Wswitch check we list all cases llvm-svn: 362520	2019-06-04 16:25:28 +00:00
Nikita Popov	df621bdfc8	[LVI][CVP] Add support for urem, srem and sdiv The underlying ConstantRange functionality has been added in D60952, D61207 and D61238, this just exposes it for LVI. I'm switching the code from using a whitelist to a blacklist, as we're down to one unsupported operation here (xor) and writing it this way seems more obvious :) Differential Revision: https://reviews.llvm.org/D62822 llvm-svn: 362519	2019-06-04 16:24:09 +00:00
Philip Reames	af11a4376c	[Tests] Update a test to consistently use new pass manager and FileCheck the result llvm-svn: 362518	2019-06-04 16:19:34 +00:00
Philip Reames	78e71c4d09	[Tests] Autogen tests so that diffs for a future change are understandable llvm-svn: 362516	2019-06-04 16:15:19 +00:00
Nico Weber	dc2a8c7d7f	llvm-undname: Add coverage for startsWithLocalScopePattern() llvm-svn: 362515	2019-06-04 15:47:25 +00:00
Nico Weber	c1a0e6fe6b	llvm-undname: More no-op changes to increase test coverage - Add test coverage around invalid anon namespaces and for error paths in demanglePrimitiveType() and in demangleFullyQualifiedTypeName() - Use DEMANGLE_UNREACHABLE in two more unreachable places llvm-svn: 362514	2019-06-04 15:38:00 +00:00
James Henderson	7f3135037d	[llvm-symbolizer] Flush output on bad input One way of using llvm-symbolizer is to interactively within a process write a line from a parent process to llvm-symbolizer's stdin, and then read the output, then write the next line, read, etc. This worked as long as all the lines were good. However, this didn't work prior to this patch if any of the inputs were bad inputs, because the output is not flushed after a bad input, meaning the parent process is sat waiting for output, whilst llvm-symbolizer is sat waiting for input. This patch flushes the output after every invocation of symbolizeInput when reading from stdin. It also removes unnecessary flushing when llvm-symbolizer is not reading addresses from stdin, which should give a slight performance boost in these situations. Reviewed by: ikudrin Differential Revision: https://reviews.llvm.org/D62371 llvm-svn: 362511	2019-06-04 15:34:58 +00:00
Jinsong Ji	3144d7a2da	[PowerPC] P9 Scheduling Model: dispatching rule fixes This is to address some of the problems in existing P9 resource modeling, especially about the dispatching rules. Instead of using a hypothetical DISPATCHER , we try to use the number of actual dispatch slots, and define SchedWriteRes to model dispatch rules, then update instruction classes according to dispatch rules. All the dispatch rules and instruction classes update are made according to POWER9 User Manual. Differential Revision: https://reviews.llvm.org/D61873 llvm-svn: 362509	2019-06-04 15:22:23 +00:00
Sanjay Patel	1e63dd0b44	[SelectionDAG][x86] limit post-legalization store merging by type The proposal in D62498 showed that x86 would benefit from vector store splitting, but that may conflict with the generic DAG combiner's store merging transforms. Add memory type to the existing TLI hook that enables the merging transforms, so we can limit those changes to scalars only for x86. llvm-svn: 362507	2019-06-04 15:15:59 +00:00
Nico Weber	880d21d3cb	llvm-undname: Several behavior-preserving changes to increase coverage - Replace `Error = true` in a few branches that are truly unreachable with DEMANGLE_UNREACHABLE - Remove early return early in startsWithLocalScopePattern() because it's redundant with the next two early returns - Remove unreachable `case '0'` (it's handled in the branch below) - Remove an unused bool return - Add test coverage for several early error returns, mostly in array type parsing llvm-svn: 362506	2019-06-04 15:13:30 +00:00
Sanjay Patel	d6de9426ee	[x86] add test for store merging/splitting; NFC This is a reduction of a test that would infinite loop with D62498. llvm-svn: 362502	2019-06-04 14:40:37 +00:00
Shawn Landden	2ee9a827ad	[SimplifyCFG] fix last commit llvm-svn: 362501	2019-06-04 14:32:52 +00:00
Shawn Landden	7f22fecac2	[SimplifyCFG] NFC; remove bogus test case Even if one bit is defined, the code is not clear what it is suppose to do. The test wants to assert that some bits are undef, but that's not what the IR does and I don't think it's even possible to do that in any meaningful way. It was added in D12497, so @reames might want to double check. Differential Revision: https://reviews.llvm.org/D60859 llvm-svn: 362499	2019-06-04 14:17:46 +00:00
Roman Lebedev	2e49e8196d	[NFC][Codegen] D62818 - also add tests with X being constant For X86, these may be a 'BT' pattern, and in general, can cause the transform to deadlock. llvm-svn: 362494	2019-06-04 11:44:50 +00:00
Peter Smith	49d7221f71	[AArch64][ELF][llvm-readobj] Add support for BTI and PAC dynamic tags ELF for the 64-bit Arm Architecture defines two processor-specific dynamic tags: DT_AARCH64_BTI_PLT 0x70000001, d_val DT_AARCH64_PAC_PLT 0x70000003, d_val These presence of these tags indicate that PLT sequences have been protected using Branch Target Identification and Pointer Authentication respectively. The presence of both indicates that the PLT sequences have been protected with both Branch Target Identification and Pointer Authentication. This patch adds the tags and tests for llvm-readobj and yaml2obj. As some of the processor specific dynamic tags overlap, this patch splits them up, keeping their original default value if they were not previously mentioned explicitly in a switch case. Differential Revision: https://reviews.llvm.org/D62596 llvm-svn: 362493	2019-06-04 11:44:33 +00:00
Peter Smith	580c6d31c0	[AARCH64][ELF][llvm-readobj] Support for AArch64 .note.gnu.property ELF for the 64-bit Arm Architecture defines a processor specific property type GNU_PROPERTY_AARCH64_FEATURE_1_AND as GNU_PROPERTY_LOPROC. This property works in a similar way to the existing X86 processor specific property GNU_PROPERTY_GNU_X86_FEATURE_1_AND. Two feature bits are defined for GNU_PROPERTY_AARCH64_FEATURE_1_AND: - GNU_PROPERTY_AARCH64_FEATURE_1_BTI 0x1 - GNU_PROPERTY_AARCH64_FEATURE_1_PAC 0x2 This patch defines the property, feature bits and implements support for printing in llvm-readobj. Differential Revision: https://reviews.llvm.org/D62595 llvm-svn: 362490	2019-06-04 11:28:22 +00:00
Roman Lebedev	3dce0326fe	[DAGCombine][X86][AArch64][MIPS][LANAI] (C - x) - y -> C - (x + y) fold (PR41952) Summary: This might be the last fold for `sink-addsub-of-const.ll`, but i'm not sure yet. As far as i can tell, there are no regressions here (ignoring x86-32), all changes are either good or neutral. This, almost surprisingly to me, fixes the motivational tests (in `shift-amount-mod.ll`) `@reg32_lshr_by_sub_from_negated` from [[ https://bugs.llvm.org/show_bug.cgi?id=41952 \| PR41952 ]]. https://rise4fun.com/Alive/vMd3 Reviewers: RKSimon, t.p.northover, craig.topper, spatel, efriedma Reviewed By: RKSimon Subscribers: sdardis, javed.absar, arichardson, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62774 llvm-svn: 362488	2019-06-04 11:06:21 +00:00
Roman Lebedev	be6ce7b3f2	[DAGCombine][X86][AArch64][ARM] (C - x) + y -> (y - x) + C fold Summary: All changes except ARM look great. https://rise4fun.com/Alive/R2M The regression `test/CodeGen/ARM/addsubcarry-promotion.ll` is recovered fully by D62392 + D62450. Reviewers: RKSimon, craig.topper, spatel, rogfer01, efriedma Reviewed By: efriedma Subscribers: dmgreen, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62266 llvm-svn: 362487	2019-06-04 11:06:08 +00:00
Simon Pilgrim	ad298f86b7	[SelectionDAG] ComputeNumSignBits - support constant pool values from target As I mentioned on D61887 we don't get many hits on ComputeNumSignBits as we did on computeKnownBits. The case we do get is interesting though - it allows us to use the 'ConditionalNegate' combine in combineLogicBlendIntoPBLENDV to remove a select. It comes too late for SSE41 (BLENDV) cases, but SSE2 tests can hit it now. We should probably try to make use of this for SSE41+ targets as well - avoiding variable blends is usually a good idea. I'll investigate as a followup. Differential Revision: https://reviews.llvm.org/D62777 llvm-svn: 362486	2019-06-04 10:49:06 +00:00
Owen Reynolds	5d5078e341	[llvm-ar] Reapply Fix relative thin archive path handling Includes a fix for an introduced build failure due to a post c++11 use of std::mismatch. This fixes some thin archive relative path issues, paths are shortened where possible and paths are output correctly when using the display table command. Differential Revision: https://reviews.llvm.org/D59491 llvm-svn: 362484	2019-06-04 10:13:03 +00:00
Simon Pilgrim	3018d505a3	[SelectionDAG] Add fpto[us]i(undef) --> undef constant fold Follow up to D62807. Differential Revision: https://reviews.llvm.org/D62811 llvm-svn: 362483	2019-06-04 10:04:55 +00:00
Mikhail Maltsev	08da01b496	[ARM] Add FP16 vector insert/extract patterns This change adds two FP16 extraction and two insertion patterns (one per possible vector length). Extractions are handled by copying a Q/D register into one of VFP2 class registers, where single FP32 sub-registers can be accessed. Then the extraction of even lanes are simple sub-register extractions (because we don't care about the top parts of registers for FP16 operations). Odd lanes need an additional VMOVX instruction. Unfortunately, insertions cannot be handled in the same way, because: * There is no instruction to insert FP16 into an even lane (VINS only works with odd lanes) * The patterns for odd lanes will have a form of a DAG (not a tree), and will not be implementable in pure tablegen Because of this insertions are handled in the same way as 16-bit integer insertions (with conversions between FP registers and GPRs using VMOVHR instructions). Without these patterns the ARM backend would sometimes fail during instruction selection. This patch also adds patterns which combine: * an FP16 element extraction and a store into a single VST1 instruction * an FP16 load and insertion into a single VLD1 instruction Differential Revision: https://reviews.llvm.org/D62651 llvm-svn: 362482	2019-06-04 09:39:55 +00:00
QingShan Zhang	11de0e71b0	[DAGCombine] Match a pattern where a wide type scalar value is stored by several narrow stores This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c static void store64(u64 x, unsigned char* y) { for(int i = 0; i != 8; ++i) y[i] = (x >> ((7-i) * 8)) & 255; } static u64 load64(const unsigned char* y) { u64 res = 0; for(int i = 0; i != 8; ++i) res \|= (u64)(y[i]) << ((7-i) * 8); return res; } The load64 has been implemented by https://reviews.llvm.org/D26149 This patch is trying to implement the store pattern. Match a pattern where a wide type scalar value is stored by several narrow stores. Fold it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; > ((i32)p) = val; i8 p = ... i32 val = ... p[0] = (val >> 24) & 0xFF; p[1] = (val >> 16) & 0xFF; p[2] = (val >> 8) & 0xFF; p[3] = (val >> 0) & 0xFF; > ((i32)p) = BSWAP(val); Differential Revision: https://reviews.llvm.org/D61843 llvm-svn: 362472	2019-06-04 08:53:53 +00:00
QingShan Zhang	72667b4e48	[NFC] Update the test to check the endianness after the CodeGenPrepare instead of checking the assembly instructions. llvm-svn: 362471	2019-06-04 08:45:07 +00:00
Simon Tatham	ac02445524	[ARM] Turn some undefined encoding bits into 0s. The family of 32-bit Thumb instruction encodings that include t2ORR, t2AND and t2EOR are all listed in the ArmARM as having (0) in bit 15. The Tablegen descriptions of those instructions listed them as ?. This change tightens that up by making them into 0 + Unpredictable. In the specific case of t2ORR, we tighten it up still further by making the zero bit mandatory. This change comes from Arm v8.1-M, in which encodings with that bit equal to 1 will now be used for different instructions. Reviewers: dmgreen, samparker, SjoerdMeijer, efriedma Reviewed By: dmgreen, efriedma Subscribers: efriedma, javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60705 llvm-svn: 362470	2019-06-04 08:28:48 +00:00
Chen Zheng	a050b25544	[PowerPC] add testcases for reordering LSR and PPCCTRLoops - NFC llvm-svn: 362468	2019-06-04 06:48:14 +00:00
Roman Lebedev	b3650868f6	[NFC][X86] Fixup FileCheck prefixes - drop duplicates llvm-svn: 362460	2019-06-03 23:00:51 +00:00
Craig Topper	ac062bbad8	[X86] Add test cases for 32 and 64 bit versions of PR42118. NFC llvm-svn: 362457	2019-06-03 22:34:15 +00:00
Roman Lebedev	6dc8ce323e	[NFC][Codegen] Add tests for hoisting and-by-const from "logical shift", when then eq-comparing with 0 This was initially reported as: https://reviews.llvm.org/D62818 https://rise4fun.com/Alive/oPH llvm-svn: 362455	2019-06-03 22:30:18 +00:00
Craig Topper	099f4a9fa8	Revert r362451 "foo" and r362452 "[X86] Add test cases for 32 and 64 bit versions of PR42118. NFC" I failed to squash these properly llvm-svn: 362453	2019-06-03 22:14:54 +00:00
Craig Topper	17728e7c15	[X86] Add test cases for 32 and 64 bit versions of PR42118. NFC llvm-svn: 362452	2019-06-03 22:11:40 +00:00
Craig Topper	27a546610c	foo llvm-svn: 362451	2019-06-03 22:11:30 +00:00
Cameron McInally	89f9af5487	[SCCP] Add UnaryOperator visitor to SCCP for unary FNeg Differential Revision: https://reviews.llvm.org/D62819 llvm-svn: 362449	2019-06-03 21:53:56 +00:00
Michael Berg	6ff978ee05	Propagate fmf for setcc in SDAG for select folds llvm-svn: 362448	2019-06-03 21:53:26 +00:00
Matt Arsenault	0ceda9fb5c	AMDGPU: Disable stack realignment for kernels This is something of a workaround, and the state of stack realignment controls is kind of a mess. Ideally, we would be able to specify the stack is infinitely aligned on entry to a kernel. TargetFrameLowering provides multiple controls which apply at different points. The StackRealignable field is used during SelectionDAG, and for some reason distinct from this hook. StackAlignment is a single field not dependent on the function. It would probably be better to make that dependent on the calling convention, and the maximum value for kernels. Currently this doesn't really change anything, since the frame lowering mostly does its own thing. This helps avoid regressions in a future change which will rely more heavily on hasFP. llvm-svn: 362447	2019-06-03 21:33:22 +00:00
Jessica Paquette	7500c97ce4	[AArch64][GlobalISel] Optimize G_FCMP + G_SELECT pairs when G_SELECT is fp Instead of emitting all of the test stuff for a compare when it's only used by a select, instead, just emit the compare + select. The select will use the value of NZCV correctly, so we don't need to emit all of the test instructions etc. For now, only support fp selects which use G_FCMP. Also only support condition codes which will only require one select to represent. Also add a test. Differential Revision: https://reviews.llvm.org/D62695 llvm-svn: 362446	2019-06-03 20:47:20 +00:00
Craig Topper	dcf865f0ca	[X86] Fix the pattern for merge masked vcvtps2pd. r362199 fixed it for zero masking, but not zero masking. The load folding in the peephole pass hid the bug. This patch turns off the peephole pass on the relevant test to ensure coverage. llvm-svn: 362440	2019-06-03 19:29:14 +00:00
Michael Berg	0b7f98da65	Propagate fmf for setcc/select folds Summary: This change facilitates propagating fmf which was placed on setcc from fcmp through folds with selects so that back ends can model this path for arithmetic folds on selects in SDAG. Reviewers: qcolombet, spatel Reviewed By: qcolombet Subscribers: nemanjai, jsji Differential Revision: https://reviews.llvm.org/D62552 llvm-svn: 362439	2019-06-03 19:12:15 +00:00
Nemanja Ivanovic	bad43d8f49	[PowerPC] Look through copies for compare elimination We currently miss the opportunities for optmizing comparisons in the peephole optimizer if the input is the result of a COPY since we look for record-form versions of the producing instruction. This patch simply lets the optimization peek through copies. Differential revision: https://reviews.llvm.org/D59633 llvm-svn: 362438	2019-06-03 19:09:15 +00:00
Matt Arsenault	8dbeb9256c	TTI: Improve default costs for addrspacecast For some reason multiple places need to do this, and the variant the loop unroller and inliner use was not handling it. Also, introduce a new wrapper to be slightly more precise, since on AMDGPU some addrspacecasts are free, but not no-ops. llvm-svn: 362436	2019-06-03 18:41:34 +00:00
Andrew Kaylor	4172dbab5d	Fix a crash when the default of a switch is removed This patch fixes a problem that occurs in LowerSwitch when a switch statement has a PHI node as its condition, and the PHI node only has two incoming blocks, and one of those incoming blocks is through an unreachable default in the switch statement. When this condition occurs, LowerSwitch holds a pointer to the condition value, but removes the switch block as a predecessor of the PHI block, causing the PHI node to be replaced. LowerSwitch then tries to use its stale pointer to the original condition value, causing a crash. Differential Revision: https://reviews.llvm.org/D62560 llvm-svn: 362427	2019-06-03 17:54:15 +00:00
Philip Reames	83645d214d	[Tests] Add LFTR tests for multiple exit loops (try 2) (Recommit after fixing a keymash in the run line. Sorry for breakage.) This is preparation for D62625 <https://reviews.llvm.org/D62625> llvm-svn: 362426	2019-06-03 17:41:12 +00:00
Dmitri Gribenko	b46934eeb8	Revert "[Tests] Add LFTR tests for multiple exit loops" This reverts commit r362417. There's a syntax error in the RUN line. llvm-svn: 362418	2019-06-03 16:58:11 +00:00
Philip Reames	2fcd2bd0df	[Tests] Add LFTR tests for multiple exit loops This is preparation for D62625 llvm-svn: 362417	2019-06-03 16:46:03 +00:00
Simon Pilgrim	985f2f48bd	[WebAssembly] Remove fptosi(undef) and fptoui(undef) from reduced test case. Pre-commit for D62811 - which adds DAG fpto[us]i(undef) --> undef constant fold llvm-svn: 362414	2019-06-03 16:21:58 +00:00
Dmitri Gribenko	857de979a7	Revert "[llvm-ar] Fix relative thin archive path handling" This reverts commit r362407. It broke compilation of llvm/lib/Object/ArchiveWriter.cpp: error: type 'llvm::sys::path::const_iterator' does not provide a call operator llvm-svn: 362413	2019-06-03 16:21:37 +00:00
Owen Reynolds	fade9cbed7	[llvm-ar] Fix relative thin archive path handling This fixes some thin archive relative path issues, paths are shortened where possible and paths are output correctly when using the display table command. Differential Revision: https://reviews.llvm.org/D59491 llvm-svn: 362407	2019-06-03 15:26:07 +00:00
Michal Gorny	9158d57d19	[llvm] [test] Remove non-portable EISDIR test from macho-disassemble-g-dsym.test Remove the test checking error message for 'is a directory'. It does not seem to serve any real purpose, and it relies on matching platform error strings which are unpredictable and makes the test fragile. Furthermore, it fails on NetBSD where read() works on directories, and therefore does not return EISDIR at all. Fixes r362141. Differential Revision: https://reviews.llvm.org/D62773 llvm-svn: 362404	2019-06-03 14:50:03 +00:00
Dmitry Preobrazhensky	9111f35f02	[AMDGPU][MC] Added support of SCC, VCCZ and EXECZ operands See bug 39292: https://bugs.llvm.org/show_bug.cgi?id=39292 Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D62660 llvm-svn: 362400	2019-06-03 13:51:24 +00:00
Simon Pilgrim	cb7e4e8193	[SelectionDAG] Add [us]itofp(undef) --> 0 constant fold (PR39205) We were missing this fold in the DAG, which I've copied directly from llvm::ConstantFoldCastInstruction Differential Revision: https://reviews.llvm.org/D62807 llvm-svn: 362397	2019-06-03 13:02:07 +00:00
Simon Pilgrim	74467814f2	[SystemZ] Remove sitofp(undef) from reduced test case. Pre-commit for D62807 - which adds DAG [us]itofp(undef) --> 0 constant fold llvm-svn: 362396	2019-06-03 12:58:36 +00:00
Cullen Rhodes	3901dd3e41	[AArch64][SVE2] Add CPU and arch directive tests Summary: This patch adds tests for directives .arch, .arch_extension and .cpu for all features defined in Arm SVE2 architecture extension. Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62602 llvm-svn: 362378	2019-06-03 10:42:02 +00:00
George Rimar	ab93e6e0fe	[llvm-readobj] - Convert gnu-sections.test to use YAML. gnu-sections.test currently use relocs.obj.elf-x86_64 and relocs.obj.elf-i386 precompiled objects as an inputs. These inputs actually initially were introduced to test the dump of relocations and have almost nothing common with dumping sections. Patch converts the test to use yaml2obj. That allows to remove relocs.obj.elf-i386 binary. (relocs.obj.elf-x86_64 is still used by another test and can't be removed atm). Differential revision: https://reviews.llvm.org/D62659 llvm-svn: 362377	2019-06-03 09:58:41 +00:00
George Rimar	1115a199aa	[llvm-readobj/llvm-readelf] - Remove gnu-relocations.test completely. rL362089 introduced a set of yaml based reloc-types-*.test test cases (instead of huge reloc-types.test that used a lot of precompiled binaries) These test cases checks LLVM-styled dumping of the relocations. gnu-relocations.test was a test case to check GNU styled relocations dumping. It did that only for elf-x86 and elf-x86_64 targets. It did not test all of the relocations though. Now, after rL362089, it does not make sence to keep it. This patch updates reloc-types-elf-i386.test and reloc-types-elf-x64.test tests with llvm-readelf calls to check GNU styled output in one place. It removes gnu-relocations.test completely. One of intentions of doing this is also to get rid of relocs.obj.elf-i386 and relocs.obj.elf-x86_64 precompiled objects completely (they are used in other tests still). Differential revision: https://reviews.llvm.org/D62655 llvm-svn: 362374	2019-06-03 09:52:32 +00:00
Nikola Prica	2d0106a110	[LiveDebugValues] Close range for previous variable's location when adding newly deduced location When LiveDebugValues deduces new variable's location from spill, restore or register copy instruction it should close old variable's location. Otherwise we can have multiple block output locations for same variable. That could lead to inserting two DBG_VALUEs for same variable to the beginning of the successor block which results to ignoring of first DBG_VALUE. Reviewers: aprantl, jmorse, wolfgangp, dstenb Reviewed By: aprantl Subscribers: probinson, asowda, ivanbaev, petarj, djtodoro Tags: #debug-info Differential Revision: https://reviews.llvm.org/D62196 llvm-svn: 362373	2019-06-03 09:48:29 +00:00
Diogo N. Sampaio	df92f84110	[ARM][FIX] Ran out of registers due tail recursion Summary: - pr42062 When compiling for MinSize, ARMTargetLowering::LowerCall decides to indirect multiple calls to a same function. However, it disconsiders the limitation that thumb1 indirect calls require the callee to be in a register from r0 to r3 (llvm limiation). If all those registers are used by arguments, the compiler dies with "error: run out of registers during register allocation". This patch tells the function IsEligibleForTailCallOptimization if we intend to perform indirect calls, as to avoid tail call optimization. Reviewers: dmgreen, efriedma Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62683 llvm-svn: 362366	2019-06-03 08:58:05 +00:00
Sam Parker	a0bd6f8a1a	[AArch64] Check for simple type in FPToUInt DAGCombiner was hitting a SimpleType assertion when trying to combine a v3f32 before type legalization. bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41916 Differential Revision: https://reviews.llvm.org/D62734 llvm-svn: 362365	2019-06-03 08:49:17 +00:00
Roman Lebedev	bcd542881d	[NFC][X86] extract-{low,}bits.ll: one more pattern c with truncation llvm-svn: 362364	2019-06-03 08:44:09 +00:00
Jim Lin	20b14dacbb	[AVR] Fix incorrect source regclass of LDWRdPtr Summary: LDWRdPtr would be expanded to ld+ldd. ldd only accepts the pointer register is Y or Z. So the register class of pointer of LDWRdPtr should be PTRDISPREGS instead of PTRREGS. Reviewers: dylanmckay Reviewed By: dylanmckay Subscribers: dylanmckay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62300 llvm-svn: 362351	2019-06-03 02:31:07 +00:00
Florian Hahn	e71963c850	Recommit r360171: [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor. If we hit the limit, we do expand the outstanding tokenfactors. Otherwise, we might drop nodes with users in the unexpanded tokenfactors. This fixes the crashes reported by Jordan Rupprecht. Reviewers: niravd, spatel, craig.topper, rupprecht Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D62633 llvm-svn: 362350	2019-06-03 01:30:19 +00:00
Nico Weber	3cbb8b8391	llvm-undname: Add coverage for some error paths llvm-svn: 362346	2019-06-02 23:48:28 +00:00
Nico Weber	54362477c7	llvm-undname; Add more test coverage for demangleFunctionClass() Also add two FC_Far that seem to be missing, by symmetry from the public and protected cases. (But FC_Far isn't really a thing anymore, so this doesn't really have an observable effect.) llvm-svn: 362344	2019-06-02 23:26:57 +00:00
Craig Topper	50b35caf30	[DAGCombiner][X86] Fold away masked store and scatter with all zeroes mask. Similar to what was done for masked load and gather. llvm-svn: 362342	2019-06-02 22:52:38 +00:00
Craig Topper	5f79d74946	[X86] Add test cases for masked store and masked scatter with an all zeroes mask. Fix bug in ScalarizeMaskedMemIntrin Need to cast only to Constant instead of ConstantVector to allow ConstantAggregateZero. llvm-svn: 362341	2019-06-02 22:52:34 +00:00
Simon Pilgrim	8a32ca381d	[CostModel][X86] Improve masked load/store AVX1/AVX2 costs A mixture of internal tests and review of the scheduler models indicates we're overestimating the cost of a masked load, which we're estimating at 4x regular memory ops - more realistic values indicates that its closer to 2x. Masked stores costs are a lot more diverse but 8x is roughly in the middle of the range. e.g. SandyBridge defm : X86WriteRes<WriteFMaskedLoad, [SBPort23,SBPort05], 8, [1,2], 3>; defm : X86WriteRes<WriteFMaskedLoadY, [SBPort23,SBPort05], 9, [1,2], 3>; defm : X86WriteRes<WriteFMaskedStore, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>; defm : X86WriteRes<WriteFMaskedStoreY, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>; e.g. Btver2 defm : X86WriteRes<WriteFMaskedLoad, [JLAGU, JFPU01, JFPX], 6, [1, 2, 2], 1>; defm : X86WriteRes<WriteFMaskedLoadY, [JLAGU, JFPU01, JFPX], 6, [2, 4, 4], 2>; defm : X86WriteRes<WriteFMaskedStore, [JSAGU, JFPU01, JFPX], 6, [1, 1, 4], 1>; defm : X86WriteRes<WriteFMaskedStoreY, [JSAGU, JFPU01, JFPX], 6, [2, 2, 4], 2>; Differential Revision: https://reviews.llvm.org/D61257 llvm-svn: 362338	2019-06-02 20:37:02 +00:00
Craig Topper	a7bc31ebc6	[DAGCombiner] Replace masked loads with a zero mask with the passthru value Similar to what was recently done for gathers in r362015. llvm-svn: 362337	2019-06-02 18:58:46 +00:00
Nico Weber	869308dd55	Add demangling test coverage for unsigned short, unsigned long llvm-svn: 362332	2019-06-02 17:29:26 +00:00
Nico Weber	dfe02bc4e9	Add mangling test coverage for non-volatile const member pointers llvm-svn: 362331	2019-06-02 17:23:53 +00:00
Roman Lebedev	420f5df1c3	[NFC][X86] extract-{low,}bits.ll: one more pattern a with truncation llvm-svn: 362330	2019-06-02 17:11:21 +00:00
Nico Weber	d0d32c35d9	Add test coverage for __pascal mangling llvm-svn: 362329	2019-06-02 16:47:07 +00:00
Simon Pilgrim	71a39bcf68	[X86] isHorizontalBinOp - add extract_subvector(shuffle(x)) handling (PR39921) Let's us match horizontal op patterns on fast-variable-shuffle targets (Haswell etc.) llvm-svn: 362327	2019-06-02 15:47:49 +00:00
Simon Pilgrim	b0dc262ffb	[X86] Add AVX2 'fast-variable-shuffle' PHADD tests (PR39921) Haswell etc. will combine shuffles to a extract_subvector(permd(x)) before isHorizontalBinOp can match it. llvm-svn: 362326	2019-06-02 15:33:28 +00:00
Roman Lebedev	2065ddfd79	[NFC][X86] extract-lowbits.ll: add one more pattern a with truncation We are also free to interpret this as 'BZHI'/'BEXTR'. https://rise4fun.com/Alive/dD6 llvm-svn: 362325	2019-06-02 15:07:49 +00:00
Simon Pilgrim	ffb4d2bff7	[DAG] isBitwiseNot / isConstOrConstSplat - add support for build vector undefs + truncation (PR41020) Add (opt-in) support for implicit truncation to isConstOrConstSplat, which allows us to match truncated 'all ones' cases in isBitwiseNot. PR41020 compares against using ISD::isBuildVectorAllOnes() instead, but that predicate silently accepts any UNDEF elements in the build vector which might not be what we want in isBitwiseNot - so I've added an opt-in 'AllowUndefs' flag that is set to false by default but will allow us to enable it on individual cases where its safe. Differential Revision: https://reviews.llvm.org/D62783 llvm-svn: 362323	2019-06-02 11:56:39 +00:00
Nikita Popov	eb37509832	[IndVarSimplify] Add tests for saturating math on IV; NFC These saturating math ops can be replaced with simple math. llvm-svn: 362320	2019-06-02 08:49:35 +00:00
Roman Lebedev	0bfa9359b0	[NFC][X86] extract-lowbits.ll: add patterns with truncation too If we look past truncations of X too eagerly (D62786), we may end up with 64-bit 'BEXTR', even though 32-bit-one would suffice. llvm-svn: 362319	2019-06-02 08:05:24 +00:00
Craig Topper	fe699c32a2	[X86] Simplify the CHECK lines in vector-reduce-and/or/xor-widen.ll in similar way to r362308. Forgot to do the widen forms when I was doing the others. llvm-svn: 362310	2019-06-02 00:43:02 +00:00
Craig Topper	396a915c26	[X86] Add the SSE versions of PMULLW and PMULLD to isAssociativeAndCommutative. llvm-svn: 362309	2019-06-02 00:42:58 +00:00
Craig Topper	4721fad972	[X86] Simplify the CHECK lines in vector-reduce-and/or/xor. The AVX512BW and AVX512VL checks were never used. And AVX512 is the same as AVX on all tests that weren't already split for AVX1 and AVX2. llvm-svn: 362308	2019-06-02 00:07:52 +00:00
Craig Topper	eeaecc63e9	[X86] Add avx512 command lines and test cases to machine-combiner.ll llvm-svn: 362307	2019-06-02 00:07:48 +00:00
Craig Topper	7cebf0af40	[InlineCost] Don't add the soft float function call cost for the fneg idiom, fsub -0.0, %x Summary: Fneg can be implemented with an xor rather than a function call so we don't need to add the function call overhead. This was pointed out in D62699 Reviewers: efriedma, cameron.mcinally Reviewed By: efriedma Subscribers: javed.absar, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62747 llvm-svn: 362304	2019-06-01 19:40:07 +00:00
Simon Pilgrim	cd1878d0f9	[AMDGPU] Regenerate SDIV tests for an upcoming patch llvm-svn: 362303	2019-06-01 18:27:06 +00:00
Simon Pilgrim	0d4a040510	[X86][AVX] Add tests for CONCAT(MOVDDUP(x),MOVDDUP(y)) llvm-svn: 362300	2019-06-01 14:05:46 +00:00
Simon Atanasyan	25694e0084	[mips] Extend range of register indexes accepted by cfcmsa/ctcmsa The `cfcmsa` and `ctcmsa` instructions accept index of MSA control register. The MIPS64 SIMD Architecture define eight MSA control registers. But register index for `cfcmsa` and `ctcmsa` instructions might be any number in 0..31 range. If the index is greater then 7, `cfcmsa` writes zero to the destination registers and `ctcmsa` does nothing [1]. [1] MIPS Architecture for Programmers Volume IV-j: The MIPS64 SIMD Architecture Module https://www.mips.com/?do-download=the-mips64-simd-architecture-module Differential Revision: https://reviews.llvm.org/D62597 llvm-svn: 362299	2019-06-01 13:55:18 +00:00
Dylan McKay	45eb4c7e55	[AVR] Disable register coalescing to the PTRDISPREGS class If we would allow register coalescing on PTRDISPREGS class then register allocator can lock Z register to some virtual register. Larger instructions requiring a memory acces then fail during the register allocation phase since there is no available register to hold a pointer if Y register was already taken for a stack frame. This patch prevents it by keeping Z register spillable. It does it by not allowing coalescer to lock it. Original discussion on https://github.com/avr-rust/rust/issues/128. llvm-svn: 362298	2019-06-01 12:38:56 +00:00
Simon Pilgrim	e6d1a80370	[SLPVectorizer][X86] Add other tests described in PR28474 llvm-svn: 362297	2019-06-01 12:35:03 +00:00
Simon Pilgrim	2ef83571f2	[SLPVectorizer][X86] This test was from PR28474 llvm-svn: 362296	2019-06-01 12:10:29 +00:00
Roman Lebedev	1aaa23c0fc	[NFC][Codegen] shift-amount-mod.ll: drop innermost operation I have initially added it in for test to display both whether the binop w/ constant is sinked or hoisted. But as it can be seen from the 'sub (sub C, %x), %y' test, that actually conceals the issues it is supposed to test. At least two more patterns are unhandled: * 'add (sub C, %x), %y' - D62266 * 'sub (sub C, %x), %y' llvm-svn: 362295	2019-06-01 11:08:29 +00:00
Nikita Popov	46d4dba6e6	[IndVarSimplify] Fixup nowrap flags during LFTR (PR31181) Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix for LFTR poison handling issues in general. When LFTR moves a condition from pre-inc to post-inc, it may now depend on value that is poison due to nowrap flags. To avoid this, we clear any nowrap flag that SCEV cannot prove for the post-inc addrec. Additionally, LFTR may switch to a different IV that is dynamically dead and as such may be arbitrarily poison. This patch will correct nowrap flags in some but not all cases where this happens. This is related to the adoption of IR nowrap flags for the pre-inc addrec. (See some of the switch_to_different_iv tests, where flags are not dropped or insufficiently dropped.) Finally, there are likely similar issues with the handling of GEP inbounds, but we don't have a test case for this yet. Differential Revision: https://reviews.llvm.org/D60935 llvm-svn: 362292	2019-06-01 09:40:18 +00:00
Nikita Popov	2b1d799a59	[IndVarSimplify] Add additional PR33181 tests; NFC Two more tests with a switch to a dynamically dead IV, with poison occuring on the first or second iteration. llvm-svn: 362291	2019-06-01 09:40:09 +00:00
Dylan McKay	038e3b9f57	Extend the DWARFExpression address handling to support 16-bit addresses This allows the DWARFExpression class to handle addresses without crashing on targets with 16-bit pointers like AVR. This is required in order to generate assembly from clang via the '-S' flag. This fixes an error with the following message: clang: llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h:132: llvm::DWARFExpression::DWARFExpression(llvm::DataExtractor, uint16_t, uint8_t): Assertion `AddressSize == 8 \|\| AddressSize == 4' failed. llvm-svn: 362290	2019-06-01 09:18:26 +00:00
Craig Topper	c288a19bb7	[X86] Add AVX512BF16 and AVX512VP2INTERSECT instructions to the loading folding tables. llvm-svn: 362288	2019-06-01 06:20:59 +00:00
Tom Tan	2258ecc2aa	[COFF, ARM64] Fix location of ARM64 CodeView test ARM64 CodeView test was incorrectly put under test/DebugInfo/COFF folder which runs for all all architectures. This fix moves it to a subfolder AArch64 with lit.local.cfg which specify it supports AArch64 only. llvm-svn: 362283	2019-06-01 02:38:08 +00:00
Philip Reames	099eca832e	[LoopPred] Handle a subset of NE comparison based latches At the moment, LoopPredication completely bails out if it sees a latch of the form: %cmp = icmp ne %iv, %N br i1 %cmp, label %loop, label %exit OR %cmp = icmp ne %iv.next, %NPlus1 br i1 %cmp, label %loop, label %exit This is unfortunate since this is exactly the form that LFTR likes to produce. So, go ahead and recognize simple cases where we can. For pre-increment loops, we leverage the fact that LFTR likes canonical counters (i.e. those starting at zero) and a (presumed) range fact on RHS to discharge the check trivially. For post-increment forms, the key insight is in remembering that LFTR had to insert a (N+1) for the RHS. CVP can hopefully prove that add nsw/nuw (if there's appropriate range on N to start with). This leaves us both with the post-inc IV and the RHS involving an nsw/nuw add, and SCEV can discharge that with no problem. This does still need to be extended to handle non-one steps, or other harder patterns of variable (but range restricted) starting values. That'll come later. Differential Revision: https://reviews.llvm.org/D62748 llvm-svn: 362282	2019-06-01 00:31:58 +00:00
Tom Tan	eb4d6142dc	[COFF, ARM64] Add CodeView register mapping CodeView has its own register map which is defined in cvconst.h. Missing this mapping before saving register to CodeView causes debugger to show incorrect value for all register based variables, like variables in register and local variables addressed by register (stack pointer + offset). This change added mapping between LLVM register and CodeView register so the correct register number will be stored to CodeView/PDB, it aso fixed the mapping from CodeView register number to register name based on current CPUType but print PDB to yaml still assumes X86 CPU and needs to be fixed. Differential Revision: https://reviews.llvm.org/D62608 llvm-svn: 362280	2019-05-31 23:43:31 +00:00
Reid Kleckner	eddd6c25b5	[codeview] Revert inline line table change of r362264 Testing with debuggers shows that our previous behavior was correct. The reason I thought MSVC did things differently is that MSVC prefers to use the 0xB combined code offset and code length update opcode when inline sites are discontiguous. Keep the test changes, and update the llvm-pdbutil inline line table dumper to account for this new interpretation of the opcodes. llvm-svn: 362277	2019-05-31 22:55:03 +00:00
Matt Arsenault	302eedcbfa	AMDGPU: Fix not adding ImplicitBufferPtr as a live-in Fixes missing test from r293000. llvm-svn: 362275	2019-05-31 22:47:36 +00:00
Erik Pilkington	abb2a93c53	[SimplifyLibCalls] Fold more fortified functions into non-fortified variants When the object size argument is -1, no checking can be done, so calling the _chk variant is unnecessary. We already did this for a bunch of these functions. rdar://50797197 Differential revision: https://reviews.llvm.org/D62358 llvm-svn: 362272	2019-05-31 22:41:36 +00:00
Philip Reames	fa6bcd0b96	[Tests] Better represent the postinc form produced by LFTR in LoopPred tests llvm-svn: 362270	2019-05-31 22:22:29 +00:00
Reid Kleckner	e98cf5fe47	[codeview] Fix inline line table accuracy for discontiguous segments After improving the inline line table dumper in llvm-pdbutil and looking at MSVC's inline line tables, it is clear that setting the length of the inlined code region does not update the code offset. This means that the delta to the beginning of a new discontiguous inlined code region should be calculated relative to the last code offset, excluding the length. Implementing this is a one line fix for MC: simply don't update LastLabel. While I'm updating these test cases, switch them to use llvm-objdump -d and llvm-pdbutil. This allows us to show offsets of each instruction and correlate the line table offsets to the actual code. llvm-svn: 362264	2019-05-31 20:55:31 +00:00
Nikita Popov	7bafae55c0	Reapply [CVP] Simplify non-overflowing saturating add/sub If we can determine that a saturating add/sub will not overflow based on range analysis, convert it into a simple binary operation. This is a sibling transform to the existing with.overflow handling. Reapplying this with an additional check that the saturating intrinsic has integer type, as LVI currently does not support vector types. Differential Revision: https://reviews.llvm.org/D62703 llvm-svn: 362263	2019-05-31 20:48:26 +00:00
Nikita Popov	d435093056	[CVP] Add vector saturating add test; NFC Extra test for the assertion failure from D62703. llvm-svn: 362262	2019-05-31 20:42:13 +00:00
Nikita Popov	23a02f6a5f	[CVP] Fix assertion failure on vector with.overflow Noticed on D62703. LVI only handles plain integers, not vectors of integers. This was previously not an issue, because vector support for with.overflow is only a relatively recent addition. llvm-svn: 362261	2019-05-31 20:42:07 +00:00
Philip Reames	f711d59427	[Tests] Add ne icmp tests w/preinc forms for LoopPredication Turns out this is substaintially easier to match then the post increment form, so let's start there. llvm-svn: 362260	2019-05-31 20:34:57 +00:00
Cameron McInally	5594ee0a3e	[NFC][InstCombine] Add unary FNeg tests to AMDGPU/amdgcn-intrinsics.ll llvm-svn: 362255	2019-05-31 19:12:59 +00:00
Nikita Popov	ccb63e0bfe	Revert "[CVP] Simplify non-overflowing saturating add/sub" This reverts commit `1e692d1777`. Causes assertion failure in builtins-wasm.c clang test. llvm-svn: 362254	2019-05-31 19:04:47 +00:00
Cameron McInally	51e0de6954	[NFC][InstCombine] Add unary FNeg to cos-1.ll cos-2.ll cos-sin-intrinsic.ll llvm-svn: 362253	2019-05-31 18:54:44 +00:00
Puyan Lotfi	3ea6b24f41	[MIR-Canon] Don't do vreg skip for independent instructions if there are none. We don't want to create vregs if there is nothing to use them for. That causes verifier errors. Differential Revision: https://reviews.llvm.org/D62740 llvm-svn: 362247	2019-05-31 17:34:25 +00:00
Philip Reames	8dda4a1675	[Tests] Add tests for loop predication of loops w/ne latch conditions llvm-svn: 362244	2019-05-31 16:54:38 +00:00
Nikita Popov	1e692d1777	[CVP] Simplify non-overflowing saturating add/sub If we can determine that a saturating add/sub will not overflow based on range analysis, convert it into a simple binary operation. This is a sibling transform to the existing with.overflow handling. Differential Revision: https://reviews.llvm.org/D62703 llvm-svn: 362242	2019-05-31 16:46:05 +00:00
Kevin P. Neal	ac79007205	Revert revert of r362112 with minor SystemZ test file corrections. [FPEnv] Added a special UnrollVectorOp method to deal with the chain on StrictFP opcodes This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully. Submitted by: Drew Wock <drew.wock@sas.com> Reviewed by: Cameron McInally, Kevin P. Neal Approved by: Cameron McInally Differential Revision: https://reviews.llvm.org/D62546 llvm-svn: 362241	2019-05-31 16:32:12 +00:00
Stanislav Mekhanoshin	fbbe5230f4	[AMDGPU] Use InliningThresholdMultiplier for inline hint AMDGPU uses multiplier 9 for the inline cost. It is taken into account everywhere except for inline hint threshold. As a result we are penalizing functions with the inline hint making them less probable to be inlined than those without the hint. Defaults are 225 for a normal function and 325 for a function with an inline hint. Currently we have effective threshold 225 * 9 = 2025 for normal functions and just 325 for those with the hint. That is fixed by this patch. Differential Revision: https://reviews.llvm.org/D62707 llvm-svn: 362239	2019-05-31 16:19:26 +00:00
Cameron McInally	8ff009a461	[NFC][InstCombine] Add unary FNeg tests to fabs.ll llvm-svn: 362238	2019-05-31 16:17:04 +00:00
Guozhi Wei	c3a24e93d5	[PPC] Correctly adjust branch probability in PPCReduceCRLogicals In PPCReduceCRLogicals after splitting the original MBB into 2, the 2 impacted branches still use original branch probability. This is unreasonable. Suppose we have following code, and the probability of each successor is 50%. condc = conda \|\| condb br condc, label %target, label %fallthrough It can be transformed to following, br conda, label %target, label %newbb newbb: br condb, label %target, label %fallthrough Since each branch has a probability of 50% to each successor, the total probability to %fallthrough is 25% now, and the total probability to %target is 75%. This actually changed the original profiling data. A more reasonable probability can be set to 70% to the false side for each branch instruction, so the total probability to %fallthrough is close to 50%. This patch assumes the branch target with two incoming edges have same edge frequency and computes new probability fore each target, and keep the total probability to original targets unchanged. Differential Revision: https://reviews.llvm.org/D62430 llvm-svn: 362237	2019-05-31 16:11:17 +00:00
Cameron McInally	6d2a4712f3	[NFC][InstCombine] Add unary FNeg tests to fcmp.ll llvm-svn: 362234	2019-05-31 15:40:03 +00:00
Cameron McInally	aea3149e6c	[NFC][InstCombine] Add unary FNeg tests to fdiv.ll llvm-svn: 362231	2019-05-31 15:10:34 +00:00
Simon Pilgrim	db6a1d4f24	[AMDGPU] Regenerate add/sub shrink constant tests for an upcoming patch llvm-svn: 362230	2019-05-31 15:06:51 +00:00
Simon Pilgrim	27d6ea9698	[AMDGPU] Regenerate CTLZ tests for an upcoming patch llvm-svn: 362229	2019-05-31 15:06:14 +00:00
Cameron McInally	66c25def00	[NFC][InstCombine] Add unary FNeg tests to fma.ll llvm-svn: 362227	2019-05-31 14:49:31 +00:00
George Rimar	60d88e0e90	[llvm-readobj] - Remove excessive `dynamic.test` dynamic.test is a test that checks dumping of dynamic tags. It uses precompiled objects as inputs and it is completely excessive nowadays: Now we have elf-dynamic-tags-machine-specific.test and elf-dynamic-tags.test. (https://github.com/llvm-mirror/llvm/blob/master/test/tools/llvm-readobj/elf-dynamic-tags-machine-specific.test) (https://github.com/llvm-mirror/llvm/blob/master/test/tools/llvm-readobj/elf-dynamic-tags.test) First is used to check target specific tags and second tests the common flags. These tests use YAML, which is much better than using precompiled binaries. Note that new reviews tend to update the YAML based tests to add new tags, e.g. see D62596. With this patch it became possible to remove dynamic-table-so.aarch64 binary from the inputs folder. (other binaries are still used in other tests). Differential revision: https://reviews.llvm.org/D62728 llvm-svn: 362224	2019-05-31 13:16:21 +00:00
Roman Lebedev	39390d8317	[InstCombine] 'C-(C2-X) --> X+(C-C2)' constant-fold It looks this fold was already partially happening, indirectly via some other folds, but with one-use limitation. No other fold here has that restriction. https://rise4fun.com/Alive/ftR llvm-svn: 362217	2019-05-31 09:47:16 +00:00
Roman Lebedev	886c4ef35a	[InstCombine] 'add (sub C1, X), C2 --> sub (add C1, C2), X' constant-fold https://rise4fun.com/Alive/qJQ llvm-svn: 362216	2019-05-31 09:47:04 +00:00
Cullen Rhodes	0fc3a07398	[AArch64][SVE2] Asm: support WHILE instructions Summary: Patch adds support for the following instructions: * WHILEGE, WHILEGT, WHILEHS, WHILEHI, WHILEWR, WHILERW The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62601 llvm-svn: 362215	2019-05-31 09:13:55 +00:00
Cullen Rhodes	087d1337f8	[AArch64][SVE2] Asm: support TBL/TBX instructions Summary: A three sources variant of the TBL instruction is added to the existing SVE instruction in SVE2. This is implemented with minor changes to the existing TableGen class. TBX is a new instruction with its own definition. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62600 llvm-svn: 362214	2019-05-31 09:06:53 +00:00
Cullen Rhodes	2e870011b6	[AArch64][SVE2] Asm: support SVE2 store instructions Summary: Patch adds support for the following instructions: * STNT1B, STNT1H, STNT1S, STNT1D The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62599 llvm-svn: 362213	2019-05-31 08:59:40 +00:00
Petar Avramovic	f317debdb8	[MIPS GlobalISel] Add detailed tests for lower call Test different operand types of callee and their behavior whether relocation model is pic or not. Possible operand types are: Register (function pointer), External symbol (used for libcalls e.g. __udivdi3 or memcpy), Global address. Global address has different handling depending on relocation model and linkage type. Register and external symbol do not. Differential Revision: https://reviews.llvm.org/D62590 llvm-svn: 362212	2019-05-31 08:40:08 +00:00
Petar Avramovic	efcd3c0009	[MIPS GlobalISel] Handle position independent code Handle position independent code for MIPS32. When callee is global address, lower call will emit callee as G_GLOBAL_VALUE and add target flag if needed. Support $gp in getRegBankFromRegClass(). Select G_GLOBAL_VALUE, specially handle case when there are target flags attached by lowerCall. Differential Revision: https://reviews.llvm.org/D62589 llvm-svn: 362210	2019-05-31 08:27:06 +00:00
Roman Lebedev	d1d915b8da	[NFC][InstCombine] Copy add/sub constant-folding tests from codegen Last three patterns are missed. llvm-svn: 362209	2019-05-31 08:24:07 +00:00
Roman Lebedev	7c1ac8269a	[NFC][Codegen] Add/sub constant-folding: add scalar tests too Just for completeness. llvm-svn: 362208	2019-05-31 08:23:48 +00:00
Petar Avramovic	f4a6dd28b6	[MIPS GlobalISel] Lower call for callee that is register Lower call for callee that is register for MIPS32. Register should contain callee function address. Differential Revision: https://reviews.llvm.org/D62585 llvm-svn: 362204	2019-05-31 08:06:17 +00:00
Craig Topper	31d00d80a2	[X86] Remove patterns for X86VSintToFP/X86VUintToFP+loadv4f32 to v2f64. These patterns can incorrectly narrow a volatile load from 128-bits to 64-bits. Similar to PR42079. Switch to using (v4i32 (bitcast (v2i64 (scalar_to_vector (loadi64))))) as the load pattern used in the instructions. This probably still has issues in 32-bit mode where loadi64 isn't legal. Maybe we should use VZMOVL for widened loads even when we don't need the upper bits as zeroes? llvm-svn: 362203	2019-05-31 07:38:26 +00:00
Craig Topper	cded573710	[X86] Add test cases for failure to use 128-bit masked vcvtdq2pd when load starts as v2i32. llvm-svn: 362202	2019-05-31 07:38:22 +00:00
Craig Topper	67d43e0744	[X86] Add test cases for a volatile load shrinking bug involving cvtdq2pd. NFC Similar to PR42079 llvm-svn: 362201	2019-05-31 07:38:18 +00:00
Craig Topper	cb0ad5accb	[X86] Copy a test case from avx512-cvt.ll to avx512-cvt-widen.ll. NFC llvm-svn: 362200	2019-05-31 07:38:14 +00:00
Craig Topper	b79cc5f802	[X86] Remove avx512 isel patterns for fpextend+load. Prefer to only match fp extloads instead. DAG combine will usually fold fpextend+load to an fp extload anyway. So the 256 and 512 patterns were probably unnecessary. The 128 bit pattern was special in that it looked for a v4f32 load, but then used it in an instruction that only loads 64-bits. This is bad if the load happens to be volatile. We could probably make the patterns volatile aware, but that's more work for something that's probably rare. The peephole pass might kick in and save us anyway. We might also be able to fix this with some additional DAG combines. This also adds patterns for vselect+extload to enabled masked vcvtps2pd to be used. Previously we looked for the unlikely vselect+fpextend+load. llvm-svn: 362199	2019-05-31 06:21:53 +00:00
Craig Topper	73b07284df	[X86] Add test to show missed opportunity to use masked vcvtps2pd for vselect+extload. llvm-svn: 362198	2019-05-31 06:21:49 +00:00
Craig Topper	8cb076ec6e	[X86] Add test case for PR42079. NFC llvm-svn: 362197	2019-05-31 06:21:45 +00:00
Puyan Lotfi	0d63cef180	[MIR-Canon] Skip the first N vreg names lazily. This consolidates the vreg skip code into one function (SkipVRegs()). SkipVRegs() now knows if it should skip as if it is the first initialization or subsequent skips. The first skip is also done the first time createVirtualRegister is called by the cursor instead of by the cursor's constructor. This prevents verifier errors on machine functions that have no vregs (where the verifier will complain that there are vregs when the function uses none). Differential Revision: https://reviews.llvm.org/D62717 llvm-svn: 362195	2019-05-31 06:02:38 +00:00
Craig Topper	23066033a1	[X86] Correct the ins operand order for MASKPAIR16STORE to match other store instructions. This makes the 5 address operands come first. And the data operand comes last. This matches the operand order the instruction is created with. It's also the expected order in X86MCInstLower. So everything appeared to work, but the operands didn't match their declared type. Fixes a -verify-machineinstrs failure. Also remove the isel patterns from these instructions since they should only be used for stack spills and reloads. I'm not even sure what types the patterns were looking for to match. llvm-svn: 362193	2019-05-31 05:20:27 +00:00
Puyan Lotfi	2a901401fe	[MIR-Canon] Hardening propagateLocalCopies. This is am almost NFC, it does the following: - If there is no register class for a COPY's src or dst, bail. - Fixes uses iterator invalidation bug. Differential Revision: https://reviews.llvm.org/D62713 llvm-svn: 362191	2019-05-31 04:49:58 +00:00
Pengfei Wang	2e67d0c842	[X86] Add VP2INTERSECT instructions Support Intel AVX512 VP2INTERSECT instructions in llvm Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D62366 llvm-svn: 362188	2019-05-31 02:50:41 +00:00

... 3 4 5 6 7 ...

62300 Commits