llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	a90d939030	X86MCTargetDesc.h - remove unused DataType.h include. NFC.	2020-04-26 14:50:52 +01:00
Simon Pilgrim	5cc84d095e	X86MCTargetDesc.cpp - remove MSVC intrin.h include. NFC. This was needed when the file called cpuid but that was removed at rL233170.	2020-04-26 14:50:52 +01:00
Simon Pilgrim	fd283ddb9b	X86MacroFusion.h - reduce MachineScheduler.h include. NFC. We only need a ScheduleDAGMutation forward declaration.	2020-04-26 14:50:52 +01:00
Simon Pilgrim	a3982491db	[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815	2020-04-26 12:58:20 +01:00
Simon Pilgrim	e4196b1cae	X86Operand.h - remove unnecessary includes. NFC.	2020-04-26 12:12:22 +01:00
Simon Pilgrim	43d6f9a876	AMDGPU/Utils - cleanup include and forward declarations. NFC. Remove unused includes + forward declarations. Reduce unnecessary StringRef.h includes to StringRef forward declaration.	2020-04-26 12:12:21 +01:00
Craig Topper	b9de62c2b6	[X86] Fix the cost of v16i1->v16i16 sext/zext on avx targets. Previously we were hitting the scalarization case in the default implementation.	2020-04-25 23:16:20 -07:00
Craig Topper	19cb26f517	[X86][CostModel] Improve costs for vXi1 sign_extend/zero_extend with avx512. With avx512 vXi1 is legal and uses k-registers with many custom cases for extending.	2020-04-25 23:16:20 -07:00
Fangrui Song	2cb48d620f	[TableGen] Drop deprecated leading # operation (NOP) and replace ## with #	2020-04-25 16:26:45 -07:00
Craig Topper	c1cb733db6	[X86] Improve lowering of v16i8->v16i1 truncate under prefer-vector-width=256.	2020-04-25 15:20:33 -07:00
Simon Pilgrim	4425751317	X86ISelLowering.h - remove unnecessary includes. NFC. Fixed implicit MachineFrameInfo.h dependency in X86SelectionDAGInfo.cpp	2020-04-25 20:07:34 +01:00
Sanjay Patel	7f4ff782d4	[x86] use vector instructions to lower even more FP->int->FP casts This is another enhancement to D77895/D78362 to avoid a round-trip from XMM->GPR->XMM. This time we handle the case of starting/ending with different FP types but always with signed i32 as the intermediate value. I think this covers all of the faux vector optimization possibilities for pre-AVX512. There is at least 1 other transform mentioned in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c19 ...where we fold an 'fpext' into a preceding 'sitofp'. I think we will want to handle that earlier (DAGCombiner or instcombine) because that's a target-independent optimization. Differential Revision: https://reviews.llvm.org/D78758	2020-04-25 11:38:54 -04:00
Benjamin Kramer	1d42764df7	Give helpers internal linkage. NFC.	2020-04-25 11:50:52 +02:00
Craig Topper	7664a0d282	[X86] Improve accuracy of cost for v16i64->v16i8 truncate with avx512. The 2 vpmovqds are only 1 uop each.	2020-04-24 19:13:55 -07:00
Craig Topper	e4a9190ad7	[X86][ArgumentPromotion] Allow Argument Promotion if caller and callee disagree on 512-bit vectors support if the arguments are scalar. If one of caller/callee has disabled ZMM registers due to prefer-vector-width=256, we were previously disabling argument promotion as the ABI might be incompatible since one side will split 512-bit vectors in this case. But if we can see that the types are all scalar this shouldn't be a problem. This patch assumes that pointer element type reflects the type that the argument will be promoted to. Differential Revision: https://reviews.llvm.org/D78770	2020-04-24 15:47:02 -07:00
Amara Emerson	dbb0356771	[AArch64][GlobalISel] Fix sub-64b stack parameter passing on Darwin. A previous bug fix for varargs introduced a regression where we would incorrectly widen some stores to memory when passing i8/i16 parameters on the stack. This didn't show up seemingly because it only happens when there is no signext/zeroext parameter attribute, which I think for Darwin clang adds. Swift however seems to be a different story, and a plain anyext on the parameter triggered the bug. To fix this, I've added a new ValueHandler::assignValueToAddress type override which lets us distiguish between varargs and fixed args (we still need this widening behaviour for varargs to fix the original bug in 2018). rdar://61353552	2020-04-24 13:56:43 -07:00
Jean-Michel Gorius	505685a67a	[llvm][CodeGen] Check for memory instructions when querying for alias status Summary: Add a check to make sure that MachineInstr::mayAlias returns prematurely if at least one of its instruction parameters does not access memory. This prevents calls to TargetInstrInfo::areMemAccessesTriviallyDisjoint with incompatible instructions. A side effect of this change is to render the mayAlias helper in the AArch64 load/store optimizer obsolete. We can now directly call the MachineInstr::mayAlias member function. Reviewers: hfinkel, t.p.northover, mcrosier, eli.friedman, efriedma Reviewed By: efriedma Subscribers: efriedma, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78823	2020-04-24 22:54:46 +02:00
Pengxuan Zheng	79702dd349	[RISCV] Add instruction definition for dret Summary: The instruction dret is used to return from debug mode and is defined in the RISC-V debug mode spec. https://github.com/riscv/riscv-opcodes/blob/master/opcodes-system Reviewers: apazos, asb, lenary, luismarques Reviewed By: apazos Subscribers: jfb, hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, sameer.abuasal, evandro, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78583	2020-04-24 13:27:43 -07:00
Matt Arsenault	35e6a9c839	AMDGPU: Break read2/write2 search range on a memory fence This is to fix performance regressions introduced by `86c944d790`. The old search would collect all potentially mergeable instructions in the entire block. In this case, the same address is written in multiple places in the block on the other side of a fence. When sorted by offset, the two unmergeable, identical addresses would be next to each other and the merge would give up. Break the search space when we encounter an instruction we won't be able to merge across. This will keep the identical addresses in different merge attempts. This may also improve compile time by reducing the merge list size.	2020-04-24 15:53:30 -04:00
Vedant Kumar	c0fa447e02	AArch64: Remove reversedInstructionsWithoutDebug helper When using reversedInstructionsWithoutDebug to construct a range from a pair of MachineInstrBundleIterators, the range unexpectedly leaves out an element. This results in mis-optimization as @mstorsjo points out in https://reviews.llvm.org/D78157. The problem is that when we convert a MachineInstrBundleIterator to a reverse iterator, the result gets incremented: MachineInstrBundleIterator(++I.getReverse()) The comment there explains that the "resulting iterator will dereference ... to the previous node, which is somewhat unexpected; but converting the two endpoints in a range will give the same range in reverse". This makes it hard to understand what reversedInstructionsWithoutDebug will do: I've removed the helper to prevent similar mistakes in the future.	2020-04-24 11:28:17 -07:00
Pablo Barrio	d4e7b000b2	[AArch64] Allow PAC mnemonics in the HINT space with PAC disabled Summary: It is important to emit HINT instructions instead of PAC ones when PAC is disabled. This allows compatibility with other assemblers (e.g. GAS). This was implemented in commit `da33762de8`. Still, developers of assembly code will want to write code that is compatible with both pre- and post-PAC CPUs. They could use HINT mnemonics, but the new mnemonics are a lot more readable (e.g. paciaz instead of hint #24), and they will result in the same encodings. So, while LLVM should not emit the new mnemonics when PAC is disabled, this patch will at least make LLVM accept assembly code that uses them. Reviewers: danielkiss, chill, olista01, LukeCheeseman, simon_tatham Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78372	2020-04-24 16:56:51 +01:00
Fangrui Song	25e22613df	[XRay] Change ARM/AArch64/powerpc64le to use version 2 sled (PC-relative address) Follow-up of D78082 (x86-64). This change avoids dynamic relocations in `xray_instr_map` for ARM/AArch64/powerpc64le. MIPS64 cannot use 64-bit PC-relative addresses because R_MIPS_PC64 is not defined. Because MIPS32 shares the same code, for simplicity, we don't use PC-relative addresses for MIPS32 as well. Tested on AArch64 Linux and ppc64le Linux. Reviewed By: ianlevesque Differential Revision: https://reviews.llvm.org/D78590	2020-04-24 08:35:43 -07:00
Simon Pilgrim	82c9eed2cf	MipsTargetStreamer.h - remove unnecessary MipsABIFlagsSection forward declaration. NFC. We need to include MipsABIFlagsSection.h already	2020-04-24 16:21:37 +01:00
Simon Pilgrim	091f7f0103	AMDGPUArgumentUsageInfo.h - cleanup includes and forward declarations. NFC. Reduce Function.h include to (already existing) forward declaration. Remove unused GCNSubtarget/TargetMachine forward declarations.	2020-04-24 16:21:37 +01:00
Luke Geeson	659ca50245	[AArch32] Armv8.6a Matrix Mul Assembly Parsing Support This patch upstreams support for the Armv8.6-a Matrix Multiplication Extension. A summary of the features can be found here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a This patch includes: - Assembly support for AArch32 and Assembly Parsing D77872 has already added the MC representations of the instructions so that they can be used in code gen; this patch fills in the details needed to make assembly parsing work, and adds tests for asm and disasm This is part of a patch series, starting with BFloat16 support and the other components in the armv8.6a extension (in previous patches linked in phabricator) Based on work by: - Luke Geeson - Oliver Stannard - Luke Cheeseman Reviewers: t.p.northover, simon_tatham Reviewed By: simon_tatham Subscribers: simon_tatham, ostannard, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77874	2020-04-24 15:54:06 +01:00
Luke Geeson	e714683880	[AArch64] Armv8.6-A Mat Mul SVE Assembly This patch upstreams support for the Armv8.6-a Matrix Multiplication Extension. A summary of the features can be found here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a This patch includes: - Assembly support for AArch64 Scalable Vector Instructions (in line with the Scalable Vector Extension - SVE) This is part of a patch series, starting with BFloat16 support and the other components in the armv8.6a extension (in previous patches linked in phabricator) Based on work by: - Luke Geeson - Oliver Stannard - Luke Cheeseman Reviewers: t.p.northover, rengolin, c-rhodes Reviewed By: c-rhodes Subscribers: c-rhodes, ostannard, tschuett, kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77873	2020-04-24 15:54:06 +01:00
Luke Geeson	7da1905125	[AArch32] Armv8.6-a Matrix Mult Assembly + Intrinsics This patch upstreams support for the Armv8.6-a Matrix Multiplication Extension. A summary of the features can be found here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a This patch includes: - Assembly support for AArch32 - Intrinsics Support for AArch32 Neon Intrinsics for Matrix Multiplication Note: these extensions are optional in the 8.6a architecture and so have to be enabled by default No additional IR types or C Types are needed for this extension. This is part of a patch series, starting with BFloat16 support and the other components in the armv8.6a extension (in previous patches linked in phabricator) Based on work by: - Luke Geeson - Oliver Stannard - Luke Cheeseman Reviewers: t.p.northover, miyuki Reviewed By: miyuki Subscribers: miyuki, ostannard, kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77872	2020-04-24 15:54:06 +01:00
Luke Geeson	832cd74913	[AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics This patch upstreams support for the Armv8.6-a Matrix Multiplication Extension. A summary of the features can be found here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a This patch includes: - Assembly support for AArch64 only (no SVE or Neon) - Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication) No IR types or C Types are needed for this extension. This is part of a patch series, starting with BFloat16 support and the other components in the armv8.6a extension (in previous patches linked in phabricator) Based on work by: - Luke Geeson - Oliver Stannard - Luke Cheeseman Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin Reviewed By: kmclaughlin Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77871	2020-04-24 15:54:06 +01:00
Simon Pilgrim	d04059778e	SIRegisterInfo.h - remove unnecessary MachineRegisterInfo forward declaration. NFC. We already need to include MachineRegisterInfo.h	2020-04-24 13:27:57 +01:00
Simon Pilgrim	fd8035cf32	HexagonShuffler.h - remove duplicate STLExtras.h include. NFC.	2020-04-24 13:27:56 +01:00
Piotr Sobczak	7631af3af2	[AMDGPU] Skip generating cache invalidating instructions on AMDPAL Summary: Frontend guarantees that coherent accesses have corresponding cache policy bits set (glc, dlc). Therefore there is no need for extra instructions that invalidate cache. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78800	2020-04-24 13:53:44 +02:00
Kerry McLaughlin	53dd72a87a	[SVE][CodeGen] Lower SDIV & UDIV to SVE intrinsics Summary: This patch maps IR operations for sdiv & udiv to the @llvm.aarch64.sve.[s\|u]div intrinsics. A ptrue must be created during lowering as the div instructions have only a predicated form. Patch contains changes by Andrzej Warzynski. Reviewers: sdesmalen, c-rhodes, efriedma, cameron.mcinally, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, andwar, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78569	2020-04-24 11:38:20 +01:00
Simon Atanasyan	0eec6662f6	[MC][mips] Replace setRType## methods by single setRTypes function. NFC MCELFObjectWriter::setRType## methods are always used altogether to build complete MIPS N64 ABI "chain" of relocations. Using single function for this task makes code less verbose.	2020-04-24 12:13:27 +03:00
Kazushi (Jam) Marukawa	9aa6792729	[VE] Update floating-point arithmetic instructions Summary: Changing all mnemonic to match assembly instructions to simplify mnemonic naming rules. This time update all floating-point arithmetic instructions. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D78768	2020-04-24 11:11:44 +02:00
Christudasan Devadasan	207cd5f68f	[AMDGPU] Add the SGPR used for FP copy to block livein lists. The temporary register used for FP copy should be live throughout the function.	2020-04-24 11:47:38 +05:30
Matt Arsenault	6bffd0df78	AMDGPU: Fix redundant members	2020-04-23 23:14:01 -04:00
Matt Arsenault	50128f8a33	AMDGPU: Use Register	2020-04-23 22:25:36 -04:00
Krzysztof Parzyszek	5c7a2cfac1	[Hexagon] Fix result word order when bitcasting vector pred to int64/128	2020-04-23 19:15:11 -05:00
Pavel Iliin	cc457672e6	[AArch64][FIX] FPR16_lo for f16 indexed patterns.	2020-04-23 23:44:56 +01:00
Christopher Tetreault	18c611ed92	[SVE] Remove calls to isScalable from Hexagon Reviewers: efriedma, sdesmalen, kparzysz, colinl Reviewed By: kparzysz Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77757	2020-04-23 14:02:14 -07:00
Christopher Tetreault	84584b0d29	[SVE] Remove calls to isScalable from AARCH64 Reviewers: efriedma, sdesmalen, t.p.northover, mcrosier Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77758	2020-04-23 13:09:17 -07:00
Matt Arsenault	156afb2253	AMDGPU: Fix inlining logic for denormals This was backwards from intended and missing a test. We perhaps should just ignored the FP mode here, since it shouldn't be legal to mix code with different default modes in the absence of strictfp.	2020-04-23 15:30:48 -04:00
Matt Arsenault	89c8c80bd5	AMDGPU: Change pre-gfx9 implementation of fcanonicalize to mul If f32 denormals were enabled pre-gfx9, we would still try to implement this with v_max_f32. Pre-gfx9, these instructions ignored the denormal mode and did not flush. Switch to the multiply form for f32 as a workaround which should always work in any case. This fixes conformance failures when the library implementation of fmin/fmax were accidentally not inlined, forcing the assumption of no flushing on targets where denormals are not enabled by default. This is a workaround, since really we should not be mixing code with different FP mode expectations, but prefer the lowering that will work in any mode. Now this will always use max to implement canonicalize on gfx9+. This is only really beneficial for f64. For f32/f16 it's a neutral choice (and worse in terms of code size in 1 case), but possibly worse for the compiler since it does add an extra register use operand. Leave this change for later.	2020-04-23 15:24:13 -04:00
Simon Pilgrim	c741dfe325	X86MCTargetDesc.h - replace FormattedStream.h include with forward declaration. NFC.	2020-04-23 17:42:51 +01:00
Simon Pilgrim	90c956318b	X86TargetObjectFile.h - remove unnecessary TargetLoweringObjectFile.h include. NFC. We already include TargetLoweringObjectFileImpl.h which includes it and we only use its types as part of TargetLoweringObjectFile* overridden methods.	2020-04-23 17:42:50 +01:00
Vedant Kumar	210616bd38	Rename a shadowed variable causing build failure on gcc<5.5 See discussion here: https://reviews.llvm.org/D78265	2020-04-23 09:23:44 -07:00
Simon Pilgrim	022ba502c1	[RISCV] Remove unused forward declarations. NFC.	2020-04-23 16:30:45 +01:00
Simon Pilgrim	5387899bb4	[WebAssembly] Remove unused forward declarations. NFC.	2020-04-23 16:30:45 +01:00
Simon Pilgrim	770931b242	[XCore] Remove unused forward declarations. NFC.	2020-04-23 16:30:45 +01:00
Simon Pilgrim	155190567c	[NVPTX] Remove unused forward declarations. NFC.	2020-04-23 16:30:44 +01:00
Simon Pilgrim	33f52ee1d7	[Sparc] Remove unused forward declarations. NFC.	2020-04-23 16:30:44 +01:00
Victor Huang	e20b07b021	[PowerPC][Future] Add missing changes for PC Realtive addressing 1. Use Subtarget.isUsingPCRelativeCalls() in LowerConstantPool to check if using PCRelative addressing. 2. Change MO_GOT_FLAG = 32 to MO_GOT_FLAG = 8 in PPC.h to use consecutive bits. Differential Revision: https://reviews.llvm.org/D78406	2020-04-23 10:26:43 -05:00
Simon Pilgrim	d8a4a99161	[PowerPC] Remove unused forward declarations. NFC.	2020-04-23 15:02:18 +01:00
Simon Pilgrim	db56a6aaf8	[Mips] Remove unused forward declarations. NFC.	2020-04-23 15:02:18 +01:00
Simon Pilgrim	82583b17ce	LanaiMCTargetDesc.h - remove unused forward declarations. NFC.	2020-04-23 15:02:18 +01:00
Simon Pilgrim	0f1a2ad440	[MSP430] Remove unused forward declarations. NFC.	2020-04-23 15:02:17 +01:00
Jay Foad	cca6bc42d9	[AMDGPU] Use RegClass helper functions in getRegForInlineAsmConstraint. This avoids more long lists of register classes that have to be updated every time we add a new one. NFC. Differential Revision: https://reviews.llvm.org/D78570	2020-04-23 12:26:52 +01:00
Jay Foad	0337017a9f	[AMDGPU] Use SGPR instead of SReg classes `12994a70cf` did this for 128-bit classes: SGPR_128 only includes the real allocatable SGPRs, and SReg_128 adds the additional non-allocatable TTMP registers. There's no point in allocating SReg_128 vregs. This shrinks the size of the classes regalloc needs to consider, which is usually good. This patch extends it to all classes > 64 bits, for consistency. Differential Revision: https://reviews.llvm.org/D78622	2020-04-23 11:45:22 +01:00
Sander de Smalen	a5e0389b2a	[AArch64] Define ACLE FP conversion intrinsics with more specific predicate. This patch changes the FP conversion intrinsics to take a predicate that matches the number of lanes for the vector with the widest element type as opposed to using <vscale x 16 x i1>. For example: ```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f16(<vscale x 4 x float>, <vscale x 4 x i1>, <vscale x 8 x half>)``` now uses <vscale x 4 x i1> instead of <vscale x 16 x i1> And similar for: ```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f64(<vscale x 4 x float>, <vscale x 2 x i1>, <vscale x 2 x double>)``` where the predicate now matches the wider type, so <vscale x 2 x i1>. Reviewers: efriedma, SjoerdMeijer, paulwalker-arm, rengolin Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78402	2020-04-23 10:53:23 +01:00
Amara Emerson	613f12dd8e	[AArch64][GlobalISel] Set the current debug loc when missing in some cases.	2020-04-23 01:34:57 -07:00
Kazuaki Ishizaki	0312b9f550	[llvm] NFC: Fix trivial typo in rst and td files Differential Revision: https://reviews.llvm.org/D77469	2020-04-23 14:26:32 +09:00
Vedant Kumar	e0b60c6df2	[AArch64CollectLOH] Debug insts should not break LOH collection [14/14] Fix an issue where the presence of debug instructions could break collection of linker optimization hints.	2020-04-22 17:03:41 -07:00
Vedant Kumar	ff8c417d31	[AArch64PreLegalizerCombiner] Fix debug invariance issue in matchFConstantToConstant [13/14] Fix an issue where the FConstantToConstant combine could fail if debug instructions were present.	2020-04-22 17:03:41 -07:00
Vedant Kumar	c2c2dc526a	[AArch64LoadStoreOptimizer] Skip debug insts during pattern matching [12/14] Do not count the presence of debug insts against the limit set by LdStLimit, and allow the optimizer to find matching insts by skipping over debug insts. Differential Revision: https://reviews.llvm.org/D78411	2020-04-22 17:03:40 -07:00
Vedant Kumar	bf4c70b355	[AArch64ConditionOptimizer] Fix missed optimization due to debug insts [11/14] Summary: The findSuitableCompare method can fail if debug instructions are present in the MBB -- fix this by using helpers to skip over debug insts. Reviewers: aemerson, paquette Subscribers: kristof.beyls, hiraditya, danielkiss, aprantl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78265	2020-04-22 17:03:40 -07:00
Vedant Kumar	78d69e97cc	[AArch64CondBrTuning] Ignore debug insts when scanning for NZCV clobbers [10/14] Summary: This fixes several instances in which condbr optimization was missed due to a debug instruction appearing as a bogus NZCV clobber. Reviewers: aemerson, paquette Subscribers: kristof.beyls, hiraditya, jfb, danielkiss, aprantl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78264	2020-04-22 17:03:40 -07:00
Vedant Kumar	4a51b61cb3	[AArch64] Clean up assorted usage of hasOneUse/use_instructions [9/14] Summary: Use the variants of these APIs which skip over debug instructions. This is mostly a cleanup, but it does fix a debug-variance issue which causes addsub-shifted.ll and addsub_ext.ll to fail when debug info is inserted by -mir-debugify. Reviewers: aemerson, paquette Subscribers: kristof.beyls, hiraditya, jfb, danielkiss, llvm-commits, aprantl Tags: #llvm Differential Revision: https://reviews.llvm.org/D78262	2020-04-22 17:03:40 -07:00
Vedant Kumar	b157974ab3	[AArch64ConditionalCompares] Ignore debug insts in findConvertibleCompare [8/14] Summary: Fix an issue where the presence of debug info could disable the ccmp optimization due to findConvertibleCompare failing too early (the error is "Can't create ccmp with multiple uses", where the "use" is a DBG_VALUE inst). Depends on D78151. Reviewers: t.p.northover, paquette, aemerson Subscribers: kristof.beyls, hiraditya, danielkiss, aprantl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78156	2020-04-22 17:03:40 -07:00
Vedant Kumar	f0b52beef3	[AArch64InstrInfo] Ignore debug insts in areCFlagsAccessedBetweenInstrs [7/14] Summary: Fix an issue where the presence of debug info could disable a peephole optimization due to areCFlagsAccessedBetweenInstrs returning the wrong result. In test/CodeGen/AArch64/arm64-csel.ll, the issue was found in the function @foo5, in which the first compare could successfully be optimized but not the second. Reviewers: t.p.northover, eastig, paquette Subscribers: kristof.beyls, hiraditya, danielkiss, aprantl, dsanders, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78157	2020-04-22 17:03:40 -07:00
Vedant Kumar	26271c8384	[AArch64InstrInfo] Ignore debug insts in canInstrSubstituteCmpInstr [6/14] Summary: Fix an issue where the presence of debug info could disable a peephole optimization in optimizeCompareInstr due to canInstrSubstituteCmpInstr returning the wrong result. Depends on D78137. Reviewers: t.p.northover, eastig, paquette Subscribers: kristof.beyls, hiraditya, danielkiss, aprantl, llvm-commits, dsanders Tags: #llvm Differential Revision: https://reviews.llvm.org/D78151	2020-04-22 17:03:40 -07:00
Vedant Kumar	10ce1bc8d0	[MachineBasicBlock] Add helpers for skipping debug instructions [1/14] Summary: These helpers are exercised by follow-up commits in this patch series, which is all about removing CodeGen differences with vs. without debug info in the AArch64 backend. Reviewers: fhahn, aprantl, jpaquette, paquette Subscribers: kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78260	2020-04-22 17:03:39 -07:00
Eli Friedman	1a78b0bd38	[MachineOutliner] Teach outliner to set live-ins Preserving liveness can be useful even late in the pipeline, if we're doing substantial optimization work afterwards. (See, for example, D76065.) Teach MachineOutliner how to correctly set live-ins on the basic block in outlined functions. Differential Revision: https://reviews.llvm.org/D78605	2020-04-22 14:19:26 -07:00
Victor Huang	a60ca4b4e9	[PowerPC][Future] Initial support for PCRel addressing to get block address Add initial support for PCRelative addressing to get block address instead of using TOC. Differential Revision: https://reviews.llvm.org/D76294	2020-04-22 15:01:29 -05:00
Simon Pilgrim	dc869d5aad	[Lanai] Remove unused forward declarations. NFC.	2020-04-22 18:26:50 +01:00
Simon Pilgrim	f8a5e746c6	[Hexagon] Remove unused forward declarations. NFC.	2020-04-22 18:26:50 +01:00
Simon Pilgrim	1b154ec0d0	[AVR] Remove unused forward declarations. NFC.	2020-04-22 18:26:50 +01:00
Simon Pilgrim	fa6b68a404	BPFMCTargetDesc.h - remove unused raw_ostream forward declaration. NFC.	2020-04-22 18:26:50 +01:00
Victor Huang	02141a17ae	[PowerPC][Future] Remove redundant r2 save and restore for indirect call Currently an indirect call produces the following sequence on PCRelative mode: extern void function( ); extern void (ptrfunc) ( ); void g() { ptrfunc=function; } void f() { (ptrfunc) ( ); } Producing paddi 3, 0, .LC0@PCREL, 1 ld 3, 0(3) std 2, 24(1) ld 12, 0(3) mtctr 12 bctrl ld 2, 24(1) Though the caller does not use or preserve r2, it is still saved and restored across a function call. This patch is added to remove these redundant save and restores for indirect calls. Differential Revision: https://reviews.llvm.org/D77749	2020-04-22 12:05:51 -05:00
Benjamin Kramer	4b33c935db	[Hexagon] Silence warning llvm/lib/Target/Hexagon/HexagonTargetObjectFile.cpp:296:11: warning: enumeration value 'ScalableVectorTyID' not handled in switch [-Wswitch] switch (Ty->getTypeID()) { ^	2020-04-22 18:57:08 +02:00
Christopher Tetreault	2dea3f1298	[SVE] Add new VectorType subclasses Summary: Introduce new types for fixed width and scalable vectors. Does not remove getNumElements yet so as to not break code during transition period. Reviewers: deadalnix, efriedma, sdesmalen, craig.topper, huntergr Reviewed By: sdesmalen Subscribers: jholewinski, arsenm, jvesely, nhaehnle, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, csigg, arpith-jacob, mgester, lucyrfox, liufengdb, kerbowa, Joonsoo, grosul1, frgossen, lldb-commits, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm, #lldb Differential Revision: https://reviews.llvm.org/D77587	2020-04-22 08:59:01 -07:00
Mark Murray	3df8135286	[ARM][MC][Thumb] Recommit: Revert relocation for some pc-relative fixups. Summary: This commit recommits the reversion of https://reviews.llvm.org/D75039. Concensus appears to be in favour of assembly-time resolution of these ADR and LDR relocations, in line with GNU. The previous backout broke many lld tests, now fixed by Peter Smith in `61bccda9d9`. Reviewers: psmith Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78301	2020-04-22 16:54:26 +01:00
Victor Huang	43abef06f4	[PowerPC][Future] Initial support for PCRel addressing for jump tables. Add initial support for PC Relative addressing to get jump table base address instead of using TOC. Differential Revision: https://reviews.llvm.org/D75931	2020-04-22 10:45:01 -05:00
Haojian Wu	f33e86df3a	Fix -Wunused-variable error.	2020-04-22 17:17:41 +02:00
Simon Pilgrim	54b3f91d20	[BPF] Remove unused forward declarations. NFC.	2020-04-22 15:07:18 +01:00
John Brawn	8211cfb7c8	[ARM] Don't shrink STM if it would cause an unknown base register store If a 16-bit thumb STM with writeback stores the base register but it isn't the first register in the list, then an unknown value is stored. The load/store optimizer knows this and generates a 32-bit STM without writeback instead, but thumb2 size reduction converts it into a 16-bit STM. Fix this by having thumb2 size reduction notice such STMs and leave them as they are. Differential Revision: https://reviews.llvm.org/D78493	2020-04-22 14:50:42 +01:00
David Green	892af45c86	[ARM] Distribute MVE post-increments This adds some extra processing into the Pre-RA ARM load/store optimizer to detect and merge MVE loads/stores and adds of the same base. This we don't always turn into a post-inc during ISel, and due to the nature of it being a graph we don't always know an order to use for the nodes, not knowing which nodes to make post-inc and which to use the new post-inc of. After ISel, we have an order that we can use to post-inc the following instructions. So this looks for a loads/store with a starting offset of 0, and an add/sub from the same base, plus a number of other loads/stores. We then do some checks and convert the zero offset load/store into a postinc variant. Any loads/stores after it have the offset subtracted from their immediates. For example: LDR #4 LDR #4 LDR #0 LDR_POSTINC #16 LDR #8 LDR #-8 LDR #12 LDR #-4 ADD #16 It only handles MVE loads/stores at the moment. Normal loads/store will be added in a followup patch, they just have some extra details to ensure that we keep generating LDRD/LDM successfully. Differential Revision: https://reviews.llvm.org/D77813	2020-04-22 14:16:51 +01:00
Pavel Iliin	4eca1c06a4	[AArch64][FIX] f16 indexed patterns encoding restrictions.	2020-04-22 14:11:28 +01:00
Simon Pilgrim	09ba6f9e69	X86TargetMachine.h - remove unused X86RegisterBankInfo forward declaration. NFC.	2020-04-22 14:01:51 +01:00
Jay Foad	dbdffe3ee9	[AMDGPU] Add 192-bit register classes Differential Revision: https://reviews.llvm.org/D78312	2020-04-22 13:10:37 +01:00
Jay Foad	d625b4b081	[AMDGPU] Add missing AReg classes Add 96-bit, 160-bit and 256-bit AReg classes to match VReg and SReg. NFC as far as I know, but it may avoid weird legalization problems. Differential Revision: https://reviews.llvm.org/D78348	2020-04-22 13:10:37 +01:00
Kerry McLaughlin	17f6e18acf	[AArch64][SVE] Add SVE intrinsic for LD1RQ Summary: Adds the following intrinsic for contiguous load & replicate: - @llvm.aarch64.sve.ld1rq The LD1RQ intrinsic only needs the SImmS16XForm added by this patch. The others (SImmS2XForm, SImmS3XForm & SImmS4XForm) were added for consistency. Reviewers: andwar, sdesmalen, efriedma, cameron.mcinally, dancgr, rengolin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76929	2020-04-22 11:29:27 +01:00
Sjoerd Meijer	0736d1ccf3	[ARM][MVE] Tail-predication: some more comments and debug messages. NFC. Finding the loop tripcount is the first crucial step in preparing a loop for tail-predication, and this adds a debug message if a tripcount cannot be found. And while I was at it, I added some more comments here and there. Differential Revision: https://reviews.llvm.org/D78485	2020-04-22 10:34:23 +01:00
Jay Foad	7318625674	[AMDGPU] Remove obsolete special case for 1024-bit vector types. NFC.	2020-04-22 09:05:24 +01:00
Jay Foad	2fa17cdd7a	[AMDGPU] Simplify definition of VReg and AReg classes. NFC. Differential Revision: https://reviews.llvm.org/D78553	2020-04-22 08:59:28 +01:00
Kazushi (Jam) Marukawa	a6ef471919	[VE] Update shift operation instructions Summary: Changing all mnemonic to match assembly instructions to simplify mnemonic naming rules. This time update all shift operation instructions. This also corrects instruction's operation kinds. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D78468	2020-04-22 09:10:10 +02:00
Kazushi (Jam) Marukawa	ba4162c1c4	[VE] Add alternative names to registers Summary: VE uses identical names "%s0-63" to all generic registers. Change to use alternative name mechanism among all generic registers instead of hard- coding them. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D78174	2020-04-22 09:07:42 +02:00
Craig Topper	d22989c34e	[CallSite removal][Target] Replace CallSite with CallBase. NFC In some cases just delete an unneeded include.	2020-04-21 23:29:36 -07:00
Qiu Chaofan	c12722cde8	[PowerPC] Exploit RLDIMI for OR with large immediates This patch exploits rldimi instruction for patterns like `or %a, 0b000011110000`, which saves number of instructions when the operand has only one use, compared with `li-ori-sldi-or`. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D77850	2020-04-22 14:16:52 +08:00
Craig Topper	daadb48553	[CallSite removal][TargetTransformInfoImpl] Replace CallSite with CallBase. NFC	2020-04-21 22:49:30 -07:00
Matt Arsenault	7dece2fde3	AMDGPU: Use Register	2020-04-21 15:19:35 -04:00
Benjamin Kramer	9a08c30705	Bit-pack some pairs. No functionlity change intended.	2020-04-21 20:40:20 +02:00
aartbik	8387bee94d	[llvm] [X86] Fixed type bug in vselect for AVX masked load Summary: Bugzilla issue 45563 https://bugs.llvm.org/show_bug.cgi?id=45563 Reviewers: nicolasvasilache, mehdi_amini, craig.topper Reviewed By: craig.topper Subscribers: RKSimon, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78527	2020-04-21 11:11:35 -07:00
Pavel Iliin	be881e2831	[AArch64] FMLA/FMLS patterns improvement. FMLA/FMLS f16 indexed patterns added. Fixes https://bugs.llvm.org/show_bug.cgi?id=45467 Removed redundant v2f32 vector_extract indexed pattern since Instruction Selection is able to match v4f32 instead.	2020-04-21 18:23:21 +01:00
Benjamin Kramer	d50bfd9764	Fix an unused-variable warning in Release mode.	2020-04-21 18:59:27 +02:00
Fangrui Song	5771c98562	[XRay] Change xray_instr_map sled addresses from absolute to PC relative for x86-64 xray_instr_map contains absolute addresses of sleds, which are relocated by `R_*_RELATIVE` when linked in -pie or -shared mode. By making these addresses relative to PC, we can avoid the dynamic relocations and remove the SHF_WRITE flag from xray_instr_map. We can thus save VM pages containg xray_instr_map (because they are not modified). This patch changes x86-64 and bumps the sled version to 2. Subsequent changes will change powerpc64le and AArch64. Reviewed By: dberris, ianlevesque Differential Revision: https://reviews.llvm.org/D78082	2020-04-21 09:36:09 -07:00
Stefan Pintilie	a92ee77d85	[PowerPC][Future] Add offsets to PC Relative relocations. This is an optimization that applies to global addresses and allows for the following transformation: Convert this: paddi r3, 0, symbol@PCREL, 1 ld r4, 8(r3) To this: pld r4, symbol@PCREL+8(0), 1 An instruction is saved and the linker can do the addition when the symbol is resolved. Differential Revision: https://reviews.llvm.org/D76160	2020-04-21 11:08:19 -05:00
Jay Foad	658f33dcea	[AMDGPU] Remove selectSGPRVectorRegClassID. NFC. This was yet another function that had to be updated whenever you added a new register class. Remove it by refactoring its only caller to use standard helper functions from SIRegisterInfo. Differential Revision: https://reviews.llvm.org/D78557	2020-04-21 16:29:21 +01:00
Simon Pilgrim	c74acd8fc9	X86ISelLowering.cpp - clang-format to fix col80 limit. NFC.	2020-04-21 15:18:23 +01:00
Kerry McLaughlin	0df40d6ef8	[AArch64][SVE] Add addressing mode for contiguous loads & stores Summary: This patch adds the register + register addressing mode for SVE contiguous load and store intrinsics (LD1 & ST1) Reviewers: sdesmalen, fpetrogalli, efriedma, rengolin Reviewed By: fpetrogalli Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78509	2020-04-21 12:04:43 +01:00
Kazushi (Jam) Marukawa	d8816261a6	[VE] Create a TargetInfo header. NFC Summary: Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header in order to follow other architectures. Differential Revision: https://reviews.llvm.org/D78543	2020-04-21 11:42:17 +02:00
Sander de Smalen	3d9b53706f	[SVEIntrinsicOpts] NFC: Remove unused isReinterpretFromBool for no-assert builds isReinterpretFromBool's only use is in an assert, which causes a warning that the function is defined but not used in no-assert builds.	2020-04-21 09:49:22 +01:00
Sam Parker	27d19101e9	[ARM][ParallelDSP] Handle squaring multiplies The logic in ARMParallelDSP is setup to merge two 16-bits loads into a 32-bit load and feed them into the smlads. This requires that four loads are combined for the four inputs, but there wasn't actually a check for this. Differential Revision: https://reviews.llvm.org/D78492	2020-04-21 08:39:56 +01:00
Shengchen Kan	c031378ce0	[MC][NFC] Use camelCase style for functions in MCObjectStreamer	2020-04-20 20:09:20 -07:00
Shengchen Kan	8bb059ab63	[MC][Bugfix] Remove redundant parameter for relaxInstruction Summary: Before this patch, `relaxInstruction` takes three arguments, the first argument refers to the instruction before relaxation and the third argument is the output instruction after relaxation. There are two quite strange things: 1) The first argument's type is `const MCInst &`, the third argument's type is `MCInst &`, but they may be aliased to the same variable 2) The backends of ARM, AMDGPU, RISC-V, Hexagon assume that the third argument is a fresh uninitialized `MCInst` even if `relaxInstruction` may be called like `relaxInstruction(Relaxed, STI, Relaxed)` in a loop. In this patch, we drop the thrid argument, and let `relaxInstruction` directly modify the given instruction. Also, this patch fixes the bug https://bugs.llvm.org/show_bug.cgi?id=45580, which is introduced by D77851, and breaks the assumption of ARM, AMDGPU, RISC-V, Hexagon. Reviewers: Razer6, MaskRay, jyknight, asb, luismarques, enderby, rtaylor, colinl, bcain Reviewed By: Razer6, MaskRay, bcain Subscribers: bcain, nickdesaulniers, nathanchance, wuzish, annita.zhang, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, tpr, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78364	2020-04-21 11:06:55 +08:00
Yonghong Song	3cb7e7bf95	BPF: fix a CORE optimization bug For the test case in this patch like below struct t { int a; } __attribute__((preserve_access_index)); int foo(void ); int test(struct t arg) { long param[1]; param[0] = (long)&arg->a; return foo(param); } The IR right before BPF SimplifyPatchable phase: %1:gpr = LD_imm64 @"llvm.t:0:0$0:0" %2:gpr = LDD killed %1:gpr, 0 %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %2:gpr STD killed %3:gpr, %stack.0.param, 0 After SimplifyPatchable phase, the incorrect IR is generated: %1:gpr = LD_imm64 @"llvm.t:0:0$0:0" %3:gpr = ADD_rr %0:gpr(tied-def 0), killed %1:gpr CORE_MEM killed %3:gpr, 306, %0:gpr, @"llvm.t:0:0$0:0" Note that CORE_MEM pseudo op is introduced to encode memory operations related to CORE. In the above, we intend to check whether we have a store like (%3:gpr + 0) = ... and if this is the case, we could replace it with (%0:gpr + @"llvm.t:0:0$0:0"_ = ... Unfortunately, in the above, IR for the store is *(%stack.0.param + 0) = %3:gpr and transformation should not happen. Note that we won't have problem if the actual CORE dereference (arg->a) happens. This patch fixed the problem by skip CORE optimization if the use of ADD_rr result is not the base address of the store operation. Differential Revision: https://reviews.llvm.org/D78466	2020-04-20 19:54:51 -07:00
Christopher Tetreault	a9b137f9ff	[SVE] Remove calls to getBitWidth from PowerPC Reviewers: efriedma, sdesmalen, hfinkel, david-arm, fpetrogalli Reviewed By: efriedma, fpetrogalli Subscribers: wuzish, nemanjai, tschuett, hiraditya, kbarton, rkruppe, psnobl, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77900	2020-04-20 14:18:37 -07:00
Christopher Tetreault	17e1df44ec	[SVE] Remove calls to getBitWidth from mips Reviewers: efriedma, ahatanak, sdesmalen, c-rhodes, david-arm Reviewed By: efriedma Subscribers: dexonsmith, sdardis, arichardson, tschuett, hiraditya, jrtc27, atanasyan, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77906	2020-04-20 14:09:06 -07:00
Piotr Sobczak	c48ceaf37b	Revert "[AMDGPU] Set the CostPerUse value for vgpr registers." This reverts commit `728b878de6`. D76417 has caused vgpr count to go up significantly in real-world graphics content.	2020-04-20 22:47:31 +02:00
Andrew Litteken	1488bef8fc	[MachineOutliner] Annotation for outlined functions in AArch64 - Adding changes to support comments on outlined functions with outlining for the conditions through which it was outlined (e.g. Thunks, Tail calls) - Adapts the emitFunctionHeader to print out a comment next to the header if the target specifies it based on information in MachineFunctionInfo - Adds mir test for function annotiation Differential Revision: https://reviews.llvm.org/D78062	2020-04-20 13:33:31 -07:00
David Tenty	0098324947	[AIX] Return the correct set of callee saved regs Summary: r13 isn't reserved on 32-bit AIX, which is reflected in our calling convention but not callee saved regs. Reviewers: sfertile, ZarkoCA, cebowleratibm, jasonliu Reviewed By: sfertile Subscribers: thakis, lei, wuzish, nemanjai, hiraditya, kbarton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77101	2020-04-20 14:31:08 -04:00
Nemanja Ivanovic	64b31d96df	[PowerPC] Do not attempt to reuse load for 64-bit FP_TO_UINT without FPCVT We call the function that attempts to reuse the conversion without checking whether the target matches the constraints that the callee expects. This patch adds the check prior to the call. Fixes: https://bugs.llvm.org/show_bug.cgi?id=43976 Differential revision: https://reviews.llvm.org/D77564	2020-04-20 13:00:06 -05:00
David Tenty	28ae1969dc	Revert "[AIX] Return the correct set of callee saved regs" This reverts commit `6c881bf1fe`.	2020-04-20 13:06:37 -04:00
Sean Fertile	d52bb6d099	[PowerPC][AIX] ByVal formal argument support: passing on the stack. Adds support for passing a ByVal formal argument completely on the stack (ie after all argument registers are exhausted). Differential Revision: https://reviews.llvm.org/D78263	2020-04-20 12:04:59 -04:00
David Tenty	6c881bf1fe	[AIX] Return the correct set of callee saved regs Summary: r13 isn't reserved on 32-bit AIX, which is reflected in our calling convention but not callee saved regs. Reviewers: sfertile, ZarkoCA, cebowleratibm, jasonliu Reviewed By: sfertile Subscribers: lei, wuzish, nemanjai, hiraditya, kbarton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77101	2020-04-20 11:22:17 -04:00
Petre-Ionut Tudor	52474992b1	Revert "[ARM] Fix conditions for lowering to S[LR]I" This reverts commit `cabfcf840a`. The patch introduced another bug in the optimization.	2020-04-20 16:11:04 +01:00
Simon Pilgrim	21bd3767c8	X86MacroFusion.cpp - ensure X86MacroFusion.h module header is included first. NFC.	2020-04-20 14:25:15 +01:00
Konstantin Schwarz	12030494fc	[GlobalISel] Introduce InlineAsmLowering class Summary: Similar to the CallLowering class used for lowering LLVM IR calls to MIR calls, we introduce a separate class for lowering LLVM IR inline asm to MIR INLINEASM. There is no functional change yet, all existing tests should pass. Reviewers: arsenm, dsanders, aemerson, volkan, t.p.northover, paquette Reviewed By: aemerson Subscribers: gargaroff, wdng, mgorny, rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78316	2020-04-20 15:10:18 +02:00
Mark Murray	f5a812cbcc	Revert `3ce0ad1b33` Die to breakage in check-lld. Requested-by: Nico Weber	2020-04-20 13:01:36 +01:00
Ayke van Laethem	8aad119d93	[AVR] Do not place functions in .progmem.data Previously, the AVR backend would put functions in .progmem.data. This is probably a regression from when functions still lived in address space 0. With this change, only global constants are placed in .progmem.data. This is not complete: avr-gcc additionally respects -fdata-sections for progmem global constants, which LLVM doesn't yet do. But fixing that is a bit more complicated (and I believe other backends such as RISC-V might also have similar issues). Differential Revision: https://reviews.llvm.org/D78212	2020-04-20 13:56:38 +02:00
Ayke van Laethem	9505b5cb66	[AVR] Do not use divmod calls for bigger integers The avr-libc provides divmodqi4, divmodhi4, and divmodsi4 functions, but does not provide a divmoddi4. Instead it provides regular divdi3 and moddi3 functions. Note that avr-libc doesn't support divti3 or modti3 for 128-bit integer manipulation. Source: https://github.com/gcc-mirror/gcc/blob/releases/gcc-5.4.0/libgcc/config/avr/lib1funcs.S Differential Revision: https://reviews.llvm.org/D78437	2020-04-20 13:56:38 +02:00
Simon Pilgrim	4ba7ae85da	X86Subtarget.h - remove unused includes. NFC. Replace with forward declarations.	2020-04-20 12:10:45 +01:00
Simon Pilgrim	2cfcbc52c3	X86Subtarget.cpp - sort includes. NFC Ensure X86Subtarget.h module header is at the top, and sort the remaining includes.	2020-04-20 11:54:04 +01:00
Simon Pilgrim	179dced13b	X86MCTargetDesc.h - remove unnecessary MCStreamer.h include. NFC. We don't need all of MCStreamer.h, just FormattedStream.h. The rest can be replaced with forward declarations. X86WinAllocaExpander.cpp had an implicit dependency on MapVector.h which I've added locally.	2020-04-20 11:39:38 +01:00
Simon Pilgrim	44cf9b85ad	X86MCAsmInfo.h - remove unnecessary MCAsmInfo.h include. NFC. We only use the COFF/Darwin/ELF classes directly.	2020-04-20 11:39:38 +01:00
Simon Pilgrim	da3bf811be	X86InstrFoldTables.h - remove unnecessary include. NFC. We don't need the limits defines, just the sized integer types so use cstdint system header directly.	2020-04-20 11:39:38 +01:00
Kerry McLaughlin	33ffce5414	[AArch64][SVE] Remove LD1/ST1 dependency on llvm.masked.load/store Summary: The SVE masked load and store intrinsics introduced in D76688 rely on common llvm.masked.load/store nodes. This patch creates new ISD nodes for LD1(S) & ST1 to remove this dependency. Additionally, this adds support for sign & zero extending loads and truncating stores. Reviewers: sdesmalen, efriedma, cameron.mcinally, c-rhodes, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, andwar, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78204	2020-04-20 11:08:11 +01:00
Sam Parker	62f97123fb	[ARM][MVE] Add patterns for VRHADD Add patterns which use standard add nodes along with arm vshr imm nodes. Differential Revision: https://reviews.llvm.org/D77069	2020-04-20 10:05:21 +01:00
Mark Murray	3ce0ad1b33	[ARM][MC][Thumb] Revert relocation for some pc-relative fixups. Summary: This commit reverts https://reviews.llvm.org/D75039. Concensus appears to be in favour of assembly-time resolution of these ADR and LDR relocations, in line with GNU. Reviewers: psmith Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78301	2020-04-20 09:38:12 +01:00
Sam Parker	e3056ae9a0	[NFC][TTI] Explicit use of VectorType The API for shuffles and reductions uses generic Type parameters, instead of VectorType, and so assertions and casts are used a lot. This patch makes those types explicit, which means that the clients can't be lazy, but results in less ambiguity, and that can only be a good thing. Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45562 Differential Revision: https://reviews.llvm.org/D78357	2020-04-20 09:16:52 +01:00
Xiang1 Zhang	0980038a5e	Handle CET for -exception-model sjlj Summary: In SjLj exception mode, the old landingpad BB will create a new landingpad BB and use indirect branch jump to the old landingpad BB in lowering. So we should add 2 endbr for this exception model. Reviewers: hjl.tools, craig.topper, annita.zhang, LuoYuanke, pengfei, efriedma Reviewed By: LuoYuanke Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77124	2020-04-20 11:13:40 +08:00
Shengchen Kan	b78c3c89c2	[X86][MC][NFC] Reduce the parameters of functions in X86MCCodeEmitter(Part III) Summary: When we encode an instruction, we need to know the number of bytes being emitted to determine the fixups in `X86MCCodeEmitter::emitImmediate`. There are only two callers for `emitImmediate`: `emitMemModRMByte` and `encodeInstruction`. Before this patch, we kept track of the current byte being emitted by passing a reference parameter `CurByte` across all the `emit` funtions, which is ugly and unnecessary. For example, we don't have any fixups when emitting prefixes, so we don't need to track this value. In this patch, we use `StartByte` to record the initial status of the streamer, and use `OS.tell()` to get the current status of the streamer when we need to know the number of bytes being emitted. On one hand, this eliminates the parameter `CurByte` for most `emit` functions, on the other hand, this make things clear: Only pass the parameter when we really need it. Reviewers: craig.topper, pengfei, MaskRay Reviewed By: craig.topper, MaskRay Subscribers: hiraditya, llvm-commits, annita.zhang Tags: #llvm Differential Revision: https://reviews.llvm.org/D78419	2020-04-20 10:03:41 +08:00
Craig Topper	4ecc8fb7eb	[CallSite removal][WebAssembly] Replace CallSite with CallBase in WebAssemblyOptimizeReturned. Differential Revision: https://reviews.llvm.org/D78451	2020-04-19 18:32:52 -07:00
Craig Topper	d7e2d937bc	[X86] Add X86ISD nodes for PDEP and PEXT. This will allow use to add DAG combines for these instructions.	2020-04-19 16:14:13 -07:00
Craig Topper	744eaa7a3f	[CallSite removal][AMDGPU] Use CallBase instead of CallSite in AMDGPUFixFunctionBitcasts. NFC	2020-04-19 15:21:02 -07:00
Simon Pilgrim	a938c7b9ed	X86CallLowering.h - remove unnecessary ArrayRef.h include. NFC.	2020-04-19 21:25:10 +01:00
Florian Hahn	a7aaadc135	[TTI] Clean up includes (NFC). Remove some unnecessary includes, replace some with forward declarations. This also exposed a few places that were missing some includes.	2020-04-19 20:11:59 +01:00
Simon Pilgrim	8859c7f6eb	X86MachineFunctionInfo.h - remove unused include. NFC.	2020-04-19 16:58:59 +01:00
Simon Pilgrim	c27fdc84df	X86InstrInfo.h - remove unused forward declarations. NFC.	2020-04-19 16:58:59 +01:00
Simon Pilgrim	a156646443	X86DisassemblerDecoder.h - remove unused forward declaration. NFC.	2020-04-19 16:58:58 +01:00
Sanjay Patel	720015e537	[x86] avoid build warning for enum mismatch; NFC gcc may warn here because X86ISD::NodeType is specified as "unsigned", but ISD::NodeType is a naked C enum (although passed as an "unsigned" throughout SDAG).	2020-04-19 10:22:11 -04:00
Simon Pilgrim	60765e911d	X86MCTargetDesc.h - remove unnecessary includes and forward declarations. NFC.	2020-04-19 14:29:35 +01:00
Simon Pilgrim	18bf42a86c	X86.h - remove unused forward declarations. NFC.	2020-04-19 14:28:52 +01:00
Simon Pilgrim	84aab8b772	X86SelectionDAGInfo.h - remove unnecessary includes and forward declarations. NFC.	2020-04-19 14:20:53 +01:00
Simon Pilgrim	44d91cac76	X86TargetTransformInfo.h - remove unnecessary includes. NFC.	2020-04-19 14:03:43 +01:00
Simon Pilgrim	e71dd7c011	[X86][SSE] getFauxShuffle - don't combine shuffles with small truncated scalars (PR45604) getFauxShuffle attempts to combine INSERT_VECTOR_ELT(TRUNCATE/EXTEND(EXTRACT_VECTOR_ELT(x))) patterns into a target shuffle chain. PR45604 identified an issue where the scalar was truncated to a size smaller than the destination vector element and then zero extended back, which requires the upper bits to be zero'd which we don't currently do. To avoid the bug I've added an early out in these truncation cases, a future commit should allow us to handle this by inserting the necessary SM_SentinelZero padding.	2020-04-19 13:35:22 +01:00
Sanjay Patel	cceb630a07	[x86] use vector instructions to lower more FP->int->FP casts This is an enhancement to D77895 to avoid another round-trip from XMM->GPR->XMM. This time we handle the case of starting/ending with an f64 and casting to signed i32 as the intermediate value. It's a bit more involved than I initially assumed because we need to use target-specific opcodes to represent the non-standard cast ops. Differential Revision: https://reviews.llvm.org/D78362	2020-04-19 08:33:17 -04:00
Simon Pilgrim	9559557014	X86InstrFMA3Info.h - remove unnecessary includes. NFC. There were a number of cpp files explicitly relying on X86InstrFMA3Info.h to include the X86.h header - so I've had to add it locally.	2020-04-19 12:17:56 +01:00
Simon Pilgrim	d49646e6de	X86AsmPrinter.h - cleanup includes and forward declarations. NFC. Reduce X86Subtarget.h/MCCodeEmitter.h/TargetMachine.h includes to forward declarations Add explicit X86Subtarget.h/TargetMachine.h includes to X86AsmPrinter.cpp/X86MCInstLower.cpp Remove unused MCSymbol forward declaration	2020-04-19 11:38:50 +01:00
Simon Pilgrim	cbd790a443	DebugHandlerBase.h - reduce MachineInstr.h include to DebugLoc.h include. We were only including MachineInstr.h for DebugLoc.h. This exposes an implicit include dependency in BTFDebug.h where I've had to add the MachineInstr.h include.	2020-04-19 11:14:01 +01:00
LemonBoy	a5d161c119	[PowerPC] Don't use rldicl for PPC32 According to https://www.ibm.com/support/knowledgecenter/ssw_aix_72/assembler/idalangref_rldicl_rletdw_instrs.html rldicl should not be used when targeting 32bit CPUs. Reviewed By: #powerpc, nemanjai, MaskRay Differential Revision: https://reviews.llvm.org/D77946	2020-04-18 17:24:25 -07:00
LemonBoy	aad3d578da	[DebugInfo] Change DIEnumerator payload type from int64_t to APInt This allows the representation of arbitrarily large enumeration values. See https://lists.llvm.org/pipermail/llvm-dev/2017-December/119475.html for context. Reviewed By: andrewrk, aprantl, MaskRay Differential Revision: https://reviews.llvm.org/D62475	2020-04-18 12:49:31 -07:00
Shengchen Kan	0d3149f431	[MC][X86] Disable branch align in non-text section Summary: The instruction in non-text section can not be executed, so they will not affect performance. In addition, their encoding values are treated as data, so we should not touch them. Reviewers: MaskRay, reames, LuoYuanke, jyknight Reviewed By: MaskRay Subscribers: annita.zhang, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77971	2020-04-18 14:41:25 +08:00
Andrew Litteken	8d5024f7fe	fix to outline cfi instruction when can be grouped in a tail call [MachineOutliner] fix test for excluding CFI and add test to include CFI in outlining New test to check that we only outline CFI instruction if all CFI Instructions in the function would be captured by the outlining adding x86 tests analagous to AARCH64 cfi tests Revision: https://reviews.llvm.org/D77852	2020-04-17 22:26:34 -07:00
Matt Arsenault	f463792506	AMDGPU: Remove custom node for RSQ_LEGACY Directly select from the intrinsic. This wasn't getting much value from the custom node.	2020-04-17 19:50:36 -04:00
Jessica Paquette	66037b84cf	MachineFunctionInfo for AArch64 in MIR Starting with hasRedZone adding MachineFunctionInfo to be put in the YAML for MIR files. Split out of: D78062 Based on implementation for MachineFunctionInfo for WebAssembly Differential Revision: https://reviews.llvm.org/D78173 Patch by Andrew Litteken! (AndrewLitteken)	2020-04-17 15:16:59 -07:00
Francesco Petrogalli	897fdec586	[llvm][CodeGen] Addressing modes for SVE stN. This reverts commit `17b1869b72`. It is an attempt to fix the failure reported at The patch differs from the original one reviwed at https://reviews.llvm.org/D77435 only for the use of the std::make_tuple in building the return value of `findAddrModeSVELoadStore`: - return {IsRegReg ? Opc_rr : Opc_ri, NewBase, NewOffset}; + return std::make_tuple(IsRegReg ? Opc_rr : Opc_ri, NewBase, the original patch submitted at `fc4e954ed5` was failing the following build: http://lab.llvm.org:8011/builders/clang-armv7-linux-build-cache/builds/29420/ with error: /home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp /home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:1439:10: error: chosen constructor is explicit in copy-initialization return {IsRegReg ? Opc_rr : Opc_ri, NewBase, NewOffset}; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here constexpr tuple(_UElements&&... __elements) ^ 1 error generated.	2020-04-17 20:35:35 +01:00
Francesco Petrogalli	17b1869b72	Revert "[llvm][CodeGen] Addressing modes for SVE stN." This reverts commit `fc4e954ed5`. The commit reported the following failure: http://lab.llvm.org:8011/builders/clang-armv7-linux-build-cache/builds/29420 FAILED: lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64ISelDAGToDAG.cpp.o /usr/bin/c++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Target/AArch64 -I/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64 -I/usr/include/libxml2 -Iinclude -I/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/include -mthumb -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -fvisibility=hidden -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MMD -MT lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64ISelDAGToDAG.cpp.o -MF lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64ISelDAGToDAG.cpp.o.d -o lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64ISelDAGToDAG.cpp.o -c /home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp /home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:1439:10: error: chosen constructor is explicit in copy-initialization return {IsRegReg ? Opc_rr : Opc_ri, NewBase, NewOffset}; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here constexpr tuple(_UElements&&... __elements)	2020-04-17 20:03:11 +01:00
Stanislav Mekhanoshin	992fbce4e9	[AMDGPU] copyPhysReg() for 16 bit SGPR subregs Differential Revision: https://reviews.llvm.org/D78255	2020-04-17 11:59:39 -07:00
Stanislav Mekhanoshin	fde2aefa22	[AMDGPU] Use SDWA for 16 bit subreg copy This simplifies the logic and allows to use it on GFX8. Differential Revision: https://reviews.llvm.org/D78150	2020-04-17 11:45:44 -07:00
Francesco Petrogalli	fc4e954ed5	[llvm][CodeGen] Addressing modes for SVE stN. Reviewers: efriedma, sdesmalen, c-rhodes, ctetreau Reviewed By: c-rhodes Subscribers: tschuett, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77435	2020-04-17 19:31:44 +01:00
Francesco Petrogalli	48879c02bf	[llvm][CodeGen] Fix issue for SVE gather prefetch. Summary: This change is fixing an issue where the dagcombine incorrectly used an addressing mode with scaled offsets (indices), instead of unscaled offsets. Those addressing modes do not exist for `prfh` , `prfw` and `prfd`, hence we can reuse `prfb` because that has unscaled offsets, and because the pseudo-code in the XML spec suggests that the element size is not used for the amount of data that is prefetched by the instruction. FWIW, GCC also emits a `prfb` for these cases. Reviewers: sdesmalen, andwar, rengolin Reviewed By: sdesmalen Subscribers: tschuett, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78069	2020-04-17 19:23:28 +01:00
Christopher Tetreault	dd24fb388b	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: craig.topper, sdesmalen, efriedma, RKSimon Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77264	2020-04-17 10:49:16 -07:00
Benjamin Kramer	d1ef44982f	[AArch64] Fold one-use variables into assert Avoids unused variable warnings in Release builds.	2020-04-17 19:43:06 +02:00
Petre-Ionut Tudor	cabfcf840a	[ARM] Fix conditions for lowering to S[LR]I Summary: Fixed wrong conditions for generating (S[LR]I X, Y, C2) from (or (and X, BvecC1), (lsl Y, C2)) and added ISel nodes to lower to S[LR]I. The optimisation is also enabled by default now. Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77387	2020-04-17 17:19:24 +01:00
Stefan Pintilie	b771c4a842	[PowerPC][Future] More support for PCRel addressing for global values Add initial support for PC Relative addressing for global values that require GOT indirect addressing. This patch adds PCRelative support for global addresses that may not be known at link time and may require access through the GOT. Differential Revision: https://reviews.llvm.org/D76064	2020-04-17 11:06:13 -05:00
Dominik Montada	55e3a7c6b2	[GlobalISel][AMDGPU] add legalization for G_FREEZE Summary: Copy the legalization rules from SelectionDAG: -widenScalar using anyext -narrowScalar using intermediate merges -scalarize/fewerElements using unmerge -moreElements using G_IMPLICIT_DEF and insert Add G_FREEZE legalization actions to AMDGPULegalizerInfo. Use the same legalization actions as G_IMPLICIT_DEF. Depends on D77795. Reviewers: dsanders, arsenm, aqjune, aditya_nandakumar, t.p.northover, lebedev.ri, paquette, aemerson Reviewed By: arsenm Subscribers: kzhuravl, yaxunl, dstuttard, tpr, t-tye, jvesely, nhaehnle, kerbowa, wdng, rovka, hiraditya, volkan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78092	2020-04-17 16:44:46 +02:00
Jay Foad	96b61571d0	[AMDGPU] New helper functions to get a register class of a given width Summary: Introduce new helper functions getVGPRClassForBitWidth, getAGPRClassForBitWidth, getSGPRClassForBitWidth and use them to refactor various other functions that all contained their own lists of valid register class widths. NFC. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78311	2020-04-17 15:16:57 +01:00
jasonliu	77618cc237	[XCOFF][AIX] Fix getSymbol to return the correct qualname when necessary Summary: AIX symbol have qualname and unqualified name. The stock getSymbol could only return unqualified name, which leads us to patch many caller side(lowerConstant, getMCSymbolForTOCPseudoMO). So we should try to address this problem in the callee side(getSymbol) and clean up the caller side instead. Note: this is a "mostly" NFC patch, with a fix for the original lowerConstant behavior. Differential Revision: https://reviews.llvm.org/D78045	2020-04-17 13:45:14 +00:00
Jay Foad	96712d6ef2	[AMDGPU] Simplify SIRegisterInfo::getRegSplitParts Summary: Use more logic and fewer tables. This reduces the line count and reduces the effort required to introduce more register classes of different sizes in future. Reviewers: arsenm, rampitec, nhaehnle Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78351	2020-04-17 14:37:11 +01:00
Benjamin Kramer	166467e822	[VectorUtils] Create shufflevector masks as int vectors instead of Constants No functionality change intended.	2020-04-17 15:28:00 +02:00
Sanjay Patel	818126ae97	[x86] rename variables for types for readability; NFC This gets harder to follow if we allow changing types/sizes between source, dest, and intermediate value.	2020-04-17 08:41:18 -04:00
Roger Ferrer Ibanez	5f23686412	[RISCV][AsmParser] Implement .option (no)pic Differential Revision: https://reviews.llvm.org/D77867	2020-04-17 12:08:30 +00:00
Shengchen Kan	c82faea9fb	Recommit [X86][MC][NFC] Reduce the parameters of functions in X86MCCodeEmitter(Part II) Previous patch didn't handle the early return in `emitREXPrefix` correctly, which causes REX prefix was not emitted for instruction without operands. This patch includes the fix for that.	2020-04-17 19:42:35 +08:00
Jay Foad	858d8db470	AMDGPU/GlobalISel: Work around another selector crash This does for G_EXTRACT_VECTOR_ELT what `588bd7be36` did for G_TRUNC. Ideally types without a corresponding register class wouldn't reach here, but we're currently missing some (in particular a 192-bit class is missing).	2020-04-17 12:07:54 +01:00
Fraser Cormack	c819ef9653	Provide operand indices to adjustSchedDependency This allows targets to know exactly which operands are contributing to the dependency, which is required for targets with per-operand scheduling models. Differential Revision: https://reviews.llvm.org/D77135	2020-04-17 11:08:44 +01:00
Simon Pilgrim	bcd7f77713	MCObjectWriter.h - remove Endian.h/EndianStream.h/raw_ostream.h includes. NFC Push these includes down to the the writers that actually need them, a number of which were implicitly relying on the MCObjectWriter.h.	2020-04-17 10:44:08 +01:00
Sam Parker	f88000a4b5	[ARM][MVE] Add VHADD and VHSUB patterns Add patterns that use a normal, non-wrapping, add and sub nodes along with an arm vshr imm node. Differential Revision: https://reviews.llvm.org/D77065	2020-04-17 07:45:15 +01:00
Shengchen Kan	c5fa0a4d4b	Temporaily revert [X86][MC][NFC] Reduce the parameters of functions in X86MCCodeEmitter(Part II) It causes some encoding fails. Plan to recommit it after fixing that. This reverts commit `3017580c79`.	2020-04-17 14:11:33 +08:00
Shengchen Kan	3017580c79	[X86][MC][NFC] Reduce the parameters of functions in X86MCCodeEmitter(Part II) Summary: We determine the REX prefix used by instruction in `determineREXPrefix`, and this value is used in `emitMemModRMByte' and used as the return value of `emitOpcodePrefix`. Before this patch, REX was passed as reference to `emitPrefixImpl`, it is strange and not necessary, e.g, we have to write ``` bool Rex = false; emitPrefixImpl(CurOp, CurByte, Rex, MI, STI, OS); ``` in `emitPrefix` even if `Rex` will not be used. So we let HasREX be the return value of `emitPrefixImpl`. The HasREX is passed from `emitREXPrefix` to `emitOpcodePrefix` and then to `emitPrefixImpl`. This makes sense since REX is a kind of opcode prefix and of course is a prefix. Reviewers: craig.topper, pengfei Reviewed By: craig.topper Subscribers: annita.zhang, craig.topper, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78276	2020-04-17 13:32:19 +08:00
QingShan Zhang	4bd186c0ff	[PowerPC] Exploit the rldicl + rldicl when and with mask If we are and the constant like 0xFFFFFFC00000, for now, we are using several instructions to generate this 48bit constant and final an "and". However, we could exploit it with two rotate instructions. MB ME MB+63-ME +----------------------+ +----------------------+ \|0000001111111111111000\| -> \|0000000001111111111111\| +----------------------+ +----------------------+ 0 63 0 63 Rotate left ME + 1 bit first, and then, mask it with (MB + 63 - ME, 63), finally, rotate back. Notice that, we need to round it with 64 bit for the wrapping case. Reviewed by: ChenZheng, Nemanjai Differential Revision: https://reviews.llvm.org/D71831	2020-04-17 05:24:00 +00:00
Wouter van Oortmerssen	48139ebc3a	[WebAssembly] Add int32 DW_OP_WASM_location variant This to allow us to add reloctable global indices as a symbol. Also adds R_WASM_GLOBAL_INDEX_I32 relocation type to support it. See discussion in https://github.com/WebAssembly/debugging/issues/12	2020-04-16 16:32:17 -07:00
Cameron McInally	1223255c2d	[AArch64][SVE] Add DestructiveBinaryImm SQSHLU patterns. Add DestructiveBinaryImm SQSHLU patterns and tests. These patterns allow the SQSHLU instruction to match with a MOVPRFX. Differential Revision: https://reviews.llvm.org/D76728	2020-04-16 13:48:08 -05:00
Stefan Pintilie	18b6050324	[PowerPC][Future] Initial support for PC Relative addressing for global values This patch adds PC Relative support for global values that are known at link time. If a global value requires access through the global offset table (GOT) it is not covered in this patch. Differential Revision: https://reviews.llvm.org/D75280	2020-04-16 12:45:22 -05:00
Stanislav Mekhanoshin	2e94a64b57	[AMDGPU] Define 16 bit SGPR subregs These are needed as a counterpart for VGPR subregs even though there are no scalar instructions which can operate 16 bit values. When we are materializing a constant that is done into an SGPR and that SGPR may/will be copied into a 16 bit VGPR subreg. Such copy is illegal. There are also similar problems if a source operand of a 16 bit VALU instruction is an SGPR. In addition we need to get a register with a lo16 subregister of an SGPR RC during selection and this fails as well. All of that makes me believe we need these subregisters as a syntactic glue. Differential Revision: https://reviews.llvm.org/D78250	2020-04-16 10:31:39 -07:00
Anna Welker	d736571538	[ARM][MVE] Fix location of optimized gather addresses Fix for the address optimization for gathers and scatters which would in some complex cases push out instructions not to the vector loop preheader, but to other locations as well which lead to a scrambled order and the compilation failing. This patch ensures that said instructions are always pushed to the end of the vector loop preheader. Differential Revision: https://reviews.llvm.org/D78293	2020-04-16 18:15:28 +01:00
Kang Zhang	513976df2e	[PowerPC] Ignore implicit register operands for MCInst Summary: When doing the conversion: MachineInst -> MCInst, we should ignore the implicit operands, it will expose more opportunity for InstiAlias. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D77118	2020-04-16 16:22:43 +00:00
Kazushi (Jam) Marukawa	48d64f5654	[VE] Update logical operation instructions Summary: Changing all mnemonic to match assembly instructions to simplify mnemonic naming rules. This time update all fixed-point arithmetic instructions. This also corrects bswp operand type. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D78177	2020-04-16 14:35:53 +02:00
Konstantin Schwarz	1a3e89aa2b	[MIR] Add comments to INLINEASM immediate flag MachineOperands Summary: The INLINEASM MIR instructions use immediate operands to encode the values of some operands. The MachineInstr pretty printer function already handles those operands and prints human readable annotations instead of the immediates. This patch adds similar annotations to the output of the MIRPrinter, however uses the new MIROperandComment feature. Reviewers: SjoerdMeijer, arsenm, efriedma Reviewed By: arsenm Subscribers: qcolombet, sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78088	2020-04-16 13:46:14 +02:00
Shengchen Kan	71303b753c	[X86] Add interface X86II::isPseudo Avoid duplicate code in X86MCCodeEmitter, NFCI.	2020-04-16 12:40:17 +08:00
Shengchen Kan	7aaaea5acd	[X86][MC][NFC] Code cleanup in X86MCCodeEmitter Make some function static, move the definitions of functions to a better place and use C++ style cast, etc.	2020-04-16 11:30:49 +08:00
Shengchen Kan	6c66bb393e	[X86][MC][NFC] Refine code in X86MCCodeEmitter As we mentioned in D78180, merge some if clauses and use CamelCase for variables, etc.	2020-04-16 10:43:42 +08:00
Shengchen Kan	322ac2e917	[X86][MC][NFC] Reduce the parameters of functions in X86MCCodeEmitter(Part I) Summary: The function in X86MCCodeEmitter has too many parameters to make it look messy, and some parameters are unnecessary. This is the first patch to reduce their parameters. The follwing operations are cheap ``` unsigned Opcode = MI.getOpcode(); const MCInstrDesc &Desc = MCII.get(Opcode); uint64_t TSFlags = Desc.TSFlags; ``` So if we pass a `MCInst`, we don't need to pass `MCInstrDesc`; if we pass a `MCInstrDesc`, we don't need to pass `TSFlags`. Reviewers: craig.topper, MaskRay, pengfei Reviewed By: craig.topper Subscribers: annita.zhang, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78180	2020-04-16 09:53:45 +08:00
Fangrui Song	7d1ff446b6	[MC] Rename MCSection::getSectionName() to getName(). NFC A pending change will merge MCSection::getName() to MCSection::getName().	2020-04-15 16:48:14 -07:00
Christopher Tetreault	85247c1e89	[SVE] Remove calls to getBitWidth from x86 Reviewers: efriedma, RKSimon, sdesmalen Reviewed By: RKSimon Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77901	2020-04-15 15:48:48 -07:00
Chris Bowler	bee6c234ed	[AIX][PowerPC] Implement caller byval arguments in stack memory Differential Revision: https://reviews.llvm.org/D77578	2020-04-15 17:57:31 -04:00
Francesco Petrogalli	89680f25e8	[llvm][CodeGen] Rename SVE gather prefetch intrinsics. [NFC] Summary: The renaming is necessary to make the naming scheme uniform with other gather/scatter load/stores SVE intrinsics. The naming of variables and functions have been adapted to make it explicit whether we are dealing with a scalar offset (which is unscaled) or an index (which is scaled according to the data type of the lanes of the vector). Reviewers: andwar, sdesmalen, rengolin Reviewed By: andwar Subscribers: tschuett, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77839	2020-04-15 21:49:16 +01:00
Nemanja Ivanovic	c196e2ca48	[PowerPC] Clear the set of symbols that need to be updated in MCTargetStreamer We have added code to correct the .localentry values on assignments. However, we never clear the set so presumably it will still contain the (now dangling) MCSymbol pointers across a call to finish() and reset() in the streamer. This is based on my speculation that it is the reason we are getting segmentation faults mentioned in https://bugs.llvm.org/show_bug.cgi?id=45366 Fixes: https://bugs.llvm.org/show_bug.cgi?id=45366 Differential revision: https://reviews.llvm.org/D78196	2020-04-15 15:42:02 -05:00
Craig Topper	8dfb9627b7	[X86] Make v32i16/v64i8 legal types without avx512bw. Use custom splitting instead. This moves v32i16/v64i8 to a model consistent with how we treat integer types with avx1. This does change the ABI for types vXi16/vXi8 vectors larger than 512 bits to pass in multiple zmms instead of multiple ymms. We'd already hacked some code to make v64i8/v32i16 pass in zmm. Cost model is still a bit of a mess. In some place I tried to match existing behavior. But really we need to account for splitting and concating costs. Cost model for shuffles is especially pessimistic. Differential Revision: https://reviews.llvm.org/D76212	2020-04-15 12:17:18 -07:00
Matt Arsenault	588bd7be36	AMDGPU/GlobalISel: Work around a selector crash Ideally types without a corresponding register class wouldn't reach here, but we're currently missing some (in particular a 192-bit class is missing).	2020-04-15 14:38:50 -04:00
Craig Topper	a916e81927	[X86] Various improvements to our vector splitting helpers for lowering. NFC -Consistently name the functions as split* -Add a helper for doing the two extractSubvector calls and determining the size of the split -Use getSplitDestVTs to get the result type for the split node. -Move the binary and unary helper to one place in the file near the extractSubvector functions. Left the VSETCC one near LowerVSETCC since that's its only caller. -Remove the 256/512 wrappers that just had asserts. I don't think they provided a lot of value and now with the routines called split* the call sites are more obvious what they do. -Make the unary routine support different source and dest types to support D76212. -Add some weaker asserts into the helpers to make up for losing the very specific asserts from the 256/512 wrappers. Differential Revision: https://reviews.llvm.org/D78176	2020-04-15 10:57:53 -07:00
Benjamin Kramer	316b49d373	Pass shufflevector indices as int instead of unsigned. No functionality change intended.	2020-04-15 15:52:49 +02:00
Victor Campos	d85b3877dc	[CodeGen][ARM] Error when writing to specific reserved registers in inline asm Summary: No error or warning is emitted when specific reserved registers are written to in inline assembly. Therefore, writes to the program counter or to the frame pointer, for instance, were permitted, which could have led to undesirable behaviour. Example: int foo() { register int a __asm__("r7"); // r7 = frame-pointer in M-class ARM __asm__ __volatile__("mov %0, r1" : "=r"(a) : : ); return a; } In contrast, GCC issues an error in the same scenario. This patch detects writes to specific reserved registers in inline assembly for ARM and emits an error in such case. The detection works for output and input operands. Clobber operands are not handled here: they are already covered at a later point in AsmPrinter::emitInlineAsm(const MachineInstr *MI). The registers covered are: program counter, frame pointer and base pointer. This is ARM only. Therefore the implementation of other targets' counterparts remain open to do. Reviewers: efriedma Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76848	2020-04-15 14:40:42 +01:00
Benjamin Kramer	cc035d475f	Upgrade users of 'new ShuffleVectorInst' to pass indices as an int array No functionality change intended.	2020-04-15 14:29:43 +02:00
Jonas Paulsson	036242b868	[SystemZ] Bugfix in adjustSubwordCmp() adjustSubwordCmp() should not optimize a load of an i1 value. This is achieved by checking that the size and store-size of the MemoryVT are the same. Fixes https://bugs.llvm.org/show_bug.cgi?id=45511. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D78187	2020-04-15 12:58:39 +02:00
Benjamin Kramer	6f64daca8f	Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints No functionality change intended.	2020-04-15 12:51:38 +02:00
Sam Parker	dd8153b757	[ARM][MVE] Tail predicate VML[A\|S]LDAV Make the non-exchanging versions of the multiply add/sub instructions validForTailPredication. Differential Revision: https://reviews.llvm.org/D77648	2020-04-15 11:34:39 +01:00
Sameer Sahasrabuddhe	8c11bc0cd0	Introduce fix-irreducible pass An irreducible SCC is one which has multiple "header" blocks, i.e., blocks with control-flow edges incident from outside the SCC. This pass converts an irreducible SCC into a natural loop by introducing a single new header block and redirecting all the edges on the original headers to this new block. This is a useful workaround for a limitation in the structurizer which, which produces incorrect control flow in the presence of irreducible regions. The AMDGPU backend provides an option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D77198 This restores commit `2ada8e2525`. Originally reverted with commit `44e09b59b8`.	2020-04-15 15:05:51 +05:30
Kazushi (Jam) Marukawa	7a7f223042	[VE] Update integer arithmetic instructions Summary: Changing all mnemonic to match assembly instructions to simplify mnemonic naming rules. This time update all fixed-point arithmetic instructions. This also corrects smax/smin code generations. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D77856	2020-04-15 09:47:51 +02:00
Sameer Sahasrabuddhe	44e09b59b8	Revert "Introduce fix-irreducible pass" This reverts commit `2ada8e2525`. Buildbots produced compilation errors which I was not able to quickly reproduce locally. Need more time to investigate.	2020-04-15 12:19:50 +05:30
Sameer Sahasrabuddhe	2ada8e2525	Introduce fix-irreducible pass An irreducible SCC is one which has multiple "header" blocks, i.e., blocks with control-flow edges incident from outside the SCC. This pass converts an irreducible SCC into a natural loop by introducing a single new header block and redirecting all the edges on the original headers to this new block. This is a useful workaround for a limitation in the structurizer which, which produces incorrect control flow in the presence of irreducible regions. The AMDGPU backend provides an option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D77198	2020-04-15 11:29:19 +05:30
Matt Arsenault	cc149172da	AMDGPU/GlobalISel: Fix selection of scalar f64 G_FABS This wasn't covered by existing tablegen patterns, but also suffers the same issues as G_FNEG. Workaround them by manually selecting, like G_FNEG.	2020-04-14 22:05:22 -04:00
Mircea Trofin	447e2c3067	[llvm][NFC][CallSite] Remove Implementation uses of CallSite Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78142	2020-04-14 14:49:47 -07:00
Christopher Tetreault	e68f1f2d43	[SVE] Remove calls to getBitWidth from Hexagon Reviewers: efriedma, sdesmalen, kparzysz Reviewed By: kparzysz Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77899	2020-04-14 11:09:49 -07:00
Christopher Tetreault	0badd8f613	[SVE] Remove calls to getBitWidth from ARM Reviewers: efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77904	2020-04-14 10:56:38 -07:00
Christopher Tetreault	05a079895c	[SVE] Remove calls to getBitWidth from AArch64 Reviewers: efriedma Reviewed By: efriedma Subscribers: danielkiss, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77905	2020-04-14 10:26:37 -07:00
Andrzej Warzynski	8fc7e6dcd8	[AArch64][SVE] Refine node definitions for ff & nf loads/stores (NFC) Summary: Only first-faulting and non-faulting loads read/update the FFR register and hence only the corresponding SDNodes should be decorated with `SDNPOptInGlue` and `SDNPOutGlue`. This patch: * removes SDNPOptInGlue from regular loads stores (FFR is not read) * adds SDNPOutGlue to first-faulting and non-faulting loads (FFR is both read and updated) Differential Revision: https://reviews.llvm.org/D77724	2020-04-14 18:17:15 +01:00
Pierre-vh	13eb890139	[Target][ARM] Fix VPT Block Pass miscompilation The pass was incorrectly reverting back to a "T" when something wrote to VPR inside a "E" block. This is not the correct behaviour, the predicate should stay the same. Differential Revision: https://reviews.llvm.org/D77798	2020-04-14 15:16:27 +01:00
Pierre-vh	4563024356	[Target][ARM] Adding MVE VPT Optimisation Pass Differential Revision: https://reviews.llvm.org/D76709	2020-04-14 15:16:27 +01:00
Simon Pilgrim	426f37584e	[TTI][X86] Add X86TTIImpl::getScalarizationOverhead implementation. This is a currently just a wrapper to the base type, I'll be adding ISD::BUILD_VECTOR costs in a future patch.	2020-04-14 12:58:19 +01:00
Georgii Rymar	1647ff6e27	[ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers. It can be used to avoid passing the begin and end of a range. This makes the code shorter and it is consistent with another wrappers we already have. Differential revision: https://reviews.llvm.org/D78016	2020-04-14 14:11:02 +03:00
Kerry McLaughlin	36c76de678	[AArch64][SVE] Add a pass for SVE intrinsic optimisations Summary: Creates the SVEIntrinsicOpts pass. In this patch, the pass tries to remove unnecessary reinterpret intrinsics which convert to and from svbool_t (llvm.aarch64.sve.convert.[to\|from].svbool) For example, the reinterprets below are redundant: %1 = call <vscale x 16 x i1> @llvm.aarch64.sve.convert.to.svbool.nxv4i1(<vscale x 4 x i1> %a) %2 = call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> %1) The pass also looks for ptest intrinsics and phi instructions where the operands are being needlessly converted to and from svbool_t. Reviewers: sdesmalen, andwar, efriedma, cameron.mcinally, c-rhodes, rengolin Reviewed By: efriedma Subscribers: mgorny, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, danielkiss, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76078	2020-04-14 10:41:49 +01:00
Peter Smith	31c8e11896	[MC][ARM] Emit R_ARM_BASE_PREL for _GLOBAL_OFFSET_TABLE_ expressions The _GLOBAL_OFFSET_TABLE_ in SysVr4 ELF is conventionally the base of the .got or .got.prel sections. Expressions such as _GLOBAL_OFFSET_TABLE_ - (.L1 +8) are used in assembler code to calculate offsets into the .got. At present MC outputs a R_ARM_REL32 with respect to the _GLOBAL_OFFSET_TABLE_ symbol, whereas gas outputs a R_ARM_BASE_PREL relocation with respect to the _GLOBAL_OFFSET_TABLE_ symbol. While both are correct the R_ARM_REL32 depends on the value of the _GLOBAL_OFFSET_TABLE_ symbol, wheras te R_ARM_BASE_PREL relocation is idependent of the symbol. The R_ARM_BASE_PREL is therefore slightly more robust to linker's that may not follow the conventional placement of _GLOBAL_OFFSET_TABLE_; for example LLD for some time defined _GLOBAL_OFFSET_TABLE_ to 0. Differential Revision: https://reviews.llvm.org/D46319	2020-04-14 10:13:21 +01:00
Kazushi (Jam) Marukawa	37db04dda6	[VE] Remove unnecessary iz pattern Summary: This iz pattern is a special pattern of im pattern. This im pattern has been supported by https://reviews.llvm.org/D77769, so removing iz pattern as a continuous patch. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D77770	2020-04-14 08:56:55 +02:00
Mircea Trofin	4aae4e3f48	[llvm][NFC] CallSite removal from inliner-related files Summary: This removes CallSite from inliner files. Some dependencies where thus affected. Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77991	2020-04-13 21:28:58 -07:00
Craig Topper	2f60fbce6c	[X86] Use a more realisitic cost for truncate v16i64->v16i8 with avx512f. Still not great and we could probably codegen this better, but 11 was clearly ridiculous.	2020-04-13 21:09:43 -07:00
Craig Topper	535a566a01	[X86] Split AVX512 getCastInstrCost into tables that require useAVX512Regs() and those that just operate on 256 or smaller vectors. Use useAVX512Regs() to skip lookups instead of using type legalization action.	2020-04-13 21:09:42 -07:00
Craig Topper	071c64d68d	[X86] Add a more accurate truncate cost for v8i64->v8i8	2020-04-13 21:09:41 -07:00
Fangrui Song	d1a677cd33	[VE] Adapt D77995 CallSite removal	2020-04-13 19:54:49 -07:00
Matt Arsenault	f48fe2c36e	GlobalISel: Fix casted unmerge of G_CONCAT_VECTORS This was assuming a scalarizing unmerge, and would fail assert if the unmerge was to smaller vector types.	2020-04-13 22:03:05 -04:00
Matt Arsenault	0ba40d4ccf	AMDGPU/GlobalISel: Combines for V_CVT_F32_UBYTE[0-3] Ports the existing DAG combines, minus the simplify demanded bits which seems to have no equivalent now. Without these, this isn't particularly helpful in most of the IR sample cases.	2020-04-13 19:18:19 -04:00
Austin Kerbow	a69b3e010c	[AMDGPU][GlobalISel] Fix div_scale in FDIV lowering Differential Revision: https://reviews.llvm.org/D78004	2020-04-13 15:54:49 -07:00
Heejin Ahn	ba40896f99	[WebAssembly] Fix try placement in fixing unwind mismatches Summary: In CFGStackify, `fixUnwindMismatches` function fixes unwind destination mismatches created by `try` marker placement. For example, ``` try ... call @qux ;; This should throw to the caller! catch ... end ``` When `call @qux` is supposed to throw to the caller, it is possible that it is wrapped inside a `catch` so in case it throws it ends up unwinding there incorrectly. (Also it is possible `call @qux` is supposed to unwind to another `catch` within the same function.) To fix this, we wrap this inner `call @qux` with a nested `try`-`catch`-`end` sequence, and within the nested `catch` body, branch to the right destination: ``` block $l0 try ... try ;; new nested try call @qux catch ;; new nested catch local.set n ;; store exnref to a local br $l0 end catch ... end end local.get n ;; retrieve exnref back rethrow ;; rethrow to the caller ``` The previous algorithm placed the nested `try` right before the `call`. But it is possible that there are stackified instructions before the call from which the call takes arguments. ``` try ... i32.const 5 call @qux ;; This should throw to the caller! catch ... end ``` In this case we have to place `try` before those stackified instructions. ``` block $l0 try ... try ;; this should go before 'i32.const 5' i32.const 5 call @qux catch local.set n br $l0 end catch ... end end local.get n rethrow ``` We correctly handle this in the first normal `try` placement phase (`placeTryMarker` function), but failed to handle this in this `fixUnwindMismatches`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77950	2020-04-13 15:50:01 -07:00
Craig Topper	113f37a1f9	[CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase Differential Revision: https://reviews.llvm.org/D77995	2020-04-13 13:50:15 -07:00
Rahman Lavaee	05192e585c	Extend BasicBlock sections to allow specifying clusters of basic blocks in the same section. Differential Revision: https://reviews.llvm.org/D76954	2020-04-13 12:19:59 -07:00
Rahman Lavaee	4ddf7ab454	Revert "Extend BasicBlock sections to allow specifying clusters of basic blocks" This reverts commit `0d4ec16d3d` Because tests were not added to the commit.	2020-04-13 12:19:59 -07:00
Rahman Lavaee	0d4ec16d3d	Extend BasicBlock sections to allow specifying clusters of basic blocks in the same section. This allows specifying BasicBlock clusters like the following example: !foo !!0 1 2 !!4 This places basic blocks 0, 1, and 2 in one section in this order, and places basic block #4 in a single section of its own.	2020-04-13 11:46:11 -07:00
Matt Arsenault	e6605a209c	DAG: Fix wrong legality check for ISD::FMAD Since `1725f28841`, this should check isFMADLegalForFAddFSub rather than the the plain isOperationLegal. This would assert in a subset of cases due to an oddity in how FMAD is selected. We will allow FMA formation pre-legalize, but not FMAD even in cases where it would be valid. The current hook requires passing in the root fadd/fsub. However, in this distributed case, this would be far more complicated to pass in the relevant operand. AMDGPU doesn't get any value from the node, and only needs the type and is the only implementor, so I'm not sure why we have this complexity. Just rename and expand the assert to avoid the more complicated checks spread through the distribution logic.	2020-04-13 10:25:39 -07:00
Craig Topper	6dbf1a1229	[X86] Move X86ShuffleDecode.cpp/h into MCTargetDesc and remove X86Utils library. NFC The shuffle decoding is used by X86ISelLowering and MCTargetDesc/X86InstComments. The latter used to be in a separate InstPrinter library. The Utils library existed to allow InstPrinter and CodeGen to share the shuffle decoding. Since X86InstComments now lives in the MCTargetDesc, which CodeGen already depends on, we can sink the shuffle decoding there as well. Differential Revision: https://reviews.llvm.org/D77980	2020-04-13 10:14:08 -07:00
Jay Foad	bc78baec4c	[X86] Improve combineVectorShiftImm Summary: Fold (shift (shift X, C2), C1) -> (shift X, (C1 + C2)) for logical as well as arithmetic shifts. This is needed to prevent regressions from an upcoming funnel shift expansion change. While we're here, fold (VSRAI -1, C) -> -1 too. Reviewers: RKSimon, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77300	2020-04-13 15:54:55 +01:00
Simon Pilgrim	401cbe373b	[X86][AVX] Attempt to scale masked shuffles to match the root type Improve the chances of folding the writemask into the combined shuffle by scaling a wider shuffle mask to match the root's original type. This creates a few minor issues with variable shuffles, preventing combines of shuffles because of the more limited support binary shuffle types. In most cases we're probably better off combining the shuffles and losing the writemask fold, but this isn't always going to be true.	2020-04-13 14:57:25 +01:00
Simon Pilgrim	fdd9ff9700	[X86][AVX] Create splitVectorIntBinary helper. Removes duplicate code from split256IntArith/split512IntArith.	2020-04-13 13:09:38 +01:00
Austin Kerbow	eab9a4f119	[AMDGPU] Don't assert on partial exec copy After Machine CSE and coalescing we can end up with copies of exec to subregister SGPRs. Differential Revision: https://reviews.llvm.org/D77992	2020-04-12 21:14:36 -07:00
Craig Topper	dbb272b0a3	[CallSite removal][FastISel] Use CallBase instead of CallSite in fastLowerCall.	2020-04-12 18:02:24 -07:00
Craig Topper	42fc7852f5	[X86] Print k-mask in FMA3 comments.	2020-04-12 13:16:53 -07:00
Craig Topper	95192f548d	[CallSite removal][TargetLowering] Use CallBase instead of CallSite in TargetLowering::ParseConstraints interface. Differential Revision: https://reviews.llvm.org/D77929	2020-04-12 11:26:25 -07:00
Mircea Trofin	d2f1cd5d97	[llvm][NFC] Refactor uses of CallSite to CallBase - call promotion Summary: Updated CallPromotionUtils and impacted sites. Parameters that are expected to be non-null, and return values that are guranteed non-null, were replaced with CallBase references rather than pointers. Left FIXME in places where more changes are facilitated by CallBase, but aren't CallSites: Instruction* parameters or return values, for example, where the contract that they are actually CallBase values. Reviewers: davidxl, dblaikie, wmi Reviewed By: dblaikie Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77930	2020-04-12 08:27:29 -07:00
Sanjay Patel	d04db4825a	[x86] use vector instructions to lower FP->int->FP casts As discussed in PR36617: https://bugs.llvm.org/show_bug.cgi?id=36617#c13 ...we can avoid the likely slow round-trip from XMM to GPR to XMM by using the vector versions of the convert instructions. Based on experimental results from recent Intel/AMD chips, we don't need to worry about triggering denorm stalls while operating on garbage data in the high lanes with convert instructions, so this is expected to always be as good or better perf than the scalar instruction equivalent. FP exceptions are also not a concern because strict code should not be using the regular SDAG opcodes. Differential Revision: https://reviews.llvm.org/D77895	2020-04-12 10:26:43 -04:00
Simon Pilgrim	40581a0a2b	[X86] Use isAnyZero shuffle mask helper where possible. NFC.	2020-04-12 10:57:29 +01:00
Craig Topper	d3465e0691	[X86] Enable shuffle combining for AVX512 unless the root is used by a vselect A lot of vectorized code doesn't use masks so we shouldn't penalize them by not doing shuffle combining on avx512 targets. I've added support for VALIGNQ/VALIGND and 512-bit SHUF128 to prevent some regressions. I also prevented recombining 256-bit SHUF128 to PERM2X128. We may not need to add 256-bit SHUF128 support, but I don't think I found any cases requiring that in my testing. Differential Revision: https://reviews.llvm.org/D77928	2020-04-11 20:05:10 -07:00
Matt Arsenault	96819011ca	AMDGPU/GlobalISel: Fix RegBankSelect for v2s16 shifts These need to be promoted and scalarized for the SALU.	2020-04-11 20:55:33 -04:00
Matt Arsenault	ac8d51a3c6	AMDGPU/GlobalISel: Legalize 16-bit shift amounts to s16 The current selector depends on 16-bit shifts using 16-bit shift amount types, but really it should accept either for all types.	2020-04-11 18:12:26 -04:00
Craig Topper	d1da1b53ff	[X86] Cleanup ISD::BRIND handling code in X86DAGToDAGISel::Select. NFC -Drop llvm:: on MVT::i32 -Use getValueType instead of getSimpleValueType for an equality check just cause its shorter and doesn't matter. -Don't create a const SDValue & since its cheap to copy. -Remove explicit case from MVT enum to EVT. -Add message to assert.	2020-04-11 15:01:05 -07:00
Craig Topper	21a7d08e72	[X86] Move code that replaces ISD::VSELECT with X86ISD::BLENDV from X86DAGToDAGISel::Select to PreprocessISelDAG	2020-04-11 15:01:05 -07:00
Matt Arsenault	c5497e5399	AMDGPU/GlobalISel: Fix legalizing <3 x s16> vselects	2020-04-11 15:59:51 -04:00
Fangrui Song	0a55d3f557	[MC] Default MCAsmInfo::UseIntegratedAssembler to true	2020-04-11 10:13:52 -07:00
Fangrui Song	d2e5157c1f	[MC] Add UseIntegratedAssembler = false. NFC	2020-04-11 10:13:49 -07:00
Matt Arsenault	cf29333f40	AMDGPU/GlobalISel: Work around forming illegal zextload after legalize Selection would fail after the post legalize combiner put an illegal zextload back together. The base combiner has parameter to only allow legal operations, but they appear to not be used. I also don't see a nice way to remove a single entry from all_combines, so just hack around this.	2020-04-11 10:52:58 -04:00
Sanjay Patel	1318ddbc14	[VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC As proposed in D77881, we'll have the related widening operation, so this name becomes too vague. While here, change the function signature to take an 'int' rather than 'size_t' for the scaling factor, add an assert for overflow of 32-bits, and improve the documentation comments.	2020-04-11 10:05:49 -04:00
Nemanja Ivanovic	512600e3c0	[PowerPC] Handle f16 as a storage type only The PPC back end currently crashes (fails to select) with f16 input. This patch expands it on subtargets prior to ISA 3.0 (Power9) and uses the HW conversions on Power9. Fixes https://bugs.llvm.org/show_bug.cgi?id=39865 Differential revision: https://reviews.llvm.org/D68237	2020-04-11 07:34:47 -05:00
Craig Topper	9c1842d8af	Change FastISel::CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This is the same as what was done to the CallLoweringInfo in TargetLowering.h in r309159. This is just a step on the way to replacing this with CallBase.	2020-04-10 23:45:36 -07:00
Nemanja Ivanovic	04eae39617	[PowerPC] Another folow-up fix for `6c4b40def7` There was another issue introduced by this commit that the OP initially missed. Namely, for functions that are free to use R2 as a callee-saved register, we emit a TOC expression based on the address of the GEP label without emitting the GEP label. Since we only emit such expressions for the large code model, this issue only surfaced there. I have confirmed that with this fix, the kernel build is successful with target "all".	2020-04-10 21:09:59 -05:00
Scott Constable	0505181006	[X86] Fix to X86LoadValueInjectionRetHardeningPass for possible segfault `MBB.back()` could segfault if `MBB.empty()`. Fixed by checking for `MBB.empty()` in the loop. Differential Revision: https://reviews.llvm.org/D77584	2020-04-10 18:28:08 -07:00
Sam Clegg	16206ee07d	[WebAssembly] Minor cleanup to WebAssemblySubtarget. NFC. Pretty much all other platforms pass CPU string as arg0 of initializeSubtargetDependencies. Differential Revision: https://reviews.llvm.org/D77894	2020-04-10 16:47:39 -07:00
Craig Topper	b8a108140d	[CallSite removal][X86] Remove uses of CallSite from X86WinEHState.cpp Differential Revision: https://reviews.llvm.org/D77862	2020-04-10 11:34:06 -07:00
Fangrui Song	a7aaaf7016	[MC][RISCV] Make .reloc support arbitrary relocation types Similar to D76746 (ARM), D76754 (AArch64) and llvmorg-11-init-6967-g152d14da64c (x86) Differential Revision: https://reviews.llvm.org/D77018	2020-04-10 10:43:53 -07:00
Craig Topper	a6732069ee	[CallSite removal][X86] Remove unneeded use of CallSite. NFC We already have a CallInst, we can just get the calling convention from it.	2020-04-10 10:27:21 -07:00
Fangrui Song	7f36cb1f1a	[AArch64InstPrinter] Change printAlignedLabel to print the target address in hexadecimal form Similar to D76580 (x86) and D76591 (PPC). ``` // llvm-objdump -d output (before) 10000: 08 00 00 94 bl #32 10004: 08 00 00 94 bl #32 // llvm-objdump -d output (after) 10000: 08 00 00 94 bl 0x10020 10004: 08 00 00 94 bl 0x10024 // GNU objdump -d. The lack of 0x is not ideal due to ambiguity. 10000: 94000008 bl 10020 <bar+0x18> 10004: 94000008 bl 10024 <bar+0x1c> ``` The new output makes it easier to find the jump target. Differential Revision: https://reviews.llvm.org/D77853	2020-04-10 09:21:09 -07:00
Simon Pilgrim	1824ae0f42	[X86] Remove defunct EmitLoweredAtomicFP declaration. NFC.	2020-04-10 17:05:07 +01:00
Simon Pilgrim	dd84a2f77a	[X86] Remove defunct emitFMA3Instr declaration. NFC.	2020-04-10 17:05:06 +01:00
Christopher Tetreault	65b8b643b4	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, efriedma, jonpa Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77265	2020-04-10 08:43:32 -07:00
Simon Pilgrim	a88cc20456	ProfileSummaryInfo.h - remove unnecessary includes. NFC Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration. This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.	2020-04-10 16:25:48 +01:00
Stanislav Mekhanoshin	44920e8566	[AMDGPU] Disable sub-dword scralar loads IR widening These will be widened in the DAG. In the meanwhile early widening prevents otherwise possible vectorization of such loads. Differential Revision: https://reviews.llvm.org/D77835	2020-04-10 08:20:49 -07:00
Simon Pilgrim	91bc50c0d7	[CostModel][X86] Improve InsertElement costs for sub-128bit vectors If we're inserting into v2i8/v4i8/v8i8/v2i16/v4i16 style sub-128bit vectors ensure we don't use the SK_PermuteTwoSrc cost of the legalized value type - this is a followup to rG12c629ec6c59 which added equivalent sub-128bit shuffle costs	2020-04-10 14:55:46 +01:00
Michael Liao	b54b4ecac3	Fix `-Wextra` warning. NFC.	2020-04-10 03:22:02 -04:00
Kai Luo	b7d5229d78	[PowerPC] Update alignment for ReuseLoadInfo in LowerFP_TO_INTForReuse In LowerFP_TO_INTForReuse, when emitting `stfiwx`, alignment of 4 is set for the `MachineMemOperand`, but RLI(ReuseLoadInfo)'s alignment is not updated for following loads. It's related to failed alignment check reported in https://bugs.llvm.org/show_bug.cgi?id=45297 Differential Revision: https://reviews.llvm.org/D77624	2020-04-10 05:49:19 +00:00
Nemanja Ivanovic	7f3787c0f2	[PowerPC] Bail out of redundant LI elimination on an implicit kill The transformation currently does not differentiate between explicit and implicit kills. However, it is not valid to later simply clear an implicit kill flag since the kill could be due to a call or return. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45374	2020-04-09 22:17:29 -05:00
Heejin Ahn	b647de9925	[WebAssembly] Use dummy debug info in Emscripten SjLj Summary: D74269 added debug info to newly created instructions, including calls to `malloc` and `free`, by taking debug info from existing instructions around, whose debug info may or may not be empty. But there are cases debug info is required by the IR verifier: when both the caller and the callee functions have DISubprograms, meaning we already have declarations to `malloc` or `free` with a DISubprogram attached, newly created calls to `malloc` and `free` should have non-empty debug info. This patch creates a non-empty dummy debug info in this case to those calls to make the IR verifier pass. Fixes https://bugs.llvm.org/show_bug.cgi?id=45461. Reviewers: dschuff Subscribers: aprantl, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77784	2020-04-09 18:44:50 -07:00
Stefan Pintilie	5b18b6e9a8	[PowerPC][Future] Fix for `6c4b40def7` This is a fix for the previous patch `6c4b40def7`. In some cases it may be possible to have the compiler produce st_other=1 without the compiler using mcpu=future which should not be the case. This patch adds a guard to make sure that if we are using st_other=1 then we are also compiling for future CPU.	2020-04-10 01:12:11 +00:00
Craig Topper	5625e6ab37	[X86] Improve min/max reduction costs. This is similar to what I recently did for getArithmeticReductionCost. I'm trying to account for the narrowing from 512->256->128 as we go. I've also added a new helper method getMinMaxCost that tries to handle the cases where we have native min/max instructions and fall back to cmp+select when we don't. Differential Revision: https://reviews.llvm.org/D76634	2020-04-09 17:28:50 -07:00
Nemanja Ivanovic	5fe2809447	[PowerPC] Don't assert on SELECT_CC with i1 type When we try to select a SELECT_CC on Power9, we check if it can be matched to a SETB instruction. In that function, we assert that the output type is i32/i64. This is unnecessary as it is perfectly reasonable to have an i1 SELECT_CC. Change that from an assert to an early exit condition. Fixes: https://bugs.llvm.org/show_bug.cgi?id=45448	2020-04-09 19:27:32 -05:00
Amara Emerson	e99169f1c2	[AArch64][GlobalISel] CallLowering: Don't generate new copies each time we need to store to a stack location for outgoing args. During call arg lowering we shouldn't be modifying SP so cache the SP copy vreg for subsequent uses. Gives a 0.2% geomean code size improvement on CTMark. Differential Revision: https://reviews.llvm.org/D77838	2020-04-09 17:08:56 -07:00
Christopher Tetreault	9f87d951fc	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: mcrosier, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77269	2020-04-09 16:43:29 -07:00
James Y Knight	5e7b98fe75	Fix an unused-variable warning in Release mode.	2020-04-09 16:34:55 -04:00
Christopher Tetreault	e634f482ea	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: arsenm, efriedma, sdesmalen Reviewed By: arsenm Subscribers: wdng, arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77268	2020-04-09 13:11:37 -07:00
Christopher Tetreault	e1e131ea5e	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: grosbach, efriedma, sdesmalen Reviewed By: efriedma Subscribers: hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77271	2020-04-09 12:52:44 -07:00
Stefan Pintilie	64868cbfcf	[PowerPC][Future] Fix for `75828ef615` Used unsigned long where uint64_t should have been used by mistake. Fixed in this patch.	2020-04-09 19:33:12 +00:00
Simon Pilgrim	12c629ec6c	[CostModel][X86] Add shuffle costs for some common sub-128bit vectors v2i8/v4i8/v8i8 + v2i16/v4i16 all show up in vectorizer code and by just using the legalized types (v16i8/v8i16) we're highly exaggerating the actual cost of the shuffle.	2020-04-09 19:57:06 +01:00
Paolo Savini	fae40bd5a1	[RISCV] Add MC layer support for proposed Bit Manipulation extension (version 0.92) This adds the instruction encoding and mnenomics for the proposed RISC-V Bit Manipulation extension (version 0.92). It is implemented with each category of instruction as its own target feature, with the 'b' extension feature enabling all options. Since this extension is not yet ratified, all target features are prefixed with 'experimental-' to note their status. Differential Revision: https://reviews.llvm.org/D65649	2020-04-09 18:04:22 +01:00
jasonliu	085689d44c	[PPC][AIX] Implement variadic function handling in LowerFormalArguments_AIX Summary: This patch adds support for handling of variadic functions for AIX. This includes ensuring that use and consume correct type of va_list (char *va_list) for AIX. Authored by: ZarkoCA Reviewers: cebowleratibm, sfertile, jasonliu Reviewed by: jasonliu Differential Revision: https://reviews.llvm.org/D76130	2020-04-09 16:49:44 +00:00
Stefan Pintilie	75828ef615	[PowerPC][Future] Initial support for PCRel addressing for constant pool loads Add initial support for PC Relative addressing for constant pool loads. This includes adding a new relocation for @pcrel and adding a new PowerPC flag to identify PC relative addressing. Differential Revision: https://reviews.llvm.org/D74486	2020-04-09 11:17:23 -05:00

... 4 5 6 7 8 ...

57472 Commits