llvm-project

Commit Graph

Author	SHA1	Message	Date
Amara Emerson	0f6ed432d5	[AArch64][GlobalISel] Fix assertion fail in C++ selection for vector zext of <4 x s8> We bailed out of dealing with vectors only after the assertion, move it before. Fixes PR43794	2019-10-28 15:45:01 -07:00
Nemanja Ivanovic	97e3626070	[PowerPC] Do not emit HW loop if the body contains calls to lrint/lround These two intrinsics are lowered to calls so should prevent the formation of CTR loops. In a subsequent patch, we will handle all currently known intrinsics and prevent the formation of HW loops if any unknown intrinsics are encountered. Differential revision: https://reviews.llvm.org/D68841	2019-10-28 17:23:08 -05:00
jasonliu	d83a2faacd	[NFCI][XCOFF][AIX] Skip empty Section during object file generation This is a fix to D69112 where we common up the logic of writing CsectGroup. However, we forget to skip the Sections that are empty in that patch. Reviewed by: daltenty, xingxue Differential Revision: https://reviews.llvm.org/D69447	2019-10-28 22:04:23 +00:00
Puyan Lotfi	6b7615ae9a	[MachineOutliner][NFC] clang-formating the MachineOutliner.	2019-10-28 17:58:27 -04:00
Artem Belevich	d9972f8482	[NVPTX] Added llvm.nvvm.mma.m8n8k4.* intrinsics Differential Revision: https://reviews.llvm.org/D69324	2019-10-28 13:55:30 -07:00
Hiroshi Yamauchi	75f72f6b73	[PGO][PGSO] SizeOpts changes. Summary: (Split of off D67120) SizeOpts/MachineSizeOpts changes for profile guided size optimization. (A second try after previously committed as r375254 and reverted as r375375.) Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69409	2019-10-28 12:57:26 -07:00
Francis Visoiu Mistrih	c7557dd692	[Remarks] Remove references to ELF support There is no ELF support at the moment. Remove all the references to the `.remarks` section.	2019-10-28 12:50:46 -07:00
Francis Visoiu Mistrih	209d5a12c5	[Remarks] Emit the remarks section by default for certain formats Emit a remarks section by default for the following formats: * bitstream * yaml-strtab while still providing -remarks-section=<bool> to override the defaults.	2019-10-28 12:50:46 -07:00
Puyan Lotfi	a51fc8ddf8	[MachineOuliner][NFC] Refactoring code to make outline rerunning a cleaner diff. I want to add the ability to rerun the outliner in certain cases, and I thought this could be an NFC change that could make a subsequent change that allows for rerunning the outliner a cleaner diff. Differential Revision: https://reviews.llvm.org/D69482	2019-10-28 15:13:45 -04:00
David Tellenbach	e3a45a24d1	[ARM][Thumb2InstrInfo] Fix default `0` opcode when rewriting frame indices The static functions `positiveOffsetOpcode`, `negativeOffsetOpcode` and `immediateOffsetOpcode` (lib/Target/ARM/Thumb2InstrInfo.cpp) currently can return `0` as default opcode which is meaningless in this situation. This patch replaces this default value by llvm_unreachable. Reviewers: t.p.northover, tellenbach Reviewed By: tellenbach Subscribers: tellenbach, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69432 Patch By: Lorenzo Casalino <lorenzo.casalino93@gmail.com>	2019-10-28 18:58:45 +00:00
Nico Weber	e59f7488c7	Convert files added in `d157a9bc8b` to unix line endings. Ran: git show --diff-filter=A --stat `d157a9bc8b` \| grep '\|' \| \ awk '{ print $1 }' \| xargs dos2unix	2019-10-28 14:39:45 -04:00
Jay Foad	843c0adf0f	[ConstantFold] Fold extractelement of getelementptr Summary: Getelementptr has vector type if any of its operands are vectors (the scalar operands being implicitly broadcast to all vector elements). Extractelement applied to a vector getelementptr can be folded by applying the extractelement in turn to all of the vector operands. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69379	2019-10-28 18:32:39 +00:00
Craig Topper	3da269a248	[X86] Add a DAG combine to turn (and (bitcast (vXi1 (concat_vectors (vYi1 setcc), undef,))), C) into (bitcast (vXi1 (concat_vectors (vYi1 setcc), zero,))) The legalization of v2i1->i2 or v4i1->i4 bitcasts followed by a setcc can create an and after the bitcast. If we're lucky enough that the input to the bitcast is a concat_vectors where the first operand is a setcc that can natively 0 all the upper bits of ak-register, then we should replace the other operands of the concat_vectors with zero in order to remove the AND. With the AND removed we might be able to use a kortest on the result. Differential Revision: https://reviews.llvm.org/D69205	2019-10-28 11:27:01 -07:00
Sander de Smalen	70f5aecede	Reland [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2) Fixed up test/DebugInfo/MIR/Mips/live-debug-values-reg-copy.mir that broke r375425.	2019-10-28 18:05:19 +00:00
Craig Topper	18824d25d8	[LV] Interleaving should not exceed estimated loop trip count. Currently we may do iterleaving by more than estimated trip count coming from the profile or computed maximum trip count. The solution is to use "best known" trip count instead of exact one in interleaving analysis. Patch by Evgeniy Brevnov. Differential Revision: https://reviews.llvm.org/D67948	2019-10-28 10:58:22 -07:00
Bjorn Pettersson	80cb2cecc6	[utils] InlineFunction: fix for debug info affecting optimizations Summary: Debug info affects output from "opt -inline", InlineFunction could not handle the llvm.dbg.value when it exist between alloca instructions. Problem was that the first alloca in a sequence of allocas was handled differently from the subsequence alloca instructions. Now all static alloca instructions are treated the same (being removed if the have no uses). So it does not matter if there are dbg instructions (or any other instructions) in between. Fix the issue: https://bugs.llvm.org/show_bug.cgi?id=43291k Patch by: yechunliang (Chris Ye) Reviewers: bjope, jmorse, vsk, probinson, jdoerfert, mtrofin, aprantl, fhahn Reviewed By: bjope Subscribers: uabelho, ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68633	2019-10-28 18:19:07 +01:00
Austin Kerbow	d11b93ec6a	AMDGPU: Avoid overwriting saved PC Summary: An outstanding load with same destination sgpr as call could cause PC to be updated with junk value on return. Reviewers: arsenm, rampitec Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69474	2019-10-28 10:02:22 -07:00
Sean Fertile	582e3c09d4	[AIX] Refactor AIX Call Lowering to use CCState. NFCI. This patch reworks the AIX call lowering to use CCState. Some defensive errors are added in this patch to protect from emitting bad code for calling convention logic that has not been implemented by design. The use of CCState follows the precedent of other targets and enables the reuse of calling convention logic in LowerFormalArguments, which will be rewritten to also use CCState in a late patch. Patch by Chris Bowler. Differential Revision: https://reviews.llvm.org/D69101	2019-10-28 12:44:22 -04:00
David Green	bf21f0d489	[InstCombine] Extra combine for uadd_sat This is an extra fold for a canonical form of uadd_sat, as shown in D68651. It essentially selects uadd from an add and a select. Differential Revision: https://reviews.llvm.org/D69244	2019-10-28 15:21:16 +00:00
Andrew Paverd	d157a9bc8b	Add Windows Control Flow Guard checks (/guard:cf). Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761	2019-10-28 15:19:39 +00:00
Jinsong Ji	a233e7d7cb	[AArch64] Fix unannotated fall-through between switch labels This is breaking buildbot with -Werror,-Wimplicit-fallthrough on. eg: http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/6881	2019-10-28 15:18:50 +00:00
Jeremy Morse	f5e1b718a6	[DebugInfo] MachineSink: find more DBG_VALUEs to sink In the Pre-RA machine sinker, previously we were relying on all DBG_VALUEs being immediately after the instruction that defined their operands. This isn't a valid assumption, as a variable location change doesn't necessarily correspond to where the value is computed. In this patch, we collect DBG_VALUEs that might need sinking as we walk through a block, and sink all of them if their defining instruction is sunk. This patch adds some copy propagation too, so that if we sink a copy inst, the now non-dominated paths can use the copy source for the variable location. Differential Revision: https://reviews.llvm.org/D58386	2019-10-28 14:32:50 +00:00
Sanjay Patel	1ebd4a2e3a	[DAGCombiner] widen any_ext of popcount based on target support This enhances D69127 (rGe6c145e0548e3b3de6eab27e44e1504387cf6b53) to handle the looser "any_extend" cast in addition to zext. This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688: https://bugs.llvm.org/show_bug.cgi?id=43688	2019-10-28 10:07:12 -04:00
Sanjay Patel	f2e93d10fe	[CVP] prevent propagating poison when substituting edge values into a phi (PR43802) This phi simplification transform was added with: D45448 However as shown in PR43802: https://bugs.llvm.org/show_bug.cgi?id=43802 ...we must be careful not to propagate poison when we do the substitution. There might be some more complicated analysis possible to retain the overflow flag, but it should always be safe and easy to drop flags (we have similar behavior in instcombine and other passes). Differential Revision: https://reviews.llvm.org/D69442	2019-10-28 08:58:28 -04:00
Jeremy Morse	ee50590e16	[DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking instructions When we sink DBG_VALUEs between blocks, we simply move the DBG_VALUE instruction to below the sunk instruction. However, we should also mark the variable as being undef at the original location, to terminate any earlier variable location. This patch does that -- plus, if the instruction being sunk is a copy, it attempts to propagate the copy through the DBG_VALUE, replacing the destination with the source. Differential Revision: https://reviews.llvm.org/D58238	2019-10-28 12:17:56 +00:00
Dmitry Preobrazhensky	b8042dbe2b	[AMDGPU][MC][GFX10] Added v_interp_[p1/p2/mov]_f32_e64 See https://bugs.llvm.org/show_bug.cgi?id=43747 Reviewers: arsenm, rampitec Differential Revision: https://reviews.llvm.org/D69348	2019-10-28 15:03:43 +03:00
David Green	ba2c625531	[Codegen][ARM] Add float softening for cbrt We would previously have no soft-float softening for cbrt, so could hit a crash failing to select. This fills in what appears to be missing. Differential Revision: https://reviews.llvm.org/D69345	2019-10-28 11:08:55 +00:00
Rafael Stahl	a483302fbe	minor doc typo fix / testing github commit	2019-10-28 12:08:40 +01:00
vhscampos	f6e11a36c4	[ARM][AArch64] Implement __cls, __clsl and __clsll intrinsics from ACLE Summary: Writing support for three ACLE functions: unsigned int __cls(uint32_t x) unsigned int __clsl(unsigned long x) unsigned int __clsll(uint64_t x) CLS stands for "Count number of leading sign bits". In AArch64, these two intrinsics can be translated into the 'cls' instruction directly. In AArch32, on the other hand, this functionality is achieved by implementing it in terms of clz (count number of leading zeros). Reviewers: compnerd Reviewed By: compnerd Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69250	2019-10-28 11:06:58 +00:00
Kerry McLaughlin	da720a38b9	[AArch64][SVE] Implement masked load intrinsics Summary: Adds support for codegen of masked loads, with non-extending, zero-extending and sign-extending variants. Reviewers: huntergr, rovka, greened, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, samparker, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68877	2019-10-28 10:06:14 +00:00
Sam Elliott	7214f7a79f	[RISCV] Lower llvm.trap and llvm.debugtrap Summary: Until this commit, these have lowered to a call to abort(). `llvm.trap()` now lowers to `unimp`, which should trap on all systems. `llvm.debugtrap()` now lowers to `ebreak`, which is exactly what this instruction is for. Reviewers: asb, luismarques Reviewed By: asb Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69390	2019-10-28 09:54:33 +00:00
David Zarzycki	657e4240b1	[X86] Fix 48/96 byte memcmp code gen Detect scalar ISD::ZERO_EXTEND generated by memcmp lowering and convert it to ISD::INSERT_SUBVECTOR. https://reviews.llvm.org/D69464	2019-10-28 08:41:45 +02:00
Craig Topper	7af8d5267b	[X86] Use 64-bit version of source register in LowerPATCHABLE_EVENT_CALL and LowerPATCHABLE_TYPED_EVENT_CALL Summary: The PATCHABLE_EVENT_CALL uses i32 in the intrinsic. This results in the register allocator picking a 32-bit register. We need to use the 64-bit register when forming the MOV64rr instructions. Otherwise we print illegal assembly in the text output. I think prior to this it was impossible for SrcReg to be equal to DstReg so the NOP code was not reachable. While there use Register instead of unsigned. Also add a FIXME for what looks like a bug. Reviewers: dberris Reviewed By: dberris Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69365	2019-10-27 20:44:41 -07:00
Matt Arsenault	9b0b626d2c	Use isConvergent helper instead of directly checking attribute	2019-10-27 19:39:14 -07:00
Saleem Abdulrasool	418d1ea555	PM: silence `-Wpessimizing-move` from GCC 9.2.1 (NFC) Remove the explicit move enabling NVRO.	2019-10-27 18:33:09 -04:00
Sanjay Patel	85a2146c15	[SDAG] fold insert_vector_elt with undef index Similar to: rG4c47617627fb This makes the DAG behavior consistent with IR's insertelement. https://bugs.llvm.org/show_bug.cgi?id=42689 I've tried to maintain test intent for AArch64 and WebAssembly by replacing undef index operands with something else.	2019-10-27 15:28:43 -04:00
Craig Topper	f067dd839e	[LegalizeTypes] When promoting BITREVERSE/BSWAP don't take the shift amount into account when determining the shift amount VT. If the target's preferred shift amount VT can't hold any shift amount for the promoted VT, we should use i32. The specific shift amount shouldn't matter. The type will be adjusted later when the shift itself is type legalized. This avoids an assert in getNode. Fixes PR43820.	2019-10-27 12:20:35 -07:00
Craig Topper	73f255b83a	[TargetLowering] Add getBooleanContents contents check to "SETCC (SETCC), [0\|1], [EQ\|NE] -> SETCC" combine. This combine is only valid if the inner setcc produces a 0/1 result or the inner type is MVT::i1. I haven't seen this cause any issues, just happened to notice it while reviewing combines in this function. While there also fix another call to use the value type from the SDValue for the operand instead of calling SDNode::getValueType(0). Though its likely the use is result 0, its not guaranteed.	2019-10-27 10:07:15 -07:00
Greg Bedwell	4640223ebd	[MCA] Fix a spelling mistake in a comment. NFC	2019-10-27 10:06:22 +00:00
Craig Topper	1ce8a5b385	[X86] Only look up boolean reduction cost tables if the reduction is not pairwise. Summary: We don't pattern match pairwise shuffles in SelectionDAG. So we should only return the optimized costs if its not a pairwise shuffle. I think SLP vectorizer gives priority to non pairwise shuffle if the cost is the same. And the look up for reduction intrinsics passes false for the pairwise flag. So this probably has no real effect today. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69083	2019-10-26 16:41:19 -07:00
Roman Lebedev	9d77ad5754	[APInt] Introduce APIntOps::GetMostSignificantDifferentBit() Summary: Compare two values, and if they are different, return the position of the most significant bit that is different in the values. Needed for D69387. Reviewers: nikic, spatel, sanjoy, RKSimon Reviewed By: nikic Subscribers: xbolva00, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69439	2019-10-26 23:20:58 +03:00
David Zarzycki	11c920207a	[X86] Prefer KORTEST on Knights Landing or later for memcmp() PTEST and especially the MOVMSK instructions are slow on Knights Landing or later. As a bonus, this patch increases instruction parallelism by emitting: KORTEST(PCMPNEQ(a, b), PCMPNEQ(c, d)) == 0 Instead of: KORTEST(AND(PCMPEQ(a, b), PCMPEQ(c, d))) == ~0 https://reviews.llvm.org/D69157	2019-10-26 21:14:57 +03:00
Georgii Rymar	073ab70b72	[ObjectYAML] - Do not use auto. NFC. Using 'auto' when the type is not obvious is undesired. (it is just a test commit actually)	2019-10-26 15:08:49 +03:00
cdevadas	e921ede540	[AMDGPU] Fix Vreg_1 PHI lowering in SILowerI1Copies. There is a minor flaw in the implementation of function lowerPhis. This function replaces values of regclass Vreg_1 (boolean values) involved in PHIs into an SGPR. Currently it iterates over the MBBs and performs an inplace lowering of PHIs and fails to lower any incoming value that itself is another PHI of Vreg_1 regclass. The failure occurs only when the MBB where the incoming PHI value belongs is not visited/lowered yet. To fix this problem, collect all Vreg_1 PHIs upfront and then perform the lowering. Differential Revision: https://reviews.llvm.org/D69182	2019-10-26 14:37:45 +05:30
Craig Topper	a6a37e820c	[X86][GISel] Fix typo in comment. NFC	2019-10-26 00:27:53 -07:00
John McCall	27e2c8faec	Add Record::getValueAsOptionalDef(). Using `?` as an optional marker is very useful in Clang's AST-node emitters because otherwise we need a separate class just to encode the presence or absence of a base node reference.	2019-10-25 16:39:21 -07:00
Sanjay Patel	4c47617627	[SDAG] fold extract_vector_elt with undef index This makes the DAG behavior consistent with IR's extractelement after: rGb32e4664a715 https://bugs.llvm.org/show_bug.cgi?id=42689 I've tried to maintain test intent for WebAssembly. The AMDGPU test is trying to test for crashing or other bad behavior, but I'm not sure if that's possible after this change.	2019-10-25 19:27:26 -04:00
Stanislav Mekhanoshin	4c0251da14	[AMDGPU] Enable SGPR copy folding That used to fail in the last testcase function because after %0:sreg_64.sub0 was folded into %3:sreg_32_xm0_xexec COPY, it was further folded into S_STORE_DWORD_IMM. Its legal effective subreg class is SReg_32 while instruction expects more restricted SReg_32_XM0_EXEC. However, SIInstrInfo::isLegalRegOperand() passed the legality check and it was caught in the verifier. Borrowed code from the verifier to check for RC legality. Differential Revision: https://reviews.llvm.org/D69445	2019-10-25 15:08:30 -07:00
Yonghong Song	a27c998c00	[BPF] fix a CO-RE issue with -mattr=+alu32 Ilya Leoshkevich (<iii@linux.ibm.com>) reported an issue that with -mattr=+alu32 CO-RE has a segfault in BPF MISimplifyPatchable pass. The pattern will be transformed by MISimplifyPatchable pass looks like below: r5 = ld_imm64 @"b:0:0$0:0" r2 = ldw r5, 0 ... r2 ... // use r2 The pass will remove the intermediate 'ldw' instruction and replacing all r2 with r5 likes below: r5 = ld_imm64 @"b:0:0$0:0" ... r5 ... // use r5 Later, the ld_imm64 insn will be replaced with r5 = <patched immediate> for field relocation purpose. With -mattr=+alu32, the input code may become r5 = ld_imm64 @"b:0:0$0:0" w2 = ldw32 r5, 0 ... w2 ... // use w2 Replacing "w2" with "r5" is incorrect and will trigger compiler internal errors. To fix the problem, if the register class of ldw* dest register is sub_32, we just replace the original ldw* register with: w2 = w5 Directly replacing all uses of w2 with in-place constructed w5 for the use operand seems not working in all cases. The latest kernel will have -mattr=+alu32 on by default, so added this flag to all CORE tests. Tested with latest kernel bpf-next branch as well with this patch. Differential Revision: https://reviews.llvm.org/D69438	2019-10-25 14:27:25 -07:00
Jian Cai	a6b0219fc4	Revert "[ARM] Uses "Sun Style" syntax for section switching" This reverts commit `03de2f84fc`.	2019-10-25 14:03:07 -07:00
Stanislav Mekhanoshin	c7dcacf16a	[AMDGPU] Fixed asan failure in SIFoldOperands Both tryFoldOMod() and tryFoldClamp() remove original instruction, so the check MI.modifiesRegister() may use a deleted MI. Differential Revision: https://reviews.llvm.org/D69448	2019-10-25 13:59:56 -07:00
Matt Arsenault	1a276d1e8c	GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT	2019-10-25 13:55:07 -07:00
Guillaume Chatelet	e8a0a0904b	[Alignment][NFC] Convert AllocaInst to MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69301	2019-10-25 22:41:34 +02:00
Jian Cai	03de2f84fc	[ARM] Uses "Sun Style" syntax for section switching Summary: Support "Sun Style" syntax for section switching ("#alloc,#write" etc). https://bugs.llvm.org/show_bug.cgi?id=43759 Reviewers: peter.smith, eli.friedman, kristof.beyls, t.p.northover Reviewed By: peter.smith Subscribers: MaskRay, llozano, manojgupta, nickdesaulniers, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69296	2019-10-25 13:27:35 -07:00
Matt Arsenault	171cf5302f	AMDGPU/GlobalISel: Handle flat/global G_ATOMIC_CMPXCHG Custom lower this to a target instruction with the merge operands. I think it might be better to directly select this and emit a REG_SEQUENCE, but this would be more work since it would require splitting the tablegen patterns for these cases from the other atomics.	2019-10-25 13:11:09 -07:00
Changpeng Fang	1ce552f3ef	AMDGPU: Fix the broken dominator tree when creating waterfall loop for resource descriptor Summary: In loadSRsrcFromVGPR, if MBB is the same as Succ, Remiander is not the immediate dominator of Succ. Reviewer: arsenm Differential Revision: https://reviews.llvm.org/D69358	2019-10-25 13:08:04 -07:00
Amy Huang	64c1f6602a	Revert "Add an instruction marker field to the ExtraInfo in MachineInstrs." Reverting commit `b85b4e5a6f` due to some buildbot failures/ out of memory errors.	2019-10-25 12:41:34 -07:00
Teresa Johnson	cc0b9647b7	[LLD][ThinLTO] Handle GUID collision in import global processing Summary: If there are a GUID collision between two globals checking the summarylist from the import index to make assumption can be dangerous. Do not assume that a GlobalValue that has a GlobalVarSummary actually is a GlobalVariable as it can be another GlobalValue with the same GUID that the summary is connected to. Patch by Joel Klinghed (the_jk@opera.com) Reviewers: evgeny777, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, dblaikie, MaskRay, mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67322	2019-10-25 12:36:01 -07:00
Guillaume Chatelet	a4783ef58d	[Alignment][NFC] getMemoryOpCost uses MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307	2019-10-25 21:26:59 +02:00
Stanislav Mekhanoshin	d4303b3861	[AMDGPU] Fold AGPR reg_sequence initializers Differential Revision: https://reviews.llvm.org/D69413	2019-10-25 11:39:02 -07:00
vpykhtin	c9c18e5a31	[AMDGPU] Disallow dpp combining for dpp instructions without Src2 operand (when Src2 is required) Differential revision: https://reviews.llvm.org/D69430	2019-10-25 21:30:37 +03:00
Craig Topper	3dd0a896b6	[X86] Add a check for SSE2 to the top of combineReductionToHorizontal. Without this, we can create a PSADBW node that isn't legal.	2019-10-25 11:11:32 -07:00
Sanjay Patel	e6c145e054	[DAGCombiner] widen zext of popcount based on target support zext (ctpop X) --> ctpop (zext X) This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688: https://bugs.llvm.org/show_bug.cgi?id=43688 I'm not sure if any other targets are affected, but I found a missing fold for PPC, so added tests based on that. The reason we widen all the way to 64-bit in these tests is because the initial DAG looks something like this: t5: i8 = ctpop t4 t6: i32 = zero_extend t5 <-- created based on IR, but unused node? t7: i64 = zero_extend t5 Differential Revision: https://reviews.llvm.org/D69127	2019-10-25 14:10:51 -04:00
Austin Kerbow	c35b358b74	AMDGPU/GlobalISel: Legalize FDIV16 Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69347	2019-10-25 11:07:17 -07:00
Amy Huang	b85b4e5a6f	Add an instruction marker field to the ExtraInfo in MachineInstrs. Summary: Add instruction marker to MachineInstr ExtraInfo. This does almost the same thing as Pre/PostInstrSymbols, except that it doesn't create a label until printing instructions. This allows for labels to be put around instructions that are deleted/duplicated somewhere. Also undo the workaround in r375137. Reviewers: rnk Subscribers: MatzeB, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69136	2019-10-25 09:21:10 -07:00
Sanjay Patel	b82fa80e80	[SLP] adjust code comment; NFC (check commit access)	2019-10-25 11:39:43 -04:00
Roman Lebedev	1cc8e1e1d7	[APInt] Add saturating left-shift ops Summary: There are `*_ov()` functions already, so at least for consistency it may be good to also have saturating variants. These may or may not be needed for `ConstantRange`'s `shlWithNoWrap()` Reviewers: spatel, nikic Reviewed By: nikic Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69398	2019-10-25 18:20:00 +03:00
Roman Lebedev	b2c184458e	[APInt] Add saturating multiply ops Summary: There are `*_ov()` functions already, so at least for consistency it may be good to also have saturating variants. These may or may not be needed for `ConstantRange`'s `mulWithNoWrap()` Reviewers: spatel, nikic Reviewed By: nikic Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69397	2019-10-25 18:19:54 +03:00
Itay Bookstein	59a51d84b3	[CodeGen][SelectionDAG] Fix tiny bug in ExpandIntRes_UADDSUBO Summary: Ternary expression checks for ISD::ADD instead of ISD::UADDO inside DAGTypeLegalizer::ExpandIntRes_UADDSUBO. This means the ternary expression will evaluate to ISD::SUBCARRY for both ISD::UADDO and ISD::USUBO nodes. Targets are likely to implement both, so impact will be very limited in practice. Reviewers: bogner, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68123	2019-10-25 18:10:51 +03:00
Luís Marques	1baa50396d	[RISCV] Add support for half-precision floats Complete fp16 support by ensuring that load extension / truncate store operations are properly expanded. Reviewers: asb, lenary Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D69246	2019-10-25 14:02:02 +01:00
Petar Avramovic	417dd67825	[MIPS GlobalISel] Select MSA vector generic and builtin fsqrt selectImpl is able to select G_FSQRT when we set bank for vector operands to fprb. Add detailed tests. Note: G_FSQRT is generated from llvm-ir intrinsics llvm.sqrt., and at the moment MIPS is not able to generate this intrinsic for vector type (some targets generate vector llvm.sqrt. from calls to a builtin function). __builtin_msa_fsqrt_<format> will be transformed into G_FSQRT in legalizeIntrinsic and selected in the same way. Differential Revision: https://reviews.llvm.org/D69376	2019-10-25 14:45:14 +02:00
georgerim	de3cef1d5d	[yaml2obj, obj2yaml] - Add support for SHT_NOTE sections. SHT_NOTE is the section that consists of namesz, descsz, type, name + padding, desc + padding data. This patch teaches yaml2obj, obj2yaml to dump and parse them. This patch implements the section how it is described here: https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-18048.html Which says: "For 64–bit objects and 32–bit objects, each entry is an array of 4-byte words in the format of the target processor" The official specification is different http://www.sco.com/developers/gabi/latest/ch5.pheader.html#note_section And says: "n 64-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array of 8-byte words in the format of the target processor. In 32-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS32), each entry is an array of 4-byte words in the format of the target processor" Since LLVM uses the first, 32-bit way, this patch follows it. Differential revision: https://reviews.llvm.org/D68983	2019-10-25 13:25:56 +03:00
David Stenberg	2a3dc6b74f	Fix a variable typo in LiveDebugValues [NFC]	2019-10-25 11:21:43 +02:00
czhengsz	822059147b	[PowerPC] [Peephole] fold frame offset by using index form to save add. renamable $x6 = ADDI8 $x1, -80 ;;; 0 is replaced with -80 renamable $x6 = ADD8 killed renamable $x6, renamable $x5 STW killed renamable $r3, 4, killed renamable $x6 :: (store 4 into %ir.14, !tbaa !2) After PEI there is a peephole opt opportunity to combine above -80 in ADDI8 with 4 in the STW to eliminate unnecessary ADD8. Expected result: renamable $x6 = ADDI8 $x1, -76 STWX killed renamable $r3, renamable $x5, killed renamable $x6 :: (store 4 into %ir.6, !tbaa !2) Reviewed by: stefanp Differential Revision: https://reviews.llvm.org/D66329	2019-10-25 04:13:30 -04:00
Djordje Todorovic	8c99a549de	[LiveDebugValues] Small code clean up; NFC	2019-10-25 09:39:42 +02:00
Craig Topper	0eb8a52aee	[X86][GISel] Remove unneeded custom selection code for handling shifts.	2019-10-24 21:11:13 -07:00
Philip Reames	34f68253ca	[SCEV] Expose and use maximum constant exit counts for individual loop exits We were already going to all of the trouble of computing maximum constant exit counts for each loop exit, we might as well expose them through the API. The change in IndVars is mostly to demonstrate that the wired up code works, but it als very slightly strengthens the transform. The strengthened case is rather narrow though: it requires one exactly analyzeable exit, one imprecisely analyzeable exit (with the upper bound less than the precise one), and one unanalyzeable exit. I coudn't construct a reasonably stable test case. This does increase the memory usage of the BackedgeTakenCount by a factor of 2 in the worst case. I also noticed the loop in IndVars is O(#Exits ^ 2). This doesn't change with this patch. A future patch will cache this result inside of SCEV to avoid requering.	2019-10-24 19:07:33 -07:00
David Blaikie	0e8fc21c2e	Fix Clang -Wcovered-switch-default warning by moving llvm_unreachable default to after the switch	2019-10-24 18:56:45 -07:00
Philip Reames	c27010ef76	[SCEV] Start reworking backedge taken count APIs to unify max handling [NFC] This is a first step in figuring out a proper API for maximum (non constant) exit counts. This may evolve a bit as we get experience with the API needs; suggestions very welcome. This patch just tried to provide a framework that we can later add maximum too in a clean and obvious way.	2019-10-24 18:21:55 -07:00
Joerg Sonnenberger	60b403e75c	Always flush pending errors in MCAsmParser This has become visible with the --fatal-warnings support.	2019-10-25 00:48:12 +02:00
Philip Reames	9b8dd00403	Test commit access via git	2019-10-24 15:10:17 -07:00
Hans Wennborg	55c223a7ed	Try harder to fix GCC 5.3 build (This time verified locally.) It was failing with: llvm/lib/MC/XCOFFObjectWriter.cpp:168:56: error: array must be initialized with a brace-enclosed initializer std::array<Section *const, 2> Sections = {&Text, &BSS}; ^	2019-10-24 23:42:48 +02:00
Simon Pilgrim	a18818207a	Fix cppcheck shadow variable warning. NFCI.	2019-10-24 22:14:36 +01:00
jasonliu	95a18b848f	Follow up on D69112, fix build break for skipping field initialization Clang emit warning for skipping field initialization. Add {} to fix it. This is a patch that fixes issue introduced in https://reviews.llvm.org/D69112	2019-10-24 21:10:06 +00:00
Simon Pilgrim	c39ba0429c	Fix MSVC "switch statement contains 'default' but no 'case' labels" warning. NFCI.	2019-10-24 13:40:13 -07:00
Vedant Kumar	d0bd3fc88b	Revert "Disable exit-on-SIGPIPE in lldb" This reverts commit `32ce14e55e`. In post-commit review, Pavel pointed out that there's a simpler way to ignore SIGPIPE in lldb that doesn't rely on llvm's handlers.	2019-10-24 13:19:49 -07:00
Akira Hatanaka	31b752cbf0	[ObjC][ARC] Check whether the return and parameter types of the old and new functions are compatible before upgrading a function call to an intrinsic call. Sometimes users insert calls to ARC runtime functions that are not compatible with the corresponding intrinsic functions (for example, 'i8* @objc_storeStrong' instead of 'void @objc_storeStrong'). Don't upgrade those calls. rdar://problem/56447127	2019-10-24 13:08:50 -07:00
Stanislav Mekhanoshin	3c8e055187	[AMDGPU] Fix mfma scheduling crash An SUnit can be neither intruction not SDNode. It is all null if represents a nop. Fixed a crash on using SU->getInstr(). Differential Revision: https://reviews.llvm.org/D69395	2019-10-24 11:01:52 -07:00
Hans Wennborg	5da6d4ec16	Speculative build fix for GCC 5.3.0 It was failing with llvm/lib/MC/XCOFFObjectWriter.cpp:168:53: error: array must be initialized with a brace-enclosed initializer std::array<Section *const, 2> Sections{&Text, &BSS}; ^	2019-10-24 19:59:35 +02:00
dfukalov	6d0fc4373e	[NFC] Remove redundant lines Reviewers: rampitec Reviewed By: rampitec Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69375	2019-10-24 19:54:28 +03:00
Benjamin Kramer	6f0bb77037	[InstCombine] Fold one-use variable into assert Avoids warnings in Release builds. NFC.	2019-10-24 17:57:24 +02:00
jasonliu	78207e1f23	[NFC][XCOFF][AIX] Serialize object file writing for each CsectGroup Summary: Right now we handle each CsectGroup(ProgramCodeCsects, BSSCsects) individually when assigning indices, writing symbol table, and writing section raw data. However, there is already a pattern there, and we could common up those actions for every CsectGroup. This will make adding new CsectGroup(Read Write data, Read only data, TC/TOC, mergeable string) easier, and less error prone. Reviewed by: sfertile, daltenty, DiggerLin Approved by: daltenty Differential Revision: https://reviews.llvm.org/D69112	2019-10-24 15:38:50 +00:00
Simon Tatham	e5f485c3bd	[InstCombine] Known-bits optimization for ARM MVE VADC. The MVE VADC instruction reads and writes the carry bit at bit 29 of the FPSCR register. The corresponding ACLE intrinsic is specified to work with an integer in which the carry bit is stored at bit 0. So if a user writes a code sequence in C that passes the carry from one VADC to the next, like this, s0 = vadcq_u32(a0, b0, &carry); s1 = vadcq_u32(a1, b1, &carry); then clang will generate IR for each of those operations that shifts the carry bit up into bit 29 before the VADC, and after it, shifts it back down and masks off all but the low bit. But in this situation what you really wanted was two consecutive VADC instructions, so that the second one directly reads the value left in FPSCR by the first, without wasting several instructions on pointlessly clearing the other flag bits in between. This commit explains to InstCombine that the other bits of the flags operand don't matter, and adds a test that demonstrates that all the code between the two VADC instructions can be optimized away as a result. Reviewers: dmgreen, miyuki, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67162	2019-10-24 16:33:13 +01:00
Simon Tatham	e0ef4ebe2f	[ARM] Add IR intrinsics for MVE VLD[24] and VST[24]. The VST2 and VST4 instructions take two or four vector registers as input, and store part of each register to memory in an interleaved pattern. They come in variants indicating which part of each register they store (VST20 and VST21; VST40 to VST43 inclusive); the intention is that issuing each of those variants in turn has the combined effect of loading or storing the whole set of registers to a memory block of equal size. The corresponding VLD2 and VLD4 instructions load from memory in the same interleaved format: each one overwrites only part of its output register set, and again, the idea is that if you use VLD4{0,1,2,3} or VLD2{0,1} together, you end up having written to the whole of each register. I've implemented the stores and loads quite differently. The loads were easiest to implement as a single intrinsic that expands to all four VLD4x instructions or both VLD2x, delivering four complete output registers. (Implementing each individual load as a separate instruction taking four input registers to partially overwrite is possible in theory, but pointless, and when I tried it, I found it would need extra work to get the register allocation not to be horrible.) Since that intrinsic delivers multiple outputs, it has to be instruction-selected in custom C++. But the store instructions are easier to model individually, because they don't overwrite any register at all and you can write a DAG Isel pattern in Tablegen for each one. Hence, my new intrinsic `int_arm_mve_vld4q` expands to four load instructions, delivers four full output vectors, and is handled by C++ code, whereas `int_arm_mve_vst4q` expands to just one store instruction, takes four input vectors and a constant indicating which lanes to store, and is handled entirely in Tablegen. (And similarly for vld2q/vst2q.) This is asymmetric, but it was the easiest way to do each one. Reviewers: dmgreen, miyuki, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68700	2019-10-24 16:33:13 +01:00
Simon Tatham	ceeff95ca4	[ARM] Add some sample IR MVE intrinsics with C++ isel. This adds some initial example IR intrinsics for MVE instructions that deliver multiple output values, and hence, have to be instruction- selected by custom C++ code instead of Tablegen patterns. I've added the writeback gather load instructions (taking a vector of base addresses and a single common offset, returning a vector of loaded values and an updated vector of base addresses); one example from the long shift family (taking and returning a 64-bit value in two GPRs); and the VADC instruction (which propagates a carry bit from each vector-lane addition to the next, taking an input carry flag in FPSCR and outputting the final one in FPSCR as well). To support the VPT-predicated forms of these instructions, I've written some helper functions to add the cluster of MVE predicate operands to the end of a MachineInstr. `AddMVEPredicateToOps` is used when the instruction actually is predicated (so it takes a predicate mask argument), and `AddEmptyMVEPredicateToOps` is for when the instruction is unpredicated (so it fills in $noreg for the mask). Each one comes in a form suitable for `vpred_n`, and one for `vpred_r` which takes the extra 'inactive' parameter. For VADC, the representation of the carry flag in the IR intrinsic is a word intended to be moved directly to and from `FPSCR_nzcvqc`, i.e. with the carry flag in bit 29 of the word. (The user-facing ACLE intrinsic will want it to be in bit 0, but I'll do that on the clang side.) Reviewers: dmgreen, miyuki, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68699	2019-10-24 16:33:13 +01:00
Simon Tatham	1b45297e01	[ARM] Begin adding IR intrinsics for MVE instructions. This commit, together with the next few, will add a representative sample of the kind of IR intrinsics that we'll need in order to implement the user-facing ACLE intrinsics for MVE. Supporting all of them will take more work; the intention of this initial series of commits is to implement an intrinsic or two from lots of different categories, as examples and proofs of concept. This initial commit introduces a small number of IR intrinsics for instructions simple enough that they can use Tablegen ISel patterns: the predicated versions of the VADD and VSUB instructions (both integer and FP), VMIN and VMAX, and the float->half VCVT instruction (predicated and unpredicated). When using VPT-predicated instructions in automatic code generation, it will be convenient to specify the predicate value as a vector of the appropriate number of i1. To make it easy to specify all sizes of an instruction in one go and give each one the matching predicate vector type, I've added a system of Tablegen informational records describing MVE's vector types: each one gives the underlying LLVM IR ValueType (which may not be the same if the MVE vector is of explicitly signed or unsigned integers) and an appropriate vNi1 to use as the predicate vector. (Also, those info records include the usual encoding for the types, so that as we add associations between each instruction encoding and one of the new `MVEVectorVTInfo` records, we can remove some of the existing template parameters and replace them with references to the vector type info's fields.) The user-facing ACLE intrinsics will receive a predicate mask as a 16-bit integer, so I've also provided a pair of intrinsics i2v and v2i, to convert between an integer and a vector of i1 by just changing the register class. Reviewers: dmgreen, miyuki, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67158	2019-10-24 16:33:13 +01:00
Michael Liao	b2a65f0d70	[AMDGPU] Skip additional folding on the same operand. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69355	2019-10-24 11:30:22 -04:00
Michael Liao	950b800c45	Fix compilation warning on the trailing whitespace. NFC.	2019-10-24 09:55:06 -04:00
Petar Avramovic	e3b49df50e	[MIPS GlobalISel] Select MSA vector generic and builtin fabs selectImpl is able to select G_FABS when we set bank for vector operands to fprb. Add detailed tests. Note: G_FABS is generated from llvm-ir intrinsics llvm.fabs., and at the moment MIPS is not able to generate this intrinsic for vector type (some targets generate vector llvm.fabs. from calls to a builtin function). We can handle fabs using __builtin_msa_fmax_a_<format> and passing same vector as both arguments. __builtin_msa_fmax_a_<format> will be directly selected into FMAX_A_<format> in legalizeIntrinsic. Differential Revision: https://reviews.llvm.org/D69346	2019-10-24 13:45:26 +02:00
Benjamin Kramer	bfa3f0c316	Hide implementation details in anonymous namespaces. NFC.	2019-10-24 10:48:43 +02:00
Petar Avramovic	914ce66413	[MIPS GlobalISel] MSA vector generic and builtin fadd, fsub, fmul, fdiv Select vector G_FADD, G_FSUB, G_FMUL and G_FDIV for MIPS32 with MSA. We have to set bank for vector operands to fprb and selectImpl will do the rest. __builtin_msa_fadd_<format>, __builtin_msa_fsub_<format>, __builtin_msa_fmul_<format> and __builtin_msa_fdiv_<format> will be transformed into G_FADD, G_FSUB, G_FMUL and G_FDIV in legalizeIntrinsic respectively and selected in the same way. Differential Revision: https://reviews.llvm.org/D69340	2019-10-24 10:15:07 +02:00
Petar Avramovic	1d7f79c017	[MIPS GlobalISel] MSA vector generic and builtin sdiv, srem, udiv, urem Select vector G_SDIV, G_SREM, G_UDIV and G_UREM for MIPS32 with MSA. We have to set bank for vector operands to fprb and selectImpl will do the rest. __builtin_msa_div_s_<format>, __builtin_msa_mod_s_<format>, __builtin_msa_div_u_<format> and __builtin_msa_mod_u_<format> will be transformed into G_SDIV, G_SREM, G_UDIV and G_UREM in legalizeIntrinsic respectively and selected in the same way. Differential Revision: https://reviews.llvm.org/D69333	2019-10-24 10:03:36 +02:00
Stanislav Mekhanoshin	61e7a61bdc	[AMDGPU] Allow folding of sgpr to vgpr copy Potentially sgpr to sgpr copy should also be possible. That is however trickier because we may end up with a wrong register class at use because of xm0/xexec permutations. Differential Revision: https://reviews.llvm.org/D69280	2019-10-23 18:42:48 -07:00
Shoaib Meenai	e3d26b42b9	[Hexagon] Fix typo. NFC Testing git push access.	2019-10-23 18:06:28 -07:00
Hans Wennborg	684ebc605e	Revert `4334892e7b` "[DAGCombine][ARM] x ==/!= c -> (x - c) ==/!= 0 iff '-c' can be folded into the x node." This broke various Windows builds, see comments on the Phabricator review. This also reverts the follow-up `20bf0cf`. > Summary: > This fold, helps recover from the rest of the D62266 ARM regressions. > https://rise4fun.com/Alive/TvpC > > Note that while the fold is quite flexible, i've restricted it > to the single interesting pattern at the moment. > > Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix > > Reviewed By: deadalnix > > Subscribers: javed.absar, kristof.beyls, llvm-commits > > Tags: #llvm > > Differential Revision: https://reviews.llvm.org/D62450	2019-10-23 19:52:02 +02:00
Roman Lebedev	8eda8f8ce8	[LVI][NFC] Factor solveBlockValueSaturatingIntrinsic() out of solveBlockValueIntrinsic() Now that there's SaturatingInst class, this is cleaner.	2019-10-23 18:17:33 +03:00
Roman Lebedev	1f665046fb	[LVI][CVP] LazyValueInfoImpl::solveBlockValueBinaryOp(): use no-wrap flags from `add` op Summary: This was suggested in https://reviews.llvm.org/D69277#1717210 In this form (this is what was suggested, right?), the results aren't staggering (especially since given LVI cross-block focus) this does catch some things (as per test-suite), but not too much: \| statistic \| old \| new \| delta \| % change \| \| correlated-value-propagation.NumAddNSW \| 4981 \| 4982 \| 1 \| 0.0201% \| \| correlated-value-propagation.NumAddNW \| 12125 \| 12126 \| 1 \| 0.0082% \| \| correlated-value-propagation.NumCmps \| 1199 \| 1202 \| 3 \| 0.2502% \| \| correlated-value-propagation.NumDeadCases \| 112 \| 111 \| -1 \| -0.8929% \| \| correlated-value-propagation.NumMulNSW \| 275 \| 278 \| 3 \| 1.0909% \| \| correlated-value-propagation.NumMulNUW \| 1323 \| 1326 \| 3 \| 0.2268% \| \| correlated-value-propagation.NumMulNW \| 1598 \| 1604 \| 6 \| 0.3755% \| \| correlated-value-propagation.NumNSW \| 7158 \| 7167 \| 9 \| 0.1257% \| \| correlated-value-propagation.NumNUW \| 13304 \| 13310 \| 6 \| 0.0451% \| \| correlated-value-propagation.NumNW \| 20462 \| 20477 \| 15 \| 0.0733% \| \| correlated-value-propagation.NumOverflows \| 4 \| 7 \| 3 \| 75.0000% \| \| correlated-value-propagation.NumPhis \| 15366 \| 15381 \| 15 \| 0.0976% \| \| correlated-value-propagation.NumSExt \| 6273 \| 6277 \| 4 \| 0.0638% \| \| correlated-value-propagation.NumShlNSW \| 1172 \| 1171 \| -1 \| -0.0853% \| \| correlated-value-propagation.NumShlNUW \| 2793 \| 2794 \| 1 \| 0.0358% \| \| correlated-value-propagation.NumSubNSW \| 730 \| 736 \| 6 \| 0.8219% \| \| correlated-value-propagation.NumSubNUW \| 2044 \| 2046 \| 2 \| 0.0978% \| \| correlated-value-propagation.NumSubNW \| 2774 \| 2782 \| 8 \| 0.2884% \| \| instcount.NumAddInst \| 277586 \| 277569 \| -17 \| -0.0061% \| \| instcount.NumAndInst \| 66056 \| 66054 \| -2 \| -0.0030% \| \| instcount.NumBrInst \| 709147 \| 709146 \| -1 \| -0.0001% \| \| instcount.NumCallInst \| 528579 \| 528576 \| -3 \| -0.0006% \| \| instcount.NumExtractValueInst \| 18307 \| 18301 \| -6 \| -0.0328% \| \| instcount.NumOrInst \| 102660 \| 102665 \| 5 \| 0.0049% \| \| instcount.NumPHIInst \| 318008 \| 318007 \| -1 \| -0.0003% \| \| instcount.NumSelectInst \| 46373 \| 46370 \| -3 \| -0.0065% \| \| instcount.NumSExtInst \| 79496 \| 79488 \| -8 \| -0.0101% \| \| instcount.NumShlInst \| 40654 \| 40657 \| 3 \| 0.0074% \| \| instcount.NumTruncInst \| 62251 \| 62249 \| -2 \| -0.0032% \| \| instcount.NumZExtInst \| 68211 \| 68221 \| 10 \| 0.0147% \| \| instcount.TotalBlocks \| 843910 \| 843909 \| -1 \| -0.0001% \| \| instcount.TotalInsts \| 7387448 \| 7387423 \| -25 \| -0.0003% \| Reviewers: nikic, reames Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69321	2019-10-23 18:17:32 +03:00
Simon Atanasyan	8e574e56c6	[mips] Use `expandLoadAddress` for JAL expansion - Reduce code duplication - Get partial support of JAL expansion for XGOT.	2019-10-23 17:36:34 +03:00
Simon Atanasyan	c470a9b586	[mips] Implement `la` macro expansion for N32 ABI	2019-10-23 17:36:34 +03:00
Simon Pilgrim	a4d55a2c36	[X86] combineX86ShufflesRecursively - assert the root mask is legal. NFCI.	2019-10-23 07:33:29 -07:00
Sam McCall	a9c3c176ad	Reland "[Support] Add a way to run a function on a detached thread"" This reverts commit `7bc7fe6b78`. The immediate callers have been fixed to pass nullopt where appropriate.	2019-10-23 15:51:44 +02:00
Sam McCall	7bc7fe6b78	Revert "[Support] Add a way to run a function on a detached thread" This reverts commit `40668abca4`. This causes clang tests to fail, as stacksize=0 is being explicitly passed and is no longer a no-op.	2019-10-23 15:10:35 +02:00
Sam McCall	40668abca4	[Support] Add a way to run a function on a detached thread This roughly mimics `std::thread(...).detach()` except it allows to customize the stack size. Required for https://reviews.llvm.org/D50993. I've decided against reusing the existing `llvm_execute_on_thread` because it's not obvious what to do with the ownership of the passed function/arguments: 1. If we pass possibly owning functions data to `llvm_execute_on_thread`, we'll lose the ability to pass small non-owning non-allocating functions for the joining case (as it's used now). Is it important enough? 2. If we use the non-owning interface in the new use case, we'll force clients to transfer ownership to the spawned thread manually, but similar code would still have to exist inside `llvm_execute_on_thread(_async)` anyway (as we can't just pass the same non-owning pointer to pthreads and Windows implementations, and would be forced to wrap it in some structure, and deal with its ownership. Patch by Dmitry Kozhevnikov! Differential Revision: https://reviews.llvm.org/D51103	2019-10-23 12:48:38 +02:00
Mirko Brkusanin	4b63ca1379	[Mips] Use appropriate private label prefix based on Mips ABI MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64 regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo we can find out Mips ABI and pick appropriate prefix. Tags: #llvm, #clang, #lldb Differential Revision: https://reviews.llvm.org/D66795	2019-10-23 12:24:35 +02:00
David Stenberg	74a72e6848	[DebugInfo] Stop describing imms in TargetInstrInfo's describeLoadedValue() impl Summary: The default implementation of the describeLoadedValue() hook uses the MoveImm property to determine if an instruction moves an immediate. If an instruction has that property the function returns the second operand, assuming that that is the immediate value the instruction moves. As far as I can tell, the MoveImm property does not imply that the second operand is the immediate value, nor that any other operand necessarily holds the immediate value; it just means that the instruction moves some immediate value. One example where the second operand is not the immediate is SystemZ's LZER instruction, which moves a zero immediate implicitly: $f0S = LZER. That case triggered an out-of-bound assertion when getting the operand. I have added a test case for that instruction. Another example is ARM's MVN instruction, which holds the logical bitwise NOT'd value of the immediate that is moved. For the following reproducer: extern void foo(int); int main() { foo(-11); } an incorrect call site value would be emitted: $ clang --target=arm foo.c -O1 -g -Xclang -femit-debug-entry-values \ -c -o - \| ./build/bin/llvm-dwarfdump - \| \ grep -A2 call_site_parameter 0x00000058: DW_TAG_GNU_call_site_parameter DW_AT_location (DW_OP_reg0 R0) DW_AT_GNU_call_site_value (DW_OP_lit10) Another example is the A2_combineii instruction on Hexagon which moves two immediates to a super-register: $d0 = A2_combineii 20, 10. Perhaps these are rare exceptions, and most MoveImm instructions hold the immediate in the second operand, but in my opinion the default implementation of the hook should only describe values that it can, by some contract, guarantee are safe to describe, rather than leaving it up to the targets to override the exceptions, as that can silently result in incorrect call site values. This patch adds X86's relevant move immediate instructions to the target's hook implementation, so this commit should be a NFC for that target. We need to do the same for ARM and AArch64. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D69109	2019-10-23 11:41:29 +02:00
georgerim	64df708400	[lib/ObjectYAML] - Add a full stop to the comment. NFC. A test commit.	2019-10-23 12:35:43 +03:00
Petar Avramovic	d1815dacb0	[MIPS GlobalISel] Select MSA vector generic and builtin mul Select vector G_MUL for MIPS32 with MSA. We have to set bank for vector operands to fprb and selectImpl will do the rest. Manual selection of G_MUL is now done for gprb only. __builtin_msa_mulv_<format> will be transformed into G_MUL in legalizeIntrinsic and selected in the same way. Differential Revision: https://reviews.llvm.org/D69310	2019-10-23 11:22:07 +02:00
Petar Avramovic	c46d24f5c3	[MIPS GlobalISel] Select MSA vector generic and builtin sub Select vector G_SUB for MIPS32 with MSA. We have to set bank for vector operands to fprb and selectImpl will do the rest. __builtin_msa_subv_<format> will be transformed into G_SUB in legalizeIntrinsic and selected in the same way. __builtin_msa_subvi_<format> will be directly selected into SUBVI_<format> in legalizeIntrinsic. Differential Revision: https://reviews.llvm.org/D69306	2019-10-23 11:15:25 +02:00
Roman Lebedev	20bf0cf2f0	[TargetLowering] optimizeSetCCToComparisonWithZero(): add extra sanity checks (PR43769) We should do the fold only if both constants are plain, non-opaque constants, at least that is the DAG.FoldConstantArithmetic() requirement. And if the constant we are comparing with is zero - we shouldn't be trying to do this fold in the first place. Fixes https://bugs.llvm.org/show_bug.cgi?id=43769	2019-10-23 12:01:40 +03:00
Simon Cook	aed9d6d64a	[RISCV] Add support for -ffixed-xX flags This adds support for reserving GPRs such that the compiler will not choose a register for register allocation. The implementation follows the same design as for AArch64; each reserved register becomes a target feature and used for getting the reserved registers for a given MachineFunction. The backend checks that it does not need to write to any reserved register; if it does a relevant error is generated. Differential Revision: https://reviews.llvm.org/D67185	2019-10-22 21:25:01 +01:00
Roman Lebedev	4334892e7b	[DAGCombine][ARM] x ==/!= c -> (x - c) ==/!= 0 iff '-c' can be folded into the x node. Summary: This fold, helps recover from the rest of the D62266 ARM regressions. https://rise4fun.com/Alive/TvpC Note that while the fold is quite flexible, i've restricted it to the single interesting pattern at the moment. Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix Reviewed By: deadalnix Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62450	2019-10-22 22:56:35 +03:00
Kit Barton	8be5827f85	Test commit - add clarification to README regarding Darwin.	2019-10-22 11:39:15 -07:00
Stanislav Mekhanoshin	48f57138be	[AMDGPU] Allow tied operand subreg folding Turns out it makes sense, contrarily to what comment said. Differential Revision: https://reviews.llvm.org/D69287	2019-10-22 11:27:36 -07:00
David Green	186155b89c	[InstCombine] Signed saturation patterns This adds an instcombine matcher for code that attempts to perform signed saturating arithmetic by casting to a higher type. Unsigned cases are already matched, this adds extra matches for the more complex signed cases, which involves matching the min(max(add a b)) nodes with proper extends to ensure legality. Differential Revision: https://reviews.llvm.org/D68651 llvm-svn: 375505	2019-10-22 15:39:47 +00:00
Petar Avramovic	95290827d7	[MIParser] Set RegClassOrRegBank during instruction parsing MachineRegisterInfo::createGenericVirtualRegister sets RegClassOrRegBank to static_cast<RegisterBank *>(nullptr). MIParser on the other hand doesn't. When we attempt to constrain Register Class on such VReg, additional COPY is generated. This way we avoid COPY instructions showing in test that have MIR input while they are not present with llvm-ir input that was used to create given MIR for a -run-pass test. Differential Revision: https://reviews.llvm.org/D68946 llvm-svn: 375502	2019-10-22 14:25:37 +00:00
Petar Avramovic	e4af9de36c	[MIPS GlobalISel] Select MSA vector generic and builtin add Select vector G_ADD for MIPS32 with MSA. We have to set bank for vector operands to fprb and selectImpl will do the rest. __builtin_msa_addv_<format> will be transformed into G_ADD in legalizeIntrinsic and selected in the same way. __builtin_msa_addvi_<format> will be directly selected into ADDVI_<format> in legalizeIntrinsic. MIR tests for it have unnecessary additional copies. Capture current state of tests with run-pass=legalizer with a test in test/CodeGen/MIR/Mips. Differential Revision: https://reviews.llvm.org/D68984 llvm-svn: 375501	2019-10-22 13:51:57 +00:00
Eugene Leviant	e5dd30f77e	[ThinLTO] Add code comment. NFC llvm-svn: 375500	2019-10-22 12:57:23 +00:00
Guillaume Chatelet	5b99c189b3	[Alignment][NFC] Convert StoreInst to MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69303 llvm-svn: 375499	2019-10-22 12:55:32 +00:00
Guillaume Chatelet	734c74ba14	[Alignment][NFC] Convert LoadInst to MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69302 llvm-svn: 375498	2019-10-22 12:35:55 +00:00
Nemanja Ivanovic	f2c8f3b181	[PowerPC] Turn on CR-Logical reducer pass This re-commits r375152 which was pulled in r375233 because it broke the EXPENSIVE_CHECKS bot on Windows. The reason for the failure was a bug in the pass that the commit turned on by default. This patch fixes that bug and turns the pass back on. This patch has been verified on the buildbot that originally failed thanks to Simon Pilgrim. Differential revision: https://reviews.llvm.org/D52431 llvm-svn: 375497	2019-10-22 12:20:38 +00:00
Guillaume Chatelet	8e050e41a4	[Alignment][NFC] Use MaybeAlign in AttrBuilder Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69300 llvm-svn: 375496	2019-10-22 11:57:52 +00:00
Guillaume Chatelet	17f5d2b1a5	[Alignment][NFC] Attributes use Align/MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69278 llvm-svn: 375495	2019-10-22 09:51:06 +00:00
Eugene Leviant	0f4186779e	[ThinLTO] Don't internalize during promotion Differential revision: https://reviews.llvm.org/D69107 llvm-svn: 375493	2019-10-22 09:24:12 +00:00
George Rimar	78d632d105	[LLVMDebugInfoPDB] - Use cantFail() instead of assert(). Currently injected-sources-native.test fails with "Expected<T> value was in success state. (Note: Expected<T> values in success mode must still be checked prior to being destroyed)" when llvm is compiled with LLVM_ENABLE_ABI_BREAKING_CHECKS in Release. The problem is that getStringForID returns Expected<StringRef> and Expected value must always be checked, even if it is in success state. Checking with assert only helps in Debug and is wrong. Differential revision: https://reviews.llvm.org/D69251 llvm-svn: 375492	2019-10-22 08:52:45 +00:00
Simon Pilgrim	e25898d93f	[X86][BMI] Pull out schedule classes from bmi_andn<> and bmi_bls<> Stop hardwiring classes llvm-svn: 375470	2019-10-21 23:41:40 +00:00
Simon Pilgrim	b446356bf3	[X86][SSE] Add OR(EXTRACTELT(X,0),OR(EXTRACTELT(X,1))) -> MOVMSK+CMP reduction combine llvm-svn: 375463	2019-10-21 22:36:31 +00:00
Austin Kerbow	97263fa2dd	AMDGPU/GlobalISel: Legalize fast unsafe FDIV Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69231 llvm-svn: 375460	2019-10-21 22:18:26 +00:00
Matt Arsenault	ef9a0278f0	AMDGPU: Select basic interp directly from intrinsics llvm-svn: 375457	2019-10-21 21:49:44 +00:00
Roman Lebedev	7cd7f4a83b	[CVP] No-wrap deduction for `shl` Summary: This is the last `OverflowingBinaryOperator` for which we don't deduce flags. D69217 taught `ConstantRange::makeGuaranteedNoWrapRegion()` about it. The effect is better than of the `mul` patch (D69203): \| statistic \| old \| new \| delta \| % change \| \| correlated-value-propagation.NumAddNUW \| 7145 \| 7144 \| -1 \| -0.0140% \| \| correlated-value-propagation.NumAddNW \| 12126 \| 12125 \| -1 \| -0.0082% \| \| correlated-value-propagation.NumAnd \| 443 \| 446 \| 3 \| 0.6772% \| \| correlated-value-propagation.NumNSW \| 5986 \| 7158 \| 1172 \| 19.5790% \| \| correlated-value-propagation.NumNUW \| 10512 \| 13304 \| 2792 \| 26.5601% \| \| correlated-value-propagation.NumNW \| 16498 \| 20462 \| 3964 \| 24.0272% \| \| correlated-value-propagation.NumShlNSW \| 0 \| 1172 \| 1172 \| \| \| correlated-value-propagation.NumShlNUW \| 0 \| 2793 \| 2793 \| \| \| correlated-value-propagation.NumShlNW \| 0 \| 3965 \| 3965 \| \| \| instcount.NumAShrInst \| 13824 \| 13790 \| -34 \| -0.2459% \| \| instcount.NumAddInst \| 277584 \| 277586 \| 2 \| 0.0007% \| \| instcount.NumAndInst \| 66061 \| 66056 \| -5 \| -0.0076% \| \| instcount.NumBrInst \| 709153 \| 709147 \| -6 \| -0.0008% \| \| instcount.NumICmpInst \| 483709 \| 483708 \| -1 \| -0.0002% \| \| instcount.NumSExtInst \| 79497 \| 79496 \| -1 \| -0.0013% \| \| instcount.NumShlInst \| 40691 \| 40654 \| -37 \| -0.0909% \| \| instcount.NumSubInst \| 61997 \| 61996 \| -1 \| -0.0016% \| \| instcount.NumZExtInst \| 68208 \| 68211 \| 3 \| 0.0044% \| \| instcount.TotalBlocks \| 843916 \| 843910 \| -6 \| -0.0007% \| \| instcount.TotalInsts \| 7387528 \| 7387448 \| -80 \| -0.0011% \| Reviewers: nikic, reames, sanjoy, timshen Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69277 llvm-svn: 375455	2019-10-21 21:31:19 +00:00
Quentin Colombet	6f0ae81512	[GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectors Teach the CombinerHelper how to turn shuffle_vectors, that concatenate vectors, into concat_vectors and add this combine to the AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69149 llvm-svn: 375452	2019-10-21 20:39:58 +00:00
Matt Arsenault	38038f116f	AMDGPU: Use CopyToReg for interp intrinsic lowering This doesn't use the default value, so doesn't benefit from the hack to help optimize it. llvm-svn: 375450	2019-10-21 19:53:49 +00:00
Matt Arsenault	8ebbf25cb1	AMDGPU: Erase redundant redefs of m0 in SIFoldOperands Only handle simple inter-block redefs of m0 to the same value. This avoids interference from redefs of m0 in SILoadStoreOptimzer. I was initially teaching that pass to ignore redefs of m0, but having them not exist beforehand is much simpler. This is in preparation for deleting the current special m0 handling in SIFixSGPRCopies to allow the register coalescer to handle the difficult cases. llvm-svn: 375449	2019-10-21 19:53:46 +00:00
Matt Arsenault	dd6cf159ba	AMDGPU: Stop adding m0 implicit def to SGPR spills r375293 removed the SGPR spilling with scalar stores path, so this is no longer necessary. This also always had the defect of adding the def even when this path wasn't in use. llvm-svn: 375448	2019-10-21 19:42:29 +00:00
Matt Arsenault	b5234b64af	AMDGPU: Slightly restructure m0 init code This will allow using another operation to produce the glue in a future change. llvm-svn: 375447	2019-10-21 19:42:26 +00:00
Stanislav Mekhanoshin	33092194f2	[AMDGPU] Select AGPR in PHI operand legalization If a PHI defines AGPR legalize its operands to AGPR. At the moment we can get an AGPR PHI with VGPR operands. I am not aware of any problems as it seems to be handled gracefully in RA, but this is not right anyway. It also slightly decreases VGPR pressure in some cases because we do not have to a copy via VGPR. Differential Revision: https://reviews.llvm.org/D69206 llvm-svn: 375446	2019-10-21 19:25:27 +00:00
Simon Pilgrim	7c15c4fb17	[X86] Rename matchBitOpReduction to matchScalarReduction. NFCI. This doesn't need to be just for bitops, but the ops do need to be fully associative. llvm-svn: 375445	2019-10-21 19:19:50 +00:00
Sander de Smalen	8f2dac471a	Reverted r375425 as it broke some buildbots. llvm-svn: 375444	2019-10-21 19:11:40 +00:00
Bjorn Pettersson	1f43ea41c3	Prune Pass.h include from DataLayout.h. NFCI Summary: Reduce include dependencies by no longer including Pass.h from DataLayout.h. That include seemed irrelevant to DataLayout, as well as being irrelevant to several users of DataLayout. Reviewers: rnk Reviewed By: rnk Subscribers: mehdi_amini, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69261 llvm-svn: 375436	2019-10-21 17:51:54 +00:00
Simon Pilgrim	ae99712559	SystemZISelLowering - supportedAddressingMode - silence static analyzer dyn_cast<> null dereference warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 375430	2019-10-21 17:16:03 +00:00
Simon Pilgrim	57e8f0b055	GVNHoist - silence static analyzer dyn_cast<> null dereference warning in hasEHOrLoadsOnPath call. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 375429	2019-10-21 17:15:49 +00:00

1 2 3 4 5 ...

128016 Commits