llvm-project

Commit Graph

Author	SHA1	Message	Date
David Spickett	398dc06ad0	[AArch64] Make AArch64 specific assembly directives case insensitive Differential Revision: https://reviews.llvm.org/D72923	2020-01-17 16:16:18 +00:00
Matt Arsenault	886f9071c6	AMDGPU: Don't assert on a16 images on targets without FeatureR128A16 Currently the lowering for i16 image coordinates asserts on gfx10. I'm somewhat confused by this though. The feature is missing from the gfx10 feature lists, but the a16 bit appears to be present in the manual for MIMG instructions.	2020-01-17 11:07:00 -05:00
Sanjay Patel	43f60e614a	[x86] try harder to form 256-bit unpck* This is another part of a problem noted in PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 The AVX2 code may use awkward 256-bit shuffles vs. the AVX code that gets split into the expected 128-bit unpack instructions. We have to be selective in matching the types where we try to do this though. Otherwise, we can end up with more instructions (in the case of v8x32/v4x64). Differential Revision: https://reviews.llvm.org/D72575	2020-01-17 10:42:39 -05:00
Krzysztof Parzyszek	2d5bfc6eb1	[Hexagon] Improve HVX version checks	2020-01-17 09:40:26 -06:00
Krzysztof Parzyszek	60aed6a4e5	[Hexagon] Add prev65 subtarget feature There was a change to trap1 instruction between v62 and v65. This feature will allow the assembler/disassembler to handle different variants depending on the CPU version.	2020-01-17 09:27:27 -06:00
Simon Pilgrim	8eb4d25a09	[X86] Split X87/SSE compare classes into WriteFCom + WriteFComX Most X87 compare instructions write to the X87 status word, while the SSE (U)COMI compares write to rFLAGS. These are often handled very differently on CPUs (e.g. rFLAGS outputs typically involve a fpu2gpr transfer), and we shouldn't be grouping all these instructions behind a single class - so this patch splits off the SSE compares into a new WriteFComX class (and currently keeps the same behaviours). If there's a need to distinguish between X87 instructions more closely we can investigate that in the future, but as we don't handle any of the X87 side effects at the moment its unlikely to have any notable effect.	2020-01-17 13:53:58 +00:00
Simon Pilgrim	1dc2f25790	[SelectionDAG] ComputeKnownBits - assert we're computing the 0'th (difference) result for the SUB/SUBC cases Matches what we already do for the ADD/ADDC/ADDE case.	2020-01-17 13:53:57 +00:00
Sanjay Patel	c1e159ef6e	[IR] fix Constant::isElementWiseEqual() to allow for all undef elements compare We could argue that match() should be more flexible here, but I'm not sure what impact that would have on existing code.	2020-01-17 08:31:16 -05:00
Sam Parker	42350cd893	[ARM][MVE] Tail Predicate IsSafeToRemove Introduce a method to walk through use-def chains to decide whether it's possible to remove a given instruction and its users. These instructions are then stored in a set until the end of the transform when they're erased. This is now used to perform checks on the iteration count (LoopDec chain), element count (VCTP chain) and the possibly redundant iteration count. As well as being able to remove chains of instructions, we know also check that the sub feeding the vctp is producing the expected value. Differential Revision: https://reviews.llvm.org/D71837	2020-01-17 13:19:14 +00:00
Fedor Sergeev	cc7cb05e9d	[BasicBlock] fix looping in getPostdominatingDeoptimizeCall Blindly following unique-successors chain appeared to be a bad idea. In a degenerate case when block jumps to itself that goes into endless loop. Discovered this problem when playing with additional changes, managed to reproduce it on existing LoopPredication code. Fix by checking a "visited" set while iterating through unique successors. Reviewed By: skatkov Tags: #llvm Differential Revision: https://reviews.llvm.org/D72908	2020-01-17 15:40:02 +03:00
Cullen Rhodes	49edf9a509	[AArch64][SVE] Add break intrinsics Summary: Implements the following intrinsics: * @llvm.aarch64.sve.brka * @llvm.aarch64.sve.brka.z * @llvm.aarch64.sve.brkb * @llvm.aarch64.sve.brkb.z * @llvm.aarch64.sve.brkn.z * @llvm.aarch64.sve.brkpa.z * @llvm.aarch64.sve.brkpb.z Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally, rengolin Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72393	2020-01-17 11:47:08 +00:00
Simon Pilgrim	f611158350	[SelectionDAG] Better ISD::ANY_EXTEND/ISD::ANY_EXTEND_VECTOR_INREG ComputeKnownBits support Add DemandedElts handling to ISD::ANY_EXTEND and add missing ISD::ANY_EXTEND_VECTOR_INREG handling. Despite the lack of test changes this code IS being used - its just that the ANY_EXTEND ops are legalized later on (typically to ZERO_EXTEND equivalents) so we typically manage to combine later on.	2020-01-17 11:37:58 +00:00
David Spickett	37fb3b3363	[AsmParser] Make generic directives and aliases case insensitive. GCC will accept any case for assembler directives. For example ".abort" and ".ABORT" (even ".aBoRt") are equivalent. https://sourceware.org/binutils/docs/as/Pseudo-Ops.html#Pseudo-Ops "The names are case insensitive for most targets, and usually written in lower case." Change llvm-mc to accept any case for generic directives or aliases of those directives. This for Bugzilla #39527. Differential Revision: https://reviews.llvm.org/D72686	2020-01-17 11:02:56 +00:00
Kerry McLaughlin	fe3bb8ec96	[AArch64][SVE] Add ImmArg property to intrinsics with immediates Summary: Several SVE intrinsics with immediate arguments (including those added by D70253 & D70437) do not use the ImmArg property. This patch adds ImmArg<Op> where required and changes the appropriate patterns which match the immediates. Reviewers: efriedma, sdesmalen, andwar, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72612	2020-01-17 10:47:55 +00:00
Craig Topper	caee96031d	[Transforms][RISCV] Remove a "using namespace llvm" from an include file. Fix a place that became dependent on it. This include file was created in October and has a "using namespace llvm". This seems to get exposed to other include files and finally onto cpp files. While this somewhat okay for llvm itself, its bad for other projects that use llvm as a library and includes a header file that picks this up. This was found by ISPC which has some class names at gloal scope with the same names as LLVM. It looks like RISCV accidentally became dependent on this. I fixed it by reordering some includes in the RISCV code, but maybe we want to change the TableGenEmitter to put "namespace llvm {" in the generated file instead? But we probably want to do the simplest thing first so we can merge it to 10.0. Differential Revision: https://reviews.llvm.org/D72895	2020-01-16 20:50:41 -08:00
Matt Arsenault	117d4f1900	AMDGPU: Add register classes to MUBUF load patterns	2020-01-16 22:00:44 -05:00
Zakk Chen	cef838e65f	Revert "[RISCV] Support ABI checking with per function target-features" This reverts commit `7bc58a779a`. It breaks EXPENSIVE_CHECKS on Windows	2020-01-16 18:01:07 -08:00
Davide Italiano	30a8865142	[FastISel] Lower `llvm.dbg.value(undef, ...` correctly. Summary: Instead of just dropping them. <rdar://problem/58657146> Reviewers: aprantl, vsk, ab, paquette, echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72877	2020-01-16 16:22:20 -08:00
David Blaikie	65eb74e94b	PointerLikeTypeTraits: Standardize NumLowBitsAvailable on static constexpr rather than anonymous enum This is (more?) usable by GDB pretty printers and seems nicer to write. There's one tricky caveat that in C++14 (LLVM's codebase today) the static constexpr member declaration is not a definition - so odr use of this constant requires an out of line definition, which won't be provided (that'd make all these trait classes more annoyidng/expensive to maintain). But the use of this constant in the library implementation is/should always be in a non-odr context - only two unit tests needed to be touched to cope with this/avoid odr using these constants. Based on/expanded from D72590 by Christian Sigg.	2020-01-16 15:30:50 -08:00
Eric Christopher	de022a8824	[NFC] Fold isHugeExpression into hasHugeExpression and update callers accordingly.	2020-01-16 15:28:54 -08:00
Jessica Paquette	b82d18e1e8	[AArch64][GlobalISel] Change G_FCONSTANTs feeding into stores into G_CONSTANTS Given the following situation: x = G_FCONSTANT (something that can't be materialized) G_STORE x, some_addr We know that x must be materialized as at least a single mov. However, at the time of selection, the G_STORE will have been regbankselected to a FPR store. So, as a result, you'll get an unnecessary fmov into the G_STORE. Storing a constant value in a GPR and a constant value in a FPR are the same. So, whenever you see a G_FCONSTANT that feeds into only G_STORES, so might as well make it a G_CONSTANT. This adds a target-specific combine which changes G_FCONSTANTs feeding into G_STOREs into G_CONSTANTs. Differential Revision: https://reviews.llvm.org/D72814	2020-01-16 15:18:44 -08:00
Derek Schuff	80906d9d16	Revert "[WebAssembly] Track frame registers through VReg and local allocation" This reverts commit `3a05c3969c`. It breaks under expensive-checks and on Windows	2020-01-16 14:38:00 -08:00
Matt Arsenault	3ef8cdf666	AMDGPU: Do permlane16 vdst_in discard optimization in InstCombine There's more potential value to discarding the source value earlier, since we always know the value of the fi/bc bits.	2020-01-16 17:27:53 -05:00
Matt Arsenault	91e758b732	AMDGPU: Move permlane discard vdst_in optimization This case can be handled as a regular selection pattern, so move it out of the weird post-isel folding code which doesn't have an exactly equivalent place in GlobalISel. I think it doesn't make much sense to do this optimization here though, and it would be more useful in instcombine. There's not really any new information that will be gained during lowering since these inputs were known from the beginning.	2020-01-16 17:27:53 -05:00
Derek Schuff	3a05c3969c	[WebAssembly] Track frame registers through VReg and local allocation This change has 2 components: Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It describes how the Dwarf frame base will be encoded. That can be a register (the default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or a DW_OP_WASM_location descriptr. WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs has run. Make WebAssemblyExplicitLocals store the local it allocates for the frame register. Use this local information to implement getDwarfFrameBase The result is that the DW_AT_frame_base attribute is correctly encoded for each subprogram, and each param and local variable has a correct DW_AT_location that uses DW_OP_fbreg to refer to the frame base. Differential Revision: https://reviews.llvm.org/D71681	2020-01-16 13:51:17 -08:00
Sanjay Patel	52b44902d0	[IR] fix crash in Constant::isElementWiseEqual() with FP types We lifted this code from InstCombine for general usage in: rL369842 ...but it's not safe as-is. There are no existing users that can trigger this bug, but I discovered it via crashing several regression tests when trying to use it for select folding in InstSimplify. ICmp requires (vector) integer types, so give up on anything that's not integer or FP (pointers and ?) then bitcast the constants before trying the match. That matches the definition of "equal or undef" that I was looking for. If someone wants an FP-aware version of equality (deal with NaN, -0.0), that could be a different mode or different function. Differential Revision: https://reviews.llvm.org/D72784	2020-01-16 16:49:16 -05:00
Krzysztof Parzyszek	ecf0766cf1	[Hexagon] Add ELF flags for Hexagon v66 to ELFYAML.cpp	2020-01-16 15:01:00 -06:00
Fedor Sergeev	1f2dad1fd5	[GVN] add GVN parameters parsing to new pass manager Introduce parsing, add a few instances of parameter use into GVN-PRE tests. Reviewers: skatkov, asbirlea Reviewed By: skatkov Tags: #llvm Differential Revision: https://reviews.llvm.org/D72752	2020-01-16 23:53:46 +03:00
Kazu Hirata	53b68e676f	Resubmit: [JumpThreading] Thread jumps through two basic blocks This reverts commit `2d258ed931`. This revision fixes the Windows build and adds a testcase for it, namely thread-two-bbs3.ll. My original patch improperly copied EH pads on Windows. This patch disregards jump threading opportunities having to do with EH pads. [JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-01-16 12:33:37 -08:00
Matt Arsenault	f5d98543b8	AMDGPU: Remove outdated comment	2020-01-16 14:54:27 -05:00
Matt Arsenault	e12b840abf	AMDGPU/GlobalISel: Improve lowering of G_SEXT_INREG Clamping the scalar is much better than lowering with superwide shifts for types > s64.	2020-01-16 14:29:37 -05:00
Matt Arsenault	a66d2817ca	GlobalISel: Don't ignore requested ext narrowing type This was assuming the narrow target was the source type. Respect the requested type when these don't match by using intermediate merges. This avoids producing very wide, illegal shift expansions.	2020-01-16 14:29:37 -05:00
Matt Arsenault	be31a7b7ee	GlobalISel: Move extension scalar narrowing to separate function Also rename a few things. Handling a different requested type will require this to become much more complex.	2020-01-16 14:29:37 -05:00
Krzysztof Parzyszek	5f65065437	[Hexagon] Update autogeneated intrinsic information in LLVM	2020-01-16 13:11:18 -06:00
Craig Topper	61a89e17df	[LegalizeDAG][Mips] Add an assert to protect a uint_to_fp implementation from double rounding. Add a i32->f32 uint_to_fp implementation that avoids this code. The algorithm here only works if the sint_to_fp doesn't do any rounding. Otherwise it can round before the offset fixup is applied. Add an assert to protect this. To avoid breaking the one test in tree that tested this code with a set of types that fail the assert, I've enabled i32->f32 to use the i64->f32 algorithm. This only occurs when f64 isn't a legal type. If f64 is legal then we do i32->f64->f32 instead. Differential Revision: https://reviews.llvm.org/D72794	2020-01-16 11:08:16 -08:00
Matt Arsenault	d0943537e1	GlobalISel: Apply target MMO flags to atomics Unify MMO flag handling with SelectionDAG like with loads and stores.	2020-01-16 13:49:43 -05:00
Matt Arsenault	0d0fce42b0	GlobalISel: Preserve load/store metadata in IRTranslator This was dropping the invariant metadata on dead argument loads, so they weren't deleted. Atomics still need to be fixed the same way. Also, apparently store was never preserving dereferencable which should also be fixed.	2020-01-16 13:49:43 -05:00
Krzysztof Parzyszek	8ee2d16896	[Hexagon] Add a target feature to disable compound instructions This affects the following instructions: Tag: M4_mpyrr_addr Syntax: Ry32 = add(Ru32,mpyi(Ry32,Rs32)) Tag: M4_mpyri_addr_u2 Syntax: Rd32 = add(Ru32,mpyi(#u6:2,Rs32)) Tag: M4_mpyri_addr Syntax: Rd32 = add(Ru32,mpyi(Rs32,#u6)) Tag: M4_mpyri_addi Syntax: Rd32 = add(#u6,mpyi(Rs32,#U6)) Tag: M4_mpyrr_addi Syntax: Rd32 = add(#u6,mpyi(Rs32,Rt32)) Tag: S4_addaddi Syntax: Rd32 = add(Rs32,add(Ru32,#s6)) Tag: S4_subaddi Syntax: Rd32 = add(Rs32,sub(#s6,Ru32)) Tag: S4_or_andix Syntax: Rx32 = or(Ru32,and(Rx32,#s10)) Tag: S4_andi_asl_ri Syntax: Rx32 = and(#u8,asl(Rx32,#U5)) Tag: S4_ori_asl_ri Syntax: Rx32 = or(#u8,asl(Rx32,#U5)) Tag: S4_addi_asl_ri Syntax: Rx32 = add(#u8,asl(Rx32,#U5)) Tag: S4_subi_asl_ri Syntax: Rx32 = sub(#u8,asl(Rx32,#U5)) Tag: S4_andi_lsr_ri Syntax: Rx32 = and(#u8,lsr(Rx32,#U5)) Tag: S4_ori_lsr_ri Syntax: Rx32 = or(#u8,lsr(Rx32,#U5)) Tag: S4_addi_lsr_ri Syntax: Rx32 = add(#u8,lsr(Rx32,#U5)) Tag: S4_subi_lsr_ri Syntax: Rx32 = sub(#u8,lsr(Rx32,#U5))	2020-01-16 12:37:30 -06:00
Arkady Shlykov	c87982b467	Revert "[Loop Peeling] Add possibility to enable peeling on loop nests." This reverts commit `3f3017e` because there's a failure on peel-loop-nests.ll with LLVM_ENABLE_EXPENSIVE_CHECKS on. Differential Revision: https://reviews.llvm.org/D70304	2020-01-16 10:33:38 -08:00
stevewan	bed7626f04	[PowerPC][AIX] Make PIC the default relocation model for AIX Summary: The `llc` tool currently defaults to Static relocation model and generates non-relocatable code for 32-bit Power. This is not desirable on AIX where we always generate Position Independent Code (PIC). This patch makes PIC the default relocation model for AIX. Reviewers: daltenty, hubert.reinterpretcast, DiggerLin, Xiangling_L, sfertile Reviewed By: hubert.reinterpretcast Subscribers: mgorny, wuzish, nemanjai, hiraditya, kbarton, jsji, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72479	2020-01-16 13:07:36 -05:00
Nico Weber	81c67da0f2	remove an include that's unused after r347592	2020-01-16 12:49:54 -05:00
Fedor Sergeev	3478551bf3	[GVN] introduce GVNOptions to control GVN pass behavior There are a few global (cl::opt) controls that enable optional behavior in GVN. Introduce GVNOptions that provide corresponding per-pass instance controls. That will allow to use GVN multiple times in pipeline each time with different settings. Reviewers: asbirlea, rnk, reames, skatkov, fhahn Reviewed By: fhahn Tags: #llvm Differential Revision: https://reviews.llvm.org/D72732	2020-01-16 20:21:08 +03:00
Mircea Trofin	7acfda633f	[llvm] Make new pass manager's OptimizationLevel a class Summary: The old pass manager separated speed optimization and size optimization levels into two unsigned values. Coallescing both in an enum in the new pass manager may lead to unintentional casts and comparisons. In particular, taking a look at how the loop unroll passes were constructed previously, the Os/Oz are now (==new pass manager) treated just like O3, likely unintentionally. This change disallows raw comparisons between optimization levels, to avoid such unintended effects. As an effect, the O{s\|z} behavior changes for loop unrolling and loop unroll and jam, matching O2 rather than O3. The change also parameterizes the threshold values used for loop unrolling, primarily to aid testing. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72547	2020-01-16 09:00:56 -08:00
Matt Arsenault	4ca1ad85b7	AMDGPU/GlobalISel: Don't handle legacy buffer intrinsic	2020-01-16 11:31:12 -05:00
Matt Arsenault	9b2f3532c7	AMDGPU/GlobalISel: Select DS GWS intrinsics	2020-01-16 11:25:10 -05:00
Jay Foad	885260d5d8	[GlobalISel] Don't arbitrarily limit a mask to 64 bits Reviewers: arsenm Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72853	2020-01-16 16:13:20 +00:00
Jay Foad	63f73545dd	[GlobalISel] Pass MachineOperands into MachineIRBuilder helper methods Reviewers: arsenm, aditya_nandakumar, aemerson Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72849	2020-01-16 16:04:21 +00:00
Sam Parker	760b175109	[ARM][LowOverheadLoops] Update liveness info Recommitting `e93e0d413f` after reverting due to test failures, which will hopefully now be fixed. Original commit message: After expanding the pseudo instructions, update the liveness info. We do this in a post-order traversal of the loop, including its exit blocks and preheader(s). Differential Revision: https://reviews.llvm.org/D72131	2020-01-16 15:44:25 +00:00
Jay Foad	28bb43bdf8	[GlobalISel] Use more MachineIRBuilder helper methods Reviewers: arsenm, nhaehnle Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72833	2020-01-16 15:34:51 +00:00
Anna Welker	c24cf97960	[ARM][MVE] Enable extending gathers Enables the masked gather pass to create extending masked gathers. Differential Revision: https://reviews.llvm.org/D72451	2020-01-16 15:24:54 +00:00
Francesco Petrogalli	66c120f025	[VectorUtils] Rework the Vector Function Database (VFDatabase). Summary: This commits is a rework of the patch in https://reviews.llvm.org/D67572. The rework was requested to prevent out-of-tree performance regression when vectorizing out-of-tree IR intrinsics. The vectorization of such intrinsics is enquired via the static function `isTLIScalarize`. For detail see the discussion in https://reviews.llvm.org/D67572. Reviewers: uabelho, fhahn, sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72734	2020-01-16 15:08:26 +00:00
Jeremy Morse	c969335abd	Revert "[PHIEliminate] Move dbg values after phi and label" Testing compiler-rt, a new assertion failure occurs when building the GwpAsanTestObjects object. I'm uploading a reproducer to D70597. This reverts commit `75188b01e9`.	2020-01-16 14:01:27 +00:00
Simon Pilgrim	23a887b0dd	Fix unused variable warning. NFCI.	2020-01-16 13:02:40 +00:00
Chris Ye	75188b01e9	[PHIEliminate] Move dbg values after phi and label If there are DBG_VALUEs between phi and label (after phi and before label), DBG_VALUE will block PHI lowering after the LABEL. Moving all DBG_VALUEs after Labels in the function ScheduleDAGSDNodes::EmitSchedule to avoid impacting PHI lowering. before: PHI DBG_VALUE LABEL after: (move DBG_VALUE after label) PHI LABEL DBG_VALUE then: (phi lowering after label) LABEL COPY DBG_VALUE Fixes the issue: https://bugs.llvm.org/show_bug.cgi?id=43859 Differential Revision: https://reviews.llvm.org/D70597	2020-01-16 11:58:09 +00:00
Florian Hahn	23c113802e	[LV] Allow assume calls in predicated blocks. The assume intrinsic is intentionally marked as may reading/writing memory, to avoid passes moving them around. When flattening the CFG for predicated blocks, we have to drop the assume calls, as they are control-flow dependent. There are some cases where we can do better (when control flow is preserved), but that is follow-up work. Fixes PR43620. Reviewers: hsaito, rengolin, dcaballe, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D68814	2020-01-16 10:11:35 +00:00
Sameer Sahasrabuddhe	ed181efa17	[HIP][AMDGPU] expand printf when compiling HIP to AMDGPU Summary: This change implements the expansion in two parts: - Add a utility function emitAMDGPUPrintfCall() in LLVM. - Invoke the above function from Clang CodeGen, when processing a HIP program for the AMDGPU target. The printf expansion has undefined behaviour if the format string is not a compile-time constant. As a sufficient condition, the HIP ToolChain now emits -Werror=format-nonliteral. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D71365	2020-01-16 15:15:38 +05:30
Kazushi (Jam) Marukawa	773ae62ff8	[VE] i64 arguments, return values and constants Summary: Support for i64 arguments (in register), return values and constants along with tests. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D72776	2020-01-16 10:09:50 +01:00
Craig Topper	5cf1b01a01	[LegalizeDAG][TargetLowering] Move vXi64/i64->vXf32/f32 uint_to_fp legalizing code from TargetLowering::expandUINT_TO_FP back to LegalizeDAG. This was moved in October 2018, but we don't appear to be using this for vectors on any in tree target. Moving it back simplifies D72794 so we can share the code for i32->f32.	2020-01-15 22:04:50 -08:00
Liu, Chen3	8fdafb7dce	Insert wait instruction after X87 instructions which could raise float-point exception. This patch also modify some mayRaiseFPException flag which set in D68854. Differential Revision: https://reviews.llvm.org/D72750	2020-01-16 12:12:51 +08:00
Matt Arsenault	c378e52cb9	Set some fast math attributes in setFunctionAttributes This will provide a more consistent view to codegen for these attributes. The current system is somewhat awkward, and the fields in TargetOptions are reset based on the command line flag if the attribute isn't set. By forcing these attributes with the flag, there can never be an inconsistency in the behavior if code directly inspects the attribute on the function without considering the command line flags.	2020-01-15 22:23:18 -05:00
Craig Topper	e445447921	[X86] When handling i64->f32 sint_to_fp on 32-bit targets only bitcast to f64 if sse2 is enabled. The code is trying to copy the i64 value to an xmm register to use a 64-bit store so that the 64-bit fild can benefit from store forwarding. But this trick only works if f64 is going to be stored in an XMM register. If we only have SSE1 then only float is in xmm register. So this trick just causes 2 stores i32 stores, an f64 load into the x87, an f64 from x87, and a 64-bit fild. So we end up with an extra stack temporary and still didn't get store forwarding. We might be able to use v2f32 here instead, but I didn't check. I just wanted the code to make sense. Found by inspection as I continue to stare too hard at our int_to_fp conversions.	2020-01-15 18:26:28 -08:00
Yuanfang Chen	6e24c6037f	Revert "[Support] make report_fatal_error `abort` instead of `exit`" This reverts commit `647c3f4e47`. Got bots failure from sanitizer-windows and maybe others.	2020-01-15 17:52:25 -08:00
Yuanfang Chen	647c3f4e47	[Support] make report_fatal_error `abort` instead of `exit` Summary: This patch could be treated as a rebase of D33960. It also fixes PR35547. A fix for `llvm/test/Other/close-stderr.ll` is proposed in D68164. Seems the consensus is that the test is passing by chance and I'm not sure how important it is for us. So it is removed like in D33960 for now. The rest of the test fixes are just adding `--crash` flag to `not` tool. ** The reason it fixes PR35547 is `exit` does cleanup including calling class destructor whereas `abort` does not do any cleanup. In multithreading environment such as ThinLTO or JIT, threads may share states which mostly are ManagedStatic<>. If faulting thread tearing down a class when another thread is using it, there are chances of memory corruption. This is bad 1. It will stop error reporting like pretty stack printer; 2. The memory corruption is distracting and nondeterministic in terms of error message, and corruption type (depending one the timing, it could be double free, heap free after use, etc.). Reviewers: rnk, chandlerc, zturner, sepavloff, MaskRay, espindola Reviewed By: rnk, MaskRay Subscribers: wuzish, jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, arichardson, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, lenary, s.egerton, pzheng, cfe-commits, MaskRay, filcab, davide, MatzeB, mehdi_amini, hiraditya, steven_wu, dexonsmith, rupprecht, seiya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D67847	2020-01-15 17:05:13 -08:00
Stanislav Mekhanoshin	8b417dd3d6	Process BUNDLE in tail duplication When tail duplication estimates a size of tail it uses instruction count. Account for a number of instrictions in a bundle too. Differential Revision: https://reviews.llvm.org/D72783	2020-01-15 15:46:57 -08:00
Vedant Kumar	360abb7ee5	[CodeExtractor] Transfer debug info to extracted function After extracting, fix up debug info in both the old and new functions by 1) Pointing line locations and debug intrinsics to the new subprogram scope, and 2) Deleting intrinsics which point to values outside of the new function. Depends on https://reviews.llvm.org/D72795. Testing: check-llvm, check-clang, a build of LNT in the `-Os -g` config with "-mllvm -hot-cold-split=1" set, and end-to-end debugging of a toy program which undergoes splitting to verify that lldb can find variables, single step, etc. in extracted code. rdar://45507940 Differential Revision: https://reviews.llvm.org/D72801	2020-01-15 15:38:36 -08:00
Matt Arsenault	711a17afaf	AMDGPU/GlobalISel: Select exp with patterns This does produce slightly different code. Now a unique IMPLICIT_DEF is emitted for each of the implicit_def operands, rather than reusing the same one.	2020-01-15 18:33:15 -05:00
Matt Arsenault	eef92f25cc	AMDGPU: Remove custom node for exports I'm mildly worried about potentially reordering exp/exp_done with IntrWriteMem on the intrinsic. Requires hacking out the illegal type on SI, so manually select that case during lowering.	2020-01-15 18:33:15 -05:00
Matt Arsenault	25e9938a45	GlobalISel: Handle more cases of G_SEXT narrowing This now develops the same problem G_ZEXT/G_ANYEXT have where the requested type is assumed to be the source type. This will be fixed separately by creating intermediate merges.	2020-01-15 18:33:15 -05:00
Brian Gesiak	daab9227ff	[IR] Module's NamedMD table needn't be 'void ' Summary: In July 21 2010 `llvm::NamedMDNode` was refactored such that it would no longer subclass `llvm::Value`: https://github.com/llvm/llvm-project/commit/2637cc1a38d7336ea30caf As part of this change, a map type from metadata names to their named metadata, `llvm::MDSymbolTable`, was deleted. In its place, the type of member `llvm::Module::NamedMDSymTab` was changed, from `llvm::MDSymbolTable` to `void `. The underlying memory allocations for this pointer were changed to `new StringMap<NamedMDNode >()`. However, as far as I can tell, there's no need for obscuring the underlying type being pointed to by the `void `, and no need for static casts from `void ` to `StringMap`. In fact, I don't think there's a need for explicit calls to `new` and `delete` at all. This commit changes `NamedMDSymTab` from a pointer to a reference, which automatically couples its lifetime with the lifetime of its owning `llvm::Module` instance, thus removing the explicit calls to `new` and `delete` in the `llvm::Module` constructor and destructor. It also changes the type from `void ` to a newly defined `NamedMDSymTabType`, and removes the static casts. Test Plan: An ASAN-enabled build and run of `check-all` succeeds with this change (aside from some tests that always fail for me in ASAN for some reason, such as `check-clang` `SemaTemplate/stack-exhaustion.cpp`). Reviewers: aprantl, dblaikie, chandlerc, pcc, echristo Reviewed By: dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72812	2020-01-15 18:27:25 -05:00
Vedant Kumar	43464509fc	DWARF: Simplify the way the return PC is attached to call site tags, NFC This cleanup was suggested by Djordje in D72489.	2020-01-15 14:16:21 -08:00
Fedor Sergeev	8a4d12ae5b	[BasicBlock] add helper getPostdominatingDeoptimizeCall It appears to be rather useful when analyzing Loops with multiple deoptimizing exits, perhaps merged ones. For now it is used in LoopPredication, will be adding more uses in other loop passes. Reviewers: asbirlea, fhahn, skatkov, spatel, reames Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D72754	2020-01-16 01:15:57 +03:00
Jinsong Ji	c65ac2ba78	[MachineScheduler][NFC] Don't swap when we can't cluster https://reviews.llvm.org/D72706 tried to reduce reordering due to mem op clustering. This patch avoid doing the swap when we can't cluster. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D72800	2020-01-15 21:55:31 +00:00
Mircea Trofin	5466597fee	[NFC] Refactor InlineResult for readability Summary: InlineResult is used both in APIs assessing whether a call site is inlinable (e.g. llvm::isInlineViable) as well as in the function inlining utility (llvm::InlineFunction). It means slightly different things (can/should inlining happen, vs did it happen), and the implicit casting may introduce ambiguity (casting from 'false' in InlineFunction will default a message about hight costs, which is incorrect here). The change renames the type to a more generic name, and disables implicit constructors. Reviewers: eraman, davidxl Reviewed By: davidxl Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72744	2020-01-15 13:34:20 -08:00
Zhongduo Lin	34ba96a3d4	[NFC][IndVarSimplify] remove duplicate code in widenWithVariantLoadUseCodegen. Summary: Duplicate code in widenWithVariantLoadUseCodegen is removed and also use assert to check unknown extension type as it should be filtered out by the pre condition check before calling this function. Reviewers: az, sanjoy, sebpop, efriedma, javed.absar, sanjoy.google Reviewed By: efriedma Subscribers: hiraditya, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D72652	2020-01-15 16:27:58 -05:00
Vedant Kumar	a2cc80bc95	DebugInfo: Factor out logic to update locations in MD_loop metadata, NFC Factor out the logic needed to update debug locations contained within MD_loop metadata. This refactor is preparation for a future change that also needs to rewrite MD_loop metadata. rdar://45507940	2020-01-15 13:02:36 -08:00
Vedant Kumar	f0120556c7	[DWARF] Emit DW_AT_call_return_pc as an address This reverts D53469, which changed llvm's DWARF emission to emit DW_AT_call_return_pc as a function-local offset. Such an encoding is not compatible with post-link block re-ordering tools and isn't standards- compliant. In addition to reverting back to the original DW_AT_call_return_pc encoding, teach lldb how to fix up DW_AT_call_return_pc when the address comes from an object file pointed-to by a debug map. While doing this I noticed that lldb's support for tail calls that cross a DSO/object file boundary wasn't covered, so I added tests for that. This latter case exercises the newly added return PC fixup. The dsymutil changes in this patch were originally included in D49887: the associated test should be sufficient to test DW_AT_call_return_pc encoding purely on the llvm side. Differential Revision: https://reviews.llvm.org/D72489	2020-01-15 13:02:23 -08:00
Lang Hames	c75180258e	[ORC] Set setCloneToNewContextOnEmit on LLJIT's transform layer when needed. Based on Don Hinton's patch in https://reviews.llvm.org/D72406. This feature was accidentally left out of `e9e26c01cd`, and would have pessimized concurrent compilation in the default case. Thanks for spotting this Don!	2020-01-15 10:22:57 -08:00
Amara Emerson	2e39ea726e	Revert "Revert rG6078f2fedcac5797ac39ee5ef3fd7a35ef1202d5 - "[AArch64][GlobalISel]: Support @llvm.{return,frame}address selection."" The original change wasn't constraining the operand regclasses which broke EXPENSIVE_CHECKS.	2020-01-15 10:13:11 -08:00
Mark Murray	da9d57d2c2	[ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics. Summary: Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics and unit tests. Reviewers: simon_tatham, miyuki, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72761	2020-01-15 17:20:15 +00:00
evgeny	10cadee5ce	[ThinLTO] Always import constants This patch imports constant variables even when they can't be internalized (which results in promotion). This offers some extra constant folding opportunities. Differential revision: https://reviews.llvm.org/D70404	2020-01-15 19:29:01 +03:00
Arkady Shlykov	3f3017e162	[Loop Peeling] Add possibility to enable peeling on loop nests. Summary: Current peeling implementation bails out in case of loop nests. The patch introduces a field in TargetTransformInfo structure that certain targets can use to relax the constraints if it's profitable (disabled by default). Also additional option is added to enable peeling manually for experimenting and testing purposes. Reviewers: fhahn, lebedev.ri, xbolva00 Reviewed By: xbolva00 Subscribers: xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D70304	2020-01-15 08:25:21 -08:00
Sanjay Patel	3180af4362	[InstCombine] reassociate fsub+fsub into fsub+fadd As discussed in the motivating PR44509: https://bugs.llvm.org/show_bug.cgi?id=44509 ...we can end up with worse code using fast-math than without. This is because the reassociate pass greedily transforms fsub into fneg/fadd and apparently (based on the regression tests seen here) expects instcombine to clean that up if it wasn't profitable. But we were missing this fold: (X - Y) - Z --> X - (Y + Z) There's another, more specific case that I think we should handle as shown in the "fake" fneg test (but missed with a real fneg), but that's another patch. That may be tricky to get right without conflicting with existing transforms for fneg. Differential Revision: https://reviews.llvm.org/D72521	2020-01-15 11:14:13 -05:00
Lang Hames	e9e26c01cd	[ORC] Simplify use of lazyReexports with LLJIT. This patch makes the target triple available via the LLJIT interface, and moves the IRTransformLayer from LLLazyJIT down into LLJIT. Together these changes make it easier to use the lazyReexports utility with LLJIT, and to apply IR transforms to code as it is compiled in LLJIT (rather than requiring transforms to be applied manually before code is added). An code example is added in llvm/examples/LLJITExamples/LLJITWithLazyReexports	2020-01-15 08:02:53 -08:00
Lang Hames	d2fabd7006	[ORC] Update lazyReexports to support aliases with different symbol names. A bug in the existing implementation meant that lazyReexports would not work if the aliased name differed from the alias's name, i.e. all lazy reexports had to be of the form (lib1, name) -> (lib2, name). This patch fixes the issue by capturing the alias's name in the NotifyResolved callback. To simplify this capture, and the LazyCallThroughManager code in general, the NotifyResolved callback is updated to use llvm::unique_function rather than a custom class. No test case yet: This can only be tested at runtime, and the only in-tree client (lli) always uses aliases with matching names. I will add a new LLJIT example shortly that will directly test the lazyReexports API and the non-trivial alias use case.	2020-01-15 08:02:53 -08:00
Hubert Tong	63b428e386	DWARFDebugLine.cpp: Format unknown line number standard opcodes Summary: This patch implements `formatv()` formatting for `dwarf::LineNumberOps` and makes use of it for the `llvm-dwarfdump --debug-line` dump. Previously, unknown line number standard opcodes would lead to undefined behaviour. The code would attempt to format the data pointer of an empty `StringRef` (a null pointer) using `%s`. According to the description for `format()`, use of that interface carries the "risk of `printf`". Passing a null pointer in place of an array to a C library function results in undefined behaviour. Reviewers: jhenderson, daltenty, stevewan Reviewed By: jhenderson Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72369	2020-01-15 10:45:50 -05:00
Matt Arsenault	936483fb7d	GlobalISel: Implement lower for G_BITCAST Bitcast only really applies between scalars and vectors. Implement as an unmerge and remerge. The test needs to tolerate failure since one of the unmerges currently fails to legalize.	2020-01-15 08:58:58 -05:00
Matt Arsenault	bd7658a212	AMDGPU: Partially directly select llvm.amdgcn.interp.p1.f16 The 16 bank LDS case is complicated due to using multiple instructions. If I attempt to write a pattern for it, the generated selector incorrectly places the copy to m0 after the first instruction, so that needs to be separately addressed. Also fix not gluing the copy to m0 to the second operation in the second half of the 16 bank lowering.	2020-01-15 08:58:58 -05:00
Matt Arsenault	91715617ad	GlobalISel: Fix narrowScalar for G_ANYEXT results This is nearly the same as G_ZEXT.	2020-01-15 08:58:57 -05:00
Nemanja Ivanovic	9c64f04df8	[PowerPC] Legalize saturating vector add/sub These intrinsics and the corresponding ISD nodes were recently added. PPC has instructions that do this for vectors. Legalize them and add patterns to emit the satuarting instructions. Differential revision: https://reviews.llvm.org/D71940	2020-01-15 07:00:38 -06:00
Simon Pilgrim	e26a78e708	Revert rG6078f2fedcac5797ac39ee5ef3fd7a35ef1202d5 - "[AArch64][GlobalISel]: Support @llvm.{return,frame}address selection." These intrinsics expand to a variable number of instructions so just like in ISelLowering.cpp we use custom code to deal with them. Committing Tim's original patch. Differential Revision: https://reviews.llvm.org/D65656 ---- Breaks EXPENSIVE_CHECKS builds.	2020-01-15 12:37:37 +00:00
Zakk Chen	7bc58a779a	[RISCV] Support ABI checking with per function target-features if users don't specific -mattr, the default target-feature come from IR attribute. Reviewers: lenary, asb Reviewed By: lenary, asb Tags: #llvm Differential Revision: https://reviews.llvm.org/D70837	2020-01-15 04:35:01 -08:00
Zakk Chen	3bc2860e92	Revert "[RISCV] Support ABI checking with per function target-features" This reverts commit `109e4d12ed`.	2020-01-15 04:32:57 -08:00
Simon Pilgrim	0b64400e0b	RegisterClassInfo::computePSetLimit - assert that we actually find a register. Fixes "pointer is null" clang static analyzer warning.	2020-01-15 12:18:12 +00:00
Simon Pilgrim	7b15865225	Fix "pointer is null" static analyzer warning. NFCI. Use cast<> instead of dyn_cast<> since the pointer is always dereferenced and cast<> will perform the null assertion for us.	2020-01-15 12:18:11 +00:00
Georgii Rymar	7570d387c2	[yaml2obj/obj2yaml] - Add support for SHT_RELR sections. Note: this is a reland with a trivial 2 lines fix in ELFState<ELFT>::writeSectionContent. It adds a check similar to ones we already have for other sections to fix the case revealed by bots, like http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/60744. The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ] i.e. start with an address, followed by any number of bitmaps. The address entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31) relocations each, at subsequent offsets following the last address entry. More information is here: https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272 This patch adds a support for these sections. Differential revision: https://reviews.llvm.org/D71872	2020-01-15 15:15:24 +03:00
Benjamin Kramer	06cfcdcca7	[AArch64][SVE] Fold variable into assert to silence unused variable warnings in Release builds	2020-01-15 12:50:27 +01:00
Georgii Rymar	ca6f616532	Revert "[yaml2obj/obj2yaml] - Add support for SHT_RELR sections." This reverts commit `46d11e30ee`. It broke bots. E.g. http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/60744	2020-01-15 14:19:00 +03:00
Cullen Rhodes	93a4dede3a	[AArch64][SVE] Add ptest intrinsics Summary: Implements the following intrinsics: * @llvm.aarch64.sve.ptest.any * @llvm.aarch64.sve.ptest.first * @llvm.aarch64.sve.ptest.last Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72398	2020-01-15 11:15:01 +00:00
Georgii Rymar	46d11e30ee	[yaml2obj/obj2yaml] - Add support for SHT_RELR sections. The encoded sequence of Elf*_Relr entries in a SHT_RELR section looks like [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ] i.e. start with an address, followed by any number of bitmaps. The address entry encodes 1 relocation. The subsequent bitmap entries encode up to 63(31) relocations each, at subsequent offsets following the last address entry. More information is here: https://github.com/llvm-mirror/llvm/blob/master/lib/Object/ELF.cpp#L272 This patch adds a support for these sections. Differential revision: https://reviews.llvm.org/D71872	2020-01-15 13:54:08 +03:00
Zakk Chen	109e4d12ed	[RISCV] Support ABI checking with per function target-features if users don't specific -mattr, the default target-feature come from IR attribute.	2020-01-15 02:30:43 -08:00
Igor Kudrin	2142e20f50	[DWARF] Fix DWARFDebugAranges to support 64-bit CU offsets. DWARFContext, the only user of this class, can already handle such offsets. Differential Revision: https://reviews.llvm.org/D71834	2020-01-15 17:19:08 +07:00
cdevadas	0dc6c249bf	[AMDGPU] Invert the handling of skip insertion. The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an optional pass. This patch inserts the s_cbranch_execz upfront during SILowerControlFlow to skip over the sections of code when no lanes are active. Later, SIRemoveShortExecBranches removes the skips for short branches, unless there is a sideeffect and the skip branch is really necessary. This new pass will replace the handling of skip insertion in the existing SIInsertSkip Pass. Differential revision: https://reviews.llvm.org/D68092	2020-01-15 15:18:16 +05:30
Kazushi (Jam) Marukawa	064859bde7	[VE] Minimal codegen for empty functions Summary: This patch implements minimal VE code generation for empty function bodies (no args, no value return). Contents * empty function code generation test. * Minimal function prologue & epilogue emission * Instruction formats and instruction definitions as far as required for the empty function prologue & epilogue. * I64 register class definitions. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D72598	2020-01-15 09:55:16 +01:00
Craig Topper	be8f217b18	[X86] Don't call LowerUINT_TO_FP_i32 for i32->f80 on 32-bit targets with sse2. We were performing an emulated i32->f64 in the SSE registers, then storing that value to memory and doing a extload into the X87 domain. After this patch we'll now just store the i32 to memory along with an i32 0. Then do a 64-bit FILD to f80 completely in the X87 unit. This matches what we do without SSE.	2020-01-15 00:43:07 -08:00
Hideto Ueno	188f9a348d	[Attributor] AAValueConstantRange: Value range analysis using constant range Summary: This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point. One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument). The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty). Currently, AAValueConstantRange is created in `getAssumedConstant` method when `AAValueSimplify` returns `nullptr`(worst state). Supported - BinaryOperator(add, sub, ...) - CmpInst(icmp eq, ...) - !range metadata `AAValueConstantRange` is not intended to extend to polyhedral range value analysis. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: phosek, davezarzycki, baziotis, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71620	2020-01-15 16:34:23 +09:00
David Green	b891490ceb	[Scheduler] Adjust interface of CreateTargetMIHazardRecognizer to use ScheduleDAGMI. NFC All the callers of this function will be ScheduleDAGMI from the MachineScheduler. This allows us to use the extra info available in ScheduleDAGMI without resorting to awkward casts.	2020-01-15 07:21:44 +00:00
Justin Hibbits	36eedfcb3c	[PowerPC] Fix powerpcspe subtarget enablement in llvm backend Summary: As currently written, -target powerpcspe will enable SPE regardless of disabling the feature later on in the command line. Instead, change this to just set a default CPU to 'e500' instead of a generic CPU. As part of this, add FeatureSPE to the e500 definition. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D72673	2020-01-14 22:07:03 -06:00
Tom Stellard	0dbcb36394	CMake: Make most target symbols hidden by default Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 36221 nm after/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439	2020-01-14 19:46:52 -08:00
Hubert Tong	aca3e70d2b	DWARFDebugLine.cpp: Restore LF line endings rG7e02406f6cf180a8c89ce64665660e7cc9dbc23e switched the file to CRLF line endings.	2020-01-14 21:23:39 -05:00
Philip Reames	1a7398eca2	[BranchAlign] Add master --x86-branches-within-32B-boundaries flag This flag was originally part of D70157, but was removed as we carved away pieces of the review. Since we have the nop support checked in, and it appears mature(), I think it's time to add the master flag. For now, it will default to nop padding, but once the prefix padding support lands, we'll update the defaults. () I can now confirm that downstream testing of the changes which have landed to date - nop padding and compiler support for suppressions - is passing all of the functional testing we've thrown at it. There might still be something lurking, but we've gotten enough coverage to be confident of the basic approach. Note that the new flag can be used either when assembling an .s file, or when using the integrated assembler directly from the compiler. The later will use all of the suppression mechanism and should always generate correct code. We don't yet have assembly syntax for the suppressions, so passing this directly to the assembler w/a raw .s file may result in broken code. Use at your own risk. Also note that this isn't the wiring for the clang option. I think the most recent review for that is D72227, but I've lost track, so that might be off. Differential Revision: https://reviews.llvm.org/D72738	2020-01-14 18:17:53 -08:00
Reid Kleckner	40cd26c700	[Win64] Handle FP arguments more gracefully under -mno-sse Pass small FP values in GPRs or stack memory according the the normal convention. This is what gcc -mno-sse does on Win64. I adjusted the conditions under which we emit an error to check if the argument or return value would be passed in an XMM register when SSE is disabled. This has a side effect of no longer emitting an error for FP arguments marked 'inreg' when targetting x86 with SSE disabled. Our calling convention logic was already assigning it to FP0/FP1, and then we emitted this error. That seems unnecessary, we can ignore 'inreg' and compile it without SSE. Reviewers: jyknight, aemerson Differential Revision: https://reviews.llvm.org/D70465	2020-01-14 17:19:35 -08:00
Craig Topper	76291e1158	[X86] Drop an unneeded FIXME. NFC The extload on X87 is free.	2020-01-14 17:05:46 -08:00
Craig Topper	57eb56b839	[X86] Swap the 0 and the fudge factor in the constant pool for the 32-bit mode i64->f32/f64/f80 uint_to_fp algorithm. This allows us to generate better code for selecting the fixup to load. Previously when the sign was set we had to load offset 0. And when it was clear we had to load offset 4. This required a testl, setns, zero extend, and finally a mul by 4. By switching the offsets we can just shift the sign bit into the lsb and multiply it by 4.	2020-01-14 17:05:23 -08:00
Michael Liao	01a4b83154	[codegen,amdgpu] Enhance MIR DIE and re-arrange it for AMDGPU. Summary: - `dead-mi-elimination` assumes MIR in the SSA form and cannot be arranged after phi elimination or DeSSA. It's enhanced to handle the dead register definition by skipping use check on it. Once a register def is `dead`, all its uses, if any, should be `undef`. - Re-arrange the DIE in RA phase for AMDGPU by placing it directly after `detect-dead-lanes`. - Many relevant tests are refined due to different register assignment. Reviewers: rampitec, qcolombet, sunfish Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72709	2020-01-14 19:26:15 -05:00
Michael Liao	8d07f8d98c	[DAGCombine] Replace `getIntPtrConstant()` with `getVectorIdxTy()`. - Prefer `getVectorIdxTy()` as the index operand type for `EXTRACT_SUBVECTOR` as targets expect different types by overloading `getVectorIdxTy()`.	2020-01-14 17:03:05 -05:00
Amara Emerson	6078f2fedc	[AArch64][GlobalISel]: Support @llvm.{return,frame}address selection. These intrinsics expand to a variable number of instructions so just like in ISelLowering.cpp we use custom code to deal with them. Committing Tim's original patch. Differential Revision: https://reviews.llvm.org/D65656	2020-01-14 13:41:21 -08:00
Craig Topper	9ee90ea55c	[LegalizeTypes] Remove untested code from ExpandIntOp_UINT_TO_FP This code is untested in tree because the "APFloat::semanticsPrecision(sem) >= SrcVT.getSizeInBits() - 1" check is false for most combinations for int and fp types except maybe i32 and f64. For that you would need i32 to be an illegal type, but f64 to be legal and have custom handling for legalizing the split sint_to_fp. The precision check itself was added in 2010 to fix a double rounding issue in the algorithm that would occur if the sint_to_fp was not able to do the conversion without rounding. Differential Revision: https://reviews.llvm.org/D72728	2020-01-14 13:15:29 -08:00
Nikita Popov	04e586151e	[InstCombine] Fix worklist management when removing guard intrinsic When multiple guard intrinsics are merged into one, currently the result of eraseInstFromFunction() is returned -- however, this should only be done if the current instruction is being removed. In this case we're removing a different instruction and should instead report that the current one has been modified by returning it. For this test case, this reduces the number of instcombine iterations from 5 to 2 (the minimum possible). Differential Revision: https://reviews.llvm.org/D72558	2020-01-14 21:47:48 +01:00
Danilo Carvalho Grael	26d96126a0	[SVE] Add patterns for MUL immediate instruction. Summary: Add the missing MUL pattern for integer immediate instructions. Reviewers: sdesmalen, huntergr, efriedma, c-rhodes, kmclaughlin Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D72654	2020-01-14 15:26:19 -05:00
Nikita Popov	410331869d	[NewPM] Port MergeFunctions pass This ports the MergeFunctions pass to the NewPM. This was rather straightforward, as no analyses are used. Additionally MergeFunctions needs to be conditionally enabled in the PassBuilder, but I left that part out of this patch. Differential Revision: https://reviews.llvm.org/D72537	2020-01-14 20:55:41 +01:00
Nikita Popov	65c0805be5	[InstCombine] Fix infinite loop due to bitcast <-> phi transforms Fix for https://bugs.llvm.org/show_bug.cgi?id=44245. The optimizeBitCastFromPhi() and FoldPHIArgOpIntoPHI() end up fighting against each other, because optimizeBitCastFromPhi() assumes that bitcasts of loads will get folded. This doesn't happen here, because a dangling phi node prevents the one-use fold in https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L620-L628 from triggering. This patch fixes the issue by explicitly performing the load combine as part of the bitcast of phi transform. Other attempts to force the load to be combined first were ultimately too unreliable. Differential Revision: https://reviews.llvm.org/D71164	2020-01-14 20:45:13 +01:00
Nikita Popov	b4dd928ffb	[InstCombine] Make combineLoadToNewType a method; NFC So it can be reused as part of other combines. In particular for D71164.	2020-01-14 20:40:03 +01:00
Nikita Popov	652cd7c100	[InstCombine] Fix user iterator invalidation in bitcast of phi transform This fixes the issue encountered in D71164. Instead of using a range-based for, manually iterate over the users and advance the iterator beforehand, so we do not skip any users due to iterator invalidation. Differential Revision: https://reviews.llvm.org/D72657	2020-01-14 20:38:10 +01:00
Jay Foad	b777e551f0	[MachineScheduler] Reduce reordering due to mem op clustering Summary: Mem op clustering adds a weak edge in the DAG between two loads or stores that should be clustered, but the direction of this edge is pretty arbitrary (it depends on the sort order of MemOpInfo, which represents the operands of a load or store). This often means that two loads or stores will get reordered even if they would naturally have been scheduled together anyway, which leads to test case churn and goes against the scheduler's "do no harm" philosophy. The fix makes sure that the direction of the edge always matches the original code order of the instructions. Reviewers: atrick, MatzeB, arsenm, rampitec, t.p.northover Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72706	2020-01-14 19:19:02 +00:00
lewis-revill	cd800f3b22	[RISCV] Allow shrink wrapping for RISC-V Enabling shrink wrapping requires ensuring the insertion point of the epilogue is correct for MBBs without a terminator, in which case the instruction to adjust the stack pointer is the last instruction in the block. Differential Revision: https://reviews.llvm.org/D62190	2020-01-14 18:59:11 +00:00
Teresa Johnson	2cefb93951	[ThinLTO/WPD] Remove an overly-aggressive assert Summary: An assert added to the index-based WPD was trying to verify that we only have multiple vtables for a given guid when they are all non-external linkage. This is too conservative because we may have multiple external vtable with the same guid when they are in comdat. Remove the assert, as we don't have comdat information in the index, the linker should issue an error in this case. See discussion on D71040 for more information. Reviewers: evgeny777, aganea Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72648	2020-01-14 10:57:14 -08:00
Craig Topper	98c54fb1fe	[X86] Directly emit a BROADCAST_LOAD from constant pool in lowerUINT_TO_FP_vXi32 to avoid double loads seen in D71971 By directly emitting the constants as a constant pool load we seem to avoid the build_vector/extract_subvector combines that resulted in the duplicate loads we had before. Differential Revision: https://reviews.llvm.org/D72307	2020-01-14 10:50:39 -08:00
diggerlin	eb23cc136b	[AIX][XCOFF] Supporting the ReadOnlyWithRel SectionKnd SUMMARY: In this patch we put the global variable in a Csect which's SectionKind is "ReadOnlyWithRel" into Data Section. Reviewers: hubert.reinterpretcast,jasonliu,Xiangling_L Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D72461	2020-01-14 13:21:49 -05:00
Juneyoung Lee	3e32b7e127	[InstCombine] Let combineLoadToNewType preserve ABI alignment of the load (PR44543) Summary: If aligment on `LoadInst` isn't specified, load is assumed to be ABI-aligned. And said aligment may be different for different types. So if we change load type, but don't pay extra attention to the aligment (i.e. keep it unspecified), we may either overpromise (if the default aligment of the new type is higher), or underpromise (if the default aligment of the new type is smaller). Thus, if no alignment is specified, we need to manually preserve the implied ABI alignment. This addresses https://bugs.llvm.org/show_bug.cgi?id=44543 by making combineLoadToNewType preserve ABI alignment of the load. Reviewers: spatel, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72710	2020-01-15 03:20:53 +09:00
Dmitri Gribenko	2948ec5ca9	Removed PointerUnion3 and PointerUnion4 aliases in favor of the variadic template	2020-01-14 18:56:29 +01:00
Sanjay Patel	c8a14c2d47	[IR] fix potential crash in Constant::isElementWiseEqual() There's only one user of this API currently, and it seems impossible that it would compare values with different types. But that's not true in general, so we need to make sure the types are the same. As denoted by the FIXME comments, we will also crash on FP values. That's what brought me here, but we can make that a follow-up patch.	2020-01-14 11:52:38 -05:00
Sjoerd Meijer	a08c0adee0	[ARM][MVE] VTP Block Pass fix Fix a missing and broken test: 2 VPT blocks predicated on the same VCMP instruction that can be folded. The problem was that for each VPT block, we record the predicate statements with a list, but the same instruction was added twice. Thus, we were running in an assert trying to remove the same instruction twice. To avoid this the instructions are now recorded with a set. Differential Revision: https://reviews.llvm.org/D72699	2020-01-14 16:10:55 +00:00
Sanne Wouda	1cc8fff420	[AArch64] Fix save register pairing for Windows AAPCS Summary: On Windows, when a function does not have an unwind table (for example, EH filtering funclets), we don't correctly pair FP and LR to form the frame record in all circumstances. Fix this by invalidating a pair when the second register is FP when compiling for Windows, even when CFI is not needed. Fixes PR44271 introduced by D65653. Reviewers: efriedma, sdesmalen, rovka, rengolin, t.p.northover, thegameg, greened Reviewed By: rengolin Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71754	2020-01-14 15:08:27 +00:00
Florian Hahn	192cce10f6	Revert "Recommit "[GlobalOpt] Pass DTU to removeUnreachableBlocks instead of recomputing."" This reverts commit `a03d7b0f24`. As discussed in D68298, this causes a compile-time regression, in case the DTs requested are not used elsewhere in GlobalOpt. We should only get the DTs if they are available here, but this seems not possible with the legacy pass manager from a module pass.	2020-01-14 14:50:07 +00:00
Xiangling Liao	25a8aec7f3	[AIX] ExternalSymbolSDNode lowering For memcpy/memset/memmove etc., replace ExternalSymbolSDNode with a MCSymbolSDNode, which have a prefix dot before function name as entry point symbol. Differential Revision: https://reviews.llvm.org/D70718	2020-01-14 09:39:02 -05:00
Tim Northover	77cc690bae	AArch64: fix bitcode upgrade of @llvm.neon.addp. We were upgrading it to faddp, but a version taking two type parameters instead of one. This then got upgraded a second time to the version with just one parameter, but occasionally (for reasons I don't understand) this unusual two-stage process corrupted a use-list, leading to a crash when the two faddp declarations didn't match.	2020-01-14 13:41:32 +00:00
Ulrich Weigand	81ee484484	[FPEnv] Fix chain handling regression after `04a8696` Code in getRoot made the assumption that every node in PendingLoads must always itself have a dependency on the current DAG root node. After the changes in `04a8696`, it turns out that this assumption no longer holds true, causing wrong codegen in some cases (e.g. stores after constrained FP intrinsics might get deleted). To fix this, we now need to make sure that the TokenFactor created by getRoot always includes the previous root, if there is no implicit dependency already present. The original getControlRoot code already has exactly this check, so this patch simply reuses that code now for getRoot as well. This fixes the regression. NFC if no constrained FP intrinsic is present.	2020-01-14 14:10:57 +01:00
Benjamin Kramer	df186507e1	Make helper functions static or move them into anonymous namespaces. NFC.	2020-01-14 14:06:37 +01:00
Simon Tatham	71d5454b37	[ARM,MVE] Use the new Tablegen `defvar` and `if` statements. Summary: This cleans up a lot of ugly `foreach` bodges that I've been using to work around the lack of those two language features. Now they both exist, I can make then all into something more legible! In particular, in the common pattern in `ARMInstrMVE.td` where a multiclass defines an `Instruction` instance plus one or more `Pat` that select it, I've used a `defvar` to wrap `!cast<Instruction>(NAME)` so that the patterns themselves become a little more legible. Replacing a `foreach` with a `defvar` removes a level of block structure, so several pieces of code have their indentation changed by this patch. Best viewed with whitespace ignored. NFC: the output of `llvm-tblgen -print-records` on the two affected Tablegen sources is exactly identical before and after this change, so there should be no effect at all on any of the other generated files. Reviewers: MarkMurrayARM, miyuki Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, dmgreen, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72690	2020-01-14 12:08:03 +00:00
Sam Parker	e27632c302	[ARM][LowOverheadLoops] Allow all MVE instrs. We have a whitelist of instructions that we allow when tail predicating, since these are trivial ones that we've deemed need no special handling. Now change ARMLowOverheadLoops to allow the non-trivial instructions if they're contained within a valid VPT block. Since a valid block is one that is predicated upon the VCTP so we know that these non-trivial instructions will still behave as expected once the implicit predication is used instead. This also fixes a previous test failure. Differential Revision: https://reviews.llvm.org/D72509	2020-01-14 12:03:58 +00:00
Simon Pilgrim	31aed2e0da	Fix "MIParser::getIRValue(unsigned int)’ defined but not used" warning. NFCI.	2020-01-14 11:58:54 +00:00
Simon Pilgrim	c05a11108b	[SelectionDAG] ComputeKnownBits - merge getValidMinimumShiftAmountConstant() and generic ISD::SHL handling. As mentioned by @nikic on rGef5debac4302, we can merge the guaranteed bottom zero bits from the shifted value, and then, if a min shift amount is known, zero out the bottom bits as well.	2020-01-14 11:51:41 +00:00
Sam Parker	bad6032bc1	[ARM][LowOverheadLoops] Change predicate inspection Use the already provided helper function to get the operand type so that we can detect whether the vpr is being used as a predicate or not. Also use existing helpers to get the predicate indices when we converting the vpt blocks. This enables us to support both types of vpr predicate operand. Differential Revision: https://reviews.llvm.org/D72504	2020-01-14 11:47:34 +00:00
Diogo Sampaio	d94d079a6a	[ARM][Thumb2] Fix ADD/SUB invalid writes to SP Summary: This patch fixes pr23772 [ARM] r226200 can emit illegal thumb2 instruction: "sub sp, r12, #80". The violation was that SUB and ADD (reg, immediate) instructions can only write to SP if the source register is also SP. So the above instructions was unpredictable. To enforce that the instruction t2(ADD\|SUB)ri does not write to SP we now enforce the destination register to be rGPR (That exclude PC and SP). Different than the ARM specification, that defines one instruction that can read from SP, and one that can't, here we inserted one that can't write to SP, and other that can only write to SP as to reuse most of the hard-coded size optimizations. When performing this change, it uncovered that emitting Thumb2 Reg plus Immediate could not emit all variants of ADD SP, SP #imm instructions before so it was refactored to be able to. (see test/CodeGen/Thumb2/mve-stacksplot.mir where we use a subw sp, sp, Imm12 variant ) It also uncovered a disassembly issue of adr.w instructions, that were only written as SUBW instructions (see llvm/test/MC/Disassembler/ARM/thumb2.txt). Reviewers: eli.friedman, dmgreen, carwil, olista01, efriedma, andreadb Reviewed By: efriedma Subscribers: gbedwell, john.brawn, efriedma, ostannard, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70680	2020-01-14 11:47:19 +00:00
Simon Pilgrim	a43b0065c5	[SelectionDAG] ComputeKnownBits - merge getValidMinimumShiftAmountConstant() and generic ISD::SRL handling. As mentioned by @nikic on rGef5debac4302 (although that was just about SHL), we can merge the guaranteed top zero bits from the shifted value, and then, if a min shift amount is known, zero out the top bits as well. SHL tests / handling will be added in a follow up patch.	2020-01-14 11:41:47 +00:00
Sam Parker	e73b20c57d	[ARM][MVE] Disallow VPSEL for tail predication Due to the current way that we collect predicated instructions, we can't easily handle vpsel in tail predicated loops. There are a couple of issues: 1) It will use the VPR as a predicate operand, but doesn't have to be instead a VPT block, which means we can assert while building up the VPT block because we don't find another VPST to being a new one. 2) VPSEL still requires a VPR operand even after tail predicating, which means we can't remove it unless there is another instruction, such as vcmp, that can provide the VPR def. The first issue should be a relatively simple fix in the logic of the LowOverheadLoops pass, whereas the second will require us to represent the 'implicit' tail predication with an explicit value. Differential Revision: https://reviews.llvm.org/D72629	2020-01-14 11:41:17 +00:00
Anna Welker	72ca86fd34	[ARM][MVE] Masked gathers from base + vector of offsets Enables the masked gather pass to create a masked gather loading from a base and vector of offsets. This also enables v8i16 and v16i8 gather loads. Differential Revision: https://reviews.llvm.org/D72330	2020-01-14 10:33:52 +00:00
Simon Tatham	ddbc0b1e51	[TableGen] Introduce an if/then/else statement. Summary: This allows you to make some of the defs in a multiclass or `foreach` conditional on an expression computed from the parameters or iteration variables. It was already possible to simulate an if statement using a `foreach` with a dummy iteration variable and a list constructed using `!if` so that it had length 0 or 1 depending on the condition, e.g. foreach unusedIterationVar = !if(condition, [1], []<int>) in { ... } But this syntax is nicer to read, and also more convenient because it allows an else clause. To avoid upheaval in the implementation, I've implemented `if` as pure syntactic sugar on the `foreach` implementation: internally, `ParseIf` actually does construct exactly the kind of foreach shown above (and another reversed one for the else clause if present). Reviewers: nhaehnle, hfinkel Reviewed By: hfinkel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71474	2020-01-14 10:19:53 +00:00
Simon Tatham	3388b0f59d	[TableGen] Introduce a `defvar` statement. Summary: This allows you to define a global or local variable to an arbitrary value, and refer to it in subsequent definitions. The main use I anticipate for this is if you have to compute some difficult function of the parameters of a multiclass, and then use it many times. For example: multiclass Foo<int i, string s> { defvar op = !cast<BaseClass>("whatnot_" # s # "_" # i); def myRecord { dag a = (op this, (op that, the other), (op x, y, z)); int b = op.subfield; } def myOtherRecord<"template params including", op>; } There are a couple of ways to do this already, but they're not really satisfactory. You can replace `defvar x = y` with a loop over a singleton list, `foreach x = [y] in { ... }` - but that's unintuitive to someone who hasn't seen that workaround idiom before, and requires an extra pair of braces that you often didn't really want. Or you can define a nested pair of multiclasses, with the inner one taking `x` as a template parameter, and the outer one instantiating it just once with the desired value of `x` computed from its other parameters - but that makes it awkward to sequentially compute each value based on the previous ones. I think `defvar` makes things considerably easier. You can also use `defvar` at the top level, where it inserts globals into the same map used by `defset`. That allows you to define global constants without having to make a dummy record for them to live in: defvar MAX_BUFSIZE = 512; // previously: // def Dummy { int MAX_BUFSIZE = 512; } // and then refer to Dummy.MAX_BUFSIZE everywhere Reviewers: nhaehnle, hfinkel Reviewed By: hfinkel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71407	2020-01-14 10:19:53 +00:00
Stanislav Mekhanoshin	ad741853c3	[AMDGPU] Model distance to instruction in bundle This change allows to model the height of the instruction within a bundle for latency adjustment purposes. Differential Revision: https://reviews.llvm.org/D72669	2020-01-14 01:18:59 -08:00
Stanislav Mekhanoshin	eca4474587	[AMDGPU] Fix getInstrLatency() always returning 1 We do not have InstrItinerary so generic getInstLatency() was always defaulting to return 1 cycle. We need to use TargetSchedModel instead to compute an instruction's latency. Differential Revision: https://reviews.llvm.org/D72655	2020-01-14 01:08:30 -08:00
Fangrui Song	0136f226c4	[MC] Don't resolve relocations referencing STB_LOCAL STT_GNU_IFUNC	2020-01-13 23:36:06 -08:00
Craig Topper	b1dcd84c7e	[X86] Copy the nofpexcept flag when folding a load into an instruction using the load folding tables./	2020-01-13 22:02:45 -08:00
Eli Friedman	e68e4cbcc5	[GlobalISel] Change representation of shuffle masks in MachineOperand. We're planning to remove the shufflemask operand from ShuffleVectorInst (D72467); fix GlobalISel so it doesn't depend on that Constant. The change to prelegalizercombiner-shuffle-vector.mir happens because the input contains a literal "-1" in the mask (so the parser/verifier weren't really handling it properly). We now treat it as equivalent to "undef" in all contexts. Differential Revision: https://reviews.llvm.org/D72663	2020-01-13 16:55:41 -08:00
Hiroshi Yamauchi	7b9f8e17d1	[PGO][CHR] Guard against 0-to-0 branch weight and avoid division by zero crash. Summary: This fixes a crash in internal builds under SamplePGO. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72653	2020-01-13 14:38:58 -08:00
Amy Huang	328e0f3dca	Revert "[DWARF5][DebugInfo]: Added support for DebugInfo generation for auto return type for C++ member functions." This reverts commit `c958639098`, which causes a crash. See https://reviews.llvm.org/D70524 for details.	2020-01-13 13:58:14 -08:00
Teresa Johnson	31441a3e00	[ThinLTO/WPD] Fix index-based WPD for alias vtables Summary: A recent fix in D69452 fixed index based WPD in the presence of available_externally vtables. It added a cast of the vtable def summary to a GlobalVarSummary. However, in some cases one def may be an alias, in which case we need to get the base object before casting, otherwise we will crash. Reviewers: evgeny777, steven_wu, aganea Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71040	2020-01-13 13:38:26 -08:00
Craig Topper	26c7a4ed10	[LegalizeIntegerTypes][X86] Add support for expanding input of STRICT_SINT_TO_FP/STRICT_UINT_TO_FP into a libcall. Needed to support i128->fp128 on 32-bit X86. Add full set of strict sint_to_fp/uint_to_fp conversion tests for fp128.	2020-01-13 13:11:12 -08:00
Alexey Lapshin	f163755eb0	[Dsymutil][Debuginfo][NFC] #3 Refactor dsymutil to separate DWARF optimizing part. Summary: This is the next portion of patches for dsymutil. Create DwarfEmitter interface to generate all debug info tables. Put DwarfEmitter into DwarfLinker library and make tools/dsymutil/DwarfStreamer to be child of DwarfEmitter. It passes check-all testing. MD5 checksum for clang .dSYM bundle matches for the dsymutil with/without that patch. Reviewers: JDevlieghere, friss, dblaikie, aprantl Reviewed By: JDevlieghere Subscribers: merge_guards_bot, hiraditya, thegameg, probinson, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D72476	2020-01-13 23:33:25 +03:00
Teresa Johnson	d0aad9f56e	[LTO] Constify lto::Config reference passed to backends (NFC) The lto::Config object saved on the global LTO object should not be updated by any of the LTO backends. Otherwise we could run into interference between threads utilizing it. Motivated by some proposed changes that would have caused it to get modified in the ThinLTO backends.	2020-01-13 12:26:17 -08:00
Daniel Sanders	a0f4600f4f	Rework `be15dfa88f` such that it works with GlobalISel which doesn't use EVT Summary: `be15dfa88f` broke GlobalISel's usage of getSetCCInverse() which currently appears to be limited to our out-of-tree backend. GlobalISel doesn't use EVT's and isn't able to derive them from the information it has as it doesn't distinguish between integer and floating point types (that distinction is made by operations rather than values). Bring back the bool version of getSetCCInverse() in a way that doesn't break the intent of `be15dfa88f` but also allows GlobalISel to continue using it. Reviewers: spatel, bogner, arichardson Reviewed By: arichardson Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72309	2020-01-13 12:19:37 -08:00
Fangrui Song	64a93afc3c	[X86][Disassembler] Fix a bug when disassembling an empty string readPrefixes() assumes insn->bytes is non-empty. The code path is not exercised in llvm-mc because llvm-mc does not feed empty input to MCDisassembler::getInstruction(). This bug is uncovered by `a5994c789a`. An empty string did not crash before because the deleted regionReader() allowed UINT64_C(-1) as insn->readerCursor. Bytes.size() <= Address -> R->Base 0 <= UINT64_C(-1) - UINT32_C(-1)	2020-01-13 10:42:21 -08:00
Puyan Lotfi	484a7472f1	[llvm][MIRVRegNamerUtils] Adding hashing on FrameIndex MachineOperands. This patch makes it so that cases where multiple instructions that differ only in their FrameIndex MachineOperand values no longer collide. For instance: %1:_(p0) = G_FRAME_INDEX %stack.0 %2:_(p0) = G_FRAME_INDEX %stack.1 Prior to this patch these instructions would collide together. Differential Revision: https://reviews.llvm.org/D71583	2020-01-13 13:39:54 -05:00
Matt Arsenault	203801425d	AMDGPU/GlobalISel: Select llvm.amdgcn.ds.ordered.{add\|swap}	2020-01-13 13:09:38 -05:00
Simon Pilgrim	c6fcd5d115	[SelectionDAG] ComputeNumSignBits add getValidMaximumShiftAmountConstant() for ISD::SHL support Allows us to handle non-uniform SHL shifts to determine the minimum number of sign bits remaining (based off the maximum shift amount value)	2020-01-13 18:02:37 +00:00
Matt Arsenault	3d8f1b2d22	AMDGPU/GlobalISel: Set insert point after waterfall loop The current users of the waterfall loop utility functions do not make use of the restored original insert point. The insertion is either done, or they set the insert point somewhere else. A future change will want to insert instructions after the waterfall loop, but figuring out the point after the loop is more difficult than ensuring the insert point is there after the loop.	2020-01-13 12:51:05 -05:00
Matt Arsenault	ca19d7a399	AMDGPU/GlobalISel: Fix branch targets when emitting SI_IF The branch target needs to be changed depending on whether there is an unconditional branch or not. Loops also need to be similarly fixed, but compiling a simple testcase end to end requires another set of patches that aren't upstream yet.	2020-01-13 12:51:05 -05:00
Matt Arsenault	7d9b0a61c3	AMDGPU/GlobalISel: Simplify assert	2020-01-13 12:51:05 -05:00
Andrew Wei	05366870ee	[LegalizeTypes] Add SoftenFloatResult support for STRICT_SINT_TO_FP/STRICT_UINT_TO_FP Some target like arm/riscv with soft-float will have compiling crash when using -fno-unsafe-math-optimization option. This patch will add the missing strict FP support to SoftenFloatRes_XINT_TO_FP. Differential Revision: https://reviews.llvm.org/D72277	2020-01-14 01:01:56 +08:00
Simon Pilgrim	38e2c01221	[SelectionDAG] ComputeNumSignBits add getValidMinimumShiftAmountConstant() ISD::SRA support Allows us to handle more non-uniform SRA sign bits cases	2020-01-13 16:55:02 +00:00
David Green	90555d9253	[Scheduler] Remove superfluous casts. NFC	2020-01-13 16:34:13 +00:00
Danilo Carvalho Grael	2d7e757a83	[AArch64][SVE] Add patterns for some arith SVE instructions. Summary: Add patterns for the following instructions: - smax, smin, umax, umin Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: amehsan Differential Revision: https://reviews.llvm.org/D71779	2020-01-13 11:39:42 -05:00
James Henderson	07804f75a6	[DebugInfo] Make debug line address size mismatch non-fatal to parsing Reasonable assumptions can be made when a parsed address length does not match the expected length, so there's no need for this to be fatal. Reviewed by: ikudrin Differential Revision: https://reviews.llvm.org/D72154	2020-01-13 16:27:05 +00:00
Kazu Hirata	6b686703e6	[Inlining] Add PreInlineThreshold for the new pass manager Summary: This patch makes it easy to try out different preinlining thresholds with a command-line switch just like -preinline-threshold for the legacy pass manager. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72618	2020-01-13 07:59:42 -08:00
Luís Marques	043c5eafa8	[RISCV] Handle globals and block addresses in asm operands Summary: These seem to be the machine operand types currently needed by the RISC-V target. Reviewers: asb, lenary Reviewed By: lenary Tags: #llvm Differential Revision: https://reviews.llvm.org/D72275	2020-01-13 15:34:56 +00:00
Pablo Barrio	da33762de8	[AArch64] Emit HINT instead of PAC insns in Armv8.2-A or below Summary: The Pointer Authentication Extension (PAC) was added in Armv8.3-A. Some instructions are implemented in the HINT space to allow compiling code common to CPUs regardless of whether they feature PAC or not, and still benefit from PAC protection in the PAC-enabled CPUs. The 8.3-specific mnemonics were currently enabled in any architecture, and LLVM was emitting them in assembly files when PAC code generation was enabled. This was ok for compilations where both LLVM codegen and the integrated assembler were used. However, the LLVM codegen was not compatible with other assemblers (e.g. GAS). Given the fact that the approach from these assemblers (i.e. to disallow Armv8.3-A mnemonics if compiling for Armv8.2-A or lower) is entirely reasonable, this patch makes LLVM to emit HINT when building for Armv8.2-A and below, instead of PACIASP, AUTIASP and friends. Then, LLVM assembly should be compatible with other assemblers. Reviewers: samparker, chill, LukeCheeseman Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71658	2020-01-13 14:14:48 +00:00
Alex Richardson	8e8ccf4712	[MIPS] Don't emit R_(MICRO)MIPS_JALR relocations against data symbols The R_(MICRO)MIPS_JALR optimization only works when used against functions. Using the relocation against a data symbol (e.g. function pointer) will cause some linkers that don't ignore the hint in this case (e.g. LLD prior to commit `5bab291b7b`) to generate a relative branch to the data symbol which crashes at run time. Before this patch, LLVM was erroneously emitting these relocations against local-dynamic TLS function pointers and global function pointers with internal visibility. Reviewers: atanasyan, jrtc27, vstefanovic Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D72571	2020-01-13 14:14:03 +00:00
Alex Richardson	894f742acb	[MIPS][ELF] Use PC-relative relocations in .eh_frame when possible When compiling position-independent executables, we now use DW_EH_PE_pcrel \| DW_EH_PE_sdata4. However, the MIPS ABI does not define a 64-bit PC-relative ELF relocation so we cannot use sdata8 for the large code model case. When using the large code model, we fall back to the previous behaviour of generating absolute relocations. With this change clang-generated .o files can be linked by LLD without having to pass -Wl,-z,notext (which creates text relocations). This is simpler than the approach used by ld.bfd, which rewrites the .eh_frame section to convert absolute relocations into relative references. I saw in D13104 that apparently ld.bfd did not accept pc-relative relocations for MIPS ouput at some point. However, I also checked that recent ld.bfd can process the clang-generated .o files so this no longer seems true. Reviewed By: atanasyan Differential Revision: https://reviews.llvm.org/D72228	2020-01-13 14:14:03 +00:00
Simon Pilgrim	376bc39c82	[SelectionDAG] ComputeNumSignBits - Use getValidShiftAmountConstant for shift opcodes getValidShiftAmountConstant handles out of bounds shift amounts for us, allowing us to remove the local handling.	2020-01-13 14:12:12 +00:00
Simon Pilgrim	6d1a8fd447	[SelectionDAG] ComputeKnownBits - Add DemandedElts support to getValidShiftAmountConstant/getValidMinimumShiftAmountConstant()	2020-01-13 14:12:12 +00:00
Ulrich Weigand	04a86966fb	[FPEnv] Fix chain handling for fpexcept.strict nodes We need to ensure that fpexcept.strict nodes are not optimized away even if the result is unused. To do that, we need to chain them into the block's terminator nodes, like already done for PendingExcepts. This patch adds two new lists of pending chains, PendingConstrainedFP and PendingConstrainedFPStrict to hold constrained FP intrinsic nodes without and with fpexcept.strict markers. This allows not only to solve the above problem, but also to relax chains a bit further by no longer flushing all FP nodes before a store or other memory access. (They are still flushed before nodes with other side effects.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D72341	2020-01-13 14:38:49 +01:00
Simon Pilgrim	ef5debac43	[SelectionDAG] ComputeKnownBits add getValidMinimumShiftAmountConstant() ISD::SHL support As mentioned on D72573	2020-01-13 12:02:13 +00:00
Simon Pilgrim	8f49204f26	[SelectionDAG] ComputeKnownBits - minimum leading/trailing zero bits in LSHR/SHL (PR44526) As detailed in https://blog.regehr.org/archives/1709 we don't make use of the known leading/trailing zeros for shifted values in cases where we don't know the shift amount value. This patch adds support to SelectionDAG::ComputeKnownBits to use KnownBits::countMinTrailingZeros and countMinLeadingZeros to set the minimum guaranteed leading/trailing known zero bits. Differential Revision: https://reviews.llvm.org/D72573	2020-01-13 11:08:12 +00:00
Simon Pilgrim	7f1cf7d5f6	[X86] Fix MSVC "truncation from 'int' to 'bool'" warning. NFCI.	2020-01-13 11:08:12 +00:00
Sjoerd Meijer	add04b9653	ARMLowOverheadLoops: return earlier to avoid printing irrelevant dbg msg. NFC	2020-01-13 10:24:10 +00:00
KAWASHIMA Takahiro	10c11e4e2d	This option allows selecting the TLS size in the local exec TLS model, which is the default TLS model for non-PIC objects. This allows large/ many thread local variables or a compact/fast code in an executable. Specification is same as that of GCC. For example, the code model option precedes the TLS size option. TLS access models other than local-exec are not changed. It means supoort of the large code model is only in the local exec TLS model. Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>) Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard Reviewd By: peter.smith Committed by: peter.smith Differential Revision: https://reviews.llvm.org/D71688	2020-01-13 10:16:53 +00:00
Sam Elliott	c9babcbda7	[RISCV] Collect Statistics on Compressed Instructions Summary: It is useful to keep statistics on how many instructions we have compressed, so we can see if future changes are increasing or decreasing this number. Reviewers: asb, luismarques Reviewed By: asb, luismarques Subscribers: xbolva00, sameer.abuasal, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67495	2020-01-13 10:04:05 +00:00
Sjoerd Meijer	07028b5a87	[SCEV] Follow up of D71563: addressing post commit comment. NFC.	2020-01-13 08:54:38 +00:00
Awanish Pandey	c958639098	[DWARF5][DebugInfo]: Added support for DebugInfo generation for auto return type for C++ member functions. Summary: This patch will provide support for auto return type for the C++ member functions. Before this return type of the member function is deduced and stored in the DIE. This patch includes llvm side implementation of this feature. Patch by: Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D70524	2020-01-13 12:26:13 +05:30
Craig Topper	52aaf4a275	[X86] Use SDNPOptInGlue instead of SDNPInGlue on a couple SDNodes. At least one of these is used without a Glue. This doesn't seem to change the X86GenDAGISel.inc output so maybe it doesn't matter?	2020-01-12 21:11:18 -08:00
Matt Arsenault	555e7ee04c	AMDGPU/GlobalISel: Don't use XEXEC class for SGPRs We don't use the xexec register classes for arbitrary values anymore. Avoids a test variance beween GlobalISel and SelectionDAG>	2020-01-12 22:44:51 -05:00
Matt Arsenault	a10527cd37	AMDGPU/GlobalISel: Copy type when inserting readfirstlane getDefIgnoringCopies will fail to find any def if no type is set if we try to use it on the use's operand, so propagate the type.	2020-01-12 22:44:51 -05:00
Zheng Chen	a6342c247a	[SCEV] accurate range for addrecexpr with nuw flag If addrecexpr has nuw flag, the value should never be less than its start value and start value does not required to be SCEVConstant. Reviewed By: nikic, sanjoy Differential Revision: https://reviews.llvm.org/D71690	2020-01-12 20:22:37 -05:00
James Clarke	0113cf193f	[RISCV] Check register class for AMO memory operands Summary: AMO memory operands use a custom parser in order to accept both (reg) and 0(reg). However, the validation predicate used for these operands was only checking that they were registers, and not the register class, so non-GPRs (such as FPRs) were also accepted. Thus, fix this by making the predicate check that they are GPRs. Reviewers: asb, lenary Reviewed By: asb, lenary Subscribers: hiraditya, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72471	2020-01-13 00:50:37 +00:00
Fangrui Song	2bfee35cb8	[MC][ELF] Emit a relocation if target is defined in the same section and is non-local For a target symbol defined in the same section, currently we don't emit a relocation if VariantKind is VK_None (with few exceptions like RISC-V relaxation), while GNU as emits one. This causes program behavior differences with and without -ffunction-sections, and can break intended symbol interposition in a -shared link. ``` .globl foo foo: call foo # no relocation. On other targets, may be written as b foo, etc call bar # a relocation if bar is in another section (e.g. -ffunction-sections) call foo@plt # a relocation ``` Unify these cases by always emitting a relocation. If we ever want to optimize `call foo` in -shared links, we should emit a STB_LOCAL alias and call via the alias. ARM/thumb2-beq-fixup.s: we now emit a relocation to global_thumb_fn as GNU as does. X86/Inputs/align-branch-64-2.s: we now emit R_X86_64_PLT32 to foo as GNU does. ELF/relax.s: rewrite the test as target-in-same-section.s . We omitted relocations to `global` and now emit R_X86_64_PLT32. Note, GNU as does not emit a relocation for `jmp global` (maybe its own bug). Our new behavior is compatible except `jmp global`. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D72197	2020-01-12 13:46:24 -08:00
Fangrui Song	7fa5290d5b	__patchable_function_entries: don't use linkage field 'unique' with -no-integrated-as .section name, "flags"G, @type, GroupName[, linkage] As of binutils 2.33, linkage cannot be 'unique'. For integrated assembler, we use both 'o' flag and 'unique' linkage to support --gc-sections and COMDAT with lld. https://sourceware.org/ml/binutils/2019-11/msg00266.html	2020-01-12 12:53:44 -08:00
Markus Böck	de797ccdd7	[NFC] Fix compilation of CrashRecoveryContext.cpp on mingw Patch by Markus Böck. Differential Revision: https://reviews.llvm.org/D72564	2020-01-12 14:43:16 -05:00
Fangrui Song	ebd26cc8c4	[PowerPC] Delete PPCDarwinAsmPrinter and PPCMCAsmInfoDarwin Darwin support has been removed. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D72063	2020-01-12 11:02:02 -08:00
Simon Pilgrim	66e39067ed	[X86][AVX] Use lowerShuffleAsLanePermuteAndSHUFP to lower binary v4f64 shuffles. Only perform this if we are shuffling lower and upper lane elements across the lanes (otherwise splitting to lower xmm shuffles would be better). This is a regression if we shuffle build_vectors due to getVectorShuffle canonicalizing 'blend of splat' build vectors, for now I've set this not to shuffle build_vector nodes at all to avoid this.	2020-01-12 12:29:41 +00:00
Simon Pilgrim	b375f28b0e	[X86][AVX] lowerShuffleAsLanePermuteAndSHUFP - only set the demanded elements of the lane mask. Fixes an cyclic dependency issue with an upcoming patch where getVectorShuffle canonicalizes masks with splat build vector sources.	2020-01-12 09:41:40 +00:00
Fangrui Song	60cc095ecc	[X86][Disassembler] Merge X86DisassemblerDecoder.cpp into X86Disassembler.cpp and refactor	2020-01-12 00:53:36 -08:00
Fangrui Song	51c1d7c4be	[X86][Disassembler] Simplify	2020-01-12 00:53:35 -08:00
Qiu Chaofan	f33fd43a7c	[NFC] Refactor memory ops cluster method Current implementation of BaseMemOpsClusterMutation is a little bit obscure. This patch directly uses a map from store chain ID to set of memory instrs to make it simpler, so that future improvements are easier to read, update and review. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D72070	2020-01-12 13:10:04 +08:00
Craig Topper	d692f0f6c8	[X86] Don't call LowerSETCC from LowerSELECT for STRICT_FSETCC/STRICT_FSETCCS nodes. This causes the STRICT_FSETCC/STRICT_FSETCCS nodes to lowered early while lowering SELECT, but the output chain doesn't get connected. Then we visit the node again when it is its turn because we haven't replaced the use of the chain result. In the case of the fp128 libcall lowering, after D72341 this will cause the libcall to be emitted twice.	2020-01-11 20:43:00 -08:00
Zheng Chen	569ccfc384	[SCEV] more accurate range for addrecexpr with nsw flag. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D72436	2020-01-11 23:26:35 -05:00
Craig Topper	efb674ac2f	[LegalizeVectorOps] Parallelize the lo/hi part of STRICT_UINT_TO_FLOAT legalization. The lo and hi computation are independent. Give them the same input chain and TokenFactor the results together.	2020-01-11 17:50:30 -08:00
Craig Topper	ed679804d5	[TargetLowering][X86] Connect the chain from STRICT_FSETCC in TargetLowering::expandFP_TO_UINT and X86TargetLowering::FP_TO_INTHelper.	2020-01-11 17:50:20 -08:00
Craig Topper	ddfcd82bdc	[LegalizeVectorOps] Expand vector MERGE_VALUES immediately. Custom legalization can produce MERGE_VALUES to return multiple results. We can expand them immediately instead of leaving them around for DAG combine to clean up.	2020-01-11 17:50:20 -08:00
Fangrui Song	1e8ce7492e	[X86][Disassembler] Optimize argument passing and immediate reading	2020-01-11 15:43:26 -08:00
Fangrui Song	6fdd6a7b3f	[Disassembler] Delete the VStream parameter of MCDisassembler::getInstruction() The argument is llvm::null() everywhere except llvm::errs() in llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds. If we ever have the needs to add verbose log to disassemblers, we can record log with a member function, instead of passing it around as an argument.	2020-01-11 13:34:52 -08:00
Lang Hames	2cdb18afda	[ORC] Fix argv handling in runAsMain / lli. This fixes an off-by-one error in the argc value computed by runAsMain, and switches lli back to using the input bitcode (rather than the string "lli") as the effective program name. Thanks to Stefan Graenitz for spotting the bug.	2020-01-11 13:03:38 -08:00
Alexandre Ganea	a1f16998f3	[Support] Optionally call signal handlers when a function wrapped by the the CrashRecoveryContext fails This patch allows for handling a failure inside a CrashRecoveryContext in the same way as the global exception/signal handler. A failure will have the same side-effect, such as cleanup of temporarty file, printing callstack, calling relevant signal handlers, and finally returning an exception code. This is an optional feature, disabled by default. This is a support patch for D69825. Differential Revision: https://reviews.llvm.org/D70568	2020-01-11 15:27:07 -05:00
Fangrui Song	179abb091d	[X86][Disassembler] Replace custom logger with LLVM_DEBUG llvm-objdump -d on clang is decreased from 7.8s to 7.4s. The improvement is likely due to the elimination of logger setup and dbgprintf(), which has a large overhead.	2020-01-11 12:17:05 -08:00
Craig Topper	5a9954c02a	[LegalizeVectorOps] Remove some of the simpler Expand methods. Pass Results vector to a couple. NFCI Some of the simplest handlers just call TLI and if that fails, they fall back to unrolling. For those just inline the TLI call and share the unrolling call with the default case of Expand. For ExpandFSUB and ExpandBITREVERSE so that its obvious they don't return results sometimes and want to defer to LegalizeDAG.	2020-01-11 12:14:19 -08:00
Craig Topper	9fe6f36c1a	[LegalizeVectorOps] Only pass SDNode* instead SDValue to all of the Expand* and Promote* methods. All the Expand* and Promote* function assume they are being called with result 0 anyway. Just hardcode result 0 into them.	2020-01-11 11:41:23 -08:00
Fangrui Song	a5994c789a	[X86][Disassembler] Simplify and optimize reader functions llvm-objdump -d on clang is decreased from 8.2s to 7.8s.	2020-01-11 11:24:38 -08:00
Craig Topper	9cc9120969	[X86] Turn FP_ROUND/STRICT_FP_ROUND into X86ISD::VFPROUND/STRICT_VFPROUND during PreprocessISelDAG to remove some duplicate isel patterns.	2020-01-11 11:06:52 -08:00
Lang Hames	d2751f8fdf	[ExecutionEngine] Re-enable FastISel for non-iOS arm targets. Patch by Nicolas Capens. Thanks Nicolas! https://reviews.llvm.org/D65015	2020-01-11 10:49:59 -08:00
Philip Reames	1d641daf26	[X86] Adjust nop emission by compiler to consider target decode limitations The primary motivation of this change is to bring the code more closely in sync behavior wise with the assembler's version of nop emission. I'd like to eventually factor them into one, but that's hard to do when one has features the other doesn't. The longest encodeable nop on x86 is 15 bytes, but many processors - for instance all intel chips - can't decode the 15 byte form efficiently. On those processors, it's better to use either a 10 byte or 11 byte sequence depending.	2020-01-11 08:45:17 -08:00
Philip Reames	563d3e3444	[X86AsmBackend] Move static function before sole use [NFC]	2020-01-11 08:45:17 -08:00
Philip Reames	6cb3957730	[X86AsmBackend] Be consistent about placing definitions out of line [NFC]	2020-01-11 08:45:17 -08:00
Simon Pilgrim	2740b2d5d5	Fix uninitialized value clang static analyzer warning. NFC.	2020-01-11 16:02:22 +00:00
Simon Pilgrim	a8ed86b5c7	moveOperands - assert Src/Dst MachineOperands are non-null. Fixes static-analyzer warnings.	2020-01-11 14:37:19 +00:00
Simon Pilgrim	24763734e7	[X86] Fix outdated comment The generic saturated math opcodes are no longer widened inside X86TargetLowering	2020-01-11 14:37:18 +00:00
Simon Pilgrim	ce35010d78	[X86][AVX] Add lowerShuffleAsLanePermuteAndSHUFP lowering Add initial support for lowering v4f64 shuffles to SHUFPD(VPERM2F128(V1, V2), VPERM2F128(V1, V2)), eventually this could be used for v8f32 (and maybe v8f64/v16f32) but I'm being conservative for the initial implementation as only v4f64 can always succeed. This currently is only called from lowerShuffleAsLanePermuteAndShuffle so only gets used for unary shuffles, and we limit this to cases where we use upper elements as otherwise concating 2 xmm shuffles is probably the better case. Helps with poor shuffles mentioned in D66004.	2020-01-11 12:42:00 +00:00
Nuno Lopes	87407fc03c	DSE: fix bug where we would only check libcalls for name rather than whole decl	2020-01-11 11:57:29 +00:00
Nikita Popov	0e322c8a1f	[InstCombine] Preserve nuw on sub of geps (PR44419) Fix https://bugs.llvm.org/show_bug.cgi?id=44419 by preserving the nuw on sub of geps. We only do this if the offset has a multiplication as the final operation, as we can't be sure the operations is nuw in the other cases without more thorough analysis. Differential Revision: https://reviews.llvm.org/D72048	2020-01-11 11:01:12 +01:00
Craig Topper	81a3d987ce	[X86] Remove dead code from X86DAGToDAGISel::Select that is no longer needed now that we don't mutate strict fp nodes. NFC	2020-01-11 00:27:14 -08:00
Craig Topper	c2ddfa876f	[X86] Simplify code by removing an unreachable condition. NFCI For X87<->SSE conversions, the SSE type is always smaller than the X87 type. So we can always use the smallest type for the memory type.	2020-01-10 23:41:06 -08:00
Craig Topper	5fe5c0a60f	[X86] Preserve fpexcept property when turning strict_fp_extend and strict_fp_round into stack operations. We use the stack for X87 fp_round and for moving from SSE f32/f64 to X87 f64/f80. Or from X87 f64/f80 to SSE f32/f64. Note for the SSE<->X87 conversions the conversion always happens in the X87 domain. The load/store ops in the X87 instructions are able to signal exceptions.	2020-01-10 23:41:06 -08:00
Fangrui Song	fcad5b298c	[X86][Disassembler] Simplify readPrefixes	2020-01-10 23:37:22 -08:00
Craig Topper	69806808b9	[X86] Use ReplaceAllUsesWith instead of ReplaceAllUsesOfValueWith to simplify some code. NFCI	2020-01-10 20:31:21 -08:00
Michael Bedy	4a32cd11ac	[AMDGPU] Remove unnecessary v_mov from a register to itself in WQM lowering. Summary: - SI Whole Quad Mode phase is replacing WQM pseudo instructions with v_mov instructions. While this is necessary for the special handling of moving results out of WWM live ranges, it is not necessary for WQM live ranges. The result is a v_mov from a register to itself after every WQM operation. This change uses a COPY psuedo in these cases, which allows the register allocator to coalesce the moves away. Reviewers: tpr, dstuttard, foad, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71386	2020-01-10 23:01:19 -05:00
Craig Topper	bb2553175a	[TargetLowering][ARM][Mips][WebAssembly] Remove the ordered FP compare from RunttimeLibcalls.def and all associated usages Summary: This always just used the same libcall as unordered, but the comparison predicate was different. This change appears to have been made when targets were given the ability to override the predicates. Before that they were hardcoded into the type legalizer. At that time we never inverted predicates and we handled ugt/ult/uge/ule compares by emitting an unordered check ORed with a ogt/olt/oge/ole checks. So only ordered needed an inverted predicate. Later ugt/ult/uge/ule were optimized to only call a single libcall and invert the compare. This patch removes the ordered entries and just uses the inverting logic that is now present. This removes some odd things in both the Mips and WebAssembly code. Reviewers: efriedma, ABataev, uweigand, cameron.mcinally, kpn Reviewed By: efriedma Subscribers: dschuff, sdardis, sbc100, arichardson, jgravelle-google, kristof.beyls, hiraditya, aheejin, sunfish, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72536	2020-01-10 19:30:08 -08:00
Jessica Paquette	ceb801612a	[AArch64] Don't generate libcalls for wide shifts on Darwin Similar to `cff90f07cb`. Darwin doesn't always use compiler-rt, and so we can't assume that these functions are available (at least on arm64).	2020-01-10 15:58:51 -08:00
Mircea Trofin	064087581a	[NFC][InlineCost] Factor cost modeling out of CallAnalyzer traversal. Summary: The goal is to simplify experimentation on the cost model. Today, CallAnalyzer decides 2 things: legality, and benefit. The refactoring keeps legality assessment in CallAnalyzer, and factors benefit evaluation out, as an extension. Reviewers: davidxl, eraman Reviewed By: davidxl Subscribers: kamleshbhalui, fedor.sergeev, hiraditya, baloghadamsoftware, haicheng, a.sidorin, Szelethus, donat.nagy, dkrupp, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71733	2020-01-10 15:30:24 -08:00
Vedant Kumar	e05e219926	[LockFileManager] Make default waitForUnlock timeout a parameter, NFC Patch by Xi Ge!	2020-01-10 15:24:32 -08:00
Stanislav Mekhanoshin	987bf8b6c1	Let targets adjust operand latency of bundles This reverts the AMDGPU DAG mutation implemented in D72487 and gives a more general way of adjusting BUNDLE operand latency. It also replaces FixBundleLatencyMutation with adjustSchedDependency callback in the AMDGPU, fixing not only successor latencies but predecessors' as well. Differential Revision: https://reviews.llvm.org/D72535	2020-01-10 14:56:53 -08:00
Vedant Kumar	a9052b4dfc	[AArch64] Add isAuthenticated predicate to MCInstDesc Add a predicate to MCInstDesc that allows tools to determine whether an instruction authenticates a pointer. This can be used by diagnostic tools to hint at pointer authentication failures. Differential Revision: https://reviews.llvm.org/D70329 rdar://55089604	2020-01-10 14:30:52 -08:00
Craig Topper	71cee21861	[TargetLowering] Use SelectionDAG::getSetCC and remove a repeated call to getSetCCResultType in softenSetCCOperands. NFCI	2020-01-10 13:24:00 -08:00
Jonas Devlieghere	815a3f5433	[CMake] Fix modules build after DWARFLinker reorganization Create a dedicate module for the DWARFLinker and make it depend on intrinsics gen.	2020-01-10 11:06:38 -08:00
Craig Topper	b590e0fd81	[TargetLowering][ARM][X86] Change softenSetCCOperands handling of ONE to avoid spurious exceptions for QNANs with strict FP quiet compares ONE is currently softened to OGT \| OLT. But the libcalls for OGT and OLT libcalls will trigger an exception for QNAN. At least for X86 with libgcc. UEQ on the other hand uses UO \| OEQ. The UO and OEQ libcalls will not trigger an exception for QNAN. This patch changes ONE to use the inverse of the UEQ lowering. So we now produce O & UNE. Technically the existing behavior was correct for a signalling ONE, but since I don't know how to generate one of those from clang that seemed like something we can deal with later as we would need to fix other predicates as well. Also removing spurious exceptions seemed better than missing an exception. There are also problems with quiet OGT/OLT/OLE/OGE, but those are harder to fix. Differential Revision: https://reviews.llvm.org/D72477	2020-01-10 11:00:17 -08:00
Craig Topper	f678fc7660	[LegalizeVectorOps] Improve handling of multi-result operations. This system wasn't very well designed for multi-result nodes. As a consequence they weren't consistently registered in the LegalizedNodes map leading to nodes being revisited for different results. I've removed the "Result" variable from the main LegalizeOp method and used a SDNode* instead. The result number from the incoming Op SDValue is only used for deciding which result to return to the caller. When LegalizeOp is called it should always register a legalized result for all of its results. Future calls for any other result should be pulled for the LegalizedNodes map. Legal nodes will now register all of their results in the map instead of just the one we were called for. The Expand and Promote handling to use a vector of results similar to LegalizeDAG. Each of the new results is then re-legalized and logged in the LegalizedNodes map for all of the Results for the node being legalized. None of the handles register their own results now. And none call ReplaceAllUsesOfValueWith now. Custom handling now always passes result number 0 to LowerOperation. This matches what LegalizeDAG does. Since the introduction of STRICT nodes, I've encountered several issues with X86's custom handling being called with an SDValue pointing at the chain and our custom handlers using that to get a VT instead of result 0. This should prevent us from having any more of those issues. On return we will update the LegalizedNodes map for all results so we shouldn't call the custom handler again for each result number. I want to push SDNode* further into the Expand and Promote handlers, but I've left that for a follow to keep this patch size down. I've created a dummy SDValue(Node, 0) to keep the handlers working. Differential Revision: https://reviews.llvm.org/D72224	2020-01-10 10:14:58 -08:00
Fangrui Song	a8fbdc5769	[X86] Support function attribute "patchable-function-entry" For x86-64, we diverge from GCC -fpatchable-function-entry in that we emit multi-byte NOPs. Differential Revision: https://reviews.llvm.org/D72220	2020-01-10 09:57:28 -08:00
Fangrui Song	4d1e23e3b3	[AArch64] Add function attribute "patchable-function-entry" to add NOPs at function entry The Linux kernel uses -fpatchable-function-entry to implement DYNAMIC_FTRACE_WITH_REGS for arm64 and parisc. GCC 8 implemented -fpatchable-function-entry, which can be seen as a generalized form of -mnop-mcount. The N,M form (function entry points before the Mth NOP) is currently only used by parisc. This patch adds N,0 support to AArch64 codegen. N is represented as the function attribute "patchable-function-entry". We will use a different function attribute for M, if we decide to implement it. The patch reuses the existing patchable-function pass, and TargetOpcode::PATCHABLE_FUNCTION_ENTER which is currently used by XRay. When the integrated assembler is used, __patchable_function_entries will be created for each text section with the SHF_LINK_ORDER flag to prevent --gc-sections (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93197) and COMDAT (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93195) issues. Retrospectively, __patchable_function_entries should use a PC-relative relocation type to avoid the SHF_WRITE flag and dynamic relocations. "patchable-function-entry"'s interaction with Branch Target Identification is still unclear (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 for GCC discussions). Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D72215	2020-01-10 09:55:51 -08:00
jasonliu	dfed052fb3	[AIX] Allow vararg calls when all arguments reside in registers Summary: This patch pushes the AIX vararg unimplemented error diagnostic later and allows vararg calls so long as all the arguments can be passed in register. This patch extends the AIX calling convention implementation to initialize GPR(s) for vararg float arguments. On AIX, both GPR(s) and FPR are allocated for floating point arguments. The GPR(s) are only initialized for vararg calls, otherwise the callee is expected to retrieve the float argument in the FPR. f64 in AIX PPC32 requires special handling in order to allocated and initialize 2 GPRs. This is performed with bitcast, SRL, truncation to initialize one GPR for the MSW and bitcast, truncations to initialize the other GPR for the LSW. A future patch will follow to add support for arguments passed on the stack. Patch provided by: cebowleratibm Reviewers: sfertile, ZarkoCA, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D71013	2020-01-10 17:33:35 +00:00
Simon Pilgrim	a5bdada09d	[X86][AVX] lowerShuffleAsLanePermuteAndShuffle - consistently normalize multi-input shuffle elements We only use lowerShuffleAsLanePermuteAndShuffle for unary shuffles at the moment, but we should consistently handle lane index calculations for multiple inputs in both the AVX1 and AVX2 paths. Minor (almost NFC) tidyup as I'm hoping to use lowerShuffleAsLanePermuteAndShuffle for binary shuffles soon.	2020-01-10 17:21:20 +00:00
Yonghong Song	fbb64aa698	[BPF] extend BTF_KIND_FUNC to cover global, static and extern funcs Previously extern function is added as BTF_KIND_VAR. This does not work well with existing BTF infrastructure as function expected to use BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO. This patch added extern function to BTF_KIND_FUNC. The two bits 0:1 of btf_type.info are used to indicate what kind of function it is: 0: static 1: global 2: extern Differential Revision: https://reviews.llvm.org/D71638	2020-01-10 09:06:31 -08:00
Andrew Paverd	bdd88b7ed3	Add support for __declspec(guard(nocf)) Summary: Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a new `"guard_nocf"` function attribute to indicate that checks should not be added on indirect calls within that function. Add support for `__declspec(guard(nocf))` following the same syntax as MSVC. Reviewers: rnk, dmajor, pcc, hans, aaron.ballman Reviewed By: aaron.ballman Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72167	2020-01-10 16:04:12 +00:00
Nemanja Ivanovic	d864d93496	[PowerPC] Handle constant zero bits in BitPermutationSelector We currently crash when analyzing an AssertZExt node that has some bits that are constant zeros (i.e. as a result of an and with a constant). This issue was reported in https://bugs.llvm.org/show_bug.cgi?id=41088 and this patch fixes that. Differential revision: https://reviews.llvm.org/D72038	2020-01-10 09:55:34 -06:00

... 3 4 5 6 7 ...

130312 Commits