llvm-project

Commit Graph

Author	SHA1	Message	Date
Zarko Todorovski	c92f29b05e	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Hsiangkai Wang	8d06a678a5	[SelectionDAG] Avoid aliasing analysis if the object size is unknown. If the size of memory access is unknown, do not use it to analysis. One example of unknown size memory access is to load/store scalable vector objects on the stack. Differential Revision: https://reviews.llvm.org/D91833	2020-11-25 06:13:37 +08:00
Janek van Oirschot	42eaf4fe0a	[HardwareLoops] Change order of SCEV expression construction for InitLoopCount. Putting the +1 before the zero-extend will allow scalar evolution to fold the expression in some cases such as the one shown in PowerPC's `shrink-wrap.ll` test. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D91724	2020-11-24 18:01:42 +00:00
Yichao Yu	a248eca665	Clear NewGEPBases after finish using them in CodeGenPrep pass AFAICT all other set/map are correctly cleared in `runOnFunction`. With assertion enabled this causes a crash when the module is freed and potentially if a later pass delete the instruction (not observed in real world though). Without assertion this can potentially cause confusing result when running on a new Function/Module. Reviewed By: loladiro Differential Revision: https://reviews.llvm.org/D84031	2020-11-24 12:12:00 -05:00
Thomas Preud'homme	9c8af93c93	Add support for STRICT_FSETCC promotion Add missing handling of STRICT_FSETCC promotion. This prevents assert failure in llvm::TargetLoweringBase::getTypeToPromoteTo(). Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D91962	2020-11-24 16:53:49 +00:00
Kai Luo	5931be60b5	[DAGCombine][PowerPC] Convert negated abs to trivial arithmetic ops This patch converts `0 - abs(x)` to `Y = sra (X, size(X)-1); sub (Y, xor (X, Y))` for better codegen. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D91120	2020-11-24 09:43:35 +00:00
Pavel Labath	bce2ac9f6d	Revert "[DebugInfo] Refactor code for emitting DWARF expressions for FP constants" The commit introduced a crash when emitting (debug info for) complex floats (pr48277).	2020-11-24 09:11:33 +01:00
Martin Storsjö	6f792041a5	Reapply "[CodeGen] [WinException] Only produce handler data at the end of the function if needed" This reapplies `36c64af9d7` in updated form. Emit the xdata for each function at .seh_endproc. This keeps the exact same output header order for most code generated by the LLVM CodeGen layer. (Sections still change order for code built from assembly where functions lack an explicit .seh_handlerdata directive, and functions with chained unwind info.) The practical effect should be that assembly output lacks superfluous ".seh_handlerdata; .text" pairs at the end of functions that don't handle exceptions, which allows such functions to use the AArch64 packed unwind format again. Differential Revision: https://reviews.llvm.org/D87448	2020-11-23 23:17:03 +02:00
Pavel Labath	6ef7835afc	[DebugInfo] Refactor code for emitting DWARF expressions for FP constants This patch moves the selection of the style used to emit the numbers (DW_OP_implicit_value vs. DW_OP_const+DW_OP_stack_value) into DwarfExpression::addUnsignedConstant. This logic is not FP-specific, and it will be needed for large integers too. The refactor also makes DW_OP_implicit_value (DW_OP_stack_value worked already) be used for floating point constants other than float and double, so I've added a _Float16 test for it. Split off from D90916. Differential Revision: https://reviews.llvm.org/D91058	2020-11-23 09:59:07 +01:00
Kazu Hirata	85d6af393c	[CodeGen] Use pred_empty (NFC)	2020-11-22 22:16:13 -08:00
Simon Pilgrim	791040cd8b	[DAG] LowerMINMAX - move default expansion to generic TargetLowering::expandIntMINMAX This is part of the discussion on D91876 about trying to reduce custom lowering of MIN/MAX ops on older SSE targets - if we can improve generic vector expansion we should be able to relax the limitations in SelectionDAGBuilder when it will let MIN/MAX ops be generated, and avoid having to flag so many ops as 'custom'.	2020-11-22 13:02:27 +00:00
Kazu Hirata	68403af007	[MBP] Remove unused declaration shouldPredBlockBeOutlined (NFC) The function was introduced on Jun 12, 2016 in commit `071d0f1807`. Its definition was removed on Mar 2, 2017 in commit `1393761e0c`.	2020-11-21 23:35:02 -08:00
Kazu Hirata	9d985082ad	[MachineLICM] Remove unused declaration HoistRegion The function definition was removed on Dec 22, 2011 in commit in `1eed5b51e8`.	2020-11-21 22:55:37 -08:00
Kazu Hirata	c2309ff3d5	[SelectionDAG] Remove unused declaration ExpandStrictFPOp (NFC) ExpandStrictFPOp started taking two parameters instead of one on Jan 10, 2020 in commit `f678fc7660`, but the declaration for the single-perameter version has remained since.	2020-11-21 22:29:44 -08:00
Ella Ma	1756d67934	[llvm][clang][mlir] Add checks for the return values from Target::createXXX to prevent protential null deref All these potential null pointer dereferences are reported by my static analyzer for null smart pointer dereferences, which has a different implementation from `alpha.cplusplus.SmartPtr`. The checked pointers in this patch are initialized by Target::createXXX functions. When the creator function pointer is not correctly set, a null pointer will be returned, or the creator function may originally return a null pointer. Some of them may not make sense as they may be checked before entering the function, but I fixed them all in this patch. I submit this fix because 1) similar checks are found in some other places in the LLVM codebase for the same return value of the function; and, 2) some of the pointers are dereferenced before they are checked, which may definitely trigger a null pointer dereference if the return value is nullptr. Reviewed By: tejohnson, MaskRay, jpienaar Differential Revision: https://reviews.llvm.org/D91410	2020-11-21 21:04:12 -08:00
Hongtao Yu	d0e42037bf	[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR, ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`. ``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ``` The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86495	2020-11-20 10:52:43 -08:00
Hongtao Yu	f3c445697d	[CSSPGO] IR intrinsic for pseudo-probe block instrumentation This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490	2020-11-20 10:39:24 -08:00
Craig Topper	a7eae62a42	[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version. The default version only works if the returned node has a single result. The X86 and PowerPC versions support multiple results and allow a single result to be returned from a node with multiple outputs. And allow a single result that is not result 0 of the node. Also replace the Mips version since the new version should work for it. The original version handled multiple results, but only if the new node and original node had the same number of results. Differential Revision: https://reviews.llvm.org/D91846	2020-11-20 10:06:53 -08:00
Andrew Wei	1cd19fc568	[DeadMachineInstrctionElim] Post order visit all blocks and Iteratively run DeadMachineInstructionElim pass until nothing dead Patched by: guopeilin Reviewed By: hliao,rampitec Differential Revision: https://reviews.llvm.org/D91513	2020-11-21 00:43:23 +08:00
Pavel Iliin	4d7df43ffd	[AArch64] Out-of-line atomics (-moutline-atomics) implementation. This patch implements out of line atomics for LSE deployment mechanism. Details how it works can be found in llvm/docs/Atomics.rst Options -moutline-atomics and -mno-outline-atomics to enable and disable it were added to clang driver. This is clang and llvm part of out-of-line atomics interface, library part is already supported by libgcc. Compiler-rt support is provided in separate patch. Differential Revision: https://reviews.llvm.org/D91157	2020-11-20 13:30:12 +00:00
Kazu Hirata	2583d8eb08	[CodeGen] Use llvm::is_contained (NFC)	2020-11-19 22:07:56 -08:00
Nikita Popov	393b9e9db3	[MemLoc] Require LocationSize argument (NFC) When constructing a MemoryLocation by hand, require that a LocationSize is explicitly specified. D91649 will split up LocationSize::unknown() into two different states, and callers should make an explicit choice regarding the kind of MemoryLocation they want to have.	2020-11-19 21:45:52 +01:00
Leonard Chan	a97f62837f	[llvm][IR] Add dso_local_equivalent Constant The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. That is, if this accepts a function, calling this constant should have the same effects as calling the function directly. This could be a direct reference to the function, the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the resolved function as a call target. When lowered, the returned address must have a constant offset at link time from some other symbol defined within the same binary. The address of this value is also insignificant. The name is leveraged from `dso_local` where use of a function or variable is resolved to a symbol in the same linkage unit. In this patch: - Addition of `dso_local_equivalent` and handling it - Update Constant::needsRelocation() to strip constant inbound GEPs and take advantage of `dso_local_equivalent` for relative references This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959) which makes vtables readonly. This works by replacing the dynamic relocations for function pointers in them with static relocations that represent the offset between the vtable and virtual functions. If a function is externally defined, `dso_local_equivalent` can be used as a generic wrapper for the function to still allow for this static offset calculation to be done. See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details. Differential Revision: https://reviews.llvm.org/D77248	2020-11-19 10:26:17 -08:00
Adhemerval Zanella	807320119f	[AArch64] Lower fptrunc/fpext from/to FP128t to/from FP16 The compiler-rt part which adds the emitted symbols is handled in a subsequent patch. Differential Revision: https://reviews.llvm.org/D91731	2020-11-19 15:14:50 -03:00
Florian Hahn	1983acce7c	[SelDAGBuilder] Do not require simple VTs for constraints. In some cases, the values passed to `asm sideeffect` calls cannot be mapped directly to simple MVTs. Currently, we crash in the backend if that happens. An example can be found in the @test_vector_too_large_r_m test case, where we pass <9 x float> vectors. In practice, this can happen in cases like the simple C example below. using vec = float __attribute__((ext_vector_type(9))); void f1 (vec m) { asm volatile("" : "+r,m"(m) : : "memory"); } One case that use "+r,m" constraints for arbitrary data types in practice is google-benchmark's DoNotOptimize. This patch updates visitInlineAsm so that it use MVT::Other for constraints with complex VTs. It looks like the rest of the backend correctly deals with that and properly legalizes the type. And we still report an error if there are no registers to satisfy the constraint. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D91710	2020-11-19 09:31:54 +00:00
Andrew Paverd	0139c8af8d	[CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-11-17 18:24:45 -08:00
Nick Desaulniers	f4c6080ab8	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit `b7926ce6d7`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Jon Roelofs	a461e76b6f	[MachineScheduler] Inform pass infra of post-ra scheduler's dependencies Differential Revision: https://reviews.llvm.org/D91561	2020-11-17 10:56:12 -08:00
Florian Hahn	a9adb62a64	[AsmPrinter] Use getMnemonic for instruction-mix remark. This patch uses the new `getMnemonic` helper from D90039 to display mnemonics instead of the internal opcodes. The main motivation behind using the mnemonics is that they are more user-friendly and more directly related to the assembly the users will be presented. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D90040	2020-11-17 12:12:47 +00:00
Jameson Nash	bf6ed355c8	Reland "[AsmPrinter] fix -disable-debug-info option" This reverts commit `105ed27ed8`, and removes the offending line from the tests.	2020-11-16 13:34:47 -05:00
Mirko Brkusanin	4cf6dd518e	[AMDGPU][GlobalISel] Fix lowerShlSat RegBankSelect would crash on G_SELECT when type is not s1. Differential Revision: https://reviews.llvm.org/D91437	2020-11-16 17:43:31 +01:00
Victor Huang	6bb2ceac90	Fix the compilation assertion due to unreachable BB pruning not deleting the associated BB from the jump tables This patch is added to remove the unreachable MBBs reference in the jump table. Differential Revisien: https://reviews.llvm.org/D90498 Reviewed by: amyk, bsaleil	2020-11-16 10:35:31 -06:00
Yuanfang Chen	a223354161	[CGProfile] allows bitcast in metadata node storing function pointers For example, during RAUW in IRMover, the `Function` ValueAsMetadata in "CG Profile" could become bitcast. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D88433	2020-11-13 09:28:21 -08:00
Jessica Paquette	b184a2eccf	[GlobalISel] Add matchers for specific constants and a matcher for negations It's fairly common to need matchers for a specific constant value, or for common idioms like finding a negated register. Add - `m_SpecificICst`, which returns true when matching a specific value.. - `m_ZeroInt`, which returns true when an integer 0 is matched. - `m_Neg`, which returns when a register is negated. Also update a few places which use idioms related to the new matchers. Differential Revision: https://reviews.llvm.org/D91397	2020-11-13 09:24:54 -08:00
Matt Arsenault	c67e1a985f	GlobalISel: Directly expose getDefSrcRegIgnoringCopies utility It's useful to get both the instruction and register at the same time.	2020-11-13 11:07:04 -05:00
Djordje Todorovic	22fd38d508	[NFC][IntrRefLDV] Remove dead code from transferSpillOrRestoreInst() Differential Revision: https://reviews.llvm.org/D90852	2020-11-13 07:53:54 -08:00
Hans Wennborg	105ed27ed8	Revert "[AsmPrinter] fix -disable-debug-info option" The test fails on Mac, see comment on the code review. > This option was in a rather convoluted place, causing global parameters > to be set in awkward and undesirable ways to try to account for it > indirectly. Add tests for the -disable-debug-info option and ensure we > don't print unintended markers from unintended places. > > Reviewed By: dstenb > > Differential Revision: https://reviews.llvm.org/D91083 This reverts commit `9606ef03f0`.	2020-11-13 13:46:13 +01:00
Kerry McLaughlin	306c8ab208	[SVE][CodeGen] Improve codegen of scalable masked scatters If the scatter store is able to perform the sign/zero extend of its index, this is folded into the instruction with refineIndexType(). Additionally, refineUniformBase() will return the base pointer and index from an add + splat_vector. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90942	2020-11-13 11:19:36 +00:00
serge-sans-paille	9218ff50f9	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Jameson Nash	9606ef03f0	[AsmPrinter] fix -disable-debug-info option This option was in a rather convoluted place, causing global parameters to be set in awkward and undesirable ways to try to account for it indirectly. Add tests for the -disable-debug-info option and ensure we don't print unintended markers from unintended places. Reviewed By: dstenb Differential Revision: https://reviews.llvm.org/D91083	2020-11-13 00:58:09 -05:00
Simon Pilgrim	8996742741	[KnownBits] Add KnownBits::makeConstant helper. NFCI. Helper for cases where we need to create a KnownBits from a (fully known) constant value.	2020-11-12 16:16:04 +00:00
David Sherwood	3225fcf11e	[SVE] Deal with SVE tuple call arguments correctly when running out of registers When passing SVE types as arguments to function calls we can run out of hardware SVE registers. This is normally fine, since we switch to an indirect mode where we pass a pointer to a SVE stack object in a GPR. However, if we switch over part-way through processing a SVE tuple then part of it will be in registers and the other part will be on the stack. I've fixed this by ensuring that: 1. When we don't have enough registers to allocate the whole block we mark any remaining SVE registers temporarily as allocated. 2. We temporarily remove the InConsecutiveRegs flags from the last tuple part argument and reinvoke the autogenerated calling convention handler. Doing this prevents the code from entering an infinite recursion and, in combination with 1), ensures we switch over to the Indirect mode. 3. After allocating a GPR register for the pointer to the tuple we then deallocate any SVE registers we marked as allocated in 1). We also set the InConsecutiveRegs flags back how they were before. 4. I've changed the AArch64ISelLowering LowerCALL and LowerFormalArguments functions to detect the start of a tuple, which involves allocating a single stack object and doing the correct numbers of legal loads and stores. Differential Revision: https://reviews.llvm.org/D90219	2020-11-12 08:41:50 +00:00
Pavel Iliin	fdb979cfbb	[NFC] [Legalize] Fix spaces and style.	2020-11-11 19:24:15 +00:00
Hans Wennborg	418f18c6cd	Revert "Reland [CFGuard] Add address-taken IAT tables and delay-load support" This broke both Firefox and Chromium (PR47905) due to what seems like dllimport function not being handled correctly. > This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. > Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. > > Reviewed By: rnk > > Differential Revision: https://reviews.llvm.org/D87544 This reverts commit `cfd8481da1`.	2020-11-11 16:03:33 +01:00
Simon Pilgrim	1a62ca65c1	[KnownBits] Add KnownBits::commonBits helper. NFCI. We have a frequent pattern where we're merging two KnownBits to get the common/shared bits, and I just fell for the gotcha where I tried to use the & operator to merge them........	2020-11-11 12:15:54 +00:00
Kerry McLaughlin	170947a5de	[SVE][CodeGen] Lower scalable masked scatters Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only) Changes included in this patch: - Custom lowering for MSCATTER, which chooses the appropriate scatter store opcode to use. Floating-point scatters are cast to integer, with patterns added to match FP reinterpret_casts. - Added the getCanonicalIndexType function to convert redundant addressing modes (e.g. scaling is redundant when accessing bytes) - Tests with 32 & 64-bit scaled & unscaled offsets Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90941	2020-11-11 11:50:22 +00:00
Kerry McLaughlin	ffbbfc76ca	[SVE][CodeGen] Add the isTruncatingStore flag to MSCATTER This patch adds the IsTruncatingStore flag to MaskedScatterSDNode, set by getMaskedScatter(). Updated SelectionDAGDumper::print_details for MaskedScatterSDNode to print the details of masked scatters (is truncating, signed or scaled). This is the first in a series of patches which adds support for scalable masked scatters Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90939	2020-11-11 10:58:24 +00:00
Chen Zheng	09e34048bf	[SelectionDAG] fminnum should be a binary operator Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D91163	2020-11-11 03:41:40 -05:00
Xun Li	b8a8ef3276	[SafeStack] Make sure SafeStack does not break musttail call contract SafeStack instrumentation should not insert anything inbetween musttail call and return instruction. For every ReturnInst that needs to be instrumented, we adjust the insertion point to the musttail call if exists. Differential Revision: https://reviews.llvm.org/D90702	2020-11-10 20:46:05 -08:00
Gaurav Jain	a0b7129767	[NFC] Use [MC]Register in TwoAddressInstructionPass Differential Revision: https://reviews.llvm.org/D90902	2020-11-10 19:01:56 -08:00
David Green	b2ac9681a7	[ARM] Alter t2DoLoopStart to define lr This changes the definition of t2DoLoopStart from t2DoLoopStart rGPR to GPRlr = t2DoLoopStart rGPR This will hopefully mean that low overhead loops are more tied together, and we can more reliably generate loops without reverting or being at the whims of the register allocator. This is a fairly simple change in itself, but leads to a number of other required alterations. - The hardware loop pass, if UsePhi is set, now generates loops of the form: %start = llvm.start.loop.iterations(%N) loop: %p = phi [%start], [%dec] %dec = llvm.loop.decrement.reg(%p, 1) %c = icmp ne %dec, 0 br %c, loop, exit - For this a new llvm.start.loop.iterations intrinsic was added, identical to llvm.set.loop.iterations but produces a value as seen above, gluing the loop together more through def-use chains. - This new instrinsic conceptually produces the same output as input, which is taught to SCEV so that the checks in MVETailPredication are not affected. - Some minor changes are needed to the ARMLowOverheadLoop pass, but it has been left mostly as before. We should now more reliably be able to tell that the t2DoLoopStart is correct without having to prove it, but t2WhileLoopStart and tail-predicated loops will remain the same. - And all the tests have been updated. There are a lot of them! This patch on it's own might cause more trouble that it helps, with more tail-predicated loops being reverted, but some additional patches can hopefully improve upon that to get to something that is better overall. Differential Revision: https://reviews.llvm.org/D89881	2020-11-10 15:57:58 +00:00
Mirko Brkusanin	a75d6178b8	[GlobalISel] Add combine for (x \| mask) -> x when (x \| mask) == x If we have a mask, and a value x, where (x \| mask) == x, we can drop the OR and just use x. Differential Revision: https://reviews.llvm.org/D90952	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	fb36ab0a42	[GlobalISel] Expand combine for (x & mask) -> x when (x & mask) == x We can use KnownBitsAnalysis to cover cases when mask is not trivial. It can also help with cases when mask is not constant but can still be folded into one. Since 'and' is comutative we should treat both operands as possible replacements. Differential Revision: https://reviews.llvm.org/D90674	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	53ae95c946	[AMDGPU][GlobalISel] Combine shift + logic + shift with constant operands This sequence of instructions can be simplified if they are single use and some operands are constants. Additional combines may be applied afterwards. Differential Revision: https://reviews.llvm.org/D90223	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	de719586a8	[AMDGPU][GlobalISel] Fold a chain of two shift instructions with constant operands Sequence of same shift instructions with constant operands can be combined into a single shift instruction. Differential Revision: https://reviews.llvm.org/D90217	2020-11-10 11:32:12 +01:00
David Zarzycki	a41ea782c8	[SelectionDAG] Enable CTPOP optimization fine tuning Add a TLI hook to allow SelectionDAG to fine tune the conversion of CTPOP to a chain of "x & (x - 1)" when CTPOP isn't legal. A subsequent patch will attempt to fine tune the X86 code gen. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89952	2020-11-09 13:49:01 -05:00
Paul Robinson	920befb337	[FastISel] Reduce spills around mem-intrinsic calls FastISel generates instructions to materialize "local values" at the top of a block, in the hope that these values could be reused within the block. To reduce spills and restores, FastISel treats calls as sub-block boundaries, flushing the "local value map" at each call. This patch treats the mem* intrinsics as if they were calls, because at O0 generally they are calls. Eliminating these spills/restores is actually better for debugging (especially a "continue at this line" command), code size, stack frame size, and maybe even performance. Differential Revision: https://reviews.llvm.org/D90877	2020-11-09 09:45:14 -08:00
jasonliu	42d2109380	[XCOFF] Enable explicit sections on AIX Implement mechanism to allow explicit sections to be generated on AIX. Reviewed By: DiggerLin Differential Revision: https://reviews.llvm.org/D88615	2020-11-09 16:27:38 +00:00
David Zarzycki	57e46e7123	[SelectionDAG] NFC: Hoist is legal check This was requested during the code review of D89952.	2020-11-09 10:55:15 -05:00
Francesco Petrogalli	fc2fe6817e	[llvm][AArch64] Simplify (and (sign_extend..) #bitmask). Fold VT = (and (sign_extend NarrowVT to VT) #bitmask) into VT = (zero_extend NarrowVT) With this combine, the test replaces a sign extended load + an unsigned extention with a zero extended load to render one of the operands of the last multiplication. BEFORE \| AFTER f_i16_i32: \| f_i16_i32: .fnstart \| .fnstart ldrsh r0, [r0] \| ldrh r1, [r1] ldrsh r1, [r1] \| ldrsh r0, [r0] smulbb r0, r1, r0 \| smulbb r0, r0, r1 uxth r1, r1 \| mul r0, r0, r1 mul r0, r0, r1 \| bx lr bx lr \| Reviewed By: resistor Differential Revision: https://reviews.llvm.org/D90605	2020-11-09 12:53:36 +00:00
Fangrui Song	ee4769687d	AsmPrinter/Dwarf*: Use llvm::Register instead of unsigned	2020-11-06 21:00:28 -08:00
Fangrui Song	7684496035	[AsmPrinter] Rename ByteStreamer::EmitInt8 to emitInt8 to be consistent with other emit*	2020-11-06 20:02:56 -08:00
Quentin Colombet	a585228027	Prevent LICM and machineLICM from hoisting convergent operations Results of convergent operations are implicitly affected by the enclosing control flows and should not be hoisted out of arbitrary loops. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D90361	2020-11-06 10:26:39 -08:00
Than McIntosh	b3d0f02861	[NFC] Fix typo in comment. Differential Revision: https://reviews.llvm.org/D90846	2020-11-06 09:03:07 -05:00
Momchil Velikov	5b30d9adc0	[MachineOutliner] Do not outline debug instructions The debug location is removed from any outlined instruction. This causes the MachineVerifier to crash on outlined DBG_VALUE instructions. Then, debug instructions are "invisible" to the outliner, that is, two ranges of instructions from different functions are considered identical if the only difference is debug instructions. Since a debug instruction from one function is unlikely to provide sensible debug information about all functions, sharing an outlined sequence, this patch just removes debug instructions from the outlined functions. Differential Revision: https://reviews.llvm.org/D89485	2020-11-05 19:26:51 +00:00
Craig Topper	98d7e583db	[LegalizeTypes] Remove unnecessary if around switch in ScalarizeVectorOperand and SplitVectorOperand. NFC The if was checking !Res.getNode() but that's always true since Res was initialized to SDValue() and not touched before the if. This appears to be a leftover from a previous implementation of Custom legalization where Res was updated instead of returning immediately.	2020-11-05 11:00:51 -08:00
Simon Pilgrim	bf04e34383	[DAG] computeKnownBits - Replace ISD::SREM handling with KnownBits::srem to reduce code duplication	2020-11-05 17:12:58 +00:00
Simon Pilgrim	7fe7c6d3be	[GlobalISel] Don't use Register type for getNumOperands(). NFCI. Copy+Paste typo - we were storing getNumOperands() opcounts in a Register type instead of just an unsigned.	2020-11-05 17:12:58 +00:00
Simon Pilgrim	e237d56b43	[KnownBits] Move ValueTracking/SelectionDAG UREM KnownBits handling to KnownBits::urem. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 14:30:59 +00:00
Simon Pilgrim	32bee18b84	[KnownBits] Move ValueTracking/SelectionDAG UDIV KnownBits handling to KnownBits::udiv. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 13:42:42 +00:00
Simon Pilgrim	546d002d7a	[GlobalISel] ComputeKnownBits - use common KnownBits shift handling (PR44526) Convert GISelKnownBits.computeKnownBitsImpl shift handling to use the common KnownBits implementations, which makes use of the known leading/trailing bits for shifted values in cases where we don't know the shift amount value, as detailed in https://blog.regehr.org/archives/1709 Differential Revision: https://reviews.llvm.org/D90527	2020-11-05 11:52:26 +00:00
Sander de Smalen	d57bba7cf8	[SVE] Return StackOffset for TargetFrameLowering::getFrameIndexReference. To accommodate frame layouts that have both fixed and scalable objects on the stack, describing a stack location or offset using a pointer + uint64_t is not sufficient. For this reason, we've introduced the StackOffset class, which models both the fixed- and scalable sized offsets. The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset, so that this can be used in other interfaces, such as to eliminate frame indices in PEI or to emit Debug locations for variables on the stack. This patch is purely mechanical and doesn't change the behaviour of how the result of this function is used for fixed-sized offsets. The patch adds various checks to assert that the offset has no scalable component, as frame offsets with a scalable component are not yet supported in various places. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D90018	2020-11-05 11:02:18 +00:00
Simon Pilgrim	b25765792b	Revert rGbbeb08497ce58 "Revert "[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation"" Updated the GISel KnownBits tests as KnownBits::computeForMul allows more accurate computation.	2020-11-05 10:39:53 +00:00
Chen Zheng	f645cea8f6	[MachineSink] add more profitable pattern. Add more profitable sinking patterns if the target bb register pressure is not too high. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D88126	2020-11-04 23:11:22 -05:00
Cameron McInally	c126eb7529	[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247. Differential Revision: https://reviews.llvm.org/D90644	2020-11-04 14:20:31 -06:00
Fraser Cormack	f99580c1e5	[DAGCombine] Fix bug in load scalarization Summary: For vector element types which are not byte-sized, we would generate incorrect scalar offsets and produce incorrect codegen. This optimization could potentially be supported in the future, e.g. by loading in bytes, then shifting and masking out the remaining bits of the vector element. However, without an upstream target to test against it's best to avoid the bad codegen in the simplest possible way. Related to this bug: https://bugs.llvm.org/show_bug.cgi?id=27600 Reviewed by: foad Differential Revision: https://reviews.llvm.org/D78568	2020-11-04 19:02:40 +00:00
Fangrui Song	bbeb08497c	Revert "[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation" This reverts commit `0b8711e1af` which broke GlobalISelTests AArch64GISelMITest.TestKnownBits	2020-11-04 09:54:04 -08:00
Simon Pilgrim	0b8711e1af	[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation Avoid code duplication	2020-11-04 17:25:24 +00:00
Simon Pilgrim	93c2a9ae07	Use isa<> instead of dyn_cast<> to avoid unused variable warning. NFCI.	2020-11-04 15:26:32 +00:00
Kerry McLaughlin	f2412d372d	[SVE][CodeGen] Lower scalable integer vector reductions This patch uses the existing LowerFixedLengthReductionToSVE function to also lower scalable vector reductions. A separate function has been added to lower VECREDUCE_AND & VECREDUCE_OR operations with predicate types using ptest. Lowering scalable floating-point reductions will be addressed in a follow up patch, for now these will hit the assertion added to expandVecReduce() in TargetLowering. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D89382	2020-11-04 11:38:49 +00:00
Simon Pilgrim	825e517e34	[DAG] computeKnownBits - Replace ISD::MUL handling with the common KnownBits::computeForMul implementation	2020-11-04 11:32:08 +00:00
Fangrui Song	0314dff051	[DebugInfo] Delete unused DwarfUnit::addConstantFPValue & addConstantValue overloads. NFC This functions appear to be unused for many years.	2020-11-04 00:05:57 -08:00
Xiang1 Zhang	7ba3293691	[StackColoring] Conservatively merge catch point of V for catch(V) Reviewed By: thanm, clin1 Differential Revision: https://reviews.llvm.org/D86673	2020-11-04 10:49:56 +08:00
Michael Liao	4b11201592	[MachineInstr] Add support for instructions with multiple memory operands. - Basically iterate each pair of memory operands from both instructions and return true if any of them may alias. - The exception are memory instructions without any memory operand. They may touch everything and could alias to any memory instruction. Differential Revision: https://reviews.llvm.org/D89447	2020-11-03 20:44:40 -05:00
Gaurav Jain	492b1d78d5	[NFC] Use [MC]Register in register allocation Differential Revision: https://reviews.llvm.org/D90725	2020-11-03 17:34:26 -08:00
Simon Pilgrim	e9b88c754a	[DAG] computeKnownBits - Move ISD::SRA handling into KnownBits::ashr As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.	2020-11-03 18:09:33 +00:00
Simon Pilgrim	cb798f040a	[DAG] computeKnownBits - Move (most) ISD::SRL handling into KnownBits::lshr As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 17:30:36 +00:00
Jameson Nash	a0ad066ce4	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Replaces https://reviews.llvm.org/D74158 Differential Revision: https://reviews.llvm.org/D89613	2020-11-03 10:02:09 -05:00
Simon Pilgrim	cab21d4fa8	[DAG] computeKnownBits - Move (most) ISD::SHL handling into KnownBits::shl As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 14:22:28 +00:00
Hans Wennborg	cbf25fbed5	Revert "[CodeGen] [WinException] Only produce handler data at the end of the function if needed" This caused an explosion in ICF times during linking on Windows when libfuzzer instrumentation is enabled. For a small binary we see ICF time go from ~0 to ~10 s. For a large binary it goes from ~1 s to forevert (I gave up after 30 minutes). See comment on the code review. > If we are going to write handler data (that is written as variable > length data following after the unwind info in .xdata), we need to > emit the handler data immediately, but for cases where no such > info is going to be written, skip emitting it right away. (Unwind > info for all remaining functions that hasn't gotten it emitted > directly is emitted at the end.) > > This does slightly change the ordering of sections (triggering a > bunch of updates to DebugInfo/COFF tests), but the change should be > benign. > > This also matches GCC's assembly output, which doesn't output > .seh_handlerdata unless it actually is needed. > > For ARM64, the unwind info can be packed into the runtime function > entry itself (leaving no data in the .xdata section at all), but > that can only be done if there's no follow-on data in the .xdata > section. If emission of the unwind info is triggered via > EmitWinEHHandlerData (or the .seh_handlerdata directive), which > implicitly switches to the .xdata section, there's a chance of the > caller wanting to pass further data there, so the packed format > can't be used in that case. > > Differential Revision: https://reviews.llvm.org/D87448 This reverts commit `36c64af9d7`.	2020-11-03 13:12:10 +01:00
Jessica Clarke	f77c8a48ae	[CodeGen] Fix regression from D83655 Arm EHABI has a null LSDASection as it does its own thing, so we should continue to return null in that case rather than try and cast it.	2020-11-03 03:57:46 +00:00
Gaurav Jain	b68994bd2d	[NFC] Use [MC]Register in Live-ness tracking Differential Revision: https://reviews.llvm.org/D90611	2020-11-02 15:46:13 -08:00
Fangrui Song	ee5d1a0449	[AsmPrinter] Split up .gcc_except_table MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table: * if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat` * otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits` This ensures that (a) non-prevailing copies are discarded and (b) .gcc_except_table associated to discarded text sections can be discarded by a .gcc_except_table-aware linker (GNU ld, but not gold or LLD) This patches matches the GCC behavior. If -fno-unique-section-names is specified, we don't append the suffix. If -ffunction-sections is additionally specified, use `.section ...,unique`. Note, if clang driver communicates that the linker is LLD and we know it is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table costs, at least in the -fno-unique-section-names case. We cannot use it on GNU ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER components in an output section https://sourceware.org/bugzilla/show_bug.cgi?id=26256 For RISC-V -mrelax, this patch additionally fixes an assembler-linker interaction problem: because a section is shrinkable, the length of a call-site code range is not a constant. Relocations referencing the associated text section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a discarded section group member from outside the group is disallowed by the ELF specification (PR46675): ``` // a.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int main() { return comdat(); } // b.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int foo() { return comdat(); } clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section: ``` -fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations. Reviewed By: jrtc27, rahmanl Differential Revision: https://reviews.llvm.org/D83655	2020-11-02 14:36:25 -08:00
Mircea Trofin	61e8a44655	[NFC][regalloc] Use MCRegister appropriately Differential Revision: https://reviews.llvm.org/D90506	2020-11-02 11:48:49 -08:00
Alex Richardson	5bc438efcf	[AtomicExpand] Avoid creating an unnamed libcall I recently modified this pass to better support CHERI-RISC-V and while doing so I noticed that this pass was calling M->getOrInsertFunction() with the result of TLI->getLibcallName(RTLibType). However, AMDGPU fills the libcalls array with nullptr, so this creates an anonymous function instead. This patch changes expandAtomicOpToLibcall to return false in case the libcall does not exist and changes the assert() in the callees to a report_fatal_error() instead. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88800	2020-11-02 17:52:37 +00:00
Matt Arsenault	b3639e9ae9	RegisterCoalescer: Use Register	2020-11-02 10:14:50 -05:00
Chen Zheng	24a31922ce	[MachineSink] sink more profitable loads Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D86864	2020-11-01 21:13:27 -05:00
QingShan Zhang	1d178d600a	[Scheduling] Fall back to the fast cluster algorithm if the DAG is too complex We have added a new load/store cluster algorithm in D85517. However, AArch64 see some compiling deg with the new algorithm as the IsReachable() is not cheap if the DAG is complex. O(M+N) See https://bugs.llvm.org/show_bug.cgi?id=47966 So, this patch added a heuristic to switch to old cluster algorithm if the DAG is too complex. Reviewed By: Owen Anderson Differential Revision: https://reviews.llvm.org/D90144	2020-11-02 02:11:52 +00:00
Qiu Chaofan	1f852ba853	[PowerPC] Avoid unnecessary fadd for unsigned to ppcf128 Unsigned 32-bit or shorter integer to ppcf128 conversion are currently expanded as signed-to-double with an extra fadd to 'complement'. But on PowerPC we have native instruction to directly convert unsigned to double since ISA v2.06. This patch exploits it. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D89786	2020-11-01 23:22:47 +08:00
Cameron McInally	dda1e74b58	[Legalize] Add legalizations for VECREDUCE_SEQ_FADD Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass. Differential Revision: https://reviews.llvm.org/D90247	2020-10-30 16:02:55 -05:00
Amy Huang	7156910d85	[CodeView] Encode signed int values correctly when emitting S_CONSTANTs Differential Revision: https://reviews.llvm.org/D90199	2020-10-30 09:28:41 -07:00
Nikita Popov	6c2ad4cf87	[SDAG] Extract helper to determine neutral element (NFC) Make the existing VECREDUCE based code more generic, but expressing it in terms of the neutral value of the base opcode instead.	2020-10-29 22:05:06 +01:00
Nikita Popov	a5f172927d	[SDAG] Fix neutral value for vecreduce_fadd The neutral value for FADD is -0.0, not 0.0, so this is what we need to pad vectors with.	2020-10-29 21:27:59 +01:00
Nikita Popov	91bf172088	[SDAG] Extract helper to get vecreduce base opcode (NFC)	2020-10-29 20:22:22 +01:00
dfukalov	b3cdaef518	[MIR] Fix out of bounds access in MIRPrinter. Fixes: SWDEV-256460 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D90239	2020-10-29 14:35:06 +03:00
Alok Kumar Sharma	930a8c60b6	[DebugInfo] [NFCI] Adding a missed out line in support for DW_TAG_generic_subrange. This commit adds a missed out line in earlier commit for DW_TAG_generic_subrange. Previous commit ID: `a6dd01afa3` Differential Revision: https://reviews.llvm.org/D89218 Thanks markus for pointing this out.	2020-10-29 16:18:20 +05:30
David Green	a4b6b1e1c8	[InterleaveAccess] Recognise Interleave loads through binary operations Instcombine will currently sink identical shuffles though vector binary operations. This is probably generally useful, but can break up the code pattern we use to represent an interleaving load group. This patch reverses that in the InterleaveAccessPass to re-recognise the pattern of shuffles sunk past binary operations and folds them back if an interleave group can be created. Differential Revision: https://reviews.llvm.org/D89489	2020-10-29 09:13:23 +00:00
Vedant Kumar	ffba94a9ac	Revert "[DebugInfo] Fix legacy ZExt emission when FromBits >= 64 (PR47927)" This reverts commit `9905346221`. It breaks the compiler-rt build, see https://reviews.llvm.org/D89838	2020-10-28 18:57:17 -07:00
Vedant Kumar	4fe81b6b6a	Revert "[DebugInfo] Shorten legacy [s\|z]ext dwarf expressions" This reverts commit `2ce36ebca5`. It depends on https://reviews.llvm.org/D89838, which needs to be reverted.	2020-10-28 18:57:17 -07:00
Derek Schuff	77973f8dee	[WebAssembly] Add support for DWARF type units Since Wasm comdat sections work similarly to ELF, we can use that mechanism to eliminate duplicate dwarf type information in the same way. Differential Revision: https://reviews.llvm.org/D88603	2020-10-28 17:41:22 -07:00
Amy Huang	7669f3c0f6	Recommit "[CodeView] Emit static data members as S_CONSTANTs." We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. This changes CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072 This reverts commit `504615353f`.	2020-10-28 16:35:59 -07:00
Gaurav Jain	f719fd7ade	[NFC] Use [MC]Register in CSE & LICM Differential Revision: https://reviews.llvm.org/D90327	2020-10-28 15:53:26 -07:00
Alok Kumar Sharma	a6dd01afa3	[DebugInfo] Support for DW_TAG_generic_subrange This is needed to support fortran assumed rank arrays which have runtime rank. Summary: Fortran assumed rank arrays have dynamic rank. DWARF TAG DW_TAG_generic_subrange is needed to support that. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89218	2020-10-29 01:34:15 +05:30
Aditya Nandakumar	bed8394047	[GISel]: Few InsertVecElt combines https://reviews.llvm.org/D88060 This adds the following combines 1) build_vector formation from insert_vec_elts 2) insert_vec_elts (build_vector) -> build_vector	2020-10-28 12:27:07 -07:00
Mircea Trofin	f0a98ad820	[NFC] Use Register in RegisterPressure APIs Some related changes as well. Differential Revision: https://reviews.llvm.org/D90268	2020-10-28 12:14:08 -07:00
Vedant Kumar	2ce36ebca5	[DebugInfo] Shorten legacy [s\|z]ext dwarf expressions Take advantage of the emitConstu helper to emit slightly shorter dwarf expressions to implement legacy [s\|z]ext operations.	2020-10-28 12:06:02 -07:00
Vedant Kumar	9905346221	[DebugInfo] Fix legacy ZExt emission when FromBits >= 64 (PR47927) Fix an out-of-bounds shift in emitLegacyZExt by using a slightly more complicated dwarf expression to create the zext mask. This addresses a UBSan diagnostic seen when compiling compiler-rt (llvm.org/PR47927). rdar://70307714 Differential Revision: https://reviews.llvm.org/D89838	2020-10-28 12:06:02 -07:00
Matt Arsenault	b9c21d43bb	RegAlloc: Clear isSSA The MIR parser may infer SSA, so -run-pass=regallocgreedy would hit a verifier error after multiple vreg defs are added.	2020-10-28 12:02:16 -04:00
Djordje Todorovic	6384378582	[NFC][IntrRefLDV] Improve the Value printing Basically, this just improves the dump of the Value stored within a location. If the defining instruction number is zero, it means it is "live-in". Before the patch: ESI --> bb 0 inst 0 loc ESI After: ESI --> Value{bb: 0, inst: live-in, loc: ESI} This is an NFC. Differential Revision: https://reviews.llvm.org/D90309	2020-10-28 07:39:08 -07:00
Simon Pilgrim	f53d7f55f1	[DAG] Move canFoldInAddressingMode before foldBinOpIntoSelect. NFC. Reduces the diff in D90113.	2020-10-28 12:16:05 +00:00
Luqman Aden	4c0a016927	Rename EHPersonality::MSVC_Win64SEH to EHPersonality::MSVC_TableSEH. NFC. The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining vs table-based exception handling. x86-32 is the only arch for which Windows uses the former. 32-bit ARM would use what is called Win64SEH today, which is a bit confusing so instead let's just rename it to be a bit more clear. Reviewed By: compnerd, rnk Differential Revision: https://reviews.llvm.org/D90117	2020-10-27 23:22:13 -07:00
Derek Schuff	44eea0b1a7	Revert "[WebAssembly] Add support for DWARF type units" This reverts commit `bcb8a119df`.	2020-10-27 17:57:32 -07:00
Derek Schuff	bcb8a119df	[WebAssembly] Add support for DWARF type units Since Wasm comdat sections work similarly to ELF, we can use that mechanism to eliminate duplicate dwarf type information in the same way. Differential Revision: https://reviews.llvm.org/D88603	2020-10-27 17:13:41 -07:00
Nicolai Hähnle	e025d09b21	Revert multiple patches based on "Introduce CfgTraits abstraction" These logically belong together since it's a base commit plus followup fixes to less common build configurations. The patches are: Revert "CfgInterface: rename interface() to getInterface()" This reverts commit `a74fc48158`. Revert "Wrap CfgTraitsFor in namespace llvm to please GCC 5" This reverts commit `f2a06875b6`. Revert "Try to make GCC5 happy about the CfgTraits thing" This reverts commit `03a5f7ce12`. Revert "Introduce CfgTraits abstraction" This reverts commit `c0cdd22c72`.	2020-10-27 20:33:30 +01:00
Amy Huang	504615353f	Revert "[CodeView] Emit static data members as S_CONSTANTs." Seems like there's an assert in here that we shouldn't be running into. This reverts commit `515973222e`.	2020-10-27 11:29:58 -07:00
Vedant Kumar	5a3ef55a52	[Utils] Skip RemoveRedundantDbgInstrs in MergeBlockIntoPredecessor (PR47746) This patch changes MergeBlockIntoPredecessor to skip the call to RemoveRedundantDbgInstrs, in effect partially reverting D71480 due to some compile-time issues spotted in LoopUnroll and SimplifyCFG. The call to RemoveRedundantDbgInstrs appears to have changed the worst-case behavior of the merging utility. Loosely speaking, it seems to have gone from O(#phis) to O(#insts). It might not be possible to mitigate this by scanning a block to determine whether there are any debug intrinsics to remove, since such a scan costs O(#insts). So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly little fallout from this, and most of it can be addressed by doing RemoveRedundantDbgInstrs later. The exception is (the block-local version of) SimplifyCFG, where it might just be too expensive to call RemoveRedundantDbgInstrs. Differential Revision: https://reviews.llvm.org/D88928	2020-10-27 10:12:59 -07:00
Djordje Todorovic	cca049ad2b	[NFC][IntrRefLDV] Some code clean up As reading the source code, I've found some minor nits: -Use using instead of typedef -Fix a comment -Refactor Differential Revision: https://reviews.llvm.org/D90155	2020-10-27 05:31:24 -07:00
Sven van Haastregt	5d03080092	[TargetLowering] Add i1 condition for bit comparison fold For i1 types, boolean false is represented identically regardless of the boolean content, so we can allow optimizations that otherwise would not be correct for booleans with false represented as a negative one. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D90145	2020-10-27 12:22:20 +00:00
Med Ismail Bennani	a3aea0193d	[llvm/DebugInfo] Simplify DW_OP_implicit_value condition (NFC) Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2020-10-27 11:25:19 +01:00
Gaurav Jain	17cdba61d4	[NFC] Use [MC]Register in RegAllocPBQP & RegisterCoalescer Differential Revision: https://reviews.llvm.org/D90008	2020-10-26 17:13:32 -07:00
Rahman Lavaee	0b2f4cdf2b	Explicitly check for entry basic block, rather than relying on MachineBasicBlock::pred_empty. Sometimes in unoptimized code, we have dangling unreachable basic blocks with no predecessors. Basic block sections should be emitted for those as well. Without this patch, the included test fails with a fatal error in `AsmPrinter::emitBasicBlockEnd`. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D89423	2020-10-26 16:15:56 -07:00
Amy Huang	515973222e	[CodeView] Emit static data members as S_CONSTANTs. We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. I changed CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072	2020-10-26 15:30:35 -07:00
Peter Waller	5b742a0c10	[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination The modified code in visitSTORE was missing a scalable vector check, and still using the now deprecated implicit cast of TypeSize to uint64_t through the overloaded operator. This patch fixes these issues. This brings the logic in line with the comment on the context line immediately above the added precondition. Add a test in sve-redundant-store.ll that the warning is not triggered. Differential Revision: https://reviews.llvm.org/D89701	2020-10-26 16:37:48 +00:00
Peter Waller	6536d6040f	Revert "[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination" This reverts commit `4604441386`. Reverting because it was not the intended version of the patch, which follows this patch.	2020-10-26 16:37:00 +00:00
Peter Waller	4604441386	[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination The modified code in visitSTORE was missing a scalable vector check, and still using the now deprecated implicit cast of TypeSize to uint64_t through the overloaded operator. This patch fixes these issues. This brings the logic in line with the comment on the context line immediately above the added precondition. Add a test in Redundantstores.ll that the warning is not triggered.	2020-10-26 16:23:42 +00:00
Djordje Todorovic	a64b2c9366	[NFC][InstrRefLDV] Fix a typo	2020-10-26 04:04:16 -07:00
Florian Hahn	b2bec7cece	[AsmPrinter] Add per BB instruction mix remark. This patch adds a remarks that provides counts for each opcode per basic block. An snippet of the generated information can be seen below. The current implementation uses the target specific opcode for the counts. For example, on AArch64 this means we currently get 2 entries for `add` instructions if the block contains 32 and 64 bit adds. Similarly, immediate version are treated differently. Unfortunately there seems to be no convenient way to get only the mnemonic part of the instruction as a string AFAIK. This could be improved in the future. ``` --- !Analysis Pass: asm-printer Name: InstructionMix DebugLoc: { File: arm64-instruction-mix-remarks.ll, Line: 30, Column: 30 } Function: foo Args: - String: 'BasicBlock: ' - BasicBlock: else - String: "\n" - String: INST_MADDWrrr - String: ': ' - INST_MADDWrrr: '2' - String: "\n" - String: INST_MOVZWi - String: ': ' - INST_MOVZWi: '1' ``` Reviewed By: anemet, thegameg, paquette Differential Revision: https://reviews.llvm.org/D89892	2020-10-26 09:25:45 +00:00
David Green	61bc18de0b	[Schedule] Add a MultiHazardRecognizer This adds a MultiHazardRecognizer and starts to make use of it in the ARM backend. The idea of the class is to allow multiple independent hazard recognizers to be added to a single base MultiHazardRecognizer, allowing them to all work in parallel without requiring them to be chained into subclasses. They can then be added or not based on cpu or subtarget features, which will become useful in the ARM backend once more hazard recognizers are being used for various things. This also renames ARMHazardRecognizer to ARMHazardRecognizerFPMLx in the process, to more clearly explain what that recognizer is designed for. Differential Revision: https://reviews.llvm.org/D72939	2020-10-26 08:06:17 +00:00
Simon Pilgrim	ce356e1546	[DAG] Add BuildVectorSDNode::getRepeatedSequence helper to recognise multi-element splat patterns Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method. I've just used this to simplify the X86ISD::BROADCASTM lowering so far (and remove isSplatZeroExtended), but we should be able to use this in more places to lower to complex broadcast patterns. Differential Revision: https://reviews.llvm.org/D87930	2020-10-24 12:23:09 +01:00
Simon Pilgrim	62b17a7697	[LegalizeTypes] Legalize vector rotate operations Lower vector rotate operations as long as the legalization occurs outside of LegalizeVectorOps. This fixes https://bugs.llvm.org/show_bug.cgi?id=47320 Patch By: @rsanthir.quic (Ryan Santhirarajan) Differential Revision: https://reviews.llvm.org/D89497	2020-10-24 11:30:32 +01:00
Med Ismail Bennani	64c4dac60e	[llvm/DebugInfo] Emit DW_OP_implicit_value when tuning for LLDB This patch enables emitting DWARF `DW_OP_implicit_value` opcode when tuning debug information for LLDB (`-debugger-tune=lldb`). This will also propagate to Darwin platforms, since they use LLDB tuning as a default. rdar://67406059 Differential Revision: https://reviews.llvm.org/D90001 Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2020-10-24 06:45:33 +02:00
Mehdi Amini	8f492f6467	Remove unused verifyRegStateMapping() function in RegAllocFast (NFC) This fixes compiler warning when building with assertions.	2020-10-24 00:36:51 +00:00
Cameron McInally	a1cc274cb3	[SVE] Lower fixed length VECREDUCE_SEQ_FADD operation Differential Revision: https://reviews.llvm.org/D89162	2020-10-23 16:24:02 -05:00
Nick Desaulniers	b7926ce6d7	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
Mircea Trofin	819044ad2d	[NFC] Use [MC]Register in RegAllocGreedy This was initiated from the uses of MCRegUnitIterator, so while likely not exhaustive, it's a step forward. Differential Revision: https://reviews.llvm.org/D89975	2020-10-23 11:30:53 -07:00
Paulo Matos	69e2797eae	[WebAssembly] Implementation of (most) table instructions Implementation of instructions table.get, table.set, table.grow, table.size, table.fill, table.copy. Missing instructions are table.init and elem.drop as they deal with element sections which are not yet implemented. Added more tests to tables.s Differential Revision: https://reviews.llvm.org/D89797	2020-10-23 08:42:54 -07:00
Jeremy Morse	b1b2c6ab66	[DebugInstrRef] Handle DBG_INSTR_REFs use-before-defs in LiveDebugValues Deciding where to place debugging instructions when normal instructions sink between blocks is difficult -- see PR44117. Dealing with this with instruction-referencing variable locations is simple: we just tolerate DBG_INSTR_REFs referring to values that haven't been computed yet. This patch adds support into InstrRefBasedLDV to record when a variable value appears in the middle of a block, and should have a DBG_VALUE added when it appears (a debug use before def). While described simply, this relies heavily on the value-propagation algorithm in InstrRefBasedLDV. The implementation doesn't attempt to verify the location of a value unless something non-trivial occurs to merge variable values in vlocJoin. This means that a variable with a value that has no location can retain it across all control flow (including loops). It's only when another debug instruction specifies a different variable value that we have to check, and find there's no location. This property means that if a machine value is defined in a block dominated by a DBG_INSTR_REF that refers to it, all the successor blocks can automatically find a location for that value (if it's not clobbered). Thus in a sense, InstrRefBasedLDV is already supporting and implementing use-before-defs. This patch allows us to specify a variable location in the block where it's defined. When loading live-in variable locations, TransferTracker currently discards those where it can't find a location for the variable value. However, we can tell from the machine value number whether the value is defined in this block. If it is, add it to a set of use-before-def records. Then, once the relevant instruction has been processed, emit a DBG_VALUE immediately after it. Differential Revision: https://reviews.llvm.org/D85775	2020-10-23 16:33:23 +01:00
Denis Antrushin	4f7ee55971	Revert "[Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty." Downstream testing revealed some problems with this patch. Reverting while investigating. This reverts commit `2b96dcebfa`.	2020-10-23 21:55:06 +07:00
Jeremy Morse	68f4715716	[DebugInstrRef] Convert DBG_INSTR_REFs into variable locations Handle DBG_INSTR_REF instructions in LiveDebugValues, to determine and propagate variable locations. The logic is fairly straight forwards: Collect a map of debug-instruction-number to the machine value numbers generated in the first walk through the function. When building the variable value transfer function and we see a DBG_INSTR_REF, look up the instruction it refers to, and pick the machine value number it generates, That's it; the rest of LiveDebugValues continues as normal. Awkwardly, there are two kinds of instruction numbering happening here: the offset into the block (which is how machine value numbers are determined), and the numbers that we label instructions with when generating DBG_INSTR_REFs. I've also restructured the TransferTracker redefVar code a little, to separate some DBG_VALUE specific operations into its own method. The changes around redefVar should be largely NFC, while allowing DBG_INSTR_REFs to specify a value number rather than just a location. Differential Revision: https://reviews.llvm.org/D85771	2020-10-23 14:50:02 +01:00
Jeremy Morse	ab93e71065	[DebugInstrRef] NFC: Separate collection of machine/variable values This patch adjusts _when_ something happens in LiveDebugValues / InstrRefBasedLDV, to make it more amenable to dealing with DBG_INSTR_REF instructions. There's no functional change. In the current InstrRefBasedLDV implementation, we collect the machine value-number transfer function for blocks at the same time as the variable-value transfer function. After solving machine value numbers, the variable-value transfer function is updated so that DBG_VALUEs of live-in registers have the correct value. The same would need to be done for DBG_INSTR_REFs, to connect instruction-references with machine value numbers. Rather than writing more code for that, this patch separates the two: we collect the (machine-value-number) transfer function and solve for machine value numbers, then step through the MachineInstrs again collecting the variable value transfer function. This simplifies things for the new few patches. Differential Revision: https://reviews.llvm.org/D85760	2020-10-23 11:13:20 +01:00
David Blaikie	4437df8eed	DebugInfo: Hash DIE referevences (DW_OP_convert) when computing Split DWARF signatures	2020-10-22 20:09:33 -07:00
Han Shen	e42f6c0ac0	Revert "[MBP] Add whole chain to BlockFilterSet instead of individual BB" This reverts commit `adfb541501`. This is reverted because it caused an chrome error: https://crbug.com/1140168	2020-10-22 17:31:01 -07:00
David Blaikie	a66311277a	DWARFv5: Disable DW_OP_convert for configurations that don't yet support it Testing reveals that lldb and gdb have some problems with supporting DW_OP_convert - gdb with Split DWARF tries to resolve the CU-relative DIE offset relative to the skeleton DIE. lldb tries to treat the offset as absolute, which judging by the llvm-dsymutil support for DW_OP_convert, I guess works OK in MachO? (though probably llvm-dsymutil is producing invalid DWARF by resolving the relative reference to an absolute one?). Specifically this disables DW_OP_convert usage in DWARFv5 if: * Tuning for GDB and using Split DWARF * Tuning for LLDB and not targeting MachO	2020-10-22 12:02:33 -07:00
Mircea Trofin	e24537d48f	[NFC][MC] Use MCRegister for ReachingDefAnalysis APIs Also updated the users of the APIs; and a drive-by small change to RDFRegister.cpp Differential Revision: https://reviews.llvm.org/D89912	2020-10-22 08:47:35 -07:00
Jeremy Morse	68ac02c0dd	[DebugInstrRef] Pass DBG_INSTR_REFs through register allocation Both FastRegAlloc and LiveDebugVariables/greedy need to cope with DBG_INSTR_REFs. None of them actually need to take any action, other than passing DBG_INSTR_REFs through: variable location information doesn't refer to any registers at this stage. LiveDebugVariables stashes the instruction information in a tuple, then re-creates it later. This is only necessary as the register allocator doesn't expect to see any debug instructions while it's working. No equivalence classes or interval splitting is required at all! No changes are needed for the fast register allocator, as it just ignores debug instructions. The test added checks that both of them preserve DBG_INSTR_REFs. This also expands ScheduleDAGInstrs.cpp to treat DBG_INSTR_REFs the same as DBG_VALUEs when rescheduling instructions around. The current movement of DBG_VALUEs around is less than ideal, but it's not a regression to make DBG_INSTR_REFs subject to the same movement. Differential Revision: https://reviews.llvm.org/D85757	2020-10-22 15:51:22 +01:00
Matt Arsenault	188df17420	ScheduleDAGInstrs: Skip debug instructions at end of scheduling region If the end instruction of the scheduling region was a DBG_VALUE, the uses of the debug instruction were tracked as if they were real uses. This would then hit the deadDefHasNoUse assertion in addVRegDefDeps if the only use was the debug instruction.	2020-10-22 10:16:45 -04:00
Jeremy Morse	d73275993b	[DebugInstrRef] Substitute debug value numbers to handle optimizations This patch touches two optimizations, TwoAddressInstruction and X86's FixupLEAs pass, both of which optimize by re-creating instructions. For LEAs, various bits of arithmetic are better represented as LEAs on X86, while TwoAddressInstruction sometimes converts instrs into three address instructions if it's profitable. For debug instruction referencing, both of these require substitutions to be created -- the old instruction number must be pointed to the new instruction number, as illustrated in the added test. If this isn't done, any variable locations based on the optimized instruction are conservatively dropped. Differential Revision: https://reviews.llvm.org/D85756	2020-10-22 13:01:03 +01:00
Fangrui Song	b0c12474ed	[ShrinkWrap] Delete unneeded nullptr checks for the save point. NFC findNearestCommonDominator never returns nullptr.	2020-10-22 00:27:01 -07:00
Xiang1 Zhang	7c3fea7721	[X86] Support customizing stack protector guard Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D88631	2020-10-22 10:08:14 +08:00
Craig Topper	9e884169a2	[FPEnv][X86][SystemZ] Use different algorithms for i64->double uint_to_fp under strictfp to avoid producing -0.0 when rounding toward negative infinity Some of our conversion algorithms produce -0.0 when converting unsigned i64 to double when the rounding mode is round toward negative. This switches them to other algorithms that don't have this problem. Since it is undefined behavior to change rounding mode with the non-strict nodes, this patch only changes the behavior for strict nodes. There are still problems with unsigned i32 conversions too which I'll try to fix in another patch. Fixes part of PR47393 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87115	2020-10-21 18:12:54 -07:00
Gaurav Jain	4634ad6c0b	[NFC] Set return type of getStackPointerRegisterToSaveRestore to Register Differential Revision: https://reviews.llvm.org/D89858	2020-10-21 16:19:38 -07:00
Jeremy Morse	537f0fbe82	[DebugInfo] Follow up `c521e44def` with an API improvement As mentioned post-commit in D85749, the 'substituteDebugValuesForInst' method added in `c521e44def` would be better off with a limit on the number of operands to substitute. This handles the common case of "substitute the first operand between these two differing instructions", or possibly up to N first operands.	2020-10-21 14:45:55 +01:00
Simon Pilgrim	88523f6f4b	[DAG] getNode(ISD::EXTRACT_SUBVECTOR) Drop unnecessary N2C null check - we assert that this isn't null and have already used the pointer. NFCI. Fixes cppcheck + null dereference warning.	2020-10-21 11:53:44 +01:00
Nicholas Guy	9a2d2bedb7	Add "SkipDead" parameter to TargetInstrInfo::DefinesPredicate Some instructions may be removable through processes such as IfConversion, however DefinesPredicate can not be made aware of when this should be considered. This parameter allows DefinesPredicate to distinguish these removable instructions on a per-call basis, allowing for more fine-grained control from processes like ifConversion. Renames DefinesPredicate to ClobbersPredicate, to better reflect it's purpose Differential Revision: https://reviews.llvm.org/D88494	2020-10-21 11:52:47 +01:00
Sven van Haastregt	bfc961aeb2	[TargetLowering] Check boolean content when folding bit compare Updates an optimization that relies on boolean contents being either 0 or 1 to properly check for this before triggering. The following: (X & 8) != 0 --> (X & 8) >> 3 Produces unexpected results when a boolean 'true' value is represented by negative one. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D89390	2020-10-21 11:46:55 +01:00
David Sherwood	5b17b323a6	[SVE][CodeGen] Replace use of TypeSize comparison operator in CreateStackTemporary We were previously relying upon the TypeSize comparison operators to obtain the maximum size of two types, however use of such operators is being deprecated in favour of making the caller aware that it could be dealing with scalable vector types. I have changed the code to assert that the two types have the same scalable property and thus we can simply take the maximum of the known minimum sizes instead. Differential Revision: https://reviews.llvm.org/D88563	2020-10-21 08:31:36 +01:00
Mircea Trofin	5e731625f3	[NFC][MC] Use [MC]Register in MachineVerifier Differential Revision: https://reviews.llvm.org/D89815	2020-10-20 20:42:35 -07:00
Hubert Tong	134ffa8138	NFC: Fix -Wsign-compare warnings on 32-bit builds Comparing 32-bit `ptrdiff_t` against 32-bit `unsigned` results in `-Wsign-compare` warnings for both GCC and Clang. The warning for the cases in question appear to identify an issue where the `ptrdiff_t` value would be mutated via conversion to an unsigned type. The warning is resolved by using the usual arithmetic conversions to safely preserve the value of the `unsigned` operand while trying to convert to a signed type. Host platforms where `unsigned` has the same width as `unsigned long long` will need to make a different change, but using an explicit cast has disadvantages that can be avoided for now. Reviewed By: dantrushin Differential Revision: https://reviews.llvm.org/D89612	2020-10-20 20:52:10 -04:00
Austin Kerbow	37d907899f	[HazardRec] Allow inserting multiple wait-states simultaneously If a target can encode multiple wait-states into a noop allow emitting such instructions directly. Reviewed By: rampitec, dmgreen Differential Revision: https://reviews.llvm.org/D89753	2020-10-20 17:03:47 -07:00
Nicolai Hähnle	c0cdd22c72	Introduce CfgTraits abstraction The CfgTraits abstraction simplfies writing algorithms that are generic over the type of CFG, and enables writing such algorithms as regular non-template code that operates on opaque references to CFG blocks and values. Implementations of CfgTraits provide operations on the concrete CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock `. CfgInterface is an abstract base class which provides operations on opaque types CfgBlockRef and CfgValueRef. Those opaque types encapsulate a `void `, but the meaning depends on the concrete CFG type. For example, MachineCfgTraits -- for use with MachineIR in SSA form -- encodes a Register inside CfgValueRef. Converting between concrete references and opaque/generic ones is done by CfgTraits::{fromGeneric,toGeneric}. Convenience methods CfgTraits::{un}wrap{Iterator,Range} are available as well. Writing algorithms in terms of CfgInterface adds some overhead (virtual method calls, plus in same cases it removes the opportunity to inline iterators), but can be much more convenient since generic algorithms can be written as non-templates. This patch adds implementations of CfgTraits for all CFGs on which dominator trees are calculated, so that the dominator tree can be ported to this machinery. Only IrCfgTraits (LLVM IR) and MachineCfgTraits (Machine IR in SSA form) are complete, the other implementations are limited to the absolute minimum required to make the upcoming dominator tree changes work. v5: - fix MachineCfgTraits::blockdef_iterator and allow it to iterate over the instructions in a bundle - use MachineBasicBlock::printName v6: - implement predecessors/successors for all CfgTraits implementations - fix error in unwrapRange - rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming that is consistent with {wrap,unwrap}{Iterator,Range} - use getVRegDef instead of getUniqueVRegDef v7: - std::forward fix in wrapping_iterator - fix typos v8: - cleanup operators on CfgOpaqueType - address other review comments Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d Differential Revision: https://reviews.llvm.org/D83088	2020-10-20 13:50:52 +02:00
Luqman Aden	51892a42da	[COFF][ARM] Fix CodeView for Windows on 32bit ARM targets. Create the LLVM / CodeView register mappings for the 32-bit ARM Window targets. Reviewed By: compnerd Differential Revision: https://reviews.llvm.org/D89622	2020-10-19 22:16:16 -07:00
Qiu Chaofan	1b2fe71ecf	[DAGCombiner] Tighten reasscociation of visitFMA From LangRef, FMF contract should not enable reassociating to form arbitrary contractions. So it should not help rearrange nodes like (fma (fmul x, c1), c2, y) into (fma x, c1*c2, y). Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89527	2020-10-20 10:13:01 +08:00
Craig Topper	edd0cb11bd	[SelectionDAG][X86] Enable SimplifySetCC CTPOP transforms for vector splats This enables these transforms for vectors: (ctpop x) u< 2 -> (x & x-1) == 0 (ctpop x) u> 1 -> (x & x-1) != 0 (ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0) (ctpop x) != 1 --> (x == 0) \|\| ((x & x-1) != 0) All enabled if CTPOP isn't Legal. This differs from the scalar behavior where the first two are done unconditionally and the last two are done if CTPOP isn't Legal or Custom. The Legal check produced better results for vectors based on X86's custom handling. Might be worth re-visiting scalars here. I disabled the looking through truncate for vectors. The code that creates new setcc can use the same result VT as the original setcc even if we truncated the input. That may work work for most scalars, but definitely wouldn't work for vectors unless it was a vector of i1. Fixes or at least improves PR47825 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89346	2020-10-19 12:56:59 -07:00
Amy Kwan	6a946fd06f	[DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead. MULH is often expanded on targets. This patch removes the isMulhCheaperThanMulShift hook and uses isOperationLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D80485	2020-10-19 12:23:04 -05:00
Mircea Trofin	225065b9a8	[NFC][MC] Type [MC]Register uses in MachineTraceMetrics Differential Revision: https://reviews.llvm.org/D89710	2020-10-19 09:49:52 -07:00
David Sherwood	3945b69e81	[SVE][CodeGen] Replace more TypeSize comparison operators with their scalar equivalents In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. This patch changes a few functions that were always expecting to work on scalar or fixed width types: 1. DAGCombiner::mergeTruncStores - deals with scalar integers only. 2. DAGCombiner::ReduceLoadWidth - not valid for vectors. 3. DAGCombiner::createBuildVecShuffle - should only be used for fixed width vectors. 4. SelectionDAGLegalize::ExpandFCOPYSIGN and SelectionDAGLegalize::getSignAsIntValue - only work on scalars. Differential Revision: https://reviews.llvm.org/D88562	2020-10-19 08:38:50 +01:00
David Sherwood	35a531fb45	[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. I've changed some of these places to use the equivalent scalar operator. Differential Revision: https://reviews.llvm.org/D88482	2020-10-19 08:30:31 +01:00
David Sherwood	f693f915a0	[SVE][CodeGen] Replace uses of TypeSize comparison operators In certain places in the code we can never end up in a situation where we're mixing fixed width and scalable vector types. For example, we can't have truncations and extends that change the lane count. Also, in other places such as GenWidenVectorStores and GenWidenVectorLoads we know from the behaviour of FindMemType that we can never choose a vector type with a different scalable property. In various places I have used EVT::bitsXY functions instead of TypeSize::isKnownXY, where it probably makes sense to keep an assert that scalable properties match. Differential Revision: https://reviews.llvm.org/D88654	2020-10-19 08:08:41 +01:00
Alok Kumar Sharma	0538353b3b	[DebugInfo] Support for DWARF operator DW_OP_over LLVM rejects DWARF operator DW_OP_over. This DWARF operator is needed for Flang to support assumed rank array. Summary: Currently LLVM rejects DWARF operator DW_OP_over. Below error is produced when llvm finds this operator. [..] invalid expression !DIExpression(151, 20, 16, 48, 30, 35, 80, 34, 6) warning: ignoring invalid debug info in over.ll [..] There were some parts missing in support of this operator, which are now completed. Testing -added a unit testcase -check-debuginfo -check-llvm Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89208	2020-10-17 08:42:28 +05:30
Craig Topper	278bd06891	[TargetLowering] Extract simplifySetCCs ctpop into a separate function. NFCI As requested in D89346. This allows us to add some early outs. I reordered some checks a little bit to make the more common bail outs happen earlier. Like checking opcode before checking hasOneUse. And I moved the bit width check to make sure it was safe to look through a truncate to the spot where we look through truncates instead of after. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89494	2020-10-16 19:47:56 -07:00
Jameson Nash	4242df1470	Revert "make the AsmPrinterHandler array public" I messed up one of the tests.	2020-10-16 17:22:07 -04:00
Jameson Nash	ac2def2d8d	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Differential Revision: https://reviews.llvm.org/D74158	2020-10-16 16:27:31 -04:00
Amara Emerson	6042c25b0a	[GlobalISel] Add translation support for vector reduction intrinsics. In order to prevent the ExpandReductions pass from expanding some intrinsics before they get to codegen, I had to add a -disable-expand-reductions flag for testing purposes. Differential Revision: https://reviews.llvm.org/D89028	2020-10-16 10:17:53 -07:00
Jay Foad	0c1381d795	[llc] Use -filetype=null to disable MIR printing If you use -stop-after or similar options, llc will normally print MIR. This patch checks for -filetype=null as a special case to disable MIR printing. As the comment says, "The Null output is intended for use for performance analysis ...", and I found this useful for timing a subset of the passes that llc runs without the significant overhead of printing MIR just to send it to /dev/null. Differential Revision: https://reviews.llvm.org/D89476	2020-10-16 16:51:56 +01:00
Amara Emerson	c2551c1f40	[GlobalISel] Remove scalar src from non-sequential fadd/fmul reductions. It's probably better to split these into separate G_FADD/G_FMUL + G_VECREDUCE operations in the translator rather than carrying the scalar around. The majority of the time it'll get simplified away as the scalars are probably identity values. Differential Revision: https://reviews.llvm.org/D89150	2020-10-15 15:51:44 -07:00
Denis Antrushin	8f0ddd4a1a	[Statepoints] Remove MI limit on number of tied operands. After D87915 statepoint can have more than 15 tied operands. Remove this restriction from statepoint lowering code.	2020-10-15 19:02:38 +07:00
Adrian Kuegel	ead2aa7098	Fix unused variable warning when compiling with asserts disabled. Differential Revision: https://reviews.llvm.org/D89454	2020-10-15 12:50:19 +02:00
Jeremy Morse	c521e44def	[DebugInstrRef] Support recording of instruction reference substitutions Add a table recording "substitutions" between pairs of <instruction, operand> numbers, from old pairs to new pairs. Post-isel optimizations are able to record the outcome of an optimization in this way. For example, if there were a divide instruction that generated the quotient and remainder, and it were replaced by one that only generated the quotient: $rax, $rcx = DIV-AND-REMAINDER $rdx, $rsi, debug-instr-num 1 DBG_INSTR_REF 1, 0 DBG_INSTR_REF 1, 1 Became: $rax = DIV $rdx, $rsi, debug-instr-num 2 DBG_INSTR_REF 1, 0 DBG_INSTR_REF 1, 1 We could enter a substitution from <1, 0> to <2, 0>, and no substitution for <1, 1> as it's no longer generated. This approach means that if an instruction or value is deleted once we've left SSA form, all variables that used the value implicitly become "optimized out", something that isn't true of the current DBG_VALUE approach. Differential Revision: https://reviews.llvm.org/D85749	2020-10-15 11:30:14 +01:00
Denis Antrushin	8c2b69d53a	[Statepoints] Unlimited tied operands. Current limit on amount of tied operands (15) sometimes is too low for statepoint. We may get couple dozens of gc pointer operands on statepoint. Review D87154 changed format of statepoint to list every gc pointer only once, which makes it trivial to find tiedness relation between statepoint operands: defs are mapped 1-1 to gc pointer operands passed on registers. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D87915	2020-10-15 16:16:11 +07:00
Craig Topper	50c9f1e11d	[TargetLowering] Replace Log2_32_Ceil with Log2_32 in SimplifySetCC ctpop combine. This combine can look through (trunc (ctpop X)). When doing this it tries to make sure the trunc doesn't lose any information from the ctpop. It does this by checking that the truncated type has more bits that Log2_32_Ceil of the ctpop type. The Ceil is unnecessary and pessimizes non-power of 2 types. For example, ctpop of i256 requires 9 bits to represent the max value of 256. But ctpop of i255 only requires 8 bits to represent the max result of 255. Log2_32_Ceil of 256 and 255 both return 8 while Log2_32 returns 8 for 256 and 7 for 255 The code with popcnt enabled is a regression for this test case, but it does match what already happens with i256 truncated to i9. Since power of 2 is more likely, I don't think it should block this change. Differential Revision: https://reviews.llvm.org/D89412	2020-10-15 01:05:07 -07:00
Snehasish Kumar	24bf6ff4e0	[llvm] Update default cutoff threshold for machine function splitter. Based on internal testing at Google we found that setting the profile summary cutoff threshold to 999950 yields the best results in terms of itlb and icache metrics (as observed on Intel CPUs). default = Split out code if no profile count available for block size-% = The fraction of bytes split out of .text and .text.hot itlb = Misses per kilo instructions (MPKI) for itlb icache = Misses per kilo instructions (MPKI) for L1 icache Search1 \| cutoff \| size-% \| itlb \| icache \| \|---------\|---------\|-----------\|---------\| \| default \| 42.5861 \| 0.0822151 \| 2.46363 \| \| 999999 \| 44.9350 \| 0.0767194 \| 2.44416 \| \| 999950 \| 50.0660 \| 0.075744 \| 2.4091 \| \| 999500 \| 56.9158 \| 0.082564 \| 2.4188 \| \| 995000 \| 63.8625 \| 0.0814927 \| 2.42832 \| \| 990000 \| 71.7314 \| 0.106906 \| 2.57785 \| Search2 \| cutoff \| size-% \| itlb \| icache \| \|---------\|--------\|----------\|---------\| \| default \| 2.8845 \| 0.626712 \| 4.73245 \| \| 999999 \| 3.3291 \| 0.602309 \| 4.70045 \| \| 999950 \| 3.8577 \| 0.587842 \| 4.71632 \| \| 999500 \| 4.4170 \| 0.63577 \| 4.68351 \| \| 995000 \| 5.1020 \| 0.657969 \| 4.82272 \| \| 990000 \| 5.7153 \| 0.719122 \| 5.39496 \| Differential Revision: https://reviews.llvm.org/D89085	2020-10-14 12:48:10 -07:00
Snehasish Kumar	77638a5343	[llvm] Set the default for -bbsections-cold-text-prefix to .text.split. After using this for a while, we find that it is generally useful to have it set to .text.split. by default, removing the need for an additional -mllvm option. Differential Revision: https://reviews.llvm.org/D88997	2020-10-14 12:16:36 -07:00
Guozhi Wei	adfb541501	[MBP] Add whole chain to BlockFilterSet instead of individual BB Currently we add individual BB to BlockFilterSet if its frequency satisfies LoopFreq / Freq <= LoopToColdBlockRatio LoopFreq is edge frequency from outside to loop header. LoopToColdBlockRatio is a command line parameter. It doesn't make sense since we always layout whole chain, not individual BBs. It may also cause a tricky problem. Sometimes it is possible that the LoopFreq of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop, like .cold in the test case. So it is added to the chain of inner loop. When work on the outer loop, .cold is not added to BlockFilterSet, so the edge to successor .problem is not counted in UnscheduledPredecessors of .problem chain. But other blocks in the inner loop are added BlockFilterSet, so the whole inner loop chain can be layout, and markChainSuccessors is called to decrease UnscheduledPredecessors of following chains. markChainSuccessors calls markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold, so .problem chain's UnscheduledPredecessors is decreased, but this edge was not counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors becomes 0 when it still has an unscheduled predecessor .pred! And it causes problems in following various successor BB selection algorithms. Differential Revision: https://reviews.llvm.org/D89088	2020-10-14 11:55:10 -07:00
jasonliu	f85bcc21dd	[AIX] Turn -fdata-sections on by default in Clang Summary: This patch does the following: 1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a parameter, because some options' default value is triple dependant. 2. DataSections is turned on by default on AIX for llc. 3. Test cases change accordingly because of the default behaviour change. 4. Clang Driver passes in -fdata-sections by default on AIX. Reviewed By: MaskRay, DiggerLin Differential Revision: https://reviews.llvm.org/D88737	2020-10-14 15:58:31 +00:00
Mircea Trofin	c8fcffe775	[NFC][MC] Use MCRegister in Machine{Sink\|Pipeliner}.cpp Differential Revision: https://reviews.llvm.org/D89328	2020-10-14 08:42:17 -07:00
Michael Liao	ae40d2858e	Fix an apparent typo. `assert()` must not contain side-effects. NFC.	2020-10-14 11:33:34 -04:00
Jeremy Morse	c4e7857d4e	[DebugInstrRef] Create DBG_INSTR_REFs in SelectionDAG When given the -experimental-debug-variable-locations option (via -Xclang or to llc), have SelectionDAG generate DBG_INSTR_REF instructions instead of DBG_VALUE. For now, this only happens in a limited circumstance: when the value referred to is not a PHI and is defined in the current block. Other situations introduce interesting problems, addresed in later patches. Practically, this patch hooks into InstrEmitter and if it can find a defining instruction for a value, gives it an instruction number, and points the DBG_INSTR_REF at that <instr, operand> pair. Differential Revision: https://reviews.llvm.org/D85747	2020-10-14 14:24:08 +01:00
Jeremy Morse	2c5f3d54c5	[DebugInstrRef] Parse debug instruction-references from/to MIR This patch defines the MIR format for debug instruction references: it's an integer trailing an instruction, marked out by "debug-instr-number", much like how "debug-location" identifies the DebugLoc metadata of an instruction. The instruction number is stored directly in a MachineInstr. Actually referring to an instruction comes in a later patch, but is done using one of these instruction numbers. I've added a round-trip test and two verifier checks: that we don't label meta-instructions as generating values, and that there are no duplicates. Differential Revision: https://reviews.llvm.org/D85746	2020-10-14 10:57:09 +01:00
Aditya Nandakumar	ef3d17482f	[GISel] Add combine for constant G_PTR_ADD offsets. https://reviews.llvm.org/D88865 This adds a single combine for GlobalISel to fold: ptradd (inttoptr C1) C2 Into: C1 + C2 Additionally, a small test for AArch64 is added. Patch by pnappa.	2020-10-13 17:26:12 -07:00
Andrew Paverd	cfd8481da1	Reland [CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-10-13 13:20:52 -07:00

... 2 3 4 5 6 ...

29920 Commits