llvm-project

Commit Graph

Author	SHA1	Message	Date
Amy Huang	7156910d85	[CodeView] Encode signed int values correctly when emitting S_CONSTANTs Differential Revision: https://reviews.llvm.org/D90199	2020-10-30 09:28:41 -07:00
Nikita Popov	6c2ad4cf87	[SDAG] Extract helper to determine neutral element (NFC) Make the existing VECREDUCE based code more generic, but expressing it in terms of the neutral value of the base opcode instead.	2020-10-29 22:05:06 +01:00
Nikita Popov	a5f172927d	[SDAG] Fix neutral value for vecreduce_fadd The neutral value for FADD is -0.0, not 0.0, so this is what we need to pad vectors with.	2020-10-29 21:27:59 +01:00
Nikita Popov	91bf172088	[SDAG] Extract helper to get vecreduce base opcode (NFC)	2020-10-29 20:22:22 +01:00
dfukalov	b3cdaef518	[MIR] Fix out of bounds access in MIRPrinter. Fixes: SWDEV-256460 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D90239	2020-10-29 14:35:06 +03:00
Alok Kumar Sharma	930a8c60b6	[DebugInfo] [NFCI] Adding a missed out line in support for DW_TAG_generic_subrange. This commit adds a missed out line in earlier commit for DW_TAG_generic_subrange. Previous commit ID: `a6dd01afa3` Differential Revision: https://reviews.llvm.org/D89218 Thanks markus for pointing this out.	2020-10-29 16:18:20 +05:30
David Green	a4b6b1e1c8	[InterleaveAccess] Recognise Interleave loads through binary operations Instcombine will currently sink identical shuffles though vector binary operations. This is probably generally useful, but can break up the code pattern we use to represent an interleaving load group. This patch reverses that in the InterleaveAccessPass to re-recognise the pattern of shuffles sunk past binary operations and folds them back if an interleave group can be created. Differential Revision: https://reviews.llvm.org/D89489	2020-10-29 09:13:23 +00:00
Vedant Kumar	ffba94a9ac	Revert "[DebugInfo] Fix legacy ZExt emission when FromBits >= 64 (PR47927)" This reverts commit `9905346221`. It breaks the compiler-rt build, see https://reviews.llvm.org/D89838	2020-10-28 18:57:17 -07:00
Vedant Kumar	4fe81b6b6a	Revert "[DebugInfo] Shorten legacy [s\|z]ext dwarf expressions" This reverts commit `2ce36ebca5`. It depends on https://reviews.llvm.org/D89838, which needs to be reverted.	2020-10-28 18:57:17 -07:00
Derek Schuff	77973f8dee	[WebAssembly] Add support for DWARF type units Since Wasm comdat sections work similarly to ELF, we can use that mechanism to eliminate duplicate dwarf type information in the same way. Differential Revision: https://reviews.llvm.org/D88603	2020-10-28 17:41:22 -07:00
Amy Huang	7669f3c0f6	Recommit "[CodeView] Emit static data members as S_CONSTANTs." We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. This changes CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072 This reverts commit `504615353f`.	2020-10-28 16:35:59 -07:00
Gaurav Jain	f719fd7ade	[NFC] Use [MC]Register in CSE & LICM Differential Revision: https://reviews.llvm.org/D90327	2020-10-28 15:53:26 -07:00
Alok Kumar Sharma	a6dd01afa3	[DebugInfo] Support for DW_TAG_generic_subrange This is needed to support fortran assumed rank arrays which have runtime rank. Summary: Fortran assumed rank arrays have dynamic rank. DWARF TAG DW_TAG_generic_subrange is needed to support that. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89218	2020-10-29 01:34:15 +05:30
Aditya Nandakumar	bed8394047	[GISel]: Few InsertVecElt combines https://reviews.llvm.org/D88060 This adds the following combines 1) build_vector formation from insert_vec_elts 2) insert_vec_elts (build_vector) -> build_vector	2020-10-28 12:27:07 -07:00
Mircea Trofin	f0a98ad820	[NFC] Use Register in RegisterPressure APIs Some related changes as well. Differential Revision: https://reviews.llvm.org/D90268	2020-10-28 12:14:08 -07:00
Vedant Kumar	2ce36ebca5	[DebugInfo] Shorten legacy [s\|z]ext dwarf expressions Take advantage of the emitConstu helper to emit slightly shorter dwarf expressions to implement legacy [s\|z]ext operations.	2020-10-28 12:06:02 -07:00
Vedant Kumar	9905346221	[DebugInfo] Fix legacy ZExt emission when FromBits >= 64 (PR47927) Fix an out-of-bounds shift in emitLegacyZExt by using a slightly more complicated dwarf expression to create the zext mask. This addresses a UBSan diagnostic seen when compiling compiler-rt (llvm.org/PR47927). rdar://70307714 Differential Revision: https://reviews.llvm.org/D89838	2020-10-28 12:06:02 -07:00
Matt Arsenault	b9c21d43bb	RegAlloc: Clear isSSA The MIR parser may infer SSA, so -run-pass=regallocgreedy would hit a verifier error after multiple vreg defs are added.	2020-10-28 12:02:16 -04:00
Djordje Todorovic	6384378582	[NFC][IntrRefLDV] Improve the Value printing Basically, this just improves the dump of the Value stored within a location. If the defining instruction number is zero, it means it is "live-in". Before the patch: ESI --> bb 0 inst 0 loc ESI After: ESI --> Value{bb: 0, inst: live-in, loc: ESI} This is an NFC. Differential Revision: https://reviews.llvm.org/D90309	2020-10-28 07:39:08 -07:00
Simon Pilgrim	f53d7f55f1	[DAG] Move canFoldInAddressingMode before foldBinOpIntoSelect. NFC. Reduces the diff in D90113.	2020-10-28 12:16:05 +00:00
Luqman Aden	4c0a016927	Rename EHPersonality::MSVC_Win64SEH to EHPersonality::MSVC_TableSEH. NFC. The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining vs table-based exception handling. x86-32 is the only arch for which Windows uses the former. 32-bit ARM would use what is called Win64SEH today, which is a bit confusing so instead let's just rename it to be a bit more clear. Reviewed By: compnerd, rnk Differential Revision: https://reviews.llvm.org/D90117	2020-10-27 23:22:13 -07:00
Derek Schuff	44eea0b1a7	Revert "[WebAssembly] Add support for DWARF type units" This reverts commit `bcb8a119df`.	2020-10-27 17:57:32 -07:00
Derek Schuff	bcb8a119df	[WebAssembly] Add support for DWARF type units Since Wasm comdat sections work similarly to ELF, we can use that mechanism to eliminate duplicate dwarf type information in the same way. Differential Revision: https://reviews.llvm.org/D88603	2020-10-27 17:13:41 -07:00
Nicolai Hähnle	e025d09b21	Revert multiple patches based on "Introduce CfgTraits abstraction" These logically belong together since it's a base commit plus followup fixes to less common build configurations. The patches are: Revert "CfgInterface: rename interface() to getInterface()" This reverts commit `a74fc48158`. Revert "Wrap CfgTraitsFor in namespace llvm to please GCC 5" This reverts commit `f2a06875b6`. Revert "Try to make GCC5 happy about the CfgTraits thing" This reverts commit `03a5f7ce12`. Revert "Introduce CfgTraits abstraction" This reverts commit `c0cdd22c72`.	2020-10-27 20:33:30 +01:00
Amy Huang	504615353f	Revert "[CodeView] Emit static data members as S_CONSTANTs." Seems like there's an assert in here that we shouldn't be running into. This reverts commit `515973222e`.	2020-10-27 11:29:58 -07:00
Vedant Kumar	5a3ef55a52	[Utils] Skip RemoveRedundantDbgInstrs in MergeBlockIntoPredecessor (PR47746) This patch changes MergeBlockIntoPredecessor to skip the call to RemoveRedundantDbgInstrs, in effect partially reverting D71480 due to some compile-time issues spotted in LoopUnroll and SimplifyCFG. The call to RemoveRedundantDbgInstrs appears to have changed the worst-case behavior of the merging utility. Loosely speaking, it seems to have gone from O(#phis) to O(#insts). It might not be possible to mitigate this by scanning a block to determine whether there are any debug intrinsics to remove, since such a scan costs O(#insts). So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly little fallout from this, and most of it can be addressed by doing RemoveRedundantDbgInstrs later. The exception is (the block-local version of) SimplifyCFG, where it might just be too expensive to call RemoveRedundantDbgInstrs. Differential Revision: https://reviews.llvm.org/D88928	2020-10-27 10:12:59 -07:00
Djordje Todorovic	cca049ad2b	[NFC][IntrRefLDV] Some code clean up As reading the source code, I've found some minor nits: -Use using instead of typedef -Fix a comment -Refactor Differential Revision: https://reviews.llvm.org/D90155	2020-10-27 05:31:24 -07:00
Sven van Haastregt	5d03080092	[TargetLowering] Add i1 condition for bit comparison fold For i1 types, boolean false is represented identically regardless of the boolean content, so we can allow optimizations that otherwise would not be correct for booleans with false represented as a negative one. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D90145	2020-10-27 12:22:20 +00:00
Med Ismail Bennani	a3aea0193d	[llvm/DebugInfo] Simplify DW_OP_implicit_value condition (NFC) Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2020-10-27 11:25:19 +01:00
Gaurav Jain	17cdba61d4	[NFC] Use [MC]Register in RegAllocPBQP & RegisterCoalescer Differential Revision: https://reviews.llvm.org/D90008	2020-10-26 17:13:32 -07:00
Rahman Lavaee	0b2f4cdf2b	Explicitly check for entry basic block, rather than relying on MachineBasicBlock::pred_empty. Sometimes in unoptimized code, we have dangling unreachable basic blocks with no predecessors. Basic block sections should be emitted for those as well. Without this patch, the included test fails with a fatal error in `AsmPrinter::emitBasicBlockEnd`. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D89423	2020-10-26 16:15:56 -07:00
Amy Huang	515973222e	[CodeView] Emit static data members as S_CONSTANTs. We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. I changed CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072	2020-10-26 15:30:35 -07:00
Peter Waller	5b742a0c10	[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination The modified code in visitSTORE was missing a scalable vector check, and still using the now deprecated implicit cast of TypeSize to uint64_t through the overloaded operator. This patch fixes these issues. This brings the logic in line with the comment on the context line immediately above the added precondition. Add a test in sve-redundant-store.ll that the warning is not triggered. Differential Revision: https://reviews.llvm.org/D89701	2020-10-26 16:37:48 +00:00
Peter Waller	6536d6040f	Revert "[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination" This reverts commit `4604441386`. Reverting because it was not the intended version of the patch, which follows this patch.	2020-10-26 16:37:00 +00:00
Peter Waller	4604441386	[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination The modified code in visitSTORE was missing a scalable vector check, and still using the now deprecated implicit cast of TypeSize to uint64_t through the overloaded operator. This patch fixes these issues. This brings the logic in line with the comment on the context line immediately above the added precondition. Add a test in Redundantstores.ll that the warning is not triggered.	2020-10-26 16:23:42 +00:00
Djordje Todorovic	a64b2c9366	[NFC][InstrRefLDV] Fix a typo	2020-10-26 04:04:16 -07:00
Florian Hahn	b2bec7cece	[AsmPrinter] Add per BB instruction mix remark. This patch adds a remarks that provides counts for each opcode per basic block. An snippet of the generated information can be seen below. The current implementation uses the target specific opcode for the counts. For example, on AArch64 this means we currently get 2 entries for `add` instructions if the block contains 32 and 64 bit adds. Similarly, immediate version are treated differently. Unfortunately there seems to be no convenient way to get only the mnemonic part of the instruction as a string AFAIK. This could be improved in the future. ``` --- !Analysis Pass: asm-printer Name: InstructionMix DebugLoc: { File: arm64-instruction-mix-remarks.ll, Line: 30, Column: 30 } Function: foo Args: - String: 'BasicBlock: ' - BasicBlock: else - String: "\n" - String: INST_MADDWrrr - String: ': ' - INST_MADDWrrr: '2' - String: "\n" - String: INST_MOVZWi - String: ': ' - INST_MOVZWi: '1' ``` Reviewed By: anemet, thegameg, paquette Differential Revision: https://reviews.llvm.org/D89892	2020-10-26 09:25:45 +00:00
David Green	61bc18de0b	[Schedule] Add a MultiHazardRecognizer This adds a MultiHazardRecognizer and starts to make use of it in the ARM backend. The idea of the class is to allow multiple independent hazard recognizers to be added to a single base MultiHazardRecognizer, allowing them to all work in parallel without requiring them to be chained into subclasses. They can then be added or not based on cpu or subtarget features, which will become useful in the ARM backend once more hazard recognizers are being used for various things. This also renames ARMHazardRecognizer to ARMHazardRecognizerFPMLx in the process, to more clearly explain what that recognizer is designed for. Differential Revision: https://reviews.llvm.org/D72939	2020-10-26 08:06:17 +00:00
Simon Pilgrim	ce356e1546	[DAG] Add BuildVectorSDNode::getRepeatedSequence helper to recognise multi-element splat patterns Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method. I've just used this to simplify the X86ISD::BROADCASTM lowering so far (and remove isSplatZeroExtended), but we should be able to use this in more places to lower to complex broadcast patterns. Differential Revision: https://reviews.llvm.org/D87930	2020-10-24 12:23:09 +01:00
Simon Pilgrim	62b17a7697	[LegalizeTypes] Legalize vector rotate operations Lower vector rotate operations as long as the legalization occurs outside of LegalizeVectorOps. This fixes https://bugs.llvm.org/show_bug.cgi?id=47320 Patch By: @rsanthir.quic (Ryan Santhirarajan) Differential Revision: https://reviews.llvm.org/D89497	2020-10-24 11:30:32 +01:00
Med Ismail Bennani	64c4dac60e	[llvm/DebugInfo] Emit DW_OP_implicit_value when tuning for LLDB This patch enables emitting DWARF `DW_OP_implicit_value` opcode when tuning debug information for LLDB (`-debugger-tune=lldb`). This will also propagate to Darwin platforms, since they use LLDB tuning as a default. rdar://67406059 Differential Revision: https://reviews.llvm.org/D90001 Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2020-10-24 06:45:33 +02:00
Mehdi Amini	8f492f6467	Remove unused verifyRegStateMapping() function in RegAllocFast (NFC) This fixes compiler warning when building with assertions.	2020-10-24 00:36:51 +00:00
Cameron McInally	a1cc274cb3	[SVE] Lower fixed length VECREDUCE_SEQ_FADD operation Differential Revision: https://reviews.llvm.org/D89162	2020-10-23 16:24:02 -05:00
Nick Desaulniers	b7926ce6d7	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
Mircea Trofin	819044ad2d	[NFC] Use [MC]Register in RegAllocGreedy This was initiated from the uses of MCRegUnitIterator, so while likely not exhaustive, it's a step forward. Differential Revision: https://reviews.llvm.org/D89975	2020-10-23 11:30:53 -07:00
Paulo Matos	69e2797eae	[WebAssembly] Implementation of (most) table instructions Implementation of instructions table.get, table.set, table.grow, table.size, table.fill, table.copy. Missing instructions are table.init and elem.drop as they deal with element sections which are not yet implemented. Added more tests to tables.s Differential Revision: https://reviews.llvm.org/D89797	2020-10-23 08:42:54 -07:00
Jeremy Morse	b1b2c6ab66	[DebugInstrRef] Handle DBG_INSTR_REFs use-before-defs in LiveDebugValues Deciding where to place debugging instructions when normal instructions sink between blocks is difficult -- see PR44117. Dealing with this with instruction-referencing variable locations is simple: we just tolerate DBG_INSTR_REFs referring to values that haven't been computed yet. This patch adds support into InstrRefBasedLDV to record when a variable value appears in the middle of a block, and should have a DBG_VALUE added when it appears (a debug use before def). While described simply, this relies heavily on the value-propagation algorithm in InstrRefBasedLDV. The implementation doesn't attempt to verify the location of a value unless something non-trivial occurs to merge variable values in vlocJoin. This means that a variable with a value that has no location can retain it across all control flow (including loops). It's only when another debug instruction specifies a different variable value that we have to check, and find there's no location. This property means that if a machine value is defined in a block dominated by a DBG_INSTR_REF that refers to it, all the successor blocks can automatically find a location for that value (if it's not clobbered). Thus in a sense, InstrRefBasedLDV is already supporting and implementing use-before-defs. This patch allows us to specify a variable location in the block where it's defined. When loading live-in variable locations, TransferTracker currently discards those where it can't find a location for the variable value. However, we can tell from the machine value number whether the value is defined in this block. If it is, add it to a set of use-before-def records. Then, once the relevant instruction has been processed, emit a DBG_VALUE immediately after it. Differential Revision: https://reviews.llvm.org/D85775	2020-10-23 16:33:23 +01:00
Denis Antrushin	4f7ee55971	Revert "[Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty." Downstream testing revealed some problems with this patch. Reverting while investigating. This reverts commit `2b96dcebfa`.	2020-10-23 21:55:06 +07:00
Jeremy Morse	68f4715716	[DebugInstrRef] Convert DBG_INSTR_REFs into variable locations Handle DBG_INSTR_REF instructions in LiveDebugValues, to determine and propagate variable locations. The logic is fairly straight forwards: Collect a map of debug-instruction-number to the machine value numbers generated in the first walk through the function. When building the variable value transfer function and we see a DBG_INSTR_REF, look up the instruction it refers to, and pick the machine value number it generates, That's it; the rest of LiveDebugValues continues as normal. Awkwardly, there are two kinds of instruction numbering happening here: the offset into the block (which is how machine value numbers are determined), and the numbers that we label instructions with when generating DBG_INSTR_REFs. I've also restructured the TransferTracker redefVar code a little, to separate some DBG_VALUE specific operations into its own method. The changes around redefVar should be largely NFC, while allowing DBG_INSTR_REFs to specify a value number rather than just a location. Differential Revision: https://reviews.llvm.org/D85771	2020-10-23 14:50:02 +01:00
Jeremy Morse	ab93e71065	[DebugInstrRef] NFC: Separate collection of machine/variable values This patch adjusts _when_ something happens in LiveDebugValues / InstrRefBasedLDV, to make it more amenable to dealing with DBG_INSTR_REF instructions. There's no functional change. In the current InstrRefBasedLDV implementation, we collect the machine value-number transfer function for blocks at the same time as the variable-value transfer function. After solving machine value numbers, the variable-value transfer function is updated so that DBG_VALUEs of live-in registers have the correct value. The same would need to be done for DBG_INSTR_REFs, to connect instruction-references with machine value numbers. Rather than writing more code for that, this patch separates the two: we collect the (machine-value-number) transfer function and solve for machine value numbers, then step through the MachineInstrs again collecting the variable value transfer function. This simplifies things for the new few patches. Differential Revision: https://reviews.llvm.org/D85760	2020-10-23 11:13:20 +01:00
David Blaikie	4437df8eed	DebugInfo: Hash DIE referevences (DW_OP_convert) when computing Split DWARF signatures	2020-10-22 20:09:33 -07:00
Han Shen	e42f6c0ac0	Revert "[MBP] Add whole chain to BlockFilterSet instead of individual BB" This reverts commit `adfb541501`. This is reverted because it caused an chrome error: https://crbug.com/1140168	2020-10-22 17:31:01 -07:00
David Blaikie	a66311277a	DWARFv5: Disable DW_OP_convert for configurations that don't yet support it Testing reveals that lldb and gdb have some problems with supporting DW_OP_convert - gdb with Split DWARF tries to resolve the CU-relative DIE offset relative to the skeleton DIE. lldb tries to treat the offset as absolute, which judging by the llvm-dsymutil support for DW_OP_convert, I guess works OK in MachO? (though probably llvm-dsymutil is producing invalid DWARF by resolving the relative reference to an absolute one?). Specifically this disables DW_OP_convert usage in DWARFv5 if: * Tuning for GDB and using Split DWARF * Tuning for LLDB and not targeting MachO	2020-10-22 12:02:33 -07:00
Mircea Trofin	e24537d48f	[NFC][MC] Use MCRegister for ReachingDefAnalysis APIs Also updated the users of the APIs; and a drive-by small change to RDFRegister.cpp Differential Revision: https://reviews.llvm.org/D89912	2020-10-22 08:47:35 -07:00
Jeremy Morse	68ac02c0dd	[DebugInstrRef] Pass DBG_INSTR_REFs through register allocation Both FastRegAlloc and LiveDebugVariables/greedy need to cope with DBG_INSTR_REFs. None of them actually need to take any action, other than passing DBG_INSTR_REFs through: variable location information doesn't refer to any registers at this stage. LiveDebugVariables stashes the instruction information in a tuple, then re-creates it later. This is only necessary as the register allocator doesn't expect to see any debug instructions while it's working. No equivalence classes or interval splitting is required at all! No changes are needed for the fast register allocator, as it just ignores debug instructions. The test added checks that both of them preserve DBG_INSTR_REFs. This also expands ScheduleDAGInstrs.cpp to treat DBG_INSTR_REFs the same as DBG_VALUEs when rescheduling instructions around. The current movement of DBG_VALUEs around is less than ideal, but it's not a regression to make DBG_INSTR_REFs subject to the same movement. Differential Revision: https://reviews.llvm.org/D85757	2020-10-22 15:51:22 +01:00
Matt Arsenault	188df17420	ScheduleDAGInstrs: Skip debug instructions at end of scheduling region If the end instruction of the scheduling region was a DBG_VALUE, the uses of the debug instruction were tracked as if they were real uses. This would then hit the deadDefHasNoUse assertion in addVRegDefDeps if the only use was the debug instruction.	2020-10-22 10:16:45 -04:00
Jeremy Morse	d73275993b	[DebugInstrRef] Substitute debug value numbers to handle optimizations This patch touches two optimizations, TwoAddressInstruction and X86's FixupLEAs pass, both of which optimize by re-creating instructions. For LEAs, various bits of arithmetic are better represented as LEAs on X86, while TwoAddressInstruction sometimes converts instrs into three address instructions if it's profitable. For debug instruction referencing, both of these require substitutions to be created -- the old instruction number must be pointed to the new instruction number, as illustrated in the added test. If this isn't done, any variable locations based on the optimized instruction are conservatively dropped. Differential Revision: https://reviews.llvm.org/D85756	2020-10-22 13:01:03 +01:00
Fangrui Song	b0c12474ed	[ShrinkWrap] Delete unneeded nullptr checks for the save point. NFC findNearestCommonDominator never returns nullptr.	2020-10-22 00:27:01 -07:00
Xiang1 Zhang	7c3fea7721	[X86] Support customizing stack protector guard Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D88631	2020-10-22 10:08:14 +08:00
Craig Topper	9e884169a2	[FPEnv][X86][SystemZ] Use different algorithms for i64->double uint_to_fp under strictfp to avoid producing -0.0 when rounding toward negative infinity Some of our conversion algorithms produce -0.0 when converting unsigned i64 to double when the rounding mode is round toward negative. This switches them to other algorithms that don't have this problem. Since it is undefined behavior to change rounding mode with the non-strict nodes, this patch only changes the behavior for strict nodes. There are still problems with unsigned i32 conversions too which I'll try to fix in another patch. Fixes part of PR47393 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87115	2020-10-21 18:12:54 -07:00
Gaurav Jain	4634ad6c0b	[NFC] Set return type of getStackPointerRegisterToSaveRestore to Register Differential Revision: https://reviews.llvm.org/D89858	2020-10-21 16:19:38 -07:00
Jeremy Morse	537f0fbe82	[DebugInfo] Follow up `c521e44def` with an API improvement As mentioned post-commit in D85749, the 'substituteDebugValuesForInst' method added in `c521e44def` would be better off with a limit on the number of operands to substitute. This handles the common case of "substitute the first operand between these two differing instructions", or possibly up to N first operands.	2020-10-21 14:45:55 +01:00
Simon Pilgrim	88523f6f4b	[DAG] getNode(ISD::EXTRACT_SUBVECTOR) Drop unnecessary N2C null check - we assert that this isn't null and have already used the pointer. NFCI. Fixes cppcheck + null dereference warning.	2020-10-21 11:53:44 +01:00
Nicholas Guy	9a2d2bedb7	Add "SkipDead" parameter to TargetInstrInfo::DefinesPredicate Some instructions may be removable through processes such as IfConversion, however DefinesPredicate can not be made aware of when this should be considered. This parameter allows DefinesPredicate to distinguish these removable instructions on a per-call basis, allowing for more fine-grained control from processes like ifConversion. Renames DefinesPredicate to ClobbersPredicate, to better reflect it's purpose Differential Revision: https://reviews.llvm.org/D88494	2020-10-21 11:52:47 +01:00
Sven van Haastregt	bfc961aeb2	[TargetLowering] Check boolean content when folding bit compare Updates an optimization that relies on boolean contents being either 0 or 1 to properly check for this before triggering. The following: (X & 8) != 0 --> (X & 8) >> 3 Produces unexpected results when a boolean 'true' value is represented by negative one. Patch by Erik Hogeman. Differential Revision: https://reviews.llvm.org/D89390	2020-10-21 11:46:55 +01:00
David Sherwood	5b17b323a6	[SVE][CodeGen] Replace use of TypeSize comparison operator in CreateStackTemporary We were previously relying upon the TypeSize comparison operators to obtain the maximum size of two types, however use of such operators is being deprecated in favour of making the caller aware that it could be dealing with scalable vector types. I have changed the code to assert that the two types have the same scalable property and thus we can simply take the maximum of the known minimum sizes instead. Differential Revision: https://reviews.llvm.org/D88563	2020-10-21 08:31:36 +01:00
Mircea Trofin	5e731625f3	[NFC][MC] Use [MC]Register in MachineVerifier Differential Revision: https://reviews.llvm.org/D89815	2020-10-20 20:42:35 -07:00
Hubert Tong	134ffa8138	NFC: Fix -Wsign-compare warnings on 32-bit builds Comparing 32-bit `ptrdiff_t` against 32-bit `unsigned` results in `-Wsign-compare` warnings for both GCC and Clang. The warning for the cases in question appear to identify an issue where the `ptrdiff_t` value would be mutated via conversion to an unsigned type. The warning is resolved by using the usual arithmetic conversions to safely preserve the value of the `unsigned` operand while trying to convert to a signed type. Host platforms where `unsigned` has the same width as `unsigned long long` will need to make a different change, but using an explicit cast has disadvantages that can be avoided for now. Reviewed By: dantrushin Differential Revision: https://reviews.llvm.org/D89612	2020-10-20 20:52:10 -04:00
Austin Kerbow	37d907899f	[HazardRec] Allow inserting multiple wait-states simultaneously If a target can encode multiple wait-states into a noop allow emitting such instructions directly. Reviewed By: rampitec, dmgreen Differential Revision: https://reviews.llvm.org/D89753	2020-10-20 17:03:47 -07:00
Nicolai Hähnle	c0cdd22c72	Introduce CfgTraits abstraction The CfgTraits abstraction simplfies writing algorithms that are generic over the type of CFG, and enables writing such algorithms as regular non-template code that operates on opaque references to CFG blocks and values. Implementations of CfgTraits provide operations on the concrete CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock `. CfgInterface is an abstract base class which provides operations on opaque types CfgBlockRef and CfgValueRef. Those opaque types encapsulate a `void `, but the meaning depends on the concrete CFG type. For example, MachineCfgTraits -- for use with MachineIR in SSA form -- encodes a Register inside CfgValueRef. Converting between concrete references and opaque/generic ones is done by CfgTraits::{fromGeneric,toGeneric}. Convenience methods CfgTraits::{un}wrap{Iterator,Range} are available as well. Writing algorithms in terms of CfgInterface adds some overhead (virtual method calls, plus in same cases it removes the opportunity to inline iterators), but can be much more convenient since generic algorithms can be written as non-templates. This patch adds implementations of CfgTraits for all CFGs on which dominator trees are calculated, so that the dominator tree can be ported to this machinery. Only IrCfgTraits (LLVM IR) and MachineCfgTraits (Machine IR in SSA form) are complete, the other implementations are limited to the absolute minimum required to make the upcoming dominator tree changes work. v5: - fix MachineCfgTraits::blockdef_iterator and allow it to iterate over the instructions in a bundle - use MachineBasicBlock::printName v6: - implement predecessors/successors for all CfgTraits implementations - fix error in unwrapRange - rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming that is consistent with {wrap,unwrap}{Iterator,Range} - use getVRegDef instead of getUniqueVRegDef v7: - std::forward fix in wrapping_iterator - fix typos v8: - cleanup operators on CfgOpaqueType - address other review comments Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d Differential Revision: https://reviews.llvm.org/D83088	2020-10-20 13:50:52 +02:00
Luqman Aden	51892a42da	[COFF][ARM] Fix CodeView for Windows on 32bit ARM targets. Create the LLVM / CodeView register mappings for the 32-bit ARM Window targets. Reviewed By: compnerd Differential Revision: https://reviews.llvm.org/D89622	2020-10-19 22:16:16 -07:00
Qiu Chaofan	1b2fe71ecf	[DAGCombiner] Tighten reasscociation of visitFMA From LangRef, FMF contract should not enable reassociating to form arbitrary contractions. So it should not help rearrange nodes like (fma (fmul x, c1), c2, y) into (fma x, c1*c2, y). Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89527	2020-10-20 10:13:01 +08:00
Craig Topper	edd0cb11bd	[SelectionDAG][X86] Enable SimplifySetCC CTPOP transforms for vector splats This enables these transforms for vectors: (ctpop x) u< 2 -> (x & x-1) == 0 (ctpop x) u> 1 -> (x & x-1) != 0 (ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0) (ctpop x) != 1 --> (x == 0) \|\| ((x & x-1) != 0) All enabled if CTPOP isn't Legal. This differs from the scalar behavior where the first two are done unconditionally and the last two are done if CTPOP isn't Legal or Custom. The Legal check produced better results for vectors based on X86's custom handling. Might be worth re-visiting scalars here. I disabled the looking through truncate for vectors. The code that creates new setcc can use the same result VT as the original setcc even if we truncated the input. That may work work for most scalars, but definitely wouldn't work for vectors unless it was a vector of i1. Fixes or at least improves PR47825 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89346	2020-10-19 12:56:59 -07:00
Amy Kwan	6a946fd06f	[DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead. MULH is often expanded on targets. This patch removes the isMulhCheaperThanMulShift hook and uses isOperationLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D80485	2020-10-19 12:23:04 -05:00
Mircea Trofin	225065b9a8	[NFC][MC] Type [MC]Register uses in MachineTraceMetrics Differential Revision: https://reviews.llvm.org/D89710	2020-10-19 09:49:52 -07:00
David Sherwood	3945b69e81	[SVE][CodeGen] Replace more TypeSize comparison operators with their scalar equivalents In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. This patch changes a few functions that were always expecting to work on scalar or fixed width types: 1. DAGCombiner::mergeTruncStores - deals with scalar integers only. 2. DAGCombiner::ReduceLoadWidth - not valid for vectors. 3. DAGCombiner::createBuildVecShuffle - should only be used for fixed width vectors. 4. SelectionDAGLegalize::ExpandFCOPYSIGN and SelectionDAGLegalize::getSignAsIntValue - only work on scalars. Differential Revision: https://reviews.llvm.org/D88562	2020-10-19 08:38:50 +01:00
David Sherwood	35a531fb45	[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. I've changed some of these places to use the equivalent scalar operator. Differential Revision: https://reviews.llvm.org/D88482	2020-10-19 08:30:31 +01:00
David Sherwood	f693f915a0	[SVE][CodeGen] Replace uses of TypeSize comparison operators In certain places in the code we can never end up in a situation where we're mixing fixed width and scalable vector types. For example, we can't have truncations and extends that change the lane count. Also, in other places such as GenWidenVectorStores and GenWidenVectorLoads we know from the behaviour of FindMemType that we can never choose a vector type with a different scalable property. In various places I have used EVT::bitsXY functions instead of TypeSize::isKnownXY, where it probably makes sense to keep an assert that scalable properties match. Differential Revision: https://reviews.llvm.org/D88654	2020-10-19 08:08:41 +01:00
Alok Kumar Sharma	0538353b3b	[DebugInfo] Support for DWARF operator DW_OP_over LLVM rejects DWARF operator DW_OP_over. This DWARF operator is needed for Flang to support assumed rank array. Summary: Currently LLVM rejects DWARF operator DW_OP_over. Below error is produced when llvm finds this operator. [..] invalid expression !DIExpression(151, 20, 16, 48, 30, 35, 80, 34, 6) warning: ignoring invalid debug info in over.ll [..] There were some parts missing in support of this operator, which are now completed. Testing -added a unit testcase -check-debuginfo -check-llvm Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89208	2020-10-17 08:42:28 +05:30
Craig Topper	278bd06891	[TargetLowering] Extract simplifySetCCs ctpop into a separate function. NFCI As requested in D89346. This allows us to add some early outs. I reordered some checks a little bit to make the more common bail outs happen earlier. Like checking opcode before checking hasOneUse. And I moved the bit width check to make sure it was safe to look through a truncate to the spot where we look through truncates instead of after. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89494	2020-10-16 19:47:56 -07:00
Jameson Nash	4242df1470	Revert "make the AsmPrinterHandler array public" I messed up one of the tests.	2020-10-16 17:22:07 -04:00
Jameson Nash	ac2def2d8d	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Differential Revision: https://reviews.llvm.org/D74158	2020-10-16 16:27:31 -04:00
Amara Emerson	6042c25b0a	[GlobalISel] Add translation support for vector reduction intrinsics. In order to prevent the ExpandReductions pass from expanding some intrinsics before they get to codegen, I had to add a -disable-expand-reductions flag for testing purposes. Differential Revision: https://reviews.llvm.org/D89028	2020-10-16 10:17:53 -07:00
Jay Foad	0c1381d795	[llc] Use -filetype=null to disable MIR printing If you use -stop-after or similar options, llc will normally print MIR. This patch checks for -filetype=null as a special case to disable MIR printing. As the comment says, "The Null output is intended for use for performance analysis ...", and I found this useful for timing a subset of the passes that llc runs without the significant overhead of printing MIR just to send it to /dev/null. Differential Revision: https://reviews.llvm.org/D89476	2020-10-16 16:51:56 +01:00
Amara Emerson	c2551c1f40	[GlobalISel] Remove scalar src from non-sequential fadd/fmul reductions. It's probably better to split these into separate G_FADD/G_FMUL + G_VECREDUCE operations in the translator rather than carrying the scalar around. The majority of the time it'll get simplified away as the scalars are probably identity values. Differential Revision: https://reviews.llvm.org/D89150	2020-10-15 15:51:44 -07:00
Denis Antrushin	8f0ddd4a1a	[Statepoints] Remove MI limit on number of tied operands. After D87915 statepoint can have more than 15 tied operands. Remove this restriction from statepoint lowering code.	2020-10-15 19:02:38 +07:00
Adrian Kuegel	ead2aa7098	Fix unused variable warning when compiling with asserts disabled. Differential Revision: https://reviews.llvm.org/D89454	2020-10-15 12:50:19 +02:00
Jeremy Morse	c521e44def	[DebugInstrRef] Support recording of instruction reference substitutions Add a table recording "substitutions" between pairs of <instruction, operand> numbers, from old pairs to new pairs. Post-isel optimizations are able to record the outcome of an optimization in this way. For example, if there were a divide instruction that generated the quotient and remainder, and it were replaced by one that only generated the quotient: $rax, $rcx = DIV-AND-REMAINDER $rdx, $rsi, debug-instr-num 1 DBG_INSTR_REF 1, 0 DBG_INSTR_REF 1, 1 Became: $rax = DIV $rdx, $rsi, debug-instr-num 2 DBG_INSTR_REF 1, 0 DBG_INSTR_REF 1, 1 We could enter a substitution from <1, 0> to <2, 0>, and no substitution for <1, 1> as it's no longer generated. This approach means that if an instruction or value is deleted once we've left SSA form, all variables that used the value implicitly become "optimized out", something that isn't true of the current DBG_VALUE approach. Differential Revision: https://reviews.llvm.org/D85749	2020-10-15 11:30:14 +01:00
Denis Antrushin	8c2b69d53a	[Statepoints] Unlimited tied operands. Current limit on amount of tied operands (15) sometimes is too low for statepoint. We may get couple dozens of gc pointer operands on statepoint. Review D87154 changed format of statepoint to list every gc pointer only once, which makes it trivial to find tiedness relation between statepoint operands: defs are mapped 1-1 to gc pointer operands passed on registers. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D87915	2020-10-15 16:16:11 +07:00
Craig Topper	50c9f1e11d	[TargetLowering] Replace Log2_32_Ceil with Log2_32 in SimplifySetCC ctpop combine. This combine can look through (trunc (ctpop X)). When doing this it tries to make sure the trunc doesn't lose any information from the ctpop. It does this by checking that the truncated type has more bits that Log2_32_Ceil of the ctpop type. The Ceil is unnecessary and pessimizes non-power of 2 types. For example, ctpop of i256 requires 9 bits to represent the max value of 256. But ctpop of i255 only requires 8 bits to represent the max result of 255. Log2_32_Ceil of 256 and 255 both return 8 while Log2_32 returns 8 for 256 and 7 for 255 The code with popcnt enabled is a regression for this test case, but it does match what already happens with i256 truncated to i9. Since power of 2 is more likely, I don't think it should block this change. Differential Revision: https://reviews.llvm.org/D89412	2020-10-15 01:05:07 -07:00
Snehasish Kumar	24bf6ff4e0	[llvm] Update default cutoff threshold for machine function splitter. Based on internal testing at Google we found that setting the profile summary cutoff threshold to 999950 yields the best results in terms of itlb and icache metrics (as observed on Intel CPUs). default = Split out code if no profile count available for block size-% = The fraction of bytes split out of .text and .text.hot itlb = Misses per kilo instructions (MPKI) for itlb icache = Misses per kilo instructions (MPKI) for L1 icache Search1 \| cutoff \| size-% \| itlb \| icache \| \|---------\|---------\|-----------\|---------\| \| default \| 42.5861 \| 0.0822151 \| 2.46363 \| \| 999999 \| 44.9350 \| 0.0767194 \| 2.44416 \| \| 999950 \| 50.0660 \| 0.075744 \| 2.4091 \| \| 999500 \| 56.9158 \| 0.082564 \| 2.4188 \| \| 995000 \| 63.8625 \| 0.0814927 \| 2.42832 \| \| 990000 \| 71.7314 \| 0.106906 \| 2.57785 \| Search2 \| cutoff \| size-% \| itlb \| icache \| \|---------\|--------\|----------\|---------\| \| default \| 2.8845 \| 0.626712 \| 4.73245 \| \| 999999 \| 3.3291 \| 0.602309 \| 4.70045 \| \| 999950 \| 3.8577 \| 0.587842 \| 4.71632 \| \| 999500 \| 4.4170 \| 0.63577 \| 4.68351 \| \| 995000 \| 5.1020 \| 0.657969 \| 4.82272 \| \| 990000 \| 5.7153 \| 0.719122 \| 5.39496 \| Differential Revision: https://reviews.llvm.org/D89085	2020-10-14 12:48:10 -07:00
Snehasish Kumar	77638a5343	[llvm] Set the default for -bbsections-cold-text-prefix to .text.split. After using this for a while, we find that it is generally useful to have it set to .text.split. by default, removing the need for an additional -mllvm option. Differential Revision: https://reviews.llvm.org/D88997	2020-10-14 12:16:36 -07:00
Guozhi Wei	adfb541501	[MBP] Add whole chain to BlockFilterSet instead of individual BB Currently we add individual BB to BlockFilterSet if its frequency satisfies LoopFreq / Freq <= LoopToColdBlockRatio LoopFreq is edge frequency from outside to loop header. LoopToColdBlockRatio is a command line parameter. It doesn't make sense since we always layout whole chain, not individual BBs. It may also cause a tricky problem. Sometimes it is possible that the LoopFreq of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop, like .cold in the test case. So it is added to the chain of inner loop. When work on the outer loop, .cold is not added to BlockFilterSet, so the edge to successor .problem is not counted in UnscheduledPredecessors of .problem chain. But other blocks in the inner loop are added BlockFilterSet, so the whole inner loop chain can be layout, and markChainSuccessors is called to decrease UnscheduledPredecessors of following chains. markChainSuccessors calls markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold, so .problem chain's UnscheduledPredecessors is decreased, but this edge was not counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors becomes 0 when it still has an unscheduled predecessor .pred! And it causes problems in following various successor BB selection algorithms. Differential Revision: https://reviews.llvm.org/D89088	2020-10-14 11:55:10 -07:00
jasonliu	f85bcc21dd	[AIX] Turn -fdata-sections on by default in Clang Summary: This patch does the following: 1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a parameter, because some options' default value is triple dependant. 2. DataSections is turned on by default on AIX for llc. 3. Test cases change accordingly because of the default behaviour change. 4. Clang Driver passes in -fdata-sections by default on AIX. Reviewed By: MaskRay, DiggerLin Differential Revision: https://reviews.llvm.org/D88737	2020-10-14 15:58:31 +00:00
Mircea Trofin	c8fcffe775	[NFC][MC] Use MCRegister in Machine{Sink\|Pipeliner}.cpp Differential Revision: https://reviews.llvm.org/D89328	2020-10-14 08:42:17 -07:00
Michael Liao	ae40d2858e	Fix an apparent typo. `assert()` must not contain side-effects. NFC.	2020-10-14 11:33:34 -04:00
Jeremy Morse	c4e7857d4e	[DebugInstrRef] Create DBG_INSTR_REFs in SelectionDAG When given the -experimental-debug-variable-locations option (via -Xclang or to llc), have SelectionDAG generate DBG_INSTR_REF instructions instead of DBG_VALUE. For now, this only happens in a limited circumstance: when the value referred to is not a PHI and is defined in the current block. Other situations introduce interesting problems, addresed in later patches. Practically, this patch hooks into InstrEmitter and if it can find a defining instruction for a value, gives it an instruction number, and points the DBG_INSTR_REF at that <instr, operand> pair. Differential Revision: https://reviews.llvm.org/D85747	2020-10-14 14:24:08 +01:00
Jeremy Morse	2c5f3d54c5	[DebugInstrRef] Parse debug instruction-references from/to MIR This patch defines the MIR format for debug instruction references: it's an integer trailing an instruction, marked out by "debug-instr-number", much like how "debug-location" identifies the DebugLoc metadata of an instruction. The instruction number is stored directly in a MachineInstr. Actually referring to an instruction comes in a later patch, but is done using one of these instruction numbers. I've added a round-trip test and two verifier checks: that we don't label meta-instructions as generating values, and that there are no duplicates. Differential Revision: https://reviews.llvm.org/D85746	2020-10-14 10:57:09 +01:00
Aditya Nandakumar	ef3d17482f	[GISel] Add combine for constant G_PTR_ADD offsets. https://reviews.llvm.org/D88865 This adds a single combine for GlobalISel to fold: ptradd (inttoptr C1) C2 Into: C1 + C2 Additionally, a small test for AArch64 is added. Patch by pnappa.	2020-10-13 17:26:12 -07:00
Andrew Paverd	cfd8481da1	Reland [CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-10-13 13:20:52 -07:00
Mircea Trofin	08097fc6a9	[NFC][Regalloc] Use MCRegister in MachineCopyPropagation Differential Revision: https://reviews.llvm.org/D89250	2020-10-13 09:05:08 -07:00
Mirko Brkusanin	52ba4fa6aa	[GlobalISel] Avoid making G_PTR_ADD with nullptr When the first operand is a null pointer we can avoid making a G_PTR_ADD and make a G_INTTOPTR with the offset operand. This helps us avoid making add with 0 later on for targets such as AMDGPU. Differential Revision: https://reviews.llvm.org/D87140	2020-10-13 13:02:55 +02:00
Craig Topper	1687a8d83b	[X86][SelectionDAG] Add SADDO_CARRY and SSUBO_CARRY to support multipart signed add/sub overflow legalization. This passes existing X86 test but I'm not sure if it handles all type legalization cases it needs to. Alternative to D89200 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D89222	2020-10-12 23:18:29 -07:00
Konstantin Schwarz	7341123439	[GlobalISel][KnownBits] Early return on out of bound shift amounts If the known shift amount is bigger than or equal to the bitwidth of the type of the value to be shifted, the result is target dependent, so don't try to infer any bits. This fixes a crash we've seen in one of our internal test suites. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D89232	2020-10-12 18:39:19 +02:00
Mircea Trofin	43d347995c	[NFC][MC] Use MCRegister in LiveRangeMatrix The change starts from LiveRangeMatrix and also checks the users of the APIs are typed accordingly. Differential Revision: https://reviews.llvm.org/D89145	2020-10-12 08:54:36 -07:00
Mircea Trofin	596a9f6b89	[NFC][Regalloc] Pass VirtRegMap by reference. It's never null - the reason it's modeled as a pointer is because the pass can't init it in its ctor. Passing by ref simplifies the code, too, as the null checks were unnecessary complexity. Differential Revision: https://reviews.llvm.org/D89171	2020-10-12 08:32:30 -07:00
Simon Pilgrim	c252200e4d	[DAG][ARM][MIPS][RISCV] Improve funnel shift promotion to use 'double shift' patterns Based on a discussion on D88783, if we're promoting a funnel shift to a width at least twice the size as the original type, then we can use the 'double shift' patterns (shifting the concatenated sources). Differential Revision: https://reviews.llvm.org/D89139	2020-10-12 14:11:02 +01:00
David Sherwood	c5ba0d33cc	[SVE] Make ElementCount and TypeSize use a new PolySize class I have introduced a new template PolySize class, where the template parameter determines the type of quantity, i.e. for an element count this is just an unsigned value. The ElementCount class is now just a simple derivation of PolySize<unsigned>, whereas TypeSize is more complicated because it still needs to contain the uint64_t cast operator, since there are still many places in the code that rely upon this implicit cast. As such the class also still needs some of it's own operators. I've tried to minimise the amount of code in the base PolySize class, which led to a couple of changes: 1. In some places we were relying on '==' operator comparisons between ElementCounts and the scalar value 1. I didn't put this operator in the new PolySize class, and thought it was actually clearer to use the isScalar() function instead. 2. I removed the isByteSized function and replaced it with calls to isKnownMultipleOf(8). I've also renamed NextPowerOf2 to be coefficientNextPowerOf2 so that it's more consistent with coefficientDivideBy. Differential Revision: https://reviews.llvm.org/D88409	2020-10-12 08:23:38 +01:00
Fangrui Song	cddb49bcc0	[SchedDAGInstrs] Delete redundant contains(). NFC	2020-10-11 20:58:30 -07:00
Krzysztof Parzyszek	61eaa2e14a	[SDAG] Remember to set UndefElts in isSplatValue for SPLAT_VECTOR	2020-10-10 19:42:24 -05:00
David Green	cb27006a94	[ARM] Attempt to make Tail predication / RDA more resilient to empty blocks There are a number of places in RDA where we assume the block will not be empty. This isn't necessarily true for tail predicated loops where we have removed instructions. This attempt to make the pass more resilient to empty blocks, not casting pointers to machine instructions where they would be invalid. The test contains a case that was previously failing, but recently been hidden on trunk. It contains an empty block to begin with to show a similar error. Differential Revision: https://reviews.llvm.org/D88926	2020-10-10 14:50:25 +01:00
Alok Kumar Sharma	96bd4d34a2	[DebugInfo] Support for DWARF attribute DW_AT_rank This patch adds support for DWARF attribute DW_AT_rank. Summary: Fortran assumed rank arrays have dynamic rank. DWARF attribute DW_AT_rank is needed to support that. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89141	2020-10-10 17:51:12 +05:30
Denis Antrushin	2b96dcebfa	[Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty. Currently we allow passing pointers from deopt bundle on VReg only if they were seen in list of gc-live pointers passed on VRegs. This means that for the case of empty gc-live bundle we spill deopt bundle's pointers. This change allows lowering deopt pointers to VRegs in case of empty gc-live bundle. In case of non-empty gc-live bundle, behavior does not change. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D88999	2020-10-10 14:58:08 +07:00
Mircea Trofin	c11c20fb00	[NFC][Regalloc] VirtRegAuxInfo::Hint does not need to be a field It is only used in weightCalcHelper, and cleared upon its finishing its job there. The patch further cleans up style guide discrepancies, and simplifies CopyHint by removing duplicate 'IsPhys' information (it's what the Reg field would report).	2020-10-09 13:42:23 -07:00
Mircea Trofin	62e2ac6461	[NFC][Regalloc] Fix coding style in CalcSpillWeights	2020-10-09 12:22:12 -07:00
Scott Linder	4a98cf7867	[NFC] Reformat MILexer.cpp:getIdentifierKind Reformat to avoid unrelated changes in diff of future patch. Committed as obvious.	2020-10-09 15:21:24 +00:00
Esme-Yi	e9fd8823ba	[DAGCombiner] Add decomposition patterns for Mul-by-Imm. Summary: This patch is derived from D87384. In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns: ``` mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M)) mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M)) ``` The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`. More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved. ``` res1 = a 0x8800; res2 = a 0x8080; ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88201	2020-10-09 08:51:40 +00:00
Fangrui Song	2c4c2dc2d9	[MCRegister] Simplify isStackSlot & isPhysicalRegister and delete isPhysical. NFC	2020-10-08 22:08:33 -07:00
Fangrui Song	c3de9a9e69	Fix incorect Register -> MCRegister conversion getReg returns a Register which may represent a virtual register.	2020-10-08 21:40:48 -07:00
Kazushi (Jam) Marukawa	1d1c1f8ff2	[VE] Add new MVT types for NEC SX Aurora VE vector This patch adds entries for: v64i64 v128i64 v256i64 v64f64 v128f64 v256f64 Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D88776	2020-10-09 12:07:41 +09:00
Kai Luo	8a5858c8fd	[TwoAddressInstruction][PowerPC] Call `regOverlapsSet` to find out real clobbers and uses In `rescheduleKillAboveMI`, current implementation uses `SmallSet` to track reg's defs and uses. When comparing, use `SmallSet.count` to find out if it's clobbered or used. It's not correct if involving subregisters. This patch uses `regOverlapsSet` already used by `rescheduleMIBelowKill` to fix the issue. Fixed https://bugs.llvm.org/show_bug.cgi?id=47707. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D88716	2020-10-09 02:34:54 +00:00
Mircea Trofin	4cfc4025cc	[NFC][MC] MCRegister API typing. Mostly LiveIntervals, with their effects (users). Differential Revision: https://reviews.llvm.org/D89018	2020-10-08 15:08:34 -07:00
Quentin Colombet	fd8275e04a	[GlobalISel] Add missing pass dependencies for IRTranslator The IRTranslator depends on the branch probability info pass when the optimization level is different than None and it depends all the time on the StackProtector pass. We have to explicitly call out pass dependencies otherwise the pass manager may not be able to schedule the IRTranslator. Before this patch, we were lucky because previous passes depend on the branch probability info pass (like the Global Variable Optimization) and the stack protector pass is initialized in initializeCodeGen. However, if the target has a custom pipeline without any passes like Global Variable Optimization, the pipeline creation will fail, at least because of the branch probability info pass dependency (it is unlikely that initializeCodeGen is not called). This patch adds the missing dependencies to the IRTranslator. Differential Revision: https://reviews.llvm.org/D89063	2020-10-08 13:57:21 -07:00
Rahman Lavaee	2b0c5d76a6	Introduce and use a new section type for the bb_addr_map section. This patch lets the bb_addr_map (renamed to __llvm_bb_addr_map) section use a special section type (SHT_LLVM_BB_ADDR_MAP) instead of SHT_PROGBITS. This would help parsers, dumpers and other tools to use the sh_type ELF field to identify this section rather than relying on string comparison on the section name. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D88199	2020-10-08 11:13:19 -07:00
Amara Emerson	283b4d6ba3	[GlobalISel] Add G_VECREDUCE_* opcodes for vector reductions. These mirror the IR and SelectionDAG intrinsics & nodes. Opcodes added: G_VECREDUCE_SEQ_FADD G_VECREDUCE_SEQ_FMUL G_VECREDUCE_FADD G_VECREDUCE_FMUL G_VECREDUCE_FMAX G_VECREDUCE_FMIN G_VECREDUCE_ADD G_VECREDUCE_MUL G_VECREDUCE_AND G_VECREDUCE_OR G_VECREDUCE_XOR G_VECREDUCE_SMAX G_VECREDUCE_SMIN G_VECREDUCE_UMAX G_VECREDUCE_UMIN Differential Revision: https://reviews.llvm.org/D88750	2020-10-08 10:33:19 -07:00
diggerlin	92bca12843	[AIX] add new option -mignore-xcoff-visibility SUMMARY: In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html. We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file. The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error. In AIX OS: 1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command . 1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes. 1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute. The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR. Reviewer: daltenty,Jason Liu Differential Revision: https://reviews.llvm.org/D87451	2020-10-08 09:34:58 -04:00
Geoffrey Martin-Noble	93db4a8ce6	Remove unused variables These are unused since https://reviews.llvm.org/rG35cb45c533fb76dcfc9f44b4e8bbd5d8a855ed2a causing `-Wunused` warnings. Differential Revision: https://reviews.llvm.org/D89022	2020-10-07 18:30:12 -07:00
Anna Thomas	35cb45c533	[ImplicitNullChecks] Support complex addressing mode The pass is updated to handle loads through complex addressing mode, specifically, when we have a scaled register and a scale. It requires two API updates in TII which have been implemented for X86. See added IR and MIR testcases. Tests-Run: make check Reviewed-By: reames, danstrushin Differential Revision: https://reviews.llvm.org/D87148	2020-10-07 20:55:38 -04:00
Mircea Trofin	297655c123	[NFC][regalloc] Use MCRegister instead of unsigned in InterferenceCache Also changed users of APIs. Differential Revision: https://reviews.llvm.org/D88930	2020-10-07 14:48:43 -07:00
Rahman Lavaee	34cd06a9b3	[BasicBlockSections] Make sure that the labels for address-taken blocks are emitted after switching the seciton. Currently, AsmPrinter code is organized in a way in which the labels of address-taken blocks are emitted in the previous section, which makes the relocation incorrect. This patch reorganizes the code to switch to the basic block section before handling address-taken blocks. Reviewed By: snehasish, MaskRay Differential Revision: https://reviews.llvm.org/D88517	2020-10-07 13:22:38 -07:00
Amara Emerson	e72cfd938f	Rename the VECREDUCE_STRICT_{FADD,FMUL} SDNodes to VECREDUCE_SEQ_{FADD,FMUL}. The STRICT was causing unnecessary confusion. I think SEQ is a more accurate name for what they actually do, and the other obvious option of "ORDERED" has the issue of already having a meaning in FP contexts. Differential Revision: https://reviews.llvm.org/D88791	2020-10-07 10:45:09 -07:00
Amara Emerson	322d0afd87	[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics. This change renames the intrinsics to not have "experimental" in the name. The autoupgrader will handle legacy intrinsics. Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html Differential Revision: https://reviews.llvm.org/D88787	2020-10-07 10:36:44 -07:00
Jay Foad	1aa8e6a51a	[SDag] SimplifyDemandedBits: simplify to FP constant if all bits known We were already doing this for integer constants. This patch implements the same thing for floating point constants. Differential Revision: https://reviews.llvm.org/D88570	2020-10-07 09:24:38 +01:00
Chen Zheng	ed46e84c7a	[MachineInstr] exclude call instruction in mayAlias we now get noAlias result for a call instruction and other load/store/call instructions if we query mayAlias. This is not right as call instruction is not with mayloadorstore, but it may alter the memory. This patch fixes this wrong alias query. Differential Revision: https://reviews.llvm.org/D87490	2020-10-07 00:12:21 -04:00
Bill Wendling	d2c61d2bf9	[CodeGen][TailDuplicator] Don't duplicate blocks with INLINEASM_BR Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node after the INLINEASM_BR, which is not okay. See: https://github.com/ClangBuiltLinux/linux/issues/1125 Differential Revision: https://reviews.llvm.org/D88823	2020-10-06 18:44:59 -07:00
Mircea Trofin	d85b845cb2	[NFC][MC] Type uses of MCRegUnitIterator as MCRegister This is one of many subsequent similar changes. Note that we're ok with the parameter being typed as MCPhysReg, as MCPhysReg -> MCRegister is a correct conversion; Register -> MCRegister assumes the former is indeed physical, so we stop relying on the implicit conversion and use the explicit, value-asserting asMCReg(). Differential Revision: https://reviews.llvm.org/D88862	2020-10-06 12:09:56 -07:00
Dmitri Gribenko	b3876ef490	Silence -Wunused-variable in NDEBUG mode	2020-10-06 16:02:17 +02:00
Denis Antrushin	c08d48fc2d	[Statepoints] Change statepoint machine instr format to better suit VReg lowering. Current Statepoint MI format is this: STATEPOINT <id>, <num patch bytes >, <num call arguments>, <call target>, [call arguments...], <StackMaps::ConstantOp>, <calling convention>, <StackMaps::ConstantOp>, <statepoint flags>, <StackMaps::ConstantOp>, <num deopt args>, [deopt args...], <gc base/derived pairs...> <gc allocas...> Note that GC pointers are listed in pairs <base,derived>. This causes base pointers to appear many times (at least twice) in instruction, which is bad for us when VReg lowering is ON. The problem is that machine operand tiedness is 1-1 relation, so it might look like this: %vr2 = STATEPOINT ... %vr1, %vr1(tied-def0) Since only one instance of %vr1 is tied, that may lead to incorrect codegen (see PR46917 for more details), so we have to always spill base pointers. This mostly defeats new VReg lowering scheme. This patch changes statepoint instruction format so that every gc pointer appears only once in operand list. That way they all can be tied. Additional set of operands is added to preserve base-derived relation required to build stackmap. New statepoint has following format: STATEPOINT <id>, <num patch bytes>, <num call arguments>, <call target>, [call arguments...], <StackMaps::ConstantOp>, <calling convention>, <StackMaps::ConstantOp>, <statepoint flags>, <StackMaps::ConstantOp>, <num deopt args>, [deopt args...], <StackMaps::ConstantOp>, <num gc pointers>, [gc pointers...], <StackMaps::ConstantOp>, <num gc allocas>, [gc allocas...] <StackMaps::ConstantOp>, <num entries in gc map>, [base/derived indices...] Changes are: - every gc pointer is listed only once in a flat length-prefixed list; - alloca list is prefixed with its length too; - following alloca list is length-prefixed list of base-derived indices of pointers from gc pointer list. Note that indices are logical (number of pointer), not absolute (index of machine operand). Differential Revision: https://reviews.llvm.org/D87154	2020-10-06 17:40:29 +07:00
David Sherwood	4ed47d50ea	[SVE][CodeGen] Fix DAGCombiner::ForwardStoreValueToDirectLoad for scalable vectors In DAGCombiner::ForwardStoreValueToDirectLoad I have fixed up some implicit casts from TypeSize -> uint64_t and replaced calls to getVectorNumElements() with getVectorElementCount(). There are some simple cases of forwarding that we can definitely support for scalable vectors, i.e. when the store and load are both scalable vectors and have the same size. I have added tests for the new code paths here: CodeGen/AArch64/sve-forward-st-to-ld.ll Differential Revision: https://reviews.llvm.org/D87098	2020-10-06 08:04:03 +01:00
Carl Ritson	ea9d6392f4	Fix reordering of instructions during VirtRegRewriter unbundling When unbundling COPY bundles in VirtRegRewriter the start of the bundle is not correctly referenced in the unbundling loop. The effect of this is that unbundled instructions are sometimes inserted out-of-order, particular in cases where multiple reordering have been applied to avoid clobbering dependencies. The resulting instruction sequence clobbers dependencies. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D88821	2020-10-06 09:43:02 +09:00
Mircea Trofin	b268e24d43	[NFC][regalloc] Separate iteration from AllocationOrder This separates the two concerns - encapsulation of traversal order; and iteration. Differential Revision: https://reviews.llvm.org/D88256	2020-10-05 16:13:18 -07:00
Craig Topper	1127662c6d	[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS. getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this happens we should create the node with the FMF that was requested. By using FlagInserter when can ensure any calls to getNode/getSetcc during canonicalization will also get the flags. Differential Revision: https://reviews.llvm.org/D88063	2020-10-05 14:55:23 -07:00
Fangrui Song	27e1cc6f39	Cleanup CodeGen/CallingConvLower.cpp Patch by pi1024e (email unavailable) Differential Revision: https://reviews.llvm.org/D82593	2020-10-05 14:47:46 -07:00
Jon Roelofs	db80cc397e	[CodeGen][MachineSched] Fixup function name typo. NFC	2020-10-05 12:43:50 -07:00
Mircea Trofin	82ebbcfb05	[NFC][regalloc] Model weight normalization as a virtual Continuing from D88499, we can now model the normalization function as a virtual member of VirtRegAuxInfo. Note that the default (normalizeSpillWeight) is also used stand-alone in RAGreedy. Differential Revision: https://reviews.llvm.org/D88713	2020-10-05 11:33:07 -07:00
David Sherwood	fa0293081d	[SVE][CodeGen] Fix TypeSize/ElementCount related warnings in sve-split-store.ll I have fixed up a number of warnings resulting from TypeSize -> uint64_t casts and calling getVectorNumElements() on scalable vector types. I think most of the changes are fairly trivial except for those in DAGTypeLegalizer::SplitVecRes_MSTORE I've tried to ensure we create the MachineMemoryOperands in a sensible way for scalable vectors. I have added a CHECK line to the following test: CodeGen/AArch64/sve-split-store.ll that ensures no new warnings are added. Differential Revision: https://reviews.llvm.org/D86928	2020-10-05 19:27:00 +01:00
Amara Emerson	c2bce848ec	[GlobalISel] Fix CSEMIRBuilder silently allowing use-before-def. If a CSEMIRBuilder query hits the instruction at the current insert point, move insert point ahead one so that subsequent uses of the builder don't end up with uses before defs. This fix also shows that AMDGPU was also affected by this bug often, but got away with it because it was using a G_IMPLICIT_DEF before the use. Differential Revision: https://reviews.llvm.org/D88605	2020-10-05 11:00:00 -07:00
Qiu Chaofan	b326d4ff94	[SelectionDAG] Don't remove unused negated constant immediately This reverts partial of `a2fb5446` (actually, `2508ef01`) about removing negated FP constant immediately if it has no uses. However, as discussed in bug 47517, there're cases when NegX is folded into constant from other places while NegY is removed by that line of code and NegX is equal to NegY. In these cases, NegX is deleted before used and crash happens. So revert the code and add necessary test case.	2020-10-06 01:16:45 +08:00
Sanjay Patel	2ccbf3dbd5	[SDAG] fold x * 0.0 at node creation time In the motivating case from https://llvm.org/PR47517 we create a node that does not get constant folded before getNegatedExpression is attempted from some other node, and we crash. By moving the fold into SelectionDAG::simplifyFPBinop(), we get the constant fold sooner and avoid the problem.	2020-10-04 11:31:57 -04:00
Denis Antrushin	7b19cd06d7	[Statepoints][ISEL] visitGCRelocate: chain to current DAG root. This is similar to D87251, but for CopyFromRegs nodes. Even for local statepoint uses we generate CopyToRegs/CopyFromRegs nodes. When generating CopyFromRegs in visitGCRelocate, we must chain to current DAG root, not EntryNode, to ensure proper ordering of copy w.r.t. statepoint node producing result for it. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D88639	2020-10-02 21:41:22 +07:00
David Sherwood	b0ce9f0f4c	[SVE][CodeGen] Fix implicit TypeSize->uint64_t casts in TypePromotion The TypePromotion pass only operates on scalar types so I've fixed up all places where we were relying upon the implicit cast from TypeSize->uint64_t. Differential Revision: https://reviews.llvm.org/D88575	2020-10-02 08:12:11 +01:00
David Sherwood	b8ce6a6756	[SVE][CodeGen] Add new EVT/MVT getFixedSizeInBits() functions When we know that a particular type is always going to be fixed width we have so far been writing code like this: getSizeInBits().getFixedSize() Since we are doing this in quite a few places now it seems to make sense to add a new helper function that allows us to replace these calls with a single getFixedSizeInBits() call. Differential Revision: https://reviews.llvm.org/D88649	2020-10-02 07:47:31 +01:00
Carl Ritson	5136f4748a	CodeGen: Fix livein calculation in MachineBasicBlock splitAt Fix and simplify computation of liveins for new block. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88535	2020-10-02 10:45:04 +09:00
jasonliu	78a9e62aa6	[XCOFF] Enable -fdata-sections on AIX Summary: Some design decision worth noting about: I've noticed a recent mailing discussing about why string literal is not affected by -fdata-sections for ELF target: http://lists.llvm.org/pipermail/llvm-dev/2020-September/145121.html But on AIX, our linker could not split the mergeable string like other target. So I think it would make more sense for us to emit separate csect for every mergeable string in -fdata-sections mode, as there might not be other ways for linker to do garbage collection on unused mergeable string. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D88339	2020-10-02 00:16:24 +00:00
Arthur Eubanks	499260c03b	Revert "[CFGuard] Add address-taken IAT tables and delay-load support" This reverts commit `ef4e971e5e`.	2020-10-01 11:29:54 -07:00
David Sherwood	15474d7691	[SVE][CodeGen] Replace use of TypeSize operator< in GlobalMerge::doMerge We don't support global variables with scalable vector types so I've changed the code to compare the fixed sizes instead. Differential Revision: https://reviews.llvm.org/D88564	2020-10-01 14:06:59 +01:00
Andrew Paverd	ef4e971e5e	[CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-10-01 12:45:07 +01:00
Kerry McLaughlin	fcf70e1e3b	[SVE][CodeGen] Lower scalable fp_extend & fp_round operations This patch adds FP_EXTEND_MERGE_PASSTHRU & FP_ROUND_MERGE_PASSTHRU ISD nodes, used to lower scalable vector fp_extend/fp_round operations. fp_round has an additional argument, the 'trunc' flag, which is an integer of zero or one. This also fixes a warning introduced by the new tests added to sve-split-fcvt.ll, resulting from an implicit TypeSize -> uint64_t cast in SplitVecOp_FP_ROUND. Reviewed By: sdesmalen, paulwalker-arm Differential Revision: https://reviews.llvm.org/D88321	2020-10-01 12:17:37 +01:00
Rahman Lavaee	8955950c12	Exception support for basic block sections This is part of the Propeller framework to do post link code layout optimizations. Please see the RFC here: https://groups.google.com/forum/#!msg/llvm-dev/ef3mKzAdJ7U/1shV64BYBAAJ and the detailed RFC doc here: https://github.com/google/llvm-propeller/blob/plo-dev/Propeller_RFC.pdf This patch provides exception support for basic block sections by splitting the call-site table into call-site ranges corresponding to different basic block sections. Still all landing pads must reside in the same basic block section (which is guaranteed by the the core basic block section patch D73674 (ExceptionSection) ). Each call-site table will refer to the landing pad fragment by explicitly specifying @LPstart (which is omitted in the normal non-basic-block section case). All these call-site tables will share their action and type tables. The C++ ABI somehow assumes that no landing pads point directly to LPStart (which works in the normal case since the function begin is never a landing pad), and uses LP.offset = 0 to specify no landing pad. In the case of basic block section where one section contains all the landing pads, the landing pad offset relative to LPStart could actually be zero. Thus, we avoid zero-offset landing pads by inserting a nop operation as the first non-CFI instruction in the exception section. Background on Exception Handling in C++ ABI https://github.com/itanium-cxx-abi/cxx-abi/blob/master/exceptions.pdf Compiler emits an exception table for every function. When an exception is thrown, the stack unwinding library queries the unwind table (which includes the start and end of each function) to locate the exception table for that function. The exception table includes a call site table for the function, which is used to guide the exception handling runtime to take the appropriate action upon an exception. Each call site record in this table is structured as follows: \| CallSite \| --> Position of the call site (relative to the function entry) \| CallSite length \| --> Length of the call site. \| Landing Pad \| --> Position of the landing pad (relative to the landing pad fragment’s begin label) \| Action record offset \| --> Position of the first action record The call site records partition a function into different pieces and describe what action must be taken for each callsite. The callsite fields are relative to the start of the function (as captured in the unwind table). The landing pad entry is a reference into the function and corresponds roughly to the catch block of a try/catch statement. When execution resumes at a landing pad, it receives an exception structure and a selector value corresponding to the type of the exception thrown, and executes similar to a switch-case statement. The landing pad field is relative to the beginning of the procedure fragment which includes all the landing pads (@LPStart). The C++ ABI requires all landing pads to be in the same fragment. Nonetheless, without basic block sections, @LPStart is the same as the function @Start (found in the unwind table) and can be omitted. The action record offset is an index into the action table which includes information about which exception types are caught. C++ Exceptions with Basic Block Sections Basic block sections break the contiguity of a function fragment. Therefore, call sites must be specified relative to the beginning of the basic block section. Furthermore, the unwinding library should be able to find the corresponding callsites for each section. To do so, the .cfi_lsda directive for a section must point to the range of call-sites for that section. This patch introduces a new CallSiteRange structure which specifies the range of call-sites which correspond to every section: `struct CallSiteRange { // Symbol marking the beginning of the precedure fragment. MCSymbol FragmentBeginLabel = nullptr; // Symbol marking the end of the procedure fragment. MCSymbol FragmentEndLabel = nullptr; // LSDA symbol for this call-site range. MCSymbol *ExceptionLabel = nullptr; // Index of the first call-site entry in the call-site table which // belongs to this range. size_t CallSiteBeginIdx = 0; // Index just after the last call-site entry in the call-site table which // belongs to this range. size_t CallSiteEndIdx = 0; // Whether this is the call-site range containing all the landing pads. bool IsLPRange = false; };` With N basic-block-sections, the call-site table is partitioned into N call-site ranges. Conceptually, we emit the call-site ranges for sections sequentially in the exception table as if each section has its own exception table. In the example below, two sections result in the two call site ranges (denoted by LSDA1 and LSDA2) placed next to each other. However, their call-sites will refer to records in the shared Action Table. We also emit the header fields (@LPStart and CallSite Table Length) for each call site range in order to place the call site ranges in separate LSDAs. We note that with -basic-block-sections, The CallSiteTableLength will not actually represent the length of the call site table, but rather the reference to the action table. Since the only purpose of this field is to locate the action table, correctness is guaranteed. Finally, every call site range has one @LPStart pointer so the landing pads of each section must all reside in one section (not necessarily the same section). To make this easier, we decide to place all landing pads of the function in one section (hence the `IsLPRange` field in CallSiteRange). \| @LPStart \| ---> Landing pad fragment ( LSDA1 points here) \| CallSite Table Length \| ---> Used to find the action table. \| CallSites \| \| … \| \| … \| \| @LPStart \| ---> Landing pad fragment ( LSDA2 points here) \| CallSite Table Length \| \| CallSites \| \| … \| \| … \| … … \| Action Table \| \| Types Table \| Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D73739	2020-09-30 11:05:55 -07:00
Mircea Trofin	d6de40f886	[NFC][regalloc] Make VirtRegAuxInfo part of allocator state All the state of VRAI is allocator-wide, so we can avoid creating it every time we need it. In addition, the normalization function is allocator-specific. In a next change, we can simplify that design in favor of just having it as a virtual member. Differential Revision: https://reviews.llvm.org/D88499	2020-09-30 08:13:05 -07:00
Matt Arsenault	5aa1119537	GlobalISel: Assert if MoreElements uses a non-vector type	2020-09-30 10:36:00 -04:00
Matt Arsenault	d93459992e	LiveDebugValues: Fix typos and indentation	2020-09-30 10:35:25 -04:00
Matt Arsenault	a66fca44ac	RegAllocFast: Add extra DBG_VALUE for live out spills This allows LiveDebugValues to insert the proper DBG_VALUEs in live out blocks if a spill is inserted before the use of a register. Previously, this would see the register use as the last DBG_VALUE, even though the stack slot should be treated as the live out value. This avoids an lldb test regression when D52010 is re-applied.	2020-09-30 10:35:25 -04:00
Matt Arsenault	89baeaef2f	Reapply "RegAllocFast: Rewrite and improve" This reverts commit `73a6a164b8`.	2020-09-30 10:35:25 -04:00
Gabriel Hjort Åkerlund	43d239d0fa	[GlobalISel] Fix incorrect setting of ValNo when splitting Before, for each original argument i, ValNo was set to i + PartIdx, but ValNo is intended to reflect the index of the value before splitting. Hence, ValNo should always be set to i and not consider the PartIdx. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D86511	2020-09-30 16:08:51 +02:00
Sam Parker	3f88c10a6b	[RDA] isSafeToDefRegAt: Look at global uses We weren't looking at global uses of a value, so we could happily overwrite the register incorrectly. Differential Revision: https://reviews.llvm.org/D88554	2020-09-30 14:06:45 +01:00
Jay Foad	cdac4492b4	[SplitKit] Cope with no live subranges in defFromParent Following on from D87757 "[SplitKit] Only copy live lanes", it is possible to split a live range at a point when none of its subranges are live. This patch handles that case by inserting an implicit def of the superreg. Patch by Quentin Colombet! Differential Revision: https://reviews.llvm.org/D88397	2020-09-30 10:16:25 +01:00
Sam Parker	779a8a028f	[ARM][LowOverheadLoops] TryRemove helper. Make a helper function that wraps around RDA::isSafeToRemove and utilises the existing DCE IT block checks.	2020-09-30 09:37:24 +01:00
Sam Parker	700f93e92b	[RDA] Switch isSafeToMove iterators So forwards is forwards and backwards is reverse. Also add a check so that we know the instructions are in the expected order. Differential Revision: https://reviews.llvm.org/D88419	2020-09-30 08:10:48 +01:00
Amara Emerson	1d54e75cf2	[GlobalISel] Fix multiply with overflow intrinsics legalization generating invalid MIR. During lowering of G_UMULO and friends, the previous code moved the builder's insertion point to be after the legalizing instruction. When that happened, if there happened to be a "G_CONSTANT i32 0" immediately after, the CSEMIRBuilder would try to find that constant during the buildConstant(zero) call, and since it dominates itself would return the iterator unchanged, even though the def of the constant was after the current insertion point. This resulted in the compare being generated before the constant which it was using. There's no need to modify the insertion point before building the mul-hi or constant. Delaying moving the insert point ensures those are built/CSEd before the G_ICMP is built. Fixes PR47679 Differential Revision: https://reviews.llvm.org/D88514	2020-09-29 18:40:58 -07:00
Zequan Wu	6c91e623e5	[CodeGen] emit CG profile for COFF object file Differential Revision: https://reviews.llvm.org/D87811	2020-09-29 12:03:30 -07:00
Mircea Trofin	6d193ba333	[NFC][regalloc] Unit test for AllocationOrder iteration. Added unittests. In the process, separated core construction - which just needs the hits, order, and 'HardHints' values - from construction from current register allocation state, to simplify testing. Differential Revision: https://reviews.llvm.org/D88455	2020-09-29 10:48:07 -07:00
Krzysztof Parzyszek	db04bec5f1	[SDAG] Do not convert undef to 0 when folding CONCAT/BUILD_VECTOR Differential Revision: https://reviews.llvm.org/D88273	2020-09-29 09:12:26 -05:00
Dominik Montada	113114a5da	[GlobalISel] fix widenScalarUnmerge if widen type is not a multiple of destination type Fix creation of illegal unmerge when widen was requested to a type which is not a multiple of the destination type. E.g. when trying to widen an s48 unmerge to s64 the existing code would create an illegal unmerge from s64 to s48. Instead, create further unmerges to a GCD type, then use this to remerge these intermediate results to the actual destinations. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88422	2020-09-29 15:52:20 +02:00
Jay Foad	781edd501c	[SDag] Verify DAG divergence after dumping. NFC. When debugging, it's useful to be able to see the DAG that has just failed divergence verification.	2020-09-29 14:05:07 +01:00
Jay Foad	d6b04f3937	[SDag] Refactor and simplify divergence calculation and checking. NFC.	2020-09-29 14:05:07 +01:00
Ruiling Song	73805329ba	[RegisterCoalescer] Pass Undefs to extendToIndices() When extending the subranges, the reaching-def may be an undefs. When extending such kind of subrange, it will try to search for the reaching def first. If the reaching def is an undef and we did not provide 'Undefs', The findReachingDefs() will fail with message: "Use of $noreg does not have a corresponding definition on every path: LLVM ERROR: Use not jointly dominated by defs." So we computeSubRangeUndefs() and pass the result to extendToIndices(). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D87744	2020-09-29 08:14:24 +08:00
Fangrui Song	bd08a87cfe	[EHStreamer] Simplify sharedTypeIDs with std::mismatch (Note that EMStreamer.cpp is largely under tested. The only test checking the prefix sharing is CodeGen/WebAssembly/eh-lsda.ll)	2020-09-28 15:05:59 -07:00
Amara Emerson	082321909e	[GlobalISel] Add support for lowering of vector G_SELECT and use for AArch64. The lowering is a port of the SDAG expansion. Differential Revision: https://reviews.llvm.org/D88364	2020-09-28 14:00:46 -07:00
Jessica Paquette	a52e78012a	[GlobalISel] Combine (xor (and x, y), y) -> (and (not x), y) When we see this: ``` %and = G_AND %x, %y %xor = G_XOR %and, %y ``` Produce this: ``` %not = G_XOR %x, -1 %new_and = G_AND %not, %y ``` as long as we are guaranteed to eliminate the original G_AND. Also matches all commuted forms. E.g. ``` %and = G_AND %y, %x %xor = G_XOR %y, %and ``` will be matched as well. Differential Revision: https://reviews.llvm.org/D88104	2020-09-28 10:08:14 -07:00
David Sherwood	bafdd11326	[SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy After some recent upstream discussion we decided that it was best to avoid having the / operator for both ElementCount and TypeSize, since this could give the impression that these classes can be used in the same way as basic integer integer types. However, division for scalable types is a bit odd because we are only dividing the minimum quantity by a value, as opposed to something like: (MinSize * Vscale) / SomeValue This is why when performing division it's important the caller first establishes whether the operation makes sense, perhaps by calling isKnownMultipleOf() prior to division. The caller must now explictly call divideCoefficientBy() on the class to perform the operation. Differential Revision: https://reviews.llvm.org/D87700	2020-09-28 08:03:00 +01:00
Arthur Eubanks	a2578e92e2	Revert "Reland [CodeGen] emit CG profile for COFF object file" This reverts commit `506b6170cb`. This still causes link errors, see https://crbug.com/1130780.	2020-09-27 22:43:14 -07:00
Nikita Popov	f229bf2e12	[Legalize][X86] Improve nnan fmin/fmax vector reduction Use +/-Inf or +/-Largest as neutral element for nnan fmin/fmax reductions. This avoids dropping any FMF flags. Preserving the nnan flag in particular is important to get a good lowering on X86. Differential Revision: https://reviews.llvm.org/D87586	2020-09-27 10:47:35 +02:00
Chen Zheng	c8f6c0f961	[Machinesink] add one more profitable loop related pattern Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D86925	2020-09-26 21:02:21 -04:00
Simon Pilgrim	a61272a900	[DAG] Fold vector mul(x,0)/mul(x,1) to a clearing mask If we're multiplying all elements of a vector by '0' or '1' then we can more efficiently perform this as a clearing mask (that is likely to further simplify to a shuffle blend). This was noticed when reviewing D87502 but seems to help idiv/irem by constant cases even more as '0'/'1' values are often used for 'passthrough' cases. Differential Revision: https://reviews.llvm.org/D88225	2020-09-26 14:31:57 +01:00
Simon Pilgrim	decc1944f3	MachineCSE.cpp - use auto const& iterators in for-range loops to avoid copies. NFCI.	2020-09-26 14:31:57 +01:00
Simon Atanasyan	c6c5629f2f	[CodeGen] Do not call `emitGlobalConstantLargeInt` for constant requires 8 bytes to store This is a fix for PR47630. The regression is caused by the D78011. After this change the code starts to call the `emitGlobalConstantLargeInt` even for constants which requires eight bytes to store. Differential revision: https://reviews.llvm.org/D88261	2020-09-26 08:58:46 +03:00
Qiu Chaofan	c0f8e4c06c	[SelectionDAG] Add guard to automatically insert flags This is like FastMathFlagGuard in IR. Since we use SDAG instance to get values, it's with SelectionDAG. By creating a FlagInserter in current scope, all values created by getNode will get the flags if no Flags argument provided. In this patch, I applied it to floating point operations folding part in DAG combiner, and removed Flags passing to getNode to show its effect. Other places in DAG combiner and other helper methods similar to getNode also need this. They can be done in follow-up patches. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D87361	2020-09-26 13:57:52 +08:00
Dávid Bolvanský	179e15d53a	[SystemZ] Optimize bcmp calls (PR47420) Solves https://bugs.llvm.org/show_bug.cgi?id=47420 Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87988	2020-09-25 17:55:39 +02:00
Jay Foad	b34ddfcc76	[SplitKit] In addDeadDef tolerate parent range that defines more lanes Following on from D87757 "[SplitKit] Only copy live lanes", in SplitEditor::addDeadDef, when we're checking whether the parent live interval has a subrange defining the same lanes, tolerate the case where the parent subrange defines a superset of the lanes. This can happen when the child subrange comes from SplitEditor::buildCopy decomposing a partial copy into a sequence of subreg copies that cover the required lanes. Differential Revision: https://reviews.llvm.org/D88020	2020-09-25 11:31:56 +01:00
Sam Parker	a399d1880b	[ARM] Find VPT implicitly predicated by VCTP On failing to find a VCTP in the list of instructions that explicitly predicate the entry of a VPT block, inspect whether the block is controlled via VPT which is implicitly predicated due to it's predicated operand(s). Differential Revision: https://reviews.llvm.org/D87819	2020-09-25 08:50:53 +01:00
Snehasish Kumar	d2696dec45	[llvm] Add -bbsections-cold-text-prefix to emit cold clusters to a different section. This change adds an option to basic block sections to allow cold clusters to be assigned a custom text prefix. With a custom prefix such as ".text.split." (D87840), lld can place them in a separate output section. The benefits are - * Empirically shown to improve icache and itlb metrics by 3-5% (absolute) compared to placing split parts in .text.unlikely. * Mitigates against poor profiles, eg samplePGO profiles used with the machine function splitter. Optimizations such as hugepage remapping can make different decisions at the section granularity. * Enables section granularity hotness monitoring (checking on the decisions made during compilation vs sample data from production). Differential Revision: https://reviews.llvm.org/D87813	2020-09-24 15:26:15 -07:00
Sriraman Tallam	e39286510d	Temporary fix for D85085 debug_loc bug with basic block sections. Until then, this one line fix removes the assert fail with basic block sections with debug info. Bug tracking this: #47549 This fix does not generate loc list or DW_AT_const_value if the argument is mentioned in a different section than the start of the function. Temporarily fixes bugzilla : https://bugs.llvm.org/show_bug.cgi?id=47549 Differential Revision: https://reviews.llvm.org/D87787	2020-09-24 14:41:49 -07:00
Zequan Wu	506b6170cb	Reland [CodeGen] emit CG profile for COFF object file This reverts commit `90242caca2`. Error fixed at `f5435399e8` Differential Revision: https://reviews.llvm.org/D87811	2020-09-24 14:38:53 -07:00
Bill Wendling	0c0c57f7b2	Revert "[CodeGen] Postprocess PHI nodes for callbr" Accidental commit. This reverts commit `7f4c940bd0`.	2020-09-24 14:35:23 -07:00
Bill Wendling	7f4c940bd0	[CodeGen] Postprocess PHI nodes for callbr When processing PHI nodes after a callbr, we need to make sure that the PHI nodes on the default branch are resolved after the callbr (inserted after INLINEASM_BR). The PHI node values on the indirect branches are processed before the INLINEASM_BR. Differential Revision: https://reviews.llvm.org/D86260	2020-09-24 14:34:28 -07:00
Mircea Trofin	89aad892a5	[NFC][regalloc] Remove unused API in AllocationOrder Differential Revision: https://reviews.llvm.org/D88197	2020-09-24 12:25:56 -07:00
Matt Arsenault	e75afc9acf	GlobalISel: Use unmerge when copying wide vectors to result registers Avoid using G_EXTRACT and move towards a more consistent vector legalization strategy.	2020-09-24 15:19:51 -04:00
vpykhtin	d9beff04a3	[RegisterCoalescer] Fix IMPLICIT_DEF init removal for a register on joining This patch removes redundant IMPLICIT_DEF for subregs which was leading to incorrect register initialization on joining in some cases. Reviewed by: qcolombet Differential revision: https://reviews.llvm.org/D82258	2020-09-24 17:37:03 +03:00
Alexandre Ganea	4b64ce7428	Improve `723fea2307` - Silence 'warning: unused variable' when compiling with Clang 10.0	2020-09-24 09:07:22 -04:00
Pushpinder Singh	41d6669f1f	[GlobalISel][AMDGPU] Lower G_SMULH/G_UMULH Reviewed By: arsenm, foad Differential Revision: https://reviews.llvm.org/D85653	2020-09-23 22:25:29 -04:00
Eli Friedman	3f739f736b	[SelectionDAG][GISel] Make LegalizeDAG lower FNEG using integer ops. Previously, if a floating-point type was legal, but FNEG wasn't legal, we would use FSUB. Instead, we should use integer ops, to preserve the semantics. (Alternatively, there's a compiler-rt call we could use, but there isn't much reason to use that.) It turns out we actually are still using this obscure codepath in a few cases: on some targets, we have "legal" floating-point types that don't actually support any floating-point operations. In particular, ARM and AArch64 are using this path. The implementation for SelectionDAG is pretty simple because we can reuse the infrastructure from FCOPYSIGN. See also `9a3dc3e`, the corresponding change to type legalization. Also includes a "bonus" change to STRICT_FSUB legalization, so we can lower a STRICT_FSUB to a float libcall. Includes the changes to both LegalizeDAG and GlobalISel so we don't have inconsistent results in the future. Fixes https://bugs.llvm.org/show_bug.cgi?id=46792 . Differential Revision: https://reviews.llvm.org/D84287	2020-09-23 14:10:33 -07:00
Guozhi Wei	fd75ad8662	[MBFIWrapper] Add a new function getBlockProfileCount MBFIWrapper keeps track of block frequencies of newly created blocks and modified blocks, modified block frequencies should also impact block profile count. This class doesn't provide interface getBlockProfileCount, users can only use the underlying MBFI to query profile count, the underlying MBFI doesn't know the modifications made in MBFIWrapper, so it either provides stale profile count for modified block or simply crashes on new blocks. So this patch add function getBlockProfileCount to class MBFIWrapper to handle new blocks or modified blocks. Differential Revision: https://reviews.llvm.org/D87802	2020-09-23 09:31:45 -07:00
Matt Arsenault	c463fd136e	GlobalISel: Fix truncating shift amount in trunc (shl) combine The shift amount type does not necessarily match the result type. This was inserting a trunc from s32 to s32, which asserted. Just preserve the original shift amount type which can be legalized later.	2020-09-23 09:07:50 -04:00
David Sherwood	e077367a28	[SVE] Make EVT::getScalarSizeInBits and others consistent with Type::getScalarSizeInBits An existing function Type::getScalarSizeInBits returns a uint64_t instead of a TypeSize class because the caller is requesting a scalar size, which cannot be scalable. This patch makes other similar functions requesting a scalar size consistent with that, thereby eliminating more than 1000 implicit TypeSize -> uint64_t casts. Differential revision: https://reviews.llvm.org/D87889	2020-09-23 09:20:08 +01:00
Fangrui Song	bee68b2956	[EHStreamer] Ensure CallSiteEntry::{BeginLabel,EndLabel} are non-null. NFC ... to simplify the code a bit. Reviewed By: rahmanl Differential Revision: https://reviews.llvm.org/D87999	2020-09-22 17:34:43 -07:00
Reid Kleckner	90242caca2	Revert "[CodeGen] emit CG profile for COFF object file" This reverts commit `91aed9bf97`, it is causing link errors.	2020-09-22 13:47:39 -07:00
Stefanos Baziotis	89c1e35f3c	[LoopInfo] empty() -> isInnermost(), add isOutermost() Differential Revision: https://reviews.llvm.org/D82895	2020-09-22 23:28:51 +03:00
Mircea Trofin	d1e0f9f3cf	[NFC][regalloc] Simplify/conform to style guide indvars in Greedy Differential Revision: https://reviews.llvm.org/D88055	2020-09-22 10:55:52 -07:00
Simon Pilgrim	4dada8d617	[DAG] Remove DAGTypeLegalizer::GenWidenVectorTruncStores (PR42046) Just scalarize trunc stores - GenWidenVectorTruncStores does the same thing but is flawed (PR42046) and unused. Differential Revision: https://reviews.llvm.org/D87708	2020-09-22 17:24:45 +01:00
Alexandre Ganea	723fea2307	Silence 'warning: unused variable' when compiling with Clang 10.0	2020-09-22 12:17:40 -04:00
Michael Liao	534f6e1718	[PeepholeOptimizer] Enhance the redundant COPY elimination. - Eliminate redundant COPYs from the same register & subregister pair. Differential Revision: https://reviews.llvm.org/D87939	2020-09-22 10:11:37 -04:00
Muhammad Omair Javaid	73a6a164b8	Revert "Reapply Revert "RegAllocFast: Rewrite and improve"" This reverts commit `55f9f87da2`. Breaks following buildbots: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4306 http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu/builds/9154	2020-09-22 14:40:06 +05:00
Mircea Trofin	6a6b06f526	[NFC][regalloc] Use reverse iterator ranges for improved readability Differential Revision: https://reviews.llvm.org/D88047	2020-09-21 14:58:37 -07:00
Martin Storsjö	36c64af9d7	[CodeGen] [WinException] Only produce handler data at the end of the function if needed If we are going to write handler data (that is written as variable length data following after the unwind info in .xdata), we need to emit the handler data immediately, but for cases where no such info is going to be written, skip emitting it right away. (Unwind info for all remaining functions that hasn't gotten it emitted directly is emitted at the end.) This does slightly change the ordering of sections (triggering a bunch of updates to DebugInfo/COFF tests), but the change should be benign. This also matches GCC's assembly output, which doesn't output .seh_handlerdata unless it actually is needed. For ARM64, the unwind info can be packed into the runtime function entry itself (leaving no data in the .xdata section at all), but that can only be done if there's no follow-on data in the .xdata section. If emission of the unwind info is triggered via EmitWinEHHandlerData (or the .seh_handlerdata directive), which implicitly switches to the .xdata section, there's a chance of the caller wanting to pass further data there, so the packed format can't be used in that case. Differential Revision: https://reviews.llvm.org/D87448	2020-09-21 23:42:59 +03:00
Matt Arsenault	55f9f87da2	Reapply Revert "RegAllocFast: Rewrite and improve" This reverts commit `dbd53a1f0c`. Needed lldb test updates	2020-09-21 15:45:27 -04:00
Simon Pilgrim	6a0ed57a22	ImplicitNullChecks.cpp - use auto const& iterators in for-range loops to avoid copies. NFCI.	2020-09-21 17:42:57 +01:00
Simon Pilgrim	3ae07b2a33	TargetPassConfig.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 17:17:11 +01:00
Simon Pilgrim	ce294ff8cd	MachineCSE.cpp - use auto const& iterator in for-range loop to avoid copies. NFCI.	2020-09-21 16:54:26 +01:00
Denis Antrushin	ee86688b81	[Statepoints][ISEL] gc.relocate uniquification should be based on SDValue, not IR Value. When exporting statepoint results to virtual registers we try to avoid generating exports for duplicated inputs. But we erroneously use IR Value* to check if inputs are duplicated. Instead, we should use SDValue, because even different IR values can get lowered to the same SDValue. I'm adding a (degenerate) test case which emphasizes importance of this feature for invoke statepoints. If we fail to export only unique values we will end up with something like that: %0 = STATEPOINT %1 = COPY %0 landing_pad: <use of %1> And when exceptional path is taken, %1 is left uninitialized (COPY is never execute). Reviewed By: reames Differential Revision: https://reviews.llvm.org/D87695	2020-09-21 19:44:46 +07:00
Alexander Belyaev	17dc729bd4	Revert "[NFC][ScheduleDAG] Remove unused EntrySU SUnit" This reverts commit `0345d88de6`. Google internal backend uses EntrySU, we are looking into removing dependency on it. Differential Revision: https://reviews.llvm.org/D88018	2020-09-21 13:33:05 +02:00
Lucas Prates	53d238a961	[CodeGen] Fixing inconsistent ABI mangling of vlaues in SelectionDAGBuilder SelectionDAGBuilder was inconsistently mangling values based on ABI Calling Conventions when getting them through copyFromRegs in SelectionDAGBuilder, causing duplicate value type convertions for function arguments. The checking for the mangling requirement was based on the value's originating instruction and was performed outside of, and inspite of, the regular Calling Convention Lowering. The issue could be observed in a scenario such as: ``` %arg1 = load half, half* %const, align 2 %arg2 = call fastcc half @someFunc() call fastcc void @otherFunc(half %arg1, half %arg2) ; Here, %arg2 was incorrectly mangled twice, as the CallConv data from ; the call to @someFunc() was taken into consideration for the check ; when getting the value for processing the call to @otherFunc(...), ; after the proper convertion had taken place when lowering the return ; value of the first call. ``` This patch fixes the issue by disregarding the Calling Convention information for such copyFromRegs, making sure the ABI mangling is properly contanined in the Calling Convention Lowering. This fixes Bugzilla #47454. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87844	2020-09-21 10:05:34 +01:00
Fangrui Song	dbc616e982	[EHStreamer] Fix a "Continue to action" -fverbose-asm comment when multi-byte LEB128 encoding is needed This only happens with more than 64 action records and it is difficult to construct a test.	2020-09-20 21:41:48 -07:00
Qiu Chaofan	1d782c2987	[PowerPC] Pass nofpexcept flag to custom lowered constrained ops This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept flag. But during custom lowering, this flag isn't passed down. This is also seen on X86 target. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87390	2020-09-21 10:44:25 +08:00
Fangrui Song	d06485685d	[XRay] Change mips to use version 2 sled (PC-relative address) Follow-up to D78590. All targets use PC-relative addresses now. Reviewed By: atanasyan, dberris Differential Revision: https://reviews.llvm.org/D87977	2020-09-20 17:59:57 -07:00
Fangrui Song	6913812abc	Fix some clang-tidy bugprone-argument-comment issues	2020-09-19 20:41:25 -07:00
Eric Christopher	dbd53a1f0c	Temporarily Revert "RegAllocFast: Rewrite and improve" as it's breaking a few tests in the lldb test suite. Bot: http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4226/steps/test/logs/stdio This reverts commit `c8757ff3aa`.	2020-09-18 18:11:21 -07:00
Fangrui Song	2ac06241d2	[LiveDebugValues] Add `#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)` to suppress -Wunused-function	2020-09-18 17:25:37 -07:00
Amara Emerson	5d34d7f1a0	[GlobalISel] Add lowering support for G_ABS and use for AArch64. Differential Revision: https://reviews.llvm.org/D87952	2020-09-18 16:17:18 -07:00
Reid Kleckner	9932561b48	[COFF] Move per-global .drective emission from AsmPrinter to TLOFCOFF This changes the order of output sections and the output assembly, but is otherwise NFC. It simplifies the TLOF interface by removing two COFF-only methods.	2020-09-18 14:31:01 -07:00
James Y Knight	f7a53d82c0	PR47468: Fix findPHICopyInsertPoint, so that copies aren't incorrectly inserted after an INLINEASM_BR. findPHICopyInsertPoint special cases placement in a block with a callbr or invoke in it. In that case, we must ensure that the copy is placed before the INLINEASM_BR or call instruction, if the register is defined prior to that instruction, because it may jump out of the block. Previously, the code placed it immediately after the last def _or use_. This is wrong, if the use is the instruction which may jump. We could correctly place it immediately after the last def (ignoring uses), but that is non-optimal for register pressure. Instead, place the copy after the last def, or before the call/inlineasm_br, whichever is later. Differential Revision: https://reviews.llvm.org/D87865	2020-09-18 14:14:04 -04:00
Matt Arsenault	3105d0f84b	CodeGen: Move split block utility to MachineBasicBlock AMDGPU needs this in several places, so consolidate them here.	2020-09-18 14:05:18 -04:00
Matt Arsenault	c8757ff3aa	RegAllocFast: Rewrite and improve This rewrites big parts of the fast register allocator. The basic strategy of doing block-local allocation hasn't changed but I tweaked several details: Track register state on register units instead of physical registers. This simplifies and speeds up handling of register aliases. Process basic blocks in reverse order: Definitions are known to end register livetimes when walking backwards (contrary when walking forward then uses may or may not be a kill so we need heuristics). Check register mask operands (calls) instead of conservatively assuming everything is clobbered. Enhance heuristics to detect killing uses: In case of a small number of defs/uses check if they are all in the same basic block and if so the last one is a killing use. Enhance heuristic for copy-coalescing through hinting: We check the first k defs of a register for COPYs rather than relying on there just being a single definition. When testing this on the full llvm test-suite including SPEC externals I measured: average 5.1% reduction in code size for X86, 4.9% reduction in code on aarch64. (ranging between 0% and 20% depending on the test) 0.5% faster compiletime (some analysis suggests the pass is slightly slower than before, but we more than make up for it because later passes are faster with the reduced instruction count) Also adds a few testcases that were broken without this patch, in particular bug 47278. Patch mostly by Matthias Braun	2020-09-18 14:05:18 -04:00
Matt Arsenault	870fd53e4f	Reapply "RegAllocFast: Record internal state based on register units" The regressions this caused should be fixed when https://reviews.llvm.org/D52010 is applied. This reverts commit `a21387c654`.	2020-09-18 14:05:18 -04:00
Zequan Wu	91aed9bf97	[CodeGen] emit CG profile for COFF object file I forgot to add emission of CG profile for COFF object file, when adding the support (https://reviews.llvm.org/D81775) Differential Revision: https://reviews.llvm.org/D87811	2020-09-18 10:57:54 -07:00
Francis Visoiu Mistrih	0345d88de6	[NFC][ScheduleDAG] Remove unused EntrySU SUnit EntrySU doesn't seem to be used at all when building the ScheduleDAG. Differential Revision: https://reviews.llvm.org/D87867	2020-09-18 09:50:47 -07:00
Simon Pilgrim	d967aaa8fa	[DAG] BuildVectorSDNode::getSplatValue - pull out repeated getNumOperands() calls. NFCI.	2020-09-18 16:10:23 +01:00
Matt Arsenault	751a6c5760	IR: Move denormal mode parsing from MachineFunction to Function This was just inspecting the IR to begin with, and is useful to check in some places in the IR.	2020-09-18 09:55:47 -04:00
Philip Reames	b04c181ed7	[AArch64] Enable implicit null check transformation This change enables the generic implicit null transformation for the AArch64 target. As background for those unfamiliar with our implicit null check support: An implicit null check is the use of a signal handler to catch and redirect to a handler a null pointer. Specifically, it's replacing an explicit conditional branch with such a redirect. This is only done for very cold branches under frontend control w/appropriate metadata. FAULTING_OP is used to wrap the faulting instruction. It is modelled as being a conditional branch to reflect the fact it can transfer control in the CFG. FAULTING_OP does not need to be an analyzable branch to achieve it's purpose. (Or at least, that's the x86 model. I find this slightly questionable.) When lowering to MC, we convert the FAULTING_OP back into the actual instruction, record the labels, and lower the original instruction. As can be seen in the test changes, currently the AArch64 backend does not eliminate the unconditional branch to the fallthrough block. I've tried two approaches, neither of which worked. I plan to return to this in a separate change set once I've wrapped my head around the interactions a bit better. (X86 handles this via AllowModify on analyzeBranch, but adding the obvious code causing BranchFolding to crash. I haven't yet figured out if it's a latent bug in BranchFolding, or something I'm doing wrong.) Differential Revision: https://reviews.llvm.org/D87851	2020-09-17 16:00:19 -07:00
Quentin Colombet	99e865b618	[TargetRegisterInfo] Add a couple of target hooks for the greedy register allocator Before this patch, the last chance recoloring and deferred spilling techniques were solely controled by command line options. This patch adds target hooks for these two techniques so that it is easier for backend writers to override the default behavior. The default behavior of the hooks preserves the default values of the related command line options. NFC	2020-09-17 15:23:15 -07:00
Derek Schuff	0ff28fa6a7	Support dwarf fission for wasm object files Initial support for dwarf fission sections (-gsplit-dwarf) on wasm. The most interesting change is support for writing 2 files (.o and .dwo) in the wasm object writer. My approach moves object-writing logic into its own function and calls it twice, swapping out the endian::Writer (W) in between calls. It also splits the import-preparation step into its own function (and skips it when writing a dwo). Differential Revision: https://reviews.llvm.org/D85685	2020-09-17 14:42:41 -07:00
Victor Huang	a4bb71b1c0	Disable hoisting MI to hotter basic blocks when using pgo This is a follow up patch for https://reviews.llvm.org/D63676 to enable the feature when using pgo. Differential Revision: https://reviews.llvm.org/D85240	2020-09-17 14:17:00 -05:00
Amara Emerson	79b21fc187	[AArch64][GlobalISel] Fix bug in fewVectorElts action while legalizing oversize G_FPTRUNC vectors. For <8 x s32> = fptrunc <8 x s64> the fewerElementsVector action tries to break down the source vector into the final source vectors of <2 x s64> using unmerge. This fixes a crash due to using the wrong number of elements for the breakdown type. Also add some legalizer tests for explicitly G_FPTRUNC which we didn't have. Differential Revision: https://reviews.llvm.org/D87814	2020-09-17 08:56:26 -07:00
Simon Pilgrim	2a56a0ba08	ModuloSchedule.cpp - remove unnecessary includes. NFCI. Already included in ModuloSchedule.h	2020-09-17 16:47:48 +01:00
Simon Pilgrim	85ba2f1663	LiveDebugVariables.cpp - remove unnecessary Compiler.h include. NFCI. Already included in LiveDebugVariables.h	2020-09-17 15:06:02 +01:00
Simon Pilgrim	46e59062a0	DwarfExpression.cpp - remove unnecessary includes. NFCI. Already included in DwarfExpression.h	2020-09-17 15:06:02 +01:00
Simon Pilgrim	67ae46c820	SafeStackLayout.cpp - remove unnecessary StackLifetime.h include. NFCI. Already included in SafeStackLayout.h	2020-09-17 14:56:46 +01:00
Simon Pilgrim	572e542c5e	DwarfStringPool.cpp - remove unnecessary StringRef include. NFCI. Already included in DwarfStringPool.h	2020-09-17 12:18:27 +01:00
Simon Pilgrim	71f237506b	DwarfFile.h - remove unnecessary includes. NFCI. Use forward declarations where possible, move includes down to DwarfFile.cpp and avoid duplicate includes.	2020-09-17 12:12:18 +01:00
Simon Pilgrim	550b1a6fd4	[AsmPrinter] DwarfDebug - use DebugLoc const references where possible. NFC. Avoid unnecessary copies.	2020-09-17 10:45:54 +01:00
Simon Pilgrim	4ae1bb193a	[AsmPrinter] Remove orphan DwarfUnit::shareAcrossDWOCUs declaration. NFCI. Method implementation no longer exists.	2020-09-17 10:45:52 +01:00
Jay Foad	6f6d389da5	[SplitKit] Only copy live lanes When splitting a live interval with subranges, only insert copies for the lanes that are live at the point of the split. This avoids some unnecessary copies and fixes a problem where copying dead lanes was generating MIR that failed verification. The test case for this is test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir. Without this fix, some earlier live range splitting would create %430: %430 [256r,848r:0)[848r,2584r:1) 0@256r 1@848r L0000000000000003 [848r,2584r:0) 0@848r L0000000000000030 [256r,2584r:0) 0@256r weight:1.480938e-03 ... 256B undef %430.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec ... 848B %430.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec ... 2584B %431:vreg_128 = COPY %430:vreg_128 Then RAGreedy::tryLocalSplit would split %430 into %432 and %433 just before 848B giving: %432 [256r,844r:0) 0@256r L0000000000000030 [256r,844r:0) 0@256r weight:3.066802e-03 %433 [844r,848r:0)[848r,2584r:1) 0@844r 1@848r L0000000000000030 [844r,2584r:0) 0@844r L0000000000000003 [844r,844d:0)[848r,2584r:1) 0@844r 1@848r weight:2.831776e-03 ... 256B undef %432.sub2:vreg_128 = V_LSHRREV_B32_e32 16, %20.sub1:vreg_128, implicit $exec ... 844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128 { internal %433.sub2:vreg_128 = COPY %432.sub2:vreg_128 848B } %433.sub0:vreg_128 = V_AND_B32_e32 %92:sreg_32, %20.sub1:vreg_128, implicit $exec ... 2584B %431:vreg_128 = COPY %433:vreg_128 Note that the copy from %432 to %433 at 844B is a curious bundle-without-a-BUNDLE-instruction that SplitKit creates deliberately, and it includes a copy of .sub0 which is not live at this point, and that causes it to fail verification: * Bad machine code: No live subrange at use * - function: zextload_global_v64i16_to_v64i64 - basic block: %bb.0 (0x7faed48) [0B;2848B) - instruction: 844B undef %433.sub0:vreg_128 = COPY %432.sub0:vreg_128 - operand 1: %432.sub0:vreg_128 - interval: %432 [256r,844r:0) 0@256r L0000000000000030 [256r,844r:0) 0@256r weight:3.066802e-03 - at: 844B Using real bundles with a BUNDLE instruction might also fix this problem, but the current fix is less invasive and also avoids some unnecessary copies. https://bugs.llvm.org/show_bug.cgi?id=47492 Differential Revision: https://reviews.llvm.org/D87757	2020-09-17 09:26:11 +01:00
Qiu Chaofan	a2fb5446be	[SelectionDAG] Check any use of negation result before removal `2508ef01` fixed a bug about constant removal in negation. But after sanitizing check I found there's still some issue about it so it's reverted. Temporary nodes will be removed if useless in negation. Before the removal, they'd be checked if any other nodes used it. So the removal was moved after getNode. However in rare cases the node to be removed is the same as result of getNode. We missed that and will be fixed by this patch. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D87614	2020-09-17 16:00:54 +08:00
Igor Kudrin	027d47d1c7	[DebugInfo] Simplify DIEInteger::SizeOf(). An AsmPrinter should always be provided to the method because some forms depend on its parameters. The only place in the codebase which passed a nullptr value was found in the unit tests, so the patch updates it to use some dummy AsmPrinter instead. Differential Revision: https://reviews.llvm.org/D85293	2020-09-17 12:47:38 +07:00
Craig Topper	e30371d99d	[DAGCombiner] Teach visitMSTORE to replace an all ones mask with an unmasked store. Similar to what done in D87788 for MLOAD. Again I've skipped indexed, truncating, and compressing stores.	2020-09-16 16:42:22 -07:00
Craig Topper	89ee4c0314	[DAGCombiner] Teach visitMLOAD to replace an all ones mask with an unmasked load If we have an all ones mask, we can just a regular masked load. InstCombine already gets this in IR. But the all ones mask can appear after type legalization. Only avx512 test cases are affected because X86 backend already looks for element 0 and the last element being 1. It replaces this with an unmasked load and blend. The all ones mask is a special case of that where the blend will be removed. That transform is only enabled on avx2 targets. I believe that's because a non-zero passthru on avx2 already requires a separate blend so its more profitable to handle mixed constant masks. This patch adds a dedicated all ones handling to the target independent DAG combiner. I've skipped extending, expanding, and index loads for now. X86 doesn't use index so I don't know much about it. Extending made me nervous because I wasn't sure I could trust the memory VT had the right element count due to some weirdness in vector splitting. For expanding I wasn't sure if we needed different undef handling. Differential Revision: https://reviews.llvm.org/D87788	2020-09-16 13:21:16 -07:00
Matt Arsenault	88bdcbbf1a	GlobalISel: Lift store value widening restriction This doesn't change the memory size and doesn't need to worry about non-power-of-2 sizes.	2020-09-16 14:25:07 -04:00
Michael Kitzan	c4e589b795	[GISel] Add new combines for unary FP instrs with constant operand https://reviews.llvm.org/D86393 Patch adds five new `GICombinerRules`, one for each of the following unary FP instrs: `G_FNEG`, `G_FABS`, `G_FPTRUNC`, `G_FSQRT`, and `G_FLOG2`. The combine rules perform the FP operation on the constant operand and replace the original instr with the result. Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules.	2020-09-16 10:34:15 -07:00
Simon Pilgrim	8f7d6b2375	DwarfUnit.h - remove unnecessary includes. NFCI.	2020-09-16 18:32:29 +01:00
Simon Pilgrim	69682f993c	InterferenceCache.cpp - remove duplicate includes. NFCI. Remove headers already included in InterferenceCache.h	2020-09-16 18:32:28 +01:00
Matt Arsenault	738c73a454	RegAllocFast: Make self loop live-out heuristic more aggressive This currently has no impact on code, but prevents sizeable code size regressions after D52010. This prevents spilling and reloading all values inside blocks that loop back. Add a baseline test which would regress without this patch.	2020-09-16 13:12:38 -04:00
Matt Arsenault	8d8a496356	LocalStackSlotAllocation: Swap order of check	2020-09-16 12:56:40 -04:00
Francesco Petrogalli	15e9a6c211	[llvm][CodeGen] Do not scalarize `llvm.masked.[gather\|scatter]` operating on scalable vectors. This patch prevents the `llvm.masked.gather` and `llvm.masked.scatter` intrinsics to be scalarized when invoked on scalable vectors. The change in `Function.cpp` is needed to prevent the warning that is raised when `getNumElements` is used in place of `getElementCount` on `VectorType` instances. The tests guards for regressions on this change. The tests makes sure that calls to `llvm.masked.[gather\|scatter]` are still scalarized when: # the intrinsics are operating on fixed size vectors, and # the compiler is not targeting fixed length SVE code generation. Reviewed By: efriedma, sdesmalen Differential Revision: https://reviews.llvm.org/D86249	2020-09-16 16:00:28 +00:00
Mircea Trofin	6e85c3d5c7	[NFC][Regalloc] accessors for 'reg' and 'weight' Also renamed the fields to follow style guidelines. Accessors help with readability - weight mutation, in particular, is easier to follow this way. Differential Revision: https://reviews.llvm.org/D87725	2020-09-16 08:28:57 -07:00
Sebastian Neubauer	833b3b0d3a	[AMDGPU] Add v3f16/v3i16 support to SDag Fix lowering and instruction selection for v3x16 types and enable InstCombine to emit them. This patch only implements it for the selection dag. GlobalISel tests in GlobalISel/llvm.amdgcn.image.load.1d.d16.ll and GlobalISel/llvm.amdgcn.image.store.2d.d16.ll still don't work. Differential Revision: https://reviews.llvm.org/D84420	2020-09-16 17:20:27 +02:00
Sam Parker	1c421046d7	[RDA] Fix getUniqueReachingDef for self loops We've fixed the case where this could return an instruction after the given instruction, but also means that we can falsely return a 'unique' def when they could be one coming from the backedge of a loop. Differential Revision: https://reviews.llvm.org/D87751	2020-09-16 12:44:23 +01:00
Simon Pilgrim	3f682611ab	[DAG] Remover getOperand() call. NFCI.	2020-09-16 11:18:58 +01:00
Volkan Keles	79378b1b75	GlobalISel: Fix a failing combiner test test/CodeGen/AArch64/GlobalISel/combine-trunc.mir was failing due to the different order for evaluating function arguments. This patch updates the related code to fix the issue.	2020-09-15 16:40:38 -07:00
Aditya Nandakumar	97203cfd6b	[GISel] Add new GISel combiners for G_MUL https://reviews.llvm.org/D87668 Patch adds two new GICombinerRules, one for G_MUL(X, 1) and another for G_MUL(X, -1). G_MUL(X, 1) is an identity combine, and G_MUL(X, -1) gets replaced with G_SUB(0, X). Patch additionally adds new combiner tests for the AArch64 target to test these new combiner rules, as well as updates AMDGPU GISel tests. Patch by mkitzan	2020-09-15 16:08:47 -07:00
Volkan Keles	a4e35cc2ec	GlobalISel: Add combines for G_TRUNC https://reviews.llvm.org/D87050	2020-09-15 15:50:34 -07:00
Guozhi Wei	243ffd0cad	[MachineBasicBlock] Fix a typo in function copySuccessor The condition used to decide if need to copy probability should be reversed. Differential Revision: https://reviews.llvm.org/D87417	2020-09-15 09:18:18 -07:00
Qiu Chaofan	e1669843f2	Revert "[SelectionDAG] Remove unused FP constant in getNegatedExpression" `2508ef01` doesn't totally fix the issue since we did not handle the case when unused temporary negated result is the same with the result, which is found by address sanitizer.	2020-09-15 22:03:50 +08:00
Hans Wennborg	a21387c654	Revert "RegAllocFast: Record internal state based on register units" This seems to have caused incorrect register allocation in some cases, breaking tests in the Zig standard library (PR47278). As discussed on the bug, revert back to green for now. > Record internal state based on register units. This is often more > efficient as there are typically fewer register units to update > compared to iterating over all the aliases of a register. > > Original patch by Matthias Braun, but I've been rebasing and fixing it > for almost 2 years and fixed a few bugs causing intermediate failures > to make this patch independent of the changes in > https://reviews.llvm.org/D52010. This reverts commit `66251f7e1d`, and follow-ups `931a68f26b` and `0671a4c508`. It also adjust some test expectations.	2020-09-15 13:25:41 +02:00
Simon Pilgrim	6c1f2a34fb	SpillPlacement.cpp - remove unnecessary includes. NFCI. These are all directly included in SpillPlacement.h	2020-09-15 12:18:24 +01:00
Simon Pilgrim	1abb4461ea	StatepointLowering.cpp - remove unnecessary includes. NFCI. These are all directly included in StatepointLowering.h	2020-09-15 12:18:23 +01:00
Simon Pilgrim	bee79cdcc6	SelectionDAGBuilder.h - remove unnecessary includes. NFCI. Reduce to forward declarations and move implicit dependencies down to the cpp files.	2020-09-15 12:18:22 +01:00
Qiu Chaofan	2508ef014e	[SelectionDAG] Remove unused FP constant in getNegatedExpression `960cbc53` immediately removes nodes that won't be used to avoid compilation time explosion. This patch adds the removal to constants to fix PR47517. Reviewed By: RKSimon, steven.zhang Differential Revision: https://reviews.llvm.org/D87614	2020-09-15 17:59:10 +08:00
Petar Avramovic	9b4fa85434	GlobalISel/IRTranslator resetTargetOptions based on function attributes Update TargetMachine.Options with function attributes before we start to generate MIR instructions. This allows access to correct function attributes via TargetMachine.Options (it used to access attributes of the function that was translated first). This affects some existing tests with "no-nans-fp-math" attribute. Follow-up on D87456. Differential Revision: https://reviews.llvm.org/D87511	2020-09-15 10:26:09 +02:00
Igor Kudrin	a845ebd633	[DebugInfo] Make offsets of dwarf units 64-bit (19/19). In the case of LTO, several DWARF units can be emitted in one section. For an extremely large application, they may exceed the limit of 4GiB for 32-bit offsets. As it is now possible to emit 64-bit debugging info, the patch enables storing the larger offsets. Differential Revision: https://reviews.llvm.org/D87026	2020-09-15 12:23:32 +07:00
Igor Kudrin	8c19ac23bd	[DebugInfo] Make the offset of string pool entries 64-bit (18/19). The string pool is shared among several units in the case of LTO, and it potentially can exceed the limit of 4GiB for an extremely large application. As it is now possible to emit 64-bit debugging info, the limitation can be removed. Differential Revision: https://reviews.llvm.org/D87025	2020-09-15 12:23:32 +07:00
Igor Kudrin	7e1e4e81cb	[DebugInfo] Fix emitting DWARF64 .debug_macro[.dwo] sections (17/19). The patch fixes emitting flags and the debug_line_offset field in the header, as well as the reference to the macro string for a pre-standard GNU .debug_macro extension. Differential Revision: https://reviews.llvm.org/D87024	2020-09-15 12:23:31 +07:00
Igor Kudrin	a93dd26d8c	[DebugInfo] Fix emitting DWARF64 .debug_names sections (16/19). The patch fixes emitting the unit length field in the header of the table and offsets to the entry pool. Note that while the patch changes the common method to emit offsets, in fact, nothing is changed for Apple accelerator tables, because we do not yet support DWARF64 for those targets. Differential Revision: https://reviews.llvm.org/D87023	2020-09-15 12:23:31 +07:00
Igor Kudrin	00ce54689d	[DebugInfo] Fix emitting DWARF64 .debug_addr sections (15/19). The patch fixes emitting the header of the table. The content is independent of the DWARF format. Differential Revision: https://reviews.llvm.org/D87022	2020-09-15 12:23:31 +07:00
Igor Kudrin	3158d3dd4b	[DebugInfo] Fix emitting DWARF64 .debug_loclists sections (14/19). The size of the offsets in the table depends on the DWARF format. Differential Revision: https://reviews.llvm.org/D87020	2020-09-15 12:23:31 +07:00
Igor Kudrin	f9b242fe24	[DebugInfo] Fix emitting DWARF64 .debug_rnglists sections (13/19). The size of the offsets in the table depends on the DWARF format. Differential Revision: https://reviews.llvm.org/D87019	2020-09-15 12:23:31 +07:00
Igor Kudrin	03b09c6b68	[DebugInfo] Fix emitting pre-v5 name lookup tables in the DWARF64 format (12/19). The transition is done by using methods of AsmPrinter which automatically emit values in compliance with the selected DWARF format. Differential Revision: https://reviews.llvm.org/D87013	2020-09-15 12:23:30 +07:00
Igor Kudrin	b118030f3f	[DebugInfo] Fix emitting DWARF64 .debug_aranges sections (11/19). The patch fixes calculating the size of the table and emitting the fields which depend on the DWARF format by using methods that choose appropriate sizes automatically. Differential Revision: https://reviews.llvm.org/D87012	2020-09-15 12:23:30 +07:00
Igor Kudrin	18f23b3ecc	[DebugInfo] Fix emitting DWARF64 type units (10/19). The patch fixes emitting the offset to the type DIE. All other fields are already fixed in previous patches. Differential Revision: https://reviews.llvm.org/D87021	2020-09-15 11:31:07 +07:00
Igor Kudrin	924dc58076	[DebugInfo] Fix emitting DWARF64 DWO compilation units and string offset tables (9/19). These two fixes are better to go together because llvm-dwarfdump is unable to dump a table when another one is malformed. Differential Revision: https://reviews.llvm.org/D87018	2020-09-15 11:31:00 +07:00
Igor Kudrin	383d34c077	[DebugInfo] Fix emitting DWARF64 .debug_str_offsets sections (8/19). The patch fixes calculating the size of the table and emitting the unit length field. Differential Revision: https://reviews.llvm.org/D87017	2020-09-15 11:30:53 +07:00
Igor Kudrin	26f1f18831	[DebugInfo] Fix emitting the DW_AT_location attribute for 64-bit DWARFv3 (7/19). The patch uses a common method to determine the appropriate form for the value of the attribute. Differential Revision: https://reviews.llvm.org/D87016	2020-09-15 11:30:46 +07:00
Igor Kudrin	cae7c1eb78	[DebugInfo] Use a common method to determine a suitable form for section offsts (6/19). This is mostly an NFC patch because the involved methods are used when emitting DWO files, which is incompatible with DWARFv3, or for platforms where DWARF64 is not supported yet. Differential Revision: https://reviews.llvm.org/D87015	2020-09-15 11:30:38 +07:00
Igor Kudrin	5dd1c59188	[DebugInfo] Fix emitting DWARF64 compilation units (5/19). The patch also adds a method to choose an appropriate DWARF form to represent section offsets according to the version and the format of producing debug info. Differential Revision: https://reviews.llvm.org/D87014	2020-09-15 11:30:30 +07:00
Igor Kudrin	982b31fad2	[DebugInfo] Add the -dwarf64 switch to llc and other internal tools (4/19). The patch adds a switch to enable emitting debug info in the 64-bit DWARF format. Most emitter for sections will be updated in the subsequent patches, whereas for .debug_line and .debug_frame the emitters are in the MC library, which is already updated. For now, the switch is enabled only for 64-bit ELF targets. Differential Revision: https://reviews.llvm.org/D87011	2020-09-15 11:30:18 +07:00
Igor Kudrin	c3c501f5d7	[DebugInfo] Add new emitting methods for values which depend on the DWARF format (3/19). These methods are going to be used in subsequent patches. Differential Revision: https://reviews.llvm.org/D87010	2020-09-15 11:30:10 +07:00
Igor Kudrin	a8058c6f8d	[DebugInfo] Fix DIE value emitters to be compatible with DWARF64 (2/19). DW_FORM_sec_offset and DW_FORM_strp imply values of different sizes with DWARF32 and DWARF64. The patch fixes DIE value classes to use correct sizes when emitting their values. For DIELocList it ensures that the requested DWARF form matches the current DWARF format because that class uses a method that selects the size automatically. Differential Revision: https://reviews.llvm.org/D87009	2020-09-15 11:30:02 +07:00
Igor Kudrin	380e746bcc	[DebugInfo] Fix methods of AsmPrinter to emit values corresponding to the DWARF format (1/19). These methods are used to emit values which are 32-bit in DWARF32 and 64-bit in DWARF64. The patch fixes them so that they choose the length automatically, depending on the DWARF format set in the Context. Differential Revision: https://reviews.llvm.org/D87008	2020-09-15 11:29:48 +07:00
Quentin Colombet	b3afad0463	[GlobalISel] Add a `X, Y = G_UNMERGE(G_ZEXT Z)` -> X = G_ZEXT Z; Y = 0 combine Add a combiner helper to transform unmerge of zext into one zext and a constant 0 Differential Revision: https://reviews.llvm.org/D87427	2020-09-14 17:27:23 -07:00
Quentin Colombet	d2321129bd	[GlobalISel] Add `X,Y<dead> = G_UNMERGE Z` -> X = G_TRUNC Z Add a combiner helper that replaces G_UNMERGE where all the destination lanes are dead except the first one with a G_TRUNC. Differential Revision: https://reviews.llvm.org/D87174	2020-09-14 17:27:23 -07:00
Quentin Colombet	a36278c2f8	[GlobalISel] Add G_UNMERGE(Cst) -> Cst1, Cst2, ... combine Add a combiner helper that replaces G_UNMERGE of big constants into direct use of smaller constants. Differential Revision: https://reviews.llvm.org/D87166	2020-09-14 16:30:18 -07:00

... 4 5 6 7 8 ...

29920 Commits