llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	a553ac9791	[CodeGen] llvm::erase_if (NFC)	2020-12-05 15:44:40 -08:00
Fangrui Song	2518433f86	Make __stack_chk_guard dso_local if Reloc::Static This is currently implied by TargetMachine::shouldAssumeDSOLocal but will be changed in the future.	2020-12-04 16:57:45 -08:00
Simon Pilgrim	9cf4f493a7	[DAG] Move SelectionDAG implementation to KnownBits::setInReg(). NFCI.	2020-12-04 18:09:08 +00:00
Simon Pilgrim	6f4ee6f870	[DAGCombiner] Use const APInt& for getConstantOperandAPInt results. NFCI. Avoid unnecessary instantiation. Noticed while removing unnecessary autos	2020-12-04 09:44:58 +00:00
Arthur Eubanks	2f0de58294	[NewPM] Support --print-before/after in NPM This changes --print-before/after to be a list of strings rather than legacy passes. (this also has the effect of not showing the entire list of passes in --help-hidden after --print-before/after, which IMO is great for making it less verbose). Currently PrintIRInstrumentation passes the class name rather than pass name to llvm::shouldPrintBeforePass(), meaning llvm::shouldPrintBeforePass() never functions as intended in the NPM. There is no easy way of converting class names to pass names outside of within an instance of PassBuilder. This adds a map of pass class names to their short names in PassRegistry.def within PassInstrumentationCallbacks. It is populated inside the constructor of PassBuilder, which takes a PassInstrumentationCallbacks. Add a pointer to PassInstrumentationCallbacks inside PrintIRInstrumentation and use the newly created map. This is a bit hacky, but I can't think of a better way since the short id to class name only exists within PassRegistry.def. This also doesn't handle passes not in PassRegistry.def but rather added via PassBuilder::registerPipelineParsingCallback(). llvm/test/CodeGen/Generic/print-after.ll doesn't seem very useful now with this change. Reviewed By: ychen, jamieschmeiser Differential Revision: https://reviews.llvm.org/D87216	2020-12-03 16:52:14 -08:00
Anna Thomas	fb2e109d45	[ScalarizeMaskedMemIntrin] NFC: Pass args by reference	2020-12-03 14:04:21 -05:00
Anna Thomas	f86ec1e1fc	[ScalarizeMaskedMemIntrin] NFC: Convert member functions to static This will make it easier to add new PM support once the pass is moved into transforms (D92407).	2020-12-03 11:46:38 -05:00
dfukalov	2ce38b3f03	[NFC] Reduce include files dependency. 1. Removed #include "...AliasAnalysis.h" in other headers and modules. 2. Cleaned up includes in AliasAnalysis.h. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92489	2020-12-03 18:25:05 +03:00
Joe Ellis	78c0ea54a2	[DAGCombine] Fix TypeSize warning in DAGCombine::visitLIFETIME_END Bail out early if we encounter a scalable store. Reviewed By: peterwaller-arm Differential Revision: https://reviews.llvm.org/D92392	2020-12-03 12:12:41 +00:00
Kazu Hirata	7a4af2a8e7	[SelectionDAG] Use is_contained (NFC)	2020-12-02 19:09:45 -08:00
Hsiangkai Wang	f7bc7c2981	[RISCV] Support Zfh half-precision floating-point extension. Support "Zfh" extension according to https://github.com/riscv/riscv-isa-manual/blob/zfh/src/zfh.tex Differential Revision: https://reviews.llvm.org/D90738	2020-12-03 09:16:33 +08:00
Mircea Trofin	bab72dd5d5	[NFC][MC] TargetRegisterInfo::getSubReg is a MCRegister. Typing the API appropriately. Differential Revision: https://reviews.llvm.org/D92341	2020-12-02 15:46:38 -08:00
Hongtao Yu	24d4291ca7	[CSSPGO] Pseudo probes for function calls. An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work. One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the `!dbg` metadata is a first-class metadata and has all essential support in place , leveraging the `!dbg` metadata as a channel to encode pseudo probe information is probably the easiest solution. With the requirement of not inflating `!dbg` metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The `!dbg` metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original `!dbg` metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst. To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against the `-pseudo-probe-for-profiling` switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use. Callsite pseudo probes are inserted in `SampleProfileProbePass` and a target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe ID/type from `!dbg`. Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91756	2020-12-02 13:45:20 -08:00
jasonliu	2c63e7604c	[XCOFF][AIX] Alternative path in EHStreamer for platforms do not have uleb128 support Summary: Not all system assembler supports `.uleb128 label2 - label1` form. When the target do not support this form, we have to take alternative manual calculation to get the offsets from them. Reviewed By: hubert.reinterpretcast Diffierential Revision: https://reviews.llvm.org/D92058	2020-12-02 20:03:15 +00:00
Nick Desaulniers	bc044a88ee	[Inline] prevent inlining on stack protector mismatch It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an attribute((no_stack_protector)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u SSP attributes can be ordered by strength. Weakest to strongest, they are: ssp, sspstrong, sspreq. Callees with differing SSP attributes may be inlined into each other, and the strongest attribute will be applied to the caller. (No change) After this change: * A callee with no SSP attributes will no longer be inlined into a caller with SSP attributes. * The reverse is also true: a callee with an SSP attribute will not be inlined into a caller with no SSP attributes. * The alwaysinline attribute overrides these rules. Functions that get synthesized by the compiler may not get inlined as a result if they are not created with the same stack protector function attribute as their callers. Alternative approach to https://reviews.llvm.org/D87956. Fixes pr/47479. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: rnk, MaskRay Differential Revision: https://reviews.llvm.org/D91816	2020-12-02 11:00:16 -08:00
jasonliu	a65d8c5d72	[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D91455	2020-12-02 18:42:44 +00:00
James Park	78b0ec3d1c	Avoid redundant inline with LLVM_ATTRIBUTE_ALWAYS_INLINE Fix MSVC warning when __forceinline is paired with inline. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D85264	2020-12-01 14:43:16 -08:00
David Blaikie	615f63e149	Revert "[FastISel] Flush local value map on ever instruction" and dependent patches This reverts commit `cf1c774d6a`. This change caused several regressions in the gdb test suite - at least a sample of which was due to line zero instructions making breakpoints un-lined. I think they're worth investigating/understanding more (& possibly addressing) before moving forward with this change. Revert "[FastISel] NFC: Clean up unnecessary bookkeeping" This reverts commit `3fd39d3694`. Revert "[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option" This reverts commit `a474657e30`. Revert "Remove static function unused after cf1c774." This reverts commit `dc35368ccf`. Revert "[lldb] Fix TestThreadStepOut.py after "Flush local value map on every instruction"" This reverts commit `53a14a47ee`.	2020-12-01 14:26:23 -08:00
Layton Kifer	d7fec38f05	[DAGCombiner][NFC] Replace duplicate implementation flipBoolean with DAG.getLogicalNOT Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D92246	2020-12-01 22:23:04 +03:00
Fangrui Song	a5309438fe	static const char *const foo => const char foo[] By default, a non-template variable of non-volatile const-qualified type having namespace-scope has internal linkage, so no need for `static`.	2020-12-01 10:33:18 -08:00
Benjamin Kramer	107e92dff8	[DAG] Remove unused variable. NFC.	2020-12-01 16:29:02 +01:00
Simon Pilgrim	1b209ff9e3	[DAG] Move vselect(icmp_ult, 0, sub(x,y)) -> usubsat(x,y) to DAGCombine (PR40111) Move the X86 VSELECT->USUBSAT fold to DAGCombiner - there's nothing target specific about these folds.	2020-12-01 14:25:29 +00:00
Simon Pilgrim	6dbd0d36a1	[DAG] Move vselect(icmp_ult, -1, add(x,y)) -> uaddsat(x,y) to DAGCombine (PR40111) Move the X86 VSELECT->UADDSAT fold to DAGCombiner - there's nothing target specific about these folds. The SSE42 test diffs are relatively benign - its avoiding an extra constant load in exchange for an extra xor operation - there are extra register moves, which is annoying as all those operations should commute them away. Differential Revision: https://reviews.llvm.org/D91876	2020-12-01 11:56:26 +00:00
Kazu Hirata	e785379aff	[CodeView] Remove unused declaration collectInlineSiteChildren (NFC) The function definition was removed on Sep 7, 2016 in commit `a9f4cc9510`. The declaration seems to be unused since then.	2020-11-30 22:28:26 -08:00
Hendrik Greving	d4ba5e15f4	Add MachineModuleInfo constructor with external MCContext Adds a constructor to MachineModuleInfo and MachineModuleInfoWapperPass that takes an external MCContext. If provided, the external context will be used throughout codegen instead of MMI's default one. This enables external drivers to take ownership of data put on the MMI's context during codegen. The internal context is used otherwise and destroyed upon finish. Differential Revision: https://reviews.llvm.org/D91313	2020-11-30 20:28:13 -08:00
Fangrui Song	d928dfc6f9	[GlobalISel] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds	2020-11-30 18:31:42 -08:00
Fangrui Song	36fe1a9dea	[GlobalISel] Fix -Wunused-variable	2020-11-30 18:25:54 -08:00
Amara Emerson	87ff156414	[AArch64][GlobalISel] Fix crash during legalization of a vector G_SELECT with scalar mask. The lowering of vector selects needs to first splat the scalar mask into a vector first. This was causing a crash when building oggenc in the test suite. Differential Revision: https://reviews.llvm.org/D91655	2020-11-30 16:37:49 -08:00
Paul Robinson	3fd39d3694	[FastISel] NFC: Clean up unnecessary bookkeeping Now that we flush the local value map for every instruction, we don't need any extra flushes for specific cases. Also, LastFlushPoint is not used for anything. Follow-ups to #dc35368 (D91734). Differential Revision: https://reviews.llvm.org/D92338	2020-11-30 12:27:50 -08:00
Matt Arsenault	29bd6519d2	SplitKit: Use Register	2020-11-30 15:09:33 -05:00
Paul Robinson	a474657e30	[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option This option is not used for anything after #dc35368 (D91734).	2020-11-30 10:55:49 -08:00
Francesco Petrogalli	f6150aa41a	[SelectionDAGBuilder] Update signature of `getRegsAndSizes()`. The mapping between registers and relative size has been updated to use TypeSize to account for the size of scalable EVTs. The patch is a NFCI, if not for the fact that with this change the function `getUnderlyingArgRegs` does not raise a warning for implicit conversion of `TypeSize` to `unsigned` when generating machine code from the test added to the patch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D92096	2020-11-30 17:38:51 +00:00
Craig Topper	fa0f01a3c0	[RISCV][LegalizeTypes] Teach type legalizer that it can promote UMIN/UMAX using SExtPromotedInteger if that's better for the target. If Sext is cheaper than Zext for a target, we can use that to promote the operands of UMIN/UMAX. Using sext just makes numbers with the sign bit set even larger when treated as an unsigned number and it has no effect on number without the sign bit set. So the relative order doesn't change. This is similar to what we already do for promoting SETCC. This is helpful on RISCV where i32 arguments are sign extended on RV64 and many instructions are able to produce results with 33 sign bits. Differential Revision: https://reviews.llvm.org/D92128	2020-11-27 11:37:25 -08:00
Simon Pilgrim	969918e177	[DAG] Legalize umin(x,y) -> sub(x,usubsat(x,y)) and umax(x,y) -> add(x,usubsat(y,x)) iff usubsat is legal If usubsat() is legal, this is likely to result in smaller codegen expansion than the default cmp+select codegen expansion. Allows us to move the x86-specific lowering to the generic expansion code. Differential Revision: https://reviews.llvm.org/D92183	2020-11-27 11:18:58 +00:00
QingShan Zhang	4d83aba422	[DAGCombine] Adding a hook to improve the precision of fsqrt if the input is denormal For now, we will hardcode the result as 0.0 if the input is denormal or 0. That will have the impact the precision. As the fsqrt added belong to the cold path of the cmp+branch, it won't impact the performance for normal inputs for PowerPC, but improve the precision if the input is denormal. Reviewed By: Spatel Differential Revision: https://reviews.llvm.org/D80974	2020-11-27 02:10:55 +00:00
Nikita Popov	4df8efce80	[AA] Split up LocationSize::unknown() Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationSize::unknown() only allows accesses after the base pointer. Some parts (various callers of AA) assume that LocationSize::unknown() allows accesses both before and after the base pointer (but within the underlying object). This patch splits up LocationSize::unknown() into LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer() to make this completely unambiguous. I tried my best to determine which one is appropriate for all the existing uses. The test changes in cs-cs.ll in particular illustrate a previously clearly incorrect AA result: We were effectively assuming that argmemonly functions were only allowed to access their arguments after the passed pointer, but not before it. I'm pretty sure that this was not intentional, and it's certainly not specified by LangRef that way. Differential Revision: https://reviews.llvm.org/D91649	2020-11-26 18:39:55 +01:00
Simon Pilgrim	8057ebf4a0	Revert rG12d59b696b330 "[DAG] Legalize umin(x,y) -> sub(x,usubsat(x,y)) and umax(x,y) -> add(x,usubsat(y,x)) iff usubsat is legal" This reverts commit `12d59b696b`. Prematurely pushed this to trunk	2020-11-26 15:07:45 +00:00
Simon Pilgrim	12d59b696b	[DAG] Legalize umin(x,y) -> sub(x,usubsat(x,y)) and umax(x,y) -> add(x,usubsat(y,x)) iff usubsat is legal If usubsat() is legal, this is likely to result in smaller codegen expansion than the default cmp+select codegen expansion. Allows us to move the x86-specific lowering to the generic expansion code.	2020-11-26 14:47:28 +00:00
Robert Lougher	6464c4a170	[LiveDebugVariables] Strip all debug instructions from nodebug functions A crash/assertion failure in the greedy register allocator was tracked down to a debug instr being passed to LiveIntervals::getInstructionIndex. Normally this should not occur as debug instructions are collected and removed by LiveDebugVariables before RA, and reinserted afterwards. However, when a function has no debug info, LiveDebugVariables simply strips any debug values that are present as they're not needed (this situation will occur when a function with debug info is inlined into a nodebug function). The problem is, it only removes DBG_VALUE instructions, leaving DBG_LABELs (the cause of the crash). This patch updates the LiveDebugVariables nodebug path to remove all debug instructions. The test case verifies that DBG_VALUE/DBG_LABEL instructions are present, and that they are stripped. When -experimental-debug-variable-locations is enabled, certain variable locations are represented by DBG_INSTR_REF instead of DBG_VALUE. The test case verifies that a DBG_INSTR_REF is emitted by the option, and that it is also stripped. Differential Revision: https://reviews.llvm.org/D92127	2020-11-26 14:30:18 +00:00
Kerry McLaughlin	4bee3197f6	[SVE][CodeGen] Extend isConstantSplatValue to support ISD::SPLAT_VECTOR Updated the affected scalable_of_scalable tests in sve-gep.ll, as isConstantSplatValue now returns true in DAGCombiner::visitMUL and folds `(mul x, 1) -> x` Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91363	2020-11-26 11:19:40 +00:00
Craig Topper	aea130f736	[LegalizerTypes] Add support for scalarizing the operand of an FP_EXTEND when the result type is legal.	2020-11-25 20:30:21 -08:00
Amy Huang	1363dfaf31	[CodeView] Avoid emitting empty debug globals subsection. In https://reviews.llvm.org/D89072 I added static const data members to the debug subsection for globals. It skipped emitting an S_CONSTANT if it didn't have a value, which meant the subsection could be empty. This patch fixes the empty subsection issue. Differential Revision: https://reviews.llvm.org/D92049	2020-11-25 16:13:32 -08:00
Craig Topper	2d6042937b	[SelectionDAGBuilder] Add SPF_NABS support to visitSelect We currently don't match this which limits the effectiveness of D91120 until InstCombine starts canonicalizing to llvm.abs. This should be easy to remove if/when we remove the SPF_ABS handling. Differential Revision: https://reviews.llvm.org/D92118	2020-11-25 14:54:26 -08:00
Paul Robinson	dc35368ccf	Remove static function unused after `cf1c774`. Caused some -Werror bot failures.	2020-11-25 13:43:06 -05:00
Simon Pilgrim	9c86c5e8ad	[DAG] Legalize abs(x) -> umin(x,sub(0,x)) iff umin/sub are legal If umin() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method. Followup to D92095 Alive2: https://alive2.llvm.org/ce/z/8nuX6s https://alive2.llvm.org/ce/z/q2hB9w	2020-11-25 18:06:02 +00:00
Paul Robinson	cf1c774d6a	[FastISel] Flush local value map on ever instruction Local values are constants or addresses that can't be folded into the instruction that uses them. FastISel materializes these in a "local value" area that always dominates the current insertion point, to try to avoid materializing these values more than once (per block). https://reviews.llvm.org/D43093 added code to sink these local value instructions to their first use, which has two beneficial effects. One, it is likely to avoid some unnecessary spills and reloads; two, it allows us to attach the debug location of the user to the local value instruction. The latter effect can improve the debugging experience for debuggers with a "set next statement" feature, such as the Visual Studio debugger and PS4 debugger, because instructions to set up constants for a given statement will be associated with the appropriate source line. There are also some constants (primarily addresses) that could be produced by no-op casts or GEP instructions; the main difference from "local value" instructions is that these are values from separate IR instructions, and therefore could have multiple users across multiple basic blocks. D43093 avoided sinking these, even though they were emitted to the same "local value" area as the other instructions. The patch comment for D43093 states: Local values may also be used by no-op casts, which adds the register to the RegFixups table. Without reversing the RegFixups map direction, we don't have enough information to sink these instructions. This patch undoes most of D43093, and instead flushes the local value map after() every IR instruction, using that instruction's debug location. This avoids sometimes incorrect locations used previously, and emits instructions in a more natural order. This does mean materialized values are not re-used across IR instruction boundaries; however, only about 5% of those values were reused in an experimental self-build of clang. () Actually, just prior to the next instruction. It seems like it would be cleaner the other way, but I was having trouble getting that to work. Differential Revision: https://reviews.llvm.org/D91734	2020-11-25 13:05:00 -05:00
Simon Pilgrim	0637dfe88b	[DAG] Legalize abs(x) -> smax(x,sub(0,x)) iff smax/sub are legal If smax() is legal, this is likely to result in smaller codegen expansion for abs(x) than the xor(add,ashr) method. This is also what PowerPC has been doing for its abs implementation, so it lets us get rid of a load of custom lowering code there (and which was never updated when they added smax lowering). Alive2: https://alive2.llvm.org/ce/z/xRk3cD Differential Revision: https://reviews.llvm.org/D92095	2020-11-25 15:03:03 +00:00
Simon Pilgrim	7e7106d104	DetectDeadLanes.cpp - remove unused headers. NFCI.	2020-11-25 11:38:28 +00:00
QingShan Zhang	9c588f53fc	[DAGCombine] Add hook to allow target specific test for sqrt input PowerPC has instruction ftsqrt/xstsqrtdp etc to do the input test for software square root. LLVM now tests it with smallest normalized value using abs + setcc. We should add hook to target that has test instructions. Reviewed By: Spatel, Chen Zheng, Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D80706	2020-11-25 05:37:15 +00:00
Kai Luo	8e6d92026c	[DAG][PowerPC] Fix dropped `nsw` flag in `SimplifySetCC` by adding `doesNodeExist` helper `SimplifySetCC` invokes `getNodeIfExists` without passing `Flags` argument and `getNodeIfExists` uses a default `SDNodeFlags` to intersect the original flags, as a consequence, flags like `nsw` is dropped. Added a new helper function `doesNodeExist` to check if a node exists without modifying its flags. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D89938	2020-11-25 04:39:03 +00:00
Zarko Todorovski	c92f29b05e	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Hsiangkai Wang	8d06a678a5	[SelectionDAG] Avoid aliasing analysis if the object size is unknown. If the size of memory access is unknown, do not use it to analysis. One example of unknown size memory access is to load/store scalable vector objects on the stack. Differential Revision: https://reviews.llvm.org/D91833	2020-11-25 06:13:37 +08:00
Janek van Oirschot	42eaf4fe0a	[HardwareLoops] Change order of SCEV expression construction for InitLoopCount. Putting the +1 before the zero-extend will allow scalar evolution to fold the expression in some cases such as the one shown in PowerPC's `shrink-wrap.ll` test. Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D91724	2020-11-24 18:01:42 +00:00
Yichao Yu	a248eca665	Clear NewGEPBases after finish using them in CodeGenPrep pass AFAICT all other set/map are correctly cleared in `runOnFunction`. With assertion enabled this causes a crash when the module is freed and potentially if a later pass delete the instruction (not observed in real world though). Without assertion this can potentially cause confusing result when running on a new Function/Module. Reviewed By: loladiro Differential Revision: https://reviews.llvm.org/D84031	2020-11-24 12:12:00 -05:00
Thomas Preud'homme	9c8af93c93	Add support for STRICT_FSETCC promotion Add missing handling of STRICT_FSETCC promotion. This prevents assert failure in llvm::TargetLoweringBase::getTypeToPromoteTo(). Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D91962	2020-11-24 16:53:49 +00:00
Kai Luo	5931be60b5	[DAGCombine][PowerPC] Convert negated abs to trivial arithmetic ops This patch converts `0 - abs(x)` to `Y = sra (X, size(X)-1); sub (Y, xor (X, Y))` for better codegen. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D91120	2020-11-24 09:43:35 +00:00
Pavel Labath	bce2ac9f6d	Revert "[DebugInfo] Refactor code for emitting DWARF expressions for FP constants" The commit introduced a crash when emitting (debug info for) complex floats (pr48277).	2020-11-24 09:11:33 +01:00
Martin Storsjö	6f792041a5	Reapply "[CodeGen] [WinException] Only produce handler data at the end of the function if needed" This reapplies `36c64af9d7` in updated form. Emit the xdata for each function at .seh_endproc. This keeps the exact same output header order for most code generated by the LLVM CodeGen layer. (Sections still change order for code built from assembly where functions lack an explicit .seh_handlerdata directive, and functions with chained unwind info.) The practical effect should be that assembly output lacks superfluous ".seh_handlerdata; .text" pairs at the end of functions that don't handle exceptions, which allows such functions to use the AArch64 packed unwind format again. Differential Revision: https://reviews.llvm.org/D87448	2020-11-23 23:17:03 +02:00
Pavel Labath	6ef7835afc	[DebugInfo] Refactor code for emitting DWARF expressions for FP constants This patch moves the selection of the style used to emit the numbers (DW_OP_implicit_value vs. DW_OP_const+DW_OP_stack_value) into DwarfExpression::addUnsignedConstant. This logic is not FP-specific, and it will be needed for large integers too. The refactor also makes DW_OP_implicit_value (DW_OP_stack_value worked already) be used for floating point constants other than float and double, so I've added a _Float16 test for it. Split off from D90916. Differential Revision: https://reviews.llvm.org/D91058	2020-11-23 09:59:07 +01:00
Kazu Hirata	85d6af393c	[CodeGen] Use pred_empty (NFC)	2020-11-22 22:16:13 -08:00
Simon Pilgrim	791040cd8b	[DAG] LowerMINMAX - move default expansion to generic TargetLowering::expandIntMINMAX This is part of the discussion on D91876 about trying to reduce custom lowering of MIN/MAX ops on older SSE targets - if we can improve generic vector expansion we should be able to relax the limitations in SelectionDAGBuilder when it will let MIN/MAX ops be generated, and avoid having to flag so many ops as 'custom'.	2020-11-22 13:02:27 +00:00
Kazu Hirata	68403af007	[MBP] Remove unused declaration shouldPredBlockBeOutlined (NFC) The function was introduced on Jun 12, 2016 in commit `071d0f1807`. Its definition was removed on Mar 2, 2017 in commit `1393761e0c`.	2020-11-21 23:35:02 -08:00
Kazu Hirata	9d985082ad	[MachineLICM] Remove unused declaration HoistRegion The function definition was removed on Dec 22, 2011 in commit in `1eed5b51e8`.	2020-11-21 22:55:37 -08:00
Kazu Hirata	c2309ff3d5	[SelectionDAG] Remove unused declaration ExpandStrictFPOp (NFC) ExpandStrictFPOp started taking two parameters instead of one on Jan 10, 2020 in commit `f678fc7660`, but the declaration for the single-perameter version has remained since.	2020-11-21 22:29:44 -08:00
Ella Ma	1756d67934	[llvm][clang][mlir] Add checks for the return values from Target::createXXX to prevent protential null deref All these potential null pointer dereferences are reported by my static analyzer for null smart pointer dereferences, which has a different implementation from `alpha.cplusplus.SmartPtr`. The checked pointers in this patch are initialized by Target::createXXX functions. When the creator function pointer is not correctly set, a null pointer will be returned, or the creator function may originally return a null pointer. Some of them may not make sense as they may be checked before entering the function, but I fixed them all in this patch. I submit this fix because 1) similar checks are found in some other places in the LLVM codebase for the same return value of the function; and, 2) some of the pointers are dereferenced before they are checked, which may definitely trigger a null pointer dereference if the return value is nullptr. Reviewed By: tejohnson, MaskRay, jpienaar Differential Revision: https://reviews.llvm.org/D91410	2020-11-21 21:04:12 -08:00
Hongtao Yu	d0e42037bf	[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR, ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`. ``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ``` The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86495	2020-11-20 10:52:43 -08:00
Hongtao Yu	f3c445697d	[CSSPGO] IR intrinsic for pseudo-probe block instrumentation This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490	2020-11-20 10:39:24 -08:00
Craig Topper	a7eae62a42	[SelectionDAG][X86][PowerPC][Mips] Replace the default implementation of LowerOperationWrapper with the X86 and PowerPC version. The default version only works if the returned node has a single result. The X86 and PowerPC versions support multiple results and allow a single result to be returned from a node with multiple outputs. And allow a single result that is not result 0 of the node. Also replace the Mips version since the new version should work for it. The original version handled multiple results, but only if the new node and original node had the same number of results. Differential Revision: https://reviews.llvm.org/D91846	2020-11-20 10:06:53 -08:00
Andrew Wei	1cd19fc568	[DeadMachineInstrctionElim] Post order visit all blocks and Iteratively run DeadMachineInstructionElim pass until nothing dead Patched by: guopeilin Reviewed By: hliao,rampitec Differential Revision: https://reviews.llvm.org/D91513	2020-11-21 00:43:23 +08:00
Pavel Iliin	4d7df43ffd	[AArch64] Out-of-line atomics (-moutline-atomics) implementation. This patch implements out of line atomics for LSE deployment mechanism. Details how it works can be found in llvm/docs/Atomics.rst Options -moutline-atomics and -mno-outline-atomics to enable and disable it were added to clang driver. This is clang and llvm part of out-of-line atomics interface, library part is already supported by libgcc. Compiler-rt support is provided in separate patch. Differential Revision: https://reviews.llvm.org/D91157	2020-11-20 13:30:12 +00:00
Kazu Hirata	2583d8eb08	[CodeGen] Use llvm::is_contained (NFC)	2020-11-19 22:07:56 -08:00
Nikita Popov	393b9e9db3	[MemLoc] Require LocationSize argument (NFC) When constructing a MemoryLocation by hand, require that a LocationSize is explicitly specified. D91649 will split up LocationSize::unknown() into two different states, and callers should make an explicit choice regarding the kind of MemoryLocation they want to have.	2020-11-19 21:45:52 +01:00
Leonard Chan	a97f62837f	[llvm][IR] Add dso_local_equivalent Constant The `dso_local_equivalent` constant is a wrapper for functions that represents a value which is functionally equivalent to the global passed to this. That is, if this accepts a function, calling this constant should have the same effects as calling the function directly. This could be a direct reference to the function, the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the resolved function as a call target. When lowered, the returned address must have a constant offset at link time from some other symbol defined within the same binary. The address of this value is also insignificant. The name is leveraged from `dso_local` where use of a function or variable is resolved to a symbol in the same linkage unit. In this patch: - Addition of `dso_local_equivalent` and handling it - Update Constant::needsRelocation() to strip constant inbound GEPs and take advantage of `dso_local_equivalent` for relative references This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959) which makes vtables readonly. This works by replacing the dynamic relocations for function pointers in them with static relocations that represent the offset between the vtable and virtual functions. If a function is externally defined, `dso_local_equivalent` can be used as a generic wrapper for the function to still allow for this static offset calculation to be done. See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details. Differential Revision: https://reviews.llvm.org/D77248	2020-11-19 10:26:17 -08:00
Adhemerval Zanella	807320119f	[AArch64] Lower fptrunc/fpext from/to FP128t to/from FP16 The compiler-rt part which adds the emitted symbols is handled in a subsequent patch. Differential Revision: https://reviews.llvm.org/D91731	2020-11-19 15:14:50 -03:00
Florian Hahn	1983acce7c	[SelDAGBuilder] Do not require simple VTs for constraints. In some cases, the values passed to `asm sideeffect` calls cannot be mapped directly to simple MVTs. Currently, we crash in the backend if that happens. An example can be found in the @test_vector_too_large_r_m test case, where we pass <9 x float> vectors. In practice, this can happen in cases like the simple C example below. using vec = float __attribute__((ext_vector_type(9))); void f1 (vec m) { asm volatile("" : "+r,m"(m) : : "memory"); } One case that use "+r,m" constraints for arbitrary data types in practice is google-benchmark's DoNotOptimize. This patch updates visitInlineAsm so that it use MVT::Other for constraints with complex VTs. It looks like the rest of the backend correctly deals with that and properly legalizes the type. And we still report an error if there are no registers to satisfy the constraint. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D91710	2020-11-19 09:31:54 +00:00
Andrew Paverd	0139c8af8d	[CFGuard] Add address-taken IAT tables and delay-load support This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D87544	2020-11-17 18:24:45 -08:00
Nick Desaulniers	f4c6080ab8	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit `b7926ce6d7`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Jon Roelofs	a461e76b6f	[MachineScheduler] Inform pass infra of post-ra scheduler's dependencies Differential Revision: https://reviews.llvm.org/D91561	2020-11-17 10:56:12 -08:00
Florian Hahn	a9adb62a64	[AsmPrinter] Use getMnemonic for instruction-mix remark. This patch uses the new `getMnemonic` helper from D90039 to display mnemonics instead of the internal opcodes. The main motivation behind using the mnemonics is that they are more user-friendly and more directly related to the assembly the users will be presented. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D90040	2020-11-17 12:12:47 +00:00
Jameson Nash	bf6ed355c8	Reland "[AsmPrinter] fix -disable-debug-info option" This reverts commit `105ed27ed8`, and removes the offending line from the tests.	2020-11-16 13:34:47 -05:00
Mirko Brkusanin	4cf6dd518e	[AMDGPU][GlobalISel] Fix lowerShlSat RegBankSelect would crash on G_SELECT when type is not s1. Differential Revision: https://reviews.llvm.org/D91437	2020-11-16 17:43:31 +01:00
Victor Huang	6bb2ceac90	Fix the compilation assertion due to unreachable BB pruning not deleting the associated BB from the jump tables This patch is added to remove the unreachable MBBs reference in the jump table. Differential Revisien: https://reviews.llvm.org/D90498 Reviewed by: amyk, bsaleil	2020-11-16 10:35:31 -06:00
Yuanfang Chen	a223354161	[CGProfile] allows bitcast in metadata node storing function pointers For example, during RAUW in IRMover, the `Function` ValueAsMetadata in "CG Profile" could become bitcast. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D88433	2020-11-13 09:28:21 -08:00
Jessica Paquette	b184a2eccf	[GlobalISel] Add matchers for specific constants and a matcher for negations It's fairly common to need matchers for a specific constant value, or for common idioms like finding a negated register. Add - `m_SpecificICst`, which returns true when matching a specific value.. - `m_ZeroInt`, which returns true when an integer 0 is matched. - `m_Neg`, which returns when a register is negated. Also update a few places which use idioms related to the new matchers. Differential Revision: https://reviews.llvm.org/D91397	2020-11-13 09:24:54 -08:00
Matt Arsenault	c67e1a985f	GlobalISel: Directly expose getDefSrcRegIgnoringCopies utility It's useful to get both the instruction and register at the same time.	2020-11-13 11:07:04 -05:00
Djordje Todorovic	22fd38d508	[NFC][IntrRefLDV] Remove dead code from transferSpillOrRestoreInst() Differential Revision: https://reviews.llvm.org/D90852	2020-11-13 07:53:54 -08:00
Hans Wennborg	105ed27ed8	Revert "[AsmPrinter] fix -disable-debug-info option" The test fails on Mac, see comment on the code review. > This option was in a rather convoluted place, causing global parameters > to be set in awkward and undesirable ways to try to account for it > indirectly. Add tests for the -disable-debug-info option and ensure we > don't print unintended markers from unintended places. > > Reviewed By: dstenb > > Differential Revision: https://reviews.llvm.org/D91083 This reverts commit `9606ef03f0`.	2020-11-13 13:46:13 +01:00
Kerry McLaughlin	306c8ab208	[SVE][CodeGen] Improve codegen of scalable masked scatters If the scatter store is able to perform the sign/zero extend of its index, this is folded into the instruction with refineIndexType(). Additionally, refineUniformBase() will return the base pointer and index from an add + splat_vector. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90942	2020-11-13 11:19:36 +00:00
serge-sans-paille	9218ff50f9	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Jameson Nash	9606ef03f0	[AsmPrinter] fix -disable-debug-info option This option was in a rather convoluted place, causing global parameters to be set in awkward and undesirable ways to try to account for it indirectly. Add tests for the -disable-debug-info option and ensure we don't print unintended markers from unintended places. Reviewed By: dstenb Differential Revision: https://reviews.llvm.org/D91083	2020-11-13 00:58:09 -05:00
Simon Pilgrim	8996742741	[KnownBits] Add KnownBits::makeConstant helper. NFCI. Helper for cases where we need to create a KnownBits from a (fully known) constant value.	2020-11-12 16:16:04 +00:00
David Sherwood	3225fcf11e	[SVE] Deal with SVE tuple call arguments correctly when running out of registers When passing SVE types as arguments to function calls we can run out of hardware SVE registers. This is normally fine, since we switch to an indirect mode where we pass a pointer to a SVE stack object in a GPR. However, if we switch over part-way through processing a SVE tuple then part of it will be in registers and the other part will be on the stack. I've fixed this by ensuring that: 1. When we don't have enough registers to allocate the whole block we mark any remaining SVE registers temporarily as allocated. 2. We temporarily remove the InConsecutiveRegs flags from the last tuple part argument and reinvoke the autogenerated calling convention handler. Doing this prevents the code from entering an infinite recursion and, in combination with 1), ensures we switch over to the Indirect mode. 3. After allocating a GPR register for the pointer to the tuple we then deallocate any SVE registers we marked as allocated in 1). We also set the InConsecutiveRegs flags back how they were before. 4. I've changed the AArch64ISelLowering LowerCALL and LowerFormalArguments functions to detect the start of a tuple, which involves allocating a single stack object and doing the correct numbers of legal loads and stores. Differential Revision: https://reviews.llvm.org/D90219	2020-11-12 08:41:50 +00:00
Pavel Iliin	fdb979cfbb	[NFC] [Legalize] Fix spaces and style.	2020-11-11 19:24:15 +00:00
Hans Wennborg	418f18c6cd	Revert "Reland [CFGuard] Add address-taken IAT tables and delay-load support" This broke both Firefox and Chromium (PR47905) due to what seems like dllimport function not being handled correctly. > This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. > Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. > > Reviewed By: rnk > > Differential Revision: https://reviews.llvm.org/D87544 This reverts commit `cfd8481da1`.	2020-11-11 16:03:33 +01:00
Simon Pilgrim	1a62ca65c1	[KnownBits] Add KnownBits::commonBits helper. NFCI. We have a frequent pattern where we're merging two KnownBits to get the common/shared bits, and I just fell for the gotcha where I tried to use the & operator to merge them........	2020-11-11 12:15:54 +00:00
Kerry McLaughlin	170947a5de	[SVE][CodeGen] Lower scalable masked scatters Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only) Changes included in this patch: - Custom lowering for MSCATTER, which chooses the appropriate scatter store opcode to use. Floating-point scatters are cast to integer, with patterns added to match FP reinterpret_casts. - Added the getCanonicalIndexType function to convert redundant addressing modes (e.g. scaling is redundant when accessing bytes) - Tests with 32 & 64-bit scaled & unscaled offsets Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90941	2020-11-11 11:50:22 +00:00
Kerry McLaughlin	ffbbfc76ca	[SVE][CodeGen] Add the isTruncatingStore flag to MSCATTER This patch adds the IsTruncatingStore flag to MaskedScatterSDNode, set by getMaskedScatter(). Updated SelectionDAGDumper::print_details for MaskedScatterSDNode to print the details of masked scatters (is truncating, signed or scaled). This is the first in a series of patches which adds support for scalable masked scatters Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90939	2020-11-11 10:58:24 +00:00
Chen Zheng	09e34048bf	[SelectionDAG] fminnum should be a binary operator Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D91163	2020-11-11 03:41:40 -05:00
Xun Li	b8a8ef3276	[SafeStack] Make sure SafeStack does not break musttail call contract SafeStack instrumentation should not insert anything inbetween musttail call and return instruction. For every ReturnInst that needs to be instrumented, we adjust the insertion point to the musttail call if exists. Differential Revision: https://reviews.llvm.org/D90702	2020-11-10 20:46:05 -08:00
Gaurav Jain	a0b7129767	[NFC] Use [MC]Register in TwoAddressInstructionPass Differential Revision: https://reviews.llvm.org/D90902	2020-11-10 19:01:56 -08:00
David Green	b2ac9681a7	[ARM] Alter t2DoLoopStart to define lr This changes the definition of t2DoLoopStart from t2DoLoopStart rGPR to GPRlr = t2DoLoopStart rGPR This will hopefully mean that low overhead loops are more tied together, and we can more reliably generate loops without reverting or being at the whims of the register allocator. This is a fairly simple change in itself, but leads to a number of other required alterations. - The hardware loop pass, if UsePhi is set, now generates loops of the form: %start = llvm.start.loop.iterations(%N) loop: %p = phi [%start], [%dec] %dec = llvm.loop.decrement.reg(%p, 1) %c = icmp ne %dec, 0 br %c, loop, exit - For this a new llvm.start.loop.iterations intrinsic was added, identical to llvm.set.loop.iterations but produces a value as seen above, gluing the loop together more through def-use chains. - This new instrinsic conceptually produces the same output as input, which is taught to SCEV so that the checks in MVETailPredication are not affected. - Some minor changes are needed to the ARMLowOverheadLoop pass, but it has been left mostly as before. We should now more reliably be able to tell that the t2DoLoopStart is correct without having to prove it, but t2WhileLoopStart and tail-predicated loops will remain the same. - And all the tests have been updated. There are a lot of them! This patch on it's own might cause more trouble that it helps, with more tail-predicated loops being reverted, but some additional patches can hopefully improve upon that to get to something that is better overall. Differential Revision: https://reviews.llvm.org/D89881	2020-11-10 15:57:58 +00:00
Mirko Brkusanin	a75d6178b8	[GlobalISel] Add combine for (x \| mask) -> x when (x \| mask) == x If we have a mask, and a value x, where (x \| mask) == x, we can drop the OR and just use x. Differential Revision: https://reviews.llvm.org/D90952	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	fb36ab0a42	[GlobalISel] Expand combine for (x & mask) -> x when (x & mask) == x We can use KnownBitsAnalysis to cover cases when mask is not trivial. It can also help with cases when mask is not constant but can still be folded into one. Since 'and' is comutative we should treat both operands as possible replacements. Differential Revision: https://reviews.llvm.org/D90674	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	53ae95c946	[AMDGPU][GlobalISel] Combine shift + logic + shift with constant operands This sequence of instructions can be simplified if they are single use and some operands are constants. Additional combines may be applied afterwards. Differential Revision: https://reviews.llvm.org/D90223	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	de719586a8	[AMDGPU][GlobalISel] Fold a chain of two shift instructions with constant operands Sequence of same shift instructions with constant operands can be combined into a single shift instruction. Differential Revision: https://reviews.llvm.org/D90217	2020-11-10 11:32:12 +01:00
David Zarzycki	a41ea782c8	[SelectionDAG] Enable CTPOP optimization fine tuning Add a TLI hook to allow SelectionDAG to fine tune the conversion of CTPOP to a chain of "x & (x - 1)" when CTPOP isn't legal. A subsequent patch will attempt to fine tune the X86 code gen. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89952	2020-11-09 13:49:01 -05:00
Paul Robinson	920befb337	[FastISel] Reduce spills around mem-intrinsic calls FastISel generates instructions to materialize "local values" at the top of a block, in the hope that these values could be reused within the block. To reduce spills and restores, FastISel treats calls as sub-block boundaries, flushing the "local value map" at each call. This patch treats the mem* intrinsics as if they were calls, because at O0 generally they are calls. Eliminating these spills/restores is actually better for debugging (especially a "continue at this line" command), code size, stack frame size, and maybe even performance. Differential Revision: https://reviews.llvm.org/D90877	2020-11-09 09:45:14 -08:00
jasonliu	42d2109380	[XCOFF] Enable explicit sections on AIX Implement mechanism to allow explicit sections to be generated on AIX. Reviewed By: DiggerLin Differential Revision: https://reviews.llvm.org/D88615	2020-11-09 16:27:38 +00:00
David Zarzycki	57e46e7123	[SelectionDAG] NFC: Hoist is legal check This was requested during the code review of D89952.	2020-11-09 10:55:15 -05:00
Francesco Petrogalli	fc2fe6817e	[llvm][AArch64] Simplify (and (sign_extend..) #bitmask). Fold VT = (and (sign_extend NarrowVT to VT) #bitmask) into VT = (zero_extend NarrowVT) With this combine, the test replaces a sign extended load + an unsigned extention with a zero extended load to render one of the operands of the last multiplication. BEFORE \| AFTER f_i16_i32: \| f_i16_i32: .fnstart \| .fnstart ldrsh r0, [r0] \| ldrh r1, [r1] ldrsh r1, [r1] \| ldrsh r0, [r0] smulbb r0, r1, r0 \| smulbb r0, r0, r1 uxth r1, r1 \| mul r0, r0, r1 mul r0, r0, r1 \| bx lr bx lr \| Reviewed By: resistor Differential Revision: https://reviews.llvm.org/D90605	2020-11-09 12:53:36 +00:00
Fangrui Song	ee4769687d	AsmPrinter/Dwarf*: Use llvm::Register instead of unsigned	2020-11-06 21:00:28 -08:00
Fangrui Song	7684496035	[AsmPrinter] Rename ByteStreamer::EmitInt8 to emitInt8 to be consistent with other emit*	2020-11-06 20:02:56 -08:00
Quentin Colombet	a585228027	Prevent LICM and machineLICM from hoisting convergent operations Results of convergent operations are implicitly affected by the enclosing control flows and should not be hoisted out of arbitrary loops. Patch by Xiaoqing Wu <xiaoqing_wu@apple.com> Differential Revision: https://reviews.llvm.org/D90361	2020-11-06 10:26:39 -08:00
Than McIntosh	b3d0f02861	[NFC] Fix typo in comment. Differential Revision: https://reviews.llvm.org/D90846	2020-11-06 09:03:07 -05:00
Momchil Velikov	5b30d9adc0	[MachineOutliner] Do not outline debug instructions The debug location is removed from any outlined instruction. This causes the MachineVerifier to crash on outlined DBG_VALUE instructions. Then, debug instructions are "invisible" to the outliner, that is, two ranges of instructions from different functions are considered identical if the only difference is debug instructions. Since a debug instruction from one function is unlikely to provide sensible debug information about all functions, sharing an outlined sequence, this patch just removes debug instructions from the outlined functions. Differential Revision: https://reviews.llvm.org/D89485	2020-11-05 19:26:51 +00:00
Craig Topper	98d7e583db	[LegalizeTypes] Remove unnecessary if around switch in ScalarizeVectorOperand and SplitVectorOperand. NFC The if was checking !Res.getNode() but that's always true since Res was initialized to SDValue() and not touched before the if. This appears to be a leftover from a previous implementation of Custom legalization where Res was updated instead of returning immediately.	2020-11-05 11:00:51 -08:00
Simon Pilgrim	bf04e34383	[DAG] computeKnownBits - Replace ISD::SREM handling with KnownBits::srem to reduce code duplication	2020-11-05 17:12:58 +00:00
Simon Pilgrim	7fe7c6d3be	[GlobalISel] Don't use Register type for getNumOperands(). NFCI. Copy+Paste typo - we were storing getNumOperands() opcounts in a Register type instead of just an unsigned.	2020-11-05 17:12:58 +00:00
Simon Pilgrim	e237d56b43	[KnownBits] Move ValueTracking/SelectionDAG UREM KnownBits handling to KnownBits::urem. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 14:30:59 +00:00
Simon Pilgrim	32bee18b84	[KnownBits] Move ValueTracking/SelectionDAG UDIV KnownBits handling to KnownBits::udiv. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 13:42:42 +00:00
Simon Pilgrim	546d002d7a	[GlobalISel] ComputeKnownBits - use common KnownBits shift handling (PR44526) Convert GISelKnownBits.computeKnownBitsImpl shift handling to use the common KnownBits implementations, which makes use of the known leading/trailing bits for shifted values in cases where we don't know the shift amount value, as detailed in https://blog.regehr.org/archives/1709 Differential Revision: https://reviews.llvm.org/D90527	2020-11-05 11:52:26 +00:00
Sander de Smalen	d57bba7cf8	[SVE] Return StackOffset for TargetFrameLowering::getFrameIndexReference. To accommodate frame layouts that have both fixed and scalable objects on the stack, describing a stack location or offset using a pointer + uint64_t is not sufficient. For this reason, we've introduced the StackOffset class, which models both the fixed- and scalable sized offsets. The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset, so that this can be used in other interfaces, such as to eliminate frame indices in PEI or to emit Debug locations for variables on the stack. This patch is purely mechanical and doesn't change the behaviour of how the result of this function is used for fixed-sized offsets. The patch adds various checks to assert that the offset has no scalable component, as frame offsets with a scalable component are not yet supported in various places. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D90018	2020-11-05 11:02:18 +00:00
Simon Pilgrim	b25765792b	Revert rGbbeb08497ce58 "Revert "[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation"" Updated the GISel KnownBits tests as KnownBits::computeForMul allows more accurate computation.	2020-11-05 10:39:53 +00:00
Chen Zheng	f645cea8f6	[MachineSink] add more profitable pattern. Add more profitable sinking patterns if the target bb register pressure is not too high. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D88126	2020-11-04 23:11:22 -05:00
Cameron McInally	c126eb7529	[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247. Differential Revision: https://reviews.llvm.org/D90644	2020-11-04 14:20:31 -06:00
Fraser Cormack	f99580c1e5	[DAGCombine] Fix bug in load scalarization Summary: For vector element types which are not byte-sized, we would generate incorrect scalar offsets and produce incorrect codegen. This optimization could potentially be supported in the future, e.g. by loading in bytes, then shifting and masking out the remaining bits of the vector element. However, without an upstream target to test against it's best to avoid the bad codegen in the simplest possible way. Related to this bug: https://bugs.llvm.org/show_bug.cgi?id=27600 Reviewed by: foad Differential Revision: https://reviews.llvm.org/D78568	2020-11-04 19:02:40 +00:00
Fangrui Song	bbeb08497c	Revert "[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation" This reverts commit `0b8711e1af` which broke GlobalISelTests AArch64GISelMITest.TestKnownBits	2020-11-04 09:54:04 -08:00
Simon Pilgrim	0b8711e1af	[GlobalISel] GISelKnownBits::computeKnownBitsImpl - Replace TargetOpcode::G_MUL handling with the common KnownBits::computeForMul implementation Avoid code duplication	2020-11-04 17:25:24 +00:00
Simon Pilgrim	93c2a9ae07	Use isa<> instead of dyn_cast<> to avoid unused variable warning. NFCI.	2020-11-04 15:26:32 +00:00
Kerry McLaughlin	f2412d372d	[SVE][CodeGen] Lower scalable integer vector reductions This patch uses the existing LowerFixedLengthReductionToSVE function to also lower scalable vector reductions. A separate function has been added to lower VECREDUCE_AND & VECREDUCE_OR operations with predicate types using ptest. Lowering scalable floating-point reductions will be addressed in a follow up patch, for now these will hit the assertion added to expandVecReduce() in TargetLowering. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D89382	2020-11-04 11:38:49 +00:00
Simon Pilgrim	825e517e34	[DAG] computeKnownBits - Replace ISD::MUL handling with the common KnownBits::computeForMul implementation	2020-11-04 11:32:08 +00:00
Fangrui Song	0314dff051	[DebugInfo] Delete unused DwarfUnit::addConstantFPValue & addConstantValue overloads. NFC This functions appear to be unused for many years.	2020-11-04 00:05:57 -08:00
Xiang1 Zhang	7ba3293691	[StackColoring] Conservatively merge catch point of V for catch(V) Reviewed By: thanm, clin1 Differential Revision: https://reviews.llvm.org/D86673	2020-11-04 10:49:56 +08:00
Michael Liao	4b11201592	[MachineInstr] Add support for instructions with multiple memory operands. - Basically iterate each pair of memory operands from both instructions and return true if any of them may alias. - The exception are memory instructions without any memory operand. They may touch everything and could alias to any memory instruction. Differential Revision: https://reviews.llvm.org/D89447	2020-11-03 20:44:40 -05:00
Gaurav Jain	492b1d78d5	[NFC] Use [MC]Register in register allocation Differential Revision: https://reviews.llvm.org/D90725	2020-11-03 17:34:26 -08:00
Simon Pilgrim	e9b88c754a	[DAG] computeKnownBits - Move ISD::SRA handling into KnownBits::ashr As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.	2020-11-03 18:09:33 +00:00
Simon Pilgrim	cb798f040a	[DAG] computeKnownBits - Move (most) ISD::SRL handling into KnownBits::lshr As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 17:30:36 +00:00
Jameson Nash	a0ad066ce4	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Replaces https://reviews.llvm.org/D74158 Differential Revision: https://reviews.llvm.org/D89613	2020-11-03 10:02:09 -05:00
Simon Pilgrim	cab21d4fa8	[DAG] computeKnownBits - Move (most) ISD::SHL handling into KnownBits::shl As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 14:22:28 +00:00
Hans Wennborg	cbf25fbed5	Revert "[CodeGen] [WinException] Only produce handler data at the end of the function if needed" This caused an explosion in ICF times during linking on Windows when libfuzzer instrumentation is enabled. For a small binary we see ICF time go from ~0 to ~10 s. For a large binary it goes from ~1 s to forevert (I gave up after 30 minutes). See comment on the code review. > If we are going to write handler data (that is written as variable > length data following after the unwind info in .xdata), we need to > emit the handler data immediately, but for cases where no such > info is going to be written, skip emitting it right away. (Unwind > info for all remaining functions that hasn't gotten it emitted > directly is emitted at the end.) > > This does slightly change the ordering of sections (triggering a > bunch of updates to DebugInfo/COFF tests), but the change should be > benign. > > This also matches GCC's assembly output, which doesn't output > .seh_handlerdata unless it actually is needed. > > For ARM64, the unwind info can be packed into the runtime function > entry itself (leaving no data in the .xdata section at all), but > that can only be done if there's no follow-on data in the .xdata > section. If emission of the unwind info is triggered via > EmitWinEHHandlerData (or the .seh_handlerdata directive), which > implicitly switches to the .xdata section, there's a chance of the > caller wanting to pass further data there, so the packed format > can't be used in that case. > > Differential Revision: https://reviews.llvm.org/D87448 This reverts commit `36c64af9d7`.	2020-11-03 13:12:10 +01:00
Jessica Clarke	f77c8a48ae	[CodeGen] Fix regression from D83655 Arm EHABI has a null LSDASection as it does its own thing, so we should continue to return null in that case rather than try and cast it.	2020-11-03 03:57:46 +00:00
Gaurav Jain	b68994bd2d	[NFC] Use [MC]Register in Live-ness tracking Differential Revision: https://reviews.llvm.org/D90611	2020-11-02 15:46:13 -08:00
Fangrui Song	ee5d1a0449	[AsmPrinter] Split up .gcc_except_table MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table: * if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat` * otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits` This ensures that (a) non-prevailing copies are discarded and (b) .gcc_except_table associated to discarded text sections can be discarded by a .gcc_except_table-aware linker (GNU ld, but not gold or LLD) This patches matches the GCC behavior. If -fno-unique-section-names is specified, we don't append the suffix. If -ffunction-sections is additionally specified, use `.section ...,unique`. Note, if clang driver communicates that the linker is LLD and we know it is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table costs, at least in the -fno-unique-section-names case. We cannot use it on GNU ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER components in an output section https://sourceware.org/bugzilla/show_bug.cgi?id=26256 For RISC-V -mrelax, this patch additionally fixes an assembler-linker interaction problem: because a section is shrinkable, the length of a call-site code range is not a constant. Relocations referencing the associated text section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a discarded section group member from outside the group is disallowed by the ELF specification (PR46675): ``` // a.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int main() { return comdat(); } // b.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int foo() { return comdat(); } clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section: ``` -fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations. Reviewed By: jrtc27, rahmanl Differential Revision: https://reviews.llvm.org/D83655	2020-11-02 14:36:25 -08:00
Mircea Trofin	61e8a44655	[NFC][regalloc] Use MCRegister appropriately Differential Revision: https://reviews.llvm.org/D90506	2020-11-02 11:48:49 -08:00
Alex Richardson	5bc438efcf	[AtomicExpand] Avoid creating an unnamed libcall I recently modified this pass to better support CHERI-RISC-V and while doing so I noticed that this pass was calling M->getOrInsertFunction() with the result of TLI->getLibcallName(RTLibType). However, AMDGPU fills the libcalls array with nullptr, so this creates an anonymous function instead. This patch changes expandAtomicOpToLibcall to return false in case the libcall does not exist and changes the assert() in the callees to a report_fatal_error() instead. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D88800	2020-11-02 17:52:37 +00:00
Matt Arsenault	b3639e9ae9	RegisterCoalescer: Use Register	2020-11-02 10:14:50 -05:00
Chen Zheng	24a31922ce	[MachineSink] sink more profitable loads Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D86864	2020-11-01 21:13:27 -05:00
QingShan Zhang	1d178d600a	[Scheduling] Fall back to the fast cluster algorithm if the DAG is too complex We have added a new load/store cluster algorithm in D85517. However, AArch64 see some compiling deg with the new algorithm as the IsReachable() is not cheap if the DAG is complex. O(M+N) See https://bugs.llvm.org/show_bug.cgi?id=47966 So, this patch added a heuristic to switch to old cluster algorithm if the DAG is too complex. Reviewed By: Owen Anderson Differential Revision: https://reviews.llvm.org/D90144	2020-11-02 02:11:52 +00:00
Qiu Chaofan	1f852ba853	[PowerPC] Avoid unnecessary fadd for unsigned to ppcf128 Unsigned 32-bit or shorter integer to ppcf128 conversion are currently expanded as signed-to-double with an extra fadd to 'complement'. But on PowerPC we have native instruction to directly convert unsigned to double since ISA v2.06. This patch exploits it. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D89786	2020-11-01 23:22:47 +08:00
Cameron McInally	dda1e74b58	[Legalize] Add legalizations for VECREDUCE_SEQ_FADD Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass. Differential Revision: https://reviews.llvm.org/D90247	2020-10-30 16:02:55 -05:00

1 2 3 4 5 ...

29920 Commits