llvm-project

Commit Graph

Author	SHA1	Message	Date
Whitney Tsang	1006ac3963	[LoopNest] Consider loop nest with inner loop guard using outer loop induction variable to be perfect This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect. Reviewed By: bmahjour, sidbav Differential Revision: https://reviews.llvm.org/D94717	2021-05-07 16:04:18 +00:00
Benjamin Kramer	6248d11190	Retire TargetRegisterInfo::getSpillAlignment getSpillAlign does the same thing.	2021-05-07 15:16:22 +02:00
Simon Pilgrim	280aa3415e	[DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts. Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication. I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to). NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly. Differential Revision: https://reviews.llvm.org/D101987	2021-05-07 13:12:30 +01:00
Stephen Tozer	0791f968fe	[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST This patch modifies updateDbgUsersToReg to properly handle DBG_VALUE_LIST instructions, by replacing the hard-coded operand indices (i.e. getOperand(0)) with the more general getDebugOperandsForReg(), and updating the register for all matching operands. Differential Revision: https://reviews.llvm.org/D101523	2021-05-07 11:47:50 +01:00
Guillaume Chatelet	e805b7c2d6	[llvm][NFC] Remove remaining deprecated alignment functions from CodeGen Differential Revision: https://reviews.llvm.org/D102058	2021-05-07 10:22:41 +00:00
Guillaume Chatelet	eb1b26ec1d	[llvm][NFC] Remove deprecated TargetFrameLowering and InstrTypes alignment functions Differential Revision: https://reviews.llvm.org/D102056	2021-05-07 10:21:35 +00:00
Sebastian Neubauer	98e5ede604	[AMDGPU] Serialize MFInfo::ScavengeFI Serialize ScavengeFI from SIMachineFunctionInfo into yaml. ScavengeFI is not used outside of the PrologEpilogInserter, so this shouldn't change anything. Differential Revision: https://reviews.llvm.org/D101367	2021-05-07 11:15:25 +02:00
Amara Emerson	1ccebb18ef	[GlobalISel] Micro-optimize the conditional branch optimization. Convert a check into an assert and pass an MI instead of recomputing in the apply function.	2021-05-07 00:03:09 -07:00
Chen Zheng	a95473c563	[XCOFF] handle string constants generation for AIX This follows https://www.ibm.com/docs/en/aix/7.2?topic=constants-string Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D101280	2021-05-07 06:43:36 +00:00
qixingxue	e388b9399b	[IR] Fix typo in comment of Intrinsics.td (NFC)	2021-05-07 13:21:58 +08:00
Cyndy Ishida	c4ed142e69	[llvm][TextAPI] add mapping from OS string to Platform * add utility for matching target triple OS value strings to PlatformKind This was reviewed offline by ributzka, steven_wu	2021-05-06 16:25:56 -07:00
Stanislav Mekhanoshin	c714d03785	[AMDGPU] Expose __builtin_amdgcn_perm for v_perm_b32 Differential Revision: https://reviews.llvm.org/D102022	2021-05-06 16:17:33 -07:00
Sanjay Patel	fefcb1f878	[PassManager] add helper function to hold set of vector passes This is no-functional-change-intended (NFC) and split off from D102002 (which proposes to eliminate the LTO-based differences).	2021-05-06 15:36:15 -04:00
Mircea Trofin	97ab068034	[NPM] Do not run function simplification pipeline unnecessarily The CGSCC pass manager interplay with the FunctionAnalysisManagerCGSCCProxy is 'special' in the sense that the former will rerun the latter if there are changes to a SCC structure; that being said, some of the functions in the SCC may be unchanged. In that case, the function simplification pipeline will be re-run, which impacts compile time[1]. This patch allows the function simplification pipeline be skipped if it was already run and the function was not modified since. The behavior is currently disabled by default. This is because, currently, the rerunning of the function simplification pipeline on an unchanged function may still result in changes. The patch simplifies investigating and fixing those cases where repeated function pass runs do actually positively impact code quality, while offering an easy workaround for those impacted negatively by compile time regressions, and not impacting mainline scenarios. [1] A [[ http://llvm-compile-time-tracker.com/compare.php?from=eb37d3546cd0c6e67798496634c45e501f7806f1&to=ac722d1190dc7bbdd17e977ef7ec95e69eefc91e&stat=instructions \| compile time tracker ]] run with the option enabled. Differential Revision: https://reviews.llvm.org/D98103	2021-05-06 12:24:33 -07:00
Kerry McLaughlin	8c9742bd23	[SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences Adds support for scalable vectorization of loops containing first-order recurrences, e.g: ``` for(int i = 0; i < n; i++) b[i] = a[i] + a[i - 1] ``` This patch changes fixFirstOrderRecurrence for scalable vectors to take vscale into account when inserting into and extracting from the last lane of a vector. CreateVectorSplice has been added to construct a vector for the recurrence, which returns a splice intrinsic for scalable types. For fixed-width the behaviour remains unchanged as CreateVectorSplice will return a shufflevector instead. The tests included here are the same as test/Transform/LoopVectorize/first-order-recurrence.ll Reviewed By: david-arm, fhahn Differential Revision: https://reviews.llvm.org/D101076	2021-05-06 11:35:39 +01:00
Guillaume Chatelet	089ec047be	[llvm][NFC] Remove CallingConvLower deprecated alignment functions Differential Revision: https://reviews.llvm.org/D101910	2021-05-06 07:46:19 +00:00
Guillaume Chatelet	1fa21bf9e9	[llvm][NFC] Remove SelectionDag alignment deprecated functions Differential Revision: https://reviews.llvm.org/D101909	2021-05-06 07:44:14 +00:00
Guillaume Chatelet	040f4a97cd	[llvm][NFC] Remove deprecated InterleaveGroup::getAlignment() function. Differential Revision: https://reviews.llvm.org/D101907	2021-05-06 07:40:18 +00:00
Guillaume Chatelet	a065efa302	[llvm][NFC] Remove deprecated DataLayout::getPreferredAlignment functions Differential Revision: https://reviews.llvm.org/D101906	2021-05-06 07:28:00 +00:00
Guillaume Chatelet	b4795544d4	[llvm][NFC] Remove deprecated Alignment::None() Differential Revision: https://reviews.llvm.org/D101905	2021-05-06 07:21:23 +00:00
Johannes Doerfert	df729e2b82	[OpenMP] Overhaul `declare target` handling This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well. This started with PR49649 in mind, trying to provide a way for users to avoid the "ref" global use introduced for globals with internal linkage. From there it went down the rabbit hole, e.g., all variables, even `nohost` ones, were emitted into the device code so it was impossible to determine if "ref" was needed late in the game (based on the name only). To make it really useful, `begin declare target` was needed as it can carry the `device_type`. Not emitting variables eagerly had a ripple effect. Finally, the precedence of the (explicit) declare target list items needed to be taken into account, that meant we cannot just look for any declare target attribute to make a decision. This caused the handling of functions to require fixup as well. I tried to clean up things while I was at it, e.g., we should not "parse declarations and defintions" as part of OpenMP parsing, this will always break at some point. Instead, we keep track what region we are in and act on definitions and declarations instead, this is what we do for declare variant and other begin/end directives already. Highlights: - new diagnosis for restrictions specificed in the standard, - delayed emission of globals not mentioned in an explicit list of a declare target, - omission of `nohost` globals on the host and `host` globals on the device, - no explicit parsing of declarations in-between `omp [begin] declare variant` and the corresponding end anymore, regular parsing instead, - precedence for explicit mentions in `declare target` lists over implicit mentions in the declaration-definition-seq, and - `omp allocate` declarations will now replace an earlier emitted global, if necessary. --- Notes: The patch is larger than I hoped but it turns out that most changes do on their own lead to "inconsistent states", which seem less desirable overall. After working through this I feel the standard should remove the explicit declare target forms as the delayed emission is horrible. That said, while we delay things anyway, it seems to me we check too often for the current status even though that is often not sufficient to act upon. There seems to be a lot of duplication that can probably be trimmed down. Eagerly emitting some things seems pretty weak as an argument to keep so much logic around. --- Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D101030	2021-05-06 02:10:41 -05:00
Lang Hames	7b73cd684a	[ORC] Introduce C API for adding object buffers directly to an object layer. This can be useful for clients constructing custom JIT stacks: If the C API for your custom stack exposes API to obtain a reference to an object layer (e.g. LLVMOrcLLJITGetObjLinkingLayer) then the newly added LLVMOrcObjectLayerAddObjectFile and LLVMOrcObjectLayerAddObjectFileWithRT functions can be used to add objects directly to that layer.	2021-05-05 19:02:13 -07:00
RamNalamothu	41f8b8e807	[MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling This change enables emitting CFI unwind information for debugging purpose for targets with MCAsmInfo::ExceptionsType == ExceptionHandling::None. Currently generating CFI unwind information is entangled with supporting the exceptions, even when AsmPrinter explicitly recognizes that the unwind tables are being generated as debug information. In fact, the unwind information is not generated even if we specify --force-dwarf-frame-section, unless exceptions are enabled. The LIT test llvm/test/CodeGen/AMDGPU/debug_frame.ll demonstrates this behavior. Enable this option for AMDGPU to prepare for future patches which add complete CFI support. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D78778	2021-05-06 04:53:45 +05:30
Matt Arsenault	fa0b93b5a0	GlobalISel: Use DAG call lowering infrastructure in a more compatible way Unfortunately the current call lowering code is built on top of the legacy MVT/DAG based code. However, GlobalISel was not using it the same way. In short, the DAG passes legalized types to the assignment function, and GlobalISel was passing the original raw type if it was simple. I do believe the DAG lowering is conceptually broken since it requires picking a type up front before knowing how/where the value will be passed. This ends up being a problem for AArch64, which wants to pass i1/i8/i16 values as a different size if passed on the stack or in registers. The argument type decision is split across 3 different places which is hard to follow. SelectionDAG builder uses getRegisterTypeForCallingConv to pick a legal type, tablegen gives the illusion of controlling the type, and the target may have additional hacks in the C++ part of the call lowering. AArch64 hacks around this by not using the standard AnalyzeFormalArguments and special casing i1/i8/i16 by looking at the underlying type of the original IR argument. I believe people have generally assumed the calling convention code is processing the original types, and I've discovered a number of dead paths in several targets. x86 actually relies on the opposite behavior from AArch64, and relies on x86_32 and x86_64 sharing calling convention code where the 64-bit cases implicitly do not work on x86_32 due to using the pre-legalized types. AMDGPU targets without legal i16/f16 have always used a broken ABI that promotes to i32/f32. GlobalISel accidentally fixed this to be the ABI we should have, but this fixes it so we're using the worse ABI that is compatible with the DAG. Ideally we would fix the DAG to match the old GlobalISel behavior, but I don't wish to fight that battle. A new native GlobalISel call lowering framework should let the target process the incoming types directly. CCValAssigns select a "ValVT" and "LocVT" but the meanings of these aren't entirely clear. Different targets don't use them consistently, even within their own call lowering code. My current belief is the intent was "ValVT" is supposed to be the legalized value type to use in the end, and and LocVT was supposed to be the ABI passed type (which is also legalized). With the default CCState::Analyze functions always passing the same type for these arguments, these only differ when the TableGen part of the lowering decide to promote the type from one legal type to another. AArch64's i1/i8/i16 hack ends up inverting the meanings of these values, so I had to add an additional hack to let the target interpret how large the argument memory is. Since targets don't consistently interpret ValVT and LocVT, this doesn't produce quite equivalent code to the initial DAG lowerings. I've opted to consistently interpret LocVT as the in-memory size for stack passed values, and ValVT as the register type to assign from that memory. We therefore produce extending loads directly out of the IRTranslator, whereas the DAG would emit regular loads of smaller values. This will also produce loads/stores that are wider than the argument value if the allocated stack slot is larger (and there will be undef padding bytes). If we had the optimizations to reduce load/stores based on truncated values, this wouldn't produce a different end result. Since ValVT/LocVT are more consistently interpreted, we now will emit more G_BITCASTS as requested by the CCAssignFn. For example AArch64 was directly assigning types to some physical vector registers which according to the tablegen spec should have been casted to a vector with a different element type. This also moves the responsibility for inserting G_ASSERT_SEXT/G_ASSERT_ZEXT from the target ValueHandlers into the generic code, which is closer to how SelectionDAGBuilder works. I had to xfail an x86 test since I don't see a quick way to fix it right now (I filed bug 50035 for this). It's broken independently of this change, and only triggers since now we end up with more ands which hit the improperly handled selection pattern. I also observed that FP arguments that need promotion (e.g. f16 passed as f32) are broken, and use regular G_TRUNC and G_ANYEXT. TLDR; the current call lowering infrastructure is bad and nobody has ever understood how it chooses types.	2021-05-05 17:35:02 -04:00
Philipp Krones	632ebc4ab4	[MC] Untangle MCContext and MCObjectFileInfo This untangles the MCContext and the MCObjectFileInfo. There is a circular dependency between MCContext and MCObjectFileInfo. Currently this dependency also exists during construction: You can't contruct a MOFI without a MCContext without constructing the MCContext with a dummy version of that MOFI first. This removes this dependency during construction. In a perfect world, MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the MCContext, like other MC information. This is future work. This also shifts/adds more information to the MCContext making it more available to the different targets. Namely: - TargetTriple - ObjectFileType - SubtargetInfo Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101462	2021-05-05 10:03:02 -07:00
Anirudh Prasad	ae2aef1361	[AsmParser][SystemZ][z/OS] Reject character and string literals for HLASM - As per the HLASM support we are providing, i.e. support only for the first parameter of the inline asm block, only pertaining to Z machine instructions defined in LLVM, character literals and string literals are not supported (see Figure 4 - https://www-01.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R3sc264940/$file/asmr1023.pdf for more information) - This patch explicitly rejects the usage of char literals and string literals (for example "abc 'a'") when the relevant field is set - This is achieved by introducing a field called `LexHLASMStrings` in MCAsmLexer similar to `LexMasmStrings` Reviewed By: abhina.sreeskantharajan, Kai Differential Revision: https://reviews.llvm.org/D101660	2021-05-05 10:21:55 -04:00
Martin Storsjö	6f5670a4c3	Revert "[Passes] Enable the relative lookup table converter pass on aarch64" This reverts commit `57b259a852`. The relative lookup table converter pass seems to cause problems for chromium on Windows/ARM64, see https://crbug.com/1204788.	2021-05-05 15:23:14 +03:00
Fangrui Song	7cac6a9d7a	[MC] Add MCAsmParser::parseComma to improve diagnostics llvm-mc will error "expected comma" instead of "unexpected token".	2021-05-04 14:13:19 -07:00
Fraser Cormack	6523ff6d47	[ValueTypes] Add MVTs for v256i16 and v256f16 This patch adds the two MVTs to fix a legalizer crash when using vector shuffles of <256 x i16> and <128 x i16> on RISC-V. The legalizer can't promote the operand of `v256i32 = any_extend_vector_inreg v128i16`. Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D101769	2021-05-04 18:06:13 +01:00
Wei Mi	82956de05f	[SampleFDO] Fix a bug when appending function symbol into the Callees set of Root node in ProfiledCallGraph. In ProfiledCallGraph::addProfiledFunction, to add a function symbol into the ProfiledCallGraph, currently an uninitialized ProfiledCallGraphNode node is created by ProfiledFunctions[Name] and inserted into Callees set of Root node before the node is initialized. The Callees set use ProfiledCallGraphNodeComparer as its comparator so the uninitialized ProfiledCallGraphNode may fail to be inserted into Callees set if it happens to contain a name in memory which has been inserted into the Callees set before. The problem will prevent some function symbols from being annotated with profiles and cause performance regression. The patch fixes the problem. Differential Revision: https://reviews.llvm.org/D101815	2021-05-04 10:05:59 -07:00
Sander de Smalen	9931ae645e	Reland "[LV] Calculate max feasible scalable VF." Relands https://reviews.llvm.org/D98509 This reverts commit `51d648c119`.	2021-05-04 15:44:41 +01:00
Jan Svoboda	00895831ab	[clang][cli][docs] Clarify marshalling infrastructure documentation	2021-05-04 15:16:32 +02:00
Jan Svoboda	d0e3a15e36	[clang][cli] NFC: Remove confusing `EmptyKPM` variable	2021-05-04 14:27:57 +02:00
Simon Moll	1db4dbba24	Recommit "[VP,Integer,#2] ExpandVectorPredication pass" This reverts the revert `02c5ba8679` Fix: Pass was registered as DUMMY_FUNCTION_PASS causing the newpm-pass functions to be doubly defined. Triggered in -DLLVM_ENABLE_MODULE=1 builds. Original commit: This patch implements expansion of llvm.vp.* intrinsics (https://llvm.org/docs/LangRef.html#vector-predication-intrinsics). VP expansion is required for targets that do not implement VP code generation. Since expansion is controllable with TTI, targets can switch on the VP intrinsics they do support in their backend offering a smooth transition strategy for VP code generation (VE, RISC-V V, ARM SVE, AVX512, ..). Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D78203	2021-05-04 11:47:52 +02:00
David Green	18883a3fec	[TTI] Replace ceil lambdas with divideCeil. NFCI As pointed out in D101726, this function already exists in MathExtras. It uses different types, but with the values used here I believe that should not make a functional difference.	2021-05-04 09:04:44 +01:00
Tomasz Miąsko	7310403e3c	[demangler] Initial support for the new Rust mangling scheme Add a demangling support for a small subset of a new Rust mangling scheme, with complete support planned as a follow up work. Intergate Rust demangling into llvm-cxxfilt and use llvm-cxxfilt for end-to-end testing. The new Rust mangling scheme uses "_R" as a prefix, which makes it easy to disambiguate it from other mangling schemes. The public API is modeled after __cxa_demangle / llvm::itaniumDemangle, since potential candidates for further integration use those. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D101444	2021-05-03 16:44:30 -07:00
Arthur Eubanks	2df3426fd1	[NewPM] Invalidate AAManager after populating GlobalsAA GlobalsAA is only created at the beginning of the inliner pipeline. If an AAManager is cached from previous passes, it won't get rebuilt to include the newly created GlobalsAA. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D101379	2021-05-03 16:37:32 -07:00
Valentin Clement	63f8226f25	[OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions Add function to create the offload_maptypes and the offload_mapnames globals. These two functions are used in clang. They will be used in the Flang/MLIR lowering as well. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101503	2021-05-03 15:42:32 -04:00
Anirudh Prasad	ca02fab7e7	[AsmParser][SystemZ][z/OS] Implement HLASM location counter syntax ("") for Z PC-relative instructions. - This patch attempts to implement the location counter syntax () for the HLASM variant for PC-relative instructions. - In the HLASM variant, for purely constant relocatable values, we expect a * token preceding it, with special support for " " which is parsed as "<pc-rel-insn 0>" - For combinations of absolute values and relocatable values, we don't expect the "" preceding the token. When you have a " * " what’s accepted is: ``` <space>.{.} -> <pc-rel-insn> 0 [+\|-][constant-value] -> <pc-rel-insn> [+\|-]constant-value ``` When you don’t have a " * " what’s accepted is: ``` brasl 1,func is allowed (MCSymbolRef type) brasl 1,func+4 is allowed (MCBinary type) brasl 1,4+func is allowed (MCBinary type) brasl 1,-4+func is allowed (MCBinary type) brasl 1,func-4 is allowed (MCBinary type) brasl 1,func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+func+4 is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,+4+func is not allowed ( cannot be used for non-MCConstantExprs) brasl 1,-4+8+func is not allowed ( cannot be used for non-MCConstantExprs) ``` Reviewed By: Kai Differential Revision: https://reviews.llvm.org/D100987	2021-05-03 14:58:24 -04:00
Paul Robinson	1d299252dd	[DebuggerTuning] Move a comment to a more useful place. The comment about how to make use of debugger tuning within DwarfDebug really belongs inside the DwarfDebug declaration, where it will be easier to find.	2021-05-03 11:08:04 -07:00
Chris Lattner	5fa9d41634	[Support/Parallel] Add a special case for 0/1 items to llvm::parallel_for_each. This avoids the non-trivial overhead of creating a TaskGroup in these degenerate cases, but also exposes parallelism. It turns out that the default executor underlying TaskGroup prevents recursive parallelism - so an instance of a task group being alive will make nested ones become serial. This is a big issue in MLIR in some dialects, if they have a single instance of an outer op (e.g. a firrtl.circuit) that has many parallel ops within it (e.g. a firrtl.module). This patch side-steps the problem by avoiding creating the TaskGroup in the unneeded case. See this issue for more details: https://github.com/llvm/circt/issues/993 Note that this isn't a really great solution for the general case of nested parallelism. A redesign of the TaskGroup stuff would be better, but would be a much more invasive change. Differential Revision: https://reviews.llvm.org/D101699	2021-05-03 10:08:00 -07:00
Abhina Sreeskantharajan	1527a5e4b4	[SystemZ][z/OS] Add the functions needed for handling EBCDIC I/O This patch adds the basic functions needed for controlling auto conversion on z/OS. Auto conversion is enabled on untagged input file to ASCII by making the assumption that all untagged files are EBCDIC encoded. Output files are auto converted to EBCDIC IBM-1047. This change also enables conversion for stdin/stdout/stderr. For more information on how fcntl controls codepage https://www.ibm.com/docs/en/zos/2.4.0?topic=descriptions-fcntl-bpx1fct-bpx4fct-control-open-file-descriptors Reviewed By: anirudhp Differential Revision: https://reviews.llvm.org/D100483	2021-05-03 08:52:38 -04:00
David Green	d1bbe61d1c	[ARM] Memory operands for MVE gathers/scatters Similarly to D101096, this makes sure that MMO operands get propagated through from MVE gathers/scatters to the Machine Instructions. This allows extra scheduling freedom, not forcing the instructions to act as scheduling barriers. We create MMO's with an unknown size, specifying that they can load from anywhere in memory, similar to the masked_gather or X86 intrinsics. Differential Revision: https://reviews.llvm.org/D101219	2021-05-03 11:24:59 +01:00
Sergio Perez Gonzalez	761d5614a1	[Object] Fix e_machine description for EM_CR16 and add EM_MICROBLAZE Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101133	2021-05-02 19:25:39 -07:00
David Green	15b5d1a5bf	[ARM] Transfer memory operands for VLDn We create MMO's for the VLDn/VSTn intrinsics in ARMTargetLowering:: getTgtMemIntrinsic, but they do not currently make it ll the way through ISel. This changes that in the various places it needs changing, making sure that the MMO is propagate through to the final instruction. This can help in scheduling, not treating the VLD2/VST2 as a scheduling barrier. Differential Revision: https://reviews.llvm.org/D101096	2021-05-03 00:04:21 +01:00
Craig Topper	6430430958	[TableGen] Use sign rotated VBR for OPC_EmitInteger. This allows for a much more efficient encoding for small negative numbers by storing the sign bit first and negating the rest of the bits. This was already being used for OPC_CheckInteger. For every in tree target this affects, the table got smaller. R600GenDAGISel.inc saw the largest reduction of 7K. I did have to add a new opcode for StringIntegers used for register class ids and subregister indices since we don't have the integer value to encode. The enum name is emitted directly into the table. Previously assumed the enum would expand to a positive 7-bit number. We might be able to just shift that right by 1 and assume it is a positive 6 bit number, but that will need more investigation.	2021-05-02 12:40:44 -07:00
Juneyoung Lee	1977c53b2a	[InstCombine] Fold overflow bit of [u\|s]mul.with.overflow in a poison-safe way As discussed in D101191, this patch adds a poison-safe folding of overflow bit check: ``` %Op0 = icmp ne i4 %X, 0 %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = select i1 %Op0, i1 %Op1, i1 false => %Y.fr = freeze %Y %Agg = call { i4, i1 } @llvm.[us]mul.with.overflow.i4(i4 %X, i4 %Y.fr) %Op1 = extractvalue { i4, i1 } %Agg, 1 %ret = %Op1 ``` https://alive2.llvm.org/ce/z/zgPUGT https://alive2.llvm.org/ce/z/h2gZ_6 Note that there are cases where inserting freeze is not necessary: e.g. %Y is `noundef`. In this case, LLVM is already good because `%ret` is already successfully folded into `and`, triggering the pre-existing optimization in InstSimplify: https://godbolt.org/z/v6qena15K Differential Revision: https://reviews.llvm.org/D101423	2021-05-02 11:54:12 +09:00
Nikita Popov	cc58e8918b	[SCEV] Simplify backedge count clearing (NFC) This seems to be a leftover from when the BackedgeTakenInfo stored multiple exit counts with manual memory management. At some point this was switchted to a simple vector, and there should be no need to micro-manage the clearing anymore. We can simply drop the loop from the map and the the destructor do its job.	2021-05-01 17:50:01 +02:00
Nathan Chancellor	4397b7095d	Revert "Re-reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"" This reverts commit `791930d740`, as per https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy. I observed breakage with the Linux kernel, as reported at https://reviews.llvm.org/D91722#2724321 Fixes exist at https://reviews.llvm.org/D101523 https://reviews.llvm.org/D101540 but they have not landed so to unbreak the tree for the weekend, revert this commit. Commit `b11e4c9907` ("Revert "[DebugInfo] Drop DBG_VALUE_LISTs with an excessive number of debug operands"") only reverted one follow-up fix, not the original patch that broke the kernel. e	2021-04-30 20:23:21 -07:00
Adrian Prantl	02c5ba8679	Revert "[VP,Integer,#2] ExpandVectorPredication pass" This reverts commit `43bc584dc0`. The commit broke the -DLLVM_ENABLE_MODULES=1 builds. http://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/31603/consoleFull#2136199809a1ca8a51-895e-46c6-af87-ce24fa4cd561	2021-04-30 17:02:28 -07:00

1 2 3 4 5 ...

44823 Commits