llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	97a4e7b7ff	[InstCombine] remove a buggy set of zext-icmp transforms The motivating case is an infinite loop shown with a reduced test from: https://llvm.org/PR51762 To solve this, I'm proposing we delete the most obviously broken part of this code. The bug example shows a fundamental problem: we ask computeKnownBits if a transform will be profitable, alter the code by creating new instructions, then rely on computeKnownBits to return the same answer to actually eliminate instructions. But there's no guarantee that the results will be the same between the 1st and 2nd calls. In the infinite loop example, we get different answers, so we add instructions that conflict with some other transform, and we're stuck. There's at least one other problem visible in the test diff for `@zext_or_masked_bit_test_uses`: the code doesn't check uses properly, so we can end up with extra instructions created. Last, it's not clear if this set of transforms actually improves analysis or codegen. I spot-checked a few targets and don't see a clear win: https://godbolt.org/z/x87EWovso If we do see a regression from this change, codegen seems like the right place to add a cmp -> bit-hack fold. If this is too big of a step, we could limit the computeKnownBits calls by not passing a context instruction and/or limiting the recursion. I checked that those would stop the infinite loop for PR51762, but that won't guarantee that some other example does not fall into the same loop. Differential Revision: https://reviews.llvm.org/D109440	2021-09-09 08:49:39 -04:00
Nico Weber	7484206cfd	[gn build] Make lldb build on Windows Differential Revision: https://reviews.llvm.org/D109478	2021-09-09 08:13:50 -04:00
Florian Mayer	6e12c73316	[NFC] [stack-safety] add placeholder addRange. This is in preparataion of D108457.	2021-09-09 13:13:18 +01:00
Cullen Rhodes	6c8ff4032e	[OptParser] NFC: Remove unused template arg 'name' from bool opt Identified in D109359. Reviewed By: jansvoboda11 Differential Revision: https://reviews.llvm.org/D109489	2021-09-09 12:04:40 +00:00
Florian Mayer	d261d4cf55	[stack-safety] [NFC] do not terminate print with blank line.	2021-09-09 12:31:09 +01:00
LLVM GN Syncbot	9bb803c7a6	[gn build] Port `c58c7a6ea0`	2021-09-09 11:25:54 +00:00
Florian Mayer	08b4dd8b24	[NFC] [stack-safety] remove unused return value.	2021-09-09 12:19:47 +01:00
Simon Pilgrim	c31a202233	[X86][AVX] Add missing X86ISD::VBROADCAST(v2f64 -> v4f64) isel pattern for AVX1 targets As discussed on the ticket, I'm intending to add additional 128->256 patterns when we have test coverage, but this addresses a known crash. Differential Revision: https://reviews.llvm.org/D109434	2021-09-09 12:16:23 +01:00
Bradley Smith	8089f9ed5a	[AArch64][SVE] Add missing patterns for unpredicated subr intrinsics Differential Revision: https://reviews.llvm.org/D109369	2021-09-09 10:28:37 +00:00
Alfonso Sánchez-Beato	b33fd31772	[yaml2obj][COFF] Allow variable number of directories Allow variable number of directories, as allowed by the specification. NumberOfRvaAndSize will default to 16 if not specified, as in the past. Reviewed by: jhenderson Differential Revision: https://reviews.llvm.org/D108825	2021-09-09 11:16:56 +01:00
Sjoerd Meijer	ecff9e3da5	[FuncSpec] Fixed minor formatting issues. NFC.	2021-09-09 10:36:54 +01:00
Roman Lebedev	909cba9699	[SimplifyCFG] performBranchToCommonDestFolding(): require block-closed SSA form for bonus instructions (PR51125) I can't seem to wrap my head around the proper fix here, we should be fine without this requirement, iff we can form this form, but the naive attempt (https://reviews.llvm.org/D106317) has failed. So just to unblock the release, put up a restriction. Fixes https://bugs.llvm.org/show_bug.cgi?id=51125	2021-09-09 12:28:09 +03:00
Jun Ma	8ba2adcf9e	Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values."" Differential Revision: https://reviews.llvm.org/D106056	2021-09-09 16:53:33 +08:00
Cullen Rhodes	9d4896f50e	[SelectionDAG] NFC: Remove unused template args Identified in D109359.	2021-09-09 07:29:29 +00:00
Cullen Rhodes	d42f76fd36	[AArch64][SVE] NFC: Remove unused template args For sve_fp_3op_p_zds_zx we have zero patterns downstream but the intrinsic args can be added again if/when the patterns are implemented. Identified in D109359. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D109429	2021-09-09 07:10:57 +00:00
Cullen Rhodes	5b848a35d2	[AArch64][SVE] NFC: Use stepvector directly in index multiclasses Also fixes a couple of warnings identified in D109359: SVEInstrFormats.td:5099:59: warning: unused template argument: sve_int_index_ri::step_vector SVEInstrFormats.td:5133:59: warning: unused template argument: sve_int_index_rr::step_vector Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D109422	2021-09-09 07:10:57 +00:00
Alexander Pivovarov	4bc8dbe0ca	[RISCV] Add SiFive cores E and S series Add SiFive cores E20, E21, E24, E34, S21, S54 and S76 Differential Revision: https://reviews.llvm.org/D109260	2021-09-08 23:59:04 -07:00
Yvan Roux	261cbe98c3	[RISCV] Fix Machine Outliner jump table handling. Don't outline machine instructions which are using jump table indexes since they are materialized as local labels (like the already handled case of constant pools). Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D109436	2021-09-09 07:32:30 +02:00
Peter Collingbourne	883e93cb28	gn build: Add support for building lldb-server on Android. The cross-compiled lldb-server targets are added to the lldb deps if Android cross compilation is enabled. Differential Revision: https://reviews.llvm.org/D109464	2021-09-08 19:33:51 -07:00
Peter Collingbourne	9449f441fc	gn build: Add support for building LLDB on Linux. On Linux, LLDB depends on lldb-server at runtime (on Mac, the dependency on a debug server presumably comes via the system debugserver), so I added it to deps. Differential Revision: https://reviews.llvm.org/D109463	2021-09-08 19:33:51 -07:00
Chris Lattner	9e46dd965a	[APInt.h] Reduce the APInt header file interface a bit. NFC This moves one mid-size function out of line, inlines the trivial tcAnd/tcOr/tcXor/tcComplement methods into their only caller, and moves the magic/umagic functions into SelectionDAG since they are implementation details of its algorithm. This also removes the unit tests for magic, but these are already tested in the divide lowering logic for various targets. This also upgrades some C style comments to C++. Differential Revision: https://reviews.llvm.org/D109476	2021-09-08 18:17:07 -07:00
Jessica Paquette	22a64d4a14	[MachineOutliner][AArch64] Ensure LR is live-in when inserting reg-save calls Similar to other code which handles creating the function frame. If LR isn't live-in to the block that we're inserting the call into, we'll get a MachineVerifier error.	2021-09-08 17:44:27 -07:00
Amara Emerson	eae44c8a86	[GlobalISel] Implement merging of stores of truncates. This is a port of a combine which matches a pattern where a wide type scalar value is stored by several narrow stores. It folds it into a single store or a BSWAP and a store if the targets supports it. Assuming little endian target: i8 p = ... i32 val = ... p[0] = (val >> 0) & 0xFF; p[1] = (val >> 8) & 0xFF; p[2] = (val >> 16) & 0xFF; p[3] = (val >> 24) & 0xFF; => ((i32)p) = val; On CTMark AArch64 -Os this results in a good amount of savings: Program before after diff SPASS 412792 412788 -0.0% kc 432528 432512 -0.0% lencod 430112 430096 -0.0% consumer-typeset 419156 419128 -0.0% bullet 475840 475752 -0.0% tramp3d-v4 367760 367628 -0.0% clamscan 383388 383204 -0.0% pairlocalalign 249764 249476 -0.1% 7zip-benchmark 570100 568860 -0.2% sqlite3 287628 286920 -0.2% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D109419	2021-09-08 17:06:33 -07:00
Philip Reames	e741fabc22	[SCEV] Move getIndexExpressionsFromGEP to delinearize [NFC]	2021-09-08 16:56:49 -07:00
Chris Lattner	717ed1c310	[APInt.h] don't privatize "needsCleanup"; it is used by Clang APValue	2021-09-08 16:33:06 -07:00
David Blaikie	d18083c6dc	Error: Improve unit test by using gtest equality rather than explicit string compare calls This ensures error messages from gtest includes the raw text of both sides of the comparison - otherwise all gtest can report is the text of the expression source, without any information about the values or how they differ.	2021-09-08 16:21:11 -07:00
David Blaikie	f03689ace5	FileError: Provide a way to retrieve the underlying error string without the file name For use with APIs that want to report the file name in a different syntactic form, have other knowledge of the filename, etc.	2021-09-08 16:16:54 -07:00
David Blaikie	0c502507f4	FileError: Support zero-length file names It's a common error in an API - to try to open an empty file, so it seems like a reasonable FileError to produce "hey, you tried to open an empty file" and to handle it the same way as any other file error.	2021-09-08 16:16:54 -07:00
Chris Lattner	a024d35b38	[APInt.h] Clean up the APInt interface. NFC. This moves all the private implementation details to the bottom of the header, and pushes all the "make an APInt" stuff up to the top. This is in prep for making other changes to spiff up APInt a bit.	2021-09-08 16:08:57 -07:00
Philip Reames	4b5e260b1d	[SCEV] Simplify findExistingSCEVInCache interface [NFC] We were returning a tuple when all but one caller only cared about one piece of the return value. That one caller can inline the complexity, and we can simplify all other uses.	2021-09-08 15:26:07 -07:00
Andrew Litteken	144cd22bae	[CodeExtractor] Creating exit stubs based off original order branch instructions. Previously the CodeExtractor created exit stubs, and the subsequent return value of the outlined function based on the order of out-of-region blocks after splitting any phi nodes, and collecting the blocks to be outlined. This could cause differences in order if there was a difference of exit block phi nodes between the two regions. This patch moves the collection of the output target blocks to be before this occurs, so that the assignment of target block to output value will be the same, regardless of the contents of the output block. Reviewers: paquette, roelofs Differential Revision: https://reviews.llvm.org/D108657	2021-09-08 15:15:15 -07:00
David Green	7ff67d5bf8	[AArch64] Rewrite floatdp_1source.ll test. NFC Rewrite this test to not rely on volatile stores in a large function, just use separate functions like any other test would.	2021-09-08 23:00:34 +01:00
Arthur Eubanks	fe15347a1e	Port the cost model printer to New PM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D109284	2021-09-08 14:47:05 -07:00
Craig Topper	a574f0e0c3	[RISCV] Disable use of i128 shift libcalls on RV32. Since i128 isn't a legal C type on RV32, I don't believe libgcc implements these functions for RV32. compiler-rt does implement them because i128 support is enabled in order to handle long double. This is consistent with 32-bit X86 and ARM. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D109383	2021-09-08 14:26:07 -07:00
Eli Friedman	0375734439	[NFC] Add extra test for D106331	2021-09-08 14:18:47 -07:00
Michael Kruse	088577a38e	[Delinerization] Require by offset to be zero. Users of delinearization assume that the the offset into the array element is zero. In most cases it will indeed be zero, but if it is not, the delinearization has to fail since it violates that assumption without the API even allowing to signal to the caller that the by offset is non-zero. This bug caused Polly to miscompile blender (526.blender_r from SPEC CPU 2017) in -polly-process-unprofitable mode. The SCEV expression incorrectly delinearized has been reduced in the test case byte_offset.ll. The dropped offset into the array element of size 4 (a float) is ((sext i32 %mul7.i4534 to i64) + {(sext i32 %i1 to i64),+,((sext i32 (1 + ((1 + %shl.i.i) * (1 + %shl.i.i)) + %shl.i.i) to i64) * (sext i32 %i1 to i64))}<%for.body703>). This significant component was just dropped, and the wrong pointer was computed when regenerating code from the remaining delinearized subscripts. This occurred during blender's subsurface scattering implementation. As a result, blender's rendering diverged from the reference image. Patch D108885 would also fix the API. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D109133	2021-09-08 16:02:37 -05:00
Greg Clayton	14850a0628	Log to the right stream in DwarfTransformer::handleDie(). Since we might end up using multiple threads when logging information in the DWARFTransformer, the handleDie() method must use the supplied stream named "OS" when logging warnings and errors. When we use multiple threads, we log to a thread specific stream buffer and then use a mutex to ensure our output doesn't overlap when we emit warnings and errors after a thread is done. Differential Revision: https://reviews.llvm.org/D109401	2021-09-08 14:00:19 -07:00
Florian Hahn	f4726e7238	[LAA] Remove unused OrigPtr from replaceSymbolicStrideSCEV (NFC). The OrigPtr argument is not used in tree.	2021-09-08 22:35:36 +02:00
Nikita Popov	6dfdc6bfd2	[SROA] Support opaque pointers Make the following changes in order to support opaque pointers in SROA: * Generate i8 GEPs for opaque pointers. * Explicitly enforce that promotable allocas only have stores of the alloca type -- previously this was implicitly enforced. * Replace a check for pointer element type with load/store type. Differential Revision: https://reviews.llvm.org/D109259	2021-09-08 22:25:44 +02:00
Arthur Eubanks	b493124ae2	[MemorySSA] Support invariant.group metadata The implementation is mostly copied from MemDepAnalysis. We want to look at all loads and stores to the same pointer operand. Bitcasts and zero GEPs of a pointer are considered the same pointer value. We choose the most dominating instruction. Since updating MemorySSA with invariant.group is non-trivial, for now handling of invariant.group is not cached in any way, so it's part of the walker. The number of loads/stores with invariant.group is small for now anyway. We can revisit if this actually noticeably affects compile times. To avoid invariant.group affecting optimized uses, we need to have optimizeUsesInBlock() not use invariant.group in any way. Co-authored-by: Piotr Padlewski <prazek@google.com> Reviewed By: asbirlea, nikic, Prazek Differential Revision: https://reviews.llvm.org/D109134	2021-09-08 13:06:12 -07:00
Philip Reames	585c594d74	Move delinearization logic out of SCEV [NFC] None of this logic has anything to do with SCEV's internals, it just uses the existing public APIs. As a result, we can move the code from ScalarEvolution.cpp/hpp to Delinearization.cpp/hpp with only minor changes. This was discussed in advance on today's loop opt call. It turned out to be easy as hoped.	2021-09-08 12:28:35 -07:00
Nikita Popov	3e54de4df2	[ConstantHoisting] Support opaque pointers Directly use i8 for GEP, rather than fetching element type of i8*.	2021-09-08 21:23:10 +02:00
Akira Hatanaka	dea6f71af0	[ObjC][ARC] Use the addresses of the ARC runtime functions instead of integer 0/1 for the operand of bundle "clang.arc.attachedcall" https://reviews.llvm.org/D102996 changes the operand of bundle "clang.arc.attachedcall". This patch makes changes to llvm that are needed to handle the new IR. This should make it easier to understand what the IR is doing and also simplify some of the passes as they no longer have to translate the integer values to the runtime functions. Differential Revision: https://reviews.llvm.org/D103000	2021-09-08 11:58:03 -07:00
Andrew Litteken	0087bb4a9a	[IROutliner] Using canonical values to find corresponding values. (NFC) D104143 introduced canonical value numbering between regions, which allows for the easy identification of items across a region, eliminating the need in the outliner to create parallel lists of instructions for each region, and replace output values in a less convoluted way. Additionally, in a future commit, the output values will not necessarily be recorded values from the region itself, it could be a combination value where the actual value being output is a PHINode instead. This new method allows us to handle the replacement of the output value to the stored value with the corresponding item in the same place for both normal output values, and PHINode outputs instead of handling the different types of outputs in different locations. Reviewers: paquette, roelofs Differential Revision: https://reviews.llvm.org/D108656	2021-09-08 11:36:05 -07:00
Joseph Huber	6b9a3ec3a2	[OpenMP] Do not SPMDize generic regions with no parallel This patch changes SPMDization to not trigger for regions with no parallelism. Otherwise, this will introduce unnecessary barriers that will slow the single-threaded region down. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D109438	2021-09-08 14:33:15 -04:00
Amara Emerson	c38ab8275e	[GlobalISel] Use a typedef for builder function matchinfos for brevity. NFC.	2021-09-08 11:25:35 -07:00
Nick Desaulniers	4331f19d8b	[ISEL][BitTestBlock] omit additional bit test when default destination is unreachable Otherwise we end up with an extra conditional jump, following by an unconditional jump off the end of a function. ie. bb.0: BT32rr .. JCC_1 %bb.4 ... bb.1: BT32rr .. JCC_1 %bb.2 ... JMP_1 %bb.3 bb.2: ... bb.3.unreachable: bb.4: ... Should be equivalent to: bb.0: BT32rr .. JCC_1 %bb.4 ... JMP_1 %bb.2 bb.1: bb.2: ... bb.3.unreachable: bb.4: ... This can occur since at the higher level IR (Instruction) SwitchInsts are required to have BBs for default destinations, even when it can be deduced that such BBs are unreachable. For most programs, this isn't an issue, just wasted instructions since the unreachable has been statically proven. The x86_64 Linux kernel when built with CONFIG_LTO_CLANG_THIN=y fails to boot though once D106056 is re-applied. D106056 makes it more likely that correlation-propagation (CVP) can deduce that the default case of SwitchInsts are unreachable. The x86_64 kernel uses a binary post processor called objtool, which emits this warning: vmlinux.o: warning: objtool: cfg80211_edmg_chandef_valid()+0x169: can't find jump dest instruction at .text.cfg80211_edmg_chandef_valid+0x17b I haven't debugged precisely why this causes a failure at boot time, but fixing this very obvious jump off the end of the function fixes the warning and boot problem. Link: https://bugs.llvm.org/show_bug.cgi?id=50080 Fixes: https://github.com/ClangBuiltLinux/linux/issues/679 Fixes: https://github.com/ClangBuiltLinux/linux/issues/1440 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D109103	2021-09-08 11:03:47 -07:00
Kirill Stoimenov	3f875134a7	[asan] Fixed the jump to use the 4 byte offset version. This should have been the 4 byte version in the first place. Unfortunatelly there is no easy way to add a test as both the 1 byte and 4 byte version are printed as 'jmp' in the assembly code. Reviewed By: kda Differential Revision: https://reviews.llvm.org/D109453	2021-09-08 17:58:12 +00:00
Wouter van Oortmerssen	a99fb86c65	[WebAssembly] Change WebAssemblyMCLowerPrePass to ModulePass It was a FunctionPass before, which subverted its purpose to collect ALL symbols before MCLowering, depending on how LLVM schedules function passes. Fixes https://bugs.llvm.org/show_bug.cgi?id=51555 Differential Revision: https://reviews.llvm.org/D109202	2021-09-08 10:47:43 -07:00
Craig Topper	c00cb52854	[RISCV] Pre-commit tests for D109394. NFC	2021-09-08 10:25:38 -07:00

1 2 3 4 5 ...

221127 Commits