llvm-project

Commit Graph

Author	SHA1	Message	Date
Jay Foad	6881a82e8c	[AMDGPU] Fix scheduling of exp pos4 Also fix a similar issue in SIInsertWaitcnts, but I don't think that fix has any effect in practice. Differential Revision: https://reviews.llvm.org/D91290	2020-11-12 19:57:14 +00:00
Jay Foad	d7d6ac5624	[AMDGPU] Define and use names for export targets. NFC. Differential Revision: https://reviews.llvm.org/D91289	2020-11-12 19:57:14 +00:00
Jianzhou Zhao	2d96859ea6	[msan] Break the getShadow loop after matching an argument Reviewed-by: eugenis Differential Revision: https://reviews.llvm.org/D91320	2020-11-12 19:48:59 +00:00
Nikita Popov	c00545dc32	[BasicAA] Remove checks for GEP decomposition limit reached The GEP aliasing code currently checks for the GEP decomposition limit being reached (i.e., we did not reach the "final" underlying object). As far as I can see, these checks are not necessary. It is perfectly fine to work with a GEP whose base can still be further decomposed. Looking back through the commit history, these checks were originally introduced in `1a444489e9`. However, I believe that the problem this was intended to address was later properly fixed with `1726fc698c`, and the checks are no longer necessary since then (and were not the right fix in the first place). Differential Revision: https://reviews.llvm.org/D91010	2020-11-12 20:43:38 +01:00
Craig Topper	4cdf1d2110	[MSP430] Remove unused MVT::Glue output from MSP430ISD::SELECT_CC nodes. Follow up from a similar patch on RISCV `637f19c36b` Nothing reads this Glue value that I could see. The SDNode def in the td file does not have the SDNPOutGlue flag so I don't think this glue would get properly propagated to MachineSDNodes if it was used.	2020-11-12 10:34:01 -08:00
Jamie Schmeiser	5f672fefeb	Reland: Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source Summary: Expand the print-memoryssa and print<memoryssa> passes with a new hidden option -cfg-dot-mssa that names a file. When set, a dot-cfg style file will be generated into the named file with the memoryssa comments retained and those blocks containing them shown in light pink. The option does nothing in isolation. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: asbirlea (Alina Sbirlea), dblaikie (David Blaikie) Differential Revision: https://reviews.llvm.org/D90638	2020-11-12 17:39:14 +00:00
Alexander Kornienko	76b6cb515b	Fix unused variable warning in release builds	2020-11-12 18:14:06 +01:00
Simon Pilgrim	f72d350bfb	[ValueTracking] Update computeKnownBitsFromShiftOperator callbacks to take KnownBits shift amount. NFCI. We were creating this internally, but will need to support general KnownBits amounts as part of D90479.	2020-11-12 16:56:55 +00:00
Simon Pilgrim	8996742741	[KnownBits] Add KnownBits::makeConstant helper. NFCI. Helper for cases where we need to create a KnownBits from a (fully known) constant value.	2020-11-12 16:16:04 +00:00
Anh Tuyen Tran	a20b3620bb	Revert "Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source" This reverts commit `45d459e752` due to build issue in Poly.	2020-11-12 15:48:14 +00:00
Craig Topper	0add5f9122	[RISCV] Don't include CodeGen layer files in MC layer -Use MCRegister instead of Register in MC layer. -Move some enums from RISCVInstrInfo.h to RISCVBaseInfo.h to be with other TSFlags bits. Differential Revision: https://reviews.llvm.org/D91114	2020-11-12 07:45:38 -08:00
Jamie Schmeiser	45d459e752	Introduce -dot-cfg-mssa option which creates dot-cfg style file with mssa comments included in source Summary: Expand the print-memoryssa and print<memoryssa> passes with a new hidden option -cfg-dot-mssa that names a file. When set, a dot-cfg style file will be generated into the named file with the memoryssa comments retained and those blocks containing them shown in light pink. The option does nothing in isolation. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: asbirlea (Alina Sbirlea), dblaikie (David Blaikie) Differential Revision: https://reviews.llvm.org/D90638	2020-11-12 15:41:16 +00:00
Craig Topper	9ca02d6fe1	[RISCV] Add an ANDI to shift amount of FSL/FSR instructions The fshl and fshr intrinsics are defined to modulo their shift amount by the bitwidth of one of their inputs. The FSR/FSL instructions read one extra bit from the shift amount. If that bit is set the inputs are swapped. In order to preserve the semantics of the llvm intrinsics we need to make sure that the extra bit isn't set. DAG combine or instcombine may have removed any mask that was originally present. We could be smarter here and try to use computeKnownBits to check if the bit is known zero, but wanted to start with correctness. Differential Revision: https://reviews.llvm.org/D90905	2020-11-12 07:33:40 -08:00
Simon Pilgrim	11c106544b	[ValueTracking] Update computeKnownBitsFromShiftOperator callbacks to use KnownBits shift handling. NFCI.	2020-11-12 15:31:26 +00:00
Jamie Schmeiser	782d6a6963	Introduce -print-before-changed, making -print-changed also print before passes that modify IR Summary: Add an option -print-before-changed that modifies the print-changed behaviour so that it prints the IR before a pass that changed it in addition to printing the IR after the pass. Note that the option does nothing in isolation. The filtering options work as expected. Lit tests are included. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: aeubanks (Arthur Eubanks) Differential Revision: https://reviews.llvm.org/D88757	2020-11-12 15:20:50 +00:00
Jamie Schmeiser	f79b483385	[NFC intended] Refactor SinkAndHoistLICMFlags to allow others to construct without exposing internals Summary: Refactor SinkAdHoistLICMFlags from a struct to a class with accessors and constructors to allow other classes to construct flags with meaningful defaults while not exposing LICM internal details. Author: Jamie Schmeiser <schmeise@ca.ibm.com> Reviewed By: asbirlea (Alina Sbirlea) Differential Revision: https://reviews.llvm.org/D90482	2020-11-12 15:06:59 +00:00
David Green	11dee2eae2	[ARM] Ensure CountReg definition dominates InsertPt when creating t2DoLoopStartTP Of course there was something missing, in this case a check that the def of the count register we are adding to a t2DoLoopStartTP would dominate the insertion point. In the future, when we remove some of these COPY's in between, the t2DoLoopStartTP will always become the last instruction in the block, preventing this from happening. In the meantime we need to check they are created in a sensible order. Differential Revision: https://reviews.llvm.org/D91287	2020-11-12 13:47:46 +00:00
Alexandre Ganea	ec63dfe368	[LLD] Fix include following `45b8a741fb`	2020-11-12 08:32:16 -05:00
Alexandre Ganea	45b8a741fb	[LLD][COFF] When using LLD-as-a-library, always prevent re-entrance on failures This is a follow-up for D70378 (Cover usage of LLD as a library). While debugging an intermittent failure on a bot, I recalled this scenario which causes the issue: 1.When executing lld/test/ELF/invalid/symtab-sh-info.s L45, we reach lld:🧝:Obj-File::ObjFile() which goes straight into its base ELFFileBase(), then ELFFileBase::init(). 2.At that point fatal() is thrown in lld/ELF/InputFiles.cpp L381, leaving a half-initialized ObjFile instance. 3.We then end up in lld::exitLld() and since we are running with LLD_IN_TEST, we hapily restore the control flow to CrashRecoveryContext::RunSafely() then back in lld::safeLldMain(). 4.Before this patch, we called errorHandler().reset() just after, and this attempted to reset the associated SpecificAlloc<ObjFile<ELF64LE>>. That tried to free the half-initialized ObjFile instance, and more precisely its ObjFile::dwarf member. Sometimes that worked, sometimes it failed and was catched by the CrashRecoveryContext. This scenario was the reason we called errorHandler().reset() through a CrashRecoveryContext. But in some rare cases, the above repro somehow corrupted the heap, creating a stack overflow. When the CrashRecoveryContext's filter (that is, __except (ExceptionFilter(GetExceptionInformation()))) tried to handle the exception, it crashed again since the stack was exhausted -- and that took the whole application down. That is the issue seen on the bot. Locally it happens about 1 times out of 15. Now this situation can happen anywhere in LLD. Since catching stack overflows is not a reliable scenario ATM when using CrashRecoveryContext, we're now preventing further re-entrance when such failures occur, by signaling lld::SafeReturn::canRunAgain=false. When running with LLD_IN_TEST=2 (or above), only one iteration will be executed, instead of two. Differential Revision: https://reviews.llvm.org/D88348	2020-11-12 08:14:43 -05:00
Kazushi (Jam) Marukawa	a72d384249	[VE] Change the default type of v64 register class Change the default type of v64 register class from v512i32 to v256f64. Add a regression test also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D91301	2020-11-12 19:07:07 +09:00
David Sherwood	3225fcf11e	[SVE] Deal with SVE tuple call arguments correctly when running out of registers When passing SVE types as arguments to function calls we can run out of hardware SVE registers. This is normally fine, since we switch to an indirect mode where we pass a pointer to a SVE stack object in a GPR. However, if we switch over part-way through processing a SVE tuple then part of it will be in registers and the other part will be on the stack. I've fixed this by ensuring that: 1. When we don't have enough registers to allocate the whole block we mark any remaining SVE registers temporarily as allocated. 2. We temporarily remove the InConsecutiveRegs flags from the last tuple part argument and reinvoke the autogenerated calling convention handler. Doing this prevents the code from entering an infinite recursion and, in combination with 1), ensures we switch over to the Indirect mode. 3. After allocating a GPR register for the pointer to the tuple we then deallocate any SVE registers we marked as allocated in 1). We also set the InConsecutiveRegs flags back how they were before. 4. I've changed the AArch64ISelLowering LowerCALL and LowerFormalArguments functions to detect the start of a tuple, which involves allocating a single stack object and doing the correct numbers of legal loads and stores. Differential Revision: https://reviews.llvm.org/D90219	2020-11-12 08:41:50 +00:00
Amara Emerson	ad376657c1	[AArch64][GlobalISel] Optimize G_PTR_ADD with a negated offset to be a G_SUB.	2020-11-11 22:46:53 -08:00
Max Kazantsev	2734a9ebf4	[NFC][SCEV] Generalize monotonicity check for full and limited iteration space A piece of logic of `isLoopInvariantExitCondDuringFirstIterations` is actually a generalized predicate monotonicity check. This patch moves it into the corresponding method and generalizes it a bit. Differential Revision: https://reviews.llvm.org/D90395 Reviewed By: apilipenko	2020-11-12 12:37:07 +07:00
Xun Li	94a45a8098	Revert "[Coroutine] Allocas used by StoreInst does not always escape" This reverts commit `8bc7b9278e`, which landed by accident.	2020-11-11 21:09:39 -08:00
Max Kazantsev	d6dd938589	[IndVars] IV user should not prevent use widening Sometimes the an instruction we are trying to widen is used by the IV (which means the instruction is the IV increment). Currently this may prevent its widening. We should ignore such user because it will be dead once the transform is done anyways. Differential Revision: https://reviews.llvm.org/D90920 Reviewed By: fhahn	2020-11-12 12:02:01 +07:00
Xun Li	8bc7b9278e	[Coroutine] Allocas used by StoreInst does not always escape In the existing logic, for a given alloca, as long as its pointer value is stored into another location, it's considered as escaped. This is a bit too conservative. Specifically, in non-optimized build mode, it's often to have patterns of code that first store an alloca somewhere and then load it right away. These used should be handled without conservatively marking them escaped. This patch tracks how the memory location where an alloca pointer is stored into is being used. As long as we only try to load from that location and nothing else, we can still consider the original alloca not escaping and keep it on the stack instead of putting it on the frame. Differential Revision: https://reviews.llvm.org/D91305	2020-11-11 20:53:51 -08:00
Max Kazantsev	2e01ceafaa	[IndVars] Recognize 'sub nuw' expressed as 'add' for widening InstCombine canonicalizes 'sub nuw' instructions to 'add' without the `nuw` flag. The typical case where we see it is decrementing induction variables. For them, IndVars fails to prove that it's legal to widen them, and inserts unprofitable `zext`'s. This patch adds recognition of such pattern using SCEV. Differential Revision: https://reviews.llvm.org/D89550 Reviewed By: fhahn, skatkov	2020-11-12 10:51:29 +07:00
Arnold Schwaighofer	431337662e	[coro] Async coroutines: Allow more than 3 arguments in the dispatch function We need to be able to call function pointers. Inline the dispatch function. Also inline the context projection function. Transfer debug locations from the suspend point to the inlined functions. Use the function argument index instead of the function argument in coro.id.async. This solves any spurious use issues. Coerce the arguments of the tail call function at a suspend point. The LLVM optimizer seems to drop casts leading to a vararg intrinsic. rdar://70097093 Differential Revision: https://reviews.llvm.org/D91098	2020-11-11 15:25:28 -08:00
Arthur Eubanks	b6ccff3d5f	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Since callbacks may end up not adding passes, we need to check if the pass managers are empty before adding them, so PassManager now has an isEmpty() function. For example, polly adds callbacks but doesn't always add passes in those callbacks, so this is necessary to keep -debug-pass-manager tests' output from changing depending on if polly is enabled or not. Tests are a continuation of those added in https://reviews.llvm.org/D89083. Reviewed By: asbirlea, Meinersbur Differential Revision: https://reviews.llvm.org/D89158	2020-11-11 15:10:27 -08:00
Baptiste Saleil	37c4ac8545	[PowerPC] Accumulator/Unprimed Accumulator register copy, spill and restore This patch adds support for accumulator/unprimed accumulator register copy, spill and restore for MMA. Authored By: Baptiste Saleil Reviewed By: #powerpc, bsaleil, amyk Differential Revision: https://reviews.llvm.org/D90616	2020-11-11 16:23:45 -06:00
Arthur Eubanks	d9cbceb041	[CGSCC][Inliner] Handle new non-trivial edges in updateCGAndAnalysisManagerForPass Previously the inliner did a bit of a hack by adding ref edges for all new edges introduced by performing an inline before calling updateCGAndAnalysisManagerForPass(). This was because updateCGAndAnalysisManagerForPass() didn't handle new non-trivial call edges. This adds handling of non-trivial call edges to updateCGAndAnalysisManagerForPass(). The inliner called updateCGAndAnalysisManagerForFunctionPass() since it was handling adding newly introduced edges (so updateCGAndAnalysisManagerForPass() would only have to handle promotion), but now it needs to call updateCGAndAnalysisManagerForCGSCCPass() since updateCGAndAnalysisManagerForPass() is now handling the new call edges and function passes cannot add new edges. We follow the previous path of adding trivial ref edges then letting promotion handle changing the ref edges to call edges and the CGSCC updates. So this still does not allow adding call edges that result in an addition of a non-trivial ref edge. This is in preparation for better detecting devirtualization. Previously since the inliner itself would add ref edges, updateCGAndAnalysisManagerForPass() would think that promotion and thus devirtualization had happened after any sort of inlining. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D91046	2020-11-11 13:43:49 -08:00
Jessica Paquette	7a70a2f04d	[AArch64][GlobalISel] Mark G_FCONSTANT as legal when there is full fp16 support When there is full fp16 support, there is no reason to widen 16-bit G_FCONSTANTs to 32 bits. Mark them as legal in this case. Also, we currently import a pattern for materializing a 16-bit 0.0. Add a testcase showing we select it. (All other 16-bit G_FCONSTANTS are not yet selected.) Differential Revision: https://reviews.llvm.org/D89164	2020-11-11 13:25:11 -08:00
Jianzhou Zhao	0dd87825db	Add a flag to control whether to propagate labels from condition values to results Before the change, DFSan always does the propagation. W/o origin tracking, it is harder to understand such flows. After the change, the flag is off by default. Reviewed-by: morehouse Differential Revision: https://reviews.llvm.org/D91234	2020-11-11 20:41:42 +00:00
Pavel Iliin	fdb979cfbb	[NFC] [Legalize] Fix spaces and style.	2020-11-11 19:24:15 +00:00
Akira Hatanaka	5e85d00ed6	Move variable declarations to functions in which they are used. NFC	2020-11-11 10:58:43 -08:00
Vedant Kumar	d76e01a6a7	[MachO] Allow the LC_IDENT load command xnu coredumps include an LC_IDENT load command. It's helpful to be able to just ignore these. IIUC an interested client can grab the identifier using the MachOObjectFile::load_commands() API. The status quo is that llvm bails out when it finds an LC_IDENT because the command is obsolete (see isLoadCommandObsolete). Differential Revision: https://reviews.llvm.org/D91221	2020-11-11 10:15:54 -08:00
Craig Topper	637f19c36b	[RISCV] Remove traces of Glue from RISCVISD::SELECT_CC We were creating RISCVISD::SELECT_CC nodes with Glue output that was never being used, and the tablegen SDNode had the SDNPInGlue flag instead of the SDNPOutGlue flag. Since we don't seem to need the Glue just get rid of it from both places. Differential Revision: https://reviews.llvm.org/D91199	2020-11-11 09:30:48 -08:00
Jessica Paquette	c42053f79b	[AArch64][GlobalISel] Select arith extended add/sub in manual selection code The manual selection code for add/sub was not checking if it was possible to fold in shifts + extends (the *rx opcode variants). As a result, we could never select things like ``` cmp x1, w0, uxtw #2 ``` Because we don't import any patterns for compares. This adds support for the arithmetic shifted register forms and updates tests for instructions selected using `emitADD`, `emitADDS`, and `emitSUBS`. This is a 0.1% geomean code size improvement on SPECINT2000 at -Os. Differential Revision: https://reviews.llvm.org/D91207	2020-11-11 09:26:03 -08:00
Jessica Paquette	f0580c73bb	[AArch64][GlobalISel] Select negative arithmetic immediates in manual selector Previously, we only handled negative arithmetic immediates in the imported selector code. Since we don't import code for, say, compares, we were missing opportunities for things like ``` %cst:gpr(s64) = G_CONSTANT i64 -10 %cmp:gpr(s32) = G_ICMP intpred(eq), %reg0(s64), %cst -> %adds = ADDSXri %reg0, 10, 0, implicit-def $nzcv %cmp = CSINCWr $wzr, $wzr, 1, implicit $nzcv ``` Instead, we would have to materialize the constant and emit a SUBS. This adds support for selection like above for SUB, SUBS, ADD, and ADDS. This is a 0.1% geomean code size improvement on SPECINT2000 at -Os. Differential Revision: https://reviews.llvm.org/D91108	2020-11-11 09:20:05 -08:00
Jay Foad	f23c4c6f8a	[AMDGPU] Separate out real exp instructions by subtarget. NFC. Differential Revision: https://reviews.llvm.org/D91247	2020-11-11 17:13:40 +00:00
Jay Foad	2b33ea6935	[AMDGPU] Split exp instructions out into their own tablegen file. NFC. Differential Revision: https://reviews.llvm.org/D91246	2020-11-11 17:13:40 +00:00
Jay Foad	f94fd1c8ca	[AMDGPU] Make use of SIInstrInfo::isEXP. NFC.	2020-11-11 17:01:20 +00:00
Alex Richardson	fb9942f876	[AsmParser] Add source location to all errors related to .cfi directives I was trying to add .cfi_ annotations to assembly code in the FreeBSD kernel and changed a macro that then resulted in incorrectly nested directives. However, clang's diagnostics said the error was happening at <unknown>:0. This addresses one of the TODOs added in D51695. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D89787	2020-11-11 17:00:07 +00:00
Hans Wennborg	418f18c6cd	Revert "Reland [CFGuard] Add address-taken IAT tables and delay-load support" This broke both Firefox and Chromium (PR47905) due to what seems like dllimport function not being handled correctly. > This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table. > Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets. > > Reviewed By: rnk > > Differential Revision: https://reviews.llvm.org/D87544 This reverts commit `cfd8481da1`.	2020-11-11 16:03:33 +01:00
Sander de Smalen	30fded75b4	Revert "[LoopVectorizer] NFCI: Calculate register usage based on TLI.getTypeLegalizationCost." This reverts commits: * [LoopVectorizer] NFCI: Calculate register usage based on TLI.getTypeLegalizationCost. `b873aba394`. * [LoopVectorizer] Silence warning in GetRegUsage. `9ff701100a`.	2020-11-11 14:41:55 +00:00
Jay Foad	830ed64ccd	Revert "Revert "[AMDGPU] Reorganize GCN subtarget features for unaligned access"" This reverts commit `8b08fa0103`. The underlying problems were fixed by D90607.	2020-11-11 14:40:14 +00:00
Simon Pilgrim	f6a326adef	[ValueTracking] computeKnownBitsFromShiftOperator - merge zero/one callbacks to single KnownBits callback. NFCI. Another cleanup for D90479 - handle the Known Ones/Zeros in a single callback, which will make it much easier to jump over to the KnownBits shift handling.	2020-11-11 14:22:42 +00:00
Caroline Concatto	37f4ccb275	[AArch64]Add memory op cost model for SVE This patch adds/fixes memory op cost model for SVE with fixed-width vector. Differential Revision: https://reviews.llvm.org/D90950	2020-11-11 12:49:19 +00:00
Simon Pilgrim	1a62ca65c1	[KnownBits] Add KnownBits::commonBits helper. NFCI. We have a frequent pattern where we're merging two KnownBits to get the common/shared bits, and I just fell for the gotcha where I tried to use the & operator to merge them........	2020-11-11 12:15:54 +00:00
Kerry McLaughlin	170947a5de	[SVE][CodeGen] Lower scalable masked scatters Lowers the llvm.masked.scatter intrinsics (scalar plus vector addressing mode only) Changes included in this patch: - Custom lowering for MSCATTER, which chooses the appropriate scatter store opcode to use. Floating-point scatters are cast to integer, with patterns added to match FP reinterpret_casts. - Added the getCanonicalIndexType function to convert redundant addressing modes (e.g. scaling is redundant when accessing bytes) - Tests with 32 & 64-bit scaled & unscaled offsets Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90941	2020-11-11 11:50:22 +00:00

1 2 3 4 5 ...

141045 Commits