llvm-project

Commit Graph

Author	SHA1	Message	Date
Lang Hames	e7a63df88c	[ORC] Add debugging output for ResourceTracker to be used in JITDylib::define.	2020-11-11 10:58:22 +11:00
Sjoerd Meijer	706ead0e87	[LoopFlatten] Make it a FunctionPass This converts LoopFlatten from a LoopPass to a FunctionPass so that we don't run into problems of a loop pass deleting a (inner)loop. Differential Revision: https://reviews.llvm.org/D90940	2020-11-10 20:03:31 +00:00
David Tenty	ae032e2714	[CMake][ExecutionEngine] add HAVE_(DE)REGISTER_FRAME as a config.h macros The macro HAVE_EHTABLE_SUPPORT is used by parts of ExecutionEngine to tell __register_frame/__deregister_frame is available to register the FDE for a generated (JIT) code. It's currently set by a slowly growing set of macro tests in the respective headers, which is updated now and then when it fails to link on some platform or another due to the symbols being missing (see for example https://bugs.llvm.org/show_bug.cgi?id=5715). This change converts the macro in two HAVE_(DE)REGISTER_FRAME config.h macros (like most of the other HAVE_* macros) and set's them based on whether CMake can actually find a definition for these symbols to link to at configuration time. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D87114	2020-11-10 13:09:44 -05:00
David Green	c7e275388e	[ARM] Don't aggressively unroll vector remainder loops We already do not unroll loops with vector instructions under MVE, but that does not include the remainder loops that the vectorizer produces. These remainder loops will be rarely executed and are not worth unrolling, as the trip count is likely to be low if they get executed at all. Luckily they get llvm.loop.isvectorized to make recognizing them simpler. We have wanted to do this for a while but hit issues with low overhead loops being reverted due to difficult registry allocation. With recent changes that seems to be less of an issue now. Differential Revision: https://reviews.llvm.org/D90055	2020-11-10 17:01:31 +00:00
David Green	b2ac9681a7	[ARM] Alter t2DoLoopStart to define lr This changes the definition of t2DoLoopStart from t2DoLoopStart rGPR to GPRlr = t2DoLoopStart rGPR This will hopefully mean that low overhead loops are more tied together, and we can more reliably generate loops without reverting or being at the whims of the register allocator. This is a fairly simple change in itself, but leads to a number of other required alterations. - The hardware loop pass, if UsePhi is set, now generates loops of the form: %start = llvm.start.loop.iterations(%N) loop: %p = phi [%start], [%dec] %dec = llvm.loop.decrement.reg(%p, 1) %c = icmp ne %dec, 0 br %c, loop, exit - For this a new llvm.start.loop.iterations intrinsic was added, identical to llvm.set.loop.iterations but produces a value as seen above, gluing the loop together more through def-use chains. - This new instrinsic conceptually produces the same output as input, which is taught to SCEV so that the checks in MVETailPredication are not affected. - Some minor changes are needed to the ARMLowOverheadLoop pass, but it has been left mostly as before. We should now more reliably be able to tell that the t2DoLoopStart is correct without having to prove it, but t2WhileLoopStart and tail-predicated loops will remain the same. - And all the tests have been updated. There are a lot of them! This patch on it's own might cause more trouble that it helps, with more tail-predicated loops being reverted, but some additional patches can hopefully improve upon that to get to something that is better overall. Differential Revision: https://reviews.llvm.org/D89881	2020-11-10 15:57:58 +00:00
Paul C. Anagnostopoulos	467208a492	[IR] [TableGen] Cleanup pass over the IR TableGen files, part 2 This pass cleans up NVVM. Differential Revision: https://reviews.llvm.org/D91097	2020-11-10 09:29:07 -05:00
Sanjay Patel	f7eac51b9b	[CostModel] remove cost-kind predicate for intrinsics in basic TTI implementation This is the last step in removing cost-kind as a consideration in the basic class model for intrinsics. See D89461 for the start of that. Subsequent commits dealt with each of the special-case intrinsics that had customization here in the basic class. This should remove a barrier to retrying D87188 (canonicalization to the abs intrinsic). The ARM and x86 cost diffs seen here may be wrong because the target-specific overrides have their own bugs, but we hope this is less wrong - if something has a significant throughput cost, then it should have a significant size / blended cost too by default. The only behavioral diff in current regression tests is shown in the x86 scatter-gather test (which is misplaced or broken because it runs the entire -O3 pipeline) - we unrolled less, and we assume that is a improvement. Differential Revision: https://reviews.llvm.org/D90554	2020-11-10 08:19:31 -05:00
Mirko Brkusanin	a75d6178b8	[GlobalISel] Add combine for (x \| mask) -> x when (x \| mask) == x If we have a mask, and a value x, where (x \| mask) == x, we can drop the OR and just use x. Differential Revision: https://reviews.llvm.org/D90952	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	fb36ab0a42	[GlobalISel] Expand combine for (x & mask) -> x when (x & mask) == x We can use KnownBitsAnalysis to cover cases when mask is not trivial. It can also help with cases when mask is not constant but can still be folded into one. Since 'and' is comutative we should treat both operands as possible replacements. Differential Revision: https://reviews.llvm.org/D90674	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	53ae95c946	[AMDGPU][GlobalISel] Combine shift + logic + shift with constant operands This sequence of instructions can be simplified if they are single use and some operands are constants. Additional combines may be applied afterwards. Differential Revision: https://reviews.llvm.org/D90223	2020-11-10 11:32:13 +01:00
Mirko Brkusanin	de719586a8	[AMDGPU][GlobalISel] Fold a chain of two shift instructions with constant operands Sequence of same shift instructions with constant operands can be combined into a single shift instruction. Differential Revision: https://reviews.llvm.org/D90217	2020-11-10 11:32:12 +01:00
Max Kazantsev	25755a0159	[NFC] Add flag to disable IV widening in indvar instance This allows us to have control over IV widening in the pipeline.	2020-11-10 15:10:44 +07:00
Max Kazantsev	6022a8b7e8	[SCEV] Drop cached ranges of AddRecs after flag update Our range computation methods benefit from no-wrap flags. But if the ranges were first computed before the flags were set, the cached range will be too pessimistic. We need to drop cached ranges whenever we sharpen AddRec's no wrap flags. Differential Revision: https://reviews.llvm.org/D89847 Reviewed By: fhahn	2020-11-10 12:37:12 +07:00
Arthur Eubanks	1cbf8e89b5	[NewPM] Port -separate-const-offset-from-gep Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D91095	2020-11-09 17:42:36 -08:00
Michael Kruse	e5dba2d7e5	[OMPIRBuilder] Start 'Create' methods with lower case. NFC. For consistency with the IRBuilder, OpenMPIRBuilder has method names starting with 'Create'. However, the LLVM coding style has methods names starting with lower case letters, as all other OpenMPIRBuilder already methods do. The clang-tidy configuration used by Phabricator also warns about the naming violation, adding noise to the reviews. This patch renames all `OpenMPIRBuilder::CreateXYZ` methods to `OpenMPIRBuilder::createXYZ`, and updates all in-tree callers. I tested check-llvm, check-clang, check-mlir and check-flang to ensure that I did not miss a caller. Reviewed By: mehdi_amini, fghanim Differential Revision: https://reviews.llvm.org/D91109	2020-11-09 19:35:11 -06:00
Kazu Hirata	2f1038c7b6	[BranchProbabilityInfo] Use SmallVector (NFC) This patch simplifies BranchProbabilityInfo by changing the type of Probs. Without this patch: DenseMap<Edge, BranchProbability> Probs maps an ordered pair of a BasicBlock* and a successor index to an edge probability. With this patch: DenseMap<const BasicBlock , SmallVector<BranchProbability, 2>> Probs maps a BasicBlock to a vector of edge probabilities. BranchProbabilityInfo has a property that for a given basic block, we either have edge probabilities for all successors or do not have any edge probability at all. This property combined with the current map type leads to a somewhat complicated algorithm in eraseBlock to erase map entries one by one while increasing the successor index. The new map type allows us to remove the all edge probabilities for a given basic block in a more intuitive manner, namely: Probs.erase(BB); Differential Revision: https://reviews.llvm.org/D91017	2020-11-09 17:29:40 -08:00
Josh Stone	4463b73e79	Enable opt-bisect for the new pass manager This instruments a should-run-optional-pass callback using the existing OptBisect class to decide if new passes should be skipped. Passes that force isRequired never reach this at all, so they are not included in "BISECT:" output nor its pass count. The test case is resurrected from r267022, an early version of D19172 that had new pass manager support (later reverted and redone without). Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D87951	2020-11-09 15:57:48 -08:00
Jan Svoboda	dbfa69c502	Port some floating point options to new option marshalling infrastructure This ports a number of OpenCL and fast-math flags for floating point over to the new marshalling infrastructure. As part of this, `Opt{In,Out}FFlag` were enhanced to allow other flags to imply them, via `DefaultAnyOf<>`. For example: ``` defm signed_zeros : OptOutFFlag<"signed-zeros", ..., "LangOpts->NoSignedZero", DefaultAnyOf<[cl_no_signed_zeros, menable_unsafe_fp_math]>>; ``` defines `-fsigned-zeros` (`false`) and `-fno-signed-zeros` (`true`) linked to the keypath `LangOpts->NoSignedZero`, defaulting to `false`, but set to `true` implicitly if one of `-cl-no-signed-zeros` or `-menable-unsafe-fp-math` is on. Note that the initial patch was written Daniel Grumberg. Differential Revision: https://reviews.llvm.org/D82756	2020-11-09 18:00:10 -05:00
Michael Kruse	f44ee0f5e7	[OpenMPIRBuilder] Implement CreateCanonicalLoop. CreateCanonicalLoop generates a standardized control flow structure for OpenMP canonical for loops. The structure can be consumed by loop-associated directives such as worksharing-loop, distribute, simd etc. as well as loop transformations such as tile and unroll. This is a first design without considering all complexities yet. The control-flow emits more basic block than strictly necessary, but these will be optimized by CFGSimplify anyway, provide a nice separation of concerns and might later be useful with more complex scenarios. I successfully implemented a basic tile construct using this API, which is not part of this patch. The fundamental building block is the CreateCanonicalLoop that only takes the loop trip count and operates on the logical iteration spaces only. An overloaded CreateCanonicalLoop for using LB, UB, Increment is provided as well, but at least for C++, Clang will need to implement a loop counter to logical induction variable mapping anyway, since iterator overload resolution cannot be done in LLVMFrontend. As there currently is no user for CreateCanonicalLoop, it is only called from unittests. Similarly, CanonicalLoopInfo::eraseFromParent() is used in my file implementation and might be generally useful for implementing loop-associated constructs, but is not used in this patch itself. The following non-exhaustive list describes not yet covered items: * collapse clause (including non-rectangular and non-perfectly nested); idea is to provide a OpenMPIRBuilder::collapseLoopNest method consuming multiple nested loops and returning a new CanonicalLoopInfo that can be used for loop-associated directives. * simarly: ordered clause for DOACROSS loops * branch weights * Cancellation point (?) * AllocaIP * break statement (if needed at all) * Exceptions (if not completely handled in the front-end) * Using it in Clang; this requires implementing at least one loop-associated construct. * ... Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90830	2020-11-09 15:03:32 -06:00
David Zarzycki	a41ea782c8	[SelectionDAG] Enable CTPOP optimization fine tuning Add a TLI hook to allow SelectionDAG to fine tune the conversion of CTPOP to a chain of "x & (x - 1)" when CTPOP isn't legal. A subsequent patch will attempt to fine tune the X86 code gen. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89952	2020-11-09 13:49:01 -05:00
jasonliu	42d2109380	[XCOFF] Enable explicit sections on AIX Implement mechanism to allow explicit sections to be generated on AIX. Reviewed By: DiggerLin Differential Revision: https://reviews.llvm.org/D88615	2020-11-09 16:27:38 +00:00
Paul C. Anagnostopoulos	91d2e5c81a	[TableGen] Add the !filter bang operator. Add a test. Update the Programmer's Reference. Use it in some TableGen files. Differential Revision: https://reviews.llvm.org/D91008	2020-11-09 10:56:55 -05:00
Sebastian Neubauer	a022b1ccd8	[AMDGPU] Add amdgpu_gfx calling convention Add a calling convention called amdgpu_gfx for real function calls within graphics shaders. For the moment, this uses the same calling convention as other calls in amdgpu, with registers excluded for return address, stack pointer and stack buffer descriptor. Differential Revision: https://reviews.llvm.org/D88540	2020-11-09 16:51:44 +01:00
Lucas Prates	c2c2cc1360	[ARM][AArch64] Adding Neoverse V1 CPU support Add support for the Neoverse V1 CPU to the ARM and AArch64 backends. This is based on patches from Mark Murray and Victor Campos. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D90765	2020-11-09 13:15:40 +00:00
Nathan James	5918ef8b1a	[clangd] Handle duplicate enum constants in PopulateSwitch tweak If an enum has different names for the same constant, make sure only the first one declared gets added into the switch. Failing to do so results in a compiler error as 2 case labels can't represent the same value. ``` lang=c enum Numbers{ One, Un = One, Two, Deux = Two, Three, Trois = Three }; // Old behaviour switch (<Number>) { case One: case Un: case Two: case Duex: case Three: case Trois: break; } // New behaviour switch (<Number>) { case One: case Two: case Three: break; } ``` Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D90555	2020-11-09 12:14:53 +00:00
Georgii Rymar	a7a447be0f	[yaml2obj] - ProgramHeaders: introduce FirstSec/LastSec instead of Sections list. Imagine we have a YAML declaration of few sections: `foo1`, `<unnamed 2>`, `foo3`, `foo4`. To put them into segment we can do (1): ``` Sections: - Section: foo1 - Section: foo4 ``` or we can use (2): ``` Sections: - Section: foo1 - Section: foo3 - Section: foo4 ``` or (3) : ``` Sections: - Section: foo1 ## "(index 2)" here is a name that we automatically created for a unnamed section. - Section: (index 2) - Section: foo3 - Section: foo4 ``` It looks really confusing that we don't have to list all of sections. At first I've tried to make this rule stricter and report an error when there is a gap (i.e. when a section is included into segment, but not listed explicitly). This did not work perfect, because such approach conflicts with unnamed sections/fills (see (3)). This patch drops "Sections" key and introduces 2 keys instead: `FirstSec` and `LastSec`. Both are optional. Differential revision: https://reviews.llvm.org/D90458	2020-11-09 13:00:50 +03:00
Georgii Rymar	99a6401acc	Recommit: [llvm-readelf/obj] - Allow dumping of ELF header even if some elements are corrupt. This is recommit for D90903 with fixes for BB: 1) Used std::move<> when returning Expected<> (http://lab.llvm.org:8011/#/builders/112/builds/913) 2) Fixed the name of temporarily file in the file-headers.test (http://lab.llvm.org:8011/#/builders/36/builds/1269) (a local old temporarily file was used before) For creating `ELFObjectFile` instances we have the factory method `ELFObjectFile<ELFT>::create(MemoryBufferRef Object)`. The problem of this method is that it scans the section header to locate some sections. When a file is truncated or has broken fields in the ELF header, this approach does not allow us to create the `ELFObjectFile` and dump the ELF header. This is https://bugs.llvm.org/show_bug.cgi?id=40804 This patch suggests a solution - it allows to delay scaning sections in the `ELFObjectFile<ELFT>::create`. It now allows user code to call an object initialization (`initContent()`) later. With that it is possible, for example, for dumpers just to dump the file header and exit. By default initialization is still performed as before, what helps to keep the logic of existent callers untouched. I've experimented with different approaches when worked on this patch. I think this approach is better than doing initialization of sections (i.e. scan of them) on demand, because normally users of `ELFObjectFile` API expect to work with a valid object. In most cases when a section header table can't be read (because of an error), we don't have to continue to work with object. So we probably don't need to implement a more complex API. Differential revision: https://reviews.llvm.org/D90903	2020-11-09 12:53:53 +03:00
Georgii Rymar	f59216b58f	Revert "[llvm-readelf/obj] - Allow dumping of ELF header even if some elements are corrupt." This reverts commit `ea8a0b8b29`. It broke BBots. http://lab.llvm.org:8011/#/builders/14/builds/1439 http://lab.llvm.org:8011/#/builders/112/builds/913	2020-11-09 11:50:50 +03:00
Georgii Rymar	ea8a0b8b29	[llvm-readelf/obj] - Allow dumping of ELF header even if some elements are corrupt. For creating `ELFObjectFile` instances we have the factory method `ELFObjectFile<ELFT>::create(MemoryBufferRef Object)`. The problem of this method is that it scans the section header to locate some sections. When a file is truncated or has broken fields in the ELF header, this approach does not allow us to create the `ELFObjectFile` and dump the ELF header. This is https://bugs.llvm.org/show_bug.cgi?id=40804 This patch suggests a solution - it allows to delay scaning sections in the `ELFObjectFile<ELFT>::create`. It now allows user code to call an object initialization (`initContent()`) later. With that it is possible, for example, for dumpers just to dump the file header and exit. By default initialization is still performed as before, what helps to keep the logic of existent callers untouched. I've experimented with different approaches when worked on this patch. I think this approach is better than doing initialization of sections (i.e. scan of them) on demand, because normally users of `ELFObjectFile` API expect to work with a valid object. In most cases when a section header table can't be read (because of an error), we don't have to continue to work with object. So we probably don't need to implement a more complex API. Differential revision: https://reviews.llvm.org/D90903	2020-11-09 11:27:07 +03:00
Georgii Rymar	c9d036ad4a	[yaml2obj] - Implement BBAddrMapSection::getEntries(). NFC. This allows to use the generic fields validation mechanism that we have. The behavior (i.e. an error reported) remains the same.	2020-11-09 11:11:57 +03:00
Paul C. Anagnostopoulos	2af0edefd6	[IR] [TableGen] Cleanup pass over the IR TableGen files. This patch includes intrinsics for AMDGPU. Differential Revision: https://reviews.llvm.org/D90946	2020-11-08 14:46:53 -05:00
Simon Pilgrim	9fd7710497	[KnownBits] isNonZero() - avoid expensive countPopulation call. NFC. We can just check for a null value.	2020-11-08 12:58:30 +00:00
Pedro Tammela	5e8ecff0d8	[Reg2Mem] add support for the new pass manager This patch refactors the pass to accomodate the new pass manager boilerplate. Differential Revision: https://reviews.llvm.org/D91005	2020-11-08 11:14:05 +00:00
Arthur Eubanks	226e179f74	Revert "[NewPM] Provide method to run all pipeline callbacks, used for -O0" This reverts commit `ae38540042`. As well as some follow-up test fixes. The original change causes new-pass-manager.ll to fail when polly is enabled.	2020-11-08 00:32:35 -08:00
Kazu Hirata	c95fff5be7	[JumpThreading] Fix function names (NFC)	2020-11-07 19:35:03 -08:00
Nikita Popov	4b860240a6	[BasicAA] Unify struct/other offset (NFC) The distinction between StructOffset and OtherOffset has been originally introduced by `82069c44ca`, which applied different reasoning to both offset kinds. However, this distinction was not actually correct, and has been fixed by `c84e77aeae`. Since then, we only ever consider the sum StructOffset + OtherOffset, so we may as well store it in that form directly.	2020-11-07 18:56:05 +01:00
Jonas Devlieghere	5d3332bc3c	[DWARFLinker] Use union to reduce sizeof(WorklistItem) (NFC) Reduce the size of the WorklistItem struct by using a struct.	2020-11-06 23:24:41 -08:00
Atmn Patel	04a0896487	Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops" This reverts commit `0b17c6e447`. This patch causes a compile-time error in SCEV.	2020-11-07 00:32:12 -05:00
Jonas Devlieghere	3897137598	[DWARFLinker] Add CompileUnit::getInfo helper that takes a DWARFDie (NFC) Eliminate the need to go through the DIE index by passing the DIE to CompileUnit::getInfo directly. Before: unsigned Idx = Unit->getOrigUnit().getDIEIndex(Die); CompileUnit::DIEInfo &Info = Unit->getInfo(Idx); After: CompileUnit::DIEInfo &Info = Unit->getInfo(Die);	2020-11-06 19:37:44 -08:00
Atmn Patel	0b17c6e447	[LoopDeletion] Allows deletion of possibly infinite side-effect free loops From C11 and C++11 onwards, a forward-progress requirement has been introduced for both languages. In the case of C, loops with non-constant conditionals that do not have any observable side-effects (as defined by 6.8.5p6) can be assumed by the implementation to terminate, and in the case of C++, this assumption extends to all functions. The clang frontend will emit the `mustprogress` function attribute for C++ functions (D86233, D85393, D86841) and emit the loop metadata `llvm.loop.mustprogress` for every loop in C11 or later that has a non-constant conditional. This patch modifies LoopDeletion so that only loops with the `llvm.loop.mustprogress` metadata or loops contained in functions that are required to make progress (`mustprogress` or `willreturn`) are checked for observable side-effects. If these loops do not have an observable side-effect, then we delete them. Loops without observable side-effects that do not satisfy the above conditions will not be deleted. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86844	2020-11-06 22:06:58 -05:00
Atmn Patel	46a29e9c6e	[Inliner] Handle `mustprogress` functions When inlining `mustprogress` functions, if the caller or the callee has the attribute, we drop the function attribute. The loops that have the `llvm.loop.mustprogress` metadata keep their metadata. We do not need to add new loop metadata to inlined functions because the patch in D86841 already adds the relevant loop metadata in all of the necessary places. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D87262	2020-11-06 20:03:46 -05:00
Atmn Patel	babc224c5d	[LoopDeletion] Remove dead loops with no exit blocks Currently, LoopDeletion refuses to remove dead loops with no exit blocks because it cannot statically determine the control flow after it removes the block. This leads to miscompiles if the loop is an infinite loop and should've been removed. Differential Revision: https://reviews.llvm.org/D90115	2020-11-06 17:08:34 -05:00
Rahman Lavaee	82e7c4ce45	[obj2yaml] [yaml2obj] Add yaml support for SHT_LLVM_BB_ADDR_MAP section. YAML support allows us to better test the feature in the subsequent patches. The implementation is quite similar to the .stack_sizes section. Reviewed By: jhenderson, grimar Differential Revision: https://reviews.llvm.org/D88717	2020-11-06 12:44:42 -08:00
Simon Pilgrim	20f87d82ed	[InstCombine] computeKnownBitsMul - use KnownBits::isNonZero() helper. Avoid an expensive isKnownNonZero() call - this is a small cleanup before moving the extra NSW functionality from computeKnownBitsMul into KnownBits::computeForMul.	2020-11-06 17:27:13 +00:00
Kevin P. Neal	2069403cdf	[FPEnv] Use strictfp metadata in casting nodes The strictfp metadata was added to the casting AST nodes in D85960, but we aren't using that metadata yet. This patch adds that support. In order to avoid lots of ad-hoc passing around of the strictfp bits I updated the IRBuilder when moving from a function that has the Expr* to a function that lacks it. I believe we should switch to this pattern to keep the strictfp support from being overly invasive. For the purpose of testing that we're picking up the right metadata, I also made my tests use a pragma to make the AST's strictfp metadata not match the global strictfp metadata. This exposes issues that we need to deal with in subsequent patches, and I believe this is the right method for most all of our clang strictfp tests. Differential Revision: https://reviews.llvm.org/D88913	2020-11-06 11:56:12 -05:00
Arnold Schwaighofer	c6543cc6b8	llvm.coro.id.async lowering: Parameterize how-to restore the current's continutation context and restart the pipeline after splitting The `llvm.coro.suspend.async` intrinsic takes a function pointer as its argument that describes how-to restore the current continuation's context from the context argument of the continuation function. Before we assumed that the current context can be restored by loading from the context arguments first pointer field (`first_arg->caller_context`). This allows for defining suspension points that reuse the current context for example. Also: llvm.coro.id.async lowering: Add llvm.coro.preprare.async intrinsic Blocks inlining until after the async coroutine was split. Also, change the async function pointer's context size position struct async_function_pointer { uint32_t relative_function_pointer_to_async_impl; uint32_t context_size; } And make the position of the `async context` argument configurable. The position is specified by the `llvm.coro.id.async` intrinsic. rdar://70097093 Differential Revision: https://reviews.llvm.org/D90783	2020-11-06 06:22:46 -08:00
Sander de Smalen	5ee9ef8519	[TypeSize] Extend UnivariateLinearPolyBase with getWithIncrement/Decrement methods This patch adds getWithIncrement/getWithDecrement methods to ElementCount and TypeSize to allow: TypeSize::getFixed(8).getWithIncrement(8) <=> TypeSize::getFixed(16) TypeSize::getFixed(16).getWithDecrement(8) <=> TypeSize::getFixed(8) TypeSize::getScalable(8).getWithIncrement(8) <=> TypeSize::getScalable(16) TypeSize::getScalable(16).getWithDecrement(8) <=> TypeSize::getScalable(8) This patch implements parts of the POC in D90342. Reviewed By: ctetreau, dmgreen Differential Revision: https://reviews.llvm.org/D90713	2020-11-06 09:01:19 +00:00
Roman Lebedev	8d0fdd36a3	[IR] CmpInst: Add getFlippedSignednessPredicate() And refactor a few places to use it	2020-11-06 11:31:09 +03:00
Roman Lebedev	d4f70d6454	[IR] CmpInst: add isRelational() Since there's CmpInst::isEquality(), it only makes sense to have it's inverse for consistency.	2020-11-06 11:31:09 +03:00
Roman Lebedev	c7c702a272	[IR] CmpInst: add isEquality(Pred) Currently there is only a member version of isEquality(), which requires an actual [IF]CmpInst to be avaliable, which isn't always possible, and is inconsistent with the general pattern here. I wanted to use it in a new patch, but it wasn't there..	2020-11-06 11:31:09 +03:00
Roman Lebedev	a5ae3edaa3	[IR] CmpInst: add getUnsignedPredicate() There's already getSignedPredicate(), it is not symmetrical to not have it's opposite. I wanted to use it in new code, but it wasn't there..	2020-11-06 11:31:08 +03:00
Yevgeny Rouban	681d6c711f	[BranchProbabilityInfo] Introduce method copyEdgeProbabilities(). NFC A new method is introduced to allow bulk copy of outgoing edge probabilities from one block to another. This can be useful when a block is cloned from another one and we do not know if there are edge probabilities set for the original block or not. Copying outside of the BranchProbabilityInfo class makes the user unconditionally set the cloned block's edge probabilities even if they are unset for the original block. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D90839	2020-11-06 14:52:35 +07:00
Yevgeny Rouban	e38c8e7590	[BranchProbabilityInfo] Remove block handles in eraseBlock() BranchProbabilityInfo::eraseBlock() is a public method and can be called without deleting the block itself. This method is made remove the correspondent tracking handle from BranchProbabilityInfo::Handles along with the probabilities of the block. Handles.erase() call is moved to eraseBlock(). In setEdgeProbability() we need to add the block handle only once. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D90838	2020-11-06 13:13:58 +07:00
Yevgeny Rouban	4931158d27	[BranchProbabilityInfo] Get rid of MaxSuccIdx. NFC This refactoring allows to eliminate the MaxSuccIdx map proposed in the commit `a7b662d0`. The idea is to remove probabilities for a block BB for all its successors one by one from first, second, ... till N-th until they are defined in Probs. This works because probabilities for the block are set at once for all its successors from number 0 to N-1 and the rest are removed if there were stale probs. The protected method setEdgeProbability(), which set probabilities for individual successor, is removed. This makes it clear that the probabilities are set in bulk by the public method with the same name. Reviewed By: kazu, MaskRay Differential Revision: https://reviews.llvm.org/D90837	2020-11-06 12:21:24 +07:00
Valentin Clement	9914a8737f	[flang][openacc] Add parsing tests and semantic check for set directive This patch add some parsing and clause validity tests for the set directive. It makes use of the possibility introduces in patch D90770 to check the restriction were one of the default_async, device_num and device_type clauses is required but also not more than once on the set directive. Reviewed By: sameeranjoshi Differential Revision: https://reviews.llvm.org/D90771	2020-11-05 22:57:58 -05:00
Sean Silva	e9e2e3107d	[STLExtras] Add append_range helper. This is convenient in a lot of cases, such as when the thing you want to append is `someReallyLongFunctionName()` that you'd rather not write twice or assign to a variable for the paired begin/end calls. Differential Revision: https://reviews.llvm.org/D90894	2020-11-05 16:20:02 -08:00
Michael Liao	23c6d1501d	[amdgpu] Add `llvm.amdgcn.endpgm` support. - `llvm.amdgcn.endpgm` is added to enable "abort" support. Differential Revision: https://reviews.llvm.org/D90809	2020-11-05 19:06:50 -05:00
Valentin Clement	a8a10acba2	[openacc][openmp] Allow duplicate between required and allowed once/exclusive Validity check introduce in D90241 are a bit too restrict and this patch propose to losen them a bit. The duplicate clauses is now check only between the three allowed lists and between the requiredClauses and allowedClauses lists. This allows to enable some check where a clause can be required but also appear only once on the directive. We found these kind of restriction useful on the set directive in OpenACC for example. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D90770	2020-11-05 16:21:26 -05:00
Momchil Velikov	35d6251254	Add default value for MachineInstr::modifiesRegister. NFC. Looks accidentally omitted, it's present on `readsRegister`, `definesRegister` and few others. Differential Revision: https://reviews.llvm.org/D89625	2020-11-05 18:50:19 +00:00
Sjoerd Meijer	7eb70158e4	[IndVarSimplify][SimplifyIndVar] Move WidenIV to Utils/SimplifyIndVar. NFCI. This moves WidenIV from IndVarSimplify to Utils/SimplifyIndVar so that we have createWideIV available as a generic helper utility. I.e., this is not only useful in IndVarSimplify, but could be useful for loop transformations. For example, motivation for this refactoring is the loop flatten transformation: if induction variables in a loop nest can be widened, we can avoid having to perform certain overflow checks, enabling this transformation. Differential Revision: https://reviews.llvm.org/D90421	2020-11-05 16:52:47 +00:00
Simon Pilgrim	6729b6de1f	[KnownBits] Move ValueTracking SREM KnownBits handling to KnownBits::srem. NFCI. Move the ValueTracking implementation to KnownBits, the SelectionDAG version is more limited so I'm intending to replace that as a separate commit.	2020-11-05 14:58:33 +00:00
Simon Pilgrim	e237d56b43	[KnownBits] Move ValueTracking/SelectionDAG UREM KnownBits handling to KnownBits::urem. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 14:30:59 +00:00
Simon Pilgrim	32bee18b84	[KnownBits] Move ValueTracking/SelectionDAG UDIV KnownBits handling to KnownBits::udiv. NFCI. Both these have the same implementation - so move them to a single KnownBits copy. GlobalISel will be able to use this as well with minimal effort.	2020-11-05 13:42:42 +00:00
Sander de Smalen	d57bba7cf8	[SVE] Return StackOffset for TargetFrameLowering::getFrameIndexReference. To accommodate frame layouts that have both fixed and scalable objects on the stack, describing a stack location or offset using a pointer + uint64_t is not sufficient. For this reason, we've introduced the StackOffset class, which models both the fixed- and scalable sized offsets. The TargetFrameLowering::getFrameIndexReference is made to return a StackOffset, so that this can be used in other interfaces, such as to eliminate frame indices in PEI or to emit Debug locations for variables on the stack. This patch is purely mechanical and doesn't change the behaviour of how the result of this function is used for fixed-sized offsets. The patch adds various checks to assert that the offset has no scalable component, as frame offsets with a scalable component are not yet supported in various places. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D90018	2020-11-05 11:02:18 +00:00
Arthur Eubanks	ae38540042	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Tests are a continuation of those added in https://reviews.llvm.org/D89083. In order to prevent TargetMachines from adding unnecessary optimization passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be changed to take an OptimizationLevel, but that will be done separately. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89158	2020-11-04 22:27:16 -08:00
Atmn Patel	cea0599aa7	[LangRef] Adds llvm.loop.mustprogress loop metadata This patch adds the llvm.loop.mustprogress loop metadata. This is to be added to loops where the frontend language requires that the loop makes observable interactions with the environment. This is the loop-level equivalent to the function attribute `mustprogress` defined in D86233. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D88464	2020-11-04 22:32:50 -05:00
Arthur Eubanks	ab0ddbc38a	Reland [NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 13:11:40 -08:00
Arthur Eubanks	9173b5a99d	Revert "[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback" This reverts commit `7a83aa0520`. Causing buildbot failures.	2020-11-04 12:57:32 -08:00
Arthur Eubanks	7a83aa0520	[NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 12:53:30 -08:00
Eric Astor	07c4f1d10b	[ms] [llvm-ml] Lex MASM strings, including escaping Allow single-quoted strings and double-quoted character values, as well as doubled-quote escaping. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D89731	2020-11-04 15:28:43 -05:00
Arnold Schwaighofer	ea5989b43a	Start of an llvm.coro.async implementation This patch adds the `async` lowering of coroutines. This will be used by the Swift frontend to lower async functions. In contrast to the `retcon` lowering the frontend needs to be in control over control-flow at suspend points as execution might be suspended at these points. This is very much work in progress and the implementation will change as it evolves with the frontend. As such the documentation is lacking detail as some of it might change. rdar://70097093 Reapply with fix for memory sanitizer failure and sphinx failure. Differential Revision: https://reviews.llvm.org/D90612	2020-11-04 10:29:21 -08:00
Arthur Eubanks	d8f531c42c	[NewPM] Don't run before pass instrumentation on required passes This allows those instrumentation to log when they decide to skip a pass. This provides extra helpful info for optnone functions and also will help with opt-bisect. Have OptNoneInstrumentation print when it skips due to seeing optnone. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90545	2020-11-04 09:45:10 -08:00
Arnold Schwaighofer	42f1916640	Revert "Start of an llvm.coro.async implementation" This reverts commit `ea606cced0`. This patch causes memory sanitizer failures sanitizer-x86_64-linux-fast.	2020-11-04 08:26:20 -08:00
Arnold Schwaighofer	ea606cced0	Start of an llvm.coro.async implementation This patch adds the `async` lowering of coroutines. This will be used by the Swift frontend to lower async functions. In contrast to the `retcon` lowering the frontend needs to be in control over control-flow at suspend points as execution might be suspended at these points. This is very much work in progress and the implementation will change as it evolves with the frontend. As such the documentation is lacking detail as some of it might change. rdar://70097093 Differential Revision: https://reviews.llvm.org/D90612	2020-11-04 07:32:29 -08:00
Eric Astor	bf027da04c	[ms] [llvm-ml] Enable support for MASM-style macro procedures Allows the MACRO directive to define macro procedures with parameters and macro-local symbols. Supports required and optional parameters (including default values), and matches ml64.exe for its macro-local symbol handling (up to 65536 macro-local symbols in any translation unit). Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D89729	2020-11-04 10:29:57 -05:00
Paul C. Anagnostopoulos	d56cd4291e	[TableGen] Add !interleave operator to concatenate a list of values with delimiters Add a test. Use it in some TableGen files. Differential Revision: https://reviews.llvm.org/D90469	2020-11-04 09:23:54 -05:00
Paul C. Anagnostopoulos	5e92acfc82	[TableGen] [IR] Eliminate unnecessary recursive help class. Differential Revision: https://reviews.llvm.org/D90532	2020-11-04 09:18:09 -05:00
Sander de Smalen	73b6cb67dc	[NFCI] Replace AArch64StackOffset by StackOffset. This patch replaces the AArch64StackOffset class by the generic one defined in TypeSize.h. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D88983	2020-11-04 08:49:00 +00:00
Arthur Eubanks	06926e0f01	Port print-must-be-executed-contexts and print-mustexecute to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90207	2020-11-03 21:06:46 -08:00
Michael Liao	4b11201592	[MachineInstr] Add support for instructions with multiple memory operands. - Basically iterate each pair of memory operands from both instructions and return true if any of them may alias. - The exception are memory instructions without any memory operand. They may touch everything and could alias to any memory instruction. Differential Revision: https://reviews.llvm.org/D89447	2020-11-03 20:44:40 -05:00
Gaurav Jain	492b1d78d5	[NFC] Use [MC]Register in register allocation Differential Revision: https://reviews.llvm.org/D90725	2020-11-03 17:34:26 -08:00
Simon Pilgrim	e9b88c754a	[DAG] computeKnownBits - Move ISD::SRA handling into KnownBits::ashr As discussed on D90527, we should be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking.	2020-11-03 18:09:33 +00:00
Simon Pilgrim	cb798f040a	[DAG] computeKnownBits - Move (most) ISD::SRL handling into KnownBits::lshr As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 17:30:36 +00:00
Tim Renouf	89d41f3a2b	[AMDGPU] Add gfx1033 target Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761	2020-11-03 16:27:48 +00:00
Tim Renouf	ee3e642627	[AMDGPU] Add gfx90c target This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-11-03 16:27:43 +00:00
Valentin Clement	6c337945c8	[openmp][openacc][NFC] Simplify access and validation of DirectiveBase information This patch adds some helper in the DirectiveLanguage wrapper to initialize it from the RecordKeeper and validate the records. This simplify arguments in lots of function since only the DirectiveLanguge is passed. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D90358	2020-11-03 11:13:06 -05:00
Sanjay Patel	3c050a597c	[CostModel] fix cost calc bug for sadd/ssub with overflow As noted in D90554, there's an opcode typo in using an easily misused cost model API: getCmpSelInstrCost(). Beyond that, the assumed sequence of ops is questionable, but that would be another patch. My guess is that the x86 test diffs show that we are probably wrong both before and after this change, so there will be no practical difference. As an example, I tried this test which shows a cost of '7' either way: define <4 x i32> @sadd(<4 x i32> %va, <4 x i32> %vb) { %V4I32 = call {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %va, <4 x i32> %vb) %ov = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 1 %r = extractvalue {<4 x i32>, <4 x i1>} %V4I32, 0 %z = select <4 x i1> %ov, <4 x i32> <i32 42, i32 42, i32 42, i32 42>, <4 x i32> %r ret <4 x i32> %z } $ llc -o - sadd.ll -mattr=avx vpaddd %xmm1, %xmm0, %xmm2 vpcmpgtd %xmm2, %xmm0, %xmm0 vpxor %xmm0, %xmm1, %xmm0 vblendvps %xmm0, LCPI0_0(%rip), %xmm2, %xmm0a Differential Revision: https://reviews.llvm.org/D90681	2020-11-03 11:03:47 -05:00
Jameson Nash	a0ad066ce4	make the AsmPrinterHandler array public This lets external consumers customize the output, similar to how AssemblyAnnotationWriter lets the caller define callbacks when printing IR. The array of handlers already existed, this just cleans up the code so that it can be exposed publically. Replaces https://reviews.llvm.org/D74158 Differential Revision: https://reviews.llvm.org/D89613	2020-11-03 10:02:09 -05:00
Nathan James	97e8da45f9	[ADT] Add SmallVector::pop_back_n Adds a method called pop_back_n to SmallVector. This is more readable and less error prone than the alternatives of using ```lang=c++ Vector.resize(Vector.size() - N); Vector.erase(Vector.end() - N, Vector.end()); for (unsigned I = 0;I<N;++I) Vector.pop_back(); ``` Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D90576	2020-11-03 14:57:10 +00:00
Simon Pilgrim	cab21d4fa8	[DAG] computeKnownBits - Move (most) ISD::SHL handling into KnownBits::shl As discussed on D90527, we should be be trying to move shift handling functionality into KnownBits to avoid code duplication in SelectionDAG/GlobalISel/ValueTracking. The refactor to use the KnownBits fixed/min/max constant helpers allows us to hit a couple of cases that we were missing before. We still need the getValidMinimumShiftAmountConstant case as KnownBits doesn't handle per-element vector cases.	2020-11-03 14:22:28 +00:00
David Green	90131e3ecb	[CostModel] Make target intrinsics cheap by default This patch changes the intrinsics cost model to assume that by default target intrinsics are cheap. This didn't seem to be the case for all intrinsics, and is potentially an MVE problem due to our scalarization overheads. Cheap seems to be a good default in general though. Differential Revision: https://reviews.llvm.org/D90597	2020-11-03 09:58:28 +00:00
Sander de Smalen	1667d23e58	[NFCI] Add StackOffset class and base classes for ElementCount, TypeSize. This patch adds a linear polynomial base class, called LinearPolyBase, which serves as a base class for StackOffset. It tries to represent a linear polynomial like: c0 * scale0 + c1 * scale1 + ... + cK * scaleK where the scale is implicit, meaning that only the coefficients are encoded. This patch also adds a univariate linear polynomial, which serves as a base class for ElementCount and TypeSize. This tries to represent a linear polynomial where only one dimension can be set at any one time, i.e. a TypeSize is either fixed-sized, or scalable-sized, but cannot be a combination of the two. class LinearPolyBase ^ \| +---- class StackOffset (dimensions = 2 (fixed/scalable), type = int64_t) class UnivariateLinearPolyBase \| \| +---- class LinearPolySize (dimensions = 2 (fixed/scalable)) ^ \| +-------- class ElementCount (type = unsigned) \| \| +-------- class TypeSize (type = uint64_t) Reviewed By: ctetreau, david-arm Differential Revision: https://reviews.llvm.org/D88982	2020-11-03 09:41:39 +00:00
Georgii Rymar	1af3cb5424	[llvm-readobj/libObject] - Allow dumping objects that has a broken SHT_SYMTAB_SHNDX section. Currently it is impossible to create an instance of ELFObjectFile when the SHT_SYMTAB_SHNDX can't be read. We error out when fail to parse the SHT_SYMTAB_SHNDX section in the factory method. This change delays reading of the SHT_SYMTAB_SHNDX section entries, with it llvm-readobj is now able to work with such inputs. Differential revision: https://reviews.llvm.org/D89379	2020-11-03 11:30:28 +03:00
Reid Kleckner	c0a922b3db	Add parallelTransformReduce and parallelForEachError parallelTransformReduce is modelled on the C++17 pstl API of std::transform_reduce, except our wrappers do not use execution policy parameters. parallelForEachError allows loops that contain potentially failing operations to propagate errors out of the loop. This was one of the major challenges I encountered while parallelizing PDB type merging in LLD. Parallelizing a loop with parallelForEachError is not behavior preserving: the loop will no longer stop on the first error, it will continue working and report all errors it encounters in a list. I plan to use this to propagate errors out of LLD's coff::TpiSource::remapTpiWithGHashes, which currently stores errors an error in the TpiSource object. Differential Revision: https://reviews.llvm.org/D90639	2020-11-02 16:50:14 -08:00
Gaurav Jain	b68994bd2d	[NFC] Use [MC]Register in Live-ness tracking Differential Revision: https://reviews.llvm.org/D90611	2020-11-02 15:46:13 -08:00
Fangrui Song	ee5d1a0449	[AsmPrinter] Split up .gcc_except_table MC currently produces monolithic .gcc_except_table section. GCC can split up .gcc_except_table: * if comdat: `.section .gcc_except_table._Z6comdatv,"aG",@progbits,_Z6comdatv,comdat` * otherwise, if -ffunction-sections: `.section .gcc_except_table._Z3fooi,"a",@progbits` This ensures that (a) non-prevailing copies are discarded and (b) .gcc_except_table associated to discarded text sections can be discarded by a .gcc_except_table-aware linker (GNU ld, but not gold or LLD) This patches matches the GCC behavior. If -fno-unique-section-names is specified, we don't append the suffix. If -ffunction-sections is additionally specified, use `.section ...,unique`. Note, if clang driver communicates that the linker is LLD and we know it is new (11.0.0 or later) we can use SHF_LINK_ORDER to avoid string table costs, at least in the -fno-unique-section-names case. We cannot use it on GNU ld because as of binutils 2.35 it does not support mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER components in an output section https://sourceware.org/bugzilla/show_bug.cgi?id=26256 For RISC-V -mrelax, this patch additionally fixes an assembler-linker interaction problem: because a section is shrinkable, the length of a call-site code range is not a constant. Relocations referencing the associated text section (STT_SECTION) are needed. However, a STB_LOCAL relocation referencing a discarded section group member from outside the group is disallowed by the ELF specification (PR46675): ``` // a.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int main() { return comdat(); } // b.cc inline int comdat() { try { throw 1; } catch (int) { return 1; } return 0; } int foo() { return comdat(); } clang++ -target riscv64-linux -c a.cc b.cc -fPIC -mno-relax ld.lld -shared a.o b.o => ld.lld: error: relocation refers to a symbol in a discarded section: ``` -fbasic-block-sections= is similar to RISC-V -mrelax: there are outstanding relocations. Reviewed By: jrtc27, rahmanl Differential Revision: https://reviews.llvm.org/D83655	2020-11-02 14:36:25 -08:00
Fangrui Song	395c8bed64	[MC] Make MCStreamer aware of AsmParser's StartTokLoc A SMLoc allows MCStreamer to report location-aware diagnostics, which were previously done by adding SMLoc to various methods (e.g. emit*) in an ad-hoc way. Since the file:line is most important, the column is less important and the start token location suffices in many cases, this patch reverts `b7e7131af2` ``` // old symbol-binding-changed.s:6:8: error: local changed binding to STB_GLOBAL .globl local ^ // new symbol-binding-changed.s:6:1: error: local changed binding to STB_GLOBAL .globl local ^ ``` Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D90511	2020-11-02 12:32:07 -08:00
Mircea Trofin	61e8a44655	[NFC][regalloc] Use MCRegister appropriately Differential Revision: https://reviews.llvm.org/D90506	2020-11-02 11:48:49 -08:00
Duncan P. N. Exon Smith	c17da8676a	Support: Avoid std::tie in Support/FileSystem/UniqueID.h, NFC Running `-fsyntax-only` on UniqueID.h is 2x faster with this patch (which avoids calling `std::tie` for `operator<`). Since the transitive includers of this file will go up as `FileEntryRef` gets used in more places, avoid that compile-time hit. This is a follow-up to `23ed570af1` (suggested by Reid Kleckner). Also drop the `<tuple>` include from FileSystem.h (which was vestigal from before UniqueID.h was split out). Differential Revision: https://reviews.llvm.org/D90471	2020-11-02 13:26:15 -05:00
Fangrui Song	98b9338588	[Debugify] Port -debugify-each to NewPM Preemptively switch 2 tests to the new PM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D90365	2020-11-02 08:16:43 -08:00
Florian Hahn	b3b993a7ad	Reland "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts the revert commit `408c4408fa`. This version of the patch includes a fix for a crash caused by treating ICmp/FCmp constant expressions as instructions. Original message: On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV.	2020-11-02 15:39:29 +00:00
Ben Dunbobbin	ff2e24a741	[PS4] Support dllimport/export attributes For PS4 development we support dllimport/export annotations in source code. This patch enables the dllimport/export attributes on PS4 by adding a new function to query the triple for whether dllimport/export are used and using that function to decide whether these attributes are supported. This replaces the current method of checking if the target is Windows. This means we can drop the use of "TargetArch" in the .td file (which is an improvement as dllimport/export support isn't really a function of the architecture). I have included a simple codgen test to show that the attributes are accepted and have an effect on codegen for PS4. I have also enabled the DLLExportStaticLocal and DLLImportStaticLocal attributes, which we support downstream. However, I am unable to write a test for these attributes until other patches for PS4 dllimport/export handling land upstream. Whilst writing this patch I noticed that, as these attributes are internal, they do not need to be target specific (when these attributes are added internally in Clang the target specific checks have already been run); however, I think leaving them target specific is fine because it isn't harmful and they "really are" target specific even if that has no functional impact. Differential Revision: https://reviews.llvm.org/D90442	2020-11-02 14:25:34 +00:00
Florian Hahn	799033d8c5	Reland "[SLP] Consider alternatives for cost of select instructions." This reverts the revert commit `a1b53db324`. This patch includes a fix for a reported issue, caused by matchSelectPattern returning UMIN for selects of pointers in some cases by looking to some connected casts. For now, ensure integer instrinsics are only returned for selects of ints or int vectors.	2020-10-31 16:52:36 +00:00
Arthur Eubanks	5c31b8b94f	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `10f2a0d662`. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Liu, Chen3	756f597841	[X86] Support Intel avxvnni This patch mainly made the following changes: 1. Support AVX-VNNI instructions; 2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix. Differential Revision: https://reviews.llvm.org/D89105	2020-10-31 12:39:51 +08:00
Thomas Lively	0a512a555a	[WebAssembly] Prototype i64x2.eq As proposed in https://github.com/WebAssembly/simd/pull/381. Since it is still in the prototyping phase, it is only accessible via a target builtin function and a target intrinsic. Depends on D90504. Differential Revision: https://reviews.llvm.org/D90508	2020-10-30 16:38:15 -07:00
Thomas Lively	1cb0b56607	[WebAssembly] Prototype i64x2.widen_{low,high}_i32x4_{s,u} As proposed in https://github.com/WebAssembly/simd/pull/290. As usual, these instructions are available only via builtin functions and intrinsics while they are in the prototyping stage. Differential Revision: https://reviews.llvm.org/D90504	2020-10-30 15:44:04 -07:00
Florian Hahn	a1b53db324	Revert "[SLP] Consider alternatives for cost of select instructions." This reverts commit `1922570489`. This appears to cause a crash in the following example a, b, c; l() { int e = a, f = l, g, h, i, j; float d = c, k = b; for (;;) for (; g < f; g++) { k[h] = d[i]; k[h - 1] = d[j]; h += e << 1; i += e; } } clang -cc1 -triple i386-unknown-linux-gnu -emit-obj -target-cpu pentium-m -O1 -vectorize-loops -vectorize-slp reduced.c llvm::Type *llvm::Type::getWithNewBitWidth(unsigned int) const: Assertion `isIntOrIntVectorTy() && "Original type expected to be a vector of integers or a scalar integer."' failed.	2020-10-30 21:26:14 +00:00
Florian Hahn	408c4408fa	Revert "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts commit `73f01e3df5`. This appears to break http://lab.llvm.org:8011/#/builders/85/builds/383.	2020-10-30 21:26:14 +00:00
Peter Collingbourne	3d049bce98	hwasan: Support for outlined checks in the Linux kernel. Add support for match-all tags and GOT-free runtime calls, which are both required for the kernel to be able to support outlined checks. This requires extending the access info to let the backend know when to enable these features. To make the code easier to maintain introduce an enum with the bit field positions for the access info. Allow outlined checks to be enabled with -mllvm -hwasan-inline-all-checks=0. Kernels that contain runtime support for outlined checks may pass this flag. Kernels lacking runtime support will continue to link because they do not pass the flag. Old versions of LLVM will ignore the flag and continue to use inline checks. With a separate kernel patch [1] I measured the code size of defconfig + tag-based KASAN, as well as boot time (i.e. time to init launch) on a DragonBoard 845c with an Android arm64 GKI kernel. The results are below: code size boot time before 92824064 6.18s after 38822400 6.65s [1] https://linux-review.googlesource.com/id/I1a30036c70ab3c3ee78d75ed9b87ef7cdc3fdb76 Depends on D90425 Differential Revision: https://reviews.llvm.org/D90426	2020-10-30 14:25:40 -07:00
Cameron McInally	dda1e74b58	[Legalize] Add legalizations for VECREDUCE_SEQ_FADD Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass. Differential Revision: https://reviews.llvm.org/D90247	2020-10-30 16:02:55 -05:00
Mircea Trofin	871d658c9c	[FileCheck] Report missing prefixes when more than one is provided. If more than a prefix is provided - e.g. --check-prefixes=CHECK,FOO - we don't report if (say) FOO is never used. This may lead to a gap in our test coverage. This patch introduces a new option, --allow-unused-prefixes. It currently is set to true, keeping today's behavior. After we explicitly set it in tests where this behavior was actually intentional, we will switch it to false by default. Differential Revision: https://reviews.llvm.org/D90281	2020-10-30 12:39:29 -07:00
Arthur Eubanks	2e31727a88	[NFC] Clean up PassBuilder Make DebugLogging a member variable so that users of PassBuilder don't need to pass it around so much. Move call to TargetMachine::registerPassBuilderCallbacks() within PassBuilder so users don't need to remember to call it. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90437	2020-10-30 10:03:59 -07:00
Arthur Eubanks	10f2a0d662	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
Florian Hahn	73f01e3df5	[TTI] Add VecPred argument to getCmpSelInstrCost. On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV. Reviewed By: dmgreen, RKSimon Differential Revision: https://reviews.llvm.org/D90070	2020-10-30 13:49:08 +00:00
Georgii Rymar	2bfaf19516	[yaml2obj] - Make `Section::Link` field to be `Optional<>`. `Link` is not an optional field currently. Because of this it is not convenient to write macros. This makes it optional and fixes corresponding test cases. Differential revision: https://reviews.llvm.org/D90390	2020-10-30 16:18:53 +03:00
Nathan James	3c3071d5e7	[ADT][NFC] Silence some misc-unconventional-assign-operator warnings	2020-10-30 10:57:25 +00:00
Nathan James	cf8d19f4fb	[ADT] Add methods to SmallString for efficient concatenation A common pattern when using SmallString is to repeatedly call append to build a larger string. The issue here is the optimizer can't see through this and often has to check there is enough space in the storage for each string you try to append. This results in lots of conditional branches and potentially multiple calls to grow needing to be emitted if the buffer wasn't large enough. By taking an initializer_list of StringRefs, SmallString can preallocate the storage it needs for all of the StringRefs which only need to grow one time at most, then use a fast path of copying all the strings into its storage knowing there is guaranteed to be enough capacity. By using StringRefs, this also means you can append different string like types in one go as they will all be implicitly converted to a StringRef. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D90386	2020-10-30 10:07:40 +00:00
Roman Lebedev	81fc53a36a	[SCEV] Introduce SCEVPtrToIntExpr (PR46786) And use it to model LLVM IR's `ptrtoint` cast. This is essentially an alternative to D88806, but with no chance for all the problems it caused due to having the cast as implicit there. (see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3) As we've established by now, there are at least two reasons why we want this: * It will allow SCEV to actually model the `ptrtoint` casts and their operands, instead of treating them as `SCEVUnknown` * It should help with initial problem of PR46786 - this should eventually allow us to not loose pointer-ness of an expression in more cases As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 \| PR46786 ]], in principle, we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()` should sink the cast as far down into the expression as possible, so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`. But i think that it isn't the best solution, because it doesn't really matter from memory consumption side - there probably won't be that many `SCEVPtrToIntExpr`s for it to matter, and it allows for much better discoverability. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89456	2020-10-30 11:13:35 +03:00
Fangrui Song	b7e7131af2	[MC] Add SMLoc to MCStreamer::emitSymbolAttribute and report changed binding warnings/errors for ELF	2020-10-29 19:43:11 -07:00
Nikita Popov	6c2ad4cf87	[SDAG] Extract helper to determine neutral element (NFC) Make the existing VECREDUCE based code more generic, but expressing it in terms of the neutral value of the base opcode instead.	2020-10-29 22:05:06 +01:00
Florian Hahn	1922570489	[SLP] Consider alternatives for cost of select instructions. Some architectures do not have general vector select instructions (e.g. AArch64). But some cmp/select patterns can be vectorized using other instructions/intrinsics. One example is using min/max instructions for certain patterns. This patch updates the cost calculations for selects in the SLP vectorizer to consider using min/max intrinsics. This patch does not change SLP vectorizer's codegen itself to actually generate those intrinsics, but relies on the backends to lower the vector cmps & selects. This keeps things simple on the SLP side and works well in practice for AArch64. This exposes additional SLP vectorization opportunities in some benchmarks on AArch64 (-O3 -flto). Metric: SLP.NumVectorInstructions Program base slp diff test-suite...ications/JM/ldecod/ldecod.test 502.00 697.00 38.8% test-suite...ications/JM/lencod/lencod.test 1023.00 1414.00 38.2% test-suite...-typeset/consumer-typeset.test 56.00 65.00 16.1% test-suite...6/464.h264ref/464.h264ref.test 804.00 822.00 2.2% test-suite...006/453.povray/453.povray.test 3335.00 3357.00 0.7% test-suite...CFP2000/177.mesa/177.mesa.test 2110.00 2121.00 0.5% test-suite...:: External/Povray/povray.test 2378.00 2382.00 0.2% Reviewed By: RKSimon, samparker Differential Revision: https://reviews.llvm.org/D89969	2020-10-29 20:39:50 +00:00
Nikita Popov	91bf172088	[SDAG] Extract helper to get vecreduce base opcode (NFC)	2020-10-29 20:22:22 +01:00
Thomas Lively	be6f50798e	[WebAssembly] Implement SIMD signselect instructions As proposed in https://github.com/WebAssembly/simd/pull/124, using the opcodes adopted by V8 in https://chromium-review.googlesource.com/c/v8/v8/+/2486235/2/src/wasm/wasm-opcodes.h. Uses new builtin functions and a new target intrinsic exclusively to ensure that the new instructions are only emitted when a user explicitly opts in to using them since they are still in the prototyping and evaluation phase. Differential Revision: https://reviews.llvm.org/D90357	2020-10-29 11:06:20 -07:00
Roland McGrath	ddfe4784cc	[Support] Make Support/SwapByteOrder.h compile on Fuchsia Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D90279	2020-10-29 10:49:06 -07:00
David Sherwood	8c058dd2d7	[SVE] Remove TypeSize comparison operators All known instances in the code where we relied upon the TypeSize comparison operators have now been changed to either use scalar interger comparisons or one of the TypeSize::isKnownXY functions. It is now safe to remove the comparison operators. Differential Revision: https://reviews.llvm.org/D90160	2020-10-29 14:32:26 +00:00
Adam Balogh	184eb4fa4f	[ADT] Fix for ImmutableMapRef The `Root` member of `ImmutableMapRef` was changed recently from a plain pointer to `IntrusiveRefCntPtr`. However, the `Profile` member function was not adjusted. This results in comilation error whenever the `Profile` method is used on an `ImmutableMapRef`. This patch fixes this issue and also adds unit tests for `ImmutableMapRef`. Differential Revision: https://reviews.llvm.org/D89486	2020-10-29 13:19:51 +01:00
Max Kazantsev	79c5b4c546	[NFC] Add some new util functions to ICmpInst	2020-10-29 17:38:11 +07:00
Max Kazantsev	a5b2e795c3	[NFC][SCEV] Refactor monotonic predicate checks to return enums instead of bools This patch gets rid of output parameter which is not needed for most users and prepares this API for further refactoring.	2020-10-29 16:01:25 +07:00
Fangrui Song	39856d5d0b	[Debugify] Move global namespace functions into llvm:: Also move exportDebugifyStats from tools/opt to Debugify.cpp	2020-10-28 19:11:41 -07:00
Mircea Trofin	735ab4be35	[ThinLTO] Fix .llvmcmd emission llvm::EmbedBitcodeInModule needs (what used to be called) EmbedMarker set, in order to emit .llvmcmd. EmbedMarker is really about embedding the command line, so renamed the parameter accordingly, too. This was not caught at test because the check-prefix was incorrect, but FileCheck does not report that when multiple prefixes are provided. A separate patch will address that. Differential Revision: https://reviews.llvm.org/D90278	2020-10-28 17:45:30 -07:00
River Riddle	f6a6f27edb	[llvm][StringExtras] Use a lookup table for `hexDigitValue` This method is at the core of the conversion from hex to binary, and using a lookup table great improves the compile time of hex conversions. Context: In MLIR we use hex strings to represent very large constants in the textual format of the IR. These changes lead to a large decrease in compile time when parsing these constants (>1 second -> 350 miliseconds). Differential Revision: https://reviews.llvm.org/D90320	2020-10-28 16:58:06 -07:00
River Riddle	1095419b10	[llvm][StringExtras] Add a fail-able version of `fromHex` This revision adds a fail-able/checked version of `fromHex` that fails when the input string contains a non-hex character. This removes the need for users to have a separate check for if the string contains all hex digits. This becomes very costly for large hex strings given that checking if a string contains only hex digits is effectively the same as just converting it in the first place. Context: In MLIR we use hex strings to represent very large constants in the textual format of the IR. These changes lead to a large decrease in compile time when parsing these constants (2 seconds -> 1 second). Differential Revision: https://reviews.llvm.org/D90265	2020-10-28 16:58:06 -07:00
Amy Huang	7669f3c0f6	Recommit "[CodeView] Emit static data members as S_CONSTANTs." We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. This changes CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072 This reverts commit `504615353f`.	2020-10-28 16:35:59 -07:00
Duncan P. N. Exon Smith	44d65efd95	Fix includes in llvm/Support/FileSystem/UniqueID.h after `23ed570af1` Not sure why this worked for me, but some of the bots pointed out I copied the wrong includes from FileSystem.h in `23ed570af1`. Fixed.	2020-10-28 18:39:39 -04:00
Craig Disselkoen	c3783847ae	C API: support scalable vectors This adds support for scalable vector types in the C API and in llvm-c-test, and also adds a test to ensure that llvm-c-test can properly roundtrip operations involving scalable vectors. While creating this diff, I discovered that the C API cannot properly roundtrip _constant expressions_ involving shufflevector / scalable vectors, but that seems to be a separate enough issue that I plan to address it in a future diff (unless reviewers feel it should be addressed here). Differential Revision: https://reviews.llvm.org/D89816	2020-10-28 18:19:34 -04:00
Duncan P. N. Exon Smith	23ed570af1	Split out llvm/Support/FileSystem/UniqueID.h and clang/Basic/FileEntry.h, NFC Split `FileEntry` and `FileEntryRef` out into a new file `clang/Basic/FileEntry.h`. This allows current users of a forward-declared `FileEntry` to transition to `FileEntryRef` without adding more includers of `FileManager.h`. Also split `UniqueID` out to llvm/Support/FileSystem/UniqueID.h, so `FileEntry.h` doesn't need to include all of `FileSystem.h` for just that type. Differential Revision: https://reviews.llvm.org/D89761	2020-10-28 16:38:32 -04:00
Adrian Prantl	0b2b50a5d2	[DebugInfo] Expose Fortran array debug info attributes through DIBuilder. The support of a few debug info attributes specifically for Fortran arrays have been added to LLVM recently, but there's no way to take advantage of them through DIBuilder. This patch extends DIBuilder::createArrayType to enable the settings of those attributes. Patch by Chih-Ping Chen! Differential Review: https://reviews.llvm.org/D90323	2020-10-28 13:13:35 -07:00
Alok Kumar Sharma	a6dd01afa3	[DebugInfo] Support for DW_TAG_generic_subrange This is needed to support fortran assumed rank arrays which have runtime rank. Summary: Fortran assumed rank arrays have dynamic rank. DWARF TAG DW_TAG_generic_subrange is needed to support that. Testing: unit test cases added (hand-written) check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D89218	2020-10-29 01:34:15 +05:30
Mircea Trofin	6fa35541a0	[NFC][ThinLTO] Change command line passing to EmbedBitcodeInModule Changing to pass by ref - less null checks to worry about. Differential Revision: https://reviews.llvm.org/D90330	2020-10-28 12:33:39 -07:00
Aditya Nandakumar	bed8394047	[GISel]: Few InsertVecElt combines https://reviews.llvm.org/D88060 This adds the following combines 1) build_vector formation from insert_vec_elts 2) insert_vec_elts (build_vector) -> build_vector	2020-10-28 12:27:07 -07:00
Mircea Trofin	f0a98ad820	[NFC] Use Register in RegisterPressure APIs Some related changes as well. Differential Revision: https://reviews.llvm.org/D90268	2020-10-28 12:14:08 -07:00
Valentin Clement	c89b645755	[openmp][openacc] Check for duplicate clauses for directive Check for duplicate clauses associated with directive. Clauses can appear only once in the 4 lists associated with each directive (allowedClauses, allowedOnceClauses, allowedExclusiveClauses, requiredClauses). Duplicates were already present (removed with this patch) or were introduce in new patches by mistake (D89861). Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D90241	2020-10-28 15:12:38 -04:00
Sanjay Patel	9df32c9044	[CostModel] remove cost-kind predicate for funnel shift costs Completing the series of FIXME removals for special-case intrinsics: `50dfa19cc7` `f2c25c7079` `c963bde015` `01ea93d85d` This one looks quite different than the others. The size/blended cost is still potentially very far off from the throughput cost, but this is hopefully not worse on the whole. It looks like the underlying costs for the expanded shift/logic have their own cost-kind limitations. Also, we are not asking the target if it has a legal funnel shift op, so we just assume that the intrinsic gets expanded.	2020-10-28 14:02:34 -04:00
Thomas Lively	31e944556f	[WebAssembly] Prototype extending multiplication SIMD instructions As proposed in https://github.com/WebAssembly/simd/pull/376. This commit implements new builtin functions and intrinsics for these instructions, but does not yet add them to wasm_simd128.h because they have not yet been merged to the proposal. These are the first instructions with opcodes greater than 0xff, so this commit updates the MC layer and disassembler to handle that correctly. Differential Revision: https://reviews.llvm.org/D90253	2020-10-28 09:38:59 -07:00
Paul C. Anagnostopoulos	9d72065cf6	[TableGen] [AMDGPU] Add !sub operator for subtraction Use it in the AMDGPU target to eliminate !add(value1, !mul(value2, -1)) Differential Revision: https://reviews.llvm.org/D90107	2020-10-28 12:27:53 -04:00
Benjamin Kramer	207cf71fa9	Revert "[OpenMP] Add Passing in Original Declaration Names To Mapper API" This reverts commit `d981c7b758` and `a87d7b3d44`. Test fails under msan.	2020-10-28 13:58:14 +01:00
Georgii Rymar	47369e194a	[yaml2obj][obj2yaml] - Teach tools to work with regular archives. This teaches obj2yaml to dump valid regular (not thin) archives. This also teaches yaml2obj to recognize archives YAML descriptions, what allows to craft all different kinds of archives (valid and broken ones). Differential revision: https://reviews.llvm.org/D89949	2020-10-28 15:27:11 +03:00
Max Kazantsev	160a453138	Return "[IndVars] Remove monotonic checks with unknown exit count" This reverts commit `e038b60d91`. This reverts commit `a0d84d8031`. This revert was a mistake. The reason of the failures was "Use uint64_t for branch weights instead of uint32_t" Differential Revision: https://reviews.llvm.org/D87832	2020-10-28 18:51:40 +07:00
Luqman Aden	4c0a016927	Rename EHPersonality::MSVC_Win64SEH to EHPersonality::MSVC_TableSEH. NFC. The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining vs table-based exception handling. x86-32 is the only arch for which Windows uses the former. 32-bit ARM would use what is called Win64SEH today, which is a bit confusing so instead let's just rename it to be a bit more clear. Reviewed By: compnerd, rnk Differential Revision: https://reviews.llvm.org/D90117	2020-10-27 23:22:13 -07:00

1 2 3 4 5 ...

43072 Commits