llvm-project

Commit Graph

Author	SHA1	Message	Date
Andrzej Warzynski	404da13e1e	[AArch64][SVE] Gather loads: pass 32 bit unpacked offsets as nxv2i32 Summary: Currently 32 bit unpacked offsets are passed as nxv2i64. However, as pointed out in https://reviews.llvm.org/D71074, using nxv2i32 instead would improve consistency with: * how other arguments are treated * how scatter stores are implemented This patch makes sure that 32 bit unpacked offsets are passes as nxv2i32 instead of nxv2i64. Reviewers: sdesmalen, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71724	2020-01-02 13:01:28 +00:00
Brian Gesiak	2fcf7691df	[Coroutines] Rename "legacy" passes (NFC) A series of patches beginning with https://reviews.llvm.org/D71898 propose to add an implementation of the coroutine passes to the new pass manager. As part of these changes, the coroutine passes that implement the legacy pass manager interface are renamed, to `<PassName>Legacy`. This mirrors similar changes that have been made to many other passes in LLVM as they've been transitioned to support both old and new pass managers. This commit splits out the renaming portion of that patch and commits it in advance as an NFC (no functional change intended) commit. It renames: * `CoroEarly` => `CoroEarlyLegacy` * `CoroSplit` => `CoroSplitLegacy` * `CoroElide` => `CoroElideLegacy` * `CoroCleanup` => `CoroCleanupLegacy`	2020-01-01 21:41:16 -05:00
Saleem Abdulrasool	68a235d07f	build: reduce CMake handling for zlib Rather than handling zlib handling manually, use `find_package` from CMake to find zlib properly. Use this to normalize the `LLVM_ENABLE_ZLIB`, `HAVE_ZLIB`, `HAVE_ZLIB_H`. Furthermore, require zlib if `LLVM_ENABLE_ZLIB` is set to `YES`, which requires the distributor to explicitly select whether zlib is enabled or not. This simplifies the CMake handling and usage in the rest of the tooling.	2020-01-01 16:36:59 -08:00
Lorenzo Casalino	f9f78cf6ac	[MachineScheduler] improve reuse of 'releaseNode'method The 'SchedBoundary::releaseNode' is merely invoked for releasing the Top/Bottom root nodes. However, 'SchedBoundary::releasePending' uses its same logic to check if the Pending queue has any releasable SUnit. It is possible to slightly modify the body of the two, allowing re-use of the former ('releaseNode') in the latter. Patch by Lorenzo Casalino <lorenzo.casalino93@gmail.com> Reviewers: MatzeB, fhahn, atrick Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65506	2020-01-01 20:22:32 +00:00
Mark de Wever	8dc7b982b4	[NFC] Fixes -Wrange-loop-analysis warnings This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857	2020-01-01 20:01:37 +01:00
Fangrui Song	d2bb8c16e7	[MC][TargetMachine] Delete MCTargetOptions::MCPIECopyRelocations clang/lib/CodeGen/CodeGenModule performs the -mpie-copy-relocations check and sets dso_local on applicable global variables. We don't need to duplicate the work in TargetMachine shouldAssumeDSOLocal. Verified that -mpie-copy-relocations can still emit PC relative relocations for external variable accesses. clang -target x86_64 -fpie -mpie-copy-relocations -c => R_X86_64_PC32 clang -target aarch64 -fpie -mpie-copy-relocations -c => R_AARCH64_ADR_PREL_PG_HI21+R_AARCH64_LDST64_ABS_LO12_NC	2020-01-01 00:50:18 -08:00
Hideto Ueno	e996303431	[Attributor] AAValueConstantRange: Value range analysis using constant range This patch introduces `AAValueConstantRange`, which answers a possible range for integer value in a specific program point. One of the motivations is propagating existing `range` metadata. (I think we need to change the situation that `range` metadata cannot be put to Argument). The state is a tuple of `ConstantRange` and it is initialized to (known, assumed) = ([-∞, +∞], empty). Currently, AAValueConstantRange is created when AAValueSimplify cannot simplify the value. Supported - BinaryOperator(add, sub, ...) - CmpInst(icmp eq, ...) - !range metadata `AAValueConstantRange` is not intended to extend to polyhedral range value analysis. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D71620	2020-01-01 15:35:56 +09:00
Johannes Doerfert	df3b56c905	[Attributor][Fix] Avoid leaking memory after D68765	2019-12-31 10:55:07 -06:00
Johannes Doerfert	751336340d	[Attributor] Function signature rewrite infrastructure As part of the Attributor manifest we want to change the signature of functions. This patch introduces a fairly generic interface to do so. As a first, very simple, use case, we remove unused arguments. A second use case, pointer privatization, will be committed with this patch as well. A lot of the code and ideas are taken from argument promotion and we run all argument promotion tests through this framework as well. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D68765	2019-12-31 02:31:33 -06:00
Johannes Doerfert	b1b441d22d	[Attributor] Use abstract call sites to determine associated arguments This is the second step after D67871 to make use of abstract call sites. In this patch the argument we associate with a abstract call site argument can be the one in the callback callee instead of the one in the callback broker. Caveat: We cannot allow no-alias arguments for problematic callbacks: As described in [1], adding no-alias (or restrict) to arguments could break synchronization as the synchronization effect, e.g., a barrier, does not "alias" with the pointer anymore. This disables no-alias annotation for potentially problematic arguments until we implement the fix described in [1]. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D68008 [1] Compiler Optimizations for OpenMP, J. Doerfert and H. Finkel, International Workshop on OpenMP 2018, http://compilers.cs.uni-saarland.de/people/doerfert/par_opt18.pdf	2019-12-31 01:33:22 -06:00
Craig Topper	787e078f3e	[TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues instead of creating a MERGE_VALUES node. NFCI This allows us to clean up some places that were peeking through the MERGE_VALUES node after the call. By returning the SDValues directly, we can clean that up. Unfortunately, there are several call sites in AMDGPU that wanted the MERGE_VALUES and now need to create their own.	2019-12-30 19:36:04 -08:00
Craig Topper	831898ff8a	[SelectionDAG] Fix copy/paste mistake in comment. NFC I think this was copied from scalarizeVectorLoad where that is what happens.	2019-12-30 19:36:04 -08:00
Johannes Doerfert	10fedd94b4	[OpenMP] Use the OpenMPIRBuilder for `omp parallel` This allows to use the OpenMPIRBuilder for parallel regions. Code was extracted from D61953 and adapted to work with the new version (D70109). All but one feature should be supported. An update of this patch will provide test coverage and privatization other than shared. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D70290	2019-12-30 13:57:13 -06:00
Johannes Doerfert	000c6a5038	[OpenMP] Use the OpenMPIRBuilder for `omp cancel` An `omp cancel parallel` needs to be emitted by the OpenMPIRBuilder if the `parallel` was emitted by the OpenMPIRBuilder. This patch makes this possible. The cancel logic is shared with the cancel barriers. Testing is done via unit tests and the clang cancel_codegen.cpp file once D70290 lands. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D71948	2019-12-30 13:57:13 -06:00
Eric Astor	4a7aa252a3	[X86][AsmParser] re-introduce 'offset' operator Summary: Amend MS offset operator implementation, to more closely fit with its MS counterpart: 1. InlineAsm: evaluate non-local source entities to their (address) location 2. Provide a mean with which one may acquire the address of an assembly label via MS syntax, rather than yielding a memory reference (i.e. "offset asm_label" and "$asm_label" should be synonymous 3. address PR32530 Based on http://llvm.org/D37461 Fix broken test where the break appears unrelated. - Set up appropriate memory-input rewrites for variable references. - Intel-dialect assembly printing now correctly handles addresses by adding "offset". - Pass offsets as immediate operands (using "r" constraint for offsets of locals). Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D71436	2019-12-30 14:35:26 -05:00
Fangrui Song	03b9f0a5e1	Ignore "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" in favor of "frame-pointer" D56351 (included in LLVM 8.0.0) introduced "frame-pointer". All tests which use "no-frame-pointer-elim" or "no-frame-pointer-elim-non-leaf" have been migrated to use "frame-pointer". Implement UpgradeFramePointerAttributes to upgrade the two obsoleted function attributes for bitcode. Their semantics are ignored. Differential Revision: https://reviews.llvm.org/D71863	2019-12-30 09:46:19 -08:00
Petar Avramovic	98f72a5107	[MIPS GlobalISel] Select bitreverse. Recommit G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Recommit notes: Introduce temporary variables in order to make sure instructions get inserted into MachineFunction in same order regardless of compiler used to build llvm. Differential Revision: https://reviews.llvm.org/D71363	2019-12-30 18:06:29 +01:00
Dmitri Gribenko	32cc14100e	Revert "[MIPS GlobalISel] Select bitreverse" This reverts commit `dbc136e0fe`. It broke buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066	2019-12-30 14:29:47 +01:00
Petar Avramovic	dbc136e0fe	[MIPS GlobalISel] Select bitreverse G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Differential Revision: https://reviews.llvm.org/D71363	2019-12-30 11:26:45 +01:00
Petar Avramovic	94a24e7a40	[MIPS GlobalISel] Select bswap G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates these intrinsics from __builtin_bswap32 and __builtin_bswap64. Add lower and narrowscalar for G_BSWAP. Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later. Differential Revision: https://reviews.llvm.org/D71362	2019-12-30 11:13:22 +01:00
Hideto Ueno	34fe8d0451	[Attributor] Use `changeUseAfterManifest` in AAValueSimplify manifest Summary: This patch makes `AAValueSimplify` use `changeUsesAfterManifest` in `manifest`. This will invoke simple folding after the manifest. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71972	2019-12-30 17:08:48 +09:00
Fangrui Song	5edb40c022	[SelectionDAG] Disallow indirect "i" constraint This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.	2019-12-29 16:50:42 -08:00
Hideto Ueno	ef4febd85b	[Attributor] AAUndefinedBehavior: Check for branches on undef value. A branch is considered UB if it depends on an undefined / uninitialized value. At this point this handles simple UB branches in the form: `br i1 undef, ...` We query `AAValueSimplify` to get a value for the branch condition, so the branch can be more complicated than just: `br i1 undef, ...`. Patch By: Stefanos Baziotis (@baziotis) Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D71799	2019-12-29 17:43:00 +09:00
Brian Gesiak	0bc7665d98	[ADT] Fix FoldingSet documentation typos * "If found then M with be non-NULL" should be "will be non-NULL". * The documentation examples (1) and (2) declare and use a variable `MyNode M`, but examples (3) and (4) switch midway to using a variable named `N`. Unify the examples to all use `M`. The examples demonstrate the use of member functions of `FoldingSet`, but (3) and (4) invoke these as if they were free functions. Modify them to call member functions on the `MyFoldingSet` object constructed in the code above example (1).	2019-12-27 21:27:59 -05:00
Fangrui Song	f7910496c8	[Intrinsic] Delete tablegen rules of llvm.{sig,}{setjmp,longjmp}	2019-12-27 18:04:39 -08:00
Matt Arsenault	e29ae3799b	TII: Fix using Register for a subregister index argument	2019-12-27 16:53:29 -05:00
Danilo Carvalho Grael	2abda66848	[NFC][DA] Remove duplicate code in checkSrcSubscript and checkDstSubscript Summary: [DA] Move common code in checkSrcSubscript and checkDstSubscript to a new function checkSubscript. This avoids duplicate code and possible out of sync in the future. Reviewers: sebpop, jmolloy, reames Reviewed By: sebpop Subscribers: bmahjour, hiraditya, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71087 Patch by zhongduo.	2019-12-27 10:06:19 -05:00
Fangrui Song	7a7334663c	Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821 Intrinsic has incorrect argument type! i32 (i32) @llvm.setjmp wipes tear	2019-12-27 00:00:14 -08:00
Hideto Ueno	cb5eb13eaf	[Attributor] Add helper to change an instruction to `unreachable` inst Summary: Calling `changeToUnreachable` in `manifest` from different places might cause really unpredictable problems. As other deleting functions are doing, we need to change these instructions after all `manifest`. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71910	2019-12-27 02:39:37 +09:00
Johannes Doerfert	6c5d1f40ff	[OpenMP][NFCI] Use the libFrontend ProcBindKind in Clang This removes the OpenMPProcBindClauseKind enum in favor of llvm::omp::ProcBindKind which lives in OpenMPConstants.h and was introduced in D70109. No change in behavior is expected. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D70289	2019-12-26 11:04:07 -06:00
Johannes Doerfert	e4add9727b	[OpenMP][IR-Builder] Introduce "pragma omp parallel" code generation This patch combines the `emitParallel` logic prototyped in D61953 with the OpenMPIRBuilder (D69785) and introduces `CreateParallel`. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D70109	2019-12-25 18:02:23 -06:00
Johannes Doerfert	f9c3c5da19	[OpenMP][IR-Builder] Introduce the finalization stack As a permanent and generic solution to the problem of variable finalization (destructors, lastprivate, ...), this patch introduces the finalization stack. The objects on the stack describe (1) the (structured) regions the OpenMP-IR-Builder is currently constructing, (2) if these are cancellable, and (3) the callback that will perform the finalization (=cleanup) when necessary. As the finalization can be necessary multiple times, at different source locations, the callback takes the position at which code is currently generated. This position will also encode the destination of the "region exit" block iff the finalization call was issues for a region generated by the OpenMPIRBuilder. For regions generated through the old Clang OpenMP code geneneration, the "region exit" is determined by Clang inside the finalization call instead (see getOMPCancelDestination). As a first user, the parallel + cancel barrier interaction is changed. In contrast to the temporary solution before, the barrier generation in Clang does not need to be aware of the "CancelDestination" block. Instead, the finalization callback is and, as described above, later even that one does not need to be. D70109 will be updated to use this scheme. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D70258	2019-12-25 16:57:08 -06:00
Johannes Doerfert	58f324a468	[Attributor] Function level undefined behavior attribute _Eventually_, this attribute will be assigned to a function if it contains undefined behavior. As a first small step, I tried to make it loop through the load instructions in a function (eventually, the plan is to check if a load instructions causes undefined behavior, because e.g. dereferences a null pointer - Also eventually, this won't happen in initialize() but in updateImpl()). Patch By: Stefanos Baziotis (@baziotis) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D71435	2019-12-24 19:23:08 -06:00
Matt Arsenault	1aa763a4a0	GlobalISel: Define equivalent node for G_INTRINSIC_ROUND	2019-12-24 10:36:54 -05:00
Matt Arsenault	e351256c0d	GlobalISel: Define equivalent node for G_INTRINSIC_TRUNC	2019-12-24 09:53:01 -05:00
Russell Gallop	2e9bfa12ff	Revert "[Support] Extend TimeProfiler to support multiple threads" and "[Support] Try to fix bot failure after 8ddcd1dc26" This reverts commits `f70f180148` and `8ddcd1dc26` as this was breaking the MacOS build, which doesn't support thread_local.	2019-12-24 11:31:48 +00:00
Fangrui Song	e0d855b399	[SelectionDAG] Change SelectionDAGISel::{funcInfo,SDB} to use unique_ptr CurDAG is referenced more than 2000 times and used in many gerated .cpp files. Don't touch it for now.	2019-12-23 22:41:05 -08:00
Shengchen Kan	70fa4c4f88	[NFC] Style cleanups 1. Remove duplicate function for class name at the beginning of the comment. 2. Use auto where the type is already obvious from the context.	2019-12-23 17:02:36 +08:00
Simon Pilgrim	3654ed21ee	Fix case style warnings in DIBuilder. NFC.	2019-12-23 07:27:18 +00:00
Yonghong Song	e3d8ee35e4	reland "[DebugInfo] Support to emit debugInfo for extern variables" Commit `d77ae1552f` ("[DebugInfo] Support to emit debugInfo for extern variables") added deebugInfo for extern variables for BPF target. The commit is reverted by `891e25b02d` as the committed tests using %clang instead of %clang_cc1 causing test failed in certain scenarios as reported by Reid Kleckner. This patch fixed the tests by using %clang_cc1. Differential Revision: https://reviews.llvm.org/D71818	2019-12-22 18:28:50 -08:00
Shengchen Kan	fb53396c49	[NFC] Remove unnecessary blank and rename align-branch-64-5b.s to align-branch-64-6a.s	2019-12-23 10:22:02 +08:00
Reid Kleckner	891e25b02d	Revert "[DebugInfo] Support to emit debugInfo for extern variables" This reverts commit `d77ae1552f`. The tests committed along with this change do not pass, and should be changed to use %clang_cc1.	2019-12-22 12:54:06 -08:00
Eric Astor	dc5b614fa9	[ms] [X86] Use "P" modifier on operands to call instructions in inline X86 assembly. Summary: This is documented as the appropriate template modifier for call operands. Fixes PR44272, and adds a regression test. Also adds support for operand modifiers in Intel-style inline assembly. Reviewers: rnk Reviewed By: rnk Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71677	2019-12-22 09:16:34 -05:00
Matt Arsenault	4af6866708	AMDGPU: Fix repeated word in comment	2019-12-21 04:57:35 -05:00
Lang Hames	9f4f237e29	[ORC] De-register eh-frames in the RTDyldObjectLinkingLayer destructor. This matches the behavior of the legacy layer, which automatically deregistered frames.	2019-12-20 21:10:49 -08:00
Yury Delendik	adf7a0a558	[WebAssembly] Use TargetIndex operands in DbgValue to track WebAssembly operands locations Extends DWARF expression language to express locals/globals locations. (via target-index operands atm) (possible variants are: non-virtual registers or address spaces) The WebAssemblyExplicitLocals can replace virtual registers to targertindex operand type at the time when WebAssembly backend introduces {get,set,tee}_local instead of corresponding virtual registers. Reviewed By: aprantl, dschuff Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D52634	2019-12-20 14:39:05 -08:00
Sam Clegg	538b485c59	Fix name of InitLibcalls() function in comment Differential Revision: https://reviews.llvm.org/D71781	2019-12-20 14:35:05 -08:00
Adrian Prantl	44b4b833ad	Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot This is a purely cosmetic change that is NFC in terms of the binary output. I bugs me that I called the attribute DW_AT_LLVM_isysroot since the "i" is an artifact of GCC command line option syntax (-isysroot is in the category of -i options) and doesn't carry any useful information otherwise. This attribute only appears in Clang module debug info. Differential Revision: https://reviews.llvm.org/D71722	2019-12-20 13:11:17 -08:00
Philip Reames	8b725f0459	Comment and adjust style in the newly introduced MCBoundaryAlignFragment infrastructure. More to follow.	2019-12-20 12:04:07 -08:00
Philip Reames	14fc20ca62	Align branches within 32-Byte boundary (NOP padding) WARNING: If you're looking at this patch because you're looking for a full performace mitigation of the Intel JCC Erratum, this is not it! This is a preliminary patch on the patch towards mitigating the performance regressions caused by Intel's microcode update for Jump Conditional Code Erratum. For context, see: https://www.intel.com/content/www/us/en/support/articles/000055650.html The patch adds the required assembler infrastructure and command line options needed to exercise the logic for INTERNAL TESTING. These are NOT public flags, and should not be used for anything other than LLVM's own testing/debugging purposes. They are likely to change both in spelling and meaning. WARNING: This patch is knowingly incorrect in some cornercases. We need, and do not yet provide, a mechanism to selective enable/disable the padding. Conversation on this will continue in parellel with work on extending this infrastructure to support prefix padding. The goal here is to have the assembler align specific instructions such that they neither cross or end at a 32 byte boundary. The impacted instructions are: a. Conditional jump. b. Fused conditional jump. c. Unconditional jump. d. Indirect jump. e. Ret. f. Call. The new options for llvm-mc are: -x86-align-branch-boundary=NUM aligns branches within NUM byte boundary. -x86-align-branch=TYPE[+TYPE...] specifies types of branches to align. A new MCFragment type, MCBoundaryAlignFragment, is added, which may emit NOP to align the fused/unfused branch. alignBranchesBegin inserts MCBoundaryAlignFragment before instructions, alignBranchesEnd marks the end of the branch to be aligned, relaxBoundaryAlign grows or shrinks sizes of NOP to align the target branch. Nop padding is disabled when the instruction may be rewritten by the linker, such as TLS Call. Process Note: I am landing a patch by skan as it has been LGTMed, and continuing to iterate on the review is simply slowing us down at this point. We can and will continue to iterate in tree. Patch By: skan Differential Revision: https://reviews.llvm.org/D70157	2019-12-20 11:35:50 -08:00
Danilo Carvalho Grael	15bfd2cd54	[AArch64][SVE] Replace integer immediate intrinsics with splat vector variant Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71614	2019-12-20 13:52:19 -05:00
Paul Walker	6cba90dc4d	[AArch64][SVE] Correct intrinsics and patterns for logical predicate instructions In general SVE intrinsics are considered predicated and merging with everything else having suitable decoration. For predicated zeroing operations (like the predicate logical instructions) we use the "_z" suffix. After this change all intrinsics use their expected names (i.e. orr instead of or and eor instead of xor). I've removed intrinsics and patterns for condition code setting instructions as that data is not returned as part of the intrinsic. The expectation is to ask for a cc flag explicitly. For example: a = and_z(pg, p1, p2) cc = ptest_<flag>(pg, a) With the code generator expected to use "s" variants of instructions when available. Differential Revision: https://reviews.llvm.org/D71715	2019-12-20 14:22:27 +00:00
Andrzej Warzynski	be2b7ea89a	[AArch64][SVE] Add intrnisics for saturating scalar arithmetic Summary: The following intrnisics are added: * @llvm.aarch64.sve.sqdec{b\|h\|w\|d\|p} * @llvm.aarch64.sve.sqinc{b\|h\|w\|d\|p} * @llvm.aarch64.sve.uqdec{b\|h\|w\|d\|p} * @llvm.aarch64.sve.uqinc{b\|h\|w\|d\|p} For every intrnisic there a scalar variants (with n32 or n64 suffix) and vector variants (no suffix). Reviewers: sdesmalen, rengolin, efriedma Reviewed By: sdesmalen, efriedma Subscribers: eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71252	2019-12-20 11:09:40 +00:00
Cullen Rhodes	3f9005eb89	Recommit "[AArch64][SVE] Add permutation and selection intrinsics" Recommit `23c28c4043` (reverted in `dcb48f50bd`) with a fix for an assert "Request for a fixed size on a scalable object" being triggered in `LowerSVEIntrinsicEXT`. The fix is to call `getKnownMinSize` on the TypeSize object.	2019-12-20 10:45:17 +00:00
Andrzej Warzynski	88a973cf68	[AArch64][SVE] Add intrinsics for binary narrowing operations Summary: The following intrinsics for binary narrowing shift righ operations are added: * @llvm.aarch64.sve.shrnb * @llvm.aarch64.sve.uqshrnb * @llvm.aarch64.sve.sqshrnb * @llvm.aarch64.sve.sqshrunb * @llvm.aarch64.sve.uqrshrnb * @llvm.aarch64.sve.sqrshrnb * @llvm.aarch64.sve.sqrshrunb * @llvm.aarch64.sve.shrnt * @llvm.aarch64.sve.uqshrnt * @llvm.aarch64.sve.sqshrnt * @llvm.aarch64.sve.sqshrunt * @llvm.aarch64.sve.uqrshrnt * @llvm.aarch64.sve.sqrshrnt * @llvm.aarch64.sve.sqrshrunt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71552	2019-12-20 10:20:30 +00:00
Sam Parker	acbc9aed72	[ARM][MVE] Fixes for tail predication. 1) Fix an issue with the incorrect value being used for the number of elements being passed to [d\|w]lstp. We were trying to check that the value was available at LoopStart, but this doesn't consider that the last instruction in the block could also define the register. Two helpers have been added to RDA for this. 2) Insert some code to now try to move the element count def or the insertion point so that we can perform more tail predication. 3) Related to (1), the same off-by-one could prevent us from generating a low-overhead loop when a mov lr could have been the last instruction in the block. 4) Fix up some instruction attributes so that not all the low-overhead loop instructions are labelled as branches and terminators - as this is not true for dls/dlstp. Differential Revision: https://reviews.llvm.org/D71609	2019-12-20 09:34:18 +00:00
Lang Hames	07ac3145cc	[Orc][LLJIT] Re-apply `298e183e81` (use JITLink for LLJIT where supported). Patch `d9220b580b` fixed the underlying issue that casued `298e183e81` to fail.	2019-12-19 20:42:26 -08:00
Lang Hames	d9220b580b	[JITLink][MachO] Fix common symbol size plumbing. This fixes the underlying bug that was exposed by `298e183e81`.	2019-12-19 20:41:59 -08:00
River Riddle	a77a290a4d	[CommandLine] Add template instantiations of cl::parser for long and long long. This allows cl::opt<int64_t>. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D71729	2019-12-19 17:01:22 -08:00
Philip Reames	8277c91cf3	[StackMaps] Be explicit about label formation [NFC] (try 2) Recommit after making the same API change in non-x86 targets. This has been build for all targets, and tested for effected ones. Why the difference? Because my disk filled up when I tried make check for all. For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.	2019-12-19 14:05:30 -08:00
Tim Northover	85cb560b8a	ConstrainedFP: use API compatible with opaque pointers. This just updates an IRBuilder interface to take Functions instead of Values so the type can be derived, and fixes some callsites in Clang to call the updated API.	2019-12-19 21:50:47 +00:00
Eric Christopher	3075cd5c9f	Temporarily Revert "[Dsymutil][Debuginfo][NFC] Refactor dsymutil to separate DWARF optimizing part 2." as it causes a layering violation/dependency cycle: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp -> llvm/DebugInfo/DWARF/DWARFExpression.h llvm/include/llvm/DebugInfo/DWARF/DWARFOptimizer.h -> llvm/CodeGen/NonRelocatableStringpool.h This reverts commit `abc7f6800d`.	2019-12-19 13:29:02 -08:00
Eric Christopher	add710eb23	Temporarily Revert "[StackMaps] Be explicit about label formation [NFC]" as it broke the aarch64 build. This reverts commit `bc7595d934`.	2019-12-19 12:52:40 -08:00
Philip Reames	bc7595d934	[StackMaps] Be explicit about label formation [NFC] For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.	2019-12-19 12:38:44 -08:00
Philip Reames	cf6aafa47c	[FaultMaps] Make label formation a bit more explicit [NFC] This is in advance of assembler padding directives support where we'll need to bundle the label w/the corresponding faulting instruction to avoid padding being inserted between.	2019-12-19 12:38:44 -08:00
Guillaume Chatelet	b4982d6ecd	[Alignment][NFC] Align compatible methods for CreateElementUnorderedAtomicMemSet Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71703	2019-12-19 20:03:35 +01:00
Bardia Mahjour	86acaa9457	[DDG] Data Dependence Graph - Ordinals Summary: This patch associates ordinal numbers to the DDG Nodes allowing the builder to order nodes within a pi-block in program order. The algorithm works by simply assuming the order in which the BBList is fed into the builder. The builder already relies on the blocks being in program order so that it can compute the dependencies correctly. Similarly the order of instructions in their parent basic blocks determine their program order. Authored By: bmahjour Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert Reviewed By: Meinersbur Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto, ppc-slack Tags: #llvm Differential Revision: https://reviews.llvm.org/D70986	2019-12-19 10:57:33 -05:00
Cullen Rhodes	dcb48f50bd	Revert "[AArch64][SVE] Add permutation and selection intrinsics" This reverts commit `23c28c4043`. It caused build failures in the following expensive checks builders: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1295 http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/700 Reverting for now whilst I figure what the issue is.	2019-12-19 14:26:14 +00:00
Cullen Rhodes	23c28c4043	[AArch64][SVE] Add permutation and selection intrinsics Summary: Adds the following intrinsics: * @llvm.aarch64.sve.clasta * @llvm.aarch64.sve.clasta_n * @llvm.aarch64.sve.clastb * @llvm.aarch64.sve.clastb_n * @llvm.aarch64.sve.compact * @llvm.aarch64.sve.ext * @llvm.aarch64.sve.lasta * @llvm.aarch64.sve.lastb * @llvm.aarch64.sve.rev * @llvm.aarch64.sve.splice * @llvm.aarch64.sve.tbl * @llvm.aarch64.sve.trn1 * @llvm.aarch64.sve.trn2 * @llvm.aarch64.sve.uzp1 * @llvm.aarch64.sve.uzp2 * @llvm.aarch64.sve.zip1 * @llvm.aarch64.sve.zip2 Reviewers: sdesmalen, efriedma, dancgr, mgudim, huntergr, rengolin Reviewed By: sdesmalen, efriedma Subscribers: kmclaughlin, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71401	2019-12-19 13:18:40 +00:00
Alexey Lapshin	abc7f6800d	[Dsymutil][Debuginfo][NFC] Refactor dsymutil to separate DWARF optimizing part 2. That patch is extracted from the D70709. It moves CompileUnit, DeclContext into llvm/DebugInfo/DWARF. It also adds new file DWARFOptimizer with AddressesMap class. AddressesMap generalizes functionality from RelocationManager. Differential Revision: https://reviews.llvm.org/D71271	2019-12-19 15:41:48 +03:00
Jay Foad	c5c935ab66	Make more use of MachineInstr::mayLoadOrStore.	2019-12-19 11:51:52 +00:00
Cullen Rhodes	eca0c97a6b	[AArch64][SVE] Implement pfirst and pnext intrinsics Reviewers: sdesmalen, efriedma, dancgr, mgudim, cameron.mcinally Reviewed By: cameron.mcinally Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71472	2019-12-19 11:03:32 +00:00
Cullen Rhodes	49199465a3	[AArch64][SVE] Implement ptrue intrinsic Reviewers: sdesmalen, eli.friedman, dancgr, mgudim, cameron.mcinally, huntergr, efriedma Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71457	2019-12-19 11:02:05 +00:00
David Blaikie	eed0242330	DebugInfo: Don't use implicit zero addr_base (found when LLVM fails to emit addr_base for gmlt+DWARFv5)	2019-12-18 16:28:19 -08:00
Thomas Lively	71eb8023d8	[WebAssembly] Add avgr_u intrinsics and require nuw in patterns Summary: The vector pattern `(a + b + 1) / 2` was previously selected to an avgr_u instruction regardless of nuw flags, but this is incorrect in the case where either addition may have an unsigned wrap. This CL changes the existing pattern to require both adds to have nuw flags and adds builtin functions and intrinsics for the avgr_u instructions because the corrected pattern is not representable in C. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71648	2019-12-18 15:31:38 -08:00
Lang Hames	5ea91bea15	Revert "[Orc][LLJIT] Use JITLink even if a custom JITTargetMachineBuilder is supplied." This reverts commit `298e183e81`. This commit caused some build failures -- reverting while I investigate.	2019-12-18 15:13:35 -08:00
Lang Hames	298e183e81	[Orc][LLJIT] Use JITLink even if a custom JITTargetMachineBuilder is supplied. LLJITBuilder will now use JITLink on supported platforms even if a custom JITTargetMachineBuilder is supplied, provided that neither the code model, nor the relocation model, nor the ObjectLinkingLayerCreator is set.	2019-12-18 14:17:25 -08:00
Ulrich Weigand	1946461344	[FPEnv] Strict versions of llvm.minimum/llvm.maximum Add new intrinsics llvm.experimental.constrained.minimum llvm.experimental.constrained.maximum as strict versions of llvm.minimum and llvm.maximum. Includes SystemZ back-end support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71624	2019-12-18 21:35:28 +01:00
Jakub Kuderski	3d29c41ad5	[InstCombine] Insert instructions before adding them to worklist Summary: This patch adds instructions to the InstCombine worklist after they are properly inserted. This way we don't get `<badref>`s printed when logging added instructions. It also adds a check in `Worklist::Add` that ensures that all added instructions have parents. Simple test case that illustrates the difference when run with `--debug-only=instcombine`: ``` define i32 @test35(i32 %a, i32 %b) { %1 = or i32 %a, 1135 %2 = or i32 %1, %b ret i32 %2 } ``` Before this patch: ``` INSTCOMBINE ITERATION #1 on test35 IC: ADDING: 3 instrs to worklist IC: Visiting: %1 = or i32 %a, 1135 IC: Visiting: %2 = or i32 %1, %b IC: ADD: %2 = or i32 %a, %b IC: Old = %3 = or i32 %1, %b New = <badref> = or i32 %2, 1135 IC: ADD: <badref> = or i32 %2, 1135 ... ``` With this patch: ``` INSTCOMBINE ITERATION #1 on test35 IC: ADDING: 3 instrs to worklist IC: Visiting: %1 = or i32 %a, 1135 IC: Visiting: %2 = or i32 %1, %b IC: ADD: %2 = or i32 %a, %b IC: Old = %3 = or i32 %1, %b New = <badref> = or i32 %2, 1135 IC: ADD: %3 = or i32 %2, 1135 ... ``` Reviewers: fhahn, davide, spatel, foad, grosser, nikic Reviewed By: nikic Subscribers: nikic, lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71093	2019-12-18 14:55:41 -05:00
Adrian McCarthy	738b5c9639	Fix more VFS tests on Windows Since VFS paths can be in either Posix or Windows style, we have to use a more flexible definition of "absolute" path. The key here is that FileSystem::makeAbsolute is now virtual, and the RedirectingFileSystem override checks for either concept of absolute before trying to make the path absolute by combining it with the current directory. Differential Revision: https://reviews.llvm.org/D70701	2019-12-18 11:38:04 -08:00
Danilo Carvalho Grael	c7abf88411	Revert "[AArch64][SVE] Replace integer immediate intrinsics with splat vector variant" This reverts commit `830e08b98b` and `eb1857ce0d`. This commit leads to an unexpected failure on test/CodeGen/AArch64/sve-gather-scatter-dag-combine.ll. The review will need more changes before its re-commited.	2019-12-18 14:14:10 -05:00
Jakub Kuderski	406b6019cd	[InstCombine] Allow to limit the max number of iterations Summary: This patch teaches InstCombine to accept a new parameter: maximum number of iterations over functions. InstCombine tries to simplify instructions by iterating over the whole function until the function stops changing. As a consequence, the last iteration before reaching a fixpoint visits all instructions in the worklist and never performs any rewrites. Bounding the number of iterations can have 2 benefits: * In case the users of the pass can make a good guess about the number of required iterations, we can save the time normally spent on the last iteration that doesn't change anything. * When the wants to use InstCombine as a cleanup pass, it may be enough to run just a few iterations and stop even before reaching a fixpoint. This can be also useful for implementing a lightweight pass pipeline (think `-O1`). This patch does not change the behavior of opt or Clang -- limiting the number of iterations is entirely opt-in. Reviewers: fhahn, davide, spatel, foad, nlopes, grosser, lebedev.ri, nikic, xbolva00 Reviewed By: spatel Subscribers: craig.topper, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71145	2019-12-18 13:48:54 -05:00
Danilo Carvalho Grael	830e08b98b	[AArch64][SVE] Replace integer immediate intrinsics with splat vector variant Summary: Replace the integer immediate intrisics with splat vector variants so they can be applied as optimizations for the C/C++ intrinsics. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71614	2019-12-18 13:11:21 -05:00
Michael Trent	6f95d33e2b	[ MC ] Match labels to existing fragments even when switching sections. (This commit restores the original branch (`4272372c57`) and applies an additional change dropped from the original in a bad merge. This change should address the previous bot failures. Both changes reviewed by pete.) Summary: This commit builds upon Derek Schuff's 2014 commit for attaching labels to existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 ) When temporary labels appear ahead of a fragment, MCObjectStreamer will track the temporary label symbol in a "Pending Labels" list. Labels are associated with fragments when a real fragment arrives; otherwise, an empty data fragment will be created if the streamer's section changes or if the stream finishes. This commit moves the "Pending Labels" list into each MCStream, so that this label-fragment matching process is resilient to section changes. If the streamer emits a label in a new section, switches to another section to do other work, then switches back to the first section and emits a fragment, that initial label will be associated with this new fragment. Labels will only receive empty data fragments in the case where no other fragment exists for that section. The downstream effects of this can be seen in Mach-O relocations. The previous approach could produce local section relocations and external symbol relocations for the same data in an object file, and this mix of relocation types resulted in problems in the ld64 Mach-O linker. This commit ensures relocations triggered by temporary labels are consistent. Reviewers: pete, ab, dschuff Reviewed By: pete, dschuff Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71368	2019-12-18 09:55:54 -08:00
Raphael Isemann	9a8c803771	Fix modules build by adding missing includes to LTO/Config.h	2019-12-18 17:43:19 +01:00
stozer	89d19d60ad	Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel This reverts commit `1f3dd83cc1`, reapplying commit `bb1b0bc4e5`. The original commit failed on some builds seemingly due to the use of a bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.	2019-12-18 16:26:42 +00:00
evgeny	ad364956ed	[ThinLTO] Show preserved symbols in DOT files Differential revision: https://reviews.llvm.org/D71608	2019-12-18 18:33:15 +03:00
Daniel Sanders	c3cb089a87	[gicombiner] Import tryCombineIndexedLoadStore() Summary: Now that arbitrary data is supported, import tryCombineIndexedLoadStore() Depends on D69147 Reviewers: bogner, volkan Reviewed By: volkan Subscribers: hiraditya, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69151	2019-12-18 14:41:38 +00:00
Daniel Sanders	55c57408b0	[gicombiner] Add support for arbitrary match data being passed from match to apply Summary: This is used by the extending_loads combine to tell the apply step which use is the preferred one to fold and the other uses should be re-written to consume. Depends on D69117 Reviewers: volkan, bogner Reviewed By: volkan Subscribers: hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69147	2019-12-18 12:27:29 +00:00
stozer	1f3dd83cc1	Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel" Reverted due to build failure on windows bots. This reverts commit `bb1b0bc4e5`.	2019-12-18 11:46:10 +00:00
Daniel Sanders	7ea2e5195a	Revert "Temporarily Revert "[gicombiner] Add the MatchDag structure and parse instruction DAG's from the input"" This reverts commit `e62e760f29`. The issue @uweigand raised should have been fixed by iterating over the vector that owns the operand list data instead of the FoldingSet. The MSVC issue raised by @thakis should have been fixed by relaxing the regexes a little. I don't have a Windows machine available to test that so I tested it by using `perl -p -e 's/0x([0-9a-f]+)/\U\1\E/g' to convert the output of %p to the windows style. I've guessed at the issue @phosek raised as there wasn't enough information to investigate it. What I think is happening on that bot is the -debug option isn't available because the second stage build is a release build. I'm not sure why other release-mode bots didn't report it though.	2019-12-18 11:37:12 +00:00
stozer	bb1b0bc4e5	[DebugInfo] Correctly handle salvaged casts and split fragments at ISel Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. This patch enables the salvaging of casts by using the DW_OP_LLVM_convert operator for SExt and Trunc instructions. There is another issue which is exposed by this fix, in which fragment DIExpressions (which are preserved more readily by this patch) for values that must be split across registers in ISel trigger an assertion, as the 'split' fragments extend beyond the bounds of the fragment DIExpression causing an error. This patch also fixes this issue by checking the fragment status of DIExpressions which are to be split, and dropping fragments that are invalid.	2019-12-18 11:09:18 +00:00
Anna Welker	7cd1cfdd6b	[NFC][TTI] Add Alignment for isLegalMasked[Gather/Scatter] Add an extra parameter so alignment can be taken under consideration in gather/scatter legalization. Differential Revision: https://reviews.llvm.org/D71610	2019-12-18 09:14:39 +00:00
Eric Christopher	e62e760f29	Temporarily Revert "[gicombiner] Add the MatchDag structure and parse instruction DAG's from the input" and follow-on patches. This is breaking a few build bots and local builds with follow-up already on the patch thread. This reverts commits `390c8baa54` and `520e3d66e7`.	2019-12-17 16:23:29 -08:00
Mitch Phillips	f827aff859	Revert "[ MC ] Match labels to existing fragments even when switching sections." This reverts commit `4272372c57`. Caused an MSan buildbot failure. More information available in the patch that introduced the bug: https://reviews.llvm.org/D71368	2019-12-17 15:04:26 -08:00
Whitney Tsang	36bdc3dc35	[LoopFusion] Move instructions from FC0.Latch to FC1.Latch. Summary:This PR move instructions from FC0.Latch bottom up to the beginning of FC1.Latch as long as they are proven safe. To illustrate why this is beneficial, let's consider the following example: Before Fusion: header1: br header2 header2: br header2, latch1 latch1: br header1, preheader3 preheader3: br header3 header3: br header4 header4: br header4, latch3 latch3: br header3, exit3 After Fusion (before this PR): header1: br header2 header2: br header2, latch1 latch1: br header3 header3: br header4 header4: br header4, latch3 latch3: br header1, exit3 Note that preheader3 is removed during fusion before this PR. Notice that we cannot fuse loop2 with loop4 as there exists block latch1 in between. This PR move instructions from latch1 to beginning of latch3, and remove block latch1. LoopFusion is now able to fuse loop nest recursively. After Fusion (after this PR): header1: br header2 header2: br header3 header3: br header4 header4: br header2, latch3 latch3: br header1, exit3 Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel, bmahjour, etiotto Reviewed By: kbarton, Meinersbur Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71165	2019-12-17 22:10:23 +00:00
Ulrich Weigand	1e89188d35	[FPEnv] Remove unnecessary rounding mode argument for constrained intrinsics The following intrinsics currently carry a rounding mode metadata argument: llvm.experimental.constrained.minnum llvm.experimental.constrained.maxnum llvm.experimental.constrained.ceil llvm.experimental.constrained.floor llvm.experimental.constrained.round llvm.experimental.constrained.trunc This is not useful since the semantics of those intrinsics do not in any way depend on the rounding mode. In similar cases, other constrained intrinsics do not have the rounding mode argument. Remove it here as well. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71218	2019-12-17 21:10:36 +01:00
Alex Lorenz	25ce33a6e4	[driver][darwin] Pass -platform_version flag to the linker instead of the -<platform>_version_min flag In Xcode 11, ld added a new flag called -platform_version that can be used instead of the old -<platform>_version_min flags. The new flag allows Clang to pass the SDK version from the driver to the linker. This patch adopts the new -platform_version flag in Clang, and starts using it by default, unless a linker version < 520 is passed to the driver. Differential Revision: https://reviews.llvm.org/D71579	2019-12-17 10:26:32 -08:00
Kevin P. Neal	2f40f5681d	[FPEnv] IRBuilder support for constrained sitofp/uitofp.	2019-12-17 12:32:28 -05:00
Ulrich Weigand	d1c0f14be8	[SystemZ][FPEnv] Back-end support for STRICT_[SU]INT_TO_FP As of `b1d8576` there is middle-end support for STRICT_[SU]INT_TO_FP, so this patch adds SystemZ back-end support as well. The patch is SystemZ target specific except for adding SD patterns strict_[su]int_to_fp and any_[su]int_to_fp to TargetSelectionDAG.td as usual.	2019-12-17 18:24:05 +01:00
Daniel Sanders	520e3d66e7	[gicombiner] Process the MatchDag such that every node is reachable from the roots Summary: When we build the walk across these DAG's we need to be able to reach every node from the roots. Flip and traversal edges (so that use->def becomes def->uses) that make nodes unreachable. Note that early on we'll just error out on these flipped edges as def->uses edges are more complicated to match due to their one->many nature. Depends on D69077 Reviewers: volkan, bogner Subscribers: llvm-commits	2019-12-17 17:03:24 +00:00
Michael Trent	4272372c57	[ MC ] Match labels to existing fragments even when switching sections. Summary: This commit builds upon Derek Schuff's 2014 commit for attaching labels to existing fragments ( Diff Revision: http://reviews.llvm.org/D5915 ) When temporary labels appear ahead of a fragment, MCObjectStreamer will track the temporary label symbol in a "Pending Labels" list. Labels are associated with fragments when a real fragment arrives; otherwise, an empty data fragment will be created if the streamer's section changes or if the stream finishes. This commit moves the "Pending Labels" list into each MCStream, so that this label-fragment matching process is resilient to section changes. If the streamer emits a label in a new section, switches to another section to do other work, then switches back to the first section and emits a fragment, that initial label will be associated with this new fragment. Labels will only receive empty data fragments in the case where no other fragment exists for that section. The downstream effects of this can be seen in Mach-O relocations. The previous approach could produce local section relocations and external symbol relocations for the same data in an object file, and this mix of relocation types resulted in problems in the ld64 Mach-O linker. This commit ensures relocations triggered by temporary labels are consistent. Reviewers: pete, ab, dschuff Reviewed By: pete, dschuff Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71368	2019-12-17 08:49:25 -08:00
Kevin P. Neal	b1d8576b0a	This adds constrained intrinsics for the signed and unsigned conversions of integers to floating point. This includes some of Craig Topper's changes for promotion support from D71130. Differential Revision: https://reviews.llvm.org/D69275	2019-12-17 10:06:51 -05:00
Kristof Beyls	870f39d310	Fix assertion failure in getMemOperandWithOffsetWidth This fixes an assertion failure that triggers inside getMemOperandWithOffset when Machine Sinking calls it on a MachineInstr that is not a memory operation. Different backends implement getMemOperandWithOffset differently: some return false on non-memory MachineInstrs, others assert. The Machine Sinking pass in at least SinkingPreventsImplicitNullCheck relies on getMemOperandWithOffset to return false on non-memory MachineInstrs, instead of asserting. This patch updates the documentation on getMemOperandWithOffset that it should return false on any MachineInstr it cannot handle, instead of asserting. It also adapts the in-tree backends accordingly where necessary. Differential Revision: https://reviews.llvm.org/D71359	2019-12-17 10:56:09 +00:00
Guillaume Chatelet	531c1161b9	Resubmit "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove" Summary: This is a resubmit of D71473. This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: aaron.ballman, courbet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71547	2019-12-17 10:07:46 +01:00
Russell Gallop	531c71118f	Revert "[Support] Fix time trace multi threaded support with LLVM_ENABLE_THREADS=OFF" This reverts commit `2bbcf156ac`. This was failing on systems which use __thread such as http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/30851	2019-12-17 08:59:05 +00:00
Russell Gallop	2bbcf156ac	[Support] Fix time trace multi threaded support with LLVM_ENABLE_THREADS=OFF Following on from `8ddcd1dc26`, which added the support. As pointed out on D71059 this does not build on some systems with LLVM_ENABLE_THREADS=OFF. Differential Revision: https://reviews.llvm.org/D71548	2019-12-17 08:42:56 +00:00
Raphael Isemann	ccfab8e459	[ObjC][DWARF] Emit DW_AT_APPLE_objc_direct for methods marked as __attribute__((objc_direct)) Summary: With DWARF5 it is no longer possible to distinguish normal methods and methods with `__attribute__((objc_direct))` by just looking at the debug information as they are both now children of the of the DW_TAG_structure_type that defines them (before only the `__attribute__((objc_direct))` methods were children). This means that in LLDB we are no longer able to create a correct Clang AST of a module by just looking at the debug information. Instead we would need to call the Objective-C runtime to see which of the methods have a `__attribute__((objc_direct))` and then add the attribute to our own Clang AST depending on what the runtime returns. This would mean that we either let the module AST be dependent on the Objective-C runtime (which doesn't seem right) or we retroactively add the missing attribute to the imported AST in our expressions. A third option is to annotate methods with `__attribute__((objc_direct))` as `DW_AT_APPLE_objc_direct` which is what this patch implements. This way LLDB doesn't have to call the runtime for any `__attribute__((objc_direct))` method and the AST in our module will already be correct when we create it. Reviewers: aprantl, SouraVX Reviewed By: aprantl Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D71201	2019-12-17 09:40:36 +01:00
Danilo Carvalho Grael	f933878991	[AArch64][SVE] Add patterns for logical immediate operations. Summary: Add pattern matching for the following SVE logical vector and immediate instructions: - and/bic, orr/orn, eor/eon. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71483	2019-12-16 16:18:34 -05:00
Thomas Lively	3a93756dfb	[WebAssembly] Replace SIMD int min/max builtins with patterns Summary: The instructions were originally implemented via builtins and intrinsics so users would have to explicitly opt-in to using them. This was useful while were validating whether these instructions should have been merged into the spec proposal. Now that they have been, we can use normal codegen patterns, so the intrinsics and builtins are no longer useful. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71500	2019-12-16 11:48:49 -08:00
John McCall	2e2d142efe	Add Optional::map.	2019-12-16 13:34:00 -05:00
Teresa Johnson	878ab6df03	[TLI] Support for per-Function TLI that overrides available libfuncs Summary: Follow-on to D66428 and D71193, to build the TLI per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. With D71193, the -fno-builtin* flags are converted to function attributes, so we can now set this information per-function on the TLI. In this patch, the TLI constructor is changed to take a Function, which can be used to override the available builtins. The TLI is augmented with an array that can be used to specify which builtins are not available for the corresponding function. The available function checks are changed to consult this override before checking the underlying module level baseline TLII. New code is added to set this override array based on the attributes. I also removed the code that sets availability in the TLII in clang from the options, which is no longer needed. I removed a per-Triple caching of TLII objects in the analysis object, as it is based on the Module's Triple which is the same for all functions in any case. Is there a case where we would be compiling multiple Modules with different Triples in one compilation? Finally, I have changed the legacy analysis wrapper to create and use the new PM analysis class (TargetLibraryAnalysis) in getTLI. This is consistent with the behavior of getTTI for the legacy TargetTransformInfo analysis. This change means that getTLI now creates a new TLI on each call (although that should be very cheap as we cache the module level TLII, and computing the per-function attribute based availability should also be reasonably efficient). I measured the compile time for a large C++ file with tens of thousands of functions and as expected there was no increase. Reviewers: chandlerc, hfinkel, gchatelet Subscribers: mehdi_amini, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67923	2019-12-16 09:19:30 -08:00
Guillaume Chatelet	4658da10e4	Revert "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove" This reverts commit `181ab91efc`.	2019-12-16 15:19:49 +01:00
Guillaume Chatelet	181ab91efc	[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove Summary: This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71473	2019-12-16 13:35:55 +01:00
Andrzej Warzynski	c41d2b5ab2	[AArch64][SVE2] Add intrinsics for binary narrowing operations Summary: The following intrinsics for binary narrowing add and sub operations are added: * @llvm.aarch64.sve.addhnb * @llvm.aarch64.sve.addhnt * @llvm.aarch64.sve.raddhnb * @llvm.aarch64.sve.raddhnt * @llvm.aarch64.sve.subhnb * @llvm.aarch64.sve.subhnt * @llvm.aarch64.sve.rsubhnb * @llvm.aarch64.sve.rsubhnt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: sdesmalen, efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71424	2019-12-16 12:22:56 +00:00
Andrzej Warzynski	7e20c3a71d	[Aarch64][SVE] Add intrinsics for scatter stores Summary: This patch adds the following SVE intrinsics for scatter stores: * 64-bit offsets: * @llvm.aarch64.sve.st1.scatter (unscaled) * @llvm.aarch64.sve.st1.scatter.index (scaled) * 32-bit unscaled offsets: * @llvm.aarch64.sve.st1.scatter.uxtw (zero-extended offset) * @llvm.aarch64.sve.st1.scatter.sxtw (sign-extended-offset) * 32-bit scaled offsets: * @llvm.aarch64.sve.st1.scatter.uxtw.index (zero-extended offset) * @llvm.aarch64.sve.st1.scatter.sxtw.index (sign-extended offset) * vector base + immediate: * @llvm.aarch64.sve.st1.scatter.imm Reviewers: rengolin, efriedma, sdesmalen Reviewed By: efriedma, sdesmalen Subscribers: kmclaughlin, eli.friedman, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71074	2019-12-16 11:52:53 +00:00
Bjorn Pettersson	1c49553c19	[BasicBlockUtils] Add utility to remove redundant dbg.value instrs Summary: Add a RemoveRedundantDbgInstrs to BasicBlockUtils with the goal to remove redundant dbg intrinsics from a basic block. This can be useful after various transforms, as it might be simpler to do a filtering of dbg intrinsics after the transform than during the transform. One primary use case would be to replace a too aggressive removal done by MergeBlockIntoPredecessor, seen at loop rotate (not done in this patch). The elimination algorithm currently focuses on dbg.value intrinsics and is doing two iterations over the BB. First we iterate backward starting at the last instruction in the BB. Whenever a consecutive sequence of dbg.value instructions are found we keep the last dbg.value for each variable found (variable fragments are identified using the {DILocalVariable, FragmentInfo, inlinedAt} triple as given by the DebugVariable helper class). Next we iterate forward starting at the first instruction in the BB. Whenever we find a dbg.value describing a DebugVariable (identified by {DILocalVariable, inlinedAt}) we save the {DIValue, DIExpression} that describes that variables value. But if the variable already was mapped to the same {DIValue, DIExpression} pair we instead drop the second dbg.value. To ease the process of making lit tests for this utility a new pass is introduced called RedundantDbgInstElimination. It can be executed by opt using -redundant-dbg-inst-elim. Reviewers: aprantl, jmorse, vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71478	2019-12-16 11:41:21 +01:00
Jay Foad	4f17b1784e	Fix for AMDGPU MUL_I24 known bits calculation Summary: At present, the code calculating known bits of AMDGPU MUL_I24 confuses the concepts of "non-negative number" and "positive number". In some situations, it results in incorrect code. I have a case where the optimizer replaces the result of calculating MUL_I24(-5, 0) with -8. Reviewers: foad, arsenm Reviewed By: arsenm Subscribers: foad, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Patch by Eugene Kuznetsov. Differential Revision: https://reviews.llvm.org/D70367	2019-12-16 10:25:57 +00:00
Lang Hames	c0143f37da	[ORC] Make ObjectLinkingLayer own its jitlink::MemoryManager. This relieves ObjectLinkingLayer clients of the responsibility of holding the memory manager. This makes it easier to select between RTDyldObjectLinkingLayer (which already owned its memory manager factory) and ObjectLinkingLayer at runtime as clients aren't required to hold a jitlink::MemoryManager field just in case ObjectLinkingLayer is selected.	2019-12-15 17:35:52 -08:00
Fangrui Song	fdb408f348	[MC] Delete unused MCAsmInfoELF::UsesNonexecutableStackSection after EM_WEBASSEMBLY was removed in D48744 This removes remnant of D15969 which hasn't been removed by D48744.	2019-12-15 15:43:30 -08:00
Roman Tereshin	8731799fc6	[Legalizer] Making artifact combining order-independent Legalization algorithm is complicated by two facts: 1) While regular instructions should be possible to legalize in an isolated, per-instruction, context-free manner, legalization artifacts can only be eliminated in pairs, which could be deeply, and ultimately arbitrary nested: { [ () ] }, where which paranthesis kind depicts an artifact kind, like extend, unmerge, etc. Such structure can only be fully eliminated by simple local combines if they are attempted in a particular order (inside out), or alternatively by repeated scans each eliminating only one innermost pair, resulting in O(n^2) complexity. 2) Some artifacts might in fact be regular instructions that could (and sometimes should) be legalized by the target-specific rules. Which means failure to eliminate all artifacts on the first iteration is not a failure, they need to be tried as instructions, which may produce more artifacts, including the ones that are in fact regular instructions, resulting in a non-constant number of iterations required to finish the process. I trust the recently introduced termination condition (no new artifacts were created during as-a-regular-instruction-retrial of artifacts not eliminated on the previous iteration) to be efficient in providing termination, but only performing the legalization in full if and only if at each step such chains of artifacts are successfully eliminated in full as well. Which is currently not guaranteed, as the artifact combines are applied only once and in an arbitrary order that has to do with the order of creation or insertion of artifacts into their worklist, which is a no particular order. In this patch I make a small change to the artifact combiner, making it to re-insert into the worklist immediate (modulo a look-through copies) artifact users of each vreg that changes its definition due to an artifact combine. Here the first scan through the artifacts worklist, while not being done in any guaranteed order, only needs to find the innermost pair(s) of artifacts that could be immediately combined out. After that the process follows def-use chains, making them shorter at each step, thus combining everything that can be combined in O(n) time. Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders Reviewed By: aditya_nandakumar, paquette Tags: #llvm Differential Revision: https://reviews.llvm.org/D71448	2019-12-13 15:45:18 -08:00
Roman Tereshin	18bf9670aa	[Legalizer] Refactoring out legalizeMachineFunction and introducing new unittests/CodeGen/GlobalISel/LegalizerTest.cpp relying on it to unit test the entire legalizer algorithm (including the top-level main loop). See also https://reviews.llvm.org/D71448	2019-12-13 15:45:18 -08:00
Alex Richardson	fc83f53a86	[NFC] Implement SelectionDAG::getObjectPtrOffset() using getMemBasePlusOffset() Summary: This change is preparatory work to use this helper functions in more places. In order to make this change, getMemBasePlusOffset() has been extended to also take a SDNodeFlags parameter. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71206	2019-12-13 21:40:03 +00:00
Alex Richardson	ea8888d1af	[NFC] Add a SDValue overload for SelectionDAG::getMemBasePlusOffset() Summary: This change is preparatory work to use this helper functions in more places. Currently the function only allows integer constants offsets, but there are cases where we can use an existing SDValue parameter. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel, craig.topper Reviewed By: spatel, craig.topper Subscribers: craig.topper, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71205	2019-12-13 21:40:03 +00:00
Alex Richardson	d9bb70acd7	[NFC] Change SelectionDAG::getMemBasePlusOffset() to use int64_t Summary: This change is preparatory work to use this helper functions in more places. Currently the function only allows positive offsets, but there are cases where we want to subtract an offset from an existing pointer. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71204	2019-12-13 21:40:03 +00:00
Francesco Petrogalli	19f73f0d1b	Revert "[VectorUtils] Introduce the Vector Function Database (VFDatabase)." This reverts commit `0be81968a2`. The VFDatabase needs some rework to be able to handle vectorization and subsequent scalarization of intrinsics in out-of-tree versions of the compiler. For more details, see the discussion in https://reviews.llvm.org/D67572.	2019-12-13 19:42:04 +00:00
Nicola Zaghen	97572775d2	Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. This fixes the buildbot failures. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-13 14:30:21 +00:00
Mikhail Maltsev	99581fd4c8	[ARM][MVE] Add vector reduction intrinsics with two vector operands Summary: This patch adds intrinsics for the following MVE instructions: * VABAV * VMLADAV, VMLSDAV * VMLALDAV, VMLSLDAV * VRMLALDAVH, VRMLSLDAVH Each of the above 4 groups has a corresponding new LLVM IR intrinsic, since the instructions cannot be easily represented using general-purpose IR operations. Reviewers: simon_tatham, ostannard, dmgreen, MarkMurrayARM Reviewed By: MarkMurrayARM Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71062	2019-12-13 13:17:29 +00:00
Simon Tatham	25305a9311	[ARM][MVE] Add intrinsics for more immediate shifts. Summary: This fills in the remaining shift operations that take a single vector input and an immediate shift count: the `vqshl`, `vqshlu`, `vrshr` and `vshll[bt]` families. `vshll[bt]` (which shifts each input lane left into a double-width output lane) is the most interesting one. There are separate MC instruction ids for shifting by exactly the input lane width and shifting by less than that, because the instruction encoding is so completely different for the lane-width special case. So I had to write two sets of patterns to match based on the immediate shift count, which involved adding a ComplexPattern matcher to avoid the general-case pattern accidentally matching the special case too. For that family I've made sure to add an llc codegen test for both versions of each instruction. I'm experimenting with a new strategy for parametrising the isel patterns for all these instructions: adding extra fields to the relevant `Instruction` subclass itself, which are ignored by the Tablegen backends that generate the MC data, but can be retrieved from each instance of that instruction subclass when it's passed as a template parameter to the multiclass that generates its isel patterns. A nice effect of that is that I can fill in those informational fields using `let` blocks, rather than having to type them out once per instruction at `defm` time. (As a result, quite a lot of existing instruction `def`s are reindented by this patch, so it's clearer to read with whitespace changes ignored.) Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71458	2019-12-13 13:07:39 +00:00
Alex Richardson	be15dfa88f	[NFC] Use EVT instead of bool for getSetCCInverse() Summary: The use of a boolean isInteger flag (generally initialized using VT.isInteger()) caused errors in our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). In our backend, pointers use a separate ValueType (iFATPTR) and therefore .isInteger() returns false. This meant that getSetCCInverse() was using the floating-point variant and generated incorrect code for us: `(void )0x12033091e < (void )0xffffffffffffffff` would return false. Committing this change will significantly reduce our merge conflicts for each upstream merge. Reviewers: spatel, bogner Reviewed By: bogner Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70917	2019-12-13 12:22:03 +00:00
Mark Murray	228c74076d	[ARM][MVE][Intrinsics] Add _x() variants of my _m() intrinsics. Summary: Better use of multiclass is used, and this helped find some existing bugs in the predicated VMULL* intrinsics, which are now fixed. The refactored VMULL[TB]Q_(INT\|POLY)_M() intrinsics were discovered to have an argument ("inactive") with incorrect type, and this required a fix that is included in this whole patch. The argument "inactive" should have been the same width (per vector element) as the return type of the intrinsic, but was not in the case where the return type was double the element width of the input types. To assist in testing the multiclassing , and to thwart further gremlins, the unit tests are improved in scope. The .ll tests are all generated by a small bit of throw-away scripting from the corresponding .c tests, and as such the diffs are large and nasty. Look at the file rather than the diff. Reviewers: dmgreen, miyuki, ostannard, simon_tatham Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71421	2019-12-13 11:51:23 +00:00
Kerry McLaughlin	4194ca8e5a	Recommit "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores" Updated pred_load patterns added to AArch64SVEInstrInfo.td by this patch to use reg + imm non-temporal loads to fix previous test failures. Original commit message: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above.	2019-12-13 10:08:20 +00:00
Georgii Rymar	86e652f828	[yaml2obj] - Add a way to override sh_flags section field. Currently we have the `Flags` property that allows to set flags for a section. The problem is that it does not allow us to set an arbitrary value, because of bit fields validation under the hood. An arbitrary values can be used to test specific broken cases. We probably do not want to relax the validation, so this patch adds a `ShSize` property that allows to override the `sh_size`. It is inline with others `Sh*` properties we have already. Differential revision: https://reviews.llvm.org/D71411	2019-12-13 11:54:37 +03:00
Rui Ueyama	69da7e29de	Revert an accidental commit `af5ca40b47`	2019-12-13 15:17:40 +09:00
Rui Ueyama	af5ca40b47	temporary	2019-12-13 14:35:03 +09:00
Danilo Carvalho Grael	6bed43f3c4	[AArch64][SVE] Add integer arithmetic with immediate instructions. Summary: Add pattern matching for the following instructions: - add, sub, subr, sqadd, sqsub, uqadd, uqsub This patch required complex patterns to match the immediate with optinal left shift. I re-used the Select function from the other SVE repo to implement the complext pattern. I plan on doing another patch to also match constant vector of the same immediate. Reviewers: sdesmalen, huntergr, rengolin, efriedma, c-rhodes, mgudim, kmclaughlin Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71370	2019-12-12 17:28:46 -05:00
Johannes Doerfert	6abd01e462	[Attributor][FIX] Do treat byval arguments special When we reason about the pointer argument that is byval we actually reason about a local copy of the value passed at the call site. This was not the case before and we wrongly introduced attributes based on the surrounding function. AAMemoryBehaviorArgument, AAMemoryBehaviorCallSiteArgument and AANoCaptureCallSiteArgument are made aware of byval now. The code to skip "subsuming positions" for reasoning follows a common pattern and we should refactor it. A TODO was added. Discovered by @efriedma as part of D69748.	2019-12-12 16:04:21 -06:00
Teresa Johnson	c8e0bb3b2c	[LTO] Support for embedding bitcode section during LTO Summary: This adds support for embedding bitcode in a binary during LTO. The libLTO gains supports the `-lto-embed-bitcode` flag. The option allows users of the LTO library to embed a bitcode section. For example, LLD can pass the option via `ld.lld -mllvm=-lto-embed-bitcode`. This feature allows doing something comparable to `clang -c -fembed-bitcode`, but on the (LTO) linker level. Having bitcode alongside native code has many use-cases. To give an example, the MacOS linker can create a `-bitcode_bundle` section containing bitcode. Also, having this feature built into LLVM is an alternative to 3rd party tools such as [[ https://github.com/travitch/whole-program-llvm \| wllvm ]] or [[ https://github.com/SRI-CSL/gllvm \| gllvm ]]. As with these tools, this feature simplifies creating "whole-program" llvm bitcode files, but in contrast to wllvm/gllvm it does not rely on a specific llvm frontend/driver. Patch by Josef Eisl <josef.eisl@oracle.com> Reviewers: #llvm, #clang, rsmith, pcc, alexshap, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mehdi_amini, inglorion, hiraditya, aheejin, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits, #llvm, #clang Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68213	2019-12-12 12:34:19 -08:00
Kit Barton	61368c8e98	Rename LoopInfo::isRotated() to LoopInfo::isRotatedForm(). This patch renames the LoopInfo::isRotated() method to LoopInfo::isRotatedForm() to make it clear that the method checks whether the loop is in rotated form, not whether the loop has been rotated by the LoopRotation pass.	2019-12-12 14:22:36 -05:00
Florian Hahn	526244b187	[Matrix] Add first set of matrix intrinsics and initial lowering pass. This is the first patch adding an initial set of matrix intrinsics and a corresponding lowering pass. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html The first patch introduces four new intrinsics (transpose, multiply, columnwise load and store) and a LowerMatrixIntrinsics pass, that lowers those intrinsics to vector operations. Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix embedded in a <16 x float> vector) and the intrinsics take the dimension information as parameters. Those parameters need to be ConstantInt. For the memory layout, we initially assume column-major, but in the RFC we also described how to extend the intrinsics to support row-major as well. For the initial lowering, we split the input of the intrinsics into a set of column vectors, transform those column vectors and concatenate the result columns to a flat result vector. This allows us to lower the intrinsics without any shape propagation, as mentioned in the RFC. In follow-up patches, we plan to submit the following improvements: * Shape propagation to eliminate the embedding/splitting for each intrinsic. * Fused & tiled lowering of multiply and other operations. * Optimization remarks highlighting matrix expressions and costs. * Generate loops for operations on large matrixes. * More general block processing for operation on large vectors, exploiting shape information. We would like to add dedicated transpose, columnwise load and store intrinsics, even though they are not strictly necessary. For example, we could instead emit a large shufflevector instruction instead of the transpose. But we expect that to (1) become unwieldy for larger matrixes (even for 16x16 matrixes, the resulting shufflevector masks would be huge), (2) risk instcombine making small changes, causing us to fail to detect the transpose, preventing better lowerings For the load/store, we are additionally planning on exploiting the intrinsics for better alias analysis. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70456	2019-12-12 15:42:18 +00:00
Guillaume Chatelet	dbc5acf8ce	[Alignment][NFC] Adding Align compatible methods to IntrinsicInst/IRBuilder Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71420	2019-12-12 16:22:15 +01:00
Hideto Ueno	4ecf25545c	[Attributor][NFC] Fix comments and unnecessary comma	2019-12-12 13:42:40 +00:00
Russell Gallop	8ddcd1dc26	[Support] Extend TimeProfiler to support multiple threads This makes TimeTraceProfilerInstance thread local. Added timeTraceProfilerFinishThread() which moves the thread local instance to a global vector of instances. timeTraceProfilerWrite() then writes recorded data from all instances. Threads are identified based on their thread ids. Totals are reported with artificial thread ids higher than the real ones. Replaced raw pointer for TimeTraceProfilerInstance with unique_ptr. Differential Revision: https://reviews.llvm.org/D71059	2019-12-12 12:01:44 +00:00
Nicola Zaghen	f798eb21ec	Temporarily Revert "[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same." This reverts commit `5f6208778f`. This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.	2019-12-12 10:29:54 +00:00
Nicola Zaghen	5f6208778f	[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-12 10:07:01 +00:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00
Reid Kleckner	85ba5f637a	Rename TTI::getIntImmCost for instructions and intrinsics Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381	2019-12-11 18:00:20 -08:00
Sanjay Patel	cdf5cfea8e	Revert "[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression()" This reverts commit `d1f0bdf2d2`. The patch can cause infinite loops in DAGCombiner.	2019-12-11 16:56:58 -05:00
Nikita Popov	fe593fe15f	[ADT] Fix SmallDenseMap assertion with large InlineBuckets Fixes issue encountered in D56362, where I tried to use a SmallSetVector<Instruction*, 128> with an excessively large number of inline elements. This triggers an "Must allocate more buckets than are inline" assertion inside allocateBuckets() under certain usage patterns. The issue is as follows: The grow() method is used either to grow the map, or to rehash it and remove tombstones. The latter is done if the fraction of empty (non-used, non-tombstone) elements is below 1/8. In this case grow() is invoked with the current number of buckets. This is currently incorrectly handled for dense maps using the small rep. The current implementation will switch them over to the large rep, which violates the invariant that the large rep is only used if there are more than InlineBuckets buckets. This patch fixes the issue by staying in the small rep and only moving the buckets. An alternative, if we do want to switch to the large rep in this case, would be to relax the assertion in allocateBuckets(). Differential Revision: https://reviews.llvm.org/D56455	2019-12-11 21:41:14 +01:00
Johannes Doerfert	d23c61490c	[OpenMP] Introduce the OpenMP-IR-Builder This is the initial patch for the OpenMP-IR-Builder, as discussed on the mailing list ([1] and later) and at the US Dev Meeting'19. The design is similar to D61953 but: - in a non-WIP status, with proper documentation and working. - using a OpenMPKinds.def file to manage lists of directives, runtime functions, types, ..., similar to the current Clang implementation. - restricted to handle only (simple) barriers, to implement most `#pragma omp barrier` directives and most implicit barriers. - properly hooked into Clang to be used if possible (D69922). - compatible with the remaining code generation. Parts have been extracted into D69853. The plan is to have multiple people working on moving logic from Clang here once the initial scaffolding (=this patch) landed. [1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim Subscribers: mgorny, hiraditya, bollu, guansong, jfb, cfe-commits, llvm-commits, penzn, ppenzin Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69785	2019-12-11 14:38:49 -06:00
Sam Clegg	881d877846	[WebAssembly] Add new `export_name` clang attribute for controlling wasm export names This is equivalent to the existing `import_name` and `import_module` attributes which control the import names in the final wasm binary produced by lld. This maps the existing This attribute currently requires a string rather than using the symbol name for a couple of reasons: 1. Avoid confusion with static and dynamic linking which is based on symbol name. Exporting a function from a wasm module using this directive is orthogonal to both static and dynamic linking. 2. Avoids name mangling. Differential Revision: https://reviews.llvm.org/D70520	2019-12-11 11:54:57 -08:00
Andrzej Warzynski	a75463c471	Add intrinsics for unary narrowing operations Summary: The following intrinsics for unary narrowing operations are added: * @llvm.aarch64.sve.sqxtnb * @llvm.aarch64.sve.uqxtnb * @llvm.aarch64.sve.sqxtunb * @llvm.aarch64.sve.sqxtnt * @llvm.aarch64.sve.uqxtnt * @llvm.aarch64.sve.sqxtunt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71270	2019-12-11 18:55:51 +00:00
Sanjay Patel	d1f0bdf2d2	[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch skips the use check during the rewrite phase. So we determine that some expression isNegatibleForFree (identically to without this patch), but during the rewrite, don't rely on use counts to decide how to create the optimal expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-11 13:30:39 -05:00
Kit Barton	942c9946cc	[Loop] Add isRotated method to Loop class. Summary: This patch adds a method to determine if a loop is in rotated form (the latch is an exiting block). It also modifies the getLoopGuardBranch method to use this new method. This method can also be used in Loopfusion. Once this patch lands I will make the corresponding changes there. Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney, fhahn, hfinkel Reviewed By: Meinersbur Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65958	2019-12-11 09:43:10 -05:00
Russell Gallop	df494f7512	[Support] Add TimeTraceScope constructor without detail arg This simplifies code where no extra details are required Also don't write out detail when it is empty. Differential Revision: https://reviews.llvm.org/D71347	2019-12-11 14:32:21 +00:00
Kerry McLaughlin	c0a3ab3655	Revert "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores" This reverts commit `3f5bf35f86` as it was causing build failures in llvm-clang-x86_64-expensive-checks: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/392 http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1045	2019-12-11 13:58:39 +00:00
Guillaume Chatelet	0a0d54b357	[Alignment][NFC] Introduce Align in IRBuilder Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71343	2019-12-11 14:41:23 +01:00
Simon Tatham	1fed9a0c0c	[TableGen] Add bang-operators !getop and !setop. Summary: These allow you to get and set the operator of a dag node, without affecting its list of arguments. `!getop` is slightly fiddly because in many contexts you need its return value to have a static type more specific than 'any record'. It works to say `!cast<BaseClass>(!getop(...))`, but it's cumbersome, so I made `!getop` take an optional type suffix itself, so that can be written as the shorter `!getop<BaseClass>(...)`. Reviewers: hfinkel, nhaehnle Reviewed By: nhaehnle Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71191	2019-12-11 12:05:22 +00:00
Kerry McLaughlin	3f5bf35f86	[AArch64][SVE] Implement intrinsics for non-temporal loads & stores Summary: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above. Reviewers: sdesmalen, paulwalker-arm, dancgr, mgudim, efriedma, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71000	2019-12-11 11:13:51 +00:00
Sjoerd Meijer	d97cf1f889	[ARM][LowOverheadLoops] Remove dead loop update instructions. After creating a low-overhead loop, the loop update instruction was still lingering around hurting performance. This removes dead loop update instructions, which in our case are mostly SUBS instructions. To support this, some helper functions were added to MachineLoopUtils and ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses before a particular loop instruction, respectively. This is a first version that removes a SUBS instruction when there are no other uses inside and outside the loop block, but there are some more interesting cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which shows that there is room for improvement. For example, we can't handle this case yet: .. dlstp.32 lr, r2 .LBB0_1: mov r3, r2 subs r2, #4 vldrh.u32 q2, [r1], #8 vmov q1, q0 vmla.u32 q0, q2, r0 letp lr, .LBB0_1 @ %bb.2: vctp.32 r3 .. which is a lot more tricky because r2 is not only used by the subs, but also by the mov to r3, which is used outside the low-overhead loop by the vctp instruction, and that requires a bit of a different approach, and I will follow up on this. Differential Revision: https://reviews.llvm.org/D71007	2019-12-11 10:20:19 +00:00
Simon Tatham	bd0f271c9e	[ARM][MVE] Add intrinsics for immediate shifts. (reland) This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: dmgreen, MarkMurrayARM Subscribers: echristo, hokein, rdhindsa, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065	2019-12-11 10:10:09 +00:00
QingShan Zhang	46822083ef	[NFC] Correct the example in the comments of JSON.h to avoid mislead user	2019-12-11 10:03:34 +00:00
Florian Hahn	b48b4ed1a0	[MCRegInfo] Add sub_and_superregs_inclusive iterator range. Reviewers: evandro, qcolombet, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70566	2019-12-11 09:53:19 +00:00
Andrzej Warzynski	1eecbda087	[AArch64][SVE] Move TableGen class definitions for gather loads (NFC) Move 2 intrinsic class definitions so that they're all clustered in one place. Patch submitted to test commit access.	2019-12-11 09:48:48 +00:00
Florian Hahn	11f311875f	[LiveRegUnits] Add phys_regs_and_masks iterator range (NFC). This iterator range just includes physical registers and register masks, which are interesting when dealing with register liveness. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70562	2019-12-11 09:34:42 +00:00
Wang, Pengfei	21bc8631fe	[FPEnv][X86] Constrained FCmp intrinsics enabling on X86 Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582	2019-12-11 08:23:09 +08:00
Vlad Tsyrklevich	636c93ed11	Revert "Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty..." This reverts commit `f2ba93971c`, it was causing build timeouts on sanitizer-x86_64-linux-autoconf such as http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/44917	2019-12-10 16:03:17 -08:00
Sanjay Patel	16e9315685	[IR] allow undefined elements when checking for splat constants This mimics the related call in SDAG. The caller is responsible for ensuring that undef values are propagated safely.	2019-12-10 17:16:59 -05:00
Vedant Kumar	8ddec9ad46	Fix -Wincomplete-umbrella warning in the modules build [281/3666] Building CXX object lib/IR/CMakeFiles/LLVMCore.dir/IntrinsicInst.cpp.o /Users/vsk/src/llvm-project-master/llvm/lib/IR/IntrinsicInst.cpp:155:2: warning: missing submodule 'LLVM_IR.ConstrainedOps' [-Wincomplete-umbrella]	2019-12-10 11:19:17 -08:00
Kevin P. Neal	6515c524b0	[FPEnv] clang support for constrained FP builtins Change the IRBuilder and clang so that constrained FP intrinsics will be emitted for builtins when appropriate. Only non-target-specific builtins are affected in this patch. Differential Revision: https://reviews.llvm.org/D70256	2019-12-10 13:09:12 -05:00
Fangrui Song	83b79f8a18	[VectorUtils] Fix -Wunused-private-field after D67572	2019-12-10 09:36:23 -08:00
Francesco Petrogalli	0be81968a2	[VectorUtils] Introduce the Vector Function Database (VFDatabase). This patch introduced the VFDatabase, the framework proposed in http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html. [] In this patch the VFDatabase is used to bridge the TargetLibraryInfo (TLI) calls that were previously used to query for the availability of vector counterparts of scalar functions. The VFISAKind field `ISA` of VFShape have been moved into into VFInfo, under the assumption that different vector ISAs may provide the same vector signature. At the moment, the vectorizer accepts any of the available ISAs as long as the signature provided by the VFDatabase matches the one expected in the vectorization process. For example, when targeting AVX or AVX2, which both have 256-bit registers, the IR signature of the two vector functions associated to the two ISAs is the same. The `getVectorizedFunction` method at the moment returns the first available match. We will need to add more heuristics to the search system to decide which of the available version (TLI, AVX, AVX2, ...) the system should prefer, when multiple versions with the same VFShape are present. Some of the code in this patch is based on the work done by Sumedh Arani in https://reviews.llvm.org/D66025. [] Notice that in the proposal the VFDatabase was called SVFS. The name VFDatabase is more in line with LLVM recommendations for naming classes and variables. Differential Revision: https://reviews.llvm.org/D67572	2019-12-10 16:36:44 +00:00
Mikhail Maltsev	e6d3261c67	[ARM][MVE] Refactor complex vector intrinsics [NFCI] Summary: This patch refactors instruction selection of the complex vector addition, multiplication and multiply-add intrinsics, so that it is now based on TableGen patterns rather than C++ code. It also changes the first parameter (halving vs non-halving) of the arm_mve_vcaddq IR intrinsic to match the corresponding instruction encoding, hence it requires some changes in the tests. The patch addresses David's comment in https://reviews.llvm.org/D71190 Reviewers: dmgreen, ostannard, simon_tatham, MarkMurrayARM Reviewed By: dmgreen Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71245	2019-12-10 16:21:52 +00:00
Yonghong Song	d77ae1552f	[DebugInfo] Support to emit debugInfo for extern variables Extern variable usage in BPF is different from traditional pure user space application. Recent discussion in linux bpf mailing list has two use cases where debug info types are required to use extern variables: - extern types are required to have a suitable interface in libbpf (bpf loader) to provide kernel config parameters to bpf programs. https://lore.kernel.org/bpf/CAEf4BzYCNo5GeVGMhp3fhysQ=_axAf=23PtwaZs-yAyafmXC9g@mail.gmail.com/T/#t - extern types are required so kernel bpf verifier can verify program which uses external functions more precisely. This will make later link with actual external function no need to reverify. https://lore.kernel.org/bpf/87eez4odqp.fsf@toke.dk/T/#m8d5c3e87ffe7f2764e02d722cb0d8cbc136880ed This patch added clang support to emit debuginfo for extern variables with a TargetInfo hook to enable it. The debuginfo for the extern variable is emitted only if that extern variable is referenced in the current compilation unit. Currently, only BPF target enables to generate debug info for extern variables. The emission of such debuginfo is disabled for C++ at this moment since BPF only supports a subset of C language. Emission with C++ can be enabled later if an appropriate use case is identified. -fstandalone-debug permits us to see more debuginfo with the cost of bloated binary size. This patch did not add emission of extern variable debug info with -fstandalone-debug. This can be re-evaluated if there is a real need. Differential Revision: https://reviews.llvm.org/D70696	2019-12-10 08:09:51 -08:00
Guillaume Chatelet	1b2842bf90	[Alignment][NFC] CreateMemSet use MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71213	2019-12-10 15:17:44 +01:00
stozer	f2ba93971c	Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty... basic blocks Originally applied in `72ce759928`. Fixed a build failure caused by incorrect use of cast instead of dyn_cast. This reverts commit `8b0780f795`.	2019-12-10 13:33:32 +00:00
Kiran Chandramohan	965ed1e974	[AArch64] Fix issues with large arrays on stack Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496	2019-12-10 11:44:41 +00:00
Johannes Doerfert	eb3e81f43f	[OpenMP][NFCI] Introduce llvm/IR/OpenMPConstants.h Summary: The new OpenMPConstants.h is a location for all OpenMP related constants (and helpers) to live. This patch moves the directives there (the enum OpenMPDirectiveKind) and rewires Clang to use the new location. Initially part of D69785. Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim Subscribers: jholewinski, ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69853	2019-12-10 00:10:09 -06:00
Fangrui Song	9574757dba	[MC] Delete MCCodePadder D34393 added MCCodePadder as an infrastructure for padding code with NOP instructions. It lacked tests and was not being worked on since then. Intel has now worked on an assembler patch to mitigate performance loss after applying microcode update for the Jump Conditional Code Erratum. https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html This new patch shares similarity with MCCodePadder, but has a concrete use case in mind and is being actively developed. The infrastructure it introduces can potentially be used for general performance improvement via alignment. Delete the unused MCCodePadder so that people can develop the new feature from a clean state. Reviewed By: jyknight, skan Differential Revision: https://reviews.llvm.org/D71106	2019-12-09 19:21:31 -08:00
Eric Christopher	9c6b7f68b8	Revert "[ARM][MVE] Add intrinsics for immediate shifts." and two follow-on commits: one warning fix and one functionality. As it's breaking at least the lto bot: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15132/steps/test-stage1-compiler/logs/stdio This reverts commits: `8d70f3c933` `ff4dceef92` `d97b3e3e65`	2019-12-09 16:47:38 -08:00
Christian Sigg	d5acc83a3a	Implement LWG#1203 for raw_ostream. Implement LWG#1203 (https://cplusplus.github.io/LWG/issue1203) for raw_ostream like libc++ does for std::basic_ostream<...>. Add a operator<< overload that takes an rvalue reference of a typed derived from raw_ostream, streams the value to it and returns the stream of the same type as the argument. This allows free operator<< to work with rvalue reference raw_ostreams: raw_ostream& operator<<(raw_ostream&, const SomeType& Value); raw_os_ostream(std::cout) << SomeType(); It also allows using the derived type like: auto Foo = (raw_string_ostream(buffer) << "foo").str(); Author: Christian Sigg <csigg@google.com> Differential Revision: https://reviews.llvm.org/D70686	2019-12-09 14:00:53 -08:00
Hiroshi Yamauchi	d9ae493937	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149	2019-12-09 12:42:59 -08:00
Mark Murray	fc3417cb5a	[ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics. Summary: Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen, miyuki Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71198	2019-12-09 17:41:47 +00:00
Mark Murray	2eb61fa5d6	[ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT\|POLY) intrinsics. Summary: Add VMULL[BT]Q_(INT\|POLY) intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71066	2019-12-09 17:41:47 +00:00
Simon Tatham	d97b3e3e65	[ARM][MVE] Add intrinsics for immediate shifts. Summary: This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065	2019-12-09 15:44:09 +00:00
Jeremy Morse	00e238896c	[DebugInfo] Nerf placeDbgValues, with prejudice CodeGenPrepare::placeDebugValues moves variable location intrinsics to be immediately after the Value they refer to. This makes tracking of locations very easy; but it changes the order in which assignments appear to the debugger, from the source programs order to the order in which the optimised program computes values. This then leads to PR43986 and PR38754, where variable locations that were in a conditional block are made unconditional, which is highly misleading. This patch adjusts placeDbgValues to only re-order variable location intrinsics if they use a Value before it is defined, significantly reducing the damage that it does. This is still not 100% safe, but the rest of CodeGenPrepare needs polishing to correctly update debug info when optimisations are performed to fully fix this. This will probably break downstream debuginfo tests -- if the instruction-stream position of variable location changes isn't the focus of the test, an easy fix should be to manually apply placeDbgValues' behaviour to the failing tests, moving dbg.value intrinsics next to SSA variable definitions thus: %foo = inst1 %bar = ... %baz = ... void call @llvm.dbg.value(metadata i32 %foo, ... to %foo = inst1 void call @llvm.dbg.value(metadata i32 %foo, ... %bar = ... %baz = ... This should return your test to exercising whatever it was testing before. Differential Revision: https://reviews.llvm.org/D58453	2019-12-09 12:52:10 +00:00
Mikhail Maltsev	0d1490bf6a	[ARM][MVE] Add complex vector intrinsics Summary: This patch adds intrinsics for the following MVE instructions: * VCADD, VHCADD * VCMUL * VCMLA Each of the above 3 groups has a corresponding new LLVM IR intrinsic. Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen Reviewed By: MarkMurrayARM Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71190	2019-12-09 12:05:59 +00:00
David Green	4a6e13ad88	[CommandLine] Add missing Callbacks It appears that the cl::bits options are not used anywhere in-tree. In the recent addition to add Callback's to the options, the Callback was missing from this one. This fixes it by adding the same code from the other classes. It also adds a simple test, of sorts, just to make sure these continue compiling.	2019-12-09 11:37:34 +00:00
David Green	be7a107070	[ARM] Teach the Arm cost model that a Shift can be folded into other instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966	2019-12-09 10:24:33 +00:00
David Stenberg	6965f835b4	[DebugInfo] Make describeLoadedValue() reg aware Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431	2019-12-09 10:47:49 +01:00
David Stenberg	f3696533f2	Revert "[DebugInfo] Make describeLoadedValue() reg aware" This reverts commit `3cd93a4efc`. I'll recommit with a well-formatted arcanist commit message.	2019-12-09 10:45:13 +01:00
David Stenberg	3cd93a4efc	[DebugInfo] Make describeLoadedValue() reg aware Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.	2019-12-09 10:44:17 +01:00
Ulrich Weigand	9db13b5a7d	[FPEnv] Constrained FCmp intrinsics This adds support for constrained floating-point comparison intrinsics. Specifically, we add: declare <ty2> @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) declare <ty2> @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) The first variant implements an IEEE "quiet" comparison (i.e. we only get an invalid FP exception if either argument is a SNaN), while the second variant implements an IEEE "signaling" comparison (i.e. we get an invalid FP exception if either argument is any NaN). The condition code is implemented as a metadata string. The same set of predicates as for the fcmp instruction is supported (except for the "true" and "false" predicates). These new intrinsics are mapped by SelectionDAG codegen onto two new ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again representing quiet vs. signaling comparison operations. Otherwise those nodes look like SETCC nodes, with an additional chain argument and result as usual for strict FP nodes. The patch includes support for the common legalization operations for those nodes. The patch also includes full SystemZ back-end support for the new ISD nodes, mapping them to all available SystemZ instruction to fully implement strict semantics (scalar and vector). Differential Revision: https://reviews.llvm.org/D69281	2019-12-07 11:28:39 +01:00
Don Hinton	6555995a6d	[CommandLine] Add callbacks to Options Summary: Add a new cl::callback attribute to Option. This attribute specifies a callback function that is called when an option is seen, and can be used to set other options, as in option A implies option B. If the option is a `cl::list`, and `cl::CommaSeparated` is also specified, the callback will fire once for each value. This could be used to validate combinations or selectively set other options. Reviewers: beanz, thomasfinch, MaskRay, thopre, serge-sans-paille Reviewed By: beanz Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70620	2019-12-06 15:16:45 -08:00
Sam Clegg	b4f4e370b5	[WebAssebmly][MC] Support .import_name/.import_field asm directives Convert the MC test to use asm rather than bitcode. This is a precursor to https://reviews.llvm.org/D70520. Differential Revision: https://reviews.llvm.org/D70877	2019-12-06 15:09:56 -08:00
Reid Kleckner	1d9291cc78	[MC] Rewrite tablegen for printInstrAlias to comiple faster, NFC Before this change, the *InstPrinter.cpp files of each target where some of the slowest objects to compile in all of LLVM. See this snippet produced by ClangBuildAnalyzer: https://reviews.llvm.org/P8171$96 Search for "InstPrinter", and see that it shows up in a few places. Tablegen was emitting a large switch containing a sequence of operand checks, each of which created many conditions and many BBs. Register allocation and jump threading both did not scale well with such a large repetitive sequence of basic blocks. So, this change essentially turns those control flow structures into data. The previous structure looked like: switch (Opc) { case TGT::ADD: // check alias 1 if (MI->getOperandCount() == N && // check num opnds MI->getOperand(0).isReg() && // check opnd 0 ... MI->getOperand(1).isImm() && // check opnd 1 AsmString = "foo"; break; } // check alias 2 if (...) ... return false; The new structure looks like: OpToPatterns: Sorted table of opcodes mapping to pattern indices. \-> Patterns: List of patterns. Previous table points to subrange of patterns to match. \-> Conds: The if conditions above encoded as a kind and 32-bit value. See MCInstPrinter.cpp for the details of how the new data structures are interpreted. Here are some before and after metrics. Time to compile AArch64InstPrinter.cpp: 0m29.062s vs. 0m2.203s size of the obj: 3.9M vs. 676K size of clang.exe: 97M vs. 96M I have not benchmarked disassembly performance, but typically disassemblers are bottlenecked on IO and string processing, not alias matching, so I'm not sure it's interesting enough to be worth doing. Reviewers: RKSimon, andreadb, xbolva00, craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D70650	2019-12-06 15:00:18 -08:00
Hiroshi Yamauchi	2eb30fafa5	Revert "[PGO][PGSO] Instrument the code gen / target passes." This reverts commit `9a0b5e1407`. This seems to break buildbots.	2019-12-06 12:17:32 -08:00
Hiroshi Yamauchi	9a0b5e1407	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072	2019-12-06 10:43:39 -08:00
Cullen Rhodes	bb8c679f4b	[AArch64][SVE] Implement integer compare intrinsics Summary: Adds intrinsics for the following: * cmphs, cmphi * cmpge, cmpgt * cmpeq, cmpne * cmplt, cmple * cmplo, cmpls Includes a minor change to `TLI.getMemValueType` that fixes a crash due to the scalable flag being dropped. Reviewers: sdesmalen, efriedma, rengolin, rovka, dancgr, huntergr Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70889	2019-12-06 10:39:06 +00:00
Alexey Lapshin	9e8c799e2b	[Dsymutil][NFC] Move NonRelocatableStringpool into common CodeGen folder. That refactoring moves NonRelocatableStringpool into common CodeGen folder. So that NonRelocatableStringpool could be used not only inside dsymutil. Differential Revision: https://reviews.llvm.org/D71068	2019-12-06 10:02:27 +03:00

... 2 3 4 5 6 ...

39164 Commits