llvm-project

Commit Graph

Author	SHA1	Message	Date
Malhar Jajoo	b856f4a232	[ARM] Transforming memcpy to Tail predicated Loop This patch converts llvm.memcpy intrinsic into Tail Predicated Hardware loops for a target that supports the Arm M-profile Vector Extension (MVE). From an implementation point of view, the patch - adds an ARM specific SDAG Node (to which the llvm.memcpy intrinsic is lowered to, during first phase of ISel) - adds a corresponding TableGen entry to generate a pseudo instruction, with a custom inserter, on matching the above node. - Adds a custom inserter function that expands the pseudo instruction into MIR suitable to be (by later passes) into a WLSTP loop. Note: A cli option is used to control the conversion of memcpy to TP loop and this option is currently disabled by default. It may be enabled in the future after further downstream testing. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D99723	2021-05-06 09:34:09 +01:00
James Henderson	abe2c906ad	[lit] Report tool path from use_llvm_tool if found via env variable Previously, if the search_env argument was specified, and the tool was found at that location, the path was not reported, unlike other situations when this function was called. Adding the reporting makes the function consistent. Reviewed by: thopre Differential Revision: https://reviews.llvm.org/D101896	2021-05-06 09:21:54 +01:00
Tim Renouf	ab5932ffbd	[llvm-objdump] Use std::make_unique Fix up my recent commit rG1128311a19179ceca799ff0fbc4dd206ab56e560 to use std::make_unique instead of std::unique_ptr(new), as requested by David Blaikie. Differential Revision: https://reviews.llvm.org/D101822	2021-05-06 09:10:30 +01:00
Guillaume Chatelet	089ec047be	[llvm][NFC] Remove CallingConvLower deprecated alignment functions Differential Revision: https://reviews.llvm.org/D101910	2021-05-06 07:46:19 +00:00
Guillaume Chatelet	1fa21bf9e9	[llvm][NFC] Remove SelectionDag alignment deprecated functions Differential Revision: https://reviews.llvm.org/D101909	2021-05-06 07:44:14 +00:00
Guillaume Chatelet	040f4a97cd	[llvm][NFC] Remove deprecated InterleaveGroup::getAlignment() function. Differential Revision: https://reviews.llvm.org/D101907	2021-05-06 07:40:18 +00:00
Guillaume Chatelet	a065efa302	[llvm][NFC] Remove deprecated DataLayout::getPreferredAlignment functions Differential Revision: https://reviews.llvm.org/D101906	2021-05-06 07:28:00 +00:00
Guillaume Chatelet	b4795544d4	[llvm][NFC] Remove deprecated Alignment::None() Differential Revision: https://reviews.llvm.org/D101905	2021-05-06 07:21:23 +00:00
Johannes Doerfert	df729e2b82	[OpenMP] Overhaul `declare target` handling This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well. This started with PR49649 in mind, trying to provide a way for users to avoid the "ref" global use introduced for globals with internal linkage. From there it went down the rabbit hole, e.g., all variables, even `nohost` ones, were emitted into the device code so it was impossible to determine if "ref" was needed late in the game (based on the name only). To make it really useful, `begin declare target` was needed as it can carry the `device_type`. Not emitting variables eagerly had a ripple effect. Finally, the precedence of the (explicit) declare target list items needed to be taken into account, that meant we cannot just look for any declare target attribute to make a decision. This caused the handling of functions to require fixup as well. I tried to clean up things while I was at it, e.g., we should not "parse declarations and defintions" as part of OpenMP parsing, this will always break at some point. Instead, we keep track what region we are in and act on definitions and declarations instead, this is what we do for declare variant and other begin/end directives already. Highlights: - new diagnosis for restrictions specificed in the standard, - delayed emission of globals not mentioned in an explicit list of a declare target, - omission of `nohost` globals on the host and `host` globals on the device, - no explicit parsing of declarations in-between `omp [begin] declare variant` and the corresponding end anymore, regular parsing instead, - precedence for explicit mentions in `declare target` lists over implicit mentions in the declaration-definition-seq, and - `omp allocate` declarations will now replace an earlier emitted global, if necessary. --- Notes: The patch is larger than I hoped but it turns out that most changes do on their own lead to "inconsistent states", which seem less desirable overall. After working through this I feel the standard should remove the explicit declare target forms as the delayed emission is horrible. That said, while we delay things anyway, it seems to me we check too often for the current status even though that is often not sufficient to act upon. There seems to be a lot of duplication that can probably be trimmed down. Eagerly emitting some things seems pretty weak as an argument to keep so much logic around. --- Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D101030	2021-05-06 02:10:41 -05:00
Johannes Doerfert	3f14596700	[OpenMP] Ensure the DefaultMapperId has a location A user reported an assertion (below) but without a reproducer. I failed to create a test myself but from the assertion one can derive the problem. I set the DefaultMapperId location now to make sure this doesn't cause trouble. ``` clang-13: .../DeclTemplate.h:1940: void clang::ClassTemplateSpecializationDecl::setPointOfInstantiation(clang::SourceLocation): Assertion `Loc.isValid() && "point of instantiation must be valid!"' failed. ``` Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D100621	2021-05-06 02:10:36 -05:00
Johannes Doerfert	5d8d994dfb	[OpenMP] Make sure classes work on the device as they do on the host We do provide `operator delete(void*)` in `<new>` but it should be available by default. This is mostly boilerplate to test it and the unconditional include of `<new>` in the header we always in include on the device. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D100620	2021-05-06 02:10:30 -05:00
Navdeep Kumar	875eb523c1	[MLIR][GPU][NVVM] Add warp synchronous matrix-multiply accumulate ops Add warp synchronous matrix-multiply accumulate ops in GPU and NVVM dialect. Add following three ops to GPU dialect :- 1.) subgroup_mma_load_matrix 2.) subgroup_mma_store_matrix 3.) subgroup_mma_compute Add following three ops to NVVM dialect :- 1.) wmma.m16n16k16.load.[a,b,c].[f16,f32].row.stride 2.) wmma.m16n16k16.store.d.[f16,f32].row.stride 3.) wmma.m16n16k16.mma.row.row.[f16,f32].[f16,f32] Reviewed By: bondhugula, ftynse, ThomasRaoux Differential Revision: https://reviews.llvm.org/D95330	2021-05-06 12:06:25 +05:30
Queen Dela Cruz	16c7829784	[clangd] Check if macro is already in the IdentifierTable before loading it Having nested macros in the C code could cause clangd to fail an assert in clang::Preprocessor::setLoadedMacroDirective() and crash. #1 0x00000000007ace30 PrintStackTraceSignalHandler(void) /qdelacru/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1 #2 0x00000000007aaded llvm::sys::RunSignalHandlers() /qdelacru/llvm-project/llvm/lib/Support/Signals.cpp:76:20 #3 0x00000000007ac7c1 SignalHandler(int) /qdelacru/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1 #4 0x00007f096604db20 __restore_rt (/lib64/libpthread.so.0+0x12b20) #5 0x00007f0964b307ff raise (/lib64/libc.so.6+0x377ff) #6 0x00007f0964b1ac35 abort (/lib64/libc.so.6+0x21c35) #7 0x00007f0964b1ab09 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21b09) #8 0x00007f0964b28de6 (/lib64/libc.so.6+0x2fde6) #9 0x0000000001004d1a clang::Preprocessor::setLoadedMacroDirective(clang::IdentifierInfo, clang::MacroDirective, clang::MacroDirective) /qdelacru/llvm-project/clang/lib/Lex/PPMacroExpansion.cpp:116:5 An example of the code that causes the assert failure: ``` ... ``` During code completion in clangd, the macros will be loaded in loadMainFilePreambleMacros() by iterating over the macro names and calling PreambleIdentifiers->get(). Since these macro names are store in a StringSet (has StringMap underlying container), the order of the iterator is not guaranteed to be same as the order seen in the source code. When clangd is trying to resolve nested macros it sometimes attempts to load them out of order which causes a macro to be stored twice. In the example above, ECHO2 macro gets resolved first, but since it uses another macro that has not been resolved it will try to resolve/store that as well. Now there are two MacroDirectives stored in the Preprocessor, ECHO and ECHO2. When clangd tries to load the next macro, ECHO, the preprocessor fails an assert in clang::Preprocessor::setLoadedMacroDirective() because there is already a MacroDirective stored for that macro name. In this diff, I check if the macro is already inside the IdentifierTable and if it is skip it so that it is not resolved twice. Reviewed By: kadircet Differential Revision: https://reviews.llvm.org/D101870	2021-05-06 08:24:06 +02:00
Giorgis Georgakoudis	207b08a913	[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101849	2021-05-05 20:08:38 -07:00
Jessica Clarke	6c80361b84	[SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics Unlike normal loads these don't have an extension field, but we know from TargetLowering whether these are sign-extending or zero-extending, and so can optimise away unnecessary extensions. This was noticed on RISC-V, where sign extensions in the calling convention would result in unnecessary explicit extension instructions, but this also fixes some Mips inefficiencies. PowerPC sees churn in the tests as all the zero extensions are only for promoting 32-bit to 64-bit, but these zero extensions are still not optimised away as they should be, likely due to i32 being a legal type. This also simplifies the WebAssembly code somewhat, which currently works around the lack of target-independent combines with some ugly patterns that break once they're optimised away. Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits, where zero-extending atomics were incorrectly returning 0 rather than the (slightly confusing) required return value of 1. Reviewed By: RKSimon, atanasyan Differential Revision: https://reviews.llvm.org/D101342	2021-05-06 04:01:20 +01:00
Jinsong Ji	6bdfcb165e	[BPF][Test] Disable codegen test on AIX https://reviews.llvm.org/D101194 changed the default getMultiarchTriple in toolchain. So -march=bpf on AIX will get triple of bpf-ibm-aix now, this is unexpected and causing test failures. BPF on AIX is not supported (yet), disable the codegen test on AIX in lit cfg. Reviewed By: yonghong-song Differential Revision: https://reviews.llvm.org/D101866	2021-05-06 02:38:46 +00:00
Lang Hames	abdd14a2d7	[ORC] Add missing library dependency on IRReader.	2021-05-05 19:38:10 -07:00
Giorgis Georgakoudis	f97b843d88	[OpenMP] Fix non-determinism in clang copyin codegen Codegen for OpeMP copyin has non-deterministic IR output due to the unspecified evaluation order in a codegen conditional branch, which makes automatic test generation unreliable. This patch refactors codegen code to avoid this non-determinism. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101952	2021-05-05 19:24:03 -07:00
Lang Hames	7b73cd684a	[ORC] Introduce C API for adding object buffers directly to an object layer. This can be useful for clients constructing custom JIT stacks: If the C API for your custom stack exposes API to obtain a reference to an object layer (e.g. LLVMOrcLLJITGetObjLinkingLayer) then the newly added LLVMOrcObjectLayerAddObjectFile and LLVMOrcObjectLayerAddObjectFileWithRT functions can be used to add objects directly to that layer.	2021-05-05 19:02:13 -07:00
Christopher Ferris	6fac34251d	[scudo] Add initialization for TSDRegistrySharedT Fixes compilation on Android which has a TSDSharedRegistry object in the config. Reviewed By: cryptoad, vitalybuka Differential Revision: https://reviews.llvm.org/D101951	2021-05-05 19:00:54 -07:00
Stanislav Mekhanoshin	ab90ae6f47	[AMDGPU] Switch AnnotateUniformValues to MemorySSA This shall speedup compilation and also remove threshold limitations used by memory dependency analysis. It also seem to fix the bug in the coalescer_remat.ll where an SMRD load was used in presence of a potentially clobbering store. Fixes: SWDEV-272132 Differential Revision: https://reviews.llvm.org/D101962	2021-05-05 18:34:41 -07:00
Austin Kerbow	6617a5a5ea	[AMDGPU] Move insertion of function entry waitcnt later This allows tracking these as preexisting waitcnt. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D101380	2021-05-05 17:58:38 -07:00
Min-Yih Hsu	f6d7fc801b	[M68k][test][NFC] Scrubing some tests Remove unecessary labels and assembly directives. NFC.	2021-05-05 17:48:28 -07:00
Fangrui Song	1b11b5b01f	[AArch64] Replace fixup_aarch64_tlsdesc_call with FirstLiteralRelocationKind + R_AARCH64_{,P32_}TLSDESC_CALL	2021-05-05 17:41:56 -07:00
Fangrui Song	5f39522320	[test] Delete redundant arm64-tls-relocs.s It just replicates tls-relocs.s	2021-05-05 17:41:04 -07:00
Juneyoung Lee	8a156d1c27	[InstCombine] Fully disable select to and/or i1 folding This is a patch that disables the poison-unsafe select -> and/or i1 folding. It has been blocking D72396 and also has been the source of a few miscompilations described in llvm.org/pr49688 . D99674 conditionally blocked this folding and successfully fixed the latter one. The former one was still blocked, and this patch addresses it. Note that a few test functions that has `_logical` suffix are now deoptimized. These are created by @nikic to check the impact of disabling this optimization by copying existing original functions and replacing and/or with select. I can see that most of these are poison-unsafe; they can be revived by introducing freeze instruction. I left comments at fcmp + select optimizations (or-fcmp.ll, and-fcmp.ll) because I think they are good targets for freeze fix. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101191	2021-05-06 09:29:52 +09:00
Austin Kerbow	f5199d7ae0	[AMDGPU] Revise handling of preexisting waitcnt Preexisting waitcnt may not update the scoreboard if the instruction being examined needed to wait on fewer counters than what was encoded in the old waitcnt instruction. Fixing this results in the elimination of some redudnat waitcnt. These changes also enable combining consecutive waitcnt into a single S_WAITCNT or S_WAITCNT_VSCNT instruction. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D100281	2021-05-05 17:21:33 -07:00
Malhar Jajoo	9ba5238c28	[ARM] Simplification to ARMBlockPlacement Pass. It simplifies the logic by moving the predecessor (preHeader or it's predecessor) above the target (or loopExit), instead of moving the target to after the predecessor. Since the loopExit is no longer being moved, directions of any branches within/to it are unaffected. While the predecessor is being moved, the backwards movement simplifies some considerations, and the only consideration now required is that a forward WLS to the predecessor should not become backwards. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D100094	2021-05-06 01:20:18 +01:00
Jianzhou Zhao	f3e3a1d79e	[dfsan] extend a test case to measure origin memory usage This is to support D101204. Reviewed By: gbalats Differential Revision: https://reviews.llvm.org/D101877	2021-05-06 00:19:44 +00:00
Min-Yih Hsu	5b3dd2a490	[M68k][AsmParser] Fix invalid register name parsing logics Adjust sanity check in register parsing function to allow register name with more than 2 characters (e.g. ccr). Differential Revision: https://reviews.llvm.org/D101733	2021-05-05 17:13:02 -07:00
Min-Yih Hsu	abac6023bb	[M68k][AsmParser] Support negative integer constants Parsing negative integer constants as expressions. Differential Revision: https://reviews.llvm.org/D101732	2021-05-05 17:11:59 -07:00
Min-Yih Hsu	34da083a8c	[M68k][test] Initial migration of MC tests As the context depicted by bug 49865[1], we are migrating tests under `test/CodeGen/M68k/Encoding`, which was originally used to test instruction encoding using MIR file as input, into `test/MC/M68k`. We are also adding test directives for AsmParser using the same set of inputs. Currently we are converting the original MIR test files into assembly code as well as translating the original LIT "RUN" statement into one that only uses built-in LLVM tools (i.e. Get rid of `extract-section`). However, since AsmParser has not completely finished, many of these original test cases fail. Thus, this patch only migrate test files that are passed by the current implementation of AsmParser (and MCCodeEmitter). The remaining tests (under test/CodeGen/M68k/Encoding) will be ported alone with the patch that fixes the related issues. [1]: https://bugs.llvm.org/show_bug.cgi?id=49865 Differential Revision: https://reviews.llvm.org/D101410	2021-05-05 17:11:35 -07:00
Heejin Ahn	7f06cae1c1	[WebAssembly] Fix JS code mentions in LowerEmscriptenEHSjLj - Removes the mention of fastcomp, which is deprecated. - Some functions in Emscripten have moved from JS glue code to compiler-rt/emscripten_setjmp.c and compiler-rt/emscripten_exception_builtins.c. This fixes comments about that. Reviewed By: sbc100 Differential Revision: https://reviews.llvm.org/D101812	2021-05-05 17:04:14 -07:00
peter klausler	535cbe02a4	[flang] Provide access to constant character array data Allow direct access to constant character array data (for creating a hash ID of a constant). Differential Revision: https://reviews.llvm.org/D101208	2021-05-05 16:56:00 -07:00
Emilio Cota	3c952ab25f	[mlir] Check generated IR of math_polynomial_approx.mlir Instead of just checking that we emit something. Differential Revision: https://reviews.llvm.org/D101940	2021-05-05 16:42:48 -07:00
Nicolai Hähnle	6adcdd2613	[tests] Update Transforms/FunctionAttrs/nosync.ll Commit generated by running update_test_checks.py, to reflect the fact that we now add the `mustprogress` attribute.	2021-05-06 01:39:18 +02:00
Fangrui Song	d738ac6e12	[AArch64] Deleted unused AsmBackend functions	2021-05-05 16:28:39 -07:00
RamNalamothu	41f8b8e807	[MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling This change enables emitting CFI unwind information for debugging purpose for targets with MCAsmInfo::ExceptionsType == ExceptionHandling::None. Currently generating CFI unwind information is entangled with supporting the exceptions, even when AsmPrinter explicitly recognizes that the unwind tables are being generated as debug information. In fact, the unwind information is not generated even if we specify --force-dwarf-frame-section, unless exceptions are enabled. The LIT test llvm/test/CodeGen/AMDGPU/debug_frame.ll demonstrates this behavior. Enable this option for AMDGPU to prepare for future patches which add complete CFI support. Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D78778	2021-05-06 04:53:45 +05:30
MaheshRavishankar	4b2d7ef3ea	[mlir][Linalg] Fix test to use new reshape op form. Differential Revision: https://reviews.llvm.org/D101956	2021-05-05 16:06:58 -07:00
Coplin, Jared	6251b2f7f6	Attach metadata to simplified masked loads and stores	2021-05-05 18:01:49 -05:00
Alex Reinking	7ac3fcc526	Allow /STACK in #pragma comment(linker, ...) The Halide project uses `#pragma comment(linker, "/STACK:...")` to set the stack size high enough for our embedded compiler to run in end-user programs on Windows. Unfortunately, lld-link.exe breaks on this when embedded in a COFF object, despite supporting the flag on the command line. MSVC's link.exe supports this fine. This patch extends support for this to lld-link.exe for better compatibility with MSVC projects. Differential Revision: https://reviews.llvm.org/D99680	2021-05-05 16:00:33 -07:00
Matt Arsenault	b6d244e5b8	AMDGPU: Fix lit test	2021-05-05 18:41:18 -04:00
MaheshRavishankar	b6060b7673	[mlir][Linalg] Fix element type of results when folding reshapes. Fixing a minor bug which lead to element type of the output being modified when folding reshapes with generic op. Differential Revision: https://reviews.llvm.org/D101942	2021-05-05 15:40:41 -07:00
Fangrui Song	7b0756a51a	[AArch64] Fix some coding standard issues related to namespace llvm https://llvm.org/docs/CodingStandards.html#use-namespace-qualifiers-to-implement-previously-declared-functions	2021-05-05 15:27:16 -07:00
Petr Hosek	9d3dbcd24c	[Driver] Move -print-runtime-dir and -print-resource-dir tests Put these into a separate files to match other -print-* options tests. Differential Revision: https://reviews.llvm.org/D101813	2021-05-05 15:23:49 -07:00
Vang Thao	7a41639c60	[AMDGPU][GlobalISel] Widen 1 and 2 byte scalar loads Widen 1 and 2 byte scalar loads to 4 bytes when sufficiently aligned to avoid using a global load. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D100430	2021-05-05 15:18:19 -07:00
Nico Weber	ea3777fe22	[gn build] (semi-manually) port `0b10bb7ddd` more	2021-05-05 18:15:13 -04:00
Dave Lee	c5cf4b8f11	[lldb] Handle missing SBStructuredData copy assignment cases Fix cases that can crash `SBStructuredData::operator=`. This happened in a case where `rhs` had a null `SBStructuredDataImpl`. Differential Revision: https://reviews.llvm.org/D101585	2021-05-05 15:12:03 -07:00
Vy Nguyen	23233ad139	[lld-macho] Check simulator platforms to avoid issuing false positive errors. Currently the linker causes unnecessary errors when either the target or the config's platform is a simulator. Differential Revision: https://reviews.llvm.org/D101855	2021-05-05 18:07:58 -04:00
Nico Weber	ceccfaae14	[gn build] (semi-manually) port `0b10bb7ddd`	2021-05-05 18:06:52 -04:00

... 2 3 4 5 6 ...

387768 Commits All Branches Search

387768 Commits

All Branches