llvm-project

Commit Graph

Author	SHA1	Message	Date
serge-sans-paille	6e040a19db	[NFC] Wisely nest dyn_cast in FunctionLoweringInfo Take advantage of the inheritance tree to avoid a few comparison.	2021-03-16 10:22:44 +01:00
Caroline Concatto	3c03635d53	[SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse This patch adds support for reverse loop vectorization. It is possible to vectorize the following loop: ``` for (int i = n-1; i >= 0; --i) a[i] = b[i] + 1.0; ``` with fixed or scalable vector. The loop-vectorizer will use 'reverse' on the loads/stores to make sure the lanes themselves are also handled in the right order. This patch adds support for scalable vector on IRBuilder interface to create a reverse vector. The IR function CreateVectorReverse lowers to experimental.vector.reverse for scalable vector and keedp the original behavior for fixed vector using shuffle reverse. Differential Revision: https://reviews.llvm.org/D95363	2021-03-16 07:51:59 +00:00
Lorenzo Chelini	fd7eee64c5	scf::ForOp: Fold away iterator arguments with no use and for which the corresponding input is yielded Enhance 'ForOpIterArgsFolder' to remove unused iteration arguments in a scf::ForOp. If the block argument corresponding to the given iterator has no use and the yielded value equals the input, we fold it away. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D98503	2021-03-16 07:01:25 +00:00
Jim Lin	678241795c	[RISCV] Don't emit #undef BUILTIN from RISCVVEmitter.cpp In BuiltinsRISCV.def, other extension 's intrinsics need to be defined by using macro BUILTIN. So, it shouldn't undefine macro BUILTIN in the end of declaration for V intrinsics. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D98682	2021-03-16 14:57:45 +08:00
Amara Emerson	9575c48b89	[AArch64][GlobalISel] Fix crash on lowering <1 x half> types.	2021-03-15 23:27:43 -07:00
Yvan Roux	c0f224e630	[AArch64][ASAN] Disable fgets_fputs.cpp test. This test is failing for long a time on AArch64 bots, disable it for now to keep the bots green while investigating it.	2021-03-16 07:00:19 +01:00
Pushpinder Singh	fc12a64ecc	[OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn Remove emit-llvm-bc from addClangTargetOptions as it conflicts with -E for save-temps. AMDGCN does not yet support linking object files so backend and assemble actions are skipped, leaving LLVM IR as the output format. Reviewed By: JonChesterfield, ronlieb Differential Revision: https://reviews.llvm.org/D96769	2021-03-16 04:58:14 +00:00
wlei	dddd590fd0	[CSSPGO][llvm-profgen] Fix getCanonicalFnName usage in llvm-profgen Previously we didn't support to keep the unique linkage name(-funique-internal-linkage-name) in llvm-profgen. As discussed in https://reviews.llvm.org/D96932, we choose to do canonicalization for it. Now since "selected" is set as the default parameter of getCanonicalFnName in `D96932`, we don't need to add any attribute here for the previous usage and only fix the missing usage in the pseudo probe decoding. Differential Revision: https://reviews.llvm.org/D98226	2021-03-15 21:00:42 -07:00
Johannes Doerfert	0a954a528b	[OpenMP][FIX] Repair accidental replacement of _shfl_sync with _shfl This was broken accidentally in D95752. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D98677	2021-03-15 22:46:00 -05:00
Johannes Doerfert	f40a2c3bef	[NVPTX] CUDA does provide malloc/free since compute capability 2.X https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#dynamic-global-memory-allocation-and-operations Reviewed By: tra Differential Revision: https://reviews.llvm.org/D98606	2021-03-15 22:45:56 -05:00
Josh Berdine	5bb2757e21	[OCaml][test] Fix Bindings/OCaml/executionengine.ml test It seems that at some point it became necessary to pass `-thread` to the ocaml compiler for this test. Differential Revision: https://reviews.llvm.org/D98593	2021-03-16 02:48:36 +00:00
LLVM GN Syncbot	6547dcb4f3	[gn build] Port `4f198b0c27`	2021-03-16 02:41:16 +00:00
Bing1 Yu	4f198b0c27	[X86] Pass to transform amx intrinsics to scalar operation. This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of shape to amx intrinsics is near the amx intrinsics code. We are not able to find a point which post-dominate all the shape and dominate all amx intrinsics. To decouple the dependency of the shape, we transform amx intrinsics to scalar operation, so that compiling doesn't fail. In long term, we should improve fast register allocation to allocate amx register. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D93594	2021-03-16 10:40:22 +08:00
David Blaikie	9341bcbdc9	Skip path separators to make the test portable across Win/Linux	2021-03-15 18:24:40 -07:00
Aart Bik	6ad7b97e20	[mlir][amx] Add Intel AMX dialect (architectural-specific vector dialect) The Intel Advanced Matrix Extensions (AMX) provides a tile matrix multiply unit (TMUL), a tile control register (TILECFG), and eight tile registers TMM0 through TMM7 (TILEDATA). This new MLIR dialect provides a bridge between MLIR concepts like vectors and memrefs and the lower level LLVM IR details of AMX. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D98470	2021-03-15 17:59:05 -07:00
Walter Erquinigo	b5657d1fbf	Fix `34885bffdf` It failed https://lab.llvm.org/buildbot/#/builders/17/builds/5262 and the fix is simply to relax a regex expression in a test.	2021-03-15 16:36:32 -07:00
Petr Hosek	9466f9b434	[CMake] Clean up unnecessary dependency The LINK_COMPONENTS dependency between DebugInfoCodeView and DebugInfoMSF is unnecessary. Breaking them would allow a more fine-controlled distribution. Patch By: dangyi Differential Revision: https://reviews.llvm.org/D98465	2021-03-15 16:29:16 -07:00
Jon Chesterfield	e23f3502d9	[libomptarget] Build amdgcn devicertl by default [libomptarget] Build amdgcn devicertl by default The cmake for this looks for an llvm install and does the right thing when building as part of enable_runtimes. It will probably do the right thing in other settings - at least, it won't try to build this with gcc. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D98658	2021-03-15 23:17:50 +00:00
LLVM GN Syncbot	2ef6ee1978	[gn build] Port `ecf6466f01`	2021-03-15 23:01:19 +00:00
Lang Hames	ecf6466f01	[JITLink][MachO][x86-64] Introduce generic x86-64 support. This patch introduces generic x86-64 edge kinds, and refactors the MachO/x86-64 backend to use these edge kinds. This simplifies the implementation of the MachO/x86-64 backend and makes it possible to write generic x86-64 passes and utilities. The new edge kinds are different from the original set used in the MachO/x86-64 backend. Several edge kinds that were not meaningfully distinguished in that backend (e.g. the PCRelMinusN edges) have been merged into single edge kinds in the new scheme (these edge kinds can be reintroduced later if we find a use for them). At the same time, new edge kinds have been introduced to convey extra information about the state of the graph. E.g. The RequestAndTransformTo* edges represent GOT/TLVP relocations prior to synthesis of the GOT/TLVP entries, and the 'Relaxable' suffix distinguishes edges that are candidates for optimization from edges which should be left as-is (e.g. to enable runtime redirection). ELF/x86-64 will be refactored to use these generic edges at some point in the future, and I anticipate a similar refactor to create a generic arm64 support header too. Differential Revision: https://reviews.llvm.org/D98305	2021-03-15 15:43:07 -07:00
Tim Keith	bcf95cbb2c	[flang] Create intrinsics modules directory (contd.) Use -module-dir rather than WORKING_DIRECTORY because we are potentially creating the working directory in this custom command.	2021-03-15 15:38:05 -07:00
Amy Huang	f5352dd9da	Emit inline implementation of __builtin__wmemchr on MSVCRT platforms. The MSVC runtime library doesn't have a definition for wmemchr, so provide an inline implementation. Differential Revision: https://reviews.llvm.org/D98472	2021-03-15 15:30:55 -07:00
Nico Weber	264ff539f3	[gn build] merge `af2796c76d` a bit more The default is fine on non-Win, but on Win this needs an explicit setting now that lit no longer has the right default.	2021-03-15 18:20:54 -04:00
Tim Keith	566a2c18bf	[flang] Create intrinsics modules directory A clean build fails using make because the intrinsics modules directory doesn't exist. For some reason it works fine with ninja.	2021-03-15 15:19:30 -07:00
Walter Erquinigo	34885bffdf	[lldb-vscode] Handle request_evaluate's context attribute Summary: The request "evaluate" supports a "context" attribute, which is sent by VSCode. The attribute is defined here https://microsoft.github.io/debug-adapter-protocol/specification#Requests_Evaluate The "clipboard" context is not yet supported by lldb-vscode, so we can forget about it for now. The 'repl' (i.e. Debug Console) and 'watch' (i.e. Watch Expression) contexts must use the expression parser in case the frame's variable path is not enough, as the user expects these expressions to never fail. On the other hand, the 'hover' expression is invoked whenever the user hovers on any keyword on the UI and the user is fine with the expression not being fully resolved, as they know that the 'repl' case is the fallback they can rely on. Given that the 'hover' expression is invoked many many times without the user noticing it due to it being triggered by the mouse, I'm making it use only the frame's variable path functionality and not the expression parser. This should speed up tremendously the responsiveness of a debug session when the user only sets source breakpoints and inspect local variables, as the entire debug info is not needed to be parsed. Regarding tests, I've tried to be as comprehensive as possible considering a multi-file project. Fortunately, the results from the "hover" case are enough most of the times. Differential Revision: https://reviews.llvm.org/D98656	2021-03-15 15:09:23 -07:00
Peyton, Jonathan L	7085f04573	[OpenMP] Remove unused cpu_stackoffset member	2021-03-15 16:52:04 -05:00
Alexander Yermolovich	51504bc1d9	[DWARF] Check for AddrOffsetSectionBase to work with DWO Units. Context: https://lists.llvm.org/pipermail/llvm-dev/2021-February/148521.html A fix for llvm-symbolizer, and other tools like BOLT, that allows retrieving address when built with -gsplit-dwarf=single mode. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D96827	2021-03-15 14:46:09 -07:00
diggerlin	d1f1bff81b	[AIX][XCOFF] Fixed the test case which failed at aix OS because enable -mignore-xcoff-visibility by default. Summary: because we enable -mignore-xcoff-visibility by default when there is no -fvisibility option in the clang in AIX OS it will cause some test case fail at aix os. in order to let the -mignore-xcoff-visibility to be disable, we need to add the -fvisibility=default for those test case. Reviewers: hubert.reinterpretcast daltenty Differential Revision: https://reviews.llvm.org/D98660	2021-03-15 17:33:02 -04:00
Artem Belevich	50c7504a93	[NVPTX] Avoid temp copy of byval kernel parameters. Avoid making a temporary copy of byval argument if all accesses are loads and therefore the pointer to the parameter can not escape. This avoids excessive global memory accesses when each kernel makes its own copy. Differential revision: https://reviews.llvm.org/D98469	2021-03-15 14:27:22 -07:00
Nick Lewycky	483a253ae9	NFC: Formatting changes. Run clang-format over these files. Capitalize some variable names per clang-tidy's request. Pulled out to simplify review of D98302.	2021-03-15 14:26:39 -07:00
peter klausler	6811b96100	[flang] Runtime: implement INDEX intrinsic function Implement INDEX in the runtime, reusing some infrastructure (with generalization and renaming as needed) put into place for its cousins SCAN and VERIFY. I did not implement full Boyer-Moore substring searching for the forward case, but did accelerate some advancement on mismatches. I (re)implemented unit testing for INDEX in the new gtest framework, combining it with the tests that have recently been ported to gtest for SCAN and VERIFY. Differential Revision: https://reviews.llvm.org/D98553	2021-03-15 14:19:13 -07:00
Stanislav Mekhanoshin	bc27a31801	[AMDGPU] Fix copyPhysReg to not produce unalined vgpr access RA can insert something like a sub1_sub2 COPY of a wide VGPR tuple which results in the unaligned acces with v_pk_mov_b32 after the copy is expanded. This is regression after D97316. Differential Revision: https://reviews.llvm.org/D98549	2021-03-15 14:14:30 -07:00
Florian Hahn	bb244ea2a8	[AnnotationRemarks] Remove unneeded Function.h include (NFC).	2021-03-15 21:09:35 +00:00
Nico Weber	01d648a69b	[gn build] merge `9bcf0eff99`	2021-03-15 17:05:05 -04:00
Jonas Paulsson	9cfd301ec8	[SystemZ] Test for isinf and isfinite in testFPKind(). Recognize BI__builtin_isinf and BI__builtin_isfinite (and a few other opcodes for finite) in testFPKind() and handle with TDC. Review: Ulrich Weigand. Differential Revision: https://reviews.llvm.org/D97901	2021-03-15 15:02:39 -06:00
Nico Weber	efbaf4030b	[gn build] kind of merge `af2796c76d` Good enough for now. If we need more, we'll do the usual platform-dependent hardcoding that in practice works for everything else too.	2021-03-15 17:01:00 -04:00
Stanislav Mekhanoshin	c297709ee1	[AMDGPU] Fixed msan failure with uninitialized value	2021-03-15 13:58:19 -07:00
Jon Chesterfield	bb38d7ff05	[libomptarget][nfc][amdgcn] Use precise triple for devicertl build	2021-03-15 20:24:13 +00:00
Stefan Pintilie	86f2a3d178	[PowerPC] Add __PCREL__ when PC Relative is enabled. This patch adds the `__PCREL__` define when PC Relative addressing is enabled. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D98546	2021-03-15 15:13:02 -05:00
Jon Chesterfield	d0bc85f04a	[libomptarget][nfc] Drop unused DEVICE macro [libomptarget][nfc] Drop unused DEVICE macro Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D98655	2021-03-15 20:12:50 +00:00
Jon Chesterfield	7da76aaaf4	[libomptarget] Build amdgpu plugin by default [libomptarget] Build amdgpu plugin by default This will build the amdgpu plugin if cmake is able to find the hsa runtime library, which will be the case if rocm is installed or if the hsa library has been installed somewhere cmake looks. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D98654	2021-03-15 20:12:01 +00:00
Kirill Bobyrev	9bcf0eff99	[clangd] Optionally add reflection for clangd-index-server This was originally landed without the optional part and reverted later: `8080ea4c4b` Reviewed By: kadircet Differential Revision: https://reviews.llvm.org/D98404	2021-03-15 21:07:25 +01:00
Markus Böck	68e4084bf6	Revert line accidentally included in `af2796c76d`	2021-03-15 21:03:46 +01:00
Sanjay Patel	b1b07dd071	[SLP] update stale test comments; NFC These bugs were fixed with `0a8e7ca402`	2021-03-15 16:02:46 -04:00
Stanislav Mekhanoshin	3bffb1cd0e	[AMDGPU] Use single cache policy operand Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amount of code. These operands are mostly 0 anyway. Additional advantage that parser will accept these flags in any order unlike now. Differential Revision: https://reviews.llvm.org/D96469	2021-03-15 13:00:59 -07:00
Markus Böck	af2796c76d	[test] Add ability to get error messages from CMake for errc substitution Visual Studios implementation of the C++ Standard Library does not use strerror to produce a message for std::error_code unlike other standard libraries such as libstdc++ or libc++ that might be used. This patch adds a cmake script that through running a C++ program gets the error messages for the POSIX error codes and passes them onto lit through an optional config parameter. If the config parameter is not set, or getting the messages failed, due to say a cross compiling configuration without an emulator, it will fall back to using pythons strerror functions. Differential Revision: https://reviews.llvm.org/D98278	2021-03-15 20:56:08 +01:00
Jon Chesterfield	bcb3f0f867	[libomptarget] Fix devicertl build [libomptarget] Fix devicertl build The target specific functions in target_interface are extern C, but the implementations for nvptx were mostly C++ mangling. That worked out as a quirk of DEVICE macro expanding to nothing, except for shuffle.h which only forward declared the functions with C++ linkage. Also implements GetWarpSize, as used by shuffle, and includes target_interface in nvptx target_impl.cu to help catch future divergence between interface and implementation. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D98651	2021-03-15 19:50:22 +00:00
Michael Kruse	9c486eb348	[Polly] Fix deprecation warning. NFC. IRBuilder::CreateLoad without type parameter was deprecated in `6312c538` to prepare for opaque pointers.	2021-03-15 14:31:16 -05:00
Wenlei He	a5d30421a6	[CSSPGO] Load context profile for external functions in PreLink and populate ThinLTO import list For ThinLTO's prelink compilation, we need to put external inline candidates into an import list attached to function's entry count metadata. This enables ThinLink to treat such cross module callee as hot in summary index, and later helps postlink to import them for profile guided cross module inlining. For AutoFDO, the import list is retrieved by traversing the nested inlinee functions. For CSSPGO, since profile is flatterned, a few things need to happen for it to work: - When loading input profile in extended binary format, we need to load all child context profile whose parent is in current module, so context trie for current module includes potential cross module inlinee. - In order to make the above happen, we need to know whether input profile is CSSPGO profile before start reading function profile, hence a flag for profile summary section is added. - When searching for cross module inline candidate, we need to walk through the context trie instead of nested inlinee profile (callsite sample of AutoFDO profile). - Now that we have more accurate counts with CSSPGO, we swtiched to use entry count instead of total count to decided if an external callee is potentially beneficial to inline. This make it consistent with how we determine whether call tagert is potential inline candidate. Differential Revision: https://reviews.llvm.org/D98590	2021-03-15 12:22:15 -07:00
Jianzhou Zhao	9cf5220c5c	[dfsan] Updated check_custom_wrappers.sh to dedup function names The origin wrappers added by https://reviews.llvm.org/D98359 reuse those __dfsw_ functions.	2021-03-15 19:12:08 +00:00

... 2 3 4 5 6 ...

382953 Commits All Branches Search

382953 Commits

All Branches