llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	f9dbca68d4	[CMake] Enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR by default on Linux This makes the default build closer to a -DLLVM_ENABLE_RUNTIMES=all build. The layout is arguably superior because different libraries of target triples are in different directories, similar to GCC/Debian multiarch. When LLVM_DEFAULT_TARGET_TRIPLE is x86_64-unknown-linux-gnu, `lib/clang/14.0.0/lib/libclang_rt.asan-x86_64.a` is moved to `lib/clang/14.0.0/lib/x86_64-unknown-linux-gnu/libclang_rt.asan.a`. In addition, if the host compiler supports -m32 (multilib), `lib/clang/14.0.0/lib/libclang_rt.asan-i386.a` is moved to `lib/clang/14.0.0/lib/i386-unknown-linux-gnu/libclang_rt.asan.a`. Clang has been detecting both paths for lib/Driver/ToolChains/Gnu.cpp since 2018 (D50547). --- Note: Darwin needs to be disabled. The hierarchy needs to be sorted out. The current -DLLVM_DEFAULT_TARGET_TRIPLE=off state is like: ``` lib/clang/14.0.0/lib/darwin/libclang_rt.profile_ios.a lib/clang/14.0.0/lib/darwin/libclang_rt.profile_iossim.a lib/clang/14.0.0/lib/darwin/libclang_rt.profile_osx.a ``` Windows needs to be disabled: https://reviews.llvm.org/D107799?id=368557#2963311 Differential Revision: https://reviews.llvm.org/D107799	2021-09-15 09:32:59 -07:00
Michał Górny	210d72e9d6	[compiler-rt] Move -fno-omit-frame-pointer check to common config-ix `9ee64c3746` has started using COMPILER_RT_HAS_OMIT_FRAME_POINTER_FLAG inside scudo. However, the relevant CMake check was performed in builtin-config-ix.cmake, so the definition was missing when builtins were not built. Move the check to config-ix.cmake, so that it runs unconditionally of the components being built. Fixes PR#51847 Differential Revision: https://reviews.llvm.org/D109812	2021-09-15 18:32:33 +02:00
Anna Thomas	36ef65adc3	[InstCombine] Update test checks through autogeneration, add more tests. NFC Updated check lines. Tests precommitted from D109700.	2021-09-15 16:20:30 +00:00
Fangrui Song	9111635cb7	[test] Fix asan/scudo -shared-libsan tests with -DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=on On x86_64-unknown-linux-gnu, `-m32` tests set LD_LIBRARY_PATH to `config.compiler_rt_libdir` (`$build/lib/clang/14.0.0/lib/x86_64-unknown-linux-gnu`) instead of i386-unknown-linux-gnu, so `-shared-libsan` executables cannot find their runtime (e.g. `TestCases/replaceable_new_delete.cpp`). Detect -m32 and -m64 in config.target_cflags, and adjust `config.compiler_rt_libdir`. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D108859	2021-09-15 09:07:47 -07:00
Matt Morehouse	0a07789fe9	[HWASan] Add missing newlines.	2021-09-15 09:06:01 -07:00
Max Kazantsev	c78ed20784	[Test] Add a test showing missing opportunities in branch deletion by indvars	2021-09-15 22:17:10 +07:00
Nicolas Vasilache	6fe77b1051	[mlir][Linalg] Fail comprehensive bufferization if a memref is returned. Summary: Reviewers: Subscribers: Differential revision: https://reviews.llvm.org/D109824	2021-09-15 15:11:17 +00:00
Alexey Bataev	446e11fa29	[SLP][NFC]Add a test for tiny tree with stores and with not same/alternate instructions.	2021-09-15 08:07:01 -07:00
Matt Morehouse	1a3b3301d7	[HWASan] Catch cases where libc populated jmp_buf. Some setjmp calls within libc cannot be intercepted while their matching longjmp calls can be. This causes problems if our setjmp/longjmp interceptors don't use the exact same format as libc for populating and reading the jmp_buf. We add a magic field to our jmp_buf and populate it in setjmp. This allows our longjmp interceptor to notice when a libc jmp_buf is passed to it. See discussion on https://reviews.llvm.org/D109699 and https://reviews.llvm.org/D69045. Fixes https://github.com/google/sanitizers/issues/1244. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D109787	2021-09-15 07:53:54 -07:00
David Tenty	1f3925e25a	[clang][driver][AIX] Add system libc++ header paths to driver This change adds the system libc++ header location to the driver. As well we define the `__LIBC_NO_CPP_MATH_OVERLOADS__` macro when using those headers, in order to suppress conflicting C++ overloads in the system libc headers that were used by XL C++. Reviewed By: ZarkoCA Differential Revision: https://reviews.llvm.org/D109078	2021-09-15 10:41:18 -04:00
Jessica Clarke	b8d83e83be	[RISCV][compiler-rt] Fix an incorrect comment for RV64 __riscv_restore_12 This was presumably copied from the RV32 implementation and not updated like the rest.	2021-09-15 15:25:59 +01:00
Corentin Jabot	274adcb866	Implement delimited escape sequences. \x{XXXX} \u{XXXX} and \o{OOOO} are accepted in all languages mode in characters and string literals. This is a feature proposed for both C++ (P2290R1) and C (N2785). The papers have been seen by both committees but are not yet adopted into either standard. However, they do have support from both committees.	2021-09-15 09:54:49 -04:00
Jessica Clarke	bbca392a7f	[RISCV][compiler-rt] Move RV64 __riscv_restore_1/0 directives next to labels This looks like it was copied from the RV32 version and not properly updated. This has no functional effect but is not good style.	2021-09-15 14:42:22 +01:00
Jessica Clarke	3c885190af	[RISCV][compiler-rt] Add missing __riscv_save_1/0 labels for RV64 These got missed in D91717.	2021-09-15 14:42:16 +01:00
Filipp Zhinkin	f5d8952356	[InstCombine] Transform X == 0 ? 0 : X * Y --> X * freeze(Y) Enabled mul folding optimization that was previously disabled by being incorrect. To preserve correctness, mul's operand that is not compared with zero in select's condition is now frozen. Related bug: https://bugs.llvm.org/show_bug.cgi?id=51286 Correctness: https://alive2.llvm.org/ce/z/bHef7J https://alive2.llvm.org/ce/z/QcR7sf https://alive2.llvm.org/ce/z/vvBLzt https://alive2.llvm.org/ce/z/jGDXgq https://alive2.llvm.org/ce/z/3Pe8Z4 https://alive2.llvm.org/ce/z/LGga8M https://alive2.llvm.org/ce/z/CTG5fs Differential Revision: https://reviews.llvm.org/D108408	2021-09-15 09:04:06 -04:00
Sanjay Patel	be1028053e	[PhaseOrdering] add tests for PR47023; NFC	2021-09-15 08:44:04 -04:00
Simon Pilgrim	0767e43d87	[CostModel][X86] Adjust bitreverse/ctpop/ctlz/cttz AVX2+ costs based on llvm-mca reports Based off the worse case numbers generated by D103695, the AVX2/512 bit reversing/counting costs were higher than necessary (based off instruction counts instead of actual throughput).	2021-09-15 13:04:40 +01:00
Martin Storsjö	b4133a21ce	[lldb] [Windows] Fix an incorrect assert in NativeRegisterContextWindows_arm This codepath hadn't been exercised in a build with asserts before. Differential Revision: https://reviews.llvm.org/D109778	2021-09-15 15:03:20 +03:00
Martin Storsjö	b33a43e57c	[ARM] Move fetching of ARMSubtarget into the scopes that need it. NFC. This was requested in D38253, but missed back then. Differential Revision: https://reviews.llvm.org/D109046	2021-09-15 15:03:20 +03:00
Nico Weber	afc45ff06f	[gn build] (manually) port `2c42a73d6c`	2021-09-15 08:01:02 -04:00
Nicolas Vasilache	660f281b5e	[mlir][Linalg] Make codegen strategy late transformations opt-in Summary: Making the late transformations opt-in results in less surprising behavior when composing multiple calls to the codegen strategy. Reviewers: Subscribers: Differential revision: https://reviews.llvm.org/D109820	2021-09-15 11:02:14 +00:00
Nicolas Vasilache	e3889b3059	[mlir][Linalg] Replace DenseSet by UnionFind in ComprehensiveBufferize - NFC AliasInfo can now use union-find for a much more efficient implementation. This brings no functional changes but large performance gains on more complex examples. Differential Revision: https://reviews.llvm.org/D109819	2021-09-15 10:35:54 +00:00
David Green	a2332d5332	[ARM] Prevent continuous folding of SUBC Under some situations under Thumb1, we could be stuck in an infinite loop recombining the same instruction. This puts a limit on that, not combining SUBC with SUBE repeatedly.	2021-09-15 11:23:32 +01:00
Florian Hahn	05c120823b	[DSE] Add capture-before test cases with loads. Add a set of test cases where redundant stores may be removable, depending on whether a local allocation gets captured before performing a load.	2021-09-15 11:13:35 +01:00
David Green	61cc873a8e	[LV] Recognize intrinsic min/max reductions This extends the reduction logic in the vectorizer to handle intrinsic versions of min and max, both the floating point variants already created by instcombine under fastmath and the integer variants from D98152. As a bonus this allows us to match a chain of min or max operations into a single reduction, similar to how add/mul/etc work. Differential Revision: https://reviews.llvm.org/D109645	2021-09-15 10:45:50 +01:00
Simon Pilgrim	dcba994184	[X86] combineX86ShuffleChain - ensure we only peek through bitcasts to vectors (PR51858) When searching for hidden identity shuffles (added at rG41146bfe82aecc79961c3de898cda02998172e4b), only peek through bitcasts to the source operand if it is a vector type as well.	2021-09-15 10:21:05 +01:00
Simon Atanasyan	533471ff2f	[MIPS] Remove unused tblgen template args. NFC Identified in D109359.	2021-09-15 12:16:07 +03:00
Justas Janickas	3b9470a6c4	[OpenCL] Supports optional image types in C++ for OpenCL 2021 Adds support for a feature macro `__opencl_c_images` in C++ for OpenCL 2021 enabling a respective optional core feature from OpenCL 3.0. This change aims to achieve compatibility between C++ for OpenCL 2021 and OpenCL 3.0. Differential Revision: https://reviews.llvm.org/D109002	2021-09-15 10:03:47 +01:00
Cullen Rhodes	18655140d6	[NVPTX] NFC: Remove unused imm type intrinsic arg Identified in D109359. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D109755	2021-09-15 08:56:51 +00:00
David Green	bddfbf91ed	[LV] Min/max intrinsic reduction test cases.	2021-09-15 09:56:19 +01:00
Matthias Springer	934e2f695e	[mlir][linalg] ComprehensiveBufferize: Do not copy InitTensorOp results E.g.: ``` %2 = memref.alloc() {alignment = 128 : i64} : memref<256x256xf32> %3 = memref.alloc() {alignment = 128 : i64} : memref<256x256xf32> // ... (%3 is not written to) linalg.copy(%3, %2) : memref<256x256xf32>, memref<256x256xf32> vector.transfer_write %11, %2[%c0, %c0] {in_bounds = [true, true]} : vector<256x256xf32>, memref<256x256xf32> ``` Avoid copies of %3 if %3 came directly from an InitTensorOp. Differential Revision: https://reviews.llvm.org/D109742	2021-09-15 17:28:04 +09:00
Florian Hahn	e90d55e1c9	[VPlan] Support sinking recipes with uniform users outside sink target. This is a first step towards addressing the last remaining limitation of the VPlan version of sinkScalarOperands: the legacy version can partially sink operands. For example, if a GEP has uniform users outside the sink target block, then the legacy version will sink all scalar GEPs, other than the one for lane 0. This patch works towards addressing this case in the VPlan version by detecting such cases and duplicating the sink candidate. All users outside of the sink target will be updated to use the uniform clone. Note that this highlights an issue with VPValue naming. If we duplicate a replicate recipe, they will share the same underlying IR value and both VPValues will have the same name ir<%gep>. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D104254	2021-09-15 09:21:39 +01:00
Xiang1 Zhang	1f1c71aeac	[X86][InlineAsm] Use mem size information (*word ptr) for "global variable + registers" memory expression in inline asm. Differential Revision: https://reviews.llvm.org/D109739	2021-09-15 16:11:14 +08:00
Alex Zinenko	b10940edfc	[mlir] Update docs on conversion and translation to LLVM Create a new document that explain both stages of the process in a single place, merge and deduplicate the content from the two previous documents. Also extend the documentation to account for the recent changes in pass structure due to standard dialect splitting and translation being more flexible. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D109605	2021-09-15 09:50:21 +02:00
Tobias Gysi	a543abc5ea	[mlir][linalg] Update OpDSL doc (NFC). Update the doc due to recent path changes an point to a helper script.	2021-09-15 07:38:15 +00:00
Amara Emerson	5ec1845cad	[AArch64][GlobalISel] Add a new reassociation for G_PTR_ADDs. G_PTR_ADD (G_PTR_ADD X, C), Y) -> (G_PTR_ADD (G_PTR_ADD(X, Y), C) Improves CTMark -Os on AArch64: Program before after diff sqlite3 286932 287024 0.0% kc 432512 432508 -0.0% SPASS 412788 412764 -0.0% pairlocalalign 249460 249416 -0.0% bullet 475740 475512 -0.0% 7zip-benchmark 568864 568356 -0.1% consumer-typeset 419088 418648 -0.1% tramp3d-v4 367628 367224 -0.1% clamscan 383184 382732 -0.1% lencod 430028 429284 -0.2% Geomean difference -0.1% Differential Revision: https://reviews.llvm.org/D109528	2021-09-14 23:57:41 -07:00
Markus Lavin	1ac209ed76	[NPM] Added -print-pipeline-passes print params for a few passes. Added '-print-pipeline-passes' printing of parameters for those passes declared with _WITH_PARAMS macro in PassRegistry.def. Note that it only prints the parameters declared inside _WITH_PARAMS as in a few cases there appear to be additional parameters not parsable. The following passes are now covered (i.e. all of those with *_WITH_PARAMS in PassRegistry.def). LoopExtractorPass - loop-extract HWAddressSanitizerPass - hwsan EarlyCSEPass - early-cse EntryExitInstrumenterPass - ee-instrument LowerMatrixIntrinsicsPass - lower-matrix-intrinsics LoopUnrollPass - loop-unroll AddressSanitizerPass - asan MemorySanitizerPass - msan SimplifyCFGPass - simplifycfg LoopVectorizePass - loop-vectorize MergedLoadStoreMotionPass - mldst-motion GVN - gvn StackLifetimePrinterPass - print<stack-lifetime> SimpleLoopUnswitchPass - simple-loop-unswitch Differential Revision: https://reviews.llvm.org/D109310	2021-09-15 08:34:04 +02:00
serge-sans-paille	2c42a73d6c	Add extra check for llvm::Any::TypeId visibility This check should ensure we don't reproduce the problem fixed by `02df443d28` More accurately, it checks every llvm::Any::TypeId symbol in libLLVM-x.so and make sure they have weak linkage and are not local to the library, which would lead to duplicate definition if another weak version of the symbol is defined in another linked library. Differential Revision: https://reviews.llvm.org/D109252	2021-09-15 08:32:55 +02:00
Esme-Yi	945df8bc4c	[obj2yaml][XCOFF] Dump sections Summary: This patch implements parsing sections for obj2yaml on AIX. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D98003	2021-09-15 05:16:33 +00:00
Hongtao Yu	0057c7185d	[CSSPGO][llvm-profgen] Truncate stack samples with invalid return address. Invalid frame addresses exist in call stack samples due to bad unwinding. This could happen to frame-pointer-based unwinding and the callee functions that do not have the frame pointer chain set up. It isn't common when the program is built with the frame pointer omission disabled, but can still happen with third-party static libs built with frame pointer omitted. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D109638	2021-09-14 21:56:22 -07:00
Mehdi Amini	0dc461441e	Revert "[flang] Make 'this_image()' an intrinsic function" This reverts commit `81f8ad1769`. This seems to break the shared libs build (linaro-flang-aarch64-sharedlibs bot) with: undefined reference to `Fortran::semantics::IsCoarray(Fortran::semantics::Symbol const&) (from tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/tools.cpp.o) When linking lib/libFortranEvaluate.so.14git	2021-09-15 03:28:34 +00:00
Mehdi Amini	a32300a68f	Make the --mlir-disable-threading command line option overrides the C++ API usage This seems in-line with the intent and how we build tools around it. Update the description for the flag accordingly. Also use an injected thread pool in MLIROptMain, now we will create threads up-front and reuse them across split buffers. Differential Revision: https://reviews.llvm.org/D109802	2021-09-15 03:20:48 +00:00
cwz920716	500d4c45ba	[MLIR] Use memref.copy ops in BufferResultsToOutParams pass. Both copy/alloc ops are using memref dialect after this change. Reviewed By: silvas, mehdi_amini Differential Revision: https://reviews.llvm.org/D109480	2021-09-15 02:59:30 +00:00
LLVM GN Syncbot	10b069d1a0	[gn build] Port `626586fc25`	2021-09-15 02:29:04 +00:00
Nico Weber	626586fc25	Re-Revert "clang-tidy: introduce readability-containter-data-pointer check" This reverts commit `49992c0414`. The test is still failing on Windows, see comments on https://reviews.llvm.org/D108893	2021-09-14 22:27:59 -04:00
Philip Reames	d4e03bccd4	regen an autogened test which is stale	2021-09-14 18:42:23 -07:00
Matt Arsenault	54d755a034	DAG: Fix incorrect folding of fmul -1 to fneg The fmul is a canonicalizing operation, and fneg is not so this would break denormals that need flushing and also would not quiet signaling nans. Fold to fsub instead, which is also canonicalizing.	2021-09-14 21:25:02 -04:00
Hongtao Yu	299b5d420d	[CSSPGO] Enable pseudo probe instrumentation in O0 mode. Pseudo probe instrumentation was missing from O0 build. It is needed in cases where some source files are built in O0 while the others are built in optimize mode. Reviewed By: wenlei, wlei, wmi Differential Revision: https://reviews.llvm.org/D109531	2021-09-14 18:13:29 -07:00
Thomas Lively	962acf0a27	[lld][WebAssembly] Use llvm-objdump to test __wasm_init_memory Rather than depending on the hex dump from obj2yaml. Now the test shows the expected function body in a human readable format. Differential Revision: https://reviews.llvm.org/D109730	2021-09-14 18:07:59 -07:00
Matt Arsenault	4a36e96c3f	RegAllocGreedy: Account for reserved registers in num regs heuristic This simple heuristic uses the estimated live range length combined with the number of registers in the class to switch which heuristic to use. This was taking the raw number of registers in the class, even though not all of them may be available. AMDGPU heavily relies on dynamically reserved numbers of registers based on user attributes to satisfy occupancy constraints, so the raw number is highly misleading. There are still a few problems here. In the original testcase that made me notice this, the live range size is incorrect after the scheduler rearranges instructions, since the instructions don't have the original InstrDist offsets. Additionally, I think it would be more appropriate to use the number of disjointly allocatable registers in the class. For the AMDGPU register tuples, there are a large number of registers in each tuple class, but only a small fraction can actually be allocated at the same time since they all overlap with each other. It seems we do not have a query that corresponds to the number of independently allocatable registers. Relatedly, I'm still debugging some allocation failures where overlapping tuples seem to not be handled correctly. The test changes are mostly noise. There are a handful of x86 tests that look like regressions with an additional spill, and a handful that now avoid a spill. The worst looking regression is likely test/Thumb2/mve-vld4.ll which introduces a few additional spills. test/CodeGen/AMDGPU/soft-clause-exceeds-register-budget.ll shows a massive improvement by completely eliminating a large number of spills inside a loop.	2021-09-14 21:00:29 -04:00

1 2 3 4 5 ...

399073 Commits All Branches Search

399073 Commits

All Branches