llvm-project

Commit Graph

Author	SHA1	Message	Date
Aditya Nandakumar	db464a3dbf	[GISel] Add new GISel combiners for G_SELECT https://reviews.llvm.org/D83833 Patch adds two new GICombinerRules for G_SELECT. The rules include: combining selects with undef comparisons into their first selectee value, and to combine away selects with constant comparisons. Patch additionally adds a new combiner test for the AArch64 target to test these new G_SELECT combiner rules and the existing select_same_val combiner rule. Patch by mkitzan	2020-08-27 09:40:15 -07:00
Jonas Devlieghere	a7e4a17735	[lldb] Make lldb-argdumper a dependency of liblldb Always make lldb-argdumper a dependency of liblldb. Currently it is only a dependency of the python swig target because of the relative symlink in the python resource directory. That means that the dependency won't be there when LLDB_ENABLE_PYTHON is disabled. Differential revision: https://reviews.llvm.org/D86722	2020-08-27 09:31:02 -07:00
Jonas Devlieghere	b981924bdd	[lldb] Move triple construction out of getArchCFlags in DarwinBuilder (NFC) Move the construction of the triple out of getArchCFlags in the DarwinBuilder.	2020-08-27 09:31:01 -07:00
Arthur Eubanks	dd04fa17d7	[OCaml] Remove add_constant_propagation After https://reviews.llvm.org/D85159.	2020-08-27 09:30:21 -07:00
Simon Moll	c48b06c44f	[sda][nfc] clang-formatting	2020-08-27 18:27:44 +02:00
Shinji Okumura	7a68f0f1e0	[Attributor] Add a phase flag to Attributor Add a new flag that indicates which stage in the process we are in. This flag is introduced for handling behavior of `getAAFor` according to the stage. (discussed in D86635) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86678	2020-08-28 01:16:38 +09:00
Aditya Nandakumar	5c2db1655b	[GISel]: Fix one more CSE Non determinism https://reviews.llvm.org/D86676 Sometimes we can have the following code x:gpr(s32) = G_OP Say we build G_OP2 to the same x and then delete the previous instruction. Using something like Register X = ...; auto NewMIB = CSEBuilder.buildOp2(X, ... args); Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method. This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code. Also this patch adds this verification at the end of the combiners as well.	2020-08-27 09:06:21 -07:00
Lucas Prates	3d943bcd22	[CodeGen] Properly propagating Calling Convention information when lowering vector arguments When joining the legal parts of vector arguments into its original value during the lower of Formal Arguments in SelectionDAGBuilder, the Calling Convention information was not being propagated for the handling of each individual parts. The same did not happen when lowering calls, causing a mismatch. This patch fixes the issue by properly propagating the Calling Convention details. This fixes Bugzilla #47001. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D86715	2020-08-27 17:01:10 +01:00
Benjamin Kramer	fddf543e6e	[MLIR][GPUToSPIRV] Fix use-after-free. Found by asan.	2020-08-27 17:57:11 +02:00
Teresa Johnson	7ed8124d46	[HeapProf] Clang and LLVM support for heap profiling instrumentation See RFC for background: http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html Note that the runtime changes will be sent separately (hopefully this week, need to add some tests). This patch includes the LLVM pass to instrument memory accesses with either inline sequences to increment the access count in the shadow location, or alternatively to call into the runtime. It also changes calls to memset/memcpy/memmove to the equivalent runtime version. The pass is modeled on the address sanitizer pass. The clang changes add the driver option to invoke the new pass, and to link with the upcoming heap profiling runtime libraries. Currently there is no attempt to optimize the instrumentation, e.g. to aggregate updates to the same memory allocation. That will be implemented as follow on work. Differential Revision: https://reviews.llvm.org/D85948	2020-08-27 08:50:35 -07:00
Mikhail Maltsev	a19fd1aab5	Revert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY" This reverts commit `3b71f91558`. The commit is breaking some build bots.	2020-08-27 16:48:10 +01:00
Roman Lebedev	6102310d81	[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify, nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places.. While i could teach EarlyCSE how to hash PHI nodes, we can't really do much (anything?) even if we find two identical PHI nodes in different basic blocks, same-BB case is the interesting one, and if we teach InstSimplify about it (which is what i wanted originally, https://reviews.llvm.org/D86530), we get EarlyCSE support for free. So i would think this is pretty uncontroversial. On vanilla llvm test-suite + RawSpeed, this has the following effects: ``` \| statistic name \| baseline \| proposed \| Δ \| % \| \\|%\\| \| \|----------------------------------------------------\|-----------\|-----------\|-------:\|---------:\|---------:\| \| instsimplify.NumPHICSE \| 0 \| 23779 \| 23779 \| 0.00% \| 0.00% \| \| asm-printer.EmittedInsts \| 7942328 \| 7942392 \| 64 \| 0.00% \| 0.00% \| \| assembler.ObjectBytes \| 273069192 \| 273084704 \| 15512 \| 0.01% \| 0.01% \| \| correlated-value-propagation.NumPhis \| 18412 \| 18539 \| 127 \| 0.69% \| 0.69% \| \| early-cse.NumCSE \| 2183283 \| 2183227 \| -56 \| 0.00% \| 0.00% \| \| early-cse.NumSimplify \| 550105 \| 542090 \| -8015 \| -1.46% \| 1.46% \| \| instcombine.NumAggregateReconstructionsSimplified \| 73 \| 4506 \| 4433 \| 6072.60% \| 6072.60% \| \| instcombine.NumCombined \| 3640264 \| 3664769 \| 24505 \| 0.67% \| 0.67% \| \| instcombine.NumDeadInst \| 1778193 \| 1783183 \| 4990 \| 0.28% \| 0.28% \| \| instcount.NumCallInst \| 1758401 \| 1758799 \| 398 \| 0.02% \| 0.02% \| \| instcount.NumInvokeInst \| 59478 \| 59502 \| 24 \| 0.04% \| 0.04% \| \| instcount.NumPHIInst \| 330557 \| 330533 \| -24 \| -0.01% \| 0.01% \| \| instcount.TotalInsts \| 8831952 \| 8832286 \| 334 \| 0.00% \| 0.00% \| \| simplifycfg.NumInvokes \| 4300 \| 4410 \| 110 \| 2.56% \| 2.56% \| \| simplifycfg.NumSimpl \| 1019808 \| 999607 \| -20201 \| -1.98% \| 1.98% \| ``` I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call` transforms, and counter-intuitively results in more instructions total. That being said, the PHI count doesn't decrease that much, and looking at some examples, it seems at least some of them were previously getting PHI CSE'd in SimplifyCFG of all places.. I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time. As a comment in `InstCombinerImpl::visitPHINode()` already stated, there are no guarantees on the ordering of the operands of a PHI node, so if we just naively compare them, we may false-negatively say that the nodes are not equal when the only difference is operand order, which is especially important since the fold is in InstSimplify, so we can't rely on InstCombine sorting them beforehand. Fixing this for the general case is costly (geomean +0.02%), and does not appear to catch anything in test-suite, but for the same-BB case, it's trivial, so let's fix at least that. As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions this appears to cause geomean +0.03% compile time increase (regression), but geomean -0.01%..-0.04% code size decrease (improvement).	2020-08-27 18:47:04 +03:00
Roman Lebedev	94d3dd8b08	[NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes PHI nodes depend on the block they're in, so we can only deal with the most basic case of same-BB PHI's.	2020-08-27 18:47:03 +03:00
Russell Gallop	c9455d3c57	[Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL This hasn't been allowed as a build option since r309990 Remove leftover REQUIRES: global-isel Differential Revision: https://reviews.llvm.org/D86714	2020-08-27 16:36:27 +01:00
Louis Dionne	49644cd941	[libc++] Install a more recent CMake on libc++ builders	2020-08-27 11:26:27 -04:00
David Nicuesa	3b71f91558	[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY Fix compilation with -DLIBCXX_BUILD_EXTERNAL_THREAD_LIBRARY when using clang. Now linking target 'cxx_external_threads' with 'cxx-headers'. Fix mismatching visibility for `libcpp_timed_backoff_policy` function in file <__threading_support>. Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D86598	2020-08-27 16:24:19 +01:00
Cullen Rhodes	42587345a3	[CodeGen][AArch64] Support arm_sve_vector_bits attribute This patch implements codegen for the 'arm_sve_vector_bits' type attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1]. The purpose of this attribute is to define vector-length-specific (VLS) versions of existing vector-length-agnostic (VLA) types. VLSTs are represented as VectorType in the AST and fixed-length vectors in the IR everywhere except in function args/return. Implemented in this patch is codegen support for the following: * Implicit casting between VLA <-> VLS types. * Coercion of VLS types in function args/return. * Mangling of VLS types. Casting is handled by the CK_BitCast operation, which has been extended to support the two new vector kinds for fixed-length SVE predicate and data vectors, where the cast is implemented through memory rather than a bitcast which is unsupported. Implementing this as a normal bitcast would require relaxing checks in LLVM to allow bitcasting between scalable and fixed types. Another option was adding target-specific intrinsics, although codegen support would need to be added for these intrinsics. Given this, casting through memory seemed like the best approach as it's supported today and existing optimisations may remove unnecessary loads/stores, although there is room for improvement here. Coercion of VLSTs in function args/return from fixed to scalable is implemented through the AArch64 ABI in TargetInfo. The VLA and VLS types are defined by the ACLE to map to the same machine-level SVE vectors. VLS types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif The latest ACLE specification (00bet5) does not contain details of this mangling scheme, it will be specified in the next revision. The mangling scheme is otherwise defined in the appendices to the Procedure Call Standard for the Arm Architecture, see [2] for more information. [1] https://developer.arm.com/documentation/100987/latest [2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85743	2020-08-27 15:11:58 +00:00
Alexandre Ganea	a6a37a2fcd	[Support] On Windows, add optional support for {rpmalloc\|snmalloc\|mimalloc} This patch optionally replaces the CRT allocator (i.e., malloc and free) with rpmalloc (mixed public domain licence/MIT licence) or snmalloc (MIT licence) or mimalloc (MIT licence). Please note that the source code for these allocators must be available outside of LLVM's tree. To enable, use `cmake ... -DLLVM_INTEGRATED_CRT_ALLOC=D:/git/rpmalloc -DLLVM_USE_CRT_RELEASE=MT` where `D:/git/rpmalloc` has already been git clone'd from `https://github.com/mjansson/rpmalloc`. The same applies to snmalloc and mimalloc. When enabled, the allocator will be embeded (statically linked) into the LLVM tools & libraries. This currently only works with the static CRT (/MT), although using the dynamic CRT (/MD) could potentially work as well in the future. When enabled, this changes the memory stack from: new/delete -> MS VC++ CRT malloc/free -> HeapAlloc -> VirtualAlloc to: new/delete -> {rpmalloc\|snmalloc\|mimalloc} -> VirtualAlloc The goal of this patch is to bypass the application's global heap - which is thread-safe thus inducing locking - and instead take advantage of a modern lock-free, thread cache, allocator. On a 6-core Xeon Skylake we observe a 2.5x decrease in execution time when linking a large scale application with LLD and ThinLTO (12 min 20 sec -> 5 min 34 sec), when all hardware threads are being used (using LLD's flag /opt:lldltojobs=all). On a dual 36-core Xeon Skylake with all hardware threads used, we observe a 24x decrease in execution time (1 h 2 min -> 2 min 38 sec) when linking a large application with LLD and ThinLTO. Clang build times also see a decrease in the range 5-10% depending on the configuration. Differential Revision: https://reviews.llvm.org/D71786	2020-08-27 11:09:46 -04:00
diggerlin	6923b0a76e	Revert "[AIX][XCOFF] emit symbol visibility for xcoff object file." This reverts commit `a081868921`. Based on the Hubert Tong'comment https://reviews.llvm.org/D84265#inline-799085	2020-08-27 11:07:58 -04:00
Alexandre E. Eichenberger	a14a2805b0	[MLIR] MemRef Normalization for Dialects When dealing with dialects that will results in function calls to external libraries, it is important to be able to handle maps as some dialects may require mapped data. Before this patch, the detection of whether normalization can apply or not, operations are compared to an explicit list of operations (`alloc`, `dealloc`, `return`) or to the presence of specific operation interfaces (`AffineReadOpInterface`, `AffineWriteOpInterface`, `AffineDMAStartOp`, or `AffineDMAWaitOp`). This patch add a trait, `MemRefsNormalizable` to determine if an operation can have its `memrefs` normalized. This trait can be used in turn by dialects to assert that such operations are compatible with normalization of `memrefs` with nontrivial memory layout specification. An example is given in the literal tests. Differential Revision: https://reviews.llvm.org/D86236	2020-08-27 20:26:59 +05:30
Benjamin Kramer	b5924a8e27	[Hexagon] Fold another layer of single-use variable into assert. NFCI.	2020-08-27 16:52:34 +02:00
Benjamin Kramer	2b7df2707f	[Hexagon] Fold single-use variable into assert. NFCI.	2020-08-27 16:44:22 +02:00
Pavel Labath	dd635062d8	[lldb/cmake] Fix linking of lldbSymbolHelpers for `9cb222e7` I didn't find this locally because I have a /usr/include/gtest which is similar enough to the bundled one to make things appear to work.	2020-08-27 16:40:17 +02:00
Matt Arsenault	6c770a09be	AMDGPU: Hoist subtarget lookup	2020-08-27 10:27:56 -04:00
Krzysztof Parzyszek	154daf1f94	[Hexagon] Widen short vector stores to HVX vectors using masked stores Also invent a flag -hexagon-hvx-widen=N to set the minimum threshold for widening short vectors to HVX vectors.	2020-08-27 09:25:08 -05:00
Florian Hahn	419c6948df	[SimplifyLibCalls] Remove over-eager early return in strlen optzns. Currently we bail out early for strlen calls with a GEP operand, if none of the GEP specific optimizations fire. But there could be later optimizations that still apply, which we currently miss out on. An example is that we do not apply the following optimization strlen(x) == 0 --> *x == 0 Unless I am missing something, there seems to be no reason for bailing out early there. Fixes PR47149. Reviewed By: lebedev.ri, xbolva00 Differential Revision: https://reviews.llvm.org/D85886	2020-08-27 15:19:45 +01:00
Pavel Labath	5b2b754565	[lldb/cmake] Fix linking of lldbUtilityHelpers for `9cb222e74`	2020-08-27 16:06:59 +02:00
Pavel Labath	0de1463373	[lldb] Fix Type::GetByteSize for pointer types The function was returning an incorrect (empty) value on the first invocation. Given that this only affected the first invocation, this bug/typo went mostly unaffected. DW_AT_const_value were particularly badly affected by this as the GetByteSize call is SymbolFileDWARF::ParseVariableDIE is likely to be the first call of this function, and its effects cannot be undone by retrying. Depends on D86348. Differential Revision: https://reviews.llvm.org/D86436	2020-08-27 15:37:49 +02:00
Pavel Labath	9cb222e749	[cmake] Make gtest include directories a part of the library interface This applies the same fix that D84748 did for macro definitions. Appropriate include path is now automatically set for all libraries which link against gtest targets, which avoids the need to set include_directories in various parts of the project. Differential Revision: https://reviews.llvm.org/D86616	2020-08-27 15:35:57 +02:00
Sam McCall	266825620c	[Tooling][Format] Treat compound extensions (foo.bar.cc) as matching foo.h Motivating use case is ".cu.cc" extensions used in some bazel projects. Alternative is to work around this with IncludeIsMainRegex in styles. I proposed this approach because it seems like a better default. Differential Revision: https://reviews.llvm.org/D86597	2020-08-27 15:24:17 +02:00
Pavel Labath	9f5927e42b	[lldb/DWARF] Fix handling of variables with both location and const_value attributes Class-level static constexpr variables can have both DW_AT_const_value (in the "declaration") and a DW_AT_location (in the "definition") attributes. Our code was trying to handle this, but it was brittle and hard to follow (and broken) because it was processing the attributes in the order in which they were found. Refactor the code to make the intent clearer -- DW_AT_location trumps DW_AT_const_value, and fix the bug which meant that we were not displaying these variables properly (the culprit was the delayed parsing of the const_value attribute due to a need to fetch the variable type. Differential Revision: https://reviews.llvm.org/D86615	2020-08-27 15:05:47 +02:00
Pavel Labath	219ccdfdde	[lldb/Utility] Use APSInt in the Scalar class This enables us to further simplify some code because it no longer needs to switch on the signedness of the type (APSInt handles that).	2020-08-27 15:05:47 +02:00
serge-sans-paille	4e29d25669	Fix OpenMP deduplicateRuntimeCalls return status Differential Revision: https://reviews.llvm.org/D86705	2020-08-27 15:01:04 +02:00
serge-sans-paille	5621571fc7	Fix Attributor return status Differential Revision: https://reviews.llvm.org/D86703	2020-08-27 15:01:04 +02:00
Eduardo Caldas	ac87a0b587	[SyntaxTree][NFC][Style] Functions start with lowercase Differential Revision: https://reviews.llvm.org/D86682	2020-08-27 12:55:24 +00:00
Eduardo Caldas	fda3fa822c	[SyntaxTree][NFC] Append "get" to syntax Nodes accessor names Differential Revision: https://reviews.llvm.org/D86679	2020-08-27 12:55:23 +00:00
Raul Tambre	45344cf7ac	[CMake][compiler-rt][libunwind] Compile assembly files as ASM not C, unify workarounds It isn't very wise to pass an assembly file to the compiler and tell it to compile as a C file and hope that the compiler recognizes it as assembly instead. Simply don't mark the file as C and CMake will recognize the rest. This was attempted earlier in https://reviews.llvm.org/D85706, but reverted due to architecture issues on Apple. Subsequent digging revealed a similar change was done earlier for libunwind in https://reviews.llvm.org/rGb780df052dd2b246a760d00e00f7de9ebdab9d09. Afterwards workarounds were added for MinGW and Apple: * https://reviews.llvm.org/rGb780df052dd2b246a760d00e00f7de9ebdab9d09 * https://reviews.llvm.org/rGd4ded05ba851304b26a437896bc3962ef56f62cb The workarounds in libunwind and compiler-rt are unified and comments added pointing to each other. The workaround is updated to only be used for MinGW for CMake versions before 3.17, which fixed the issue (https://gitlab.kitware.com/cmake/cmake/-/merge_requests/4287). Additionally fixed Clang not being passed as the assembly compiler for compiler-rt runtime build. Example error: [525/634] Building C object lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o FAILED: lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o /opt/tooling/drive/host/bin/clang --target=aarch64-linux-gnu -I/opt/tooling/drive/llvm/compiler-rt/lib/tsan/.. -isystem /opt/tooling/drive/toolchain/opt/drive/toolchain/include -x c -Wall -Wno-unused-parameter -fno-lto -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fno-sanitize=safe-stack -fvisibility=hidden -fno-lto -O3 -gline-tables-only -Wno-gnu -Wno-variadic-macros -Wno-c99-extensions -Wno-non-virtual-dtor -fPIE -fno-rtti -Wframe-larger-than=530 -Wglobal-constructors --sysroot=. -MD -MT lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o -MF lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o.d -o lib/tsan/CMakeFiles/clang_rt.tsan-aarch64.dir/rtl/tsan_rtl_aarch64.S.o -c /opt/tooling/drive/llvm/compiler-rt/lib/tsan/rtl/tsan_rtl_aarch64.S /opt/tooling/drive/llvm/compiler-rt/lib/tsan/rtl/tsan_rtl_aarch64.S:29:1: error: expected identifier or '(' .section .text ^ 1 error generated. Differential Revision: https://reviews.llvm.org/D86308	2020-08-27 15:40:15 +03:00
Jay Foad	45eeb8c2a9	[AMDGPU] Remove unused variable introduced in r251860	2020-08-27 13:28:32 +01:00
Drew Wock	0ec098e22b	[FPEnv] Allow fneg + strict_fadd -> strict_fsub in DAGCombiner This is the first of a set of DAGCombiner changes enabling strictfp optimizations. I want to test to waters with this to make sure changes like these are acceptable for the strictfp case- this particular change should preserve exception ordering and result precision perfectly, and many other possible changes appear to be able to as well. Copied from regular fadd combines but modified to preserve ordering via the chain, this change allows strict_fadd x, (fneg y) to become struct_fsub x, y and strict_fadd (fneg x), y to become strict_fsub y, x. Differential Revision: https://reviews.llvm.org/D85548	2020-08-27 08:17:01 -04:00
Martin Storsjö	df8f3bf626	[LLD] [COFF] Check the aux section definition size for IMAGE_COMDAT_SELECT_SAME_SIZE Binutils generated sections seem to be padded to a multiple of 16 bytes, but the aux section definition contains the original, unpadded section length. The size check used for IMAGE_COMDAT_SELECT_SAME_SIZE previously only checked the size of the section itself. When checking the currently processed object file against the previously chosen comdat section, we easily have access to the aux section definition of the currently processed section, but we have to iterate over the symbols of the previously selected object file to find the section definition of the previously picked section. (We don't want to inflate SectionChunk to carry more data, for something that is only needed in corner cases.) Only do this when the mingw flag is set. This fixes statically linking clang-built C++ object files against libstdc++ built with GCC, if the object files contain e.g. typeinfo. Differential Revision: https://reviews.llvm.org/D86659	2020-08-27 15:08:57 +03:00
Martin Storsjö	e72403f96d	[LLD] [MinGW] Enable dynamicbase by default This matches lld-link's own default. Add a new command line option --no-dynamicbase for disabling it. (Unfortunately, GNU ld doesn't yet have a matching --no-dynamicbase option, as that's the default there.) Differential Revision: https://reviews.llvm.org/D86654	2020-08-27 15:08:53 +03:00
Russell Gallop	c17718e0ff	Fix for PS4 bots after `0b7f6cc71a`	2020-08-27 12:47:26 +01:00
Florian Hahn	bb024c3c4e	[DSE,MemorySSA] Remove short-cut to check if all paths are covered. The post-order number early continue does not work in some cases, e.g. if a path from EarlierAccess to an exit includes a node that dominates EarlierAccess in a cycle. The short-cut only has very minor impact on compile-time, so it seems straight-forward to remove it for now: http://llvm-compile-time-tracker.com/compare.php?from=062412e79fcfedf2cf004433e42036b0333e3f83&to=d7386016a77ce1387bdbbf360f1de157faea9d31&stat=instructions Fixes PR47285.	2020-08-27 12:42:40 +01:00
Anatoly Trosinenko	fce035eae9	[NFC][compiler-rt] Factor out __mulo[sdt]i4 implementations to .inc file The existing implementations are almost identical except for width of the integer type. Factor them out to int_mulo_impl.inc for better maintainability. This patch is almost identical to D86277. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D86289	2020-08-27 14:33:48 +03:00
Anatoly Trosinenko	182d14db07	[NFC][compiler-rt] Factor out __mulv[sdt]i3 implementations to .inc file The existing implementations are almost identical except for width of the integer type. Factor them out to int_mulv_impl.inc for better maintainability. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D86277	2020-08-27 14:33:48 +03:00
Andrew Ng	d4e2e2852a	[ELF][test] Add test coverage of TLS to gc-sections.s Differential Revision: https://reviews.llvm.org/D86639	2020-08-27 12:28:51 +01:00
OCHyams	b6cca0ec05	Revert "[DWARF] Add cuttoff guarding quadratic validThroughout behaviour" This reverts commit `b9d977b0ca`. This cutoff is no longer required. The commit 34ffa7fc501 (D86153) introduces a performance improvement which was tested against the motivating case for this patch. Discussed in differential revision: https://reviews.llvm.org/D86153	2020-08-27 11:52:30 +01:00
OCHyams	57d8acac64	[DwarfDebug] Improve validThroughout performance (4/4) Almost NFC (see end). The backwards scan in validThroughout significantly contributed to compile time for a pathological case, causing the 'X86 Assembly Printer' pass to account for roughly 70% of the run time. This patch guards the loop against running unnecessarily, bringing the pass contribution down to 4%. Almost NFC: There is a hack in validThroughout which promotes single constant value DBG_VALUEs in the prologue to be live throughout the function. We're more likely to hit this code path with this patch applied. Similarly to the parent patches there is a small coverage change reported in the order of 10s of bytes. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D86153	2020-08-27 11:52:30 +01:00
OCHyams	3c491881d2	[DwarfDebug] Improve multi-BB single location detection in validThroughout (3/4) With the changes introduced in D86151 we can now check for single locations which span multiple blocks for inlined scopes and blocks. D86151 introduced the InstructionOrdering parameter, replacing a scan through MBB instructions. The functionality to compare instruction positions across blocks was add there, and this patch just removes the exit checks that were previously (but no longer) required. CTMark shows a geomean binary size reduction of 2.2% for RelWithDebInfo builds. llvm-locstats (using D85636) shows a very small variable location coverage change in 5 of 10 binaries, but just like in D86151 it is only in the order of 10s of bytes. Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D86152	2020-08-27 11:52:29 +01:00
OCHyams	0b5a8050ea	[DwarfDebug] Improve single location detection in validThroughout (2/4) With this patch we're now accounting for two more cases which should be considered 'valid throughout': First, where RangeEnd is ScopeEnd. Second, where RangeEnd comes before ScopeEnd when including meta instructions, but are both preceded by the same non-meta instruction. CTMark shows a geomean binary size reduction of 1.5% for RelWithDebInfo builds. `llvm-locstats` (using D85636) shows a very small variable location coverage change in 2 of 10 binaries, but it is in the order of 10s of bytes which lines up with my expectations. I've added a test which checks both of these new cases. The first check in the test isn't strictly necessary for this patch. But I'm not sure that it is explicitly tested anywhere else, and is useful for the final patch in the series. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D86151	2020-08-27 11:52:29 +01:00

1 2 3 4 5 ...

364697 Commits All Branches Search

364697 Commits

All Branches