Summary:
Extend -unroll-partial-threshold to 200 for runtime-loop3.ll test
as epilogue unroll initially add 1 more IV to the loop.
From: Evgeny Stupachenko <evstupac@gmail.com>
llvm-svn: 296803
Summary:
Currently both libc++ and libc++abi provide definitions for operator new/delete. However I believe this is incorrect and that one or the other should offer them.
This patch adds the CMake option `-DLIBCXX_ENABLE_NEW_DELETE_DEFINITIONS` which defaults no `ON` unless `-DLIBCXXABI_ENABLE_NEW_DELETE_DEFINITIONS=ON` is specified.
Reviewers: mclow.lists, mehdi_amini, dexonsmith, danalbert, smeenai, mgorny, rmaprath
Reviewed By: mehdi_amini
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30516
llvm-svn: 296802
Summary:
Currently both libc++ and libc++abi provide definitions for operator new/delete. However I believe this is incorrect and that one or the other should offer them.
This patch adds the CMake option `-DLIBCXXABI_ENABLE_NEW_DELETE_DEFINITIONS` which defaults to `OFF` unless otherwise specified. This means that by default
only libc++ provides the new/delete definitions.
Reviewers: mclow.lists, mehdi_amini, dexonsmith, beanz, jroelofs, danalbert, smeenai, rmaprath, mgorny
Reviewed By: mehdi_amini
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30517
llvm-svn: 296801
Summary:
Currently both libc++ and libc++abi provide definitions for new/delete. However libc++abi's definitions haven't been updated to include aligned new/delete or sized deallocation.
I don't see any reason why libc++abi shouldn't provide these newer overloads.
This patch copies libc++'s implementation of `new/delete` into libc++abi so that it's now up to date.
After applying this patch I plan to fix a longstanding bug where both libc++ and libc++abi provide definitions for new/delete.
Reviewers: mclow.lists, mehdi_amini, dexonsmith, danalbert, smeenai, rmaprath, jroelofs
Reviewed By: mehdi_amini
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30514
llvm-svn: 296787
Make opcode selection code for the load instruction a bit easier
to read and maintain.
This patch also catches number of f16 load/store variants that were
not handled before.
Differential Revision: https://reviews.llvm.org/D30513
llvm-svn: 296785
MMX extraction often ends up as extract_i32(bitcast_v2i32(extract_i64(bitcast_v1i64(x86mmx v), 0)), 0) which fails to simplify on 32-bit targets as i64 isn't legal
llvm-svn: 296782
Original commit message:
"Allow externally dlopen-ed libraries to be registered as permanent libraries.
This is also useful in cases when llvm is in a shared library. First we dlopen
the llvm shared library and then we register it as a permanent library in order
to keep the JIT and other services working.
Patch reviewed by Vedant Kumar (D29955)!"
llvm-svn: 296774
This patch reduces the stack frame size by not allocating the parameter area if
it is not required. In the current implementation LowerFormalArguments_64SVR4
already handles the parameter area, but LowerCall_64SVR4 does not
(when calculating the stack frame size). What this patch does is make
LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4.
Committing on behalf of Hiroshi Inoue.
Differential Revision: https://reviews.llvm.org/D29881
llvm-svn: 296771
When we are deciding whether we are creating a PCH or a module, we would
check if the ModuleMgr had any elements to switch into PCH mode.
However, when creating a module, the size may be 1. This would result
in us going down the wrong path.
This was found by cross-compiling the swift standard library. Use the
PCH chain length instead to identify the PCH mode.
Unfortunately, I have not yet been able to create a simple test case for
this, but have verified that this fixes the swift standard library
construction.
Thanks to Adrian Prantl for help and discussions with this change!
llvm-svn: 296769
This bug was introduced with:
https://reviews.llvm.org/rL296699
There may be a way to loosen the restriction, but for now just bail out
on any opaque constant.
The tests show that opacity is target-specific. This goes back to cost
calculations in ConstantHoisting based on TTI->getIntImmCost().
llvm-svn: 296768
This tool allows generating the different between two optimization record
files. The result is a YAML file too that can be visualized with opt-viewer.
This is very useful to see what optimization were added and removed by a
change.
llvm-svn: 296767
We used to exclude arguments but for a diffed YAML file, it's interesting to
show these as changes.
Turns out this also affects gvn/LoadClobbered because we used to squash
multiple entries of this on the same line even if they reported clobbers
by *different* instructions. This increases the number of unique entries now
and the share of gvn/LoadClobbered.
Total number of remarks 902287
Top 10 remarks by pass:
inline 43%
gvn 37%
licm 11%
loop-vectorize 4%
asm-printer 3%
regalloc 1%
loop-unroll 1%
inline-cost 0%
slp-vectorizer 0%
loop-delete 0%
Top 10 remarks:
gvn/LoadClobbered 33%
inline/Inlined 16%
inline/CanBeInlined 14%
inline/NoDefinition 7%
licm/Hoisted 6%
licm/LoadWithLoopInvariantAddressInvalidated 5%
gvn/LoadElim 3%
asm-printer/InstructionCount 3%
inline/TooCostly 2%
loop-vectorize/MissedDetails 2%
llvm-svn: 296766
__getattr__ does not work well with debugging. If the attribute function has
a run-time error, a missing attribute is reported instead.
llvm-svn: 296765
Bug-Point functionality needs extending due to the patch D29185 by bd1976llvm (Allow llvm's build and test systems to support paths with spaces ). It requires Bugpoint to accept the use of spaces within ‘--compile-command’ tokens.
Details
Bugpoint uses the argument ‘--compile-command’ to pass in a command line argument as a string, the string is tokenized by the ‘lexCommand’ function using spaces as a delimiter. Patch D29185 will cause the unit test compile-custom.ll to fail as spaces are now required within tokens and as a delimiter. This patch allows the use of escape characters as below:
Two consecutive '\' evaluate to a single '\'.
A space after a '\' evaluates to a space that is not interpreted as a delimiter.
Any other instances of the '\' character are removed.
Committed on behalf of Owen Reynolds
Differential revision: https://reviews.llvm.org/D29940
llvm-svn: 296763
This re-applies r289696, which caused TSan perf regression, which has
since been addressed in separate changes (see PR for details).
See PR31382.
llvm-svn: 296759
The CallingConv.td rules allocate 8 bytes for these kinds of arguments
on AAPCS targets, but we were only recording the smaller amount. The
difference is theoretical on AArch64 because we don't actually store
more than the smaller amount, but it's still much better to have these
two components in agreement.
Based on Diana Picus's ARM equivalent patch (where it matters a lot
more).
llvm-svn: 296754
* suggest static_cast instead of reinterpret_cast for casts from void*
* top-level const doesn't need a const_cast
* don't emit a separate "possibly redundant cast" warning, instead suggest
static_cast (in C++ only) and add a little hint to consider removing the cast
llvm-svn: 296753
Summary:
When InstCombine is optimizing certain select-cmp-br patterns
it replaces the result of the select in uses outside of the
basic block containing the select. This is only legal if the
path from the select to the outside use is disjoint from all
other paths out from the originating basic block.
The problem found was that InstCombiner::replacedSelectWithOperand
did not consider the case when both edges out from the br pointed
to the same label. In that case the paths aren't disjoint and the
transformation is illegal. This patch avoids the faulty rewrites
by verifying that there is a single flow to the successor where
we want to replace uses.
Reviewers: llvm-commits, spatel, majnemer
Differential Revision: https://reviews.llvm.org/D30455
llvm-svn: 296752
After r296750, we're able to match interleaved accesses having types wider than
128 bits. This patch updates the associated TTI costs.
Differential Revision: https://reviews.llvm.org/D29675
llvm-svn: 296751
This patch teaches (ARM|AArch64)ISelLowering.cpp to match illegal vector types
to interleaved access intrinsics as long as the types are multiples of the
vector register width. A "wide" access will now be mapped to multiple
interleave intrinsics similar to the way in which non-interleaved accesses with
illegal types are legalized into multiple accesses. I'll update the associated
TTI costs (in getInterleavedMemoryOpCost) as a follow-on.
Differential Revision: https://reviews.llvm.org/D29466
llvm-svn: 296750
When computing the smallest and largest types for selecting the maximum
vectorization factor, we currently ignore loads and stores of pointer types if
the memory access is non-consecutive. We do this because such accesses must be
scalarized regardless of vectorization factor, and thus shouldn't be considered
when determining the factor. This patch makes this check less aggressive by
also considering non-consecutive accesses that may be vectorized, such as
interleaved accesses. Because we don't know at the time of the check if an
accesses will certainly be vectorized (this is a cost model decision given a
particular VF), we consider all accesses that can potentially be vectorized.
Differential Revision: https://reviews.llvm.org/D30305
llvm-svn: 296747
These loads cannot be savely hoisted as the condition guarding the
non-affine region cannot be duplicated to also protect the hoisted load
later on. Today they are dropped in ScopInfo. By checking for this early, we
do not even try to model them and possibly can still optimize smaller regions
not containing this specific required-invariant load.
llvm-svn: 296744
If dominator tree is not calculated or is invalidated, set corresponding
pointer in the pass state to nullptr. Such pointer value will indicate
that operations with dominator tree are not allowed. In particular, it
allows to skip verification for such pass state. The dominator tree is
not calculated if the machine dominator pass was skipped, it occures in
the case of entities with linkage available_externally.
The change fixes some test fails observed when expensive checks
are enabled.
Differential Revision: https://reviews.llvm.org/D29280
llvm-svn: 296742
this test was using the VPATH hack to avoid having a copy of the
inferior source code. This makes the test fail if in happens to run
concurrently with a test in the parent folder. Fix that by moving it up
to the parent.
llvm-svn: 296741
MSVC (at least the version I am using) does not want to implicitly
capture a const bool variable. Move it into the lambda, as it is not
used outside anyway.
llvm-svn: 296738
Summary:
This patch makes the namespace comment fixer use the number of unwrapped lines
that a namespace spans to detect it that namespace is short, thus not needing
end comments to be added.
This is needed to ensure clang-format is idempotent. Previously, a short namespace
was detected by the original source code lines. This has the effect of requiring two
runs for this example:
```
namespace { class A; }
```
after first run:
```
namespace {
class A;
}
```
after second run:
```
namespace {
class A;
} // namespace
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D30528
llvm-svn: 296736
Summary:
Hello everybody,
this is an incremental patch for the NoMalloc-Checker I wrote. It allows to configure the memory-management functions, that are checked,
This might be helpful for a code base with custom functions in use, or non-standard functionality, like posix_memalign.
Reviewers: aaron.ballman, hokein, alexfh
Reviewed By: aaron.ballman, alexfh
Subscribers: sbenza, nemanjai, JDevlieghere
Tags: #clang-tools-extra
Patch by Jonas Toth!
Differential Revision: https://reviews.llvm.org/D28239
llvm-svn: 296734