As suggested in D45842
(although still not sure if we're going to advance that),
we must invalidate references to instructions that have
been recycled (operands were changed, so result is different).
llvm-svn: 331083
Some of the bots were failing in a different way to the others. These were
unable to compare tuples. Fix this by changing to a struct, thereby avoiding
the quirks of tuples.
llvm-svn: 331081
We currently have a hard to solve analysis problem around the order of instructions within a potentially throwing block. We can't cheaply determine whether a given instruction is before the first potential throw in the block. While we're working on that in the background, special case the first instruction within the header.
why this particular special case? Well, headers are guaranteed to execute if the loop does, and it turns out we tend to produce this form in practice.
In a follow on patch, I tend to extend LICM with an alternate approach which works for any instruction in the header before the first throw, but this is the best I can come up with other users of the analysis (such as store promotion.)
Note: I can't show the difference in the analysis result since we're ORing in the expensive instruction walk used by SCEV. Using the full walk is not suitable for a general solution.
llvm-svn: 331079
Summary:
Only allow a single unique .symver alias per symbol. This matches the
behavior of gas. I noticed that we ignored multiple mismatched symver
directives looking at https://reviews.llvm.org/D45798
Reviewers: pcc, tejohnson, espindola
Reviewed By: pcc
Subscribers: emaste, arichardson, llvm-commits, kcc
Differential Revision: https://reviews.llvm.org/D45845
llvm-svn: 331078
These maps are small, but we are creating an destroying one for each
input .eh_frame.
This patch reduces the total memory allocation from 765.54MB to
749.19MB. The peak is still the same: 563.7MB.
llvm-svn: 331075
Extend the live-in check for all aliased registers so that we can
allow sinking Copy instructions when only implicit def is in successor's
live-in.
llvm-svn: 331072
Summary:
Currently only the memory size is supported but others can be added as
needed.
narrowScalar for G_LOAD and G_STORE now correctly update the
MachineMemOperand and will refuse to legalize atomics since those need more
careful expansions to maintain atomicity.
Reviewers: ab, aditya_nandakumar, bogner, rtereshin, aemerson, javed.absar
Reviewed By: aemerson
Subscribers: aemerson, rovka, kristof.beyls, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D45466
llvm-svn: 331071
LLVM_ON_WIN32 is set exactly with MSVC and MinGW (but not Cygwin) in
HandleLLVMOptions.cmake, which is where _WIN32 defined too. Just use the
default macro instead of a reinvented one.
See thread "Replacing LLVM_ON_WIN32 with just _WIN32" on llvm-dev and cfe-dev.
No intended behavior change.
llvm-svn: 331069
Summary:
Previously, we checked tokens for `tok::identifier` to see if they
were identifiers inside an Objective-C selector.
However, this missed C++ keywords like `new` and `delete`.
To fix this, this diff uses `getIdentifierInfo()` to find
identifiers or keywords inside Objective-C selectors.
Test Plan: New tests added. Ran tests with:
% make -j16 FormatTests && ./tools/clang/unittests/Format/FormatTests
Reviewers: djasper, jolesiak
Reviewed By: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D46143
llvm-svn: 331067
During deserialization clang is currently missing the merging of
protocols into the canonical interface for the class extension.
This merging only currently happens during parsing and should also
be considered during deserialization.
rdar://problem/38724303
llvm-svn: 331063
The idea is to have a pass which performs the same transformation as GuardWidening, but can be run within a loop pass manager without disrupting the pass manager structure. As demonstrated by the test case, this doesn't quite get there because of issues with post dom, but it gives a good step in the right direction. the motivation is purely to reduce compile time since we can now preserve locality during the loop walk.
This patch only includes a legacy pass. A follow up will add a new style pass as well.
llvm-svn: 331060
Now that getSectionPiece is fast (uses a hash) it is probably OK to
split merge sections early.
The reason I want to do this is to split eh_frame sections in the same
place.
This does mean that we have to decompress early. Given that the only
compressed sections are debug info, I don't think we are missing much.
It is a small improvement: 0.5% on the geometric mean.
llvm-svn: 331058
These branches were previously unanalyzable and unselectable. Add them and
recognize how to generate their inverses.
Reviewers: smaksimovic, atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D46113
llvm-svn: 331050
Always normalizing lldb_private::FileSpec paths will help us get a consistent results from comparisons when setting breakpoints and when looking for source files. This also removes a lot of complexity from the comparison routines. Modified the DWARF line table parser to use the normalized compile unit directory if needed.
Differential Revision: https://reviews.llvm.org/D45977
llvm-svn: 331049
PPC64 V2 ABI describes two entry points to a function. The global entry point
sets up the TOC base pointer. When calling a local function, the call should
branch to the local entry point rather than the global entry point.
Section 3.4.1 describes using the 3 most significant bits of the st_other
field to find out how many instructions there are between the local and global
entry point. This patch adds the correct offset required to branch to the local
entry point of a function.
Differential Revision: https://reviews.llvm.org/D45729
llvm-svn: 331046
Put the first ldp at the end, so that the load-store optimizer can run
and merge the ldp and the add into a post-index ldp.
This didn't work in case no frame was needed and resulted in code size
regressions.
llvm-svn: 331044
Summary:
`HandleLLVMOptions` adds `-w` to the cflags if `LLVM_ENABLE_WARNINGS` is not on.
With `-w`, `check_cxx_compiler_flag` doesn't error out for unsupported flags
(for example `-mcrc` on x86_64), and those flags end up being detected as
working - and really they aren't.
I am not entirely sure what the best way to solve this is, but setting
`LLVM_ENABLE_WARNINGS` prior to including `HandleLLVMOptions` does the job.
Reviewers: phosek, beanz
Reviewed By: phosek
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D46079
llvm-svn: 331042
As discussed in the post-commit thread for:
rL330437 ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html )
We need a way to opt-out of a float-to-int-to-float cast optimization because too much
existing code relies on the platform-specific undefined result of those casts when the
float-to-int overflows.
The LLVM changes associated with adding this function attribute are here:
rL330947
rL330950
rL330951
Also as suggested, I changed the LLVM doc to mention the specific sanitizer flag that
catches this problem:
rL330958
Differential Revision: https://reviews.llvm.org/D46135
llvm-svn: 331041
If the MachineInstr uses a custom inserter and is then erased after
instruction selection, there is no use for mapping it to a sched class.
Review: Ulrich Weigand
llvm-svn: 331040
The ACLE spec which describes these intrinsics hasn't been published yet, but
this is based on the final draft which will be published soon, and these have
already been implemented by GCC.
Differential revision: https://reviews.llvm.org/D46109
llvm-svn: 331039
This adds a pre-defined macro to test if the compiler has support for the
v8.2-A dot rpoduct intrinsics in AArch32 mode.
The AAcrh64 equivalent has already been added by rL330229.
The ACLE spec which describes this macro hasn't been published yet, but this is
based on the final internal draft, and GCC has already implemented this.
Differential revision: https://reviews.llvm.org/D46108
llvm-svn: 331038
We currently support LCSSA PHI nodes in the outer loop exit, if their
incoming values do not come from the outer loop latch or if the
outer loop latch has a single predecessor. In that case, the outer loop latch
will be executed only if the inner loop gets executed. If we have multiple
predecessors for the outer loop latch, it may be executed even if the inner
loop does not get executed.
This is a first step to support the case described in
https://bugs.llvm.org/show_bug.cgi?id=30472
Reviewers: efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D43237
llvm-svn: 331037
This adds IR intrinsics for the AArch64 dot-product instructions introduced in
v8.2-A.
Differential revisioon: https://reviews.llvm.org/D46107
llvm-svn: 331036
Since PTX has grown a <2 x half> datatype vectorization has become more
important. The late LoadStoreVectorizer intentionally only does loads
and stores, but now arithmetic has to be vectorized for optimal
throughput too.
This is still very limited, SLP vectorization happily creates <2 x half>
if it's a legal type but there's still a lot of register moving
happening to get that fed into a vectorized store. Overall it's a small
performance win by reducing the amount of arithmetic instructions.
I haven't really checked what the loop vectorizer does to PTX code, the
cost model there might need some more tweaks. I didn't see it causing
harm though.
Differential Revision: https://reviews.llvm.org/D46130
llvm-svn: 331035