The motivation for this patch starts with PR20134:
https://llvm.org/bugs/show_bug.cgi?id=20134
void foo(int *a, int i) {
a[i] = a[i+1] + a[i+2];
}
It seems better to produce this (14 bytes):
movslq %esi, %rsi
movl 0x4(%rdi,%rsi,4), %eax
addl 0x8(%rdi,%rsi,4), %eax
movl %eax, (%rdi,%rsi,4)
Rather than this (22 bytes):
leal 0x1(%rsi), %eax
cltq
leal 0x2(%rsi), %ecx
movslq %ecx, %rcx
movl (%rdi,%rcx,4), %ecx
addl (%rdi,%rax,4), %ecx
movslq %esi, %rax
movl %ecx, (%rdi,%rax,4)
The most basic problem (the first test case in the patch combines constants) should also be fixed in InstCombine,
but it gets more complicated after that because we need to consider architecture and micro-architecture. For
example, AArch64 may not see any benefit from the more general transform because the ISA solves the sexting in
hardware. Some x86 chips may not want to replace 2 ADD insts with 1 LEA, and there's an attribute for that:
FeatureSlowLEA. But I suspect that doesn't go far enough or maybe it's not getting used when it should; I'm
also not sure if FeatureSlowLEA should also mean "slow complex addressing mode".
I see no perf differences on test-suite with this change running on AMD Jaguar, but I see small code size
improvements when building clang and the LLVM tools with the patched compiler.
A more general solution to the sext(add nsw(x, C)) problem that works for multiple targets is available
in CodeGenPrepare, but it may take quite a bit more work to get that to fire on all of the test cases that
this patch takes care of.
Differential Revision: http://reviews.llvm.org/D13757
llvm-svn: 250560
When we have a R_PPC64_ADDR64 for a weak undef symbol, which thus resolves to
0, and we're creating a shared library, we need to make sure that it stays 0
(because code that conditionally calls the weak function tests for this).
Unfortunately, we were creating a R_PPC64_RELATIVE for these relocation
targets, making the address of the undefined weak symbol equal to the base
address of the shared library (which is non-zero). In general, we should not be
creating RelativeReloc relocs for undef weak symbols.
llvm-svn: 250558
Crashing is bad, m'kay? Fixing a 4 year old bug of my own creation.
Adding the testcase now which I should have added then which would have
long since caught this.
The problem is that printMessage() will display the diagnostic but not
set HadError to true, resulting in the assembler continuing on its way
and trying to create relocations for things that may not allow them or
otherwise get itself into trouble. Using the Error() helper function
here rather than calling printMessage() directly resolves this.
rdar://23133240
llvm-svn: 250557
R_PPC64_TOC does not have an associated symbol, but does have a non-zero VA
that target-specific code must compute using some non-trivial rule. We
handled this as a special case in PPC64TargetInfo::relocateOne, where
we knew to write this special address, but that did not work when creating shared
libraries. The special TOC address needs to be the subject of a
R_PPC64_RELATIVE relocation, and so we also need to know how to encode this
special address in the addend of that relocation.
Thus, some target-specific logic is necessary when creating R_PPC64_RELATIVE as
well. To solve this problem, we teach getLocalRelTarget to handle R_PPC64_TOC
as a special case. This allows us to remove the special case in
PPC64TargetInfo::relocateOne (simplifying code there), and naturally allows the
existing logic to do the right thing when creating associated R_PPC64_RELATIVE
relocations for shared libraries.
llvm-svn: 250555
Summary:
We now use the block for the catchpad itself, rather than its normal
successor, as the funclet entry.
Putting the normal successor in the map leads downstream funclet
membership computations to erroneous results.
Reviewers: majnemer, rnk
Subscribers: rnk, llvm-commits
Differential Revision: http://reviews.llvm.org/D13798
llvm-svn: 250552
If you increase the number of diags of a particular type by one more than the
number available you get the nice assert message. If you do it by two more
than available you get the old non-helpful message. Combining the two makes
sense I think.
llvm-svn: 250546
When I initially implemented PPC64TargetInfo::isRelRelative, I included a fixed
set of relative relocations, and made the default false. In retrospect, this
seems unwise in two respects: First, most PPC64 relocations are relative
(either to the base address, the TOC, etc.). Second, most relocation targets
are not appropriate for R_PPC64_RELATIVE (which writes a 64-bit absolute
address). Thus, back off, and include only those relocations for which we test
(or soon will), and are obviously appropriate for R_PPC64_RELATIVE.
llvm-svn: 250540
The number of samples collected at the head of a function only make
sense for top-level functions (i.e., those actually called as opposed to
being inlined inside another).
Head samples essentially count the time spent inside the function's
prologue. This clearly doesn't make sense for inlined functions, so we
were always emitting 0 in those.
llvm-svn: 250539
Summary:
This check flags all access to members of unions. Passing unions as a
whole is not flagged.
Reading from a union member assumes that member was the last one
written, and writing to a union member assumes another member with a
nontrivial destructor had its destructor called. This is fragile because
it cannot generally be enforced to be safe in the language and so relies
on programmer discipline to get it right.
This rule is part of the "Type safety" profile of the C++ Core
Guidelines, see
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#-type7-avoid-accessing-members-of-raw-unions-prefer-variant-instead
Reviewers: alexfh, sbenza, bkramer, aaron.ballman
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D13784
llvm-svn: 250537
Summary: The syntax has changed a bit recently.
Reviewers: binji
Subscribers: llvm-commits, jfb, sunfish, dschuff
Differential Revision: http://reviews.llvm.org/D13821
llvm-svn: 250535
Summary:
When a cleanup's cleanupendpad or cleanupret targets a catchendpad, stop
trying to propagate the cleanup's parent's color to the catchendpad, since
what's needed is the cleanup's grandparent's color and the catchendpad
will get that color from the catchpad linkage already. We already had
this exclusion for invokes, but were missing it for
cleanupendpad/cleanupret.
Also add a missing line that tags cleanupendpads' states in the
EHPadStateMap, without with lowering invokes that target cleanupendpads
which unwind to other handlers (and so don't have the -1 state) will fail.
This fixes the reduced IR repro in PR25163.
Reviewers: majnemer, andrew.w.kaylor, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D13797
llvm-svn: 250534
Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations.
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).
Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its
select-formation behavior that changed with r248439.
Differential Revision: http://reviews.llvm.org/D13297
llvm-svn: 250527
This fix implements the following OMPT events for the API locking routines:
* ompt_event_acquired_lock
* ompt_event_acquired_nest_lock_first
* ompt_event_acquired_nest_lock_next
* ompt_event_init_lock
* ompt_event_init_nest_lock
* ompt_event_destroy_lock
* ompt_event_destroy_nest_lock
For the acquired events the depths of the locks ist required, so a return value
was added similiar to the return values we already have for the release lock
routines.
Patch by Tim Cramer
Differential Revision: http://reviews.llvm.org/D13689
llvm-svn: 250526
Using the Python native C API is non-portable across Python versions,
so this patch changes them to use the `PythonFile` class which hides
the version specific differences behind a single interface.
llvm-svn: 250525
Summary: Make the relooper an analysis pass, to convert CFG to AST.
Reviewers: sunfish
Subscribers: jfb, dschuff
Differential Revision: http://reviews.llvm.org/D12744
llvm-svn: 250524
Summary: Prevent clang-tidy from discarding fixes that are in different files but happen to have the same file offset.
Reviewers: klimek, bkramer
Subscribers: bkramer, alexfh, cfe-commits
Differential Revision: http://reviews.llvm.org/D13810
llvm-svn: 250523
Summary: Prevent clang-tidy from applying fixes to errors that overlap with other errors' fixes, with one exception: if one fix is completely contained inside another one, then we can apply the big one.
Reviewers: bkramer, klimek
Subscribers: djasper, cfe-commits, alexfh
Differential Revision: http://reviews.llvm.org/D13516
llvm-svn: 250509