https://reviews.llvm.org/D51147
Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.
reviewed by : dsanders.
llvm-svn: 343325
- Add fix so that all code paths that create DWARFContext
with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method
llvm-svn: 343317
When printing successor probabilities for a MBB, a human readable value is sometimes shown as 200.0%.
The human readable output is based on getProbabilityIterator, which returns 0xFFFFFFFF for getNumerator() and 0x80000000 for getDenominator() for unknown BranchProbability.
By using getSuccProbability as we do for the non-human readable part, we can avoid this problem.
Differential Revision: https://reviews.llvm.org/D52605
llvm-svn: 343297
This shouldn't really happen in practice I hope, but we tried to handle other constant cases. We missed this one because we checked for ConstantVector without realizing that zero becomes ConstantAggregateZero instead.
So instead just check for Constant and use getAggregateElement which will do the dirty work for us.
llvm-svn: 343270
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.
llvm-svn: 343244
It seems to have broken several targets, see comments on the llvm-commits thread.
> Change the copy tracker to keep a single map of register units instead
> of 3 maps of registers. This gives a very significant compile time
> performance improvement to the pass. I measured a 30-40% decrease in
> time spent in MCP on x86 and AArch64 and much more significant
> improvements on out of tree targets with more registers.
>
> Differential Revision: https://reviews.llvm.org/D52374
llvm-svn: 343189
It was the case when calling MO::dump(), but MI::dump() was still
depending on hasComplexRegisterTies().
The MIR output is not affected.
llvm-svn: 343107
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.
> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
> state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
> i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
> (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
> https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136
llvm-svn: 343103
Adding NonNull as attributes to returned pointers has the unfortunate side
effect of disabling tail calls. This patch ignores the NonNull attribute when
we decide whether to tail merge, in the same way that we ignore the NoAlias
attribute, as it has no affect on the call sequence.
Differential Revision: https://reviews.llvm.org/D52238
llvm-svn: 343091
Functions that have signed return addresses need additional dwarf support:
- After signing the LR, and before authenticating it, the LR register is in a
state the is unusable by a debugger or unwinder
- To account for this a new directive, .cfi_negate_ra_state, is added
- This directive says the signed state of the LR register has now changed,
i.e. unsigned -> signed or signed -> unsigned
- This directive has the same CFA code as the SPARC directive GNU_window_save
(0x2d), adding a macro to account for multiply defined codes
- This patch matches the gcc implementation of this support:
https://patchwork.ozlabs.org/patch/800271/
Differential Revision: https://reviews.llvm.org/D50136
llvm-svn: 343089
VerifyDAGDiverence costs compilation time, avoid running it in non-debug
builds.
Differential Revision: https://reviews.llvm.org/D52454
llvm-svn: 343086
Since the body of the "else if" contains
// TODO
I suppose someone will need the variable again at some point, but with
-Werror the warning made it not compile at all.
llvm-svn: 343071
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.
Fix build failed in BuildBot, clang-stage1-cmake-RA-incremental, on macOS.
Differential Revision: https://reviews.llvm.org/D51908
llvm-svn: 343062
Summary:
We have `llvm::addLandingPadInfo` and `MachineFunction::addLandingPad`,
both of which add landing pad information to populate `LandingPadInfo`
but are called from different locations, which was confusing. This patch
unifies them with one `MachineFunction::addLandingPad` function, which
now has functionlities of both functions.
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D52428
llvm-svn: 343018
This is the final (I hope!) problem pattern mentioned in PR37749:
https://bugs.llvm.org/show_bug.cgi?id=37749
We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops.
We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like
extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches
that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op.
The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test,
we have this vector-type-legalized sequence:
t29: v8i32 = concat_vectors t27, t28
t30: v4i64 = bitcast t29
t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ...
t31: v4i64 = bitcast t18
t32: v4i64 = xor t30, t31
t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ...
t34: v4i64 = bitcast t9
t35: v4i64 = and t32, t34
t36: v8i32 = bitcast t35
t37: v4i32 = extract_subvector t36, Constant:i64<0>
t38: v4i32 = extract_subvector t36, Constant:i64<4>
Differential Revision: https://reviews.llvm.org/D52318
llvm-svn: 343008
For the AMDGPU target if a MBB contains exec mask restore preamble, SplitEditor may get state when it cannot insert a spill instruction.
E.g. for a MIR
bb.100:
%1 = S_OR_SAVEEXEC_B64 %2, implicit-def $exec, implicit-def $scc, implicit $exec
and if the regalloc will try to allocate a virtreg to the physreg already assigned to virtreg %1, it should insert spill instruction before the S_OR_SAVEEXEC_B64 instruction.
But it is not possible since can generate incorrect code in terms of exec mask.
The change makes regalloc to ignore such physreg candidates.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D52052
llvm-svn: 343004
DAGCombine will try to fold two loads that feed a SELECT or SELECT_CC
after the select, resulting in a select of an address and a single
load after.
If either of the loads depend on the other, this is not legal as it
could introduce cycles. However, it only checked this if the opcode
was a SELECT, and not for a SELECT_CC.
Unfortunately, the only reproducer I have for this is for our
downstream target. I've tried getting it to trigger on an upstream one
but haven't been successful.
Patch thanks to Bevin Hansson.
llvm-svn: 342980
Summary:
For the loop that used MCRegAliasIterator this should be NFC.
For the loop that previously used MCSubRegIterator we should
now detect more cases where the register is actually live out that
we previously missed.
Reviewers: MatzeB, arsenm
Reviewed By: MatzeB
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D52410
llvm-svn: 342944
In some senario, LLVM will remove llvm.dbg.labels in IR. For example,
when the labels are in unreachable blocks, these labels will not
be generated in LLVM IR. In the case, these debug labels will have
address zero as their address. It is not legal address for debugger to
set breakpoints or query sources. So, the patch inhibits the address info
(DW_AT_low_pc) of removed labels.
Differential Revision: https://reviews.llvm.org/D51908
llvm-svn: 342943
Change the copy tracker to keep a single map of register units instead
of 3 maps of registers. This gives a very significant compile time
performance improvement to the pass. I measured a 30-40% decrease in
time spent in MCP on x86 and AArch64 and much more significant
improvements on out of tree targets with more registers.
Differential Revision: https://reviews.llvm.org/D52374
llvm-svn: 342942
Instead of updating the CopyTracker's maps each time we come across a
RegMask, defer checking for this kind of interference until we're
actually trying to propagate a copy. This avoids the need to
repeatedly iterate over maps in the cases where we don't end up doing
any work.
This is a slight compile time improvement for MachineCopyPropagation
as is, but it also enables a much bigger improvement that I'll follow
up with soon.
Differential Revision: https://reviews.llvm.org/D52370
llvm-svn: 342940
Implementing -print-before-all/-print-after-all/-filter-print-func support
through PassInstrumentation callbacks.
- PrintIR routines implement printing callbacks.
- StandardInstrumentations class provides a central place to manage all
the "standard" in-tree pass instrumentations. Currently it registers
PrintIR callbacks.
Reviewers: chandlerc, paquette, philip.pfaffe
Differential Revision: https://reviews.llvm.org/D50923
llvm-svn: 342896
This is a preliminary step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613
If we have an 'add' instruction that sets flags, we can use that to eliminate an
explicit compare instruction or some other instruction (cmn) that sets flags for
use in the later select.
As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively
reversing an IR icmp canonicalization that replaces a variable operand with a
constant:
https://rise4fun.com/Alive/V1Q
But we're not using 'uaddo' in those cases via DAG transforms. This happens in
CGP after D8889 without checking target lowering to see if the op is supported.
So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with
"using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned
saturated add and converts to uaddo without checking target capabilities.
This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only
see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title
(unlike x86 which sees improvements for all sizes because all sizes are 'custom').
But the AArch code (like x86) looks better when translated to 'uaddo' in all cases.
So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO,
so this patch will fire on those tests.
Another possibility given the existing behavior: we could remove the legal-or-custom
check altogether because we're assuming that a UADDO sequence is canonical/optimal
before we ever reach here. But that seems like a bug to me. If the target doesn't
have an add-with-flags op, then it's not likely that we'll get optimal DAG combining
using a UADDO node. This is similar justification for why we don't canonicalize IR to
the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first
place.
Differential Revision: https://reviews.llvm.org/D51929
llvm-svn: 342886
This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015.
llvm-svn: 342856
This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed.
llvm-svn: 342851