A catchswitch is both a pad and a terminator, meaning it must be the
only non-phi instruction in its basic block. When we're inserting a
bitcast in the incoming basic block for a phi, if that incoming block is
a catchswitch, we should go up the dominator tree to find a valid
insertion point rather than attempting to insert before the catchswitch
(which would result in invalid IR).
Differential Revision: https://reviews.llvm.org/D46412
llvm-svn: 331548
Summary: Adding support for Fast flags in the SDNode to leverage fast math sub flag usage.
Reviewers: spatel, arsenm, jbhateja, hfinkel, escha, qcolombet, echristo, wristow, javed.absar
Reviewed By: spatel
Subscribers: llvm-commits, rampitec, nhaehnle, tstellar, FarhanaAleen, nemanjai, javed.absar, jbhateja, hfinkel, wdng
Differential Revision: https://reviews.llvm.org/D45710
llvm-svn: 331547
Summary:
This change makes the TimelineView source simpler to read and easier to modify in the future.
This patch introduces a class of static chars used as the display values in the TimelineView report, this change just eliminates a few magic characters.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb
Subscribers: tschuett, gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D46409
llvm-svn: 331540
Previously we were emitting the "cooked" alignment, which made it hard
to distinguish between that and the default alignment.
Differential Revision: https://reviews.llvm.org/D46418
llvm-svn: 331537
Summary: They are not consistent with other microarchitectures.
Reviewers: gchatelet
Subscribers: tschuett, llvm-commits
Differential Revision: https://reviews.llvm.org/D46434
llvm-svn: 331532
And eliminatw the duplication of those instructions for microMIPS32r6.
Reviewers: smaksimovic, abeserminji, atanasyan
Differential Revision: https://reviews.llvm.org/D46117
llvm-svn: 331526
This patch adds a custom lowering for ISD::MULH{S,U} used on divide by
constant optimization (DAGCombiner::BuildSDIV and DAGCombiner::BuildUDIV).
New patterns for smull and umull are added, so AArch64ISD::{S,U}MULL
can be correctly lowered to smull2 and umull2.
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D46009
llvm-svn: 331522
Split off from SchedWriteFAdd for fp rounding/bit-manipulation instructions.
Fixes an issue on btver2 which only had the ymm version using the JSTC pipe instead of JFPA.
llvm-svn: 331515
Following the advice in review D45022, this currently tests for the broken llc
output where an instruction is mis-scheduled. This test is committed in advance
to improve the eventual fixing patch in D45022, making the bad behaviour that
that patch fixes clearer.
llvm-svn: 331514
Summary:
Added a helper method in RegsForValue to get a list with
all the <RegNumber, RegSize> pairs that we want to iterate
over in SelectionDAGBuilder::EmitFuncArgumentDbgValue and
in SelectionDAGBuilder::visitIntrinsicCall.
Reviewers: vsk
Reviewed By: vsk
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46360
llvm-svn: 331510
Don't assume the alias of a defined reg is always already in the set.
As the test case in https://bugs.llvm.org/show_bug.cgi?id=36587 discovered,
it is wrong to assume that all the aliases of the defined register in the
*current function* is already present in the UsedPhysRegsMask.
This patch changes this so that any definition in the current function of a
phys-reg always results in all its aliases inserted into the set of defined
registers.
Review: Quentin Colombet
https://reviews.llvm.org/D45157
llvm-svn: 331509
Summary:
This addresses http://llvm.org/PR36790.
The change Deprecates a number of functions and types in
`include/xray/xray_log_interface.h` to recommend using string-based
configuration of XRay through the __xray_log_init_mode(...) function. In
particular, this deprecates the following:
- `__xray_set_log_impl(...)` -- users should instead use the
`__xray_log_register_mode(...)` and `__xray_log_select_mode(...)` APIs.
- `__xray_log_init(...)` -- users should instead use the
`__xray_log_init_mode(...)` function, which also requires using the
`__xray_log_register_mode(...)` and `__xray_log_select_mode(...)`
functionality.
- `__xray::FDRLoggingOptions` -- in following patches, we'll be
migrating the FDR logging implementations (and tests) to use the
string-based configuration. In later stages we'll remove the
`__xray::FDRLoggingOptions` type, and ask users to migrate to using the
string-based configuration mechanism instead.
- `__xray::BasicLoggingOptions` -- same as `__xray::FDRLoggingOptions`,
we'll be removing this type later and instead rely exclusively on the
string-based configuration API.
We also update the documentation to reflect the new advice and remove
some of the deprecated notes.
Reviewers: eizan, kpw, echristo, pelikan
Reviewed By: kpw
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46173
llvm-svn: 331503
Summary:
Using a set is unnecessary here an in some cases (see e.g. PR37277)
takes significant amount of time to just insert values into it. In this
particular case all we need is just to check if we find the block we are
looking for or not.
Reviewers: davide
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D46411
llvm-svn: 331502
Two of these are immediately dereferenced on the next line. The other two are passed immediately to the IRBuilder constructor which can't handle a nullptr.
llvm-svn: 331500
These are casts on users of a PHINode to Instruction. I think since PHINode is an Instruction any users would also be Instructions. At least a cast will give us an assertion if its wrong.
llvm-svn: 331498
We currently recognize this idiom where x is signed and thus the shift in an ashr.
int cnt = 0;
while (x) {
x >>= 1; // arithmetic shift right
++cnt;
}
and turn it into (bitwidth - ctlz(x)). And if there is anything else in the loop we will create a new loop that runs that many times.
If x is initially negative, the shift result will never be 0 and thus the loop is infinite. If you put something with side effects in the loop, that side effect will now only happen bitwidth times instead of an infinite number of times.
So this transform is only safe for logical shift right (which we don't currently recognize) or if we can prove that x cannot be negative before the loop.
llvm-svn: 331493
Summary:
This makes is possible to have R600RegisterInfo and SIRegisterInfo
not inherit from AMDGPURegisterInfo.
Reviewers: arsenm, nhaehnle
Reviewed By: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D46280
llvm-svn: 331490
The RWMutex implementation depends on config.h macros (specifically
HAVE_PTHREAD_H and HAVE_PTHREAD_RWLOCK_INIT), so we need to be
including it and not just llvm-config.h here or we fall back to a much
slower implementation.
llvm-svn: 331487
Add logic for the special case when a cmp+select can clearly be
reduced to just a bitwise logic instruction, and remove an
over-reaching chunk of general purpose bit magic. The primary goal
is to remove cases where we are not improving the IR instruction
count when doing these select transforms, and in all cases here that
is true.
In the motivating 3-way compare tests, there are further improvements
because we can combine/propagate select values (not sure if that
belongs in instcombine, but it's there for now).
DAGCombiner has folds to turn some of these selects into bit magic,
so there should be no difference in the end result in those cases.
Not all constant combinations are handled there yet, however, so it
is possible that some targets will see more cmov/csel codegen with
this change in IR canonicalization.
Ideally, we'll go further to *not* turn selects into multiple
logic/math ops in instcombine, and we'll canonicalize to selects.
But we should make sure that this step does not result in regressions
first (and if it does, we should fix those in the backend).
The general direction for this change was discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105373.htmlhttp://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html
Alive proofs for the new bit magic:
https://rise4fun.com/Alive/XG7
Differential Revision: https://reviews.llvm.org/D46086
llvm-svn: 331486
Summary:
constrainOperandRegClass() currently fails if it tries to constrain the
register class of an operand that is defeined with an unallocatable register
class. This patch resolves this by adding a target callback to compute
register constriants in this case.
This is required by the AMDGPU because many of its instructions have source opreands
defined with the unallocatable register classe VS_32 which is a union of two allocatable
register classes VGPR_32 and SReg_32.
Reviewers: dsanders, aditya_nandakumar
Reviewed By: aditya_nandakumar
Subscribers: rovka, kristof.beyls, tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D45991
llvm-svn: 331485
Summary:
Support was added to the regular LTO backend, but not thinBackend.
This patch adds that support.
Reviewers: pcc, davide
Subscribers: mehdi_amini, inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D46376
llvm-svn: 331481
We can't see all of the problems currently unless
we look at debug output when the global 'unsafe' is
on. It's a mess. This is another attempt to make
sure that D45710 is not making changes unintentionally.
llvm-svn: 331476
This took a bit of extra work as on Intel targets the old (V)PSLLDrr/(V)PSLLDrm style instructions act differently - I ended up creating WriteVecShiftImm classes for XMM/YMM/ZMM vector shift by immediate and retaining WriteVecShift as the default (used only by MMX) plus WriteVecShiftX/WriteVecShiftY. X86SchedWriteWidths hides most of this thank goodness.
llvm-svn: 331472
I'm choosing PPC out of convenience because it does
all of the transforms of interest in these tests by
default. There are multiple FMF problems shown in the
current checks. D45710 is proposing to fix part of
that.
llvm-svn: 331471
Summary:
When we create a fragment expression, and there already is an
old fragment expression, we assert that the new fragment is
within the range for the old fragment.
If for example the old fragment expression says that we
describe bit 10-16 of a variable (Offset=10, Size=6),
and we now want to create a new fragment expression only
describing bit 3-6 of the original value, then the resulting
fragment expression should have Offset=13, Size=3.
The assert is supposed to catch if the resulting fragment
expression is outside the range for the old fragment. However,
it used to verify that the Offset+Size of the new fragment was
smaller or equal than Offset+Size for the old fragment. What
we really want to check is that Offset+Size of the new fragment
is smaller than the Size of the old fragment.
Reviewers: aprantl, vsk
Reviewed By: aprantl
Subscribers: davide, llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D46391
llvm-svn: 331465
Summary:
This reverts SVN r331441 (reapplies r331337), together with a fix
in to handle an already existing fragment expression in the
dbg.value that must be fragmented due to a split PHI node.
This should solve the problem seen in PR37321, which was the
reason for the revert of r331337.
The situation in PR37321 is that we have a PHI node like this
%u.sroa = phi i80 [ %u.sroa.x, %if.x ],
[ %u.sroa.y, %if.y ],
[ %u.sroa.z, %if.z ]
and a dbg.value like this
call void @llvm.dbg.value(metadata i80 %u.sroa,
metadata !13,
metadata !DIExpression(DW_OP_LLVM_fragment, 0, 80))
The phi node is split into three 32-bit PHI nodes
%30:gr32 = PHI %11:gr32, %bb.4, %14:gr32, %bb.5, %27:gr32, %bb.8
%31:gr32 = PHI %12:gr32, %bb.4, %15:gr32, %bb.5, %28:gr32, %bb.8
%32:gr32 = PHI %13:gr32, %bb.4, %16:gr32, %bb.5, %29:gr32, %bb.8
but since the original value only is 80 bits we need to adjust the size
of the last fragment expression, and with this patch we get
DBG_VALUE debug-use %30:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 0, 32)
DBG_VALUE debug-use %31:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 32, 32)
DBG_VALUE debug-use %32:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 64, 16)
Reviewers: vsk, aprantl, mstorsjo
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D46384
llvm-svn: 331464
These tests are for DAGCombiner::foldSelectCCToShiftAnd().
Right now, they were only tested for AArch64,
but given the upcoming X86 changes to the hasAndNot(),
the test coverage needs to be added.
These tests originated from D27489 / rL289738
llvm-svn: 331454
By default LLVM thinks very large vectors get aligned to their size when
passed across functions. Unfortunately no-one told the ARM backend so it
doesn't trigger stack realignment and so accesses can cause the usual
misalignment issues (e.g. a data abort).
This changes the ABI alignment to the stack alignment, which in practice
(and as a bonus) also coincides with the alignment "natural" vectors get.
llvm-svn: 331451