Part of the fix for https://bugs.llvm.org/show_bug.cgi?id=35812.
This patch ensures that the compare operand for the atomic compare and swap
is properly zero-extended to 32 bits if applicable.
A follow-up commit will fix the extension for the SETCC node generated when
expanding an ATOMIC_CMP_SWAP_WITH_SUCCESS. That will complete the bug fix.
Differential Revision: https://reviews.llvm.org/D41856
llvm-svn: 322372
This patch makes the following changes to the schedule of instructions in the
prologue and epilogue.
The stack pointer update is moved down in the prologue so that the callee saves
do not have to wait for the update to happen.
Saving the lr is moved down in the prologue to hide the latency of the mflr.
The stack pointer is moved up in the epilogue so that restoring of the lr can
happen sooner.
The mtlr is moved up in the epilogue so that it is away form the blr at the end
of the epilogue. The latency of the mtlr can now be hidden by the loads of the
callee saved registers.
This commit is almost identical to this one: r322036 except that two warnings
that broke build bots have been fixed.
The revision number is D41737 as before.
llvm-svn: 322124
The CTRLoop pass performs checks on the argument of certain libcalls/intrinsics,
and assumes the arguments must be of a simple type. This isn't always the case
though. For example if we unroll and vectorize a loop we may end up with vectors
larger then the largest legal type, along with intrinsics that operate on those
wider types. This happened in the ffmpeg build, where we unrolled a loop and
ended up with a sqrt intrinsic that operated on V16f64, triggering an assertion.
Differential Revision: https://reviews.llvm.org/D41758
llvm-svn: 322055
This patch makes the following changes to the schedule of instructions in the
prologue and epilogue.
The stack pointer update is moved down in the prologue so that the callee saves
do not have to wait for the update to happen.
Saving the lr is moved down in the prologue to hide the latency of the mflr.
The stack pointer is moved up in the epilogue so that restoring of the lr can
happen sooner.
The mtlr is moved up in the epilogue so that it is away form the blr at the end
of the epilogue. The latency of the mtlr can now be hidden by the loads of the
callee saved registers.
Differential Revision: https://reviews.llvm.org/D41737
llvm-svn: 322036
Summary:
I believe legalization is really expecting that ReplaceNodeResults will return something with the same type as the thing that's being legalized. Ultimately, it uses the output to replace the uses in the DAG so the type should match to make that work.
There are two relevant cases here. When crbits are enabled, then i1 is a legal type and getSetCCResultType should return i1. In this case, the truncate will be between i1 and i1 and should be removed (SelectionDAG::getNode does this). Otherwise, getSetCCResultType will be i32 and the legalizer will promote the truncate to be i32 -> i32 which will be similarly removed.
With this fixed we can remove some code from PromoteIntRes_SETCC that seemed to only exist to deal with the intrinsic being replaced with a larger type without changing the other operand. With the truncate being used for connectivity this doesn't happen anymore.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: nemanjai, llvm-commits, kbarton
Differential Revision: https://reviews.llvm.org/D41654
llvm-svn: 321959
Currently it's not possible to access MCSubtargetInfo from a TgtMCAsmBackend.
D20830 threaded an MCSubtargetInfo reference through
MCAsmBackend::relaxInstruction, but this isn't the only function that would
benefit from access. This patch removes the Triple and CPUString arguments
from createMCAsmBackend and replaces them with MCSubtargetInfo.
This patch just changes the interface without making any intentional
functional changes. Once in, several cleanups are possible:
* Get rid of the awkward MCSubtargetInfo handling in ARMAsmBackend
* Support 16-bit instructions when valid in MipsAsmBackend::writeNopData
* Get rid of the CPU string parsing in X86AsmBackend and just use a SubtargetFeature for HasNopl
* Emit 16-bit nops in RISCVAsmBackend::writeNopData if the compressed instruction set extension is enabled (see D41221)
This change initially exposed PR35686, which has since been resolved in r321026.
Differential Revision: https://reviews.llvm.org/D41349
llvm-svn: 321692
If the callee and caller use different calling convensions, we cannot apply TCO if the callee requires arguments on stack; e.g. C calling convention and Fast CC use the same registers for parameter passing, but the stack offset is not necessarily same.
This patch also recommit r319218 "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." by @sfertile since the problem reported in r320106 should be fixed.
Differential Revision: https://reviews.llvm.org/D40893
llvm-svn: 321579
Revision 320791 introduced a pass that transforms reg+reg instructions to
reg+imm if they're fed by "load immediate". However, it didn't
handle out-of-range shifts correctly as reported in PR35688.
This patch fixes that and therefore the PR.
Furthermore, there was undefined behaviour in the patch where the RHS of an
initialization expression was 32 bits and constant `1` was shifted left 32
bits. This was fixed by ensuring the RHS is 64 bits just like the LHS.
Differential Revision: https://reviews.llvm.org/D41369
llvm-svn: 321551
Re-land r321234. It had to be reverted because it broke the shared
library build. The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).
Original commit message:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
llvm-svn: 321375
The build failure was caused by an assertion in pre-legalization DAGCombine:
Combining: t6: ppcf128 = uint_to_fp t5
... into: t20: f32 = PPCISD::FCFIDUS t19
which is clearly wrong since ppcf128 are definitely different type with f32 and
we cannot change the node value type when do DAGCombine. The fix is don't
handle ppc_fp128 or i1 conversions in PPCTargetLowering::combineFPToIntToFP and
leave it to downstream to legalize it and expand it to small legal types.
Differential Revision: https://reviews.llvm.org/D41411
llvm-svn: 321276
Summary:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
llvm-svn: 321234
The function createTailCallBranchInstr assumes that the iterator MBBI is valid.
However, only one use of MBBI is guarded in the function.
Fix this by adding an assert.
Differential Revision: https://reviews.llvm.org/D41358
llvm-svn: 321205
This patch fixes a bug in the redundant compare elimination reported in https://reviews.llvm.org/rL320786 and re-enables the optimization.
The redundant compare elimination assumes that we can replace signed comparison with unsigned comparison for the equality check. But due to the difference in the sign extension behavior we cannot change the opcode if the comparison is against an immediate and the most significant bit of the immediate is one.
Differential Revision: https://reviews.llvm.org/D41385
llvm-svn: 321147
r307148 added an assembly mnemonic spelling correction support and enabled it
on ARM. This enables that support on PowerPC as well.
Patch by Dmitry Venikov, thanks!
Differential Revision: https://reviews.llvm.org/D40552
llvm-svn: 320911
This patch adds the necessary infrastructure to convert instructions that
take two register operands to those that take a register and immediate if
the necessary operand is produced by a load-immediate. Furthermore, it uses
this infrastructure to perform such conversions twice - first at MachineSSA
and then pre-emit.
There are a number of reasons we may end up with opportunities for this
transformation, including but not limited to:
- X-Form instructions chosen since the exact offset isn't available at ISEL time
- Atomic instructions with constant operands (we will add patterns for this
in the future)
- Tail duplication may duplicate code where one block contains this redundancy
- When emitting compare-free code in PPCDAGToDAGISel, we don't handle constant
comparands specially
Furthermore, this patch moves the initialization of PPCMIPeepholePass so that
it can be used for MIR tests.
llvm-svn: 320791
The compare elimination peephole introduced in https://reviews.llvm.org/rL312514
causes a miscompile in AMDGPUInstrInfo.cpp which in turn causes some AMDGPU
test case failures in stage2 bootstrap testing. This miscompile didn't cause any
test case failures until https://reviews.llvm.org/rL320614, so it appeared as if
that patch caused these failures.
Disabling this transformation for now to bring the build bots back to green and
the author of the patch will investigate the miscompile.
llvm-svn: 320786
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.
On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.
llvm-svn: 320746
Work towards the unification of MIR and debug output by printing
`@foo` instead of `<ga:@foo>`.
Also print target flags in the MIR format since most of them are used on
global address operands.
Only debug syntax is affected.
llvm-svn: 320682
The initial implementation of an MI SSA pass to reduce cr-logical operations.
Currently, the only operations handled by the pass are binary operations where
both CR-inputs come from the same block and the single use is a conditional
branch (also in the same block).
Committing this off by default to allow for a period of field testing. Will
enable it by default in a follow-up patch soon.
Differential Revision: https://reviews.llvm.org/D30431
llvm-svn: 320584
Most of the targets don't need the scheduler class enum.
I have an X86 scheduler model change that causes some names in the enum to become about 18000 characters long. This is because using instregex in scheduler models causes the scheduler class to get named with every instruction that matches the regex concatenated together. MSVC has a limit of 4096 characters for an identifier name. Rather than trying to come up with way to reduce the name length, I'm just going to sidestep the problem by not including the enum in X86.
llvm-svn: 320552
Headers/Implementation files should be named after the class they
declare/define.
Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in
favor of `class LiveIntarvals;`
llvm-svn: 320546
This flag was missing but it wasn't an issue as nothing depended on it
for these asm parser-only instructions. Now that LLDB support is slowly
landing, it is important to get this right.
Committing on behalf of Leonardo Bianconi.
Differential revision: https://reviews.llvm.org/D40846
llvm-svn: 320475
The last of the three patches that https://reviews.llvm.org/D40348 was
broken up into.
Canonicalize the materialization of constants so that they are more likely
to be CSE'd regardless of the bit-width of the use. If a constant can be
materialized using PPC::LI, materialize it the same way always.
For example:
li 4, -1
li 4, 255
li 4, 65535
are equivalent if the uses only use the low byte. Canonicalize it to the
first form.
Differential Revision: https://reviews.llvm.org/D40348
llvm-svn: 320473
The pass to expand ISEL instructions into if-then-else sequences in patch D23630
is currently disabled. This patch partially enable it by always removing the
unnecessary ISELs (all registers used by the ISELs are the same one) and folding
the ISELs which have the same input registers into unconditional copies.
Differential Revision: https://reviews.llvm.org/D40497
llvm-svn: 320414
Second part of https://reviews.llvm.org/D40348.
Revision r318436 has extended all constants feeding a store to 64 bits
to allow for CSE on the SDAG. However, negative constants were zero extended
which made the constant being loaded appear to be a positive value larger than
16 bits. This resulted in long sequences to materialize such constants
rather than simply a "load immediate". This patch just sign-extends those
updated constants so that they remain 16-bit signed immediates if they started
out that way.
llvm-svn: 320368
This adds assembly & disassembly support for the e500mc "external pid"
instructions.
See https://reviews.llvm.org/D39249.
Patch by vit9696 <vit9696@avp.su>
llvm-svn: 320287
It is causing sanitizer failures on llvm tests in a bootstrapped compiler. No bot link since it's currently down, but following up to get the bot up.
This reverts commit r319218.
llvm-svn: 320106
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.
The MIR printer prints the IR name of a MBB only for block definitions.
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix
Differential Revision: https://reviews.llvm.org/D40422
llvm-svn: 319665
This re-commits everything that was pulled in r314244. The transformation
is off by default (patch to enable it to follow). The code is refactored
to have a single entry-point and provide fine-grained control over patterns
that it selects. This patch also fixes the bugs in the original code.
Everything that failed with the original patch has been re-tested with this
patch (with the transformation turned on). So the patch to turn this on is
soon to follow.
Differential Revision: https://reviews.llvm.org/D38575
llvm-svn: 319434
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).
Basically:
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed
Differential Revision: https://reviews.llvm.org/D40420
llvm-svn: 319427
- add -ppc-reg-with-percent-prefix option to use %r3 etc as register
names
- split off logic for Darwinish verbose conditional codes into a helper
function
- be explicit about Darwin vs AIX vs GNUish assembler flavors
Based on the patch from Alexandre Yukio Yamashita
Differential Revision: https://reviews.llvm.org/D39016
llvm-svn: 319381
Separate the handling of AND/AND8 out from PHI/OR/ISEL checking. The reasoning
is the others need all their operands to be sign/zero extended for their output
to also be sign/zero extended. This is true for AND and sign-extension, but for
zero-extension we only need at least one of the input operands to be zero
extended for the result to also be zero extended.
Differential Revision: https://reviews.llvm.org/D39078
llvm-svn: 319289
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.
* Only debug printing is affected. It now follows MIR.
Differential Revision: https://reviews.llvm.org/D40417
llvm-svn: 319187
This patch adds a peep hole optimization to remove any redundant toc save
instructions added as part of the call sequence for indirect calls. It removes
any toc saves within a function that are dominated by another toc save.
Differential Revision: https://reviews.llvm.org/D39736
llvm-svn: 319087
This patch extends on to rL307174 to not use the power9 vector extract with
variable index instructions when extracting word element 1. For such cases,
the existing selection of MFVSRWZ provides a better sequence.
Differential Revision: https://reviews.llvm.org/D38287
llvm-svn: 319049