The machine scheduler (before register allocation) is enabled by default for
SystemZ.
The SelectionDAG scheduling preference now becomes source order scheduling
(was regpressure).
Review: Ulrich Weigand
https://reviews.llvm.org/D37977
llvm-svn: 315063
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.
If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).
This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.
Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899https://bugs.llvm.org/show_bug.cgi?id=34610
llvm-svn: 314516
The expensive-checks build bot found a problem with the r314428 commit:
if CC is live after a ATOMIC_CMP_SWAPW instruction, it needs to be
marked as live-in to the block after the loop the pseudo gets expanded
to. This actually fixes a code-gen bug as well, since if the CC isn't
live, the CR and JLH are merged to a CRJLH which doesn't actually set
the condition code any more.
llvm-svn: 314465
The SystemZ compare-and-swap instructions already provide the "success"
indication via a condition-code value, so the default expansion of those
operations generates an unnecessary extra comparsion.
llvm-svn: 314428
More conversions to load-and-test can be made with this patch by adding a
forward search in optimizeCompareZero().
Review: Ulrich Weigand
https://reviews.llvm.org/D38076
llvm-svn: 313877
SystemZTargetLowering::combineSTORE contains code to transform a
combination of STORE + BSWAP into a STRV type instruction.
This transformation is correct for regular stores, but not for
truncating stores. The routine neglected to check for that case.
Fixes a miscompilation of llvm-objcopy with clang, which caused
test suite failures in the SystemZ multistage build bot.
llvm-svn: 313669
This bit is needed in order for the CalleeSavedRegs list to automatically
include the super registers if all of their subregs are present.
Thanks to Wei Mi for initially indicating this deficiency in the SystemZ
backend.
Review: Ulrich Weigand.
https://bugs.llvm.org/show_bug.cgi?id=34550
llvm-svn: 313023
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.
The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.
Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.
Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053
llvm-svn: 311072
The liveness-tracking code assumes that the registers that were saved
in the function's prolog are live outside of the function. Specifically,
that registers that were saved are also live-on-exit from the function.
This isn't always the case as illustrated by the LR register on ARM.
Differential Revision: https://reviews.llvm.org/D36160
llvm-svn: 310619
isLegalAddressingMode() has recently gained the extra optional Instruction*
parameter, and therefore it can now do the job that previously only
isFoldableMemAccess() could do.
The SystemZ implementation of isLegalAddressingMode() has gained the
functionality of checking for offsets, which used to be done with
isFoldableMemAccess().
The isFoldableMemAccess() hook has been removed everywhere.
Review: Quentin Colombet, Ulrich Weigand
https://reviews.llvm.org/D35933
llvm-svn: 310463
This adds support for the main 128-bit atomic operations,
using the SystemZ instructions LPQ, STPQ, and CDSG.
Generating these instructions is a bit more complex than usual
since the i128 type is not legal for the back-end. Therefore,
we have to hook the LowerOperationWrapper and ReplaceNodeResults
TargetLowering callbacks.
llvm-svn: 310094
We currently emit a serialization operation (bcr 14, 0) before every
atomic load and after every atomic store. This is overly conservative.
The SystemZ architecture actually does not require any serialization
for atomic loads, and a serialization after an atomic store only if
we need to enforce sequential consistency. This is what other compilers
for the platform implement as well.
llvm-svn: 310093
IMHO it is an antipattern to have a enum value that is Default.
At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.
This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.
llvm-svn: 309911
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.
In order to achieve this, the following common code changes were made:
* New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
LSR should do instruction-based addressing evaluations by calling
isLegalAddressingMode() with the Instruction pointers.
* In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
not just loads or stores.
SystemZ changes:
* isLSRCostLess() implemented with Insns first, and without ImmCost.
* New function supportedAddressingMode() that is a helper for TTI methods
looking at Instructions passed via pointers.
Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262https://reviews.llvm.org/D35049
llvm-svn: 308729
This adds support for the new 128-bit vector float instructions of z14.
Note that these instructions actually only operate on the f128 type,
since only each 128-bit vector register can hold only one 128-bit
float value. However, this is still preferable to the legacy 128-bit
float instructions, since those operate on pairs of floating-point
registers (so we can hold at most 8 values in registers), while the
new instructions use single vector registers (so we hold up to 32
value in registers).
Adding support includes:
- Enabling the instructions for the assembler/disassembler.
- CodeGen for the instructions. This includes allocating the f128
type now to the VR128BitRegClass instead of FP128BitRegClass.
- Scheduler description support for the instructions.
Note that for a small number of operations, we have no new vector
instructions (like integer <-> 128-bit float conversions), and so
we use the legacy instruction and then reformat the operand
(i.e. copy between a pair of floating-point registers and a
vector register).
llvm-svn: 308196
This adds support for the new 32-bit vector float instructions of z14.
This includes:
- Enabling the instructions for the assembler/disassembler.
- CodeGen for the instructions, including new LLVM intrinsics.
- Scheduler description support for the instructions.
- Update to the vector cost function calculations.
In general, CodeGen support for the new v4f32 instructions closely
matches support for the existing v2f64 instructions.
llvm-svn: 308195
This patch series adds support for the IBM z14 processor. This part includes:
- Basic support for the new processor and its features.
- Support for new instructions (except vector 32-bit float and 128-bit float).
- CodeGen for new instructions, including new LLVM intrinsics.
- Scheduler description for the new processor.
- Detection of z14 as host processor.
Support for the new 32-bit vector float and 128-bit vector float
instructions is provided by separate patches.
llvm-svn: 308194
The issue is not if the value is pcrel. It is whether we have a
relocation or not.
If we have a relocation, the static linker will select the upper
bits. If we don't have a relocation, we have to do it.
llvm-svn: 307730
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
global and local memory. These scopes restrict how synchronization is
achieved, which can result in improved performance.
This change extends existing notion of synchronization scopes in LLVM to
support arbitrary scopes expressed as target-specific strings, in addition to
the already defined scopes (single thread, system).
The LLVM IR and MIR syntax for expressing synchronization scopes has changed
to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
replaces *singlethread* keyword), or a target-specific name. As before, if
the scope is not specified, it defaults to CrossThread/System scope.
Implementation details:
- Mapping from synchronization scope name/string to synchronization scope id
is stored in LLVM context;
- CrossThread/System and SingleThread scopes are pre-defined to efficiently
check for known scopes without comparing strings;
- Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
the bitcode.
Differential Revision: https://reviews.llvm.org/D21723
llvm-svn: 307722
Several integer multiply/divide instructions require use of a
register pair as input and output. This patch moves setting
up the input register pair from C++ code to TableGen, simplifying
the whole process and making it more easily extensible.
No functional change.
llvm-svn: 307155
Fixes a couple of whitespace errors, re-sorts the vector floating-point
instructions to make them more easily extensible, and adds a missing
pseudo instruction.
No functional change.
llvm-svn: 307154
This adds all remaining instructions that were still missing, mostly
privileged and semi-privileged system-level instructions. These are
provided for use with the assembler and disassembler only.
This brings the LLVM assembler / disassembler to parity with the
GNU binutils tools.
llvm-svn: 306876
There are a few instructions provided by the high-word facility (z196)
that we cannot easily exploit for code generation. This patch at least
adds those missing instructions for the assembler and disassembler.
This means that now all nonprivileged instructions up to z13 are
supported by the LLVM assembler / disassembler.
llvm-svn: 306821
We sometimes need emergency spill slots for the register scavenger.
This may be the case when code needs to access a stack slot that
has an offset of 4096 or more relative to the stack pointer.
To make that determination, processFunctionBeforeFrameFinalized
currently simply checks the total stack frame size of the current
function. But this is not enough, since code may need to access
stack slots in the caller's stack frame as well, in particular
incoming arguments stored on the stack.
This commit fixes the problem by taking argument slots into account.
llvm-svn: 306305
Csmith discovered that this function can be called with a zero argument,
in which case an assert for this triggered.
This patch also adds a guard before the other call to this function since
it was missing, although the test only covers the case where it was
discovered.
Reduced test case attached as CodeGen/SystemZ/int-cmp-54.ll.
Review: Ulrich Weigand
llvm-svn: 306287
processFixupValue is called on every relaxation iteration. applyFixup
is only called once at the very end. applyFixup is then the correct
place to do last minute changes and value checks.
While here, do proper range checks again for fixup_arm_thumb_bl. We
used to do it, but dropped because of thumb2. We now do it again, but
use the thumb2 range.
llvm-svn: 306177
This reverts the use of TargetLowering::prepareVolatileOrAtomicLoad
introduced by r196905. Nothing in the semantics of the "volatile"
keyword or the definition of the z/Architecture actually requires
that volatile loads are preceded by a serialization operation, and
no other compiler on the platform actually implements this.
Since we've now seen a use case where this additional serialization
causes noticable performance degradation, this patch removes it.
The patch still leaves in the serialization before atomic loads,
which is now implemented directly in lowerATOMIC_LOAD. (This also
seems overkill, but that can be addressed separately.)
llvm-svn: 306117
The isBarrier/isTerminator flags have been removed from the SystemZ trap
instructions, so that tests do not fail with EXPENSIVE_CHECKS. This was just
an issue at -O0 and did not affect code output on benchmarks.
(Like Eli pointed out: "targets are split over whether they consider their
"trap" a terminator; x86, AArch64, and NVPTX don't, but ARM, MIPS, PPC, and
SystemZ do. We should probably try to be consistent here.". This is still the
case, although SystemZ has switched sides).
SystemZ now returns true in isMachineVerifierClean() :-)
These Generic tests have been modified so that they can be run with or without
EXPENSIVE_CHECKS: CodeGen/Generic/llc-start-stop.ll and
CodeGen/Generic/print-machineinstrs.ll
Review: Ulrich Weigand, Simon Pilgrim, Eli Friedman
https://bugs.llvm.org/show_bug.cgi?id=33047https://reviews.llvm.org/D34143
llvm-svn: 306106
Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation.
Reviewers: chandlerc, rnk, reames
Reviewed By: reames
Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits
Differential Revision: https://reviews.llvm.org/D33903
llvm-svn: 305189
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
llvm-svn: 304864
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
llvm-svn: 304787
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.
This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!
Differential Revision: https://reviews.llvm.org/D33696
llvm-svn: 304320
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.
While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.
llvm-svn: 304247
This adds assembler / disassembler support for the decimal
floating-point instructions. Since LLVM does not yet have
support for decimal float types, these cannot be used for
codegen at this point.
llvm-svn: 304203
This adds assembler / disassembler support for the hexadecimal
floating-point instructions. Since the Linux ABI does not use
any hex float data types, these are not useful for codegen.
llvm-svn: 304202
Use VLREP when inserting one or more loads into a vector. This is more
efficient than to first load and then use a VLVGP.
Review: Ulrich Weigand
llvm-svn: 304152
The loop vectorizer usually vectorizes any instruction it can and then
extracts the elements for a scalarized use. On SystemZ, all elements
containing addresses must be extracted into address registers (GRs). Since
this extraction is not free, it is better to have the address in a suitable
register to begin with. By forcing address arithmetic instructions and loads
of addresses to be scalar after vectorization, two benefits result:
* No need to extract the register
* LSR optimizations trigger (LSR isn't handling vector addresses currently)
Benchmarking show improvements on SystemZ with this new behaviour.
Any other target could try this by returning false in the new hook
prefersVectorizedAddressing().
Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand
https://reviews.llvm.org/D32422
llvm-svn: 303744
EXPENSIVE_CHECKS found this bug (https://bugs.llvm.org/show_bug.cgi?id=33047), which
this patch fixes. The EAR instruction defines a GR32, not a GR64.
Review: Ulrich Weigand
llvm-svn: 303743
This adds a few missing instructions for the assembler and
disassembler. Those should be the last missing general-
purpose (Chapter 7) instructions for the z10 ISA.
llvm-svn: 302667
This adds the remaining general arithmetic instructions
for assembler / disassembler use. Most of these are not
useful for codegen; a few might be, and those are listed
in the README.txt for future improvements.
llvm-svn: 302665
This method must return a valid register class, or the list-ilp isel
scheduler will crash. For MVT::Untyped nullptr was previously returned, but
now ADDR128BitRegClass is returned instead. This is needed just as long as
list-ilp (and probably also list-hybrid) is still there.
Review: Ulrich Weigand, A Trick
https://reviews.llvm.org/D32802
llvm-svn: 302649
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.
This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.
The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
affects all targets that use frame pseudo instructions and touched many
files although the changes are uniform.
- Access to frame properties are implemented using special instructions
rather than calls getOperand(N).getImm(). For X86 and ARM such
replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
instruction. These involve proper instruction initialization and
methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
frame parts initialized inside frame instruction pair and outside it.
The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.
Differential Revision: https://reviews.llvm.org/D32394
llvm-svn: 302527
When a 128 bit COPY is lowered into two instructions, an impl-use operand of
the super-reg should be added to each new instruction in case one of the
sub-regs is undefined.
Review: Ulrich Weigand
llvm-svn: 302146
It is needed to check that the number of operands are 2 when
finding the case of a logic combination, e.g. 'and' of two compares.
Review: Ulrich Weigand
llvm-svn: 302022
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.
This is largely a mechanical transformation from KnownZero to Known.Zero.
Differential Revision: https://reviews.llvm.org/D32569
llvm-svn: 301620
In getCmpSelInstrCost(), CondTy may actually be scalar while ValTy is a
vector when LoopVectorizer is the caller. Therefore the assert that CondTy
must be a vector type if ValTy is was wrong and is now removed.
Review: Ulrich Weigand
llvm-svn: 301533
1. RegisterClass::getSize() is split into two functions:
- TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
- TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
- TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;
This will allow making those values depend on subtarget features in the
future.
Differential Revision: https://reviews.llvm.org/D31783
llvm-svn: 301221
Since SystemZ supports vector element load/store instructions, there is no
need for extracts/inserts if a vector load/store gets scalarized.
This patch lets Target specify that it supports such instructions by means of
a new TTI hook that defaults to false.
The use for this is in the LoopVectorizer getScalarizationOverhead() method,
which will with this patch produce a smaller sum for a vector load/store on
SystemZ.
New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll
Review: Adam Nemet
https://reviews.llvm.org/D30680
llvm-svn: 300056
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.
Interleaved access vectorization enabled.
BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.
Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631
llvm-svn: 300052
A test case was found with llvm-stress that caused DAGCombiner to crash
when compiling for an older subtarget without vector support.
SystemZTargetLowering::combineTruncateExtract() should do nothing for older
subtargets.
This check was placed in canTreatAsByteVector(), which also helps in a few
other places.
Review: Ulrich Weigand
llvm-svn: 299763
A number of backends (AArch64, MIPS, ARM) have been using
MCContext::reportError to report issues such as out-of-range fixup values in
their TgtAsmBackend. This is great, but because MCContext couldn't easily be
threaded through to the adjustFixupValue helper function from its usual
callsite (applyFixup), these backends ended up adding an MCContext* argument
and adding another call to applyFixup to processFixupValue. Adding an
MCContext parameter to applyFixup makes this unnecessary, and even better -
applyFixup can take a reference to MCContext rather than a potentially null
pointer.
Differential Revision: https://reviews.llvm.org/D30264
llvm-svn: 299529
Since LOCR only accepts GR32 virtual registers, its operands must be copied
into this regclass in insertSelect(), when an LOCR is built. Otherwise, the
case where the source operand was GRX32 will produce invalid IR.
Review: Ulrich Weigand
llvm-svn: 299220
Even on older subtargets that lack vector support, there may be vector values
with just one element in the input program. These are converted during DAG
legalization to scalar values.
The pre-legalize SystemZ DAGCombiner methods should in this circumstance not
touch these nodes. This patch adds a check for this in
SystemZTargetLowering::combineEXTRACT_VECTOR_ELT().
Review: Ulrich Weigand
llvm-svn: 299213
The def operand of the new LG/LD should have the old def operands
flags and subreg index.
New test: test/CodeGen/SystemZ/fold-memory-op-impl.ll
Review: Ulrich Weigand
llvm-svn: 298341
If one of the subregs of the 128 bit reg is undefined when splitMove() splits
a store into two instructions, a use of an undefined physical register
results.
To remedy this, an implicit use of the super register is added onto both new
instructions, along with propagated kill and undef flags.
This was discovered with llvm-stress, and that test case is attached as
test/CodeGen/SystemZ/splitMove_undefReg_mverifier.ll
Thanks to Matthias Braun for helping with a nice explanation.
Review: Ulrich Weigand
llvm-svn: 298047
The GeneralShuffle::add() method used to have an assert that made sure that
source elements were at least as big as the destination elements. This was
wrong, since it is actually expected that an EXTRACT_VECTOR_ELT node with a
smaller source element type than the return type gets extended.
Therefore, instead of asserting this, it is just checked and if this is the
case 'false' is returned from the GeneralShuffle::add() method. This case
should be very rare and is not handled further by the backend.
Review: Ulrich Weigand.
llvm-svn: 292888
Vector immediate load instructions should have the isAsCheapAsAMove, isMoveImm
and isReMaterializable flags set. With them, these instruction will get
hoisted out of loops.
Review: Ulrich Weigand
llvm-svn: 292790
During post-RA pseudo expansion, an 'undef' flag of the source operand should
be propagated by emitGRX32Move().
Review: Ulrich Weigand
llvm-svn: 292353
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.
See https://reviews.llvm.org/D28057 for the whole discussion.
Differential Revision: https://reviews.llvm.org/D28556
llvm-svn: 291891
A store of an extracted element or a load which gets inserted into a vector,
will be combined into a vector load/store element instruction.
Therefore, isFoldableMemAccessOffset(), which is called by LSR, should
return false in these cases.
Reviewer: Ulrich Weigand
llvm-svn: 291673
Add assembler support for all atomic instructions that weren't already
supported. Some of those could be used to implement codegen for 128-bit
atomic operations, but this isn't done here yet.
llvm-svn: 288526
Add assembler support for instructions manipulating the FPC.
Also add codegen support via the GCC compatibility builtins:
__builtin_s390_sfpc
__builtin_s390_efpc
llvm-svn: 288525
Move setting of hasSideEffects out of SystemZInstrFormats.td,
to allow use of the format classes for instructions where this
flag shouldn't be set. NFC.
llvm-svn: 288524
Recommitting r288293 with some extra fixes for GlobalISel code.
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.
This is a necessary step to have machine module passes work properly.
Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
because the available MachineFunction pointers are const, but the code
wants to call tidyLandingPads() in between
(markFunctionEnd()/endFunction()).
Differential Revision: https://reviews.llvm.org/D27227
llvm-svn: 288405
Now that we have fixups that only fill parts of a byte, it turns
out we have to mask off the bits outside the fixup area when
applying them. Failing to do so caused invalid object code to
be emitted for bprp with a negative 12-bit displacement.
llvm-svn: 288374
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.
This is a necessary step to have machine module passes work properly.
Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
because the available MachineFunction pointers are const, but the code
wants to call tidyLandingPads() in between
(markFunctionEnd()/endFunction()).
Differential Revision: https://reviews.llvm.org/D27227
llvm-svn: 288293
This is per function data so it is better kept at the function instead
of the module.
This is a necessary step to have machine module passes work properly.
Differential Revision: https://reviews.llvm.org/D27185
llvm-svn: 288291
This adds assembler support for the instructions provided by the
execution-hint facility (NIAI and BP(R)P). This required adding
support for the new relocation types for 12-bit and 24-bit PC-
relative offsets used by the BP(R)P instructions.
llvm-svn: 288031
This patch adds assembler support for the remaining branch instructions:
the non-relative branch on count variants, and all variants of branch
on index.
The only one of those that can be readily exploited for code generation
is BRCTH (branch on count using a high 32-bit register as count). Do
use it, however, it is necessary to also introduce a hew CHIMux pseudo
to allow comparisons of a 32-bit value agains a short immediate to go
into a high register as well (implemented via CHI/CIH).
This causes a bit of codegen changes overall, but those have proven to
be neutral (or even beneficial) in performance measurements.
llvm-svn: 288029
This patch moves formation of LOC-type instructions from (late)
IfConversion to the early if-conversion pass, and in some cases
additionally creates them directly from select instructions
during DAG instruction selection.
To make early if-conversion work, the patch implements the
canInsertSelect / insertSelect callbacks. It also implements
the commuteInstructionImpl and FoldImmediate callbacks to
enable generation of the full range of LOC instructions.
Finally, the patch adds support for all instructions of the
load-store-on-condition-2 facility, which allows using LOC
instructions also for high registers.
Due to the use of the GRX32 register class to enable high registers,
we now also have to handle the cases where there are still no single
hardware instructions (conditional move from a low register to a high
register or vice versa). These are converted back to a branch sequence
after register allocation. Since the expandRAPseudos callback is not
allowed to create new basic blocks, this requires a simple new pass,
modelled after the ARM/AArch64 ExpandPseudos pass.
Overall, this patch causes significantly more LOC-type instructions
to be used, and results in a measurable performance improvement.
llvm-svn: 288028
Summary:
* ARM is omitted from this patch because this check appears to expose bugs in this target.
* Mips is omitted from this patch because this check either detects bugs or deliberate
emission of instructions that don't satisfy their predicates. One deliberate
use is the SYNC instruction where the version with an operand is correctly
defined as requiring MIPS32 while the version without an operand is defined
as an alias of 'SYNC 0' and requires MIPS2.
* X86 is omitted from this patch because it doesn't use the tablegen-erated
MCCodeEmitter infrastructure.
Patches for ARM and Mips will follow.
Depends on D25617
Reviewers: tstellarAMD, jmolloy
Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D25618
llvm-svn: 287439
This adds support for the compare logical and trap (memory)
instructions that were added as part of the miscellaneous
instruction extensions feature with zEC12.
llvm-svn: 286587
This adds support for the LZRF/LZRG/LLZRGF instructions that were
added on z13, and uses them for code generation were appropriate.
SystemZDAGToDAGISel::tryRISBGZero is updated again to prefer LLZRGF
over RISBG where both would be possible.
llvm-svn: 286586
This adds support for the 31-to-64-bit zero extension instructions
LLGT and LLGTR and uses them for code generation where appropriate.
Since this operation can also be performed via RISBG, we have to
update SystemZDAGToDAGISel::tryRISBGZero so that we prefer LLGT
over RISBG in case both are possible. The patch includes some
simplification to the tryRISBGZero code; this is not intended
to cause any (further) functional change in codegen.
llvm-svn: 286585
The name/comment of the third argument to the ScheduleDAGMI constructor
is RemoveKillFlags and not IsPostRA. Only the comments are changed.
Review: A Trick
llvm-svn: 286350
This completes assembler / disassembler support for all BFP
instructions provided by the floating-point extensions facility.
The instructions added here are not currently used for codegen.
llvm-svn: 286285
Add several instructions that operate on the program mask
or the addressing mode. These are not really needed for
code generation under Linux, but are provided for completeness
for the assembler/disassembler.
llvm-svn: 286284
Add the 16 access registers as LLVM registers. This allows removing
a lot of special cases in the assembler and disassembler where we
were handling access registers; this can all just use the generic
register code now.
Also add a bunch of instructions to operate on access registers,
for assembler/disassembler use only. No change in code generation
intended.
llvm-svn: 286283
Define a couple of additional semantic classes and use them
throughout the .td files to make them more consistent and
more easily readable.
No functional change.
llvm-svn: 286268
This changes the InstRR (and related) patterns to no longer
automatically add an "r" at the end of the mnemonic. This
makes the .td files more obviously understandable, and also
allows using the patterns for those few instructions that
do not follow the *r scheme.
Also add some more sub-formats of the RRF format class, to
match operand names and sequence from the PoP better.
No functional change.
llvm-svn: 286267
Now that we've added instruction format subclasses like
InstRIb, it makes sense to rename the old InstRI to InstRIa.
Similar for InstRX, InstRXY, InstRS, InstRSY, and InstSS.
No functional change.
llvm-svn: 286266
Rework patterns for branches, call & return instructions,
compare-and-branch, compare-and-trap, and conditional move
instructions.
In particular, simplify creation of patterns for the extended
opcodes of instructions that take a CC mask.
Also, use semantical instruction classes for all the instructions
instead of open-coding them in SystemZInstrInfo.td.
Adds a couple of the basic branch instructions (that are unused
for codegen) for the assembler/disassembler.
llvm-svn: 286263
* Use a generic vector unit to model the issue unit more accurately.
* Update some vector instructions that actually use the vector unit for more
than one cycle.
Review: Ulrich Weigand
llvm-svn: 286112
IssueWidth updated to reflect the capacity of the issue unit correctly.
Correct number of FX and LS units modelled (2, was 1).
Review: Ulrich Weigand
llvm-svn: 286109
As it stands, the OperandMatchResultTy is only included in the generated
header if there is custom operand parsing. However, almost all backends
make use of MatchOperand_Success and friends from OperandMatchResultTy for
e.g. parseRegister. This is a pain when starting an AsmParser for a new
backend that doesn't yet have custom operand parsing. Move the enum to
MCTargetAsmParser.h.
This patch is a prerequisite for D23563
Differential Revision: https://reviews.llvm.org/D23496
llvm-svn: 285705
This patch implements two changes:
- Move processor feature definition into a new file SystemZFeatures.td,
and provide explicit lists of supported and unsupported features for
each level of the z/Architecture. This allows specifying unsupported
features in the scheduler definition files for each processor.
- Add optional aliases for the -mcpu processor names according to the
level of the z/Architecture, for compatibility with other compilers
on the platform. The supported aliases are:
-mcpu=arch8 equals -mcpu=z10
-mcpu=arch9 equals -mcpu=z196
-mcpu=arch10 equals -mcpu=zEC12
-mcpu=arch11 equals -mcpu=z13
llvm-svn: 285577
The LEFR/LFER pseudos are aliases for vector instructions and should
therefore be guared by FeatureVector. If they aren't, the TableGen
scheduler definition checking might complain that there is no data
for those pseudos for pre-z13 machines.
No functional change intended.
llvm-svn: 285576
Currently, when using an instruction that is not supported on the
currently selected architecture, the LLVM assembler is likely to
diagnose an "invalid operand" instead of a "missing feature".
This is because many operands require a custom parser in order to
be processed correctly, and if an instruction is not available
according to the current feature set, the generated parser code
will also not detect the associated custom operand parsers.
Fixed by temporarily enabling all features while parsing operands.
The missing features will then be correctly detected when actually
parsing the instruction itself.
llvm-svn: 285575
LLVM currently treats the first operand of MVCK as if it were a
regular base+index+displacement address. However, it is in fact
a base+displacement combined with a length register field.
While the two might look syntactically similar, there are two
semantic differences:
- %r0 is a valid length register, even though it cannot be used
as an index register.
- In an expression with just a single register like 0(%rX), the
register is treated as base with normal addresses, while it is
treated as the length register (with an empty base) for MVCK.
Fixed by adding a new operand parser class BDRAddr and reworking
the assembler parser to distinguish between address + length
register operands and regular addresses.
llvm-svn: 285574
It is not safe to use LOAD ON CONDITION to implement access to a memory
location marked "volatile", since the architecture leaves it unspecified
whether or not an access happens if the condition is false.
The current code already appears to care about that:
def LOC : CondUnaryRSY<"loc", 0xEBF2, nonvolatile_load, GR32, 4>;
Unfortunately, that "nonvolatile_load" operator is simply ignored
by the CondUnaryRSY class, and there was no test to catch it.
llvm-svn: 285077
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.
No functionality change intended.
llvm-svn: 284721
Post-RA sched strategy and scheduling instruction annotations for z196, zEC12
and z13.
This scheduler optimizes decoder grouping and balances processor resources
(including side steering the FPd unit instructions).
The SystemZHazardRecognizer keeps track of the scheduling state, which can
be dumped with -debug-only=misched.
Reviers: Ulrich Weigand, Andrew Trick.
https://reviews.llvm.org/D17260
llvm-svn: 284704
Most z13 vector instructions have a base form where the data type of
the operation (whether to consider the vector to be 16 bytes, 8
halfwords, 4 words, or 2 doublewords) is encoded into a mask field,
and then a set of extended mnemonics where the mask field is not
present but the data type is encoded into the mnemonic name.
Currently, LLVM only supports the type-specific forms (since those
are really the ones needed for code generation), but not the base
type-generic forms.
To complete the assembler support and make it fully compatible with
the GNU assembler, this commit adds assembler aliases for all the
base forms of the various vector instructions.
It also adds two more alias forms that are documented in the PoP:
VFPSO/VFPSODB/WFPSODB -- generic form of VFLCDB etc.
VNOT -- special variant of VNO
llvm-svn: 284586
The vfee[bhf], vfene[bhf], and vistr[bhf] assembler mnemonics are
documented in the Principles of Operation to have an optional last
operand to encode arbitrary values in a mask field.
This commit adds support for those optional operands, and cleans up
the patterns to generate vector string instruction as bit. No change
to code generation intended.
llvm-svn: 284585
This commit enables more unrolling for SystemZ by implementing the
SystemZTargetTransformInfo::getUnrollingPreferences() method.
It has been found that it is better to only unroll moderately, so the
DefaultUnrollRuntimeCount has been moved into UnrollingPreferences in order
to set this to a lower value for SystemZ (4).
Reviewers: Evgeny Stupachenko, Ulrich Weigand.
https://reviews.llvm.org/D24451
llvm-svn: 282570
Recommitting after fixing AsmParser initialization and X86 inline asm
error cleanup.
Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.
As part of this many minor cleanups to the Parser:
* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
now fixed.
These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.
Reviewers: rnk, majnemer
Subscribers: aemerson, jyknight, llvm-commits
Differential Revision: https://reviews.llvm.org/D24047
llvm-svn: 281762
Recommitting after fixing AsmParser Initialization.
Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.
As part of this many minor cleanups to the Parser:
* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
now fixed.
These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.
Reviewers: rnk, majnemer
Subscribers: aemerson, jyknight, llvm-commits
Differential Revision: https://reviews.llvm.org/D24047
llvm-svn: 281336
Allow errors to be deferred and emitted as part of clean up to simplify
and shorten Assembly parser code. This will allow error messages to be
emitted in helper functions and be modified by the caller which has
better context.
As part of this many minor cleanups to the Parser:
* Unify parser cleanup on error
* Add Workaround for incorrect return values in ParseDirective instances
* Tighten checks on error-signifying return values for parser functions
and fix in-tree TargetParsers to be more consistent with the changes.
* Fix AArch64 test cases checking for spurious error messages that are
now fixed.
These changes should be backwards compatible with current Target Parsers
so long as the error status are correctly returned in appropriate
functions.
Reviewers: rnk, majnemer
Subscribers: aemerson, jyknight, llvm-commits
Differential Revision: https://reviews.llvm.org/D24047
llvm-svn: 281249
Summary:
An IR load can be invariant, dereferenceable, neither, or both. But
currently, MI's notion of invariance is IR-invariant &&
IR-dereferenceable.
This patch splits up the notions of invariance and dereferenceability at
the MI level. It's NFC, so adds some probably-unnecessary
"is-dereferenceable" checks, which we can remove later if desired.
Reviewers: chandlerc, tstellarAMD
Subscribers: jholewinski, arsenm, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D23371
llvm-svn: 281151
Rename AllVRegsAllocated to NoVRegs. This avoids the connotation of
running after register and simply describes that no vregs are used in
a machine function. With that we can simply compute the property and do
not need to dump/parse it in .mir files.
Differential Revision: http://reviews.llvm.org/D23850
llvm-svn: 279698
The change in r279105 causes an infinite loop in some cases, as it sets the upper bits of an AND mask constant, which DAGCombiner::SimplifyDemandedBits then unsets.
This patch reverts that part of the behaviour, instead relying on .td peepholes to perform the transformation to NILL. I reapplied my original fix for the problem addressed by r279105 (unsetting the upper bits, which prevents a compiler abort for a different reason).
Differential Revision: https://reviews.llvm.org/D23781
llvm-svn: 279515
Summary:
Inline asm memory constraints can have the base or index register be assigned
to %r0 right now. Make sure that we assign only ADDR64 registers to the base
and index.
Reviewers: uweigand
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23367
llvm-svn: 279157
The names of the tablegen defs now match the names of the ISD nodes.
This makes the world a slightly saner place, as previously "fround" matched
ISD::FP_ROUND and not ISD::FROUND.
Differential Revision: https://reviews.llvm.org/D23597
llvm-svn: 279129
Normally, when an AND with a constant is lowered to NILL, the constant value is truncated to 16 bits. However, since r274066, ANDs whose results are used in a shift are caught by a different pattern that does not truncate. The instruction printer expects a 16-bit unsigned immediate operand for NILL, so this results in an abort.
This patch adds code to manually truncate the constant in this situation. The rest of the bits are then set, so we will detect a case for NILL "naturally" rather than using peephole optimizations.
Differential Revision: http://reviews.llvm.org/D21854
llvm-svn: 279105
Refactored so that a LSRUse owns its fixups, as oppsed to letting the
LSRInstance own them. This makes it easier to rate formulas for
LSRUses, since the fixups are available directly. The Offsets vector
has been removed since it was no longer necessary.
New target hook isFoldableMemAccessOffset(), which is used during formula
rating.
For SystemZ, this is useful to express that loads and stores with
float or vector types with a big/negative offset should be avoided in
loops. Without this, LSR will generate a lot of negative offsets that
would require extra instructions for loading the address.
Updated tests:
test/CodeGen/SystemZ/loop-01.ll
Reviewed by: Quentin Colombet and Ulrich Weigand.
https://reviews.llvm.org/D19152
llvm-svn: 278927
In debug mode extra macros are enabled for several C++ algorithms. Some of them
may cause unfortunate build failures.
This commit adds a redundant operator() to work around one of those troublesome
macros which was hit accidentally by change r278012.
llvm-svn: 278241
Summary:
Add support for the .insn directive.
.insn is an s390 specific directive that allows encoding of an instruction
instead of using a mnemonic. The motivating case is some code in node.js that
requires support for the .insn directive.
Reviewers: koriakin, uweigand
Subscribers: koriakin, llvm-commits
Differential Revision: https://reviews.llvm.org/D21809
llvm-svn: 278012
Summary:
Add instruction formats E, RSI, SSd, SSE, and SSF.
Added BRXH, BRXLE, PR, MVCK, STRAG, and ECTG instructions to test out
those formats.
Reviewers: uweigand
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23179
llvm-svn: 277822
This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of
subclasses already implement.
Differential Revision: https://reviews.llvm.org/D22885
llvm-svn: 277126
Some targets, notably AArch64 for ILP32, have different relocation encodings
based upon the ABI. This is an enabling change, so a future patch can use the
ABIName from MCTargetOptions to chose which relocations to use. Tested using
check-llvm.
The corresponding change to clang is in: http://reviews.llvm.org/D16538
Patch by: Joel Jones
Differential Revision: https://reviews.llvm.org/D16213
llvm-svn: 276654
Summary:
Instead, we take a single flags arg (a bitset).
Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.
This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted. It also greatly simplifies the process of adding another flag
to getLoad.
Reviewers: chandlerc, tstellarAMD
Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits
Differential Revision: http://reviews.llvm.org/D22249
llvm-svn: 275592
Summary:
Previously we took an unsigned.
Hooray for type-safety.
Reviewers: chandlerc
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D22282
llvm-svn: 275591
Avoid implicit conversions from MachineInstrBundleIterator to
MachineInstr* in the SystemZ backend, mainly by preferring MachineInstr&
over MachineInstr* and using range-based for loops.
llvm-svn: 275137
Summary: Add support for the z13 instructions LOCHI and LOCGHI which
conditionally load immediate values. Add target instruction info hooks so
that if conversion will allow predication of LHI/LGHI.
Author: RolandF
Reviewers: uweigand
Subscribers: zhanjunl
Commiting on behalf of Roland.
Differential Revision: http://reviews.llvm.org/D22117
llvm-svn: 275086
This adds a new SystemZ-specific intrinsic, llvm.s390.tdc.f(32|64|128),
which maps straight to the test data class instructions. A new IR pass
is added to recognize instructions that can be converted to TDC and
perform the necessary replacements.
Differential Revision: http://reviews.llvm.org/D21949
llvm-svn: 275016
Summary: Branch off the work to add support for the .word directive,
using addAliasForDirective.
Reviewers: koriakin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22142
llvm-svn: 274878
Summary:
A regression showed up in node.js when handling conditional calls.
Fix the regression by recognizing external symbols as a possible
operand type in CallJG.
Reviewers: koriakin
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D22054
llvm-svn: 274761
On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount.
Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we
can remove the AND operation entirely.
Differential Revision: http://reviews.llvm.org/D21854
llvm-svn: 274650
Change all the methods in LiveVariables that expect non-null
MachineInstr* to take MachineInstr& and update the call sites. This
clarifies the API, and designs away a class of iterator to pointer
implicit conversions.
llvm-svn: 274319
This is a mechanical change to make TargetLowering API take MachineInstr&
(instead of MachineInstr*), since the argument is expected to be a valid
MachineInstr. In one case, changed a parameter from MachineInstr* to
MachineBasicBlock::iterator, since it was used as an insertion point.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
llvm-svn: 274287
This processor feature had been left out by mistake from the z13
ProcessorModel.
This time with updated test case. Thanks, Hans.
Reviewed by Ulrich Weigand.
llvm-svn: 274216
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr. This is a
general API improvement.
Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other. Instead I've done everything as a block and just
updated what was necessary.
This is mostly mechanical fixes: adding and removing `*` and `&`
operators. The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy. I couldn't run tests
for AVR since llc doesn't link with it turned on.
llvm-svn: 274189
Summary: SystemZ shift instructions only use the last 6 bits of the shift
amount. When the result of an AND operation is used as a shift amount, this
means that we can use the NILL instruction (which operates on the last 16 bits)
rather than NILF (which operates on the last 32 bits) for a 16-bit savings in
instruction size.
Reviewers: uweigand
Subscribers: llvm-commits
Author: colpell
Committing on behalf of Elliot.
Differential Revision: http://reviews.llvm.org/D21686
llvm-svn: 274066
Summary:
Created a pattern to match 64-bit mode (and (xor x, -1), y)
to a shorter sequence of instructions.
Before the change, the canonical form is translated to:
xihf %r3, 4294967295
xilf %r3, 4294967295
ngr %r2, %r3
After the change, the canonical form is translated to:
ngr %r3, %r2
xgr %r2, %r3
Reviewers: zhanjunl, uweigand
Subscribers: llvm-commits
Author: assem
Committing on behalf of Assem.
Differential Revision: http://reviews.llvm.org/D21693
llvm-svn: 273887
Summary:
Recognize RISBG opportunities where the end result is narrower than the
original input - where a truncate separates the shift/and operations.
The motivating case is some code in postgres which looks like:
srlg %r2, %r0, 11
nilh %r2, 255
Reviewers: uweigand
Author: RolandF
Differential Revision: http://reviews.llvm.org/D21452
llvm-svn: 273433
This enables use of the 'R' and 'T' memory constraints for inline ASM
operands on SystemZ, which allow an index register as well as an
immediate displacement. This patch includes corresponding documentation
and test case updates.
As with the last patch of this kind, I moved the 'm' constraint to the
most general case, which is now 'T' (base + 20-bit signed displacement +
index register).
Author: colpell
Differential Revision: http://reviews.llvm.org/D21239
llvm-svn: 272547
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512
Support and generate Compare and Traps like CRT, CIT, etc.
Support Trap as legal DAG opcodes and generate "j .+2" for them by default.
Add support for Conditional Traps and use the If Converter to convert them into
the corresponding compare and trap opcodes.
Differential Revision: http://reviews.llvm.org/D21155
llvm-svn: 272419
This enables use of the 'S' constraint for inline ASM operands on
SystemZ, which allows for a memory reference with a signed 20-bit
immediate displacement. This patch includes corresponding documentation
and test case updates.
I've changed the 'T' constraint to match the new behavior for 'S', as
'T' also uses a long displacement (though index constraints are still
not implemented). I also changed 'm' to match the behavior for 'S' as
this will allow for a wider range of displacements for 'm', though
correct me if that's not the right decision.
Author: colpell
Differential Revision: http://reviews.llvm.org/D21097
llvm-svn: 272266
Having an enum member named Default is quite confusing: Is it distinct
from the others?
This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.
llvm-svn: 269988
Summary:
The ordering of registers in BinaryRRF instructions are wrong, and
affects the copysign instruction (CPSDR). This results in the wrong
magnitude and sign being set.
Author: zhanjunl
Reviewers: kbarton, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D20308
llvm-svn: 269922
Summary: On Linux, /usr/include/bits/byteswap-16.h defines __byteswap_16(x) as an inlined LRVH (Load Reversed Half-word) instruction. The SystemZ back-end did not support this opcode and the inlined assembly would cause a fatal error.
Reviewers: bryanpkc, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18732
llvm-svn: 269688
- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
the method to try* and return a bool for success.
Part of llvm.org/pr26808.
llvm-svn: 269505
This is a bit of a spot fix for now. I'll try to fix this up more
comprehensively soon.
This is part of the work to have Select return void instead of an
SDNode *, which is in turn part of llvm.org/pr26808.
llvm-svn: 269120
Added support for extended mnemonics for the following branch instructions and
load/store-on-condition opcodes:
BR, LOCR, LOCGR, LOC, LOCG, STOC, STOCG
Phabricator: http://reviews.llvm.org/D19729
Committing on behalf of Zhan Liau
llvm-svn: 269106
Currently, SelectionDAG assumes 8/16-bit cmpxchg returns either a sign
extended result, or a zero extended result. SystemZ takes a third
option by returning junk in the high bits (rotated contents of the other
bytes in the memory word). In that case, don't use Assert*ext, and
zero-extend the result ourselves if a comparison is needed.
Differential Revision: http://reviews.llvm.org/D19800
llvm-svn: 269075
SystemZ (and probably other targets as well) can fold a memory operand
by changing the opcode into a new instruction that as a side-effect
also clobbers the CC-reg.
In order to do this, liveness of that reg must first be checked. When
LIS is passed, getRegUnit() can be called on it and the right
LiveRange is computed on demand.
Reviewed by Matthias Braun.
http://reviews.llvm.org/D19861
llvm-svn: 269026
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.
llvm-svn: 269011
The call to Select on Upper here happens in an unusual order in order
to defeat the constant folding that getNode() does. Add a comment
explaining why we can't just move the Select to later to avoid a
Handle, and wrap the call to SelectCode in a handle so we don't need
its return value.
This is part of the work to have Select return void instead of an
SDNode *, which is in turn part of llvm.org/pr26808.
llvm-svn: 268990
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.
We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.
Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.
llvm-svn: 268693
This introduces a SystemZ-specific "backchain" attribute on function, which
enables writing the frame backchain link as specified by the ABI. This will
be used to implement -mbackchain option in clang.
Differential Revision: http://reviews.llvm.org/D19889
Fixed in this version: added RegState::Define and RegState::Kill on R1D
in prologue.
llvm-svn: 268581
This introduces a SystemZ-specific "backchain" attribute on function, which
enables writing the frame backchain link as specified by the ABI. This will
be used to implement -mbackchain option in clang.
Differential Revision: http://reviews.llvm.org/D19889
llvm-svn: 268571
Marking implicit CC defs as dead everywhere except when CC is actually
defined and used explicitly, is important since the post-ra scheduler
will otherwise insert edges between instructions unnecessarily.
Also temporarily disable LA(Y)-> AGSI optimization in
foldMemoryOperandImpl(), since this inroduces a def of the CC reg,
which is illegal unless it is known to be dead.
Reviewed by Ulrich Weigand.
llvm-svn: 268215
Summary:
Port rL265480, rL264754, rL265997 and rL266252 to SystemZ, in order to enable the Swift port on the architecture. SwiftSelf and SwiftError are assigned to R10 and R9, respectively, which are normally callee-saved registers. For more information, see:
RFC: Implementing the Swift calling convention in LLVM and Clang
https://groups.google.com/forum/#!topic/llvm-dev/epDd2w93kZ0
Reviewers: kbarton, manmanren, rjmccall, uweigand
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19414
llvm-svn: 267823
This fixes PR22248 on s390x. The previous attempt at this was D19101,
which was before LOAD_STACK_GUARD existed. Compared to the previous
version, this always emits a rather ugly block of 4 instructions, involving
a thread pointer load that can't be shared with other potential users.
However, this is necessary for SSP - spilling the guard value (or thread
pointer used to load it) is counter to the goal, since it could be
overwritten along with the frame it protects.
Differential Revision: http://reviews.llvm.org/D19363
llvm-svn: 267340
Removed some unused headers, replaced some headers with forward class declarations.
Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'
Patch by Eugene Kosov <claprix@yandex.ru>
Differential Revision: http://reviews.llvm.org/D19219
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
Use the tryAddingSymbolicOperand callback to attempt to present immediate
values in symbolic form when disassembling. This is currently only used
for PC-relative immediates (which are most likely to be symbolic in the
SystemZ ISA). Add new DecodeMethod types to allow distinguishing between
branch and non-branch instructions.
llvm-svn: 266469
On z13, if eliminateFrameIndex() chooses LE (and not LEY), immediately
transform that LE to LDE32 to avoid partial register dependencies.
LEY should be generally preferred for big offsets over an expansion
into LAY + LDE32.
Reviewed by Ulrich Weigand.
llvm-svn: 266060
The note about conditional returns can now be removed, as they are
implemented. Let's also add 2 new ones in exchange.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D18962
llvm-svn: 265944
This adds a conditional variant of CallBR instruction, CallBCR. Also,
it can be fused with integer comparisons, resulting in one of the new
C*BCall instructions.
In addition to CallBRCL limitations, this has another one: it won't
trigger if the function to call isn't already in %r1 - see f22 in the
test for an example (it's also why the loads in tests are volatile).
Author: koriakin
Differential Revision: http://reviews.llvm.org/D18928
llvm-svn: 265933
These are fused compare-and-branches, so they obviously don't use CC.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D18927
llvm-svn: 265932
This adds a conditional variant of CallJG instruction, CallBRCL.
It can be used for conditional sibling calls. Unfortunately, due
to IfCvt limitations, it only really works well for functions without
arguments.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D18864
llvm-svn: 265814
Return is now considered a predicable instruction, and is converted
to a newly-added CondReturn (which maps to BCR to %r14) instruction by
the if conversion pass.
Also, fused compare-and-branch transform knows about conditional
returns, emitting the proper fused instructions for them.
This transform triggers on a *lot* of tests, hence the huge diffstat.
The changes are mostly jX to br %r14 -> bXr %r14.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D17339
llvm-svn: 265689
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.
The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).
This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.
As a follow-up I'll add:
bool operator<(AtomicOrdering, AtomicOrdering) = delete;
bool operator>(AtomicOrdering, AtomicOrdering) = delete;
bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.
Reviewers: jyknight, reames
Subscribers: jyknight, llvm-commits
Differential Revision: http://reviews.llvm.org/D18775
llvm-svn: 265602
Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.
Reviewers: qcolombet
Subscribers: jyknight, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18525
llvm-svn: 265313
This adds MC support for fused compare + indirect branch instructions,
ie. CRB, CGRB, CLRB, CLGRB, CIB, CGIB, CLIB, CLGIB. They aren't actually
generated yet -- this is preparation for their use for conditional
returns in the next iteration of D17339.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D18742
llvm-svn: 265296
A cross-thread sequentially consistent fence should be lowered into
z/Architecture's BCR serialization instruction, instead of causing a
fatal error in the back-end.
Author: bryanpkc
Differential Revision: http://reviews.llvm.org/D18644
llvm-svn: 265292
Enable the SystemZ back-end to lower FRAMEADDR and RETURNADDR, which
previously would cause the back-end to crash. Currently, only a
frame count of zero is supported.
Author: bryanpkc
Differential Revision: http://reviews.llvm.org/D18514
llvm-svn: 265291
This will become necessary in a subsequent change to make this method
merge adjacent stack adjustments, i.e. it might erase the previous
and/or next instruction.
It also greatly simplifies the calls to this function from Prolog-
EpilogInserter. Previously, that had a bunch of logic to resume iteration
after the call; now it just continues with the returned iterator.
Note that this changes the behaviour of PEI a little. Previously,
it attempted to re-visit the new instruction created by
eliminateCallFramePseudoInstr(). That code was added in r36625,
but I can't see any reason for it: the new instructions will obviously
not be pseudo instructions, they will not have FrameIndex operands,
and we have already accounted for the stack adjustment.
Differential Revision: http://reviews.llvm.org/D18627
llvm-svn: 265036