Commit Graph

534 Commits

Author SHA1 Message Date
Krzysztof Parzyszek bb1aede865 [SystemZ] Require asserts in subregliveness-06.mir
The option -misched=shuffle is only available with !NDEBUG builds.

llvm-svn: 339931
2018-08-16 20:12:15 +00:00
Krzysztof Parzyszek 9af86a5e01 [MachineVerifier] Check if predecessor is jointly dominated by undefs
Each use of a value should be jointly dominated by the union of defs and
undefs. It can happen that it will only be jointly dominated by undefs,
and that is still legal. Make sure that the verifier is aware of that.

llvm-svn: 339924
2018-08-16 19:13:28 +00:00
Krzysztof Parzyszek 17143f6111 [RegisterCoalescer] Shrink to uses if needed after removeCopyByCommutingDef
llvm-svn: 339912
2018-08-16 18:02:59 +00:00
Krzysztof Parzyszek 3b097b4d3e [RegisterCoalescer] Ensure that both registers have subranges if one does
llvm-svn: 339792
2018-08-15 17:04:58 +00:00
Krzysztof Parzyszek 88d267d094 [RegisterCoalescer] Reset VNInfo def when copying segments over
llvm-svn: 339788
2018-08-15 16:21:53 +00:00
Krzysztof Parzyszek 46ce441df6 [RegAlloc] Check that subreg liveness tracking applies to given virtual reg
Subregister liveness applies selectively to register classes with certain
properties. Make sure that when it's enabled, it applies to a given virtual
register (in virtual register rewriter).

llvm-svn: 339784
2018-08-15 16:07:47 +00:00
Krzysztof Parzyszek 4e06beb820 [SystemZ] Add testcase for r339778
llvm-svn: 339780
2018-08-15 15:43:13 +00:00
Jonas Paulsson f107b7275c [SystemZ] Improve handling of instructions which expand to several groups
Some instructions expand to more than one decoder group.

This has been hitherto ignored, but is handled with this patch.

Review: Ulrich Weigand
https://reviews.llvm.org/D50187

llvm-svn: 338849
2018-08-03 10:43:05 +00:00
Ulrich Weigand 58a9786e81 [SystemZ, TableGen] Fix shift count handling
The DAG combiner logic to simplify AND masks in shift counts is invalid.
While it is true that the SystemZ shift instructions ignore all but the
low 6 bits of the shift count, it is still invalid to simplify the AND
masks while the DAG still uses the standard shift operators (which are
*not* defined to match the SystemZ instruction behavior).

Instead, this patch performs equivalent operations during instruction
selection. For completely removing the AND, this now happens via
additional DAG match patterns implemented by a multi-alternative
PatFrags. For simplifying a 32-bit AND to a 16-bit AND, the existing DAG
patterns were already mostly OK, they just needed an output XForm to
actually truncate the immediate value.

Unfortunately, the latter change also exposed a bug in TableGen: it
seems XForms are currently only handled correctly for direct operands of
the outermost operation node. This patch also fixes that bug by simply
recurring through the whole pattern. This should be NFC for all other
targets.

Differential Revision: https://reviews.llvm.org/D50096

llvm-svn: 338521
2018-08-01 11:57:58 +00:00
Jonas Paulsson 2f12e45d5a [SystemZ] Improve decoding in case of instructions with four register operands.
Since z13, the max group size will be 2 if any μop has more than 3 register
sources.

This has been ignored sofar in the SystemZHazardRecognizer, but is now
handled by recognizing those instructions and adjusting the tracking of
decoding and the cost heuristic for grouping.

Review: Ulrich Weigand
https://reviews.llvm.org/D49847

llvm-svn: 338368
2018-07-31 13:00:42 +00:00
Ulrich Weigand 9dd23b8433 [SystemZ] Test case formatting fixes
Fix systematically wrong whitespace from a prior automated change.

NFC.

llvm-svn: 337542
2018-07-20 12:12:10 +00:00
Jonas Paulsson c88d3f6a99 [SystemZ] Reimplent SchedModel IssueWidth and WriteRes/ReadAdvance mappings.
As a consequence of recent discussions
(http://lists.llvm.org/pipermail/llvm-dev/2018-May/123164.html), this patch
changes the SystemZ SchedModels so that the IssueWidth is 6, which is the
decoder capacity, and NumMicroOps become the number of decoder slots needed
per instruction.

In addition, the SchedWrite latencies now match the MachineInstructions
def-operand indexes, and ReadAdvances have been added on instructions with
one register operand and one memory operand.

Review: Ulrich Weigand
https://reviews.llvm.org/D47008

llvm-svn: 337538
2018-07-20 09:40:43 +00:00
Joel E. Denny 9fa9c9368d [FileCheck] Add -allow-deprecated-dag-overlap to failing llvm tests
See https://reviews.llvm.org/D47106 for details.

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D47171

This commit drops that patch's changes to:

  llvm/test/CodeGen/NVPTX/f16x2-instructions.ll
  llvm/test/CodeGen/NVPTX/param-load-store.ll

For some reason, the dos line endings there prevent me from commiting
via the monorepo.  A follow-up commit (not via the monorepo) will
finish the patch.

llvm-svn: 336843
2018-07-11 20:25:49 +00:00
George Rimar dcf59c5480 Recommit r335333 "[MC] - Add .stack_size sections into groups and link them with .text"
With compilation fix.

Original commit message:

D39788 added a '.stack-size' section containing metadata on function stack sizes
to output ELF files behind the new -stack-size-section flag.

This change does following two things on top:

1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. 
    The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to
    eliminate them fast during resolving the COMDATs.
2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text.
   With that linker will be able to do -gc-sections on dead stack sizes sections.

Differential revision: https://reviews.llvm.org/D46874

llvm-svn: 335336
2018-06-22 10:53:47 +00:00
George Rimar 6d448da1be Revert r335332 "[MC] - Add .stack_size sections into groups and link them with .text"
It broke bots.

http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443
http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551

llvm-svn: 335333
2018-06-22 10:27:33 +00:00
George Rimar e14485a0c6 [MC] - Add .stack_size sections into groups and link them with .text
D39788 added a '.stack-size' section containing metadata on function stack sizes
to output ELF files behind the new -stack-size-section flag.

This change does following two things on top:

1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. 
    The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to
    eliminate them fast during resolving the COMDATs.
2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text.
   With that linker will be able to do -gc-sections on dead stack sizes sections.

Differential revision: https://reviews.llvm.org/D46874

llvm-svn: 335332
2018-06-22 10:10:53 +00:00
Karl-Johan Karlsson abb11f805f [BranchFolding] Fix live-in's when hoisting code
Summary:
When the branch folder hoist code into a predecessor it adjust live-in's
in the blocks it hoist code from. However it fail to handle hoisted code
that contain a defed register that originally is live-in in the block
through a super register.

This is fixed by replacing the live-in handling code with calls to
utility functions in LivePhysRegs.

Reviewers: kparzysz, gberry, MatzeB, uweigand, aprantl

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47529

llvm-svn: 334163
2018-06-07 07:20:33 +00:00
Jonas Paulsson 307e782cbc [SystemZ] Bugfix in combineSTORE().
Remember to check if store is truncating before calling
combineTruncateExtract().

Review: Ulrich Weigand
llvm-svn: 333262
2018-05-25 09:01:23 +00:00
Jonas Paulsson 7d484fae2b [RegUsageInfoCollector] Bugfix for callee saved registers.
Previously, this pass would look at the (static) set returned by
getCallPreservedMask() and add those back as preserved in the case when
isSafeForNoCSROpt() returns false.

A problem is that a target may have to save some registers even when NoCSROpt
takes place. For instance, on SystemZ, the return register is needed upon
return from a function.

Furthermore, getCallPreservedMask() only includes the registers that the
target actually wishes to emit save/restore instructions for. This means that
subregs and (fully saved) superregs are missing.

This patch instead takes the (dynamic) set returned by target for the
function from determineCalleeSaves() and then adds sub/super regs to build
the set to be used when building the RegMask for the function.

Review: Quentin Colombet, Ulrich Weigand
https://reviews.llvm.org/D46315

llvm-svn: 333261
2018-05-25 08:42:02 +00:00
Jonas Paulsson de54c058a6 [SystemZ] Fold AHIMux in foldMemoryOperandImpl.
AHIMux can be folded the same way as AHI.

Review: Ulrich Weigand
llvm-svn: 332703
2018-05-18 11:54:04 +00:00
Jonas Paulsson ebb1605bf3 [SystemZ] Bugfix for MVCLoop CC clobbering.
MVCLoop clobbers CC (since it emits a compare/branch), but this was not
modelled.

Review: Ulrich Weigand
llvm-svn: 331627
2018-05-07 10:48:43 +00:00
Jonas Paulsson 72fe760592 [RegUsageInfoCollector] Bugfix for handling of register aliases.
Don't assume the alias of a defined reg is always already in the set.

As the test case in https://bugs.llvm.org/show_bug.cgi?id=36587 discovered,
it is wrong to assume that all the aliases of the defined register in the
*current function* is already present in the UsedPhysRegsMask.

This patch changes this so that any definition in the current function of a
phys-reg always results in all its aliases inserted into the set of defined
registers.

Review: Quentin Colombet
https://reviews.llvm.org/D45157

llvm-svn: 331509
2018-05-04 07:50:05 +00:00
Ulrich Weigand c3ec80fea1 [SystemZ] Handle SADDO et.al. and ADD/SUBCARRY
This provides an optimized implementation of SADDO/SSUBO/UADDO/USUBO
as well as ADDCARRY/SUBCARRY on top of the new CC implementation.

In particular, multi-word arithmetic now uses UADDO/ADDCARRY instead
of the old ADDC/ADDE logic, which means we no longer need to use
"glue" links for those instructions.  This also allows making full
use of the memory-based instructions like ALSI, which couldn't be
recognized due to limitations in the DAG matcher previously.

Also, the llvm.sadd.with.overflow et.al. intrinsincs now expand to
directly using the ADD instructions and checking for a CC 3 result.

llvm-svn: 331203
2018-04-30 17:54:28 +00:00
Ulrich Weigand b32f3656d2 [SystemZ] Do not use glue to represent condition code dependencies
Currently, an instruction setting the condition code is linked to
the instruction using the condition code via a "glue" link in the
SelectionDAG.  This has a number of drawbacks; in particular, it
means the same CC cannot be used by multiple users.  It also makes
it more difficult to efficiently implement SADDO et. al.

This patch changes the back-end to represent CC dependencies as
normal values during SelectionDAG matching, along the lines of
how this is handled in the X86 back-end already.

In addition to the core mechanics of updating all relevant patterns,
this requires a number of additional changes:

- We now need to be able to spill/restore a CC value into a GPR
  if necessary.  This means providing a copyPhysReg implementation
  for moves involving CC, and defining getCrossCopyRegClass.

- Since we still prefer to avoid such spills, we provide an override
  for IsProfitableToFold to avoid creating a merged LOAD / ICMP if
  this would result in multiple users of the CC.

- combineCCMask no longer requires a single CC user, and no longer
  need to be careful about preventing invalid glue/chain cycles.

- emitSelect needs to be more careful in marking CC live-in to
  the basic block it generates.  Also, we can now optimize the
  case of multiple subsequent selects with the same condition
  just like X86 does.

llvm-svn: 331202
2018-04-30 17:52:32 +00:00
Ulrich Weigand fb56686cd3 [SystemZ] Improve handling of Select pseudo-instructions
If we have LOCR instructions, select them directly from SelectionDAG
instead of first going through a pseudo instruction and then using
the custom inserter to emit the LOCR.

Provide Select pseudo-instructions for VR32/VR64 if we have vector
instructions, to avoid having to go through the first 16 FPRs
unnecessarily.

If we do not have LOCFHR, prefer using LOCR followed by a move
over a conditional branch.

llvm-svn: 331191
2018-04-30 15:49:27 +00:00
Jun Bum Lim 06073bfff7 [PostRASink]Add register dependency check for implicit operands
Summary:
This change extend the register dependency check for implicit operands in Copy instructions.
Fixes PR36902.

Reviewers: thegameg, sebpop, uweigand, jnspaulsson, gberry, mcrosier, qcolombet, MatzeB

Reviewed By: thegameg

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D44958

llvm-svn: 330018
2018-04-13 14:23:09 +00:00
Zaara Syeda 935474fef5 [MachineLICM] Re-enable hoisting of constant stores
This patch fixes an issue exposed on the SystemZ build bots when committing
https://reviews.llvm.org/rL327856. The hoisting was temporarily disabled with
an option. This patch now re-enables hoisting and checks that we only hoist a
store instruction when all its operands are either constant caller preserved
registers or immediates.

Differential Revision: https://reviews.llvm.org/D45286

llvm-svn: 329577
2018-04-09 14:50:02 +00:00
Rafael Espindola 78fdca3cd5 Use local symbols for creating .stack-size.
llvm-svn: 328581
2018-03-26 20:40:22 +00:00
Jonas Paulsson 8ad035d8e5 [SystemZ] Add "REQUIRES: asserts" to test case to fix build bots.
llvm-svn: 327958
2018-03-20 08:29:19 +00:00
Jonas Paulsson a6216ec4cc [SystemZ] Bugfix of CC liveness in emitMemMemWrapper (CLC).
If DoneMBB becomes empty it must have CC added to its live-in list, since it
will fall-through into EndMBB. This happens when the CLC loop does the
complete range.

Review: Ulrich Weigand
llvm-svn: 327834
2018-03-19 13:05:22 +00:00
Jonas Paulsson dbcf1bf503 [SystemZ] Add 'REQUIRES: asserts' to test case using debug output.
llvm-svn: 327766
2018-03-17 09:15:13 +00:00
Jonas Paulsson 138960770c [SystemZ] computeKnownBitsForTargetNode() / ComputeNumSignBitsForTargetNode()
Improve/implement these methods to improve DAG combining. This mainly
concerns intrinsics.

Some constant operands to SystemZISD nodes have been marked Opaque to avoid
transforming back and forth between generic and target nodes infinitely.

Review: Ulrich Weigand
llvm-svn: 327765
2018-03-17 08:32:12 +00:00
Jonas Paulsson e9f7fa83d5 [SelectionDAG] Handle big endian target BITCAST in computeKnownBits()
The BITCAST handling in computeKnownBits() previously only worked for little
endian.

This patch reverses the iteration over elements for a big endian target which
allows this to work in this case also.

SystemZ test case.

Review: Eli Friedman
https://reviews.llvm.org/D44249

llvm-svn: 327764
2018-03-17 08:04:00 +00:00
Jonas Paulsson 5612bb292c [CodeGenPrepare] Respect endianness in splitMergedValStore.
splitMergedValStore will split a store into two if target prefers this, or if
-force-split-store is passed.

This patch adds the missing handling for endianness in this function along
with a test case.

Review: Eli Friedman
https://reviews.llvm.org/D44396

llvm-svn: 327375
2018-03-13 08:36:20 +00:00
Ulrich Weigand 1785e244eb [SystemZ] Fix test cases after r326613
I forgot to check in the updated test cases after the r326613 commit.

llvm-svn: 326616
2018-03-02 21:22:42 +00:00
Ulrich Weigand 8b19be46c7 [SystemZ] Add support for anyregcc calling convention
This adds back-end support for the anyregcc calling convention
for use with patchpoints.

Since all registers are considered call-saved with anyregcc
(except for 0 and 1 which may still be clobbered by PLT stubs
and the like), this required adding support for saving and
restoring vector registers in prologue/epilogue code for the
first time.  This is not used by any other calling convention.

llvm-svn: 326612
2018-03-02 20:40:11 +00:00
Ulrich Weigand 5eb64110d2 [SystemZ] Support stackmaps and patchpoints
This adds back-end support for the @llvm.experimental.stackmap and
@llvm.experimental.patchpoint intrinsics.

llvm-svn: 326611
2018-03-02 20:39:30 +00:00
Ulrich Weigand 3206388870 [SystemZ] Fix common-code users of stack size
On SystemZ we need to provide a register save area of 160 bytes to
any called function.  This size needs to be added when allocating
stack in the function prologue.  However, it was not accounted for
as part of MachineFrameInfo::getStackSize(); instead the back-end
used a private routine getAllocatedStackSize().

This is OK for code-gen purposes, but it breaks other users of
the getStackSize() routine, in particular it breaks the recently-
added -stack-size-section feature.

Fix this by updating the main stack size tracked by common code
(in emitPrologue) instead of using the private routine.

No change in code generation intended.

llvm-svn: 326610
2018-03-02 20:38:41 +00:00
Ulrich Weigand 18f6930fef [SystemZ] Support vector registers in inline asm
This adds support for specifying vector registers for use with inline
asm statements, either via the 'v' constraint or by explicit register
names (v0 ... v31).

llvm-svn: 326609
2018-03-02 20:36:34 +00:00
Craig Topper e7ca6f5456 [DAGCombiner] When combining zero_extend of a truncate, only mask before extending for vectors.
Masking first, prevents the extend from being combine with loads. Its also interfering with some vXi1 extraction code.

Differential Revision: https://reviews.llvm.org/D42679

llvm-svn: 326500
2018-03-01 22:32:25 +00:00
Geoff Berry a2b9011290 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Re-enable commit r323991 now that r325931 has been committed to make
MachineOperand::isRenamable() check more conservative w.r.t. code
changes and opt-in on a per-target basis.

llvm-svn: 326208
2018-02-27 16:59:10 +00:00
Jonas Paulsson 5b5e3d8f80 [SystemZ] Also update the CHECK line for VPDI
llvm-svn: 325898
2018-02-23 13:22:46 +00:00
Jonas Paulsson abc29dfa79 [SystemZ] Fix VPDI argument in test.
To select element 1 from each half with VPDI, a constant of 5 should be used.

llvm-svn: 325897
2018-02-23 13:20:57 +00:00
Quentin Colombet 48abac82b8 Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r323991.

This commit breaks target that don't model all the register constraints
in TableGen. So far the workaround was to set the
hasExtraXXXRegAllocReq, but it proves that it doesn't cover all the
cases.
For instance, when mutating an instruction (like in the lowering of
COPYs) the isRenamable flag is not properly updated. The same problem
will happen when attaching machine operand from one instruction to
another.

Geoff Berry is working on a fix in https://reviews.llvm.org/D43042.

llvm-svn: 325421
2018-02-17 03:05:33 +00:00
Jonas Paulsson 422dfbf7cc [SelectionDAG] Consider endianness in scalarizeVectorStore().
When handling vectors with non byte-sized elements, reverse the order of the
elements in the built integer if the target is Big-Endian.

SystemZ tests updated.

Review: Eli Friedman, Ulrich Weigand.
https://reviews.llvm.org/D42786

llvm-svn: 324063
2018-02-02 08:48:02 +00:00
Jonas Paulsson 0e50b6ed80 [SystemZ] Update test case (NFC)
test/CodeGen/SystemZ/vec-trunc-to-i1.ll was marked as a temporary
FAIL when it was previously updated when it needed one more COPY.
This was however wrong, since the loop body had been reduced
significantly, and it was actually an improvement.

Review: Ulrich Weigand.
llvm-svn: 324060
2018-02-02 07:52:02 +00:00
Geoff Berry 94503c7bc3 [MachineCopyPropagation] Extend pass to do COPY source forwarding
Summary:
This change extends MachineCopyPropagation to do COPY source forwarding
and adds an additional run of the pass to the default pass pipeline just
after register allocation.

This version of this patch uses the newly added
MachineOperand::isRenamable bit to avoid forwarding registers is such a
way as to violate constraints that aren't captured in the
Machine IR (e.g. ABI or ISA constraints).

This change is a continuation of the work started in D30751.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar

Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits

Differential Revision: https://reviews.llvm.org/D41835

llvm-svn: 323991
2018-02-01 18:54:01 +00:00
Nirav Dave 18f7f60e17 [SelectionDAG] Fix UpdateChains handling of TokenFactors
Summary:
In Instruction Selection UpdateChains replaces all matched Nodes'
chain references including interior token factors and deletes them.
This may allow nodes which depend on these interior nodes but are not
part of the set of matched nodes to be left with a dangling dependence.
Avoid this by doing the replacement for matched non-TokenFactor nodes.

Fixes PR36164.

Reviewers: jonpa, RKSimon, bogner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D42754

llvm-svn: 323977
2018-02-01 16:11:59 +00:00
Puyan Lotfi 43e94b15ea Followup on Proposal to move MIR physical register namespace to '$' sigil.
Discussed here:

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html

In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.

llvm-svn: 323922
2018-01-31 22:04:26 +00:00
Jonas Paulsson cc5fe73669 [SystemZ] Check the bitwidth before calling isInt/isUInt.
Since these methods will assert if the integer does not fit into 64 bits,
it is necessary to do this check before calling them in
supportedAddressingMode().

Review: Ulrich Weigand.
llvm-svn: 323866
2018-01-31 12:41:25 +00:00
Jonas Paulsson 9cee52732f Move new test from Generic to SystemZ.
A few build bots failed with r323042 because they are not configured to
build the SystemZ target.

llvm-svn: 323044
2018-01-20 16:57:06 +00:00
Jonas Paulsson 7ad28863fb [SelectionDAG] Fix codegen of vector stores with non byte-sized elements.
This was completely broken, but hopefully fixed by this patch.

In cases where it is needed, a vector with non byte-sized elements is stored
by extracting, zero-extending, shift:ing and or:ing the elements into an
integer of the same width as the vector, which is then stored.

Review: Eli Friedman, Ulrich Weigand
https://reviews.llvm.org/D42100#inline-369520
https://bugs.llvm.org/show_bug.cgi?id=35520

llvm-svn: 323042
2018-01-20 16:05:10 +00:00
Ulrich Weigand 426f6bef44 [SystemZ] Prefer LOCHI over generating IPM sequences
On current machines we have load-on-condition instructions that can be
used to directly implement the SETCC semantics.  If we have those, it is
always preferable to use them instead of generating the IPM sequence.

llvm-svn: 322989
2018-01-19 20:56:04 +00:00
Ulrich Weigand 31112895d9 [SystemZ] Directly use CC result of compare-and-swap
In order to implement a test whether a compare-and-swap succeeded, the
SystemZ back-end currently emits a rather inefficient sequence of first
converting the CC result into an integer, and then testing that integer
against zero.  This commit changes the back-end to simply directly test
the CC value set by the compare-and-swap instruction.

llvm-svn: 322988
2018-01-19 20:54:18 +00:00
Ulrich Weigand 849a59fd4b [SystemZ] Rework IPM sequence generation
The SystemZ back-end uses a sequence of IPM followed by arithmetic
operations to implement the SETCC primitive.  This is currently done
early during SelectionDAG.  This patch moves generating those sequences
to much later in SelectionDAG (during PreprocessISelDAG).

This doesn't change much in generated code by itself, but it allows
further enhancements that will be checked-in as follow-on commits.

llvm-svn: 322987
2018-01-19 20:52:04 +00:00
Ulrich Weigand ac04d9b8e5 [SystemZ] Run branch-12.ll test only if long tests enabled
This avoids excessive test run times e.g. with expensive checks enabled.

llvm-svn: 322983
2018-01-19 19:51:38 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Jonas Paulsson ef785694f2 [SystemZ] Handle BRCTH branches correctly in SystemZLongBranch.cpp.
BRCTH is capable of a long branch which needs to be recognized during branch
relaxation. This is done by checking for ExtraRelaxSize == 0.

Review: Ulrich Weigand
llvm-svn: 322688
2018-01-17 17:16:07 +00:00
Jonas Paulsson 776a81a483 [SystemZ] Check for legality before doing LOAD AND TEST transformations.
Since a load and test instruction treat its operands as signed, it can only
replace a logical compare for EQ/NE uses.

Review: Ulrich Weigand
https://bugs.llvm.org/show_bug.cgi?id=35662

llvm-svn: 322488
2018-01-15 15:41:26 +00:00
Jonas Paulsson 1a76f3a2c2 Temporarily revert
"[SystemZ]  Check for legality before doing LOAD AND TEST transformations."

, due to test failures.

llvm-svn: 322165
2018-01-10 10:05:55 +00:00
Jonas Paulsson 9222b91e24 [SelectionDAGBuilder] Chain prefetches less aggressively.
Prefetches used to always be chained between any previous and following
memory accesses. The problem with this was that later optimizations, such as
folding of a load into the user instruction, got disrupted.

This patch relaxes the chaining of prefetches in order to remedy this.

Reveiw: Hal Finkel
https://reviews.llvm.org/D38886

llvm-svn: 322163
2018-01-10 09:33:00 +00:00
Jonas Paulsson d9dde1ac56 [SystemZ] Check for legality before doing LOAD AND TEST transformations.
Since a load and test instruction treat its operands as signed, it can only
replace a logical compare for EQ/NE uses.

Review: Ulrich Weigand
https://bugs.llvm.org/show_bug.cgi?id=35662

llvm-svn: 322161
2018-01-10 09:18:17 +00:00
Puyan Lotfi fe6c9cbb24 [MIR] Repurposing '$' sigil used by external symbols. Replacing with '&'.
Planning to add support for named vregs. This puts is in a conundrum since
physregs are named as well. To rectify this we need to use a sigil other than
'%' for physregs in MIR. We've settled on using '$' for physregs but first we
must repurpose it from external symbols using it, which is what this commit is
all about. We think '&' will have familiar semantics for C/C++ users.

llvm-svn: 322146
2018-01-10 00:56:48 +00:00
Geoff Berry 60c431022e [MachineOperand][MIR] Add isRenamable to MachineOperand.
Summary:
Add isRenamable() predicate to MachineOperand.  This predicate can be
used by machine passes after register allocation to determine whether it
is safe to rename a given register operand.  Register operands that
aren't marked as renamable may be required to be assigned their current
register to satisfy constraints that are not captured by the machine
IR (e.g. ABI or ISA constraints).

Reviewers: qcolombet, MatzeB, hfinkel

Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39400

llvm-svn: 320503
2017-12-12 17:53:59 +00:00
Francis Visoiu Mistrih a8a83d150f [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

llvm-svn: 320022
2017-12-07 10:40:31 +00:00
Jonas Paulsson 19380bae05 [SystemZ] Bugfix in expandRxSBG()
Csmith discovered a program that caused wrong code generation with -O0:

When handling a SIGN_EXTEND in expandRxSBG(), RxSBG.BitSize may be less than
the Input width (if a truncate was previously traversed), so maskMatters()
should be called with a masked based on the width of the sign extend result
instead.

Review: Ulrich Weigand
llvm-svn: 319892
2017-12-06 13:53:24 +00:00
Ulrich Weigand 5bfed6cb7c [SystemZ] Validate shifted compare value in adjustForTestUnderMask
When folding a shift into a test-under-mask comparison, make sure that
there is no loss of precision when creating the shifted comparison
value.  This usually never happens, except for certain always-true
comparisons in unoptimized code.

Fixes PR35529.

llvm-svn: 319818
2017-12-05 19:42:07 +00:00
Jonas Paulsson b5b91cd402 [SystemZ] set 'guessInstructionProperties = 0' and set flags as needed.
This has proven a healthy exercise, as many cases of incorrect instruction
flags were corrected in the process. As part of this, IntrWriteMem was added
to several SystemZ instrinsics.

Furthermore, a bug was exposed in TwoAddress with this change (as incorrect
hasSideEffects flags were removed and instructions could now be sunk), and
the test case for that bugfix (r319646) is included here as
test/CodeGen/SystemZ/twoaddr-sink.ll.

One temporary test regression (one extra copy) which will hopefully go away
in upcoming patches for similar cases:
test/CodeGen/SystemZ/vec-trunc-to-i1.ll

Review: Ulrich Weigand.
https://reviews.llvm.org/D40437

llvm-svn: 319756
2017-12-05 11:24:39 +00:00
Jonas Paulsson 86c40db49d [Regalloc] Generate and store multiple regalloc hints.
MachineRegisterInfo used to allow just one regalloc hint per virtual
register. This patch extends this to a vector of regalloc hints, which is
filled in by common code with sorted copy hints. Such hints will make for
more ID copies that can be removed.

NB! This improvement is currently (and hopefully temporarily) *disabled* by
default, except for SystemZ. The only reason for this is the big impact this
has on tests, which has unfortunately proven unmanageable. It was a long
while since all the tests were updated and just waiting for review (which
didn't happen), but now targets have to enable this themselves
instead. Several targets could get a head-start by downloading the tests
updates from the Phabricator review. Thanks to those who helped, and sorry
you now have to do this step yourselves.

This should be an improvement generally for any target!

The target may still create its own hint, in which case this has highest
priority and is stored first in the vector. If it has target-type, it will
not be recomputed, as per the previous behaviour.

The temporary hook enableMultipleCopyHints() will be removed as soon as all
targets return true.

Review: Quentin Colombet, Ulrich Weigand.
https://reviews.llvm.org/D38128

llvm-svn: 319754
2017-12-05 10:52:24 +00:00
Francis Visoiu Mistrih 25528d6de7 [CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665
2017-12-04 17:18:51 +00:00
Francis Visoiu Mistrih c71cced0aa [CodeGen] Always use `printReg` to print registers in both MIR and debug
output

As part of the unification of the debug format and the MIR format,
always use `printReg` to print all kinds of registers.

Updated the tests using '_' instead of '%noreg' until we decide which
one we want to be the default one.

Differential Revision: https://reviews.llvm.org/D40421

llvm-svn: 319445
2017-11-30 16:12:24 +00:00
Jonas Paulsson b9a2467501 [SystemZ] Bugfix in adjustSubwordCmp.
Csmith generated a program where a store after load to the same address did
not get chained after the new load created during DAG legalizing, and so
performed an illegal overwrite of the expected value.

When the new zero-extending load is created, the chain users of the original
load must be updated, which was not done previously.

A similar case was also found and handled in lowerBITCAST.

Review: Ulrich Weigand
https://reviews.llvm.org/D40542

llvm-svn: 319409
2017-11-30 08:18:50 +00:00
Francis Visoiu Mistrih 9d7bb0cb40 [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

llvm-svn: 319187
2017-11-28 17:15:09 +00:00
Nirav Dave db77e57ea8 [DAG] Do MergeConsecutiveStores again before Instruction Selection
Summary:

Now that store-merge is only generates type-safe stores, do a second
pass just before instruction selection to allow lowered intrinsics to
be merged as well.

Reviewers: jyknight, hfinkel, RKSimon, efriedma, rnk, jmolloy

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33675

llvm-svn: 319036
2017-11-27 15:28:15 +00:00
Jonas Paulsson 181e260e32 [DAGCombiner] Bugfix in isAlias().
Since i1 is a legal type, this:

  NumBytes = Op1->getMemoryVT().getSizeInBits() >> 3;

is wrong and should be instead

  NumBytes = Op0->getMemoryVT().getStoreSize();

There seems to be more places where this should be fixed outside DAGCombiner.

Review: Hal Finkel
https://bugs.llvm.org/show_bug.cgi?id=35366

llvm-svn: 318824
2017-11-22 08:58:30 +00:00
Jonas Paulsson 12e3a58842 [SystemZ] Bugfix for handling of subregisters in getRegAllocationHints().
The 32 bit subreg indices of GR128 registers must also be checked for in
getRC32().

Review: Ulrich Weigand.
llvm-svn: 318652
2017-11-20 14:54:03 +00:00
Rong Xu 3573d8da36 [CodeGen] Peel off the dominant case in switch statement in lowering
This patch peels off the top case in switch statement into a branch if the
probability exceeds a threshold. This will help the branch prediction and
avoids the extra compares when lowering into chain of branches.

Differential Revision: http://reviews.llvm.org/D39262

llvm-svn: 318202
2017-11-14 21:44:09 +00:00
Ulrich Weigand 5f4373a2fc [SystemZ] Do not crash when selecting an OR of two constants
In rare cases, common code will attempt to select an OR of two
constants.  This confuses the logic in splitLargeImmediate,
causing an internal error during isel.  Fixed by simply leaving
this case to common code to handle.

This fixes PR34859.

llvm-svn: 318187
2017-11-14 20:00:34 +00:00
Ulrich Weigand 55b8590e03 [SystemZ] Fix invalid codegen using RISBMux on out-of-range bits
Before using the 32-bit RISBMux set of instructions we need to
verify that the input bits are actually within range of the 32-bit
instruction.  This fixer PR35289.

llvm-svn: 318177
2017-11-14 19:20:46 +00:00
Jonas Paulsson 4b017e682d [RegAlloc, SystemZ] Increase number of LOCRs by passing "hard" regalloc hints.
* The method getRegAllocationHints() is now of bool type instead of void. If
true is returned, regalloc (AllocationOrder) will *only* try to allocate the
hints, as opposed to merely trying them before non-hinted registers.

* TargetRegisterInfo::getRegAllocationHints() is implemented for SystemZ with
an increase in number of LOCRs.

In this case, it is desired to force the hints even though there is a slight
increase in spilling, because if a non-hinted register would be allocated,
the LOCRMux pseudo would have to be expanded with a jump sequence. The LOCR
(Load On Condition) SystemZ instruction must have both operands in either the
low or high part of the 64 bit register.

Reviewers: Quentin Colombet and Ulrich Weigand
https://reviews.llvm.org/D36795

llvm-svn: 317879
2017-11-10 08:46:26 +00:00
Ulrich Weigand d39e9dca1b [SystemZ] Add support for the "o" inline asm constraint
We don't really need any special handling of "offsettable"
memory addresses, but since some existing code uses inline
asm statements with the "o" constraint, add support for this
constraint for compatibility purposes.

llvm-svn: 317807
2017-11-09 16:31:57 +00:00
Jonas Paulsson c63ed222b8 [SystemZ] Enable machine scheduler.
The machine scheduler (before register allocation) is enabled by default for
SystemZ.

The SelectionDAG scheduling preference now becomes source order scheduling
(was regpressure).

Review: Ulrich Weigand
https://reviews.llvm.org/D37977

llvm-svn: 315063
2017-10-06 13:59:28 +00:00
Jonas Paulsson c9e363ac69 [SystemZ] implement shouldCoalesce()
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.

If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).

This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899
https://bugs.llvm.org/show_bug.cgi?id=34610

llvm-svn: 314516
2017-09-29 14:31:39 +00:00
Ulrich Weigand df86855f61 [SystemZ] Fix fall-out from r314428
The expensive-checks build bot found a problem with the r314428 commit:
if CC is live after a ATOMIC_CMP_SWAPW instruction, it needs to be
marked as live-in to the block after the loop the pseudo gets expanded
to.  This actually fixes a code-gen bug as well, since if the CC isn't
live, the CR and JLH are merged to a CRJLH which doesn't actually set
the condition code any more.

llvm-svn: 314465
2017-09-28 22:08:25 +00:00
Ulrich Weigand 0f1de04979 [SystemZ] Custom-expand ATOMIC_CMP_AND_SWAP_WITH_SUCCESS
The SystemZ compare-and-swap instructions already provide the "success"
indication via a condition-code value, so the default expansion of those
operations generates an unnecessary extra comparsion.

llvm-svn: 314428
2017-09-28 16:22:54 +00:00
Jonas Paulsson b0e8a2e623 [SystemZ] Improve optimizeCompareZero()
More conversions to load-and-test can be made with this patch by adding a
forward search in optimizeCompareZero().

Review: Ulrich Weigand
https://reviews.llvm.org/D38076

llvm-svn: 313877
2017-09-21 13:52:24 +00:00
Ulrich Weigand 59a01a958a [SystemZ] Fix truncstore + bswap codegen bug
SystemZTargetLowering::combineSTORE contains code to transform a
combination of STORE + BSWAP into a STRV type instruction.

This transformation is correct for regular stores, but not for
truncating stores.  The routine neglected to check for that case.

Fixes a miscompilation of llvm-objcopy with clang, which caused
test suite failures in the SystemZ multistage build bot.

llvm-svn: 313669
2017-09-19 20:50:05 +00:00
NAKAMURA Takumi 38fac5905e Move llvm/test/CodeGen/X86/clear-liverange-spillreg.mir to SystemZ. It was in wrong place.
llvm-svn: 313218
2017-09-14 00:03:23 +00:00
Jonas Paulsson fc4f323ac1 [SystemZ] Add the CoveredBySubRegs bit to GPR64, GPR128 and FPR128 registers.
This bit is needed in order for the CalleeSavedRegs list to automatically
include the super registers if all of their subregs are present.

Thanks to Wei Mi for initially indicating this deficiency in the SystemZ
backend.

Review: Ulrich Weigand.
https://bugs.llvm.org/show_bug.cgi?id=34550

llvm-svn: 313023
2017-09-12 12:11:29 +00:00
Jonas Paulsson 57a705d9d0 [SystemZ, MachineScheduler] Improve post-RA scheduling.
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.

The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.

Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.

Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053

llvm-svn: 311072
2017-08-17 08:33:44 +00:00
Mikael Holmen 8b10680922 [IfConversion] Maintain the CFG when predicating/merging blocks in IfConvert*
Summary:
This fixes PR32721 in IfConvertTriangle and possible similar problems in
IfConvertSimple, IfConvertDiamond and IfConvertForkedDiamond.

In PR32721 we had a triangle

   EBB
   | \
   |  |
   | TBB
   |  /
   FBB

where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).

The edge updating code and branch probability updating code is now pushed
into MergeBlocks() which allows us to share the same update logic between
more callsites. This lets us remove several dependencies on analyzeBranch
and completely eliminate RemoveExtraEdges.

One thing that showed up with this patch was that IfConversion sometimes
left a successor with 0% probability even if there was no branch or
fallthrough to the successor.

One such example from the test case ifcvt_bad_zero_prob_succ.mir. The
indirect branch tBRIND can only jump to bb.1, but without the patch we
got:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000), %bb.2(0x00000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

  bb.2:

There is no way to jump from bb.1 to bb2, but still there is a 0% edge
from bb.1 to bb.2.

With the patch applied we instead get the expected:

  bb.0:
    successors: %bb.1(0x80000000)

  bb.1:
    successors: %bb.1(0x80000000)
    tBRIND %r1, 1, %cpsr
    B %bb.1

Since bb.2 had no predecessor at all, it was removed.

Several testcases had to be updated due to this since the removed
successor made the "Branch Probability Basic Block Placement" pass
sometimes place blocks in a different order.

Finally added a couple of new test cases:

* PR32721_ifcvt_triangle_unanalyzable.mir:
  Regression test for the original problem dexcribed in PR 32721.

* ifcvt_triangleWoCvtToNextEdge.mir:
  Regression test for problem that caused a revert of my first attempt
  to solve PR 32721.

* ifcvt_simple_bad_zero_prob_succ.mir:
  Test case showing the problem where a wrong successor with 0% probability
  was previously left.

* ifcvt_[diamond|forked_diamond|simple]_unanalyzable.mir
  Very simple test cases for the simple and (forked) diamond cases
  involving unanalyzable branches that can be nice to have as a base if
  wanting to write more complicated tests.

Reviewers: iteratee, MatzeB, grosser, kparzysz

Reviewed By: kparzysz

Subscribers: kbarton, davide, aemerson, nemanjai, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34099

llvm-svn: 310697
2017-08-11 06:57:08 +00:00
Evgeny Stupachenko c675290680 Reapply fix PR23384 (part 3 of 3) r304824 (was reverted in r305720).
The root cause of reverting was fixed - PR33514.

Summary:
The patch makes instruction count the highest priority for
 LSR solution for X86 (previously registers had highest priority).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D30562

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>
llvm-svn: 310289
2017-08-07 19:56:34 +00:00
Ulrich Weigand a11f63a952 [SystemZ] Add support for 128-bit atomic load/store/cmpxchg
This adds support for the main 128-bit atomic operations,
using the SystemZ instructions LPQ, STPQ, and CDSG.

Generating these instructions is a bit more complex than usual
since the i128 type is not legal for the back-end.  Therefore,
we have to hook the LowerOperationWrapper and ReplaceNodeResults
TargetLowering callbacks.

llvm-svn: 310094
2017-08-04 18:57:58 +00:00
Ulrich Weigand 02f1c02c27 [SystemZ] Eliminate unnecessary serialization operations
We currently emit a serialization operation (bcr 14, 0) before every
atomic load and after every atomic store.  This is overly conservative.
The SystemZ architecture actually does not require any serialization
for atomic loads, and a serialization after an atomic store only if
we need to enforce sequential consistency.  This is what other compilers
for the platform implement as well.

llvm-svn: 310093
2017-08-04 18:53:35 +00:00
Jonas Paulsson be7a7e4979 [SystemZ] test update
test/CodeGen/SystemZ/loop-01.ll was incorrectly updated by r308729.

llvm-svn: 308736
2017-07-21 13:14:17 +00:00
Jonas Paulsson 024e319489 [SystemZ, LoopStrengthReduce]
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.

In order to achieve this, the following common code changes were made:

 * New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
 LSR should do instruction-based addressing evaluations by calling
 isLegalAddressingMode() with the Instruction pointers.
 * In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
 as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
 not just loads or stores.

SystemZ changes:

 * isLSRCostLess() implemented with Insns first, and without ImmCost.
 * New function supportedAddressingMode() that is a helper for TTI methods
 looking at Instructions passed via pointers.

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262
https://reviews.llvm.org/D35049

llvm-svn: 308729
2017-07-21 11:59:37 +00:00
Ulrich Weigand f2968d58cb [SystemZ] Add support for IBM z14 processor (3/3)
This adds support for the new 128-bit vector float instructions of z14.
Note that these instructions actually only operate on the f128 type,
since only each 128-bit vector register can hold only one 128-bit
float value.  However, this is still preferable to the legacy 128-bit
float instructions, since those operate on pairs of floating-point
registers (so we can hold at most 8 values in registers), while the
new instructions use single vector registers (so we hold up to 32
value in registers).

Adding support includes:
- Enabling the instructions for the assembler/disassembler.
- CodeGen for the instructions.  This includes allocating the f128
  type now to the VR128BitRegClass instead of FP128BitRegClass.
- Scheduler description support for the instructions.

Note that for a small number of operations, we have no new vector
instructions (like integer <-> 128-bit float conversions), and so
we use the legacy instruction and then reformat the operand
(i.e. copy between a pair of floating-point registers and a
vector register).

llvm-svn: 308196
2017-07-17 17:44:20 +00:00
Ulrich Weigand 33435c4c9c [SystemZ] Add support for IBM z14 processor (2/3)
This adds support for the new 32-bit vector float instructions of z14.
This includes:
- Enabling the instructions for the assembler/disassembler.
- CodeGen for the instructions, including new LLVM intrinsics.
- Scheduler description support for the instructions.
- Update to the vector cost function calculations.

In general, CodeGen support for the new v4f32 instructions closely
matches support for the existing v2f64 instructions.

llvm-svn: 308195
2017-07-17 17:42:48 +00:00
Ulrich Weigand 2b3482fe85 [SystemZ] Add support for IBM z14 processor (1/3)
This patch series adds support for the IBM z14 processor.  This part includes:
- Basic support for the new processor and its features.
- Support for new instructions (except vector 32-bit float and 128-bit float).
- CodeGen for new instructions, including new LLVM intrinsics.
- Scheduler description for the new processor.
- Detection of z14 as host processor.

Support for the new 32-bit vector float and 128-bit vector float
instructions is provided by separate patches.

llvm-svn: 308194
2017-07-17 17:41:11 +00:00
Quentin Colombet 868ef847a6 [RegAllocFast] Don't insert kill flags of super-register for partial kill
When reusing a register for a new definition, the fast register allocator
used to insert a kill flag at the previous last use of that register to
inform later passes that this register is free between the redef and the
last use. However, this may be wrong when subregisters are involved.
Indeed, a partially redef would have trigger a kill of the full super
register, potentially wrongly marking all the other subregisters as
free. Given we don't track which lanes are still live, we cannot set the
kill flag in such case.

Note: This bug has been latent for about 7 years (r104056).

llvmg.org/PR33677

llvm-svn: 307428
2017-07-07 19:25:45 +00:00