Commit Graph

26 Commits

Author SHA1 Message Date
Jonas Paulsson fdc4ea34e3 [SystemZ, RegAlloc] Favor 3-address instructions during instruction selection.
This patch aims to reduce spilling and register moves by using the 3-address
versions of instructions per default instead of the 2-address equivalent
ones. It seems that both spilling and register moves are improved noticeably
generally.

Regalloc hints are passed to increase conversions to 2-address instructions
which are done in SystemZShortenInst.cpp (after regalloc).

Since the SystemZ reg/mem instructions are 2-address (dst and lhs regs are
the same), foldMemoryOperandImpl() can no longer trivially fold a spilled
source register since the reg/reg instruction is now 3-address. In order to
remedy this, new 3-address pseudo memory instructions are used to perform the
folding only when the dst and lhs virtual registers are known to be allocated
to the same physreg. In order to not let MachineCopyPropagation run and
change registers on these transformed instructions (making it 3-address), a
new target pass called SystemZPostRewrite.cpp is run just after
VirtRegRewriter, that immediately lowers the pseudo to a target instruction.

If it would have been possibe to insert a COPY instruction and change a
register operand (convert to 2-address) in foldMemoryOperandImpl() while
trusting that the caller (e.g. InlineSpiller) would update/repair the
involved LiveIntervals, the solution involving pseudo instructions would not
have been needed. This is perhaps a potential improvement (see Phabricator
post).

Common code changes:

* A new hook TargetPassConfig::addPostRewrite() is utilized to be able to run a
target pass immediately before MachineCopyPropagation.

* VirtRegMap is passed as an argument to foldMemoryOperand().

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D60888

llvm-svn: 362868
2019-06-08 06:19:15 +00:00
Ulrich Weigand 9dd23b8433 [SystemZ] Test case formatting fixes
Fix systematically wrong whitespace from a prior automated change.

NFC.

llvm-svn: 337542
2018-07-20 12:12:10 +00:00
Ulrich Weigand c3ec80fea1 [SystemZ] Handle SADDO et.al. and ADD/SUBCARRY
This provides an optimized implementation of SADDO/SSUBO/UADDO/USUBO
as well as ADDCARRY/SUBCARRY on top of the new CC implementation.

In particular, multi-word arithmetic now uses UADDO/ADDCARRY instead
of the old ADDC/ADDE logic, which means we no longer need to use
"glue" links for those instructions.  This also allows making full
use of the memory-based instructions like ALSI, which couldn't be
recognized due to limitations in the DAG matcher previously.

Also, the llvm.sadd.with.overflow et.al. intrinsincs now expand to
directly using the ADD instructions and checking for a CC 3 result.

llvm-svn: 331203
2018-04-30 17:54:28 +00:00
Ulrich Weigand fb56686cd3 [SystemZ] Improve handling of Select pseudo-instructions
If we have LOCR instructions, select them directly from SelectionDAG
instead of first going through a pseudo instruction and then using
the custom inserter to emit the LOCR.

Provide Select pseudo-instructions for VR32/VR64 if we have vector
instructions, to avoid having to go through the first 16 FPRs
unnecessarily.

If we do not have LOCFHR, prefer using LOCR followed by a move
over a conditional branch.

llvm-svn: 331191
2018-04-30 15:49:27 +00:00
Kyle Butt efe56fed12 Revert "CodeGen: Allow small copyable blocks to "break" the CFG."
This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded.

This needs a simple probability check because there are some cases where it is
not profitable.

llvm-svn: 291695
2017-01-11 19:55:19 +00:00
Kyle Butt df27aa8c89 CodeGen: Allow small copyable blocks to "break" the CFG.
When choosing the best successor for a block, ordinarily we would have preferred
a block that preserves the CFG unless there is a strong probability the other
direction. For small blocks that can be duplicated we now skip that requirement
as well.

Differential revision: https://reviews.llvm.org/D27742

llvm-svn: 291609
2017-01-10 23:04:30 +00:00
Jonas Paulsson 28fa48de32 [SystemZ] CodeGen/SystemZ/asm-18.ll run with -verify-machineinstrs
Relates to the fixes of r249811.

llvm-svn: 249946
2015-10-10 07:20:23 +00:00
David Blaikie a79ac14fa6 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

llvm-svn: 230794
2015-02-27 21:17:42 +00:00
David Blaikie 79e6c74981 [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

llvm-svn: 230786
2015-02-27 19:29:02 +00:00
Ulrich Weigand bd039299c0 Use the integrated assembler as default on SystemZ
This was already done in clang, this commit now uses the integrated
assembler as default when using LLVM tools directly.

A number of test cases deliberately using an invalid instruction in
inline asm now have to use -no-integrated-as.

llvm-svn: 225820
2015-01-13 19:45:16 +00:00
Richard Sandiford b63e300b67 [SystemZ] Add comparisons of high words and memory
llvm-svn: 191777
2013-10-01 15:00:44 +00:00
Richard Sandiford a9ac0e0f75 [SystemZ] Add comparisons of large immediates using high words
There are no corresponding patterns for small immediates because they would
prevent the use of fused compare-and-branch instructions.

llvm-svn: 191775
2013-10-01 14:56:23 +00:00
Richard Sandiford 42a694f44e [SystemZ] Add immediate addition involving high words
llvm-svn: 191774
2013-10-01 14:53:46 +00:00
Richard Sandiford 2cac763544 [SystemZ] Extend test-under-mask support to high GR32s
llvm-svn: 191773
2013-10-01 14:41:52 +00:00
Richard Sandiford 3ad5a15b72 [SystemZ] Extend 32-bit RISBG optimizations to high words
This involves using RISB[LH]G, whereas the equivalent z10 optimization
uses RISBG.

llvm-svn: 191770
2013-10-01 14:36:20 +00:00
Richard Sandiford 7028428c2c [SystemZ] Allow integer AND involving high words
llvm-svn: 191762
2013-10-01 14:20:41 +00:00
Richard Sandiford 5718dacbdd [SystemZ] Allow integer XOR involving high words
llvm-svn: 191759
2013-10-01 14:08:44 +00:00
Richard Sandiford 6e96ac600f [SystemZ] Allow integer OR involving high words
llvm-svn: 191755
2013-10-01 13:22:41 +00:00
Richard Sandiford 1a56931b22 [SystemZ] Allow integer insertions with a high-word destination
llvm-svn: 191753
2013-10-01 13:18:56 +00:00
Richard Sandiford 7c5c0eabc9 [SystemZ] Allow selects with a high-word destination
llvm-svn: 191751
2013-10-01 13:10:16 +00:00
Richard Sandiford 012402346f [SystemZ] Add patterns to load a constant into a high word (IIHF)
Similar to low words, we can use the shorter LLIHL and LLIHH if it turns
out that the other half of the GR64 isn't live.

llvm-svn: 191750
2013-10-01 13:02:28 +00:00
Richard Sandiford 21235a256f [SystemZ] Add register zero extensions involving at least one high word
llvm-svn: 191746
2013-10-01 12:49:07 +00:00
Richard Sandiford 5469c39a26 [SystemZ] Add truncating high-word stores (STCH and STHH)
llvm-svn: 191743
2013-10-01 12:22:49 +00:00
Richard Sandiford 0d46b1a30f [SystemZ] Add zero-extending high-word loads (LLCH and LLHH)
llvm-svn: 191742
2013-10-01 12:19:08 +00:00
Richard Sandiford 89e160d975 [SystemZ] Add sign-extending high-word loads (LBH and LHH)
llvm-svn: 191740
2013-10-01 12:11:47 +00:00
Richard Sandiford 0755c93b0c [SystemZ] Use upper words of GR64s for codegen
This just adds the basics necessary for allocating the upper words to
virtual registers (move, load and store).  The move support is parameterised
in a way that makes it easy to handle zero extensions, but the associated
zero-extend patterns are added by a later patch.

The easiest way of testing this seemed to be add a new "h" register
constraint for high words.  I don't expect the constraint to be useful
in real inline asms, but it should work, so I didn't try to hide it
behind an option.

llvm-svn: 191739
2013-10-01 11:26:28 +00:00