Standard load/store instructions with GLC bit set.
Reviewers: tstellardAMD, arsenm
Differential Revision: http://reviews.llvm.org/D18760
llvm-svn: 265709
The pass walk through the machine function and assign the register banks
using the default mapping. In other words, there is no attempt to reduce
cross register copies.
llvm-svn: 265707
the mapping of an instruction on register bank.
For most instructions, it is possible to guess the mapping of the
instruciton by using the encoding constraints.
It remains instructions without encoding constraints.
For copy-like instructions, we try to propagate the information we get
from the other operands. Otherwise, the target has to give this
information.
llvm-svn: 265703
helper class.
The default constructor creates invalid (isValid() == false) instances
and may be used to communicate that a mapping was not found.
llvm-svn: 265699
A virtual register may have either a register bank or a register class.
This is represented by a PointerUnion between the related classes.
Typically, a virtual register went through the following states
regarding register class and register bank:
1. Creation: None is set. Virtual registers are fully generic.
2. Register bank assignment: Register bank is set. Virtual registers
live into a register bank, but we do not know the constraints they need
to fulfil.
3. Instruction selection: Register class is set. Virtual registers are
bound by encoding constraints.
To map these states to GlobalISel, the IRTranslator implements #1,
RegBankSelect #2, and Select #3.
llvm-svn: 265696
A draft line added to release notes for PPC, to keep a record of changes.
This is just a draft and will be rewritten towards the end of release.
llvm-svn: 265694
Return is now considered a predicable instruction, and is converted
to a newly-added CondReturn (which maps to BCR to %r14) instruction by
the if conversion pass.
Also, fused compare-and-branch transform knows about conditional
returns, emitting the proper fused instructions for them.
This transform triggers on a *lot* of tests, hence the huge diffstat.
The changes are mostly jX to br %r14 -> bXr %r14.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D17339
llvm-svn: 265689
As suggested by Chandler in his review comments for D18662, this
follow-on patch renames some variables in GetLoadValueForLoad and
CoerceAvailableValueToLoadType to hopefully make it more obvious
which variables hold value sizes and which hold load/store sizes.
No functional change intended.
llvm-svn: 265687
Follow-up to D18775 and related clang change. AtomicOrdering is a lattice, 'stronger' is the right thing to do, direct comparison is fraught with peril.
llvm-svn: 265685
When GVN wants to re-interpret an already available value in a smaller
type, it needs to right-shift the value on big-endian systems to ensure
the correct bytes are accessed. The shift value is the difference of
the sizes of the two types.
This is correct as long as both types occupy multiples of full bytes.
However, when one of them is a sub-byte type like i1, this no longer
holds true: we still need to shift, but only to access the correct
*byte*. Accessing bits within the byte requires no shift in either
endianness; e.g. an i1 resides in the least-significant bit of its
containing byte on both big- and little-endian systems.
Therefore, the appropriate shift value to be used is the difference of
the *storage* sizes of the two types. This is already handled correctly
in one place where such a shift takes place (GetStoreValueForLoad), but
is incorrect in two other places: GetLoadValueForLoad and
CoerceAvailableValueToLoadType.
This patch changes both places to use the storage size as well.
Differential Revision: http://reviews.llvm.org/D18662
llvm-svn: 265684
http://reviews.llvm.org/D18562
A large number of testcases has been modified so they pass after this test.
One testcase is deleted, because I realized even after undoing the original
change that was committed with this testcase, the testcase still passes. So
I removed it. The change to one other testcase (test/CodeGen/PowerPC/pr25802.ll)
is an arbitrary change to keep it passing. Given the original intention of the
testcase, and the fact that fixing it will require some time to change the testcase,
we concluded that this quick change will be enough.
llvm-svn: 265683
Summary: This makes it possible to insert nops at the end of blocks.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18549
llvm-svn: 265678
For VGPR_32 operand disassembler expects a VGPR register encoded as 0..255 (enum8 src operand).
readfirstlane/readline actually has enum9 operand and this change fixes VGPR_32 to VS_32 (enum9 encoding).
Differential Revision: http://reviews.llvm.org/D18696
llvm-svn: 265670
This patch add support for GCC attribute((ifunc("resolver"))) for
targets that use ELF as object file format. In general ifunc is a
special kind of function alias with type @gnu_indirect_function. Patch
for Clang http://reviews.llvm.org/D15524
Differential Revision: http://reviews.llvm.org/D15525
llvm-svn: 265667
`sys/types.h` has a related define in `config.h.cmake`, but was never
checked for in CMake. Sync this.
Differential Revision: http://reviews.llvm.org/D18825
llvm-svn: 265648
Reenable reverted r265550 with endianness issue fixed. Variables of
endian-aware types such as ulittle32_t should be explicitly casted
to their natural equivalent types before passing it as vararg to
printf like functions (format in my case). Added lit config file
depending on AMDGPU target as the testcase uses assembler.
Differential revision: http://reviews.llvm.org/D16998
llvm-svn: 265645
The iterators of SmallPtrSet SpillsInSubTreeMap[Child].first may be
invalidated when SpillsInSubTreeMap grows. Rearrange the code to
ensure the grow of SpillsInSubTreeMap only happens before getting
the iterators of the SmallPtrSet.
llvm-svn: 265639
Re-apply r265450 which caused PR27245 and was reverted in r265559
because of a wrong generalization: the fetch_and_add->add_and_fetch
combine only works in specific, but pretty common, cases:
(icmp slt x, 0) -> (icmp sle (add x, 1), 0)
(icmp sge x, 0) -> (icmp sgt (add x, 1), 0)
(icmp sle x, 0) -> (icmp slt (sub x, 1), 0)
(icmp sgt x, 0) -> (icmp sge (sub x, 1), 0)
Original Message:
We only generate LOCKed versions of add/sub when the result is unused.
It often happens that the result is used, but only by a comparison. We
can optimize those out by reusing EFLAGS, which lets us use the proper
instructions, instead of having to fallback to LXADD.
Instead of doing this as an MI peephole (as we do for the other
non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite
tricky later.
This also makes it eventually possible to stop expanding and/or/xor
if the only user is an icmp (also see D18141).
This uses the LOCK ISD opcodes added by r262244.
Differential Revision: http://reviews.llvm.org/D17633
llvm-svn: 265636
Remove the assertion that disallowed the combination, since
RF_IgnoreMissingLocals should have no effect on globals. As it happens,
RF_NullMapMissingGlobalValues asserted in MapValue(Constant*,...), so I
also changed a cast to a cast_or_null to get my test passing.
llvm-svn: 265633
Start treating LocalAsMetadata similarly to function-local members of
the Value hierarchy in MapValue and MapMetadata.
- Don't memoize them.
- Return nullptr if they are missing.
This also cleans up ConstantAsMetadata to stop listening to the
RF_IgnoreMissingLocals flag.
llvm-svn: 265631
Clarify what this RemapFlag actually means.
- Change the flag name to match its intended behaviour.
- Clearly document that it's not supposed to affect globals.
- Add a host of FIXMEs to indicate how to fix the behaviour to match
the intent of the flag.
RF_IgnoreMissingLocals should only affect the behaviour of
RemapInstruction for function-local operands; namely, for operands of
type Argument, Instruction, and BasicBlock. Currently, it is *only*
passed into RemapInstruction calls (and the transitive MapValue calls
that it makes).
When I split Metadata from Value I didn't understand the flag, and I
used it in a bunch of places for "global" metadata.
This commit doesn't have any functionality change, but prepares to
cleanup MapMetadata and MapValue.
llvm-svn: 265628
getInstrMapping.
This implementation requires that the target implemented
getRegBankFromRegClass.
Indeed, the implementation uses the register classes for the encoding
constraints for the instructions to deduce the mapping of a value.
llvm-svn: 265624
Third time's the charm? The previous attempt (r265345) caused ASan test
failures on X86, as broken CFI caused stack traces to not work.
This version of the patch makes sure not to merge with stack adjustments
that have CFI, and to not add merged instructions' offests to the CFI
about to be generated.
This is already covered by the lit tests; I just got the expectations
wrong previously.
llvm-svn: 265623