Summary: They are currently being parsed as %f14, %f16, and %f18.
Reviewers: venkatra, jyknight
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27342
llvm-svn: 288503
getTargetConstantBitsFromNode currently only extracts constant pool vector data, but it will need to be generalized to support broadcast and scalar constant pool data as well.
Converted Constant bit extraction and Bitset splitting to helper lambda functions.
llvm-svn: 288496
As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106640.html
This is for a couple of reasons:
- Values of type PointerType are unlike the other SequentialTypes (arrays
and vectors) in that they do not hold values of the element type. By moving
PointerType we can unify certain aspects of how the other SequentialTypes
are handled.
- PointerType will have no place in the SequentialType hierarchy once
pointee types are removed, so this is a necessary step towards removing
pointee types.
Differential Revision: https://reviews.llvm.org/D26595
llvm-svn: 288462
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.
Differential Revision: https://reviews.llvm.org/D26594
llvm-svn: 288458
Since the spill is for the whole wave, these
don't have the swizzling problems that vector stores do
and a single 4-byte allocation is enough to spill a 64 element
register. This should reduce the number of spill instructions and
put all the spills for a register in the same cacheline.
This should save allocated private size, but for now it doesn't.
The extra slots are allocated for each component, but never used
because the frame layout is essentially finalized before frame
indices are replaced. For always using the scalar store path,
this should probably be moved into processFunctionBeforeFrameFinalized.
llvm-svn: 288445
Summary:
Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all
full COPYs between register classes of the same size that are either
spilled or refilled.
Reviewers: MatzeB, qcolombet
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D27271
llvm-svn: 288439
Move the cast<MCSymbolELF> inside emitELFSize, so that:
- it's done in one place instead of at each call
- it's more consistent with similar functions like EmitCOFFSafeSEH
- ambiguity between cast<> and dyn_cast<> is avoided (which also
eliminates an unnecessary dyn_cast call)
This also makes it easier to experiment with using ".size" directives on
non-ELF targets.
llvm-svn: 288437
Summary:
This patch fixes comparison of 64-bit atomic with its expected value in CMP_SWAP_64 expansion.
Currently, the low words are compared with CMP, while the high words are compared with SBC. SBC expects the carry flag to be set if CMP detects a difference. CMP might leave the carry unset for unequal arguments though if the first one is >= than the second. This might cause the comparison logic to detect false equality.
Example of the broken C++ code:
```
std::atomic<long long> at(2);
long long ll = 1;
std::atomic_compare_exchange_strong(&at, &ll, 3);
```
Even though the atomic `at` and the expected value `ll` are not equal and `atomic_compare_exchange_strong` returns `false`, `at` is changed to 3.
The patch replaces SBC with CMPEQ.
Reviewers: t.p.northover
Subscribers: aemerson, rengolin, llvm-commits, asl
Differential Revision: https://reviews.llvm.org/D27315
llvm-svn: 288433
This time the issue is fortunately just a simple mistake rather than a horrible
design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for
comparing two 64-bit values, but they don't.
The fix is slightly clunkier in AArch64 because we can't use conditional
execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to
an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a
CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway.
Thanks to Anton Korobeynikov for pointing out the issue.
llvm-svn: 288418
Recommitting r288293 with some extra fixes for GlobalISel code.
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.
This is a necessary step to have machine module passes work properly.
Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
because the available MachineFunction pointers are const, but the code
wants to call tidyLandingPads() in between
(markFunctionEnd()/endFunction()).
Differential Revision: https://reviews.llvm.org/D27227
llvm-svn: 288405
Now that we have fixups that only fill parts of a byte, it turns
out we have to mask off the bits outside the fixup area when
applying them. Failing to do so caused invalid object code to
be emitted for bprp with a negative 12-bit displacement.
llvm-svn: 288374
not all lakemont MCU support long nop.
we can't assume we can generate long nop by default for MCU.
Differential Revision: https://reviews.llvm.org/D26895
llvm-svn: 288363
Support a new assembler directive, .import_global, to declare imported
global variables (i.e. those with external linkage and no
initializer). The linker turns these into wasm imports.
Patch by Jacob Gravelle
Differential Revision: https://reviews.llvm.org/D26875
llvm-svn: 288296
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.
This is a necessary step to have machine module passes work properly.
Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
because the available MachineFunction pointers are const, but the code
wants to call tidyLandingPads() in between
(markFunctionEnd()/endFunction()).
Differential Revision: https://reviews.llvm.org/D27227
llvm-svn: 288293
This is per function data so it is better kept at the function instead
of the module.
This is a necessary step to have machine module passes work properly.
Differential Revision: https://reviews.llvm.org/D27185
llvm-svn: 288291
Summary:
This is preparation for ThunderX processors that have Large
System Extension (LSE) atomic instructions, but not the
other instructions introduced by V8.1a.
This will mimic changes to GCC as described here:
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00388.html
LSE instructions are: LD/ST<op>, CAS*, SWP
Reviewers: t.p.northover, echristo, jmolloy, rengolin
Subscribers: aemerson, mehdi_amini
Differential Revision: https://reviews.llvm.org/D26621
llvm-svn: 288279
No test case necessary as the problematic condition is checked with the
newly introduced assertAllSuperRegsMarked() function.
Differential Revision: https://reviews.llvm.org/D26648
llvm-svn: 288277
Summary:
When computing useful bits for a BFM instruction, we need
to take into consideration the case where both operands
of the BFM are equal and provide data that we need to track.
Not doing this can cause us to miss useful bits.
Fixes PR31138 (https://llvm.org/bugs/show_bug.cgi?id=31138)
Reviewers: t.p.northover, jmolloy
Subscribers: evandro, gberry, srhines, pirama, mcrosier, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D27130
llvm-svn: 288253
Initial support for target shuffle constant folding in cases where all shuffle inputs are constant. We may be able to relax this and merge shuffles with only some constant inputs in the future.
I've added the helper function getTargetConstantBitsFromNode (based off a similar function in X86ShuffleDecodeConstantPool.cpp) that could be reused for other cases requiring constant vector extraction.
Differential Revision: https://reviews.llvm.org/D27220
llvm-svn: 288250
This patch corresponds to review:
https://reviews.llvm.org/D26023
This patch adds support for converting a vector of loads into a single load if
the loads are consecutive (in either direction).
llvm-svn: 288219
This patch corresponds to review:
https://reviews.llvm.org/D25980
This is the 2nd patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC. This particular patch combines a build vector
of fp-to-int conversions into an fp-to-int conversion of a build vector of fp
values. For example:
Converts (build_vector (fp_to_[su]i $A), (fp_to_[su]i $B), ...)
Into (fp_to_[su]i (build_vector $A, $B, ...))).
Which is a natural match for much cleaner code.
llvm-svn: 288218
Summary: Previously 0 and -1 was matched via tablegen rules. But this could cause problems where a physical register was being used where a virtual register was expected (seen in optimizeSelect and TwoAddressInstructionPass). Instead follow AArch64 and match in DAGToDAGISel.
Reviewers: eliben, majnemer
Subscribers: llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D27171
llvm-svn: 288215
This commit caused some miscompiles that did not show up on any of the bots.
Reverting until we can investigate the cause of those failures.
llvm-svn: 288214
This is not in the list of valid inputs for the encoding.
When spilling, copies from exec can be folded directly
into the spill instruction which results in broken
stores.
This only fixes the operand constraints, more codegen
work is required to avoid emitting the invalid
spills.
This sort of breaks the dbg.value test. Because the
register class of the s_load_dwordx2 changes, there
is a copy to SReg_64, and the copy is the operand
of dbg_value. The copy is later dead, and removed
from the dbg_value.
llvm-svn: 288191
Use vaddr/vdst for the same purposes.
This also fixes a beg in SIInsertWaits for the
operand check. The stored value operand is currently called
data0 in the single offset case, not data.
llvm-svn: 288188
It isn't generally safe to fold the frame index
directly into the operand since it will possibly
not be an inline immediate after it is expanded.
This surprisingly seems to produce better code, since
the FI doesn't prevent folding other immediate operands.
llvm-svn: 288185
Change the logic for when to fold immediates to
consider the destination operand rather than the
source of the materializing mov instruction.
No change yet, but this will allow for correctly handling
i16/f16 operands. Since 32-bit moves are used to materialize
constants for these, the same bitvalue will not be in the
register.
llvm-svn: 288184
Summary:
In AArch64InstrInfo::foldMemoryOperandImpl, catch more cases where the
COPY being spilled is copying from WZR/XZR, but the source register is
not in the COPY destination register's regclass.
For example, when spilling:
%vreg0 = COPY %XZR ; %vreg0:GPR64common
without this change, the code in TargetInstrInfo::foldMemoryOperand()
and canFoldCopy() that normally handles cases like this would fail to
optimize since %XZR is not in GPR64common. So the spill code generated
would be:
%vreg0 = COPY %XZR
STR %vreg
instead of the new code generated:
STR %XZR
Reviewers: qcolombet, MatzeB
Subscribers: mcrosier, aemerson, t.p.northover, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D26976
llvm-svn: 288176