Summary:
A *lot* of instructions have this special register.
It seems this never really worked, but i finally noticed it only
because it happened to break for `CMOV16rm` instruction.
We serialized that register as "" (empty string), which is naturally
'ignored' during deserialization, so we re-create a `MCInst` with
too few operands.
And when we then happened to try to resolve variant sched class
for this mis-serialized instruction, and the variant predicate
tried to read an operand that was out of bounds since we got less operands,
we crashed.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=41448 | PR41448 ]].
Reviewers: craig.topper, courbet
Reviewed By: courbet
Subscribers: tschuett, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60517
llvm-svn: 358153
Because of gp = sdata_start_address + 0x800, gp with signed twelve-bit offset
could covert most of the small data section. Linker relaxation could transfer
the multiple data accessing instructions to a gp base with signed twelve-bit
offset instruction.
Differential Revision: https://reviews.llvm.org/D57493
llvm-svn: 358150
Summary:
Make DW_LNS_copy set the discriminator register to 0, to conform to
DWARF 4 & 5: "Then it sets the discriminator register to 0, and sets the
basic_block, prologue_end and epilogue_begin registers to false."
Because all of DW_LNE_end_sequence, DN_LNS_copy, and special opcodes reset
discriminator to 0, we can move discriminator=0 to appendRowToMatrix.
Also, make DW_LNS_copy print before appending the row, as it is similar
to a address+=0,line+=0 special opcode, which prints before appending
the row.
Reviewers: dblaikie, probinson, aprantl
Reviewed By: dblaikie
Subscribers: danielcdh, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60364
llvm-svn: 358148
If the ObjectSizeOffsetEvaluator fails to fold the object size call, then it may
litter some unused instructions in the function. When done repeatably in
InstCombine, this results in an infinite loop. Fix this by tracking the set of
instructions that were inserted, then removing them on failure.
rdar://49172227
Differential revision: https://reviews.llvm.org/D60298
llvm-svn: 358146
foldMaskedShiftToScaledMask tries to reorder and & shl to enable the shl to fold into an LEA. But if there is an any_extend between them it doesn't work.
This patch modifies the code to look through any_extend from i32 to i64 when the and mask only uses bits that weren't from the extended part.
This will prevent a regression from D60358 caused by 64-bit SHL being narrowed to 32-bits when their upper bits aren't demanded.
Differential Revision: https://reviews.llvm.org/D60532
llvm-svn: 358139
Many of our instructions have both a _Int form used by intrinsics and a form
used by other IR constructs. In the EVEX space the _Int versions usually cover
all the capabilities include broadcasting and rounding. While the other version
only covers simple register/register or register/load forms. For this reason
in EVEX, the non intrinsic form is usually marked isCodeGenOnly=1.
In the VEX encoding space we were less consistent, but usually the _Int version
was the isCodeGenOnly version.
This commit makes the VEX instructions match the EVEX instructions. This was
done by manually studying the AsmMatcher table so its possible I missed some
cases, but we should be closer now.
I'm thinking about using the isCodeGenOnly bit to simplify the EVEX2VEX
tablegen code that disambiguates the _Int and non _Int versions. Currently it
checks register class sizes and Record the memory operands come from. I have
some other changes I was looking into for D59266 that may break the memory check.
I had to make a few scheduler hacks to keep the _Int versions from being treated
differently than the non _Int version.
Differential Revision: https://reviews.llvm.org/D60441
llvm-svn: 358138
If we have an (add X, (and (aext (shl Y, C1)), C2)), we can pull the shift through and+aext to fold into an LEA with the.
Assuming C1 is small enough and C2 masks off all of the extend bits.
This pattern showed up in D60358. And we need to handle it to prevent a regression.
llvm-svn: 358124
Summary:
```
``!listsplat(a, size)``
A list value that contains the value ``a`` ``size`` times.
Example: ``!listsplat(0, 2)`` results in ``[0, 0]``.
```
I plan to use this in X86ScheduleBdVer2.td for LoadRes handling.
This is a little bit controversial because unlike every other binary operator
the types aren't identical.
Reviewers: stoklund, javed.absar, nhaehnle, craig.topper
Reviewed By: javed.absar
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60367
llvm-svn: 358117
This adds a simple extra test for constant hoisting to show it's
usefulness with constant addresses like those seen in memory
mapped registers in embedded systems.
llvm-svn: 358114
Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not
seeing through to the constant in other blocks. Revert this patch while we come
up with a better way to handle that.
I will try to follow this up with some better tests.
llvm-svn: 358113
This fixes a regression from https://reviews.llvm.org/D60354. We used to
SymbolNode *Symbol = demangleEncodedSymbol(MangledName, QN);
if (Symbol) {
Symbol->Name = QN;
}
but changed that to
SymbolNode *Symbol = demangleEncodedSymbol(MangledName, QN);
if (Error)
return nullptr;
Symbol->Name = QN;
and one branch somewhere returned a nullptr without setting Error.
Looking at the code changed in r340083 and r340710 that branch looks
like a remnant from an earlier attempt to demangle RTTI descriptors
that has since been rewritten -- so just remove this branch. It
shouldn't change behavior for correctly mangled symbols.
llvm-svn: 358112
This patch teach getTestBitOperand to look through ANY_EXTENDs when the extended bits aren't used. The test case changed here is based what D60358 did to test16 in tbz-tbnz.ll. So this patch will avoid that regression.
Differential Revision: https://reviews.llvm.org/D60482
llvm-svn: 358108
Following D60483 and D60497, this adds support for AlwaysOverflows
handling for ssubo. This is the last case we can handle right now.
Differential Revision: https://reviews.llvm.org/D60518
llvm-svn: 358100
ssubo X, C is equivalent to saddo X, -C. Make the transformation in
InstCombine and allow the logic implemented for saddo to fold prior
usages of add nsw or sub nsw with constants.
Patch by Dan Robertson.
Differential Revision: https://reviews.llvm.org/D60061
llvm-svn: 358099
Summary: The fp16 scalar version of facge and facgt requires a custom patter matching, as the result type is not the same width of the operands.
Reviewers: olista01, javed.absar, pbarrio
Reviewed By: javed.absar
Subscribers: kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60212
llvm-svn: 358083
Summary:
When calculating the debug value history, DbgEntityHistoryCalculator
would only keep track of register clobbering for the latest debug value
per inlined entity. This meant that preceding register-described debug
value fragments would live on until the next overlapping debug value,
ignoring any potential clobbering. This patch amends
DbgEntityHistoryCalculator so that it keeps track of all registers that
a inlined entity's currently live debug values are described by.
The DebugInfo/COFF/pieces.ll test case has had to be changed since
previously a register-described fragment would incorrectly outlive its
basic block.
The parent patch D59941 is expected to increase the coverage slightly,
as it makes sure that location list entries are inserted after clobbered
fragments, and this patch is expected to decrease it, as it stops
preceding register-described from living longer than they should. All in
all, this patch and the preceding patch has a negligible effect on the
output from `llvm-dwarfdump -statistics' for a clang-3.4 binary built
using the RelWithDebInfo build profile. "Scope bytes covered" increases
by 0.5%, and "variables with location" increases from 2212083 to
2212088, but it should improve the accuracy quite a bit.
This fixes PR40283.
Reviewers: aprantl, probinson, dblaikie, rnk, bjope
Reviewed By: aprantl
Subscribers: llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D59942
llvm-svn: 358073
Summary:
Currently the DbgValueHistorymap only keeps track of clobbered registers
for the last debug value that it has encountered. This could lead to
preceding register-described debug values living on longer in the
location lists than they should. See PR40283 for an example. This
patch does not introduce tracking of multiple registers, but changes
the DbgValueHistoryMap structure to allow for that in a follow-up
patch. This patch is not NFC, as it at least fixes two bugs in
DwarfDebug (both are covered in the new clobbered-fragments.mir test):
* If a debug value was clobbered (its End pointer set), the value would
still be added to OpenRanges, meaning that the succeeding location list
entries could potentially contain stale values.
* If a debug value was clobbered, and there were non-overlapping
fragments that were still live after the clobbering, DwarfDebug would
not create a location list entry starting directly after the
clobbering instruction. This meant that the location list could have
a gap until the next debug value for the variable was encountered.
Before this patch, the history map was represented by <Begin, End>
pairs, where a new pair was created for each new debug value. When
dealing with partially overlapping register-described debug values, such
as in the following example:
DBG_VALUE $reg2, $noreg, !1, !DIExpression(DW_OP_LLVM_fragment, 32, 32)
[...]
DBG_VALUE $reg3, $noreg, !1, !DIExpression(DW_OP_LLVM_fragment, 64, 32)
[...]
$reg2 = insn1
[...]
$reg3 = insn2
the history map would then contain the entries `[<DV1, insn1>, [<DV2, insn2>]`.
This would leave it up to the users of the map to be aware of
the relative order of the instructions, which e.g. could make
DwarfDebug::buildLocationList() needlessly complex. Instead, this patch
makes the history map structure monotonically increasing by dropping the
End pointer, and replacing that with explicit clobbering entries in the
vector. Each debug value has an "end index", which if set, points to the
entry in the vector that ends the debug value. The ending entry can
either be an overlapping debug value, or an instruction which clobbers
the register that the debug value is described by. The ending entry's
instruction can thus either be excluded or included in the debug value's
range. If the end index is not set, the debug value that the entry
introduces is valid until the end of the function.
Changes to test cases:
* DebugInfo/X86/pieces-3.ll: The range of the first DBG_VALUE, which
describes that the fragment (0, 64) is located in RDI, was
incorrectly ended by the clobbering of RAX, which the second
(non-overlapping) DBG_VALUE was described by. With this patch we
get a second entry that only describes RDI after that clobbering.
* DebugInfo/ARM/partial-subreg.ll: This test seems to indiciate a bug
in LiveDebugValues that is caused by it not being aware of fragments.
I have added some comments in the test case about that. Also, before
this patch DwarfDebug would incorrectly include a register-described
debug value from a preceding block in a location list entry.
Reviewers: aprantl, probinson, dblaikie, rnk, bjope
Reviewed By: aprantl
Subscribers: javed.absar, kristof.beyls, jdoerfert, llvm-commits
Tags: #debug-info, #llvm
Differential Revision: https://reviews.llvm.org/D59941
llvm-svn: 358072
Make it possible to TableGen code for FCONSTS and FCONSTD.
We need to make two changes to the TableGen descriptions of vfp_f32imm
and vfp_f64imm respectively:
* add GISelPredicateCode to check that the immediate fits in 8 bits;
* extract the SDNodeXForms into separate definitions and create a
GISDNodeXFormEquiv and a custom renderer function for each of them.
There's a lot of boilerplate to get the actual value of the immediate,
but it basically just boils down to calling ARM_AM::getFP32Imm or
ARM_AM::getFP64Imm.
llvm-svn: 358063
1. Use computed VF for stress testing.
2. If the computed VF does not produce vector code (VF smaller than 2), force VF to be 4.
3. Test vectorization of i64 data on AArch64 to make sure we generate VF != 4 (on X86 that was already tested on AVX).
Patch by Francesco Petrogalli <francesco.petrogalli@arm.com>
Differential Revision: https://reviews.llvm.org/D59952
llvm-svn: 358056
Check AlwaysOverflow condition for usubo. The implementation is the
same as the existing handling for uaddo and umulo. Handling for saddo
and ssubo will follow (smulo doesn't have the necessary ValueTracking
support).
Differential Revision: https://reviews.llvm.org/D60483
llvm-svn: 358052
metadata into a module flag in the auto-upgrader and make the ARC
contract pass read the marker as a module flag.
This is needed to fix a bug where ARC contract wasn't inserting the
retainRV marker when LTO was enabled, which caused objects returned
from a function to be auto-released.
rdar://problem/49464214
Differential Revision: https://reviews.llvm.org/D60303
llvm-svn: 358047
Years ago I moved this to an InstAlias using VR128H/VR128L. But now that we support {vex3} pseudo prefix, we need to block the optimization when it is set to match gas behavior.
llvm-svn: 358046
Summary:
Obviously, new built MI (sethi+add or sethi+xor+add) for constructing large offset
should be inserted before new created MI for storing even register into memory.
So the insertion position should be *StMI instead of II.
before fixed:
std %f0, [%g1+80]
sethi 4, %g1 <<<
add %g1, %sp, %g1 <<< this two instructions should be put before "std %f0, [%g1+80]".
sethi 4, %g1
add %g1, %sp, %g1
std %f2, [%g1+88]
after fixed:
sethi 4, %g1
add %g1, %sp, %g1
std %f0, [%g1+80]
sethi 4, %g1
add %g1, %sp, %g1
std %f2, [%g1+88]
Reviewers: venkatra, jyknight
Reviewed By: jyknight
Subscribers: jyknight, fedor.sergeev, jrtc27, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60397
llvm-svn: 358042
The EVEX versions are ambiguous with the VEX versions based on operands alone so we had explicitly dropped
them from the AsmMatcher table. Unfortunately, when we add them they incorrectly show in the table before
their VEX counterparts. This is different how the prioritization normally works.
To fix this we have to explicitly reject the instructions unless the {evex} prefix has been seen.
llvm-svn: 358041
The selection for G_ICMP is unfortunately not currently importable from SDAG
due to the use of custom SDNodes. To support this, this selection method has an
opcode table which has been generated by a script, indexed by various
instruction properties. Ideally in future we will have a GISel native selection
patterns that we can write in tablegen to improve on this.
For selection of some types we also need support for G_ASHR and G_SHL which are
generated as a result of legalization. This patch also adds support for them,
generating the same code as SelectionDAG currently does.
Differential Revision: https://reviews.llvm.org/D60436
llvm-svn: 358035
required to be passed as different register types. E.g. <2 x i16> may need to
be passed as a larger <2 x i32> type, so formal arg lowering needs to be able
truncate it back. Likewise, when dealing with returns of these types, they need
to be widened in the appropriate way back.
Differential Revision: https://reviews.llvm.org/D60425
llvm-svn: 358032