Commit Graph

9845 Commits

Author SHA1 Message Date
Tim Northover 4bf394be3a ARM: use correct offset from base pointer (r6) in call frame regions.
When we had dynamic call frames (i.e. sp adjustment around each call) we
were including that adjustment into offsets calculated based on r6, even
though it's only sp that changes. This led to incorrect stack slot
accesses.

llvm-svn: 348591
2018-12-07 13:43:55 +00:00
David Green ca29c271d2 [Targets] Add errors for tiny and kernel codemodel on targets that don't support them
Adds fatal errors for any target that does not support the Tiny or Kernel
codemodels by rejigging the getEffectiveCodeModel calls.

Differential Revision: https://reviews.llvm.org/D50141

llvm-svn: 348585
2018-12-07 12:10:23 +00:00
Diana Picus 1027249ec9 [ARM GlobalISel] Nothing is legal for Thumb
...yet!

A lot of the current code should be shared for arm and thumb mode, but
until we add tests and work out some of the details (e.g. checking the
correct subtarget feature for G_SDIV) it's safer to bail out as early as
possible for thumb targets.

This should have arguably been part of r348347, which allowed Thumb
functions to be handled by the IR Translator.

llvm-svn: 348472
2018-12-06 09:26:14 +00:00
Aditya Nandakumar f75d4f329c [GISel]: Provide standard interface to observe changes in GISel passes
https://reviews.llvm.org/D54980

This provides a standard API across GISel passes to observe and notify
passes about changes (insertions/deletions/mutations) to MachineInstrs.
This patch also removes the recordInsertion method in MachineIRBuilder
and instead provides method to setObserver.

Reviewed by: vkeles.

llvm-svn: 348406
2018-12-05 20:14:52 +00:00
Diana Picus 8a1b4f57c9 [ARM GlobalISel] Implement call lowering for Thumb2
The only things that are different from arm are:
* different opcodes for calls and returns
* Thumb calls take predicate operands

llvm-svn: 348347
2018-12-05 10:35:28 +00:00
Tim Northover 5745b6ac3b ARM: use target-specific SUBS node when combining cmp with cmov.
This has two positive effects. First, using a custom node prevents
recombination leading to an infinite loop since the output DAG is notionally a
little more complex than the input one. Using a flag-setting instruction also
allows the subtraction to be folded with the related comparison more easily.

https://reviews.llvm.org/D53190

llvm-svn: 348122
2018-12-03 11:16:21 +00:00
Oliver Stannard 4cf35b4ab0 [ARM][MC] Move information about variadic register defs into tablegen
Currently, variadic operands on an MCInst are assumed to be uses,
because they come after the defs. However, this is not always the case,
for example the Arm/Thumb LDM instructions write to a variable number of
registers.

This adds a property of instruction definitions which can be used to
mark variadic operands as defs. This only affects MCInst, because
MachineInstruction already tracks use/def per operand in each instance
of the instruction, so can already represent this.

This property can then be checked in MCInstrDesc, allowing us to remove
some special cases in ARMAsmParser::isITBlockTerminator.

Differential revision: https://reviews.llvm.org/D54853

llvm-svn: 348114
2018-12-03 10:32:42 +00:00
Oliver Stannard c588110f13 [ARM][Asm] Debug trace for the processInstruction loop
In the Arm assembly parser, we first match an instruction, then call
processInstruction to possibly change it to a different encoding, to
match rules in the architecture manual which can't be expressed by the
table-generated matcher.

This adds debug printing so that this process is visible when using the
-debug option.

To support this, I've added a new overload of MCInst::dump_pretty which
takes the opcode name as a StringRef, since we don't have an InstPrinter
instance in the assembly parser. Instead, we can get the same
information directly from the MCInstrInfo.

Differential revision: https://reviews.llvm.org/D54852

llvm-svn: 348113
2018-12-03 10:21:28 +00:00
Sjoerd Meijer 5afc957eba [ARM] FP16: support vld1.16 for vector loads with post-increment
Differential Revision: https://reviews.llvm.org/D55112

llvm-svn: 348110
2018-12-03 08:26:34 +00:00
Sjoerd Meijer ecc7dcb879 [ARM] Don't expand sdiv when optimising for minsize
Don't expand SDIV with an immediate that is a power of 2 if we optimise for
minimum code size. For example:

sdiv %1, i32 4

gets expanded to a sequence of 3 instructions, but this is suboptimal for
minimum code size so instead we just generate a MOV and a SDIV if integer
division is supported.

Differential Revision: https://reviews.llvm.org/D54546

llvm-svn: 347965
2018-11-30 08:14:28 +00:00
Jonas Devlieghere ccf7d4b4aa Produce an error on non-encodable offsets for darwin ARM scattered relocations.
Scattered ARM relocations for Mach-O's only have 24 bits available to
encode the offset. This is not checked but just truncated and can result
in corrupt binaries after linking because the relocations are applied to
the wrong offset. This patch will check and error out in those
situations instead of emitting a wrong relocation.

Patch by: Sander Bogaert (dzn)

Differential revision: https://reviews.llvm.org/D54776

llvm-svn: 347922
2018-11-29 21:58:23 +00:00
Sterling Augustine 9cc1ffadc5 Notify the linker when a TU compiled with split-stack has a function without a prologue.
More context here: https://go-review.googlesource.com/c/go/+/148819/

llvm-svn: 347614
2018-11-26 23:26:31 +00:00
Diana Picus 0528e2cfb3 [ARM GlobalISel] Support G_CTLZ and G_CTLZ_ZERO_UNDEF
We can now select CLZ via the TableGen'erated code, so support G_CTLZ
and G_CTLZ_ZERO_UNDEF throughout the pipeline for types <= s32.

Legalizer:
If the CLZ instruction is available, use it for both G_CTLZ and
G_CTLZ_ZERO_UNDEF. Otherwise, use a libcall for G_CTLZ_ZERO_UNDEF and
lower G_CTLZ in terms of it.

In order to achieve this we need to add support to the LegalizerHelper
for the legalization of G_CTLZ_ZERO_UNDEF for s32 as a libcall (__clzsi2).

We also need to allow lowering of G_CTLZ in terms of G_CTLZ_ZERO_UNDEF
if that is supported as a libcall, as opposed to just if it is Legal or
Custom. Due to a minor refactoring of the helper function in charge of
this, we will also allow the same behaviour for G_CTTZ and G_CTPOP.
This is not going to be a problem in practice since we don't yet have
support for treating G_CTTZ and G_CTPOP as libcalls (not even in
DAGISel).

Reg bank select:
Map G_CTLZ to GPR. G_CTLZ_ZERO_UNDEF should not make it to this point.

Instruction select:
Nothing to do.

llvm-svn: 347545
2018-11-26 11:07:02 +00:00
Sam Parker 5338f7aae4 [ARM] Prevent parallel macs for unsigned values
Both zext and sext are currently allowed during the search for narrow
sequences and sexts operands are later added to the mac candidates.
But operands of muls are also added, without checking whether they're
sext or zext, which means we can generate a signed smlad when we
shouldn't.

Differential Revision: https://reviews.llvm.org/D54790

llvm-svn: 347542
2018-11-26 10:22:55 +00:00
Fangrui Song 220f2a9cac [ARM] Add dependency from ARMAsmParser to ARMAsmPrinter after r347494
This fixes -DBUILD_SHARED_LIBS=on

llvm-svn: 347506
2018-11-23 23:43:46 +00:00
Oliver Stannard 173bc2bb7f [ARM][AsmParser] Improve debug printing of parsed asm operands
In ARMOperand::print:
- Print human-readable register names, instead of numbers.
- Print the correct names for IT condition masks (these were in the wrong order
  before).
- Print all parts of memory operands, not just the base register.

This makes the output of llvm-mc -show-inst-operands more readable.

Differential revision: https://reviews.llvm.org/D54850

llvm-svn: 347494
2018-11-23 14:27:21 +00:00
Sam Parker e7c42dd7e2 [ARM] Remove trunc sinks in ARM CGP
Truncs are treated as sources if their produce a value of the same
type as the one we currently trying to promote. Truncs used to be
considered as a sink if their operand was the same value type.
    
We now allow smaller types in the search, so we should search through
truncs that produce a smaller value. These truncs can then be
converted to an AND mask.
    
This leaves sinks as being:
  - points where the value in the register is being observed, such as
    an icmp, switch or store.
  - points where value types have to match, such as calls and returns.
  - zext are included to ease the transformation and are generally
    removed later on.
    
During this change, it also became apart from truncating sinks was
broken: if a sink used a source, its type information had already
been lost by the time the truncation happens. So I've changed the
method of caching the type information.

Differential Revision: https://reviews.llvm.org/D54515

llvm-svn: 347191
2018-11-19 11:34:40 +00:00
Eli Friedman 0bbb0d0720 [ARM] Add MemOperand to LDRcp to enable DCE.
LDRcp should be deleted when the dest register is dead in register
coalescing. Without MemOp, dead LDRcp will cause dead constant pool
value which references to non-existing label.

Patch by Yin Ma.

Differential Revision: https://reviews.llvm.org/D54173

llvm-svn: 346563
2018-11-09 23:09:17 +00:00
Sam Parker 2804f32ec4 [ARM] Don't promote i1 types in ARM CGP
Now that we have mixed type sizes, i1 values need to be explicitly
handled as we want to avoid promoting these values.

Differential Revision: https://reviews.llvm.org/D54308

llvm-svn: 346499
2018-11-09 15:06:33 +00:00
Sam Parker 08979cd125 [ARM] Enable mixed types in ARM CGP
Previously, during the search, all values had to have the same
'TypeSize', which is equal to number of bits of the integer type of
the icmp operand. All values in the tree had to match this size;
meaning that, if we searched from i16, we wouldn't accept i8s. A
change in type size requires zext and truncs to perform the casts so,
to allow mixed narrow types, the handling of these instructions is
now slightly different:

- we allow casts if their result or operand is <= TypeSize.
- zexts are sinks if their result > TypeSize.
- truncs are still sinks if their operand == TypeSize.
- truncs are still sources if their result == TypeSize.

The transformation bails on finding an icmp that operates on data
smaller than the current TypeSize.

Differential Revision: https://reviews.llvm.org/D54108

llvm-svn: 346480
2018-11-09 09:28:27 +00:00
Sam Parker 453ba916a0 [ARM] Small reorganisation in ARMParallelDSP
A few code movement things:

- AreSymmetrical is now a method of BinOpChain.
- Created a lambda in CreateParallelMACPairs to reduce loop nesting.
- A Reduction object now gets pasted in a couple of places instead,
  including CreateParallelMACPairs so it doesn't need to return a
  value.
I've also added RecordSequentialLoads, which is run before the
transformation begins, and caches the interesting loads. This can then
be queried later instead of cross checking many load values.

Differential Revision: https://reviews.llvm.org/D54254

llvm-svn: 346479
2018-11-09 09:18:00 +00:00
Petr Pavlu 7c84b2e3ab [ARM] Enable spilling of the hGPR register class in Thumb2
Generalize code in Thumb2InstrInfo::storeRegToStackSlot() and
loadRegToStackSlot() to allow the GPR class or any of its sub-classes
(including hGPR) to be stored/loaded by ARM::t2STRi12/ARM::t2LDRi12.

Differential Revision: https://reviews.llvm.org/D51927

llvm-svn: 346401
2018-11-08 13:02:10 +00:00
Eli Friedman 7d7d41debc [ARM] Fix CPSR liveness in tMOVCCr_pseudo lowering.
The lowering was missing live-ins in certain cases, like a sequence of
multiple tMOVCCr_pseudo instructions.  This would lead to a verifier
failure, and on pre-v6 Thumb CPSR would be incorrectly clobbered.

For reasons I don't completely understand, it's hard to get a sequence
of multiple tMOVCCr_pseudo instructions; the issue only seems to show up
with 64-bit comparisons where the result is zero-extended. I added some
extra testcases in case that changes in the future. Probably some
optimization opportunities here if anyone is interested. (@test_slt_not
is the case that was getting miscompiled.)

The code to check the liveness of CPSR was stolen from
X86ISelLowering.cpp; maybe it could be refactored into common helper,
but I have no idea where to put it.

Differential Revision: https://reviews.llvm.org/D54192

llvm-svn: 346355
2018-11-07 21:08:13 +00:00
Sam Parker fec793c98f [ARM] Turn assert into condition in ARMCGP
Turn the assert in PrepareConstants into a conditon so that we can
handle mul instructions with negative immediates.

Differential Revision: https://reviews.llvm.org/D54094

llvm-svn: 346126
2018-11-05 11:26:04 +00:00
Sam Parker fcd8adab30 [ARM][ARMCGP] Remove unecessary zexts and truncs
r345840 slightly changed the way promotion happens which could
result in zext and truncs having the same source and destination
types. This fixes that issue.

We can now also remove the zext and trunc in the following case:
(zext (trunc (promoted op)), i32)

This means that we can no longer treat a value, that is only used by
a sink, to be safe to promote.

I've also added in some extra asserts and replaced a cast for a
dyn_cast.

Differential Revision: https://reviews.llvm.org/D54032

llvm-svn: 346125
2018-11-05 10:58:37 +00:00
Matthias Braun 5f7cb79e94 ARMExpandPseudoInsts: Fix CMP_SWAP expansion adding a kill flag to a def
llvm-svn: 346026
2018-11-02 18:22:15 +00:00
Sam Parker 48fbf752b0 [ARM] Attempt to fix ppc64be buildbot
llvm-svn: 345850
2018-11-01 16:44:45 +00:00
Sam Parker 84a2f8b364 [ARM][CGP] Negative constant operand handling
While mutating instructions, we sign extended negative constant
operands for binary operators that can safely overflow. This was to
allow instructions, such as add nuw i8 %a, -2, to still be able to
perform a subtraction. However, the code to handle constants doesn't
take into consideration that instructions, such as sub nuw i8 -2, %a,
require the i8 -2 to be converted into i32 254.

This is a relatively simple fix, but I've taken the time to
reorganise the code a bit - mainly that instructions that can be
promoted are cached and splitting up the Mutate function.

Differential Revision: https://reviews.llvm.org/D53972

llvm-svn: 345840
2018-11-01 15:23:42 +00:00
Eli Friedman 063fd98bcc [ARM] Add missing pseudo-instruction for Thumb1 RSBS.
Shows up rarely for 64-bit arithmetic, more frequently for the compare
patterns added in r325323.

Differential Revision: https://reviews.llvm.org/D53848

llvm-svn: 345782
2018-10-31 21:45:48 +00:00
Dorit Nuzman 34da6dd696 [LV] Support vectorization of interleave-groups that require an epilog under
optsize using masked wide loads 

Under Opt for Size, the vectorizer does not vectorize interleave-groups that
have gaps at the end of the group (such as a loop that reads only the even
elements: a[2*i]) because that implies that we'll require a scalar epilogue
(which is not allowed under Opt for Size). This patch extends the support for
masked-interleave-groups (introduced by D53011 for conditional accesses) to
also cover the case of gaps in a group of loads; Targets that enable the
masked-interleave-group feature don't have to invalidate interleave-groups of
loads with gaps; they could now use masked wide-loads and shuffles (if that's
what the cost model selects).

Reviewers: Ayal, hsaito, dcaballe, fhahn

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53668

llvm-svn: 345705
2018-10-31 09:57:56 +00:00
Eli Friedman 2ac1162917 [ARM] Make InstrEmitter mark CPSR defs dead for Thumb1.
The "dead" markings allow existing target-independent optimizations,
like MachineSink, to trigger more frequently. The CPSR defs would have
eventually been marked dead by LiveVariables, so this only affects
optimizations before regalloc.

The ARMBaseInstrInfo.cpp change is fixing a bug which is only visible
with this change: the transform adds a use to an otherwise dead def
of CPSR. This is covered by existing regression tests.

thumb2-tbh.ll breaks for Thumb1 due to MachineLICM changing the
generated code; I'll fix it in D53452.

Differential Revision: https://reviews.llvm.org/D53453

llvm-svn: 345420
2018-10-26 19:32:24 +00:00
Sam Parker a16667e79b [ARM] Use Cortex-A57 sched model for Cortex-A72
This mirrors what we already do for AArch64 as the cores are similar.
As discussed in the review, enabling the machine scheduler causes
more variations in performance changes so it is not enabled for now.
This patch improves LNT scores by a geomean of 1.57% at -O3.

Differential Revision: https://reviews.llvm.org/D53562

llvm-svn: 345272
2018-10-25 15:08:29 +00:00
Simon Pilgrim 071e82218f [TTI] Add generic SK_Broadcast shuffle costs
I noticed while fixing PR39368 that we don't have generic shuffle costs for broadcast style shuffles.

This patch adds SK_BROADCAST handling, but exposes ARM/AARCH64 lack of handling of this type, which I've added a fix for at the same time.

Differential Revision: https://reviews.llvm.org/D53570

llvm-svn: 345253
2018-10-25 10:52:36 +00:00
Thomas Lively 30f1d69115 [NFC] Rename minnan and maxnan to minimum and maximum
Summary:
Changes all uses of minnan/maxnan to minimum/maximum
globally. These names emphasize that the semantic difference between
these operations is more than just NaN-propagation.

Reviewers: arsenm, aheejin, dschuff, javed.absar

Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53112

llvm-svn: 345218
2018-10-24 22:49:55 +00:00
Peter Collingbourne 4bb928c110 ARM: Use BKPT instead of TRAP to implement llvm.debugtrap.
The BKPT instruction is specified to cause a software breakpoint,
and at least on Linux results in a SIGTRAP. This makes it more
suitable for implementing debugtrap than TRAP (aka UDF #254), which
is specified to cause an undefined instruction exception and results
in a SIGILL on Linux.

Moreover, BKPT is not marked as a terminator, which is not only
consistent with the IR instruction but allows the analyzeBlock
function to correctly analyze a basic block containing the instruction,
which fixes an assertion failure in the machine block placement pass
previously triggered by the included test case.

Because BKPT is only supported starting with ARMv5T, we continue to
use UDF #254 when targeting v4T.

Differential Revision: https://reviews.llvm.org/D53614

llvm-svn: 345171
2018-10-24 18:10:38 +00:00
Saleem Abdulrasool 4005f9a860 ARM: handle checking aliases with out-of-bounds GEPs
A global alias may use indices which are not considered in bounds.  In
such a case, accessing the base object will fail as it only peers
through inbounds accesses.  This pattern is used by the swift compiler
to create references to preceeding members in the type metadata.  This
would cause the code generation to fail when targeting a platform that
used ELF as the object file format.  Be conservative and fail the
read-only check if we run into an alias that we cannot peer through.

llvm-svn: 345107
2018-10-24 00:00:52 +00:00
Simon Pilgrim 816e57be35 [TTI] Add generic cost handling of SK_Reverse shuffles
These can be treated as a general permute.

This required a fix for missing reverse patterns on ARM

llvm-svn: 345015
2018-10-23 09:42:10 +00:00
Eli Friedman b09c778715 Revert r344693 ("[ARM] bottom-top mul support in ARMParallelDSP")
Still causing failures on the polly-aosp buildbot; I'll follow up
with a reduced testcase.

llvm-svn: 344752
2018-10-18 19:34:30 +00:00
Sam Parker 2ef3c0dad6 [ARM] bottom-top mul support in ARMParallelDSP
Previously reverted in rL343082.

Original commit message:

On failing to find sequences that can be converted into dual macs,
try to find sequential 16-bit loads that are used by muls which we
can then use smultb, smulbt, smultt with a wide load.

Differential Revision: https://reviews.llvm.org/D51983

llvm-svn: 344693
2018-10-17 13:02:48 +00:00
Sjoerd Meijer 64cfb74a61 [ARM] Do not fuse VADD and VMUL, continued (2/2)
This is patch 2/2, following up on D53314, and is the functional change
to prevent fusing mul + add sequences into VFMAs.

Differential revision: https://reviews.llvm.org/D53315

llvm-svn: 344683
2018-10-17 10:05:44 +00:00
Sjoerd Meijer 1a213a42d6 [ARM] Follow up of rL344671, attempt to pacify a buildbot
It was rightfully complaining about an unpretty logical expression.

llvm-svn: 344677
2018-10-17 07:51:24 +00:00
Sjoerd Meijer ff3ab33ec8 [ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2)
This is a follow up of rL342874, which stopped fusing muls and adds into VMLAs
for performance reasons on the Cortex-M4 and Cortex-M33.  This is a serie of 2
patches, that is trying to achieve the same for VFMA.  The second column in the
table below shows what we were generating before rL342874, the third column
what changed with rL342874, and the last column what we want to achieve with
these 2 patches:

 --------------------------------------------------------
 | Opt   |  < rL342874   |  >= rL342874   |             |
 |------------------------------------------------------|
 |-O3    |     vmla      |      vmul      |     vmul    |
 |       |               |      vadd      |     vadd    |
 |------------------------------------------------------|
 |-Ofast |     vfma      |      vfma      |     vmul    |
 |       |               |                |     vadd    |
 |------------------------------------------------------|
 |-Oz    |     vmla      |      vmla      |     vmla    |
 --------------------------------------------------------

This patch 1/2, is a cleanup of the spaghetti predicate logic on the different
VMLA and VFMA codegen rules, so that we can make the final functional change in
patch 2/2.  This also fixes a typo in the regression test added in rL342874.

Differential revision: https://reviews.llvm.org/D53314

llvm-svn: 344671
2018-10-17 07:26:35 +00:00
Evandro Menezes 46eadcff9c [NFC][ARM] Refactor macro fusion
Simplify code for wildcards.

llvm-svn: 344625
2018-10-16 17:19:51 +00:00
Simon Pilgrim 5abb607ebe [ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281)
As I suggested on PR39281, this patch uses PADDL pairwise addition to widen from the vXi8 CTPOP result to the target vector type.

This is a blocker for moving more x86 code to generic vector CTPOP expansion (P32655 + D53258) - ARM's vXi64 CTPOP currently expands, which would generate a vXi64 MUL but ARM's custom lowering expands the general MUL case and vectors aren't well handled in LegalizeDAG - improving the CTPOP lowering was a lot easier than fixing the MUL lowering for this one case......

Differential Revision: https://reviews.llvm.org/D53257

llvm-svn: 344512
2018-10-15 13:20:41 +00:00
Dorit Nuzman 38bbf81ade recommit 344472 after fixing build failure on ARM and PPC.
llvm-svn: 344475
2018-10-14 08:50:06 +00:00
Dorit Nuzman 5118c68cde revert 344472 due to failures.
llvm-svn: 344473
2018-10-14 07:21:20 +00:00
Dorit Nuzman 8174368955 [IAI,LV] Add support for vectorizing predicated strided accesses using masked
interleave-group

The vectorizer currently does not attempt to create interleave-groups that
contain predicated loads/stores; predicated strided accesses can currently be
vectorized only using masked gather/scatter or scalarization. This patch makes
predicated loads/stores candidates for forming interleave-groups during the
Loop-Vectorizer's analysis, and adds the proper support for masked-interleave-
groups to the Loop-Vectorizer's planning and transformation stages. The patch
also extends the TTI API to allow querying the cost of masked interleave groups
(which each target can control); Targets that support masked vector loads/
stores may choose to enable this feature and allow vectorizing predicated
strided loads/stores using masked wide loads/stores and shuffles.

Reviewers: Ayal, hsaito, dcaballe, fhahn, javed.absar

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D53011

llvm-svn: 344472
2018-10-14 07:06:16 +00:00
George Burgess IV 6ef8002c2c Replace most users of UnknownSize with LocationSize::unknown(); NFC
Moving away from UnknownSize is part of the effort to migrate us to
LocationSizes (e.g. the cleanup promised in D44748).

This doesn't entirely remove all of the uses of UnknownSize; some uses
require tweaks to assume that UnknownSize isn't just some kind of int.
This patch is intended to just be a trivial replacement for all places
where LocationSize::unknown() will Just Work.

llvm-svn: 344186
2018-10-10 21:28:44 +00:00
Peter Smith 6f36cd4d76 [ARM] Account for implicit IT when calculating inline asm size
When deciding if it is safe to optimize a conditional branch to a CBZ or
CBNZ the offsets of the BasicBlocks from the start of the function are
estimated. For inline assembly the generic getInlineAsmLength() function is
used to get a worst case estimate of the inline assembly by multiplying the
number of instructions by the max instruction size of 4 bytes. This
unfortunately doesn't take into account the generation of Thumb implicit IT
instructions. In edge cases such as when all the instructions in the block
are 4-bytes in size and there is an implicit IT then the size is
underestimated. This can cause an out of range CBZ or CBNZ to be generated.

The patch takes a conservative approach and assumes that every instruction
in the inline assembly block may have an implicit IT.

Fixes pr31805

Differential Revision: https://reviews.llvm.org/D52834

llvm-svn: 343960
2018-10-08 09:38:28 +00:00
Matthias Braun 81578e9f77 X86, AArch64, ARM: Do not attach debug location to spill/reload instructions
This rebases and recommits r343520. hwasan should be fixed now and this
shouldn't break the tests anymore.

Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).

Differential Revision: https://reviews.llvm.org/D52125

llvm-svn: 343895
2018-10-05 22:00:13 +00:00