Summary: Fix the bug when promoted call return type mismatches with the promoted function, we should not try to inline it. Otherwise it may lead to compiler crash.
Reviewers: davidxl, tejohnson, eraman
Reviewed By: tejohnson
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38018
llvm-svn: 313658
This change adds support for nested and even overlapping segments. This means
that PT_PHDR, PT_GNU_RELRO, PT_TLS, and PT_DYNAMIC can be supported properly.
Differential Revision: https://reviews.llvm.org/D36558
llvm-svn: 313656
Add support for the R_AARCH64_ABS{16,32} relocations in the execution
engine. This is primarily used for DWARF debug information relocations
and needed by the LLVM JIT to support JITing for lldb.
Patch by Alex Langford!
llvm-svn: 313654
I forgot to zero out the BitVector when reusing it between UserValues.
Later uses of the same location number for a different UserValue would
falsely indicate that they were spilled. Usually this would lead to
incorrect debug info, but in some cases they would indicate something
nonsensical like a memory location based on a vector register (Q8 on
ARM).
llvm-svn: 313640
Two blocks prior to the join each perform an li and the the join block has an
add using the initialized register. Optimize each predecessor block to instead
use addi and delete the li's and add.
Differential Revision: https://reviews.llvm.org/D36734
llvm-svn: 313639
This is a preparatory step for D34515.
This change:
- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
borrow, we need to invert the value of the second operand and result before
and after using ARMISD::SUBE. We need to invert the carry result of
ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
as well otherwise i64 operations now would require branches. This implies
updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
- fixes PR34045
- fixes PR34564
Differential Revision: https://reviews.llvm.org/D35192
llvm-svn: 313618
This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target.
We used the scheduling information retrieved from the Skylake architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction.
The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879.
Please expect some performance fluctuations due to code alignment effects.
Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena
Differential Revision: https://reviews.llvm.org/D37294
llvm-svn: 313613
After clang started emitting deferred regions (r312818), llvm-cov has
had a hard time picking reasonable line execuction counts. There have
been one or two generic improvements in this area (e.g r310012), but
line counts can still report coverage for whitespace instead of code
(llvm.org/PR34612).
To fix the problem:
* Introduce a new region kind so that frontends can explicitly label
gap areas.
This is done by changing the encoding of the columnEnd field of
MappingRegion. This doesn't substantially increase binary size, and
makes it easy to maintain backwards-compatibility.
* Don't set the line count to a count from a gap area, unless the count
comes from a wrapped segment.
* Don't highlight gap areas as uncovered.
Fixes llvm.org/PR34612.
llvm-svn: 313597
This caused asserts in Chromium. See http://crbug.com/766261
> Summary:
> This comes up in optimized debug info for C++ programs that pass and
> return objects indirectly by address. In these programs,
> llvm.dbg.declare survives optimization, which causes us to emit indirect
> DBG_VALUE instructions. The fast register allocator knows to insert
> DW_OP_deref when spilling indirect DBG_VALUE instructions, but the
> LiveDebugVariables did not until this change.
>
> This fixes part of PR34513. I need to look into why this doesn't work at
> -O0 and I'll send follow up patches to handle that.
>
> Reviewers: aprantl, dblaikie, probinson
>
> Subscribers: qcolombet, hiraditya, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D37911
llvm-svn: 313589
If we have an AssertZext of a truncated value that has already been AssertZext'ed,
we can assert on the wider source op to improve the zext-y knowledge:
assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN
This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.
Differential Revision: https://reviews.llvm.org/D37017
llvm-svn: 313577
rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217
We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.
I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.
Differential Revision: https://reviews.llvm.org/D37987
llvm-svn: 313564
The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext.
This fixes PR28540.
Differential Revision: https://reviews.llvm.org/D37729
llvm-svn: 313563
For cases where we are BITCASTing to vectors of smaller elements, then if the entire source was a splatted sign (src's NumSignBits == SrcBitWidth) we can say that the dst's NumSignBit == DstBitWidth, as we're just splitting those sign bits across multiple elements.
We could generalize this but at the moment the only use case I have is to peek through bitcasts to vector comparison results.
Differential Revision: https://reviews.llvm.org/D37849
llvm-svn: 313543
The shuffle combining and lowerVectorShuffleAsLanePermuteAndBlend were both still trying to use VPERM2XF128 for unary shuffles when AVX2 is enabled. VPERM2X128 takes two inputs meaning when we use it for a unary shuffle one of those inputs is left undefined creating a false dependency on whatever register gets allocated there.
If we have VPERMQ/PD we should prefer those since they only have a single input.
Differential Revision: https://reviews.llvm.org/D37947
llvm-svn: 313542
Add the missing hardware features the ProcA55 and ProcA75 feature.
These are already enabled via the target parser, but I had missed
them in the backend.
Differential Revision: https://reviews.llvm.org/D37974
llvm-svn: 313535
Implement the isTruncateFree hooks, lifted from AArch64, that are
used by TargetTransformInfo. This allows simplifycfg to reduce the
test case into a single basic block.
Differential Revision: https://reviews.llvm.org/D37516
llvm-svn: 313533
The indexed dot product instructions only accept the lower 16 D-registers as
the indexed register, but we were e.g. incorrectly accepting:
vudot.u8 d16,d16,d18[0]
Differential Revision: https://reviews.llvm.org/D37968
llvm-svn: 313531
This patch makes the `.eh_frame` extension an alias for `.debug_frame`.
Up till now it was only possible to dump the section using objdump, but
not with dwarfdump. Since the two are essentially interchangeable, we
dump whichever of the two is present.
As a workaround, this patch also adds parsing for 3 currently
unimplemented CFA instructions: `DW_CFA_def_cfa_expression`,
`DW_CFA_expression`, and `DW_CFA_val_expression`. Because I lack the
required knowledge, I just parse the fields without actually creating
the instructions.
Finally, this also fixes the typo in the `.debug_frame` section name
which incorrectly contained a trailing `s`.
Differential revision: https://reviews.llvm.org/D37852
llvm-svn: 313530
Summary:
Subregister liveness tracking is not implemented for X86 backend, so
sometimes the whole super register is said to be live, when only a
subregister is really live. That might happen if the def and the use
are located in different MBBs, see added fixup-bw-isnt.mir test.
However, using knowledge of the specific instructions handled by the
bw-fixup-pass we can get more precise liveness information which this
change does.
Reviewers: MatzeB, DavidKreitzer, ab, andrew.w.kaylor, craig.topper
Reviewed By: craig.topper
Subscribers: n.bozhenov, myatsina, llvm-commits, hiraditya
Patch by Andrei Elovikov <andrei.elovikov@intel.com>
Differential Revision: https://reviews.llvm.org/D37559
llvm-svn: 313524
Summary:
This change adds support for explicit tail-exit records to be written by
the XRay runtime. This lets us differentiate the tail exit
records/events in the log, and allows us to treat those exit events
especially in the future. For now we allow printing those out in YAML
(and reading them in).
Reviewers: kpw, pelikan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37964
llvm-svn: 313514