Previously these instructions were marked codegen only and had
an under-specified instruction description that did not record the
fcc register.
Reviewers: atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D38847
llvm-svn: 315905
Minor addition and follow up of r314773 and r311533: this adds more
debug messages to the type legalizer. For each node, it dumps
legalization info for results and operands nodes, rather than just the
final legalized node.
Differential Revision: https://reviews.llvm.org/D38726
llvm-svn: 315904
Currently llvm-dwarfdump runs into llvm_unreachable when
faces DW_CFA_GNU_args_size. Patch implements the support.
Differential revision: https://reviews.llvm.org/D38879
llvm-svn: 315897
Summary:
The following transformation for cmp instruction:
icmp smin(x, PositiveValue), 0 -> icmp x, 0
should only be done after checking for min/max to prevent infinite
looping caused by a reverse canonicalization. That is why this
transformation was moved to place after the mentioned check.
Reviewers: spatel, efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38934
Patch by: Artur Gainullin <artur.gainullin@intel.com>
llvm-svn: 315895
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding. This caused comparison with wrong value.
This bug looks to be introduced by r308080. The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>
llvm-svn: 315889
This patch enables redundant sign- and zero-extension elimination in PowerPC MI Peephole pass.
If the input value of a sign- or zero-extension is known to be already sign- or zero-extended, the operation is redundant and can be eliminated.
One common case is sign-extensions for a method parameter or for a method return value; they must be sign- or zero-extended as defined in PPC ELF ABI.
For example of the following simple code, two extsw instructions are generated before the invocation of int_func and before the return. With this patch, both extsw are eliminated.
void int_func(int);
void ii_test(int a) {
if (a & 1) return int_func(a);
}
Such redundant sign- or zero-extensions are quite common in many programs; e.g. I observed about 60,000 occurrences of the elimination while compiling the LLVM+CLANG.
Differential Revision: https://reviews.llvm.org/D31319
llvm-svn: 315888
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.
At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.
The previous commit failed on MSVC due to a failure to convert an
initializer_list to a std::vector. Hopefully, MSVC will accept this version.
Depends on D37457
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37458
llvm-svn: 315887
Summary:
iPTR is a pointer of subtarget-specific size to any address space. Therefore
type checks on this size derive the SizeInBits from a subtarget hook.
At this point, we can import the simplests G_LOAD rules and select load
instructions using them. Further patches will support for the predicates to
enable additional loads as well as the stores.
Depends on D37457
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37458
llvm-svn: 315885
Summary:
This includes some context-sensitivity in the MVT to LLT conversion so that
pointer types are tested correctly.
FIXME: I'm not happy with the way this is done since everything is a
special-case. I've yet to find a reasonable way to implement it.
select-load.mir fails because <1 x s64> loads in tablegen get priority over s64
loads. This is fixed in the next patch and as such they should be committed
together, I've posted them separately to help with the review.
Depends on D37456
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: kristof.beyls, javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37457
llvm-svn: 315884
This allows lld-link to process /manifestinput: flags on macOS too.
Also makes the `REQUIRES: manifesttool` lld tests run on macOS.
Setting LLVM_ENABLE_LIBXML2 to off can suppress this behavior, like on Linux.
llvm-svn: 315873
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.
This patch adds support for this in order to enable the import of load/store
patterns.
Depends on D37445
Hopefully fixed the ambiguous constructor that a large number of bots reported.
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D37456
llvm-svn: 315869
Summary:
It's possible for a ComplexPattern to be used as an operator in a match
pattern. This is used by the load/store patterns in AArch64 to name the
suboperands returned by ComplexPattern predicate so that they can be broken
apart and referenced independently in the result pattern.
This patch adds support for this in order to enable the import of load/store
patterns.
Depends on D37445
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D37456
llvm-svn: 315863
Turns out we have no patterns on the instructions that were using this feature flag for other reasons. These instructions are slow on all modern CPUs so it seems unlikely that we will spend any effort supporting these instructions going forward. So we might as well just kill of the feature flag and just fix up the comments.
llvm-svn: 315862
Summary:
This was impeding our ability to combine the extending shuffles with other shuffles as you can see from the test changes.
There's one special case that needed to be added to use VZEXT directly for v8i8->v8i64 since the custom lowering requires v64i8.
Reviewers: RKSimon, zvi, delena
Reviewed By: delena
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38714
llvm-svn: 315860
Summary: I see nothing in Agner Fog's tables to indicate that this improved between Ivy Bridge and Haswell. It's also set for all Atom CPUs so I assume KNL should have it too.
Reviewers: RKSimon, zvi, gadi.haber
Reviewed By: gadi.haber
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38890
llvm-svn: 315859
In type inference, an empty type set for a specific hw mode is not an
error. In earlier stages of the design it was, but having to use non-
parameterized types with target intrinsics necessarily led to type
contradictions: since the intrinsics used specific types, they were
only valid for a specific hw mode, and the resulting type set for other
modes ended up empty. To accommodate the existence of such intrinsics
individual type sets were allowed to be empty as long as not all sets
were empty.
llvm-svn: 315858
This can result in significant code size savings in some cases,
e.g. an interrupt table all filled with the same assembly stub
in a certain Cortex-M BSP results in code blowup by a factor of 2.5.
Differential Revision: https://reviews.llvm.org/D34806
llvm-svn: 315853
This reduces code size for constructs like vtables or interrupt
tables that refer to functions in global initializers.
Differential Revision: https://reviews.llvm.org/D34805
llvm-svn: 315852
This avoid code duplication and allow us to add the disable unroll metadata elsewhere.
Differential Revision: https://reviews.llvm.org/D38928
llvm-svn: 315850
Summary:
It's better to use our shuffle lowering code to handle these than loading an immediate into a k-register.
It really feels like this should be a DAG combine optimization rather than a lowering operation, but that's a problem for another day.
Reviewers: RKSimon, delena, zvi
Reviewed By: delena
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38932
llvm-svn: 315849
We should be able to fold constant conditions by converting to shuffles, but fixing that would break these tests in their current form. Since they are really trying to test masking ops, add a non-constant mask to the selects.
llvm-svn: 315848
Summary:
There is an important mismatch between ISD::LOAD and G_LOAD (and likewise for
ISD::STORE and G_STORE). In SelectionDAG, ISD::LOAD is a non-atomic load
and atomic loads are handled by a separate node. However, this is not true of
GlobalISel's G_LOAD. For G_LOAD, the MachineMemOperand indicates the atomicity
of the operation. As a result, this mapping must also add a predicate that
checks for non-atomic MachineMemOperands.
This is NFC since these nodes always have predicates in practice and are
therefore always rejected at the moment.
Depends on D37443
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: kristof.beyls, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37445
llvm-svn: 315843
Summary:
GlobalISel and SelectionDAG require different code for the common
load/store predicates due to differences in the representation.
For example:
SelectionDAG: (load<signext,i8>:i32 GPR32:$addr) // The <> denote properties of the SDNode that are not printed in the DAG
GlobalISel: (G_SEXT:s32 (G_LOAD:s8 GPR32:$addr))
Even without that, differences in the IR (SDNode vs MachineInstr) require
differences in the C++ predicate.
This patch moves the implementation of the common load/store predicates
into tablegen so that it can handle these differences.
It's NFC for SelectionDAG since it emits equivalent code and it's NFC for
GlobalISel since the rules involving the relevant predicates are still
rejected by the importer.
Depends on D36618
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Subscribers: llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D37443
Includes a partial revert of r315826 since this patch makes it necessary for
getPredCode() to return a std::string and getImmCode() should have the same
interface as getPredCode().
llvm-svn: 315841
Fixes cfe/trunk/test/Misc/backend-resource-limit-diagnostics.cl
test after r315808
We may hit few other similar issues, but I want to discuss good
solution offline.
llvm-svn: 315830