More conversions to load-and-test can be made with this patch by adding a
forward search in optimizeCompareZero().
Review: Ulrich Weigand
https://reviews.llvm.org/D38076
llvm-svn: 313877
The N32 ABI uses RELA relocation format, do not use 3-in-1 relocation's
encoding, and uses ELFCLASS32. This change passes the `IsN32` flag
to the `MCAsmBackend` to distinguish usage of N32 ABI.
We still do not handle some cases like providing the `-target-abi=o32`
command line option with the `mips64` target triple. That's why
elf_header.s contains some "FIXME" strings. This case will be fixed in
a separate patch.
Differential revision: https://reviews.llvm.org/D37960
llvm-svn: 313873
We can have a v_mac with an immediate src0.
We can still fold if it's an inline immediate,
otherwise it already uses the constant bus.
llvm-svn: 313852
The Swift CC is identical to Win64 CC with the exception of swift error
being passed in r12 which is a CSR. However, since this calling
convention is only used in swift -> swift code, it does not impact
interoperability and can be treated entirely as Win64 CC. We would
previously incorrectly lower the frame setup as we did not treat the
frame as conforming to Win64 specifications.
llvm-svn: 313813
These write to the low and high half of the destination
register and leave the other 16-bits unchanged. This is true
for most 16-bit instructions on gfx9, but we don't use that
now.
llvm-svn: 313812
Also add some tests that should be able to use v_mad_mixhi_f16,
but do not yet. This is trickier because we don't really model
the partial update of the register done by 16-bit instructions.
llvm-svn: 313806
Add support for passing SwiftError through a register on the Windows x64
calling convention. This allows the use of swifterror attributes on
parameters which is used by the swift front end for the `Error`
parameter. This partially enables building the swift standard library
for Windows x86_64.
llvm-svn: 313791
If these checks fail we end up not selecting an instruction at all. So we are already relying on the immediate being checked upstream of isel. So doing the check in isel is just bloat to the isel table. Interestingly, we didn't check on the AVX512 version of the instructions anyway.
llvm-svn: 313724
Also starts selecting global loads for constant address
in some cases. Some end up selecting to mubuf still, which
requires investigation.
We still get sub-optimal regalloc and extra waitcnts inserted
due to not really tracking the liveness of the separate register
halves.
llvm-svn: 313716
The ARM docs suggest in examples that the flags can have either case, and there
are applications in the wild that (libopencm3, for example) that expect to be
able to use the uppercase spelling.
https://reviews.llvm.org/D37953
llvm-svn: 313680
Summary:
There is no benefit in having the 4-byte alignment, and removing this
restriction can save a lot of space for some applications.
Reviewers: asl, awygle
Reviewed By: awygle
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36165
llvm-svn: 313676
The pre-RA scheduler does load/store clustering, but post-RA
scheduler undoes it. Add mutation to prevent it.
Differential Revision: https://reviews.llvm.org/D38014
llvm-svn: 313670
SystemZTargetLowering::combineSTORE contains code to transform a
combination of STORE + BSWAP into a STRV type instruction.
This transformation is correct for regular stores, but not for
truncating stores. The routine neglected to check for that case.
Fixes a miscompilation of llvm-objcopy with clang, which caused
test suite failures in the SystemZ multistage build bot.
llvm-svn: 313669
Similar to what we do for X86ISD::SHRUNKBLEND just turn X86ISD::SELECT into ISD::VSELECT. This allows us to remove the duplicated TRUNC patterns.
Differential Revision: https://reviews.llvm.org/D38022
llvm-svn: 313644
Two blocks prior to the join each perform an li and the the join block has an
add using the initialized register. Optimize each predecessor block to instead
use addi and delete the li's and add.
Differential Revision: https://reviews.llvm.org/D36734
llvm-svn: 313639
This is the second minimal patch keeping Nios2 target buildable.
I'm adding subtarget here and other stuff for frame lowering, instruction,
register information methods. I do not add any test cases, as still there
are missing parts like DAG selector and assembly printing. I plan to include
them into the next patch.
Patch by Andrei Grischenko <andrei.l.grischenko@intel.com>
Differential Revision: https://reviews.llvm.org/D37256
llvm-svn: 313626
This is a preparatory step for D34515.
This change:
- makes nodes ISD::ADDCARRY and ISD::SUBCARRY legal for i32
- lowering is done by first converting the boolean value into the carry flag
using (_, C) ← (ARMISD::ADDC R, -1) and converted back to an integer value
using (R, _) ← (ARMISD::ADDE 0, 0, C). An ARMISD::ADDE between the two
operations does the actual addition.
- for subtraction, given that ISD::SUBCARRY second result is actually a
borrow, we need to invert the value of the second operand and result before
and after using ARMISD::SUBE. We need to invert the carry result of
ARMISD::SUBE to preserve the semantics.
- given that the generic combiner may lower ISD::ADDCARRY and
ISD::SUBCARRYinto ISD::UADDO and ISD::USUBO we need to update their lowering
as well otherwise i64 operations now would require branches. This implies
updating the corresponding test for unsigned.
- add new combiner to remove the redundant conversions from/to carry flags
to/from boolean values (ARMISD::ADDC (ARMISD::ADDE 0, 0, C), -1) → C
- fixes PR34045
- fixes PR34564
Differential Revision: https://reviews.llvm.org/D35192
llvm-svn: 313618
This patch adds the instruction scheduling information for the SkylakeClient (SKL) architecture target by adding the file X86SchedSkylakeClient.td located under the X86 Target.
We used the scheduling information retrieved from the Skylake architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each SKL instruction.
The patch continues the scheduling replacement and insertion effort started with the SNB target in r307529 and r310792 and for HSW in r311879.
Please expect some performance fluctuations due to code alignment effects.
Reviewers: craig.topper, zvi, chandlerc, igorb, aymanmus, RKSimon, delena
Differential Revision: https://reviews.llvm.org/D37294
llvm-svn: 313613
If we have an AssertZext of a truncated value that has already been AssertZext'ed,
we can assert on the wider source op to improve the zext-y knowledge:
assert (trunc (assert X, i8) to iN), i1 --> trunc (assert X, i1) to iN
This moves a fold from being Mips-specific to general combining, and x86 shows
improvements.
Differential Revision: https://reviews.llvm.org/D37017
llvm-svn: 313577
The AssertZext we might see in this case is only giving information about the lower 32 bits. It isn't providing information about the upper 32 bits. So we should emit a zext.
This fixes PR28540.
Differential Revision: https://reviews.llvm.org/D37729
llvm-svn: 313563
This is similar to D37843, but for sub_8bit. This fixes all of the patterns except for the 2 that emit only an EXTRACT_SUBREG. That causes a verifier error with global isel because global isel doesn't know to issue the ABCD when doing this extract on 32-bits targets.
Differential Revision: https://reviews.llvm.org/D37890
llvm-svn: 313558
I'm pretty sure that InstrEmitter::EmitSubregNode will take care of this itself by calling ConstrainForSubReg which in turn calls TRI->getSubClassWithSubReg.
I think Jakob Stoklund Olesen alluded to this in his commit message for r141207 which added the code to EmitSubregNode.
Differential Revision: https://reviews.llvm.org/D37843
llvm-svn: 313557