After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing,
so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests:
if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is
one test case in this patch to prove that point.
This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I
did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF
( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes.
This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as
FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the
current global settings.
Differential Revision: http://reviews.llvm.org/D12095
llvm-svn: 247815
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, llvm-commits, rafael, yaron.keren
Differential Revision: http://reviews.llvm.org/D11037
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241776
Summary:
The documentation writes vectors highest-index first whereas LLVM-IR writes
them lowest-index first. As a result, instructions defined in terms of
left_half() and right_half() had the halves reversed.
In addition to correcting them, they have been improved to allow shuffles
that use the same operand twice or in reverse order. For example, ilvev
used to accept masks of the form:
<0, n, 2, n+2, 4, n+4, ...>
but now accepts:
<0, 0, 2, 2, 4, 4, ...>
<n, n, n+2, n+2, n+4, n+4, ...>
<0, n, 2, n+2, 4, n+4, ...>
<n, 0, n+2, 2, n+4, 4, ...>
One further improvement is that splati.[bhwd] is now the preferred instruction
for splat-like operations. The other special shuffles are no longer used
for splats. This lead to the discovery that <0, 0, ...> would not cause
splati.[hwd] to be selected and this has also been fixed.
This fixes the enc-3des test from the test-suite on Mips64r6 with MSA.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9660
llvm-svn: 237689
Summary:
When using the N64 ABI, element-indices use the i64 type instead of i32.
In many cases, we can use iPTR to account for this but additional patterns
and pseudo's are also required.
This fixes most (but not quite all) failures in the test-suite when using
N64 and MSA together.
Reviewers: vkalintiris
Reviewed By: vkalintiris
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D9342
llvm-svn: 236494
[DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).
Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.
Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.
This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.
Differential Revision: http://reviews.llvm.org/D9084
llvm-svn: 235989
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).
Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.
Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.
This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.
Differential Revision: http://reviews.llvm.org/D9084
llvm-svn: 235977
This required plumbing a TargetRegisterInfo through computeRegisterProperties
and into findRepresentativeClass which uses it for register class
iteration. This required passing a subtarget into a few target specific
initializations of TargetLowering.
llvm-svn: 230583
Summary:
-mno-odd-spreg prohibits the use of odd-numbered single-precision floating
point registers. However, vector insert/extract was still using them when
manipulating the subregisters of an MSA register. Fixed this by ensuring
that insertion/extraction is only performed on even-numbered vector
registers when -mno-odd-spreg is given.
Reviewers: vmedic, sstankovic
Reviewed By: sstankovic
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7672
llvm-svn: 230235
calls that don't take a Function argument from Mips. Notable
exceptions: the AsmPrinter and MipsTargetObjectFile. The
latter needs to be fixed, and the former will be fixed when the
general AsmPrinter changes happen.
llvm-svn: 227512
Summary:
This patch adds support for some operations that were missing from
128-bit integer types (add/sub/mul/sdiv/udiv... etc.). With these
changes we can support the __int128_t and __uint128_t data types
from C/C++.
Depends on D7125
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7143
llvm-svn: 227089
type (in addition to the memory type).
The *LoadExt* legalization handling used to only have one type, the
memory type. This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.
However, this isn't always the case. For instance, on X86, with AVX,
this is legal:
v4i32 load, zext from v4i8
but this isn't:
v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.
Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.
Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.
Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior. The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)
No functional change intended.
Differential Revision: http://reviews.llvm.org/D6532
llvm-svn: 225421
r221056 "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC."
r221058 "[mips] Fix unused variable warning introduced in r221056"
r221059 "[mips] Move all ByVal handling into CCState and tablegen-erated code. NFC."
r221061 "Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC."
It cuased an undefined behavior in LLVM :: CodeGen/Mips/return-vector.ll.
llvm-svn: 221081
Summary:
CCState already contains a byval implementation that is very similar to the
Mips custom code. This patch merges the custom code into the existing
common code and tablegen-erated code.
Reviewers: vmedic
Reviewed By: vmedic
Subscribers: rnk, llvm-commits
Differential Revision: http://reviews.llvm.org/D5977
llvm-svn: 221059
doesn't generate lazy binding stub for a function whose address is taken in
the program.
Differential Revision: http://reviews.llvm.org/D5067
llvm-svn: 218744
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reverts commits r215111, 215115, 215116, 215117, 215136.
llvm-svn: 215154
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.
Thanks to Lang Hames for making MCJIT a good replacement!
llvm-svn: 215111
fromulation of the node, which isn't really the desired behavior from
within the combiner or legalizer, but is necessary within ISel. I've
added a hopefully helpful comment and fixed the only two places where
this took place.
Yet another step toward the combiner and legalizer not needing to use
update listeners with virtual calls to manage the worklists behind
legalization and combining.
llvm-svn: 214574
Rename to allowsMisalignedMemoryAccess.
On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment,
and don't need to be split into multiple accesses. Vector loads with
an alignment of the element type are not uncommon in OpenCL code.
llvm-svn: 214055
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).
This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.
No functionality change intended.
llvm-svn: 213859
Options struct and move the comment to inMips16HardFloat. Use the
fact that we now know whether or not we cared about soft float to
set the libcalls.
Accordingly rename mipsSEUsesSoftFloat to abiUsesSoftFloat and
propagate since it's no longer CPU specific.
llvm-svn: 213335
Summary:
Also tightened up the acceptable condition operand for these instructions
on MIPS-I to MIPS-III. Support for $fcc[1-7] was added in MIPS-IV. Prior
to that only $fcc0 is acceptable.
We currently don't optimize (BEQZ (NOT $a), $target) and similar. It's
probably best to do this in InstCombine.
Depends on D4111
Reviewers: jkolek, zoran.jovanovic, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D4112
llvm-svn: 210787
Summary:
c.cond.fmt has been replaced by cmp.cond.fmt. Where c.cond.fmt wrote to
dedicated condition registers, cmp.cond.fmt writes 1 or 0 to normal FGR's
(like the GPR comparisons).
mov[fntz] have been replaced by seleqz and selnez. These instructions
conditionally zero a register based on a bool in a GPR. The results can
then be or'd together to act as a select without, for example, requiring a third
register read port.
mov[fntz].[ds] have been replaced with sel.[ds]
MIPS64r6 currently generates unnecessary sign-extensions for most selects.
This is because the result of a SETCC is currently an i32. Bits 32-63 are
undefined in i32 and the behaviour of seleqz/selnez would otherwise depend
on undefined bits. Later, we will fix this by making the result of SETCC an
i64 on MIPS64 targets.
Depends on D3958
Reviewers: jkolek, vmedic, zoran.jovanovic
Reviewed By: vmedic, zoran.jovanovic
Differential Revision: http://reviews.llvm.org/D4003
llvm-svn: 210777
Summary:
This patch disables madd/maddu/msub/msubu in both the assembler and code
generator.
Depends on D3896
Reviewers: jkolek, zoran.jovanovic, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3955
llvm-svn: 210762
Summary:
The accumulator-based (HI/LO) multiplies and divides from earlier ISA's have
been removed and replaced with GPR-based equivalents. For example:
div $1, $2
mflo $3
is now:
div $3, $1, $2
This patch disables the accumulator-based multiplies and divides for
MIPS32r6/MIPS64r6 and uses the GPR-based equivalents instead.
Renamed expandPseudoDiv to insertDivByZeroTrap to better describe the
behaviour of the function.
MipsDelaySlotFiller now invalidates the liveness information when moving
instructions to the delay slot. Without this, divrem.ll will abort since
%GP ends up used before it is defined.
Reviewers: vmedic, zoran.jovanovic, jkolek
Reviewed By: jkolek
Differential Revision: http://reviews.llvm.org/D3896
llvm-svn: 210760
Summary:
Instead the system is required to provide some means of handling unaligned
load/store without special instructions. Options include full hardware
support, full trap-and-emulate, and hybrids such as hardware support within
a cache line and trap-and-emulate for multi-line accesses.
MipsSETargetLowering::allowsUnalignedMemoryAccesses() has been configured to
assume that unaligned accesses are 'fast' on the basis that I expect few
hardware implementations will opt for pure-software handling of unaligned
accesses. The ones that do handle it purely in software can override this.
mips64-load-store-left-right.ll has been merged into load-store-left-right.ll
The stricter testing revealed a Bits!=Bytes bug in passByValArg(). This has
been fixed and the variables renamed to clarify the units they hold.
Reviewers: zoran.jovanovic, jkolek, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3872
llvm-svn: 209512
Summary:
This isn't supported directly so we rotate the vector by the desired number of
elements, insert to element zero, then rotate back.
The i64 case generates rather poor code on MIPS32. There is an obvious
optimisation to be made in future (do both insert.w's inside a shared
rotate/unrotate sequence) but for now it's sufficient to select valid code
instead of aborting.
Depends on D3536
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3537
llvm-svn: 207640
Summary:
- Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64
- ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
I've added additional testcases to cover as much of the codegen changes
affecting MIPS-IV as I can. Where I've been unable to find an existing
MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering
ISD::GlobalAddress and similar), I at least agree that MIPS-IV should
behave like MIPS64. Further testcases that are fixed by this patch will follow
in my next commit. The testcases from that commit that fail for MIPS-IV without
this patch are:
LLVM :: CodeGen/Mips/2010-07-20-Switch.ll
LLVM :: CodeGen/Mips/cmov.ll
LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll
LLVM :: CodeGen/Mips/largeimmprinting.ll
LLVM :: CodeGen/Mips/longbranch.ll
LLVM :: CodeGen/Mips/mips64-f128.ll
LLVM :: CodeGen/Mips/mips64directive.ll
LLVM :: CodeGen/Mips/mips64ext.ll
LLVM :: CodeGen/Mips/mips64fpldst.ll
LLVM :: CodeGen/Mips/mips64intldst.ll
LLVM :: CodeGen/Mips/mips64load-store-left-right.ll
LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll
Reviewers: dsanders
Reviewed By: dsanders
CC: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3343
llvm-svn: 206183
Summary:
Highlights:
- Registers are resolved much later (by the render method).
Prior to that point, GPR32's/GPR64's are GPR's regardless of register
size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register
size or FR mode. Numeric registers can be anything.
- All registers are parsed the same way everywhere (even when handling
symbol aliasing)
- One consequence is that all registers can be specified numerically
almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing
but that can be easily resolved.
- Removes the need for the hasConsumedDollar hack
- Parenthesis and Bracket suffixes are handled generically
- Micromips instructions are parsed directly instead of going through the
standard encodings first.
- rdhwr accepts all 32 registers, and the following instructions that previously
xfailed now work:
ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d,
c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1
- Diagnostics involving registers point at the correct character (the $)
- There's only one kind of immediate in MipsOperand. LSA immediates are handled
by the predicate and renderer.
Lowlights:
- Hardcoded '$zero' in the div patterns is handled with a hack.
MipsOperand::isReg() will return true for a k_RegisterIndex token
with Index == 0 and getReg() will return ZERO for this case. Note that it
doesn't return ZERO_64 on isGP64() targets.
- I haven't cleaned up all of the now-unused functions.
Some more of the generic parser could be removed too (integers and relocs
for example).
- insve.df needed a custom decoder to handle the implicit fourth operand that
was needed to make it parse correctly. The difficulty was that the matcher
expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this.
Reviewers: matheusalmeida, vmedic
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3222
llvm-svn: 205292