We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits.
Fixes PR36199
Differential Revision: https://reviews.llvm.org/D42809
llvm-svn: 324002
Until we support extending loads properly we're going to fall back for these.
We already handle stores in the same way, so this is just being consistent.
llvm-svn: 324001
Increment the field list member count for base classes and virtual base
classes.
Differential Revision: https://reviews.llvm.org/D41874
llvm-svn: 324000
Summary:
This change extends MachineCopyPropagation to do COPY source forwarding
and adds an additional run of the pass to the default pass pipeline just
after register allocation.
This version of this patch uses the newly added
MachineOperand::isRenamable bit to avoid forwarding registers is such a
way as to violate constraints that aren't captured in the
Machine IR (e.g. ABI or ISA constraints).
This change is a continuation of the work started in D30751.
Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar
Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits
Differential Revision: https://reviews.llvm.org/D41835
llvm-svn: 323991
This allows us to use PSHUFB for v8i16/v4i32 and VPERMD/PERMPS for v4i64/v4f64 variable shuffles.
Differential Revision: https://reviews.llvm.org/D42487
llvm-svn: 323987
Summary:
EmitTest sometimes creates X86ISD::AND specifically to hide the AND from DAG combine. But this prevents isel patterns that look for (cmp (and X, Y), 0) from being able to see it. So we end up with an AND and a TEST. The TEST gets removed by compare instruction optimization during the peephole pass.
This patch attempts to fix this by converting X86ISD::AND with no flag users back into ISD::AND during the DAG preprocessing just before isel.
In order to do this correctly I had to make the X86ISD::AND node created by EmitTest in this case really have a flag output. Which arguably it should have had anyway so that the number of operands would be consistent for the opcode in all cases. Then I had to modify the ReplaceAllUsesWith to understand that we might be looking at an instruction with 2 outputs. Though in this case there are no uses to replace since we just created the node, but that's what the code did before so I just made it keep working.
Reviewers: spatel, RKSimon, niravd, deadalnix
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42764
llvm-svn: 323982
As shown in the example in PR34994:
https://bugs.llvm.org/show_bug.cgi?id=34994
...we can return a very wrong answer (inf instead of 0.0) for square root when
using a reciprocal square root estimate instruction.
Here, I've conditionalized the filtering out of denorms based on the function
having "denormal-fp-math"="ieee" in its attributes. The other options for this
attribute are 'preserve-sign' and 'positive-zero'.
So we don't generate this extra code by default with just '-ffast-math' (because
then there's no denormal attribute string at all), but it works if you specify
'-ffast-math -fdenormal-fp-math=ieee' from clang.
As noted in the review, there may be other problems in clang that affect the
results depending on platform (Linux x86 at least), but this should allow
creating the desired codegen.
Differential Revision: https://reviews.llvm.org/D42323
llvm-svn: 323981
Summary:
In Instruction Selection UpdateChains replaces all matched Nodes'
chain references including interior token factors and deletes them.
This may allow nodes which depend on these interior nodes but are not
part of the set of matched nodes to be left with a dangling dependence.
Avoid this by doing the replacement for matched non-TokenFactor nodes.
Fixes PR36164.
Reviewers: jonpa, RKSimon, bogner
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D42754
llvm-svn: 323977
Commit r323512 introduced an optimisation in LowerReturn for half-precision
return values. A missing check caused a crash when the return value is "undef"
(i.e. a node that has no operands).
Differential Revision: https://reviews.llvm.org/D42743
llvm-svn: 323968
This patch includes EVA instructions in the Std2MicroMips mapping
tables, which is required for direct object emission.
Differential Revision: https://reviews.llvm.org/D41771
llvm-svn: 323958
This fixes bugzilla 33011
https://bugs.llvm.org/show_bug.cgi?id=33011
Defines bits {19-16} as zero or unpredictable as specified by the ARM ARM in
sections A8.8.116 and A8.8.117.
It fixes also the usage of PC register as destination register for MVN
register-shifted register version as specified in A8.8.117.
Differential Revision: https://reviews.llvm.org/D41905
llvm-svn: 323954
This, in instcombine, allows conversions to i8/i16/i32 (very
common cases) even if the resulting type is not legal according
to the data layout. This can often open up extra combine
opportunities.
Differential Revision: https://reviews.llvm.org/D42424
llvm-svn: 323951
Summary:
Before emitting code for scaled registers, we prevent
SCEVExpander from hoisting any scaled addressing mode
by emitting all the bases first. However, these bases
are being forced to the final type, resulting in some
odd code.
For example, if the type of the base is an integer and
the final type is a pointer, we will emit an inttoptr
for the base, a ptrtoint for the scale, and then a
'reverse' GEP where the GEP pointer is actually the base
integer and the index is the pointer. It's more intuitive
to use the pointer as a pointer and the integer as index.
Patch by: Bevin Hansson
Reviewers: atrick, qcolombet, sanjoy
Reviewed By: qcolombet
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42103
llvm-svn: 323946
Fix the infinite loop reported in PR35809. It can occur with GCC-style
EH table assembly, where the compiler relies on the assembler to
calculate the offsets in the EH table.
Also see https://sourceware.org/bugzilla/show_bug.cgi?id=4029 for the
equivalent issue in the GNU assembler.
Patch by Ryan Prichard!
llvm-svn: 323934
This covers the case where TruncInst leaf node is a constant expression.
See PR36121 for more details.
Differential Revision: https://reviews.llvm.org/D42622
llvm-svn: 323926
Discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html
In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.
llvm-svn: 323922
Summary:
This removes the need for a machine module pass using some deeply
questionable hacks. This should address PR36123 which is a case where in
full LTO the memory usage of a machine module pass actually ended up
being significant.
We should revert this on trunk as soon as we understand and fix the
memory usage issue, but we should include this in any backports of
retpolines themselves.
Reviewers: echristo, MatzeB
Subscribers: sanjoy, mcrosier, mehdi_amini, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D42726
llvm-svn: 323915
Summary:
Call MRI.freezeReservedRegs() on functions created during outlining so
that calls to isReserved() by the verifier called after this pass won't
assert.
Reviewers: MatzeB, qcolombet, paquette
Subscribers: mcrosier, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D42749
llvm-svn: 323905
This change is useful for the upcoming addition of the symbol
table (D41954) since in that world aliases for given function
all share the same function index.
This change does not effect lld because it essentially ignores
the wasm "table". The table exists only to the wasm objects
will validate and disassembly meaningfully.
Patch by Nicholas Wilson!
Differential Revision: https://reviews.llvm.org/D42095
llvm-svn: 323900
Summary:
This was introduced in D42646 but ended up being reverted because the original implementation was buggy.
Depends on D42646
Reviewers: craig.topper, niravd, spatel, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42741
llvm-svn: 323899
Since r322087, glibc's finite lib calls are generated when possible.
However, they are not supported on Android. This change also
disables other functions not available on Android.
Differential Revision: http://reviews.llvm.org/D42668
llvm-svn: 323898
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
CodeGenPrepare pass to be more aggressive in improving the source and destination alignments
of memcpy/memmove/memset by exploiting our new ability to record independent alignments
for each argument.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 323891
Summary:
It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger.
Reviewers: craig.topper, niravd, spatel, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D42646
llvm-svn: 323888
Selecting of constant HVX vectors involves some "manual processing",
which mishandled an unrelated BITCAST operation causing a selection
error.
llvm-svn: 323887
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the Lint
analysis to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting
source & dest specific alignments through the new API.
Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.
Reference
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.htmlhttp://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html
llvm-svn: 323886
This commit came as a result for revert of patch r317579 (originally
committed as r317100). The patch made CFI instructions duplicable, because
their existence in the epilogue block was affecting the Tail duplication
pass. However, duplicating blocks with CFI instructions was an issue for
compact unwind info on Darwin, which is why the patch was reverted.
This patch allows duplicating tails with CFI instructions, though they are
not duplicable, by copying them 'manually'.
Patch by Djordje Kovacevic.
Differential Revision: https://reviews.llvm.org/D40979
llvm-svn: 323883
Summary:
Instruction Selection preserves relative orders of all nodes save
TokenFactors which we treat specially. As a result Node Ids for
TokenFactors may violate the topological ordering and should not be
considered as valid pruning candidates in predecessor search.
Fixes PR35316.
Reviewers: RKSimon, hfinkel
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D42701
llvm-svn: 323880
In D41587, @mssimpso discovered that the order of some patterns for
AArch64 was sub-optimal. I thought a bit about how we could avoid that
case in the future. I do not think there is a need for evaluating all
patterns for now. But this patch adds an extra (expensive) check, that
evaluates the latencies of all patterns, and ensures that the latency
saved decreases for subsequent patterns.
This catches the sub-optimal order fixed in D41587, but I am not
entirely happy with the check, as it only applies to sub-optimal
patterns seen while building with EXPENSIVE_CHECKS on. It did not
discover any other sub-optimal pattern ordering.
Reviewers: Gerolf, spatel, mssimpso
Reviewed By: Gerolf, mssimpso
Differential Revision: https://reviews.llvm.org/D41766
llvm-svn: 323873
When selecting a split candidate for region splitting, the register allocator tries to predict which candidate will have the cheapest spill cost.
Global splitting may cause the creation of local intervals, and they might spill.
This patch makes RA take into account the spill cost of local split intervals in use blocks (we already take into account the spill cost in through blocks).
A flag ("-condsider-local-interval-cost") controls weather we do this advanced cost calculation (it's on by default for X86 target, off for the rest).
Differential Revision: https://reviews.llvm.org/D41585
Change-Id: Icccb8ad2dbf13124f5d97a18c67d95aa6be0d14d
llvm-svn: 323870
Summary:
Expressions of the form x < 0 ? 0 : x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves
In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.
Patch by Marten Svanfeldt.
Reviewers: fhahn, pbarrio
Reviewed By: pbarrio
Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D42574
llvm-svn: 323869