Commit Graph

50580 Commits

Author SHA1 Message Date
Craig Topper a5944aade1 [DAGCombiner] When folding (insert_subvector undef, (bitcast (extract_subvector N1, Idx)), Idx) -> (bitcast N1) make sure that N1 has the same total size as the original output
We were only checking the element count, but not the total width. This could cause illegal bitcasts to be created if for example the output was 512-bits, but N1 is 256 bits, and the extraction size was 128-bits.

Fixes PR36199

Differential Revision: https://reviews.llvm.org/D42809

llvm-svn: 324002
2018-02-01 20:48:50 +00:00
Amara Emerson cbc02c71a4 [GlobalISel] Fix assert failure when legalizing non-power-2 loads.
Until we support extending loads properly we're going to fall back for these.
We already handle stores in the same way, so this is just being consistent.

llvm-svn: 324001
2018-02-01 20:47:03 +00:00
Brock Wyma 4536c1f569 [CodeView] Class record member counts should include base classes and ...
Increment the field list member count for base classes and virtual base
classes.

Differential Revision: https://reviews.llvm.org/D41874

llvm-svn: 324000
2018-02-01 20:37:38 +00:00
Geoff Berry 94503c7bc3 [MachineCopyPropagation] Extend pass to do COPY source forwarding
Summary:
This change extends MachineCopyPropagation to do COPY source forwarding
and adds an additional run of the pass to the default pass pipeline just
after register allocation.

This version of this patch uses the newly added
MachineOperand::isRenamable bit to avoid forwarding registers is such a
way as to violate constraints that aren't captured in the
Machine IR (e.g. ABI or ISA constraints).

This change is a continuation of the work started in D30751.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa, tstellar

Subscribers: tpr, mgorny, mcrosier, nhaehnle, nemanjai, jyknight, hfinkel, arsenm, inouehrs, eraman, sdardis, guyblank, fedor.sergeev, aheejin, dschuff, jfb, myatsina, llvm-commits

Differential Revision: https://reviews.llvm.org/D41835

llvm-svn: 323991
2018-02-01 18:54:01 +00:00
Changpeng Fang 29fcf883fb AMDGPU/SI: Adjust the encoding family for D16 buffer instructions when the target has UnpackedD16VMem feature.
Reviewers:
  Matt and Brian

Differential Revision:
  https://reviews.llvm.org/D42548

llvm-svn: 323988
2018-02-01 18:41:33 +00:00
Simon Pilgrim 1a8cefc328 [X86][SSE] LowerBUILD_VECTORAsVariablePermute - add support for scaling index vectors
This allows us to use PSHUFB for v8i16/v4i32 and VPERMD/PERMPS for v4i64/v4f64 variable shuffles.

Differential Revision: https://reviews.llvm.org/D42487

llvm-svn: 323987
2018-02-01 18:10:30 +00:00
Sanjay Patel 702c19cc3e [AArch64] add tests with sqrt estimate and ieee denorms; NFC
As noted in D42323, we're not checking for denorms as we should.

llvm-svn: 323985
2018-02-01 17:57:45 +00:00
Sanjay Patel f42381fd7e [AArch64] auto-generate complete checks; NFC
llvm-svn: 323984
2018-02-01 17:44:50 +00:00
Craig Topper 7e910a9e85 [X86] Turn X86ISD::AND nodes that have no flag users back into ISD::AND just before isel to enable test instruction matching
Summary:
EmitTest sometimes creates X86ISD::AND specifically to hide the AND from DAG combine. But this prevents isel patterns that look for (cmp (and X, Y), 0) from being able to see it. So we end up with an AND and a TEST. The TEST gets removed by compare instruction optimization during the peephole pass.

This patch attempts to fix this by converting X86ISD::AND with no flag users back into ISD::AND during the DAG preprocessing just before isel.

In order to do this correctly I had to make the X86ISD::AND node created by EmitTest in this case really have a flag output. Which arguably it should have had anyway so that the number of operands would be consistent for the opcode in all cases. Then I had to modify the ReplaceAllUsesWith to understand that we might be looking at an instruction with 2 outputs. Though in this case there are no uses to replace since we just created the node, but that's what the code did before so I just made it keep working.

Reviewers: spatel, RKSimon, niravd, deadalnix

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42764

llvm-svn: 323982
2018-02-01 17:08:39 +00:00
Sanjay Patel 657e5d8d41 [DAGCombiner] filter out denorm inputs when calculating sqrt estimate (PR34994)
As shown in the example in PR34994:
https://bugs.llvm.org/show_bug.cgi?id=34994
...we can return a very wrong answer (inf instead of 0.0) for square root when 
using a reciprocal square root estimate instruction.

Here, I've conditionalized the filtering out of denorms based on the function 
having "denormal-fp-math"="ieee" in its attributes. The other options for this 
attribute are 'preserve-sign' and 'positive-zero'.

So we don't generate this extra code by default with just '-ffast-math' (because 
then there's no denormal attribute string at all), but it works if you specify 
'-ffast-math -fdenormal-fp-math=ieee' from clang. 

As noted in the review, there may be other problems in clang that affect the 
results depending on platform (Linux x86 at least), but this should allow 
creating the desired codegen.

Differential Revision: https://reviews.llvm.org/D42323

llvm-svn: 323981
2018-02-01 16:57:18 +00:00
Nirav Dave 18f7f60e17 [SelectionDAG] Fix UpdateChains handling of TokenFactors
Summary:
In Instruction Selection UpdateChains replaces all matched Nodes'
chain references including interior token factors and deletes them.
This may allow nodes which depend on these interior nodes but are not
part of the set of matched nodes to be left with a dangling dependence.
Avoid this by doing the replacement for matched non-TokenFactor nodes.

Fixes PR36164.

Reviewers: jonpa, RKSimon, bogner

Subscribers: llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D42754

llvm-svn: 323977
2018-02-01 16:11:59 +00:00
Simon Pilgrim eb50b6d060 [X86][SSE] Add PR26491 horizontal add test
llvm-svn: 323973
2018-02-01 15:30:02 +00:00
Simon Pilgrim afc7c63bc2 [X86][AVX512DQ] Add DQ var permute 256 tests as requested on D42487
llvm-svn: 323970
2018-02-01 14:44:50 +00:00
Sjoerd Meijer 9d9a86535e [ARM] FullFP16 LowerReturn Fix
Commit r323512 introduced an optimisation in LowerReturn for half-precision
return values. A missing check caused a crash when the return value is "undef"
(i.e. a node that has no operands).

Differential Revision: https://reviews.llvm.org/D42743

llvm-svn: 323968
2018-02-01 13:48:40 +00:00
David Green 184df0c35d Revert commit rL323951
Looks like it's causing timeouts out on at least ppc64le
buildbots.

llvm-svn: 323959
2018-02-01 13:05:25 +00:00
Aleksandar Beserminji a330c208f2 [mips] Include EVA instructions in Std2MicroMips mapping tables
This patch includes EVA instructions in the Std2MicroMips mapping
tables, which is required for direct object emission.

Differential Revision: https://reviews.llvm.org/D41771

llvm-svn: 323958
2018-02-01 12:53:26 +00:00
Yvan Roux 490e9e6761 [ARM] Add support for unpredictable MVN instructions.
This fixes bugzilla 33011
https://bugs.llvm.org/show_bug.cgi?id=33011

Defines bits {19-16} as zero or unpredictable as specified by the ARM ARM in
sections A8.8.116 and A8.8.117.

It fixes also the usage of PC register as destination register for MVN
register-shifted register version as specified in A8.8.117.

Differential Revision: https://reviews.llvm.org/D41905

llvm-svn: 323954
2018-02-01 12:06:57 +00:00
David Green e11f0545db [InstCombine] Allow common type conversions to i8/i16/i32
This, in instcombine, allows conversions to i8/i16/i32 (very
common cases) even if the resulting type is not legal according
to the data layout. This can often open up extra combine
opportunities.

Differential Revision: https://reviews.llvm.org/D42424

llvm-svn: 323951
2018-02-01 11:06:18 +00:00
Mikael Holmen 6d06976e74 [LSR] Don't force bases of foldable formulae to the final type.
Summary:
Before emitting code for scaled registers, we prevent
SCEVExpander from hoisting any scaled addressing mode
by emitting all the bases first. However, these bases
are being forced to the final type, resulting in some
odd code.

For example, if the type of the base is an integer and
the final type is a pointer, we will emit an inttoptr
for the base, a ptrtoint for the scale, and then a
'reverse' GEP where the GEP pointer is actually the base
integer and the index is the pointer. It's more intuitive
to use the pointer as a pointer and the integer as index.

Patch by: Bevin Hansson

Reviewers: atrick, qcolombet, sanjoy

Reviewed By: qcolombet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42103

llvm-svn: 323946
2018-02-01 06:38:34 +00:00
Rafael Espindola 45b12f1835 [MC] Fix assembler infinite loop on EH table using LEB padding.
Fix the infinite loop reported in PR35809. It can occur with GCC-style
EH table assembly, where the compiler relies on the assembler to
calculate the offsets in the EH table.

Also see https://sourceware.org/bugzilla/show_bug.cgi?id=4029 for the
equivalent issue in the GNU assembler.

Patch by Ryan Prichard!

llvm-svn: 323934
2018-02-01 00:25:19 +00:00
Matt Arsenault df0f25070c DAG: Fix not truncating when promoting bswap/bitreverse
These need to convert back to the original type, like any
other promotion.

llvm-svn: 323932
2018-01-31 23:54:16 +00:00
Evgeniy Stepanov 7746899f48 Revert "[ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations"
Miscompiles code. Testcase pending.

This reverts commit r323869.

llvm-svn: 323929
2018-01-31 22:55:19 +00:00
Amjad Aboud b86b771c02 [AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf node correctly.
This covers the case where TruncInst leaf node is a constant expression.
See PR36121 for more details.

Differential Revision: https://reviews.llvm.org/D42622

llvm-svn: 323926
2018-01-31 22:39:05 +00:00
Puyan Lotfi 43e94b15ea Followup on Proposal to move MIR physical register namespace to '$' sigil.
Discussed here:

http://lists.llvm.org/pipermail/llvm-dev/2018-January/120320.html

In preparation for adding support for named vregs we are changing the sigil for
physical registers in MIR to '$' from '%'. This will prevent name clashes of
named physical register with named vregs.

llvm-svn: 323922
2018-01-31 22:04:26 +00:00
Chandler Carruth 0dcee4fe7a [x86] Make the retpoline thunk insertion a machine function pass.
Summary:
This removes the need for a machine module pass using some deeply
questionable hacks. This should address PR36123 which is a case where in
full LTO the memory usage of a machine module pass actually ended up
being significant.

We should revert this on trunk as soon as we understand and fix the
memory usage issue, but we should include this in any backports of
retpolines themselves.

Reviewers: echristo, MatzeB

Subscribers: sanjoy, mcrosier, mehdi_amini, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D42726

llvm-svn: 323915
2018-01-31 20:56:37 +00:00
Krzysztof Parzyszek 1108ee2496 [Hexagon] Implement HVX codegen for vector shifts
llvm-svn: 323914
2018-01-31 20:49:24 +00:00
Marek Olsak 8f2df9d26c [SeparateConstOffsetFromGEP] Fix up addrspace in the AMDGPU test
llvm-svn: 323913
2018-01-31 20:49:19 +00:00
Krzysztof Parzyszek 9eb085e6cf [Hexagon] Handle ANY_EXTEND_VECTOR_INREG in lowering
llvm-svn: 323912
2018-01-31 20:48:11 +00:00
Krzysztof Parzyszek b843f75179 [Hexagon] Handle SETCC on vector pairs in lowering
llvm-svn: 323911
2018-01-31 20:46:55 +00:00
Marek Olsak d4bb329d0e AMDGPU: Fold inline offset for loads properly in moveToVALU on GFX9
Summary:
This enables load merging into x2, x4, which is driven by inline offsets.

6500 shaders are affected:
Code Size in affected shaders: -15.14 %

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D42078

llvm-svn: 323909
2018-01-31 20:18:11 +00:00
Marek Olsak 13e4741275 AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D41663

llvm-svn: 323908
2018-01-31 20:18:04 +00:00
Marek Olsak 8e7d149a31 [SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPs
Summary:
!amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things
happen.

Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie

Subscribers: wdng, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D42744

llvm-svn: 323907
2018-01-31 20:17:52 +00:00
Geoff Berry 82203c4149 [MachineOutliner] Freeze registers in new functions
Summary:
Call MRI.freezeReservedRegs() on functions created during outlining so
that calls to isReserved() by the verifier called after this pass won't
assert.

Reviewers: MatzeB, qcolombet, paquette

Subscribers: mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D42749

llvm-svn: 323905
2018-01-31 20:15:16 +00:00
Sam Clegg f9edbe95db [WebAssembly] MC: Resolve aliases when creating provisional table entries
This change is useful for the upcoming addition of the symbol
table (D41954) since in that world aliases for given function
all share the same function index.

This change does not effect lld because it essentially ignores
the wasm "table".  The table exists only to the wasm objects
will validate and disassembly meaningfully.

Patch by Nicholas Wilson!

Differential Revision: https://reviews.llvm.org/D42095

llvm-svn: 323900
2018-01-31 19:28:47 +00:00
Amaury Sechet f9a9e9a251 [X86] Generate testl instruction through truncates.
Summary:
This was introduced in D42646 but ended up being reverted because the original implementation was buggy.

Depends on D42646

Reviewers: craig.topper, niravd, spatel, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42741

llvm-svn: 323899
2018-01-31 19:20:06 +00:00
Chih-Hung Hsieh 60d1e79ffb [Analysis] Disable calls to *_finite and other glibc-only functions on Android.
Since r322087, glibc's finite lib calls are generated when possible.
However, they are not supported on Android. This change also
disables other functions not available on Android.

Differential Revision: http://reviews.llvm.org/D42668

llvm-svn: 323898
2018-01-31 19:12:50 +00:00
Max Moroz 790baeed37 [llvm-cov] Improvements for summary report generated in HTML format.
Summary:
This commit adds the following changes:

1) coverage numbers are aligned to the left and padded with spaces in order to
provide better readability for percentage values, e.g.:

```
file1     |  89.13% (123 / 2323)    | 100.00% (55 / 55)    |   9.33% (14545 / 234234)
file_asda |   1.78% ( 23 / 4323)    |  32.31% (555 / 6555) |  67.89% (1545 / 2234)
fileXXX   | 100.00% (12323 / 12323) | 100.00% (555 / 555)  | 100.00% (12345 / 12345)
```

2) added "hover" attribute to CSS for highlighting table row under mouse cursor
see screenshot attached to the phabricator review page

{F5764813}

3) table title row and "totals" row now use bold text

Reviewers: vsk, morehouse

Reviewed By: vsk

Subscribers: kcc, llvm-commits

Differential Revision: https://reviews.llvm.org/D42093

llvm-svn: 323892
2018-01-31 17:37:21 +00:00
Daniel Neilson be58a220e9 [CodeGenPrepare] Improve source and dest alignments of memory intrinsics independently
Summary:
  This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
CodeGenPrepare pass to be more aggressive in improving the source and destination alignments
of memcpy/memmove/memset by exploiting our new ability to record independent alignments
for each argument.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 323891
2018-01-31 17:24:53 +00:00
Krzysztof Parzyszek 82a83391d3 [Hexagon] Handle BUILD_VECTOR from undef values in buildHvxVectorReg
llvm-svn: 323889
2018-01-31 16:52:15 +00:00
Amaury Sechet f89f188ddb [X86] Avoid using high register trick for test instruction
Summary:
It seems it's main effect is to create addition copies when values are inr register that do not support this trick, which increase register pressure and makes the code bigger.

Reviewers: craig.topper, niravd, spatel, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D42646

llvm-svn: 323888
2018-01-31 16:48:54 +00:00
Krzysztof Parzyszek 8cc636c592 [Hexagon] Only process bitcasts of vsplats when selecting const vectors
Selecting of constant HVX vectors involves some "manual processing",
which mishandled an unrelated BITCAST operation causing a selection
error.

llvm-svn: 323887
2018-01-31 16:48:20 +00:00
Daniel Neilson 147810d28a [Lint] Upgrade uses of MemoryIntrinic::getAlignment() to new API. (NFCI)
Summary:
  This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the Lint
analysis to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting
source & dest specific alignments through the new API.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 323886
2018-01-31 16:42:15 +00:00
Petar Jovanovic 540f4cd10a [DWARF] Allow duplication of tails with CFI instructions
This commit came as a result for revert of patch r317579 (originally
committed as r317100). The patch made CFI instructions duplicable, because
their existence in the epilogue block was affecting the Tail duplication
pass. However, duplicating blocks with CFI instructions was an issue for
compact unwind info on Darwin, which is why the patch was reverted.

This patch allows duplicating tails with CFI instructions, though they are
not duplicable, by copying them 'manually'.


Patch by Djordje Kovacevic.

Differential Revision: https://reviews.llvm.org/D40979

llvm-svn: 323883
2018-01-31 15:57:57 +00:00
Sanjay Patel fd58ade81c [InstCombine] move related tests into the same file; NFC
llvm-svn: 323882
2018-01-31 15:47:59 +00:00
Sanjay Patel 8c74a9a155 [InstCombine] add tests to show limit of canEvaluate* ; NFC
llvm-svn: 323881
2018-01-31 15:28:39 +00:00
Nirav Dave c3a1e16db1 [DAG] Prevent NodeId pruning of TokenFactors in Instruction Selection.
Summary:
Instruction Selection preserves relative orders of all nodes save
TokenFactors which we treat specially. As a result Node Ids for
TokenFactors may violate the topological ordering and should not be
considered as valid pruning candidates in predecessor search.

Fixes PR35316.

Reviewers: RKSimon, hfinkel

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D42701

llvm-svn: 323880
2018-01-31 15:23:17 +00:00
Marina Yatsina 3f34f33148 Fix build error in r323870
Change-Id: I15a8b27764a4d817cfbe48836bf09dc6520934b7
llvm-svn: 323874
2018-01-31 14:18:37 +00:00
Florian Hahn c68428b5dc [MachineCombiner] Add check for optimal pattern order.
In D41587, @mssimpso discovered that the order of some patterns for
AArch64 was sub-optimal. I thought a bit about how we could avoid that
case in the future. I do not think there is a need for evaluating all
patterns for now. But this patch adds an extra (expensive) check, that
evaluates the latencies of all patterns, and ensures that the latency
saved decreases for subsequent patterns.

This catches the sub-optimal order fixed in D41587, but I am not
entirely happy with the check, as it only applies to sub-optimal
patterns seen while building with EXPENSIVE_CHECKS on. It did not
discover any other sub-optimal pattern ordering.

Reviewers: Gerolf, spatel, mssimpso

Reviewed By: Gerolf, mssimpso

Differential Revision: https://reviews.llvm.org/D41766

llvm-svn: 323873
2018-01-31 13:54:30 +00:00
Marina Yatsina cd5bc4a2cd Take into account the cost of local intervals when selecting split candidate.
When selecting a split candidate for region splitting, the register allocator tries to predict which candidate will have the cheapest spill cost.
Global splitting may cause the creation of local intervals, and they might spill.

This patch makes RA take into account the spill cost of local split intervals in use blocks (we already take into account the spill cost in through blocks).
A flag ("-condsider-local-interval-cost") controls weather we do this advanced cost calculation (it's on by default for X86 target, off for the rest).

Differential Revision: https://reviews.llvm.org/D41585

Change-Id: Icccb8ad2dbf13124f5d97a18c67d95aa6be0d14d
llvm-svn: 323870
2018-01-31 13:31:08 +00:00
Pablo Barrio 2e442a7831 [ARM] Lower lower saturate to 0 and lower saturate to -1 using bit-operations
Summary:
Expressions of the form x < 0 ? 0 :  x; and x < -1 ? -1 : x can be lowered using bit-operations instead of branching or conditional moves

In thumb-mode this results in a two-instruction sequence, a shift followed by a bic or or while in ARM/thumb2 mode that has flexible second operand the shift can be folded into a single bic/or instructions. In most cases this results in smaller code and possibly less branches, and in no case larger than before.

Patch by Marten Svanfeldt.

Reviewers: fhahn, pbarrio

Reviewed By: pbarrio

Subscribers: efriedma, rogfer01, aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D42574

llvm-svn: 323869
2018-01-31 13:20:10 +00:00