Commit Graph

44489 Commits

Author SHA1 Message Date
Craig Topper 6b1b630a98 [SelectionDAG] Use known ones to provide a better bound for the known zeros for CTTZ/CTLZ operations.
This is the SelectionDAG version of D32521. If know where at least one 1 is located in the input to these intrinsics we can place an upper bound on the number of bits needed to represent the count and thus increase the number of known zeros in the output.

I think we can also refine this further for CTTZ_UNDEF/CTLZ_UNDEF by assuming that the answer will never be BitWidth. I've left this out for now because it caused other test failures across multiple targets. Usually because of turning ADD into OR based on this new information.

I'll fix CTPOP in a future patch.

Differential Revision: https://reviews.llvm.org/D32692

llvm-svn: 301806
2017-05-01 16:08:06 +00:00
Xin Tong 21f8ac235e [JumpThread] Do RAUW in case Cond folds to a constant in the CFG
Summary: [JumpThread] Do RAUW in case Cond folds to a constant in the CFG

Reviewers: sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32407

llvm-svn: 301804
2017-05-01 15:34:17 +00:00
Sanjay Patel d2f13b62d9 [InstCombine] add multi-use variants for DeMorgan folds; NFC
llvm-svn: 301802
2017-05-01 14:52:17 +00:00
Sanjay Patel c526fbcfa9 [InstCombine] use FileCheck and auto-generate checks; NFC
llvm-svn: 301801
2017-05-01 14:20:30 +00:00
Sanjay Patel 4e312203af [InstCombine] consolidate more DeMorgan tests; NFC
llvm-svn: 301800
2017-05-01 14:10:59 +00:00
Michael Zuckerman da4b52e4bf Fix test for altmacro
llvm-svn: 301799
2017-05-01 14:00:54 +00:00
Michael Zuckerman 56704618aa [LLVM][inline-asm] Altmacro absolute expression '%' feature
In this patch, I introduce a new alt macro feature.
This feature adds meaning for the % when using it as a prefix to the calling macro arguments.

In the altmacro mode, the percent sign '%' before an absolute expression convert the expression first to a string. 
As described in the https://sourceware.org/binutils/docs-2.27/as/Altmacro.html
"Expression results as strings
You can write `%expr' to evaluate the expression expr and use the result as a string."

expression assumptions:

1. '%' can only evaluate an absolute expression.
2. Altmacro '%' must be the first character of the evaluated expression.
3. If no '%' is located before the expression, a regular module operation is expected.
4. The result of Absolute Expressions can be only integer.

Differential Revision: https://reviews.llvm.org/D32526

llvm-svn: 301797
2017-05-01 13:20:12 +00:00
Dylan McKay 59e7fe3da8 [AVR] Implement non-constant bit rotations
This lets us do bit rotations of variable amount.

llvm-svn: 301794
2017-05-01 09:48:55 +00:00
Igor Breger 4064dc76c5 [GlobalISel][X86] rename test file. NFC.
llvm-svn: 301793
2017-05-01 08:11:02 +00:00
Craig Topper c8b5693948 [X86] Add tests for opportunities to improve known bits for CTTZ and CTLZ.
llvm-svn: 301791
2017-05-01 06:33:17 +00:00
Igor Breger c08a783521 [GlobalISel][X86] G_SEXT/G_ZEXT support.
Reviewers: zvi, guyblank

Reviewed By: zvi

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D32591

llvm-svn: 301790
2017-05-01 06:30:16 +00:00
Igor Breger a9edb88d46 [GlobalISel][X86] G_LOAD/G_STORE pointer selection support.
Summary: [GlobalISel][X86] G_LOAD/G_STORE pointer selection support.

Reviewers: zvi, guyblank

Reviewed By: zvi, guyblank

Subscribers: dberris, rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D32217

llvm-svn: 301788
2017-05-01 06:08:32 +00:00
Dylan McKay 2e8718bcbb [AVR] Fix a bug so that we now emit R_AVR_16 fixups with the correct offset
Before this, the LDS/STS instructions would have their opcodes
overwritten while linking.

llvm-svn: 301782
2017-04-30 23:33:52 +00:00
Sanjay Patel ad13826aea [DAGCombiner] shrink/widen a vselect to match its condition operand size (PR14657)
We discussed shrinking/widening of selects in IR in D26556, and I'll try to get back to that
patch eventually. But I'm hoping that this transform is less iffy in the DAG where we can check
legality of the select that we want to produce.

A few things to note:

1. We can't wait until after legalization and do this generically because (at least in the x86
   tests from PR14657), we'll have PACKSS and bitcasts in the pattern.
2. This might benefit more of the SSE codegen if we lifted the legal-or-custom requirement, but
   that requires a closer look to make sure we don't end up worse.
3. There's a 'vblendv' opportunity that we're missing that results in andn/and/or in some cases. 
   That should be fixed next.
4. I'm assuming that AVX1 offers the worst of all worlds wrt uneven ISA support with multiple 
   legal vector sizes, but if there are other targets like that, we should add more tests.
5. There's a codegen miracle in the multi-BB tests from PR14657 (the gcc auto-vectorization tests):
   despite IR that is terrible for the target, this patch allows us to generate the optimal loop
   code because something post-ISEL is hoisting the splat extends above the vector loops.

Differential Revision: https://reviews.llvm.org/D32620

llvm-svn: 301781
2017-04-30 22:44:51 +00:00
Sanjoy Das 08989c7ecd Rename isKnownNotFullPoison to programUndefinedIfPoison; NFC
Summary:
programUndefinedIfPoison makes more sense, given what the function
does; and I'm about to add a function with a name similar to
isKnownNotFullPoison (so do the rename to avoid confusion).

Reviewers: broune, majnemer, bjarke.roune

Reviewed By: broune

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D30444

llvm-svn: 301776
2017-04-30 19:41:19 +00:00
Amaury Sechet 8ac81f3924 Do not legalize large add with addc/adde, introduce addcarry and do it with uaddo/addcarry
Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29872

llvm-svn: 301775
2017-04-30 19:24:09 +00:00
Sanjay Patel 0c6086f493 [InstCombine] consolidate tests for DeMorgan folds; NFC
I'm proposing to add tests and change behavior in D32665.

llvm-svn: 301774
2017-04-30 18:57:12 +00:00
Zvi Rackover 4086e13e0d InstructionSimplify: Simplify a shuffle with a undef mask to undef
Summary:
Following the discussion in pr32486, adding the simplification:
 shuffle %x, %y, undef -> undef

Reviewers: spatel, RKSimon, andreadb, davide

Reviewed By: spatel

Subscribers: jroelofs, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D32293

llvm-svn: 301764
2017-04-30 06:06:26 +00:00
Simon Atanasyan 3979f43813 [mips] Emit R_MICROMIPS_TLS_GOTTPREL relocation for %gottprel in case of microMIPS
In case of microMIPS mode %gottprel operator should emit microMIPS
relocation R_MICROMIPS_TLS_GOTTPREL, not R_MIPS_TLS_GOTTPREL.

Differential Revision: http://reviews.llvm.org/D32617

llvm-svn: 301763
2017-04-30 04:27:23 +00:00
Daniel Sanders 887a141d4d [globalisel][tablegen] Fix the test after silencing the unused variable warning in r301755.
llvm-svn: 301756
2017-04-29 19:46:27 +00:00
Daniel Sanders e9fdba39e0 [globalisel][tablegen] Compute available feature bits correctly.
Summary:
Predicate<> now has a field to indicate how often it must be recomputed.
Currently, there are two frequencies, per-module (RecomputePerFunction==0)
and per-function (RecomputePerFunction==1). Per-function predicates are
currently recomputed more frequently than necessary since the only predicate
in this category is cheap to test. Per-module predicates are now computed in
getSubtargetImpl() while per-function predicates are computed in selectImpl().

Tablegen now manages the PredicateBitset internally. It should only be
necessary to add the required includes.

Also fixed a problem revealed by the test case where
constrainSelectedInstRegOperands() would attempt to tie operands that
BuildMI had already tied.

Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D32491

llvm-svn: 301750
2017-04-29 17:30:09 +00:00
Simon Pilgrim 694cb2c838 [X86][AVX] Added codegen tests for _mm256_zext* helper intrinsics (PR32839)
Not great codegen, especially as VEX moves support implicit zeroing of upper bits....

llvm-svn: 301748
2017-04-29 17:15:12 +00:00
Simon Pilgrim ac7f3e24d3 [X86][SSE] Add initial <2 x half> tests for PR31088
As discussed on D32391, test X86/X64 SSE2 and X64 F16C.

llvm-svn: 301744
2017-04-29 14:29:06 +00:00
Matt Arsenault 2a80369ae4 AMDGPU: Fix copies from physical registers in SIFixSGPRCopies
This would assert when there were multiple defs of
a physical register.

We just need to move all of the users of it.

llvm-svn: 301730
2017-04-29 01:26:34 +00:00
Zachary Turner 5b6e4e0aed [llvm-pdbdump] Abstract some of the YAML/Raw printing code.
There is a lot of duplicate code for printing line info between
YAML and the raw output printer.  This introduces a base class
that can be shared between the two, and makes some minor
cleanups in the process.

llvm-svn: 301728
2017-04-29 01:13:21 +00:00
Akira Hatanaka 6fdcb3c2ce [ObjCARC] Do not move a release between a call and a
retainAutoreleasedReturnValue that retains the returned value.

This commit fixes a bug in ARC optimizer where it moves a release
between a call and a retainAutoreleasedReturnValue, causing the returned
object to be released before the retainAutoreleasedReturnValue can
retain it.

This commit accomplishes that by doing a lookahead and checking whether
the call prevents the release from moving upwards. In the long term, we
should treat the region between the retainAutoreleasedReturnValue and
the call as a critical section and disallow moving anything there
(possibly using operand bundles).

rdar://problem/20449878

llvm-svn: 301724
2017-04-29 00:23:11 +00:00
Davide Italiano 534e314356 [LoopUnswitch] Don't remove instructions with side effects.
This fixes PR32818.

Differential Revision:  https://reviews.llvm.org/D32664

llvm-svn: 301722
2017-04-29 00:12:18 +00:00
Sanjay Patel c8ab6bb27d [InstCombine] add tests to show potentially bogus application of DeMorgan (NFC)
llvm-svn: 301714
2017-04-28 23:14:33 +00:00
Matt Arsenault e0f9e984fd InferAddressSpaces: Search constant expressions for addrspacecasts
These are pretty common when using local memory, and the 64-bit generic
addressing is much more expensive to compute.

llvm-svn: 301711
2017-04-28 22:52:41 +00:00
Adrian Prantl fed4f399d3 Remove line and file from DINamespace.
Fixes the issue highlighted in
http://lists.llvm.org/pipermail/cfe-dev/2014-June/037500.html.

The DW_AT_decl_file and DW_AT_decl_line attributes on namespaces can
prevent LLVM from uniquing types that are in the same namespace. They
also don't carry any meaningful information.

rdar://problem/17484998
Differential Revision: https://reviews.llvm.org/D32648

llvm-svn: 301706
2017-04-28 22:25:46 +00:00
Matt Arsenault a1e734050c InferAddressSpaces: Infer from just addrspacecasts
Eliminates some more cases where some subset of the addressing
computation remains flat. Some cases with addrspacecasts
in nested constant expressions are still left behind however.

llvm-svn: 301704
2017-04-28 22:18:08 +00:00
Krzysztof Parzyszek 072ddb383c [RDF] Correctly calculate lane masks for defs
llvm-svn: 301700
2017-04-28 21:57:53 +00:00
Krzysztof Parzyszek 2065a2f4e6 Properly handle PHIs with subregisters in UnreachableBlockElim
When a PHI operand has a subregister, create a COPY instead of simply
replacing the PHI output with the input it.

Differential Revision: https://reviews.llvm.org/D32650

llvm-svn: 301699
2017-04-28 21:56:33 +00:00
Krzysztof Parzyszek 0b3acbb1dd [Hexagon] Do not move a block if it is on a fall-through path
llvm-svn: 301698
2017-04-28 21:54:11 +00:00
Sam Clegg a06de02889 [WebAssembly] Add size of section header to data relocation offsets.
Also, add test for data relocations and fix addend to
be signed.

Subscribers: jfb, dschuff

Differential Revision: https://reviews.llvm.org/D32513

llvm-svn: 301690
2017-04-28 21:22:38 +00:00
Matt Arsenault cf5e7fe358 [ValueTracking] Teach isSafeToSpeculativelyExecute() about the speculatable attribute
Patch by Tom Stellard

llvm-svn: 301688
2017-04-28 21:13:09 +00:00
Sam Clegg ff0730b3fc [WebAssembly] Write initial memory in pages not bytes
Subscribers: jfb, dschuff

Differential Revision: https://reviews.llvm.org/D32660

llvm-svn: 301687
2017-04-28 21:12:09 +00:00
Matt Arsenault b19b57ea60 Add speculatable function attribute
This attribute tells the optimizer that the function may be speculated.

Patch by Tom Stellard

llvm-svn: 301680
2017-04-28 20:25:27 +00:00
Marek Olsak 2d82590f64 AMDGPU: Add new amdgcn.init.exec intrinsics
v2: More tests, bug fixes, cosmetic changes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D31762

llvm-svn: 301677
2017-04-28 20:21:58 +00:00
Alexei Starovoitov f7bd5ebd3b [bpf] add bigendian support to disassembler
. swap 4-bit register encoding, 16-bit offset and 32-bit imm to support big endian archs
. add a test

Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 301653
2017-04-28 16:51:01 +00:00
Jun Bum Lim 919f9e8d65 [InlineCost] Improve the cost heuristic for Switch
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2:                                   ; preds = %if.then
  switch i32 %Status, label %if.then27 [
    i32 -7012, label %if.end35
    i32 -10008, label %if.end35
    i32 -10016, label %if.end35
    i32 15000, label %if.end35
    i32 14013, label %if.end35
    i32 10114, label %if.end35
    i32 10107, label %if.end35
    i32 10105, label %if.end35
    i32 10013, label %if.end35
    i32 10011, label %if.end35
    i32 7008, label %if.end35
    i32 7007, label %if.end35
    i32 5002, label %if.end35
  ]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)

```
.LBB853_9:                              // %lor.lhs.false2
        mov     w8, #10012
        cmp             w19, w8
        b.gt    .LBB853_14
// BB#10:                               // %lor.lhs.false2
        mov     w8, #5001
        cmp             w19, w8
        b.gt    .LBB853_18
// BB#11:                               // %lor.lhs.false2
        mov     w8, #-10016
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#12:                               // %lor.lhs.false2
        mov     w8, #-10008
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#13:                               // %lor.lhs.false2
        mov     w8, #-7012
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_14:                             // %lor.lhs.false2
        mov     w8, #14012
        cmp             w19, w8
        b.gt    .LBB853_21
// BB#15:                               // %lor.lhs.false2
        mov     w8, #-10105
        add             w8, w19, w8
        cmp             w8, #9          // =9
        b.hi    .LBB853_17
// BB#16:                               // %lor.lhs.false2
        orr     w9, wzr, #0x1
        lsl     w8, w9, w8
        mov     w9, #517
        and             w8, w8, w9
        cbnz    w8, .LBB853_23
.LBB853_17:                             // %lor.lhs.false2
        mov     w8, #10013
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_18:                             // %lor.lhs.false2
        mov     w8, #-7007
        add             w8, w19, w8
        cmp             w8, #2          // =2
        b.lo    .LBB853_23
// BB#19:                               // %lor.lhs.false2
        mov     w8, #5002
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#20:                               // %lor.lhs.false2
        mov     w8, #10011
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_21:                             // %lor.lhs.false2
        mov     w8, #14013
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#22:                               // %lor.lhs.false2
        mov     w8, #15000
        cmp             w19, w8
        b.ne    .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.

This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
  -inline-generic-switch-cost=false

This change was originally proposed by Haicheng in D29870.

Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier

Reviewed By: hans

Subscribers: joerg, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31085

llvm-svn: 301649
2017-04-28 16:04:03 +00:00
Teresa Johnson 51177295c4 Memory intrinsic value profile optimization: Avoid divide by 0
Summary:
Skip memops if the total value profiled count is 0, we can't correctly
scale up the counts and there is no point anyway.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32624

llvm-svn: 301645
2017-04-28 14:30:54 +00:00
Simon Pilgrim 7ae9419dc0 [DAGCombiner] Add ComputeNumSignBits vector demanded elements support to ASHR and INSERT_VECTOR_ELT (reapplied)
Reapplied r299221 after fix for nondeterminism in ThinLTO builder (rL301599), with extra check for implicit truncation of inserted element.

llvm-svn: 301644
2017-04-28 13:21:18 +00:00
Simon Pilgrim ec93334317 [X86][SSE] Added new tests from D32416 to show codegen delta
llvm-svn: 301641
2017-04-28 11:53:08 +00:00
Simon Pilgrim 04928fd021 [X86][SSE] Renames all ones test to better match type.
Added 8f32/4f64 optsize tests discussed on D32416

llvm-svn: 301639
2017-04-28 11:12:30 +00:00
Simon Pilgrim 67b1a79985 [X86][SSE] Add codegen test for _mm_set_pd1 (PR32827)
llvm-svn: 301638
2017-04-28 10:31:42 +00:00
Andrew Ng 03e35b6bc0 [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values.
This is a follow up to the fix in r298360 to improve the handling of debug
values when redundant LEAs are removed. The fix in r298360 effectively
discarded the debug values. This patch now attempts to preserve the debug
values by using the DWARF DW_OP_stack_value operation via prependDIExpr.

Moved functions appendOffset and prependDIExpr from Local.cpp to
DebugInfoMetadata.cpp and made them available as static member functions of
DIExpression.

Differential Revision: https://reviews.llvm.org/D31604

llvm-svn: 301630
2017-04-28 08:44:30 +00:00
Diana Picus 0674a3ce97 [ARM] GlobalISel: Tighten test. NFC
Explicitly check types and load sizes in the IRTranslator test.

llvm-svn: 301627
2017-04-28 07:50:47 +00:00
Max Kazantsev 531db9a504 [EarlyCSE] Mark the condition of assume intrinsic as true
EarlyCSE should not just ignore assumes. It should use the fact that its condition is true for all dominated instructions.

Reviewers: sanjoy, reames, apilipenko, anna, skatkov

Reviewed By: reames, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32482

llvm-svn: 301625
2017-04-28 06:25:39 +00:00
Max Kazantsev 0589d9fa0f [EarlyCSE] Remove guards with conditions known to be true
If a condition is calculated only once, and there are multiple guards on this condition, we should be able
to remove all guards dominated by the first of them. This patch allows EarlyCSE to try to find the condition
of a guard among the known values, and if it is true, remove the guard. Otherwise we keep the guard and
mark its condition as 'true' for future consideration.

Reviewers: sanjoy, reames, apilipenko, skatkov, anna, dberlin

Reviewed By: reames, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32476

llvm-svn: 301623
2017-04-28 06:05:48 +00:00