Commit Graph

239045 Commits

Author SHA1 Message Date
Todd Fiala c8b3717344 xfailed TestObjCNewSyntax.py on macOS for gmodules
Tracked by:
rdar://27792848

llvm-svn: 278289
2016-08-10 21:07:48 +00:00
Kyle Butt 81d32846b0 Codegen: Don't tail-duplicate blocks with un-analyzable fallthrough.
If AnalyzeBranch can't analyze a block and it is possible to
fallthrough, then duplicating the block doesn't make sense, as only one
block can be the layout predecessor for the un-analyzable fallthrough.

Submitted wit a test case, but NOTE: the test case doesn't currently
fail. However, the test case fails with D20505 and would have saved me
some time debugging.

llvm-svn: 278288
2016-08-10 21:03:27 +00:00
Kyle Butt e1c931b171 CodeGen: If Convert blocks that would form a diamond when tail-merged.
The following function currently relies on tail-merging for if
conversion to succeed. The common tail of cond_true and cond_false is
extracted, and this then forms a diamond pattern that can be
successfully if converted.

If this block does not get extracted, either because tail-merging is
disabled or the threshold is higher, we should still recognize this
pattern and if-convert it.

Fixed a regression in the original commit. Need to un-reverse branches after
reversing them, or other conversions go awry.

define i32 @t2(i32 %a, i32 %b) nounwind {
entry:
        %tmp1434 = icmp eq i32 %a, %b           ; <i1> [#uses=1]
        br i1 %tmp1434, label %bb17, label %bb.outer

bb.outer:               ; preds = %cond_false, %entry
        %b_addr.021.0.ph = phi i32 [ %b, %entry ], [ %tmp10, %cond_false ]
        %a_addr.026.0.ph = phi i32 [ %a, %entry ], [ %a_addr.026.0, %cond_false ]
        br label %bb

bb:             ; preds = %cond_true, %bb.outer
        %indvar = phi i32 [ 0, %bb.outer ], [ %indvar.next, %cond_true ]
        %tmp. = sub i32 0, %b_addr.021.0.ph
        %tmp.40 = mul i32 %indvar, %tmp.
        %a_addr.026.0 = add i32 %tmp.40, %a_addr.026.0.ph
        %tmp3 = icmp sgt i32 %a_addr.026.0, %b_addr.021.0.ph
        br i1 %tmp3, label %cond_true, label %cond_false

cond_true:              ; preds = %bb
        %tmp7 = sub i32 %a_addr.026.0, %b_addr.021.0.ph
        %tmp1437 = icmp eq i32 %tmp7, %b_addr.021.0.ph
        %indvar.next = add i32 %indvar, 1
        br i1 %tmp1437, label %bb17, label %bb

cond_false:             ; preds = %bb
        %tmp10 = sub i32 %b_addr.021.0.ph, %a_addr.026.0
        %tmp14 = icmp eq i32 %a_addr.026.0, %tmp10
        br i1 %tmp14, label %bb17, label %bb.outer

bb17:           ; preds = %cond_false, %cond_true, %entry
        %a_addr.026.1 = phi i32 [ %a, %entry ], [ %tmp7, %cond_true ], [ %a_addr.026.0, %cond_false ]
        ret i32 %a_addr.026.1
}

Without tail-merging or diamond-tail if conversion:
LBB1_1:                                 @ %bb
                                        @ =>This Inner Loop Header: Depth=1
        cmp     r0, r1
        ble     LBB1_3
@ BB#2:                                 @ %cond_true
                                        @   in Loop: Header=BB1_1 Depth=1
        subs    r0, r0, r1
        cmp     r1, r0
        it      ne
        cmpne   r0, r1
        bgt     LBB1_4
LBB1_3:                                 @ %cond_false
                                        @   in Loop: Header=BB1_1 Depth=1
        subs    r1, r1, r0
        cmp     r1, r0
        bne     LBB1_1
LBB1_4:                                 @ %bb17
        bx      lr

With diamond-tail if conversion, but without tail-merging:
@ BB#0:                                 @ %entry
        cmp     r0, r1
        it      eq
        bxeq    lr
LBB1_1:                                 @ %bb
                                        @ =>This Inner Loop Header: Depth=1
        cmp     r0, r1
        ite     le
        suble   r1, r1, r0
        subgt   r0, r0, r1
        cmp     r1, r0
        bne     LBB1_1
@ BB#2:                                 @ %bb17
        bx      lr

llvm-svn: 278287
2016-08-10 20:45:56 +00:00
Greg Clayton 649da6d623 Fix the lookup of dictionary values by name to not do a linear search.
llvm-svn: 278286
2016-08-10 20:37:45 +00:00
Reid Kleckner 7cbd6b74b4 Disable sancov tests failing due to apparent endianness issues
Undoes some of the effect of r278271

llvm-svn: 278285
2016-08-10 20:11:35 +00:00
Reid Kleckner 0881472ac4 [sancov] Port sancov -print-coverage-pcs to COFF
The export table is not considered part of the object file symbol table,
so we have to look through it separately.

Reviewers: kcc

Differential Revision: https://reviews.llvm.org/D23321

llvm-svn: 278284
2016-08-10 20:08:19 +00:00
Marshall Clow 7725546a32 std:: quailfy the calls for cend/crend/cbegin/cend. Fixes bug 28927.
llvm-svn: 278282
2016-08-10 20:04:46 +00:00
Jonathan Roelofs 851b79dc4d Fix UB in APInt::ashr
i64 -1, whose sign bit is the 0th one, can't be left shifted without invoking UB.

https://reviews.llvm.org/D23362

llvm-svn: 278280
2016-08-10 19:50:14 +00:00
Eugene Zelenko 9ef6b6b4f4 [Documentation] Fix style and grammar mistake in Clang-tidy readability-else-after-return description spotted by Alexander Kornienko.
llvm-svn: 278279
2016-08-10 19:49:38 +00:00
Matt Arsenault 61f8ba8b79 AMDGPU: s_setpc_b64 should be an indirect branch
llvm-svn: 278278
2016-08-10 19:20:02 +00:00
Matt Arsenault c6b1350039 AMDGPU: Set sizes on control flow pseudos
llvm-svn: 278276
2016-08-10 19:11:51 +00:00
Matt Arsenault f4af802381 AMDGPU: Remove empty file comment
llvm-svn: 278275
2016-08-10 19:11:48 +00:00
Matt Arsenault 11587d97be AMDGPU: Remove unnecessary cast
llvm-svn: 278274
2016-08-10 19:11:45 +00:00
Matt Arsenault 57431c9680 AMDGPU: Change insertion point of si_mask_branch
Insert before the skip branch if one is created.
This is a somewhat more natural placement relative
to the skip branches, and makes it possible to implement
analyzeBranch for skip blocks.

The test changes are mostly due to a quirk where
the block label is not emitted if there is a terminator
that is not also a branch.

llvm-svn: 278273
2016-08-10 19:11:42 +00:00
Matt Arsenault b920e9987d AMDGPU: Use CreateStackObject instead of CreateSpillStackObject
I'm not sure what the difference is, but no other target
uses this for emergency spill slots.

llvm-svn: 278272
2016-08-10 19:11:36 +00:00
Reid Kleckner 260ac88cd4 [sancov] Run more sancov tests on non-x86-Linux machines
Add the $arch-registered-target features that clang uses to disable
tests that require a registered backend, so that we can run the sancov
tests on Windows. LLVM's lit suite did not appear to have a per-test way
to do this, and I would rather not split up the sancov tests into
architecture directories.

Split out of https://reviews.llvm.org/D23321

llvm-svn: 278271
2016-08-10 19:03:18 +00:00
Sanjay Patel 5ccc85fe83 [x86, AVX] allow FP vector select folding to bitwise logic ops (PR28895)
This handles the case in:
https://llvm.org/bugs/show_bug.cgi?id=28895

...but we are not getting all of the possibilities yet. 
Eg, we use 'X86::FANDN' for scalar FP select combines.

That enhancement is filed as:
https://llvm.org/bugs/show_bug.cgi?id=28925

Differential Revision: https://reviews.llvm.org/D23337

llvm-svn: 278270
2016-08-10 19:00:11 +00:00
Andrew Kaylor 498d3113c3 [IndVarSimplify] Eliminate zext of a signed IV when the IV is known to be non-negative
Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18867

llvm-svn: 278269
2016-08-10 18:56:35 +00:00
Nicolai Haehnle 02d784172c LiveIntervalAnalysis: fix a crash in repairOldRegInRange
Summary:
See the new test case for one that was (non-deterministically) crashing
on trunk and deterministically hit the assertion that I added in D23302.
Basically, the machine function contains a sequence

     DS_WRITE_B32 %vreg4, %vreg14:sub0, ...
     DS_WRITE_B32 %vreg4, %vreg14:sub0, ...
     %vreg14:sub1<def> = COPY %vreg14:sub0

and SILoadStoreOptimizer::mergeWrite2Pair merges the two DS_WRITE_B32
instructions into one before calling repairIntervalsInRange.

Now repairIntervalsInRange wants to repair %vreg14, in particular, and
ends up trying to repair %vreg14:sub1 as well, but that only becomes
active _after_ the range that is to be repaired, hence the crash due
to LR.find(...) == LR.begin() at the start of repairOldRegInRange.

I believe that just skipping those subrange is fine, but again, not too
familiar with that code.

Reviewers: MatzeB, kparzysz, tstellarAMD

Subscribers: llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D23303

llvm-svn: 278268
2016-08-10 18:51:14 +00:00
Andrew Kaylor b10f6876cd [ValueTracking] An improvement to IR ValueTracking on Non-negative Integers
Patch by Li Huang

Differential Revision: https://reviews.llvm.org/D18777

llvm-svn: 278267
2016-08-10 18:47:19 +00:00
Krzysztof Parzyszek c9c2bba621 [Hexagon] Remove unused variants of LO/HI instructions
llvm-svn: 278266
2016-08-10 18:40:36 +00:00
Kyle Butt 71b1ca1be4 Codegen: Tail Merge: Be less aggressive with special cases.
This change makes it possible for tail-duplication and tail-merging to
be disjoint. By being less aggressive when merging during layout, there are no
overlapping cases between tail-duplication and tail-merging, provided the
thresholds are disjoint.

There is a remaining TODO to benchmark the succ_size() test for non-layout tail
merging.

llvm-svn: 278265
2016-08-10 18:36:18 +00:00
Bruno Cardoso Lopes 7ea9fd233b Reapply [Sema] Add sizeof diagnostics for bzero
Reapply r277787. For memset (and others) we can get diagnostics like:

  struct stat { int x; };
  void foo(struct stat *stamps) {
    bzero(stamps, sizeof(stamps));
    memset(stamps, 0, sizeof(stamps));
  }

  t.c:7:28: warning: 'memset' call operates on objects of type 'struct stat' while the size is based on a different type 'struct stat *' [-Wsizeof-pointer-memaccess]
    memset(stamps, 0, sizeof(stamps));
           ~~~~~~            ^~~~~~
  t.c:7:28: note: did you mean to dereference the argument to 'sizeof' (and multiply it by the number of elements)?
    memset(stamps, 0, sizeof(stamps));
                             ^~~~~~

This patch implements the same class of warnings for bzero.

Differential Revision: https://reviews.llvm.org/D22525

rdar://problem/18963514

llvm-svn: 278264
2016-08-10 18:34:47 +00:00
Eugene Zelenko 12b0acc727 [Documentation] Highlighting consistency and spelling mistake fix in Clang-tidy readability-else-after-return description.
llvm-svn: 278263
2016-08-10 18:30:14 +00:00
Eugene Zelenko cdfdb4f110 [Release Notes] Consistency in Clang-tidy entries' style.
llvm-svn: 278262
2016-08-10 18:15:51 +00:00
Eugene Leviant 9d278b6040 [ELF] Support LLVM-style casting for OutputSectionBase<ELFT> derived classes
llvm-svn: 278261
2016-08-10 18:10:41 +00:00
Simon Pilgrim 675c257a32 [X86][SSE] Dropped blend(insertps(x,y),zero) combine - this is now handled by target shuffle chain combining
llvm-svn: 278260
2016-08-10 18:10:29 +00:00
Tim Shen ca37f0f990 [ADT] Removed synthesized constructor introduced in r278251, since MSVC doesn't support them
llvm-svn: 278259
2016-08-10 18:08:38 +00:00
Matthias Braun c881d61314 TargetOpcodes: Rewrite the documentation for SUBREG_TO_REG
Differential Revision: https://reviews.llvm.org/D22708

llvm-svn: 278258
2016-08-10 18:05:50 +00:00
Kirill Bobyrev 34789edbbf [clang-tidy] enhance readability-else-after-return
`readability-else-after-return` only warns about `return` calls, but LLVM Coding
Standars stat that `throw`, `continue`, `goto`, etc after `return` calls are
bad, too.

Reviwers: alexfh, aaron.ballman

Differential Revision: https://reviews.llvm.org/D23265

llvm-svn: 278257
2016-08-10 18:05:47 +00:00
Krzysztof Parzyszek 0bbad0fc86 [Hexagon] Simplify the SplitConst32/64 pass
llvm-svn: 278256
2016-08-10 18:05:47 +00:00
Eugene Zelenko 7fa868b31d [Documentation] Fix grammar mistakes in docs/clang-tidy/index.rst spotted by Alexander Kornienko.
llvm-svn: 278255
2016-08-10 18:02:15 +00:00
Kirill Bobyrev 8694cb97c2 [clang-tidy] minor improvements in modernise-deprecated-headers check
This patch introduces a minor list of changes as proposed by Richard Smith in
the mailing list.

See original comments with an impact on the future check state below:

[comments.begin

> +                          {"complex.h", "ccomplex"},

It'd be better to convert this one to <complex>, or leave it alone.
<ccomplex> is an unnecessary wart.

(The contents of C++11's <complex.h> / <ccomplex> / <complex> (all of
which are identical) aren't comparable to C99's <complex.h>, so if
this was C++98 code using the C99 header, the code will be broken with
or without this transformation.)

> +                          {"iso646.h", "ciso646"},

Just delete #includes of this one. <ciso646> does nothing.

> +              {"stdalign.h", "cstdalign"},
> +              {"stdbool.h", "cstdbool"},

We should just delete these two includes. These headers do nothing in C++.

comments.end]

Reviewers: alexfh, aaron.ballman

Differential Revision: https://reviews.llvm.org/D17990

llvm-svn: 278254
2016-08-10 18:01:45 +00:00
Zachary Turner d00efc6795 Remove a circular include dependency.
lldb-private-interfaces.h included lldb-private.h, and
lldb-private.h included lldb-private-interfaces.h.

llvm-svn: 278253
2016-08-10 17:59:03 +00:00
Krzysztof Parzyszek 3b946c90ef [Hexagon] Add extra patterns for single-precision min/max instructions
llvm-svn: 278252
2016-08-10 17:56:24 +00:00
Tim Shen 64afe23528 [ADT] Add make_scope_exit().
Summary: make_scope_exit() is described in C++ proposal p0052r2, which uses RAII to do cleanup works at scope exit.

Reviewers: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22796

llvm-svn: 278251
2016-08-10 17:52:09 +00:00
Rong Xu 63f970ee24 Fix LCSSA increased compile time
We are seeing r276077 drastically increasing compiler time for our larger
benchmarks in PGO profile generation build (both clang based and IR based
mode) -- it can be 20x slower than without the patch (like from 30 secs to
780 secs)

The increased time are all in pass LCSSA. The problematic code is about
PostProcessPHIs after use-rewrite. Note that the InsertedPhis from ssa_updater
is accumulating (never been cleared). Since the inserted PHIs are added to the
candidate for each rewrite, The earlier ones will be repeatedly added. Later
when adding the new PHIs to the work-list, we don't check the duplication
either. This can result in extremely long work-list that containing tons of
duplicated PHIs.

This patch fixes the issue by hoisting the code out of the loop.

Differential Revision: http://reviews.llvm.org/D23344

llvm-svn: 278250
2016-08-10 17:49:11 +00:00
Rui Ueyama 2dc5645b94 Check for availability of `cpio` command.
cpio may not be available on Windows, so it is better to check
for availability before running the command in a test.

llvm-svn: 278249
2016-08-10 17:42:26 +00:00
Krzysztof Parzyszek c1f6cd2980 [Hexagon] Fix table-gen decode conflict warnings for CONST32/64
llvm-svn: 278247
2016-08-10 17:22:24 +00:00
Tim Northover 1dc10fec21 GlobalISel: fixup copy/paste comment error
llvm-svn: 278246
2016-08-10 16:51:18 +00:00
Tim Northover 7552ef5a00 GlobalISel: avoid inserting redundant COPYs for bitcasts.
If the value produced by the bitcast hasn't been referenced yet, we can simply
reuse the input register avoiding an unnecessary COPY instruction.

llvm-svn: 278245
2016-08-10 16:51:14 +00:00
Krzysztof Parzyszek a3386501af [Hexagon] Use integer instructions for floating point immediates
Floating point instructions use general purpose registers, so the few
instructions that can put floating point immediates into registers are,
in fact, integer instruction. Use them explicitly instead of having
pseudo-instructions specifically for dealing with floating point values.

Simplify the constant loading instructions (from sdata) to have only two:
one for 32-bit values and one for 64-bit values: CONST32 and CONST64.

llvm-svn: 278244
2016-08-10 16:46:36 +00:00
Gor Nishanov b2a9c02521 [Coroutines] Part 6: Elide dynamic allocation of a coroutine frame when possible
Summary:
A particular coroutine usage pattern, where a coroutine is created, manipulated and
destroyed by the same calling function, is common for coroutines implementing
RAII idiom and is suitable for allocation elision optimization which avoid
dynamic allocation by storing the coroutine frame as a static `alloca` in its
caller.

coro.free and coro.alloc intrinsics are used to indicate which code needs to be suppressed
when dynamic allocation elision happens:
```
entry:
  %elide = call i8* @llvm.coro.alloc()
  %need.dyn.alloc = icmp ne i8* %elide, null
  br i1 %need.dyn.alloc, label %coro.begin, label %dyn.alloc
dyn.alloc:
  %alloc = call i8* @CustomAlloc(i32 4)
  br label %coro.begin
coro.begin:
  %phi = phi i8* [ %elide, %entry ], [ %alloc, %dyn.alloc ]
  %hdl = call i8* @llvm.coro.begin(i8* %phi, i32 0, i8* null,
                          i8* bitcast ([2 x void (%f.frame*)*]* @f.resumers to i8*))
```
and
```
  %mem = call i8* @llvm.coro.free(i8* %hdl)
  %need.dyn.free = icmp ne i8* %mem, null
  br i1 %need.dyn.free, label %dyn.free, label %if.end
dyn.free:
  call void @CustomFree(i8* %mem)
  br label %if.end
if.end:
  ...
```

If heap allocation elision is performed, we replace coro.alloc with a static alloca on the caller frame and coro.free with null constant.

Also, we need to make sure that if there are any tail calls referencing the coroutine frame, we need to remote tail call attribute, since now coroutine frame lives on the stack.

Documentation and overview is here: http://llvm.org/docs/Coroutines.html.

Upstreaming sequence (rough plan)
1.Add documentation. (https://reviews.llvm.org/D22603)
2.Add coroutine intrinsics. (https://reviews.llvm.org/D22659)
3.Add empty coroutine passes. (https://reviews.llvm.org/D22847)
4.Add coroutine devirtualization + tests.
ab) Lower coro.resume and coro.destroy (https://reviews.llvm.org/D22998)
c) Do devirtualization (https://reviews.llvm.org/D23229)
5.Add CGSCC restart trigger + tests. (https://reviews.llvm.org/D23234)
6.Add coroutine heap elision + tests.  <= we are here
7.Add the rest of the logic (split into more patches)

Reviewers: mehdi_amini, majnemer

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D23245

llvm-svn: 278242
2016-08-10 16:40:39 +00:00
Roger Ferrer Ibanez 17586582e7 Fix build break of VS 2013 debug builds
In debug mode extra macros are enabled for several C++ algorithms. Some of them
may cause unfortunate build failures.

This commit adds a redundant operator() to work around one of those troublesome
macros which was hit accidentally by change r278012.

llvm-svn: 278241
2016-08-10 16:39:58 +00:00
Artem Dergachev cad151491e [analyzer] Fix a crash in CloneDetector when calling functions by pointers.
CallExpr may have a null direct callee when the callee function is not
known in compile-time. Do not try to take callee name in this case.

Patch by Raphael Isemann!

Differential Revision: https://reviews.llvm.org/D23320

llvm-svn: 278238
2016-08-10 16:25:16 +00:00
Krzysztof Parzyszek 12e03aa5fe [Hexagon] Delete HexagonSelectCCInfo.td
This file is not used. The location assignment of call arguments and
return values is implemented directly in HexagonISelLowering.

llvm-svn: 278237
2016-08-10 16:23:53 +00:00
Krzysztof Parzyszek 2a48ce4ec2 [Hexagon] Remove unneeded/unused ISD opcodes ARGEXTEND and FCONST32
llvm-svn: 278236
2016-08-10 16:20:33 +00:00
Joey Gouly b95e36027f [OpenCL] Fix typo in test that I accidentally introduced in my previous commit.
llvm-svn: 278235
2016-08-10 16:04:14 +00:00
Joey Gouly ddbda40245 [OpenCL] Change block descriptor address space to constant.
The block descriptor is a GlobalVariable in the LLVM IR, so it shouldn't be
in the private address space.

llvm-svn: 278234
2016-08-10 15:57:02 +00:00
Simon Pilgrim b204f03004 [X86][XOP] Tweak vpermil2pd test to stop it being combined away
The target shuffle combined to a BLENDPD pattern which we will shortly add support for.

llvm-svn: 278233
2016-08-10 15:15:56 +00:00