Commit Graph

36617 Commits

Author SHA1 Message Date
Rafael Espindola 8571aa3d5d Simplify handling of hidden stubs on PowerPC.
We now handle them just like non hidden ones. This was already the case
on x86 (r207518) and arm (r207517).

llvm-svn: 270205
2016-05-20 12:00:52 +00:00
Igor Kudrin ac40e81987 [Coverage] Fix an issue where improper coverage mapping data could be loaded for an inline function.
If an inline function is observed but unused in a translation unit, dummy
coverage mapping data with zero hash is stored for this function.
If such a coverage mapping section came earlier than real one, the latter
was ignored. As a result, llvm-cov was unable to show coverage information
for those functions.

Differential Revision: http://reviews.llvm.org/D20286

llvm-svn: 270194
2016-05-20 09:14:24 +00:00
Chris Dewhurst 0dfa6bc004 [Sparc] Enable more inline assembly constraints.
Note: This is specifically to allow GCC's test pr44707 to pass.

Trivial change, not put for differential revision. Test included.

llvm-svn: 270192
2016-05-20 09:03:01 +00:00
Craig Topper 25363178bb [X86] Run the AVX/AVX2 intrinsic tests in AVX512VL mode too just to make sure we don't break any older intrinsics.
llvm-svn: 270183
2016-05-20 05:10:32 +00:00
Craig Topper 565463fbba Revert accidental commit of a test command line addition.
llvm-svn: 270175
2016-05-20 02:01:51 +00:00
Craig Topper 0a7a8dee2b [X86] Fix some AVX patterns to only be disabled if VLX and BWI are supported. Without this we get isel failures on the avx-intrinsics-x86.ll test in AVX512VL.
llvm-svn: 270174
2016-05-20 02:00:08 +00:00
Chris Bieneman db373bed66 [obj2yaml] [yaml2obj] Adding a test for r270124
This test covers strings after load command structs and zero fill bytes.

llvm-svn: 270159
2016-05-19 23:26:39 +00:00
Chris Bieneman 1abf005fe6 [yaml2obj] Removing debug code that scribbled 0xDEADBEEF
Now that MachO load command fields are fully covered we can fill unaccounted for bytes with 0. That allows us to sparsely specify YAML to simplify tests.

Simplifying load_commands test accordingly.

llvm-svn: 270158
2016-05-19 23:26:31 +00:00
Lang Hames 45bd7ca7fc [RuntimeDyld][MachO] Add support for SUBTRACTOR relocations between anonymous
symbols on x86-64.

llvm-svn: 270157
2016-05-19 23:26:05 +00:00
Easwaran Raman bb578ef0dd Allow -inline-threshold to override default threshold.
Before r257832, the threshold used by SimpleInliner was explicitly specified or generated from opt levels and passed to the base class Inliner's constructor. There, it was first overridden by explicitly specified -inline-threshold. The refactoring in r257832 did not preserve this behavior for all opt levels. This change brings back the original behavior.

Differential Revision: http://reviews.llvm.org/D20452

llvm-svn: 270153
2016-05-19 23:02:09 +00:00
Sanjoy Das f5f0331a3b [GuardWidening] Introduce range check merging
Sequences of range checks expressed using guards, like

  guard((I - 2) u< L)
  guard((I - 1) u< L)
  guard((I + 0) u< L)
  guard((I + 1) u< L)
  guard((I + 2) u< L)

can sometimes be combined into a smaller sequence:

  guard((I - 2) u< L AND (I + 2) u< L)

if we can prove that (I - 2) u< L AND (I + 2) u< L implies all of checks
expressed in the previous sequence.

This change teaches GuardWidening to do this kind of merging when
feasible.

llvm-svn: 270151
2016-05-19 22:55:46 +00:00
Matthew Simpson 476c0afc01 [ARM, AArch64] Match additional patterns to ldN instructions
When matching an interleaved load to an ldN pattern, the interleaved access
pass checks that all users of the load are shuffles. If the load is used by an
instruction other than a shuffle, the pass gives up and an ldN is not
generated. This patch considers users of the load that are extractelement
instructions. It attempts to modify the extracts to use one of the available
shuffles rather than the load. After the transformation, the load is only used
by shuffles and will then be matched with an ldN pattern.

Differential Revision: http://reviews.llvm.org/D20250

llvm-svn: 270142
2016-05-19 21:39:00 +00:00
Guozhi Wei b1d37199cc [InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions
This patch fixes https://llvm.org/bugs/show_bug.cgi?id=27703.

If there is a sequence of one or more load instructions, each loaded value is used as address of later load instruction, bitcast is necessary to change the value type, don't optimize it.

llvm-svn: 270135
2016-05-19 21:07:01 +00:00
Sanjay Patel cfe75fa72e comment out line that is causing UBSAN bot failures
Patch is awaiting review here:
http://reviews.llvm.org/D20434

llvm-svn: 270128
2016-05-19 21:00:02 +00:00
Wei Mi 0456d9dd18 Recommit r255691 since PR26509 has been fixed.
llvm-svn: 270113
2016-05-19 20:38:03 +00:00
Hans Wennborg 172eee9cfc X86: Don't reset the stack after calls that don't return (PR27117)
Since the calls don't return, the instruction afterwards will never run,
and is just taking up unnecessary space in the binary.

Differential Revision: http://reviews.llvm.org/D20406

llvm-svn: 270109
2016-05-19 20:15:33 +00:00
Sanjay Patel c48a879ef8 [x86] add tests for urem lowering
llvm-svn: 270096
2016-05-19 18:57:54 +00:00
Rui Ueyama 0376b1a2d7 pdbdump: Rename NumberOfSymbols -> SymbolRecordStreamIndex.
Differential Revision: http://reviews.llvm.org/D20441

llvm-svn: 270088
2016-05-19 18:05:58 +00:00
Simon Pilgrim 7a8dcf2556 [X86][SSE] Added fast-isel tests to sync with clang/test/CodeGen/sse-builtins.c
llvm-svn: 270081
2016-05-19 16:55:52 +00:00
Simon Pilgrim b1ff2dd145 [X86][SSE2] Fixed shuffle of results in _mm_cmpnge_sd/_mm_cmpngt_sd tests
llvm-svn: 270080
2016-05-19 16:49:53 +00:00
George Rimar cf2bf9d015 Temporarily revert r270070
It broke buildbot:
http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/4817/steps/ninja%20check%201/logs/stdio

Actually it is just because D20273 not yet commited, but these 2 were crossing with each other,
and I`ll better find the way to land them separatelly soon.

Initial commit message:

[llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style.

Before this patch llvm-mc generated zlib-gnu styled sections. 
That means no SHF_COMPRESSED flag was set, magic 'zlib' signature
was used in combination with full size field. Sections were renamed to "*.z*".
This patch reimplements the compression style to zlib one as zlib-gnu looks
to be depricated everywhere.

Differential revision: http://reviews.llvm.org/D20331

llvm-svn: 270075
2016-05-19 15:58:05 +00:00
Matthew Simpson 6feebe9847 [LAA] Check independence of strided accesses before forward case
This patch changes the order in which we attempt to prove the independence of
strided accesses. We previously did this after we knew the dependence distance
was positive. With this change, we check for independence before handling the
negative distance case. The patch prevents LAA from reporting forward
dependences for independent strided accesses.

This change was requested in the review of D19984.

llvm-svn: 270072
2016-05-19 15:37:19 +00:00
George Rimar 99c901fc47 [llvm-mc] - Teach llvm-mc to generate compressed debug sections in zlib style.
Before this patch llvm-mc generated zlib-gnu styled sections. 
That means no SHF_COMPRESSED flag was set, magic 'zlib' signature
was used in combination with full size field. Sections were renamed to "*.z*".
This patch reimplements the compression style to zlib one as zlib-gnu looks
to be depricated everywhere.

Differential revision: http://reviews.llvm.org/D20331

llvm-svn: 270070
2016-05-19 15:08:31 +00:00
Chad Rosier 02f25a9565 [AArch64 ] Generate a BFXIL from 'or (and X, Mask0Imm),(and Y, Mask1Imm)'.
Mask0Imm and ~Mask1Imm must be equivalent and one of the MaskImms is a shifted
mask (e.g., 0x000ffff0).  Both 'and's must have a single use.

This changes code like:

  and w8, w0, #0xffff000f
  and w9, w1, #0x0000fff0
  orr w0, w9, w8

into

  lsr w8, w1, #4
  bfi w0, w8, #4, #12

llvm-svn: 270063
2016-05-19 14:19:47 +00:00
Ranjeet Singh dbbbef5401 [ARM] Add cdp intrinsic tests.
- Renamed intrinsics.ll to intrinsics-coprocessor.ll
  as all the tests were testing coprocessor instructions,
  also made the test checks match the full instruction.

Differential Revision: http://reviews.llvm.org/D20393

llvm-svn: 270057
2016-05-19 12:59:17 +00:00
Artem Tamazov 8ce1f7177b [AMDGPU][llvm-mc] Fixes to support buffer atomics.
Fixes for MUBUF_Atomic instructions to make operand list valid:
 - For RTN insns, make a copy of $vdata_in operand as $vdata.
 - Do not add operand for GLC, it is hardcoded and comes as a token.
Workaround to avoid adding multiple default optional operands.
Tests added.

Differential Revision: http://reviews.llvm.org/D20257

llvm-svn: 270049
2016-05-19 12:22:39 +00:00
Zoran Jovanovic 5f94cedeb5 ps][microMIPS] Add R_MICROMIPS_PC21_S1 relocation
Differential Revision: http://reviews.llvm.org/D15526

llvm-svn: 270048
2016-05-19 12:20:40 +00:00
Simon Pilgrim 47825fad71 [X86][SSE2] Added _mm_move_* tests
llvm-svn: 270046
2016-05-19 11:59:57 +00:00
Simon Pilgrim 01809e0506 [X86][SSE2] Added _mm_cast* and _mm_set* tests
llvm-svn: 270041
2016-05-19 10:58:54 +00:00
Daniel Sanders 2f2ab5102c [mips][mips16] Fix ZERO is not a CPU16Regs register error from the machine verifier.
Summary: Partially fixes PR27458

Reviewers: sdardis

Subscribers: dsanders, llvm-commits, sdardis

Differential Revision: http://reviews.llvm.org/D20330

llvm-svn: 270037
2016-05-19 10:42:14 +00:00
Andrey Turetskiy 45b22a4aff [X86] Enable RRL part of the LEA optimization pass for -O2.
Enable "Remove Redundant LEAs" part of the LEA optimization pass for -O2.
This gives 6.4% performance improve on Broadwell on nnet benchmark from Coremark-pro.
There is no significant effect on other benchmarks (Geekbench, Spec2000, Spec2006).

Differential Revision: http://reviews.llvm.org/D19659

llvm-svn: 270036
2016-05-19 10:18:29 +00:00
Zlatko Buljan e663e34e79 [mips][microMIPS] Implement BC1EQZC, BC1NEZC, BC2EQZC and BC2NEZC instructions
Differential Revision: http://reviews.llvm.org/D18352

llvm-svn: 270030
2016-05-19 07:31:28 +00:00
Peter Collingbourne fe12d0e3e5 CodeGen: Make the global-merge pass independently testable, and add a test.
llvm-svn: 270023
2016-05-19 04:38:56 +00:00
Sanjoy Das b784ed36c0 [GuardWidening] Use getEquivalentICmp to fold constant compares
`ConstantRange::getEquivalentICmp` is more general, and better
factored.

llvm-svn: 270019
2016-05-19 03:53:17 +00:00
Dan Gohman 537bc9b9f5 [WebAssembly] Make several CHECK lines less fragile using regexes and CHECK-DAG.
llvm-svn: 270011
2016-05-19 01:52:56 +00:00
Matt Arsenault c438ef574d AMDGPU: Fix promote alloca for pointer loads
If the load has a pointer type, we don't want to change
its type.

llvm-svn: 270000
2016-05-18 23:20:24 +00:00
Sanjoy Das 083f38939b New pass: guard widening
Summary:
Implement guard widening in LLVM. Description from GuardWidening.cpp:

The semantics of the `@llvm.experimental.guard` intrinsic lets LLVM
transform it so that it fails more often that it did before the
transform.  This optimization is called "widening" and can be used hoist
and common runtime checks in situations like these:

```
%cmp0 = 7 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
%cmp1 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp1) [ "deopt"(...) ]
...
```

to

```
%cmp0 = 9 u< Length
call @llvm.experimental.guard(i1 %cmp0) [ "deopt"(...) ]
call @unknown_side_effects()
...
```

If `%cmp0` is false, `@llvm.experimental.guard` will "deoptimize" back
to a generic implementation of the same function, which will have the
correct semantics from that point onward.  It is always _legal_ to
deoptimize (so replacing `%cmp0` with false is "correct"), though it may
not always be profitable to do so.

NB! This pass is a work in progress.  It hasn't been tuned to be
"production ready" yet.  It is known to have quadriatic running time and
will not scale to large numbers of guards

Reviewers: reames, atrick, bogner, apilipenko, nlewycky

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20143

llvm-svn: 269997
2016-05-18 22:55:34 +00:00
Dehao Chen f16376b505 Follow-up patch of http://reviews.llvm.org/D19948 to handle missing profiles when simplifying CFG.
Summary: Set default branch weight to 1:1 if one of the branch has profile missing when simplifying CFG.

Reviewers: spatel, davidxl

Subscribers: danielcdh, llvm-commits

Differential Revision: http://reviews.llvm.org/D20307

llvm-svn: 269995
2016-05-18 22:41:03 +00:00
Rafael Espindola 8c34dd8257 Delete Reloc::Default.
Having an enum member named Default is quite confusing: Is it distinct
from the others?

This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.

llvm-svn: 269988
2016-05-18 22:04:49 +00:00
Michael Zolotukhin d2268a73bc [LoopUnrollAnalyzer] Take into account cost of instructions controlling branches, along with their operands.
Previously, we didn't add their and their operands cost, which could've
resulted in unrolling loops for no actual benefit.

llvm-svn: 269985
2016-05-18 21:20:12 +00:00
Sanjay Patel fbb9a5e91f [x86] add test for immediate comment formatting
llvm-svn: 269977
2016-05-18 20:26:32 +00:00
Chris Bieneman 6f6f9f0bb9 Fixing test failure on Windows bot
http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc19-DA/builds/553/steps/test-llvm/logs/LLVM%20%3A%3A%20ObjectYAML__MachO__load_commands.yaml

llvm-svn: 269975
2016-05-18 20:01:48 +00:00
Krzysztof Parzyszek 14a1c18448 When looking for a spill slot in reg scavenger, find one that matches RC
When looking for an available spill slot, the register scavenger would stop
after finding the first one with no register assigned to it. That slot may
have size and alignment that do not meet the requirements of the register
that is to be spilled. Instead, find an available slot that is the closest
in size and alignment to one that is needed to spill a register from RC.

Differential Revision: http://reviews.llvm.org/D20295

llvm-svn: 269969
2016-05-18 18:16:00 +00:00
Simon Pilgrim 5a0d728181 [X86][SSE2] Added fast-isel tests to sync with clang/test/CodeGen/sse2-builtins.c
llvm-svn: 269966
2016-05-18 18:00:43 +00:00
Rui Ueyama 350b29862f pdbdump: Print out section offsets in the publics stream.
llvm-svn: 269955
2016-05-18 16:24:16 +00:00
Chris Bieneman 2de17d49dd Re-apply: [obj2yaml] [yaml2obj] Support MachO section and section_64
This re-applies r269845, r269846, and r269850 with an included fix for a crash reported by zturner.

llvm-svn: 269953
2016-05-18 16:17:23 +00:00
Matt Arsenault 1735da460b AMDGPU: Other sizes of popcnt are fast
We can chain bcnt instructions together, so
any width popcnt is pretty fast.

llvm-svn: 269950
2016-05-18 16:10:19 +00:00
Hans Wennborg 8eb336c14e Re-commit r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions"
with an additional fix to make RegAllocFast ignore undef physreg uses. It would
previously get confused about the "push %eax" instruction's use of eax. That
method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate
as well, but since that runs after register-allocation, we didn't run into the
RegAllocFast issue before.

llvm-svn: 269949
2016-05-18 16:10:17 +00:00
Matt Arsenault 9430b9113a AMDGPU: Fix assert when erroring on a call
For some reason an assert is now hit when a valid chain
is not returned, so return the entry chain.

llvm-svn: 269948
2016-05-18 16:10:11 +00:00
Matt Arsenault 891fccc0c1 AMDGPU: Handle alloca promoting with null operands
If the second pointer in a multi-pointer instruction is
a constant, we can replace the type.

llvm-svn: 269945
2016-05-18 15:57:21 +00:00