Commit Graph

37938 Commits

Author SHA1 Message Date
Chad Rosier 3972953efd Revert "[AArch64] Change the preferred alignment for char and short to word alignment"
This reverts commit r273279 as the change was not properly approved.

llvm-svn: 274768
2016-07-07 16:37:29 +00:00
Valery Pykhtin af8b1bddbd [AMDGPU] fix ds_write_src2 encoding (bz26027)
Differential revision: http://reviews.llvm.org/D22041

llvm-svn: 274756
2016-07-07 14:23:38 +00:00
Rafael Espindola b34cba97b7 Don't crash trying to relax 32 loads on COFF.
Fixes pr28452.

llvm-svn: 274754
2016-07-07 14:00:07 +00:00
Sjoerd Meijer 17c08dc701 Code size optimisation: don't rewrite fputs to fwrite when optimising for size
because fwrite requires more arguments and thus extra MOVs are required.

llvm-svn: 274753
2016-07-07 13:56:23 +00:00
David Majnemer 7afb46d3c8 [LoopAccessAnalysis] Fix an integer overflow
We were inappropriately using 32-bit types to account for quantities
that can be far larger.

Fixed in PR28443.

llvm-svn: 274737
2016-07-07 06:24:36 +00:00
Craig Topper d5d2a35013 [AVX512] Zero extend the result of vpcmpeq/vpcmpgt and similar intrinsics in the autoupgrade code. This currently results in worse codegen but is needed for correctness.
llvm-svn: 274736
2016-07-07 06:11:07 +00:00
Elena Demikhovsky fc1e969dfc Fixed a bug in vectorizing GEP before gather/scatter intrinsic.
Vectorizing GEP was incorrect and broke SSA in some cases.
 
The patch fixes PR27997 https://llvm.org/bugs/show_bug.cgi?id=27997.

Differential revision: http://reviews.llvm.org/D22035

llvm-svn: 274735
2016-07-07 06:06:46 +00:00
David Majnemer a54fe1acdc [CodeView] Implement support for thread-local variables
llvm-svn: 274734
2016-07-07 05:14:21 +00:00
Qin Zhao c35b2cba6f [esan:cfrag] Add option -esan-aux-field-info
Summary:
Adds option -esan-aux-field-info to control generating binary with
auxiliary struct field information.

Extracts code for creating auxiliary information from
createCacheFragInfoGV into createCacheFragAuxGV.

Adds test struct_field_small.ll for -esan-aux-field-info test.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D22019

llvm-svn: 274726
2016-07-07 03:20:16 +00:00
Peter Collingbourne 730c82e6b8 ThinLTO: Remove check for multiple modules before applying weak resolutions.
This check is not only unnecessary, it can produce the wrong result. If we
are linking a single module and it has an exported linkonce symbol, we need
to promote to weak in order to avoid PR19901-style problems.

Differential Revision: http://reviews.llvm.org/D21917

llvm-svn: 274722
2016-07-07 01:51:11 +00:00
Sean Silva 284b0324e2 [PM] Avoid getResult on a higher level in LoopAccessAnalysis
Note that require<domtree> and require<loops> aren't needed because they
come in implicitly via the loop pass manager.

llvm-svn: 274712
2016-07-07 01:01:53 +00:00
Sean Silva 59fe82f4ce [PM] Port TailCallElim
llvm-svn: 274708
2016-07-06 23:48:41 +00:00
Sean Silva b025d375a1 [PM] Port CorrelatedValuePropagation
llvm-svn: 274705
2016-07-06 23:26:29 +00:00
Peter Collingbourne d1d2614ee1 ThinLTO: Add test cases for promote+internalize.
This tests the effect of both promotion and internalization on a module,
and helps show that D21883 is NFC wrt promotion+internalization.

Differential Revision: http://reviews.llvm.org/D21915

llvm-svn: 274699
2016-07-06 22:53:02 +00:00
Sanjay Patel 65a51c25c1 [InstCombine] enhance (select X, C1, C2 --> ext X) to handle vectors
By replacing dyn_cast of ConstantInt with m_Zero/m_One/m_AllOnes, we
allow these transforms for splat vectors.

Differential Revision: http://reviews.llvm.org/D21899

llvm-svn: 274696
2016-07-06 22:23:01 +00:00
Manman Ren 524ca27b90 Add testing coverage for r274582.
llvm-svn: 274693
2016-07-06 22:01:28 +00:00
Michael Kuperstein 1ef6c59b1d [X86] Transform setcc + movzbl into xorl + setcc
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.

This fixes PR28146.

Differential Revision: http://reviews.llvm.org/D21774

llvm-svn: 274692
2016-07-06 21:56:18 +00:00
Vedant Kumar 4c01092a25 [llvm-cov] Add support for creating html reports
Based on a patch by Harlan Haskins!

Differential Revision: http://reviews.llvm.org/D18278

llvm-svn: 274688
2016-07-06 21:44:05 +00:00
Matthias Braun ad0032a649 AArch64: Change modeling of zero cycle zeroing.
On CPUs with the zero cycle zeroing feature enabled "movi v.2d" should
be used to zero a vector register. This was previously done at
instruction selection time, however the register coalescer sometimes
widened multiple vregs to the Q width because of that leading to extra
spills. This patch leaves the decision on how to zero a register to the
AsmPrinter phase where it doesn't affect register allocation anymore.

This patch also sets isAsCheapAsAMove=1 on FMOVS0, FMOVD0.

This fixes http://llvm.org/PR27454, rdar://25866262

Differential Revision: http://reviews.llvm.org/D21826

llvm-svn: 274686
2016-07-06 21:39:33 +00:00
Chad Rosier 232e29ebea [MemorySSA] Reinstate the legacy printer and verifier.
Differential Revision: http://reviews.llvm.org/D22058

llvm-svn: 274679
2016-07-06 21:20:47 +00:00
Rafael Espindola a29971faeb Add initial support for R_386_GOT32X.
This adds it only for movl mov@GOT(%reg), %reg.

llvm-svn: 274678
2016-07-06 21:19:11 +00:00
David Majnemer 7abd269aa9 [CodeView] Emit an appropriate symbol kind for globals
We emitted debug info for globals/functions as if they all had external
linkage.  Instead, emit local symbol records when appropriate.

llvm-svn: 274676
2016-07-06 21:07:47 +00:00
David Majnemer e1e7372e93 [CodeView] Unions are always sealed
It is impossible to inherit from a union.  We are missing a way to
represent this in IR for classes/structs...

llvm-svn: 274675
2016-07-06 21:07:42 +00:00
Justin Lebar 6f9d01bbd5 [NVPTX] Add sm_60, sm_61, sm_62 targets to LLVM.
Reviewers: tra

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D22068

llvm-svn: 274674
2016-07-06 21:06:10 +00:00
Haicheng Wu a95cd1267f [LIR] Fix mis-compilation with unwinding.
To fix PR27859, bail out if there is an instruction may throw.

Differential Revision: http://reviews.llvm.org/D20638

llvm-svn: 274673
2016-07-06 21:05:40 +00:00
Piotr Padlewski 6deaa6afae Add 'thinlto_src_module' metadata to imported function
Added metadata to be able to make statistics on how many functions
that have been imported have been removed. Also module name might
be helpfull when debugging.

Reviewers: tejohnson, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: http://reviews.llvm.org/D21943

llvm-svn: 274668
2016-07-06 20:26:25 +00:00
Derek Bruening d712a3c10e [esan|wset] Fix incorrect memory size assert
Summary:
Fixes an incorrect assert that fails on 128-bit-sized loads or stores.
Augments the wset tests to include this case.

Reviewers: aizatsky

Subscribers: vitalybuka, zhaoqin, kcc, eugenis, llvm-commits

Differential Revision: http://reviews.llvm.org/D22062

llvm-svn: 274666
2016-07-06 20:13:53 +00:00
Justin Bogner a463537a36 NVPTX: Replace uses of cuda.syncthreads with nvvm.barrier0
Everywhere where cuda.syncthreads or __syncthreads is used, use the
properly namespaced nvvm.barrier0 instead.

llvm-svn: 274664
2016-07-06 20:02:45 +00:00
Adrian McCarthy 820ca5404c Retry: "Emit CodeView type records for nested classes."
Now with a corrected test to account for a recently supported properties bit in the debug info of a struct.

Original review: http://reviews.llvm.org/D21939

This reverts commit 970c3fd497a28d25dd69526eb52594a696c37968.

llvm-svn: 274661
2016-07-06 19:49:51 +00:00
Chad Rosier dcfce2d0ec [DSE] Avoid iterator invalidation bugs.
The dse_with_dbg_value.ll test committed with r273141 is removed because this
we no longer performs any type of back tracking, which is what was causing the
codegen differences with and without debug information.

Differential Revision: http://reviews.llvm.org/D21613

llvm-svn: 274660
2016-07-06 19:48:52 +00:00
Sanjay Patel 04b3496d9b [x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434)
This is "cvtdq2ps" which does not appear to be particularly slow on any CPU
according to Agner's tables. Choosing "5" as a cost here as suggested in:
https://llvm.org/bugs/show_bug.cgi?id=21356
...but it seems very conservative given that the instruction is fully pipelined,
and I think these costs are supposed to model throughput.

Note that related costs are also most likely too high, but this fixes PR21356
and partly fixes PR28434.

llvm-svn: 274658
2016-07-06 19:15:54 +00:00
Sean Silva f50d4b6cdc Work around PR28400 a bit harder.
We were still crashing in the "no change" case because LVI was not
getting invalidated.

See the thread "Should analyses be able to hold AssertingVH to IR?
(related to PR28400)" for more discussion.

llvm-svn: 274656
2016-07-06 19:05:41 +00:00
Elliot Colp bc2cfc2291 [SystemZ] Remove AND mask of bottom 6 bits when result is used for shift/rotate
On SystemZ, shift and rotate instructions only use the bottom 6 bits of the shift/rotate amount.
Therefore, if the amount is ANDed with an immediate mask that has all of the bottom 6 bits set, we
can remove the AND operation entirely.

Differential Revision: http://reviews.llvm.org/D21854

llvm-svn: 274650
2016-07-06 18:13:11 +00:00
Zachary Turner 8848a7a6b2 [pdb] Round trip the PDB stream between YAML and binary PDB.
This gets writing of the PDB stream working.

llvm-svn: 274647
2016-07-06 18:05:57 +00:00
Kit Barton f9d0a40573 Ensure all uses of permute instructions feed vector stores
There is a problem in VSXSwapRemoval where it is incorrectly removing permute instructions.
In this case, the permute is feeding both a vector store and also a non-store instruction. In this case, the permute cannot be removed.

The fix is to simply look at all the uses of the vector register defined by the permute and ensure that all the uses are vector store instructions.

This problem was reported in PR 27735 (https://llvm.org/bugs/show_bug.cgi?id=27735).

Test case based on the original problem reported.

Phabricator Review: http://reviews.llvm.org/D21802

llvm-svn: 274645
2016-07-06 18:03:52 +00:00
Tim Shen 1c3c0afc53 [DAGCombiner] Fix visitSTORE to continue processing current SDNode, if findBetterNeighborChains doesn't actually CombineTo it.
Summary:
findBetterNeighborChains may or may not find a better chain for each node it finds, which include the node ("St") that visitSTORE is currently processing. If no better chain is found for St, visitSTORE should continue instead of return SDValue(St, 0), as if it's CombinedTo'ed.

This fixes bug 28130. There might be other ways to make the test pass (see D21409). I think both of the patches are fixing actual bugs revealed by the same testcase.

Reviewers: echristo, wschmidt, hfinkel, kbarton, amehsan, arsenm, nemanjai, bogner

Subscribers: mehdi_amini, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D21692

llvm-svn: 274644
2016-07-06 17:44:03 +00:00
Michael Kuperstein aa71bdd3af [TTI] The cost model should not assume vector casts get completely scalarized
The cost model should not assume vector casts get completely scalarized, since
on targets that have vector support, the common case is a partial split up to
the legal vector size. So, when a vector cast  gets split, the resulting casts
end up legal and cheap.

Instead of pessimistically assuming scalarization, base TTI can use the costs
the concrete TTI provides for the split vector, plus a fudge factor to account
for the cost of the split itself. This fudge factor is currently 1 by default,
except on AMDGPU where inserts and extracts are considered free.

Differential Revision: http://reviews.llvm.org/D21251

llvm-svn: 274642
2016-07-06 17:30:56 +00:00
Adrian McCarthy 7649d8388a Revert "Emit CodeView type records for nested classes."
This reverts commit 256b29322c827a2d94da56468c936596f5509032.

llvm-svn: 274632
2016-07-06 15:14:10 +00:00
Simon Pilgrim 118da63a9d [X86][SSE] Added test cases for missed opportunities to combine pshufb to pslldq/psrldq
llvm-svn: 274631
2016-07-06 15:09:48 +00:00
Adrian McCarthy 024a7b6358 Emit CodeView type records for nested classes.
Differential Revision: http://reviews.llvm.org/D21939

llvm-svn: 274629
2016-07-06 14:47:32 +00:00
Matthew Simpson 433cb1dfe3 [LV] Don't widen trivial induction variables
We currently always vectorize induction variables. However, if an induction
variable is only used for counting loop iterations or computing addresses with
getelementptr instructions, we don't need to do this. Vectorizing these trivial
induction variables can create vector code that is difficult to simplify later
on. This is especially true when the unroll factor is greater than one, and we
create vector arithmetic when computing step vectors. With this patch, we check
if an induction variable is only used for counting iterations or computing
addresses, and if so, scalarize the arithmetic when computing step vectors
instead. This allows for greater simplification.

This patch addresses the suboptimal pointer arithmetic sequence seen in
PR27881.

Reference: https://llvm.org/bugs/show_bug.cgi?id=27881
Differential Revision: http://reviews.llvm.org/D21620

llvm-svn: 274627
2016-07-06 14:26:59 +00:00
Elena Demikhovsky ad0a56f3da Re-commit of 274613.
The prev commit failed on compilation.
A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure.

llvm-svn: 274626
2016-07-06 14:15:43 +00:00
Sam Kolton 3c21a69077 [AMDGPU] Assembler: regression tests for bug 28413. NFC
llvm-svn: 274623
2016-07-06 12:52:20 +00:00
Diana Picus b772e409ba [ARM] Do not test for CPUs, use SubtargetFeatures. Also remove 2 flags.
This is a follow-up for r273544.

The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.

This commit also removes two command-line flags that weren't used in any of the
tests: widen-vmovs and swift-partial-update-clearance. The former may be easily
replaced with the mattr mechanism, but the latter may not (as it is a subtarget
property, and not a proper feature).

Differential Revision: http://reviews.llvm.org/D21797

llvm-svn: 274620
2016-07-06 11:22:11 +00:00
Elena Demikhovsky 02ced295aa Reverted 274613 due to compilation failue.
llvm-svn: 274615
2016-07-06 09:11:49 +00:00
Elena Demikhovsky 5a4f2476fd AVX-512: Optimization for patterns with i1 scalar type
The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc".
I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction.
I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions.

This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173.

Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails).

Differential revision: http://reviews.llvm.org/D21956

llvm-svn: 274613
2016-07-06 09:01:20 +00:00
Nicolai Haehnle e40530ea7b AMDGPU: Fix return of non-void-returning shaders
Summary:
Since "AMDGPU: Fix verifier errors in SILowerControlFlow", the logic that
ensures that a non-void-returning shader falls off the end of the last
basic block was effectively disabled, since SI_RETURN is now used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96731

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, kzhuravl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21975

llvm-svn: 274612
2016-07-06 08:35:17 +00:00
Elena Demikhovsky 971fbfda1e Vector GEP test: renamed + some comments
Differential revision: http://reviews.llvm.org/D21957

llvm-svn: 274611
2016-07-06 08:11:23 +00:00
Daniel Berlin fc7e651bfd Fix handling of forward unreachable but reverse-reachable blocks in MemorySSA construction
llvm-svn: 274606
2016-07-06 05:32:05 +00:00
George Burgess IV bfa401e5ad [CFLAA] Split into Anders+Steens analysis.
StratifiedSets (as implemented) is very fast, but its accuracy is also
limited. If we take a more aggressive andersens-like approach, we can be
way more accurate, but we'll also end up being slower.

So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA.

Long-term, we want to end up in a place where CFLSteens is queried
first; if it can provide an answer, great (since queries are basically
map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc.

This patch splits everything out so we can try to do something like
that when we get a reasonable CFLAnders implementation.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21910

llvm-svn: 274589
2016-07-06 00:26:41 +00:00
Ryan Govostes e51401bdab [asan] Add a hidden option for Mach-O global metadata liveness tracking
llvm-svn: 274578
2016-07-05 21:53:08 +00:00
Tim Northover e6ae6767d9 AArch64: TableGenerate system instruction operands.
The way the named arguments for various system instructions are handled at the
moment has a few problems:

  - Large-scale duplication between AArch64BaseInfo.h and AArch64BaseInfo.cpp
  - That weird Mapping class that I have no idea what I was on when I thought
    it was a good idea.
  - Searches are performed linearly through the entire list.
  - We print absolutely all registers in upper-case, even though some are
    canonically mixed case (SPSel for example).
  - The ARM ARM specifies sysregs in terms of 5 fields, but those are relegated
    to comments in our implementation, with a slightly opaque hex value
    indicating the canonical encoding LLVM will use.

This adds a new TableGen backend to produce efficiently searchable tables, and
switches AArch64 over to using that infrastructure.

llvm-svn: 274576
2016-07-05 21:23:04 +00:00
Balaram Makam d4acd7ed10 Revert r259387: "AArch64: Implement missed conditional compare sequences."
This reverts commit r259387 because it inserts illegal code after legalization
    in some backends where i64 OR type is illegal for example.

llvm-svn: 274573
2016-07-05 20:24:05 +00:00
Simon Pilgrim bec6543d17 [X86][AVX2] Add support for target shuffle combining to BROADCAST
Only support broadcast from vector register so far - memory folding support will have to wait.

llvm-svn: 274572
2016-07-05 20:11:29 +00:00
Simon Pilgrim 48adedffb7 [X86][AVX512] Fixed decoding of permd/permpd variable mask shuffles + enabled them for target shuffle combining
Corrected element mask masking to extract the bottom index bits (now matches the perm2 implementation but for unary inputs).

llvm-svn: 274571
2016-07-05 18:31:17 +00:00
Saleem Abdulrasool 4d950ef892 ARM: fix `-mlong-calls` for WoA
Not all code-paths set the relocation model to static for Windows.  This
currently breaks on Windows ARM with `-mlong-calls` when built with clang.
Loosen the assertion to what it was previously.  We would ideally ensure that
all the configuration sets Windows to static relocation model.

llvm-svn: 274570
2016-07-05 18:30:52 +00:00
Matt Arsenault 2d79389508 DAGCombiner: Fold away vector extract of insert with the same index
This only really matters when the index is non-constant since the
constant case already gets taken care of by other combines.

llvm-svn: 274569
2016-07-05 18:25:02 +00:00
Tim Northover 01dff9d18a AArch64: use correct SDValue # when looking for bitfield placement.
The other use really does only care about the SDNode (it checks the
opcode against a whitelist), but bitFieldPlacement can be misled if
the node produces multiple results.

Patch by Ismail Badawi.

llvm-svn: 274567
2016-07-05 18:02:57 +00:00
Matt Arsenault ffc8275f2b AMDGPU: Fix folding SGPRs into madak/madmk src0
Because of the special immediate operand, the constant
bus is already used so SGPRs are never useful.

r263212 changed the name of the immediate operand, which
broke the verifier check for the restriction.

llvm-svn: 274564
2016-07-05 17:09:01 +00:00
Sam Kolton a9cd6aa895 [AMDGPU] Assembler: Fix parsing error with floating-point literals passed to integer instructions
Differential Revision: http://reviews.llvm.org/D21972

llvm-svn: 274551
2016-07-05 14:01:11 +00:00
Simon Pilgrim 4e96fbf3c1 [X86][AVX512] Autoupgrade the BROADCAST intrinsics
llvm-svn: 274550
2016-07-05 13:58:47 +00:00
Simon Pilgrim 1e91654b38 [X86][AVX512BW] Added BROADCAST intrinsics fast-isel generic IR tests
llvm-svn: 274545
2016-07-05 13:16:05 +00:00
James Molloy ae5ff990ae [Thumb] Reapply r272251 with a fix for PR28348 (mk 2)
The important thing I was missing was ensuring newly added constants were kept in topological order. Repositioning the node is correct if the constant is newly added (so it has no topological ordering) but wrong if it already existed - positioning it next in the worklist would break the topological ordering.

Original commit message:
  [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated

  If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead;

    int i(int a) {
      return a & 0xfffffeec;
    }

  Used to produce:
      ldr r1, [CONSTPOOL]
      ands r0, r1
    CONSTPOOL: 0xfffffeec

  And now produces:
      movs    r1, #255
      adds    r1, #20  ; Less costly immediate generation
      bics    r0, r1

llvm-svn: 274543
2016-07-05 12:37:13 +00:00
Simon Pilgrim 20ede63a33 [X86][AVX512] Added BROADCAST intrinsics fast-isel generic IR tests
llvm-svn: 274537
2016-07-05 10:15:14 +00:00
Nemanja Ivanovic 44513e545f [PowerPC] - Legalize vector types by widening instead of integer promotion
This patch corresponds to review:
http://reviews.llvm.org/D20443

It changes the legalization strategy for illegal vector types from integer
promotion to widening. This only applies for vectors with elements of width
that is a multiple of a byte since we have hardware support for vectors with
1, 2, 3, 8 and 16 byte elements.
Integer promotion for vectors is quite expensive on PPC due to the sequence
of breaking apart the vector, extending the elements and reconstituting the
vector. Two of these operations are expensive.
This patch causes between minor and major improvements in performance on most
benchmarks. There are very few benchmarks whose performance regresses. These
regressions can be handled in a subsequent patch with a DAG combine (similar
to how this patch handles int -> fp conversions of illegal vector types).

llvm-svn: 274535
2016-07-05 09:22:29 +00:00
Simon Pilgrim dea33cc2f3 [X86][AVX512] Added VSHUFPD intrinsics fast-isel generic IR tests
llvm-svn: 274534
2016-07-05 09:10:07 +00:00
Simon Pilgrim 8a01915bd2 [X86][AVX512VL] Added VSHUFPD/VSHUFPS intrinsics fast-isel generic IR tests
llvm-svn: 274533
2016-07-05 09:09:41 +00:00
Saleem Abdulrasool 4d3626ed31 test: relax the match on the timestamp
llvm-svn: 274529
2016-07-05 01:14:53 +00:00
Saleem Abdulrasool aecbdf70bf Object: support empty UID/GID fields
Normal archives do not have empty UID/GID fields.  However, the Microsoft
Import library format is a customized archive (it just uses an alternate symbol
index format).  When the import library is constructed by lib.exe, the UID and
GID fields are left empty.  Do not abort on such an input.

llvm-svn: 274528
2016-07-05 00:23:05 +00:00
Simon Pilgrim 3ad040909a [X86][AVX512] Add support for lowering shuffles to VSHUFPD
llvm-svn: 274520
2016-07-04 20:41:24 +00:00
James Molloy c3b4ed4a70 Revert "[Thumb] Reapply r272251 with a fix for PR28348"
This reverts commit r274510 - it made green dragon unhappy.

llvm-svn: 274512
2016-07-04 17:14:24 +00:00
James Molloy 9f019835ef [Thumb] Reapply r272251 with a fix for PR28348
We were using DAG->getConstant instead of DAG->getTargetConstant. This meant that we could inadvertently increase the use count of a constant if stars aligned, which it did in this testcase. Increasing the use count of the constant could cause ISel to fall over (because DAGToDAG lowering assumed the constant had only one use!)

Original commit message:
  [Thumb] Select a BIC instead of AND if the immediate can be encoded more optimally negated

  If an immediate is only used in an AND node, it is possible that the immediate can be more optimally materialized when negated. If this is the case, we can negate the immediate and use a BIC instead;

    int i(int a) {
      return a & 0xfffffeec;
    }

  Used to produce:
      ldr r1, [CONSTPOOL]
      ands r0, r1
    CONSTPOOL: 0xfffffeec

  And now produces:
      movs    r1, #255
      adds    r1, #20  ; Less costly immediate generation
      bics    r0, r1

llvm-svn: 274510
2016-07-04 16:35:41 +00:00
Simon Pilgrim 02d435d2f4 [X86][AVX512] Autoupgrade the VPERMPD/VPERMQ intrinsics
llvm-svn: 274506
2016-07-04 14:19:05 +00:00
Simon Pilgrim 8b82fce537 [X86][AVX512] Added VPERMPD/VPERMQ intrinsics fast-isel generic IR tests
llvm-svn: 274503
2016-07-04 13:43:10 +00:00
Simon Pilgrim 9fca300cbe [X86][AVX512] Autoupgrade the VPERMILPD/VPERMILPS intrinsics
llvm-svn: 274498
2016-07-04 12:40:54 +00:00
Simon Pilgrim c8cf2ddb6d [X86][AVX512] Added VPERMILPD/VPERMILPS intrinsics fast-isel generic IR tests
Added PSHUFD tests as well

llvm-svn: 274493
2016-07-04 11:07:50 +00:00
Nicolai Haehnle 84c9f9919a Add writeonly IR attribute
Summary:
This complements the earlier addition of IntrWriteMem and IntrWriteArgMem
LLVM intrinsic properties, see D18291.

Also start using the attribute for memset, memcpy, and memmove intrinsics,
and remove their special-casing in BasicAliasAnalysis.

Reviewers: reames, joker.eph

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D18714

llvm-svn: 274485
2016-07-04 08:01:29 +00:00
Craig Topper d83f818a3e [CodeGen] Make the code that detects a if a shuffle is really a concatenation of the inputs more general purpose.
We can now handle concatenation of each source multiple times. The previous code just checked for each source to appear once in either order.

This also now handles an entire source vector sized piece having undef indices correctly. We now concat with UNDEF instead of using one of the sources. This is responsible for the test case change.

llvm-svn: 274483
2016-07-04 06:19:35 +00:00
Simon Pilgrim 7f096de0b8 [X86][AVX512] Add support for 512-bit shuffle lowering to VPERMPD/VPERMQ
llvm-svn: 274473
2016-07-03 19:50:06 +00:00
Craig Topper d1eca0f32c [CodeGen] Teach OR combine of shuffles involving zero vectors to better handle undef indices.
Undef indices can now be treated as zeros. Or if its undef ORed with zero, we will keep the undef.

llvm-svn: 274472
2016-07-03 19:37:12 +00:00
Craig Topper 8e826d5abe [X86] Add tests to show that the DAG combine for OR of shuffles with zero vectors doesn't handle undefs as well as it could. Fix coming in another commit.
llvm-svn: 274471
2016-07-03 19:37:10 +00:00
Haicheng Wu b71b2f622a [MBB] add a missing corner case in UpdateTerminator()
After the block placement, if a block ends with a conditional branch, but the
next block is not its successor. The conditional branch should be changed to
unconditional branch.  This patch fixes PR28307, PR28297, PR28402.

Differential Revision: http://reviews.llvm.org/D21811

llvm-svn: 274470
2016-07-03 19:14:17 +00:00
Simon Pilgrim 68ea80649b [X86][AVX512] Add support for VPERMPD/VPERMQ masked shuffle comments
llvm-svn: 274469
2016-07-03 18:40:24 +00:00
Simon Pilgrim a0d73835b2 [X86][AVX512] Add support for 512-bit shuffle decoding of VPERMPD/VPERMQ
llvm-svn: 274468
2016-07-03 18:27:37 +00:00
Simon Pilgrim dbd6db0dc7 [X86][AVX512] Add support for VPALIGNR/PSHUFD/PSHUFHW/PSHUFLW masked shuffle comments
llvm-svn: 274466
2016-07-03 15:00:51 +00:00
Sanjay Patel cbaac41856 [InstCombine] enable vector select of bools -> logic folds
llvm-svn: 274465
2016-07-03 14:34:39 +00:00
Simon Pilgrim 598bdb6bfe [X86][AVX512] Add support for UNPCK masked shuffle comments
llvm-svn: 274464
2016-07-03 14:26:21 +00:00
Simon Pilgrim 1f59076196 [X86][AVX512] Add support for VPERM/VSHUF masked shuffle comments
llvm-svn: 274462
2016-07-03 13:55:41 +00:00
Simon Pilgrim 68f438a036 [X86][AVX512] Add support for PMOVZX masked shuffle comments
llvm-svn: 274461
2016-07-03 13:33:28 +00:00
Sanjay Patel 42396ae0ea add vector bool select tests and regenerate checks for scalar bool select tests
llvm-svn: 274460
2016-07-03 13:26:02 +00:00
Simon Pilgrim 7c2fbdc101 [X86][AVX512] Add support for masked shuffle comments
This patch adds support for including the avx512 mask register information in the mask/maskz versions of shuffle instruction comments.

This initial version just adds support for MOVDDUP/MOVSHDUP/MOVSLDUP to reduce the mass of test regenerations, other shuffle instructions can be added in due course.

Differential Revision: http://reviews.llvm.org/D21953

llvm-svn: 274459
2016-07-03 13:08:29 +00:00
Simon Pilgrim 129b720c18 [X86][AVX512] Add support for lowering shuffles to VPERMILPS
llvm-svn: 274458
2016-07-03 12:47:21 +00:00
Sean Silva 45835e731d Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

llvm-svn: 274455
2016-07-02 23:47:27 +00:00
Xinliang David Li 8a021317a2 [PM] Port LoopAccessInfo analysis to new PM
It is implemented as a LoopAnalysis pass as 
discussed and agreed upon.

llvm-svn: 274452
2016-07-02 21:18:40 +00:00
Simon Pilgrim 99e8a1aa0b [X86][AVX512] Add support for lowering shuffles to VPERMILPD
llvm-svn: 274450
2016-07-02 20:20:12 +00:00
Simon Pilgrim 72052f6de9 [X86][AVX512VL] Add fast-isel MOVDDUP/MOVSLDUP/MOVSHDUP shuffle tests
llvm-svn: 274448
2016-07-02 19:22:46 +00:00
Simon Pilgrim cde7c54baa [X86][AVX512] Add support for 512-bit PSHUFB lowering
llvm-svn: 274444
2016-07-02 18:14:31 +00:00
Simon Pilgrim 77dda7c2e0 [X86][AVX512] Converted the MOVDDUP/MOVSLDUP/MOVSHDUP masked intrinsics to generic IR
llvm-svn: 274443
2016-07-02 17:16:41 +00:00
Simon Pilgrim 19adee9d84 [X86][AVX512] Autoupgrade the MOVDDUP/MOVSLDUP/MOVSHDUP intrinsics
llvm-svn: 274439
2016-07-02 14:42:35 +00:00
Simon Pilgrim f040d8c061 [X86][AVX512] Add support for lowering shuffles to MOVDDUP/MOVSLDUP/MOVSHDUP
llvm-svn: 274436
2016-07-02 12:45:03 +00:00
Simon Pilgrim 5e95390957 [X86][AVX512] Add test cases that should lower to MOVSLDUP/MOVSHDUP
llvm-svn: 274435
2016-07-02 12:20:35 +00:00
Simon Pilgrim a6f262a1f9 [X86][AVX512] Add fast-isel shuffle tests
Its not worth trying to write out tests for all the avx512f builtins yet, just adding tests for lowering of generic IR as we transition to it (shuffles mainly right now).

llvm-svn: 274434
2016-07-02 12:13:29 +00:00
Qin Zhao b463c23c10 [esan|cfrag] Add counters for struct array accesses
Summary:
Adds one counter to the struct counter array for counting struct
array accesses.

Adds instrumentation to insert counter update for struct array
accesses.

Reviewers: aizatsky

Subscribers: llvm-commits, bruening, eugenis, kcc, zhaoqin, vitalybuka

Differential Revision: http://reviews.llvm.org/D21594

llvm-svn: 274420
2016-07-02 03:25:37 +00:00
Michael Kuperstein 071d8306b0 [PM] Port ConstantHoisting to the new Pass Manager
Differential Revision: http://reviews.llvm.org/D21945

llvm-svn: 274411
2016-07-02 00:16:47 +00:00
Reid Kleckner e092dad72c [codeview] Set the Nested and Scoped ClassOptions based on the scope chain
These are set on both the declaration record and the definition record.

llvm-svn: 274410
2016-07-02 00:11:07 +00:00
Matt Arsenault accddacb70 TII: Fix inlineasm size counting comments as insts
The main problem was counting comments on their own
line as instructions.

llvm-svn: 274405
2016-07-01 23:26:50 +00:00
David Majnemer 08bd744c2c [CodeView] Include the offset of nested members
Given something like:
  struct S {
    int a;
    struct { int b; };
  };

We would fail to give 'b' offset 4.  Instead, we would give it the
offset it has inside of it's struct.

llvm-svn: 274400
2016-07-01 23:12:48 +00:00
David Majnemer 6bdc24e7b6 [CodeView] Pretty print anonymous scopes
A namespace without a name should be written out as `anonymous
namespace' while a tag type without a name should be written out as
<unnamed-tag>.

llvm-svn: 274399
2016-07-01 23:12:45 +00:00
Matt Arsenault 7f681ac7a9 AMDGPU: Add feature for unaligned access
llvm-svn: 274398
2016-07-01 23:03:44 +00:00
Matt Arsenault 8af47a09e5 AMDGPU: Expand unaligned accesses early
Due to visit order problems, in the case of an unaligned copy
the legalized DAG fails to eliminate extra instructions introduced
by the expansion of both unaligned parts.

llvm-svn: 274397
2016-07-01 22:55:55 +00:00
Evgeniy Stepanov b736335dc3 [msan] Fix __msan_maybe_ for non-standard type sizes.
Fix incorrect calculation of the type size for __msan_maybe_warning_N
call that resulted in an invalid (narrowing) zext instruction and
"Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed."

Only happens in very large functions (with more than 3500 MSan
checks) operating on integer types that are not power-of-two.

llvm-svn: 274395
2016-07-01 22:49:59 +00:00
Matt Arsenault 327bb5ad82 AMDGPU: Improve load/store of illegal types.
There was a combine before to handle the simple copy case.
Split this into handling loads and stores separately.

We might want to change how this handles some of the vector
extloads, since this can result in large code size increases.

llvm-svn: 274394
2016-07-01 22:47:50 +00:00
Reid Kleckner ad56ea3129 [codeview] Don't record UDTs for anonymous structs
MSVC makes up names for these anonymous structs, but we don't (yet).
Eventually Clang should use getTypedefNameForAnonDecl() to put some name
in the debug info, and we can update the test case when that happens.

llvm-svn: 274391
2016-07-01 22:24:51 +00:00
Alina Sbirlea 8d8aa5dd6c Address two correctness issues in LoadStoreVectorizer
Summary:
GetBoundryInstruction returns the last instruction as the instruction which follows or end(). Otherwise the last instruction in the boundry set is not being tested by isVectorizable().
Partially solve reordering of instructions. More extensive solution to follow.

Reviewers: tstellarAMD, llvm-commits, jlebar

Subscribers: escha, arsenm, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21934

llvm-svn: 274389
2016-07-01 21:44:12 +00:00
Dehao Chen 7b2c997736 Specify mtriple for the frame-order.ll test.
Summary: original test may have different bahavior on different bot, specifically it broke llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21931

llvm-svn: 274368
2016-07-01 17:35:13 +00:00
Dehao Chen ad2b4e1334 Do not count debug instructions when counting number of uses to reorder frame objects.
Summary: The code generation should be independent of the debug info.

Reviewers: zansari, davidxl, mkuper, majnemer

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D21911

llvm-svn: 274357
2016-07-01 15:40:25 +00:00
Nikolay Haustov beb24f5b20 Resubmit r268719 - AMDGPU/SI: Add amdgpu_kernel calling convention. Part 2.
This was reverted in r268740 because of problems with corresponding Clang change.
Clang change was updated and resubmitted in r274220.

Check calling convention in AMDGPUMachineFunction::isKernel

This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.

Also, in the future unused non-kernels may be optimized.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D19917

llvm-svn: 274341
2016-07-01 10:00:58 +00:00
Sam Kolton 5196b88f07 [AMDGPU] Assembler: support SDWA for VOPC instructions
Summary: dst_sel and dst_unused disabled for VOPC as they have no effect on result

Reviewers: artem.tamazov, tstellarAMD, vpykhtin

Subscribers: arsenm, kzhuravl

Differential Revision: http://reviews.llvm.org/D21376

llvm-svn: 274340
2016-07-01 09:59:21 +00:00
Duncan P. N. Exon Smith e60719b3fa Revert "add tests for bugs fixed by the GVN hoist pass"
This reverts commit r274327 since the tests fail.  E.g.:
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17240

It looks like this commit is building on r274305, but that commit caused
a miscompile and was reverted in r274320.

llvm-svn: 274332
2016-07-01 04:55:13 +00:00
Sebastian Pop 196ba4f844 add tests for bugs fixed by the GVN hoist pass
https://llvm.org/bugs/show_bug.cgi?id=20242
https://llvm.org/bugs/show_bug.cgi?id=22005

llvm-svn: 274327
2016-07-01 03:03:19 +00:00
Reid Kleckner b5af11dfa3 [codeview] Add DISubprogram::ThisAdjustment
Summary:
This represents the adjustment applied to the implicit 'this' parameter
in the prologue of a virtual method in the MS C++ ABI. The adjustment is
always zero unless multiple inheritance is involved.

This increases the size of DISubprogram by 8 bytes, unfortunately. The
adjustment really is a signed 32-bit integer. If this size increase is
too much, we could probably win it back by splitting out a subclass with
info specific to virtual methods (virtuality, vindex, thisadjustment,
containingType).

Reviewers: aprantl, dexonsmith

Subscribers: aaboud, amccarth, llvm-commits

Differential Revision: http://reviews.llvm.org/D21614

llvm-svn: 274325
2016-07-01 02:41:21 +00:00
Matt Arsenault 0101ecade0 LoadStoreVectorizer: Don't increase alignment with no align set
If no alignment was set on the load/stores, it would vectorize
to the new type even though this increases the default alignment.

llvm-svn: 274323
2016-07-01 02:09:38 +00:00
Matt Arsenault 370e8226c7 LoadStoreVectorizer: Check TTI for vec reg bit width
llvm-svn: 274322
2016-07-01 02:07:22 +00:00
Matt Arsenault 42ad17059a LoadStoreVectorizer: Fix assert when merging pointer ops
This needs to use inttoptr/ptrtoint if combining an int and pointer
load. If a pointer is used always do an integer load.

llvm-svn: 274321
2016-07-01 01:55:52 +00:00
Duncan P. N. Exon Smith 9d1f156418 Revert "code hoisting pass based on GVN"
This reverts commit r274305, since it breaks self-hosting:
  http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/22349/
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17232

Note that the blamelist on lab.llvm.org:8011 is incorrect.  The previous
build was r274299, but somehow r274305 wasn't included in the blamelist:
  http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules

llvm-svn: 274320
2016-07-01 01:51:40 +00:00
Matt Arsenault 241f34cde8 LoadStoreVectorizer: Use AA metadata
This was not passing the full instruction with metadata
to the alias query.

llvm-svn: 274318
2016-07-01 01:47:46 +00:00
Matt Arsenault d7e8898bdd LoadStoreVectorizer: if one element of a vector is integer, default to
integer.

Fixes issues on some architectures where we use arithmetic ops to build
vectors, which can cause bad things to happen for loads/stores of mixed
types.

Patch by Fiona Glaser

llvm-svn: 274307
2016-07-01 00:37:01 +00:00
Matt Arsenault 8a4ab5e19f LoadStoreVectorizer: Fix crashes on sub-byte types
llvm-svn: 274306
2016-07-01 00:36:54 +00:00
Sebastian Pop 5c5798c57c code hoisting pass based on GVN
This pass hoists duplicated computations in the program. The primary goal of
gvn-hoist is to reduce the size of functions before inline heuristics to reduce
the total cost of function inlining.

Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki.
Important algorithmic contributions by Daniel Berlin under the form of reviews.

Differential Revision: http://reviews.llvm.org/D19338

llvm-svn: 274305
2016-07-01 00:24:31 +00:00
Matt Arsenault 079d0f19a2 LoadStoreVectorizer: Check skipFunction first.
Also add test I forgot to add to r274296.

llvm-svn: 274299
2016-06-30 23:50:18 +00:00
Matt Arsenault 2cbe52b990 LoadStoreVectorizer: Skip optnone functions
llvm-svn: 274296
2016-06-30 23:30:29 +00:00
Matt Arsenault 08debb0244 Add LoadStoreVectorizer pass
This was contributed by Apple, and I've been working on
minimal cleanups and generalizing it.

llvm-svn: 274293
2016-06-30 23:11:38 +00:00
Matt Arsenault 727e279ac4 SLPVectorizer: Move propagateMetadata to VectorUtils
This will be re-used by the LoadStoreVectorizer.

Fix handling of range metadata and testcase by Justin Lebar.

llvm-svn: 274281
2016-06-30 21:17:59 +00:00
Yunzhong Gao b386955adc Add an artificial line-0 debug location when the compiler emits a call to
__stack_chk_fail(). This avoids a compiler crash.

Differential Revision: http://reviews.llvm.org/D21818

llvm-svn: 274263
2016-06-30 18:49:04 +00:00
Wei Mi 95685faeee Refine the set of UniformAfterVectorization instructions.
Except the seed uniform instructions (conditional branch and consecutive ptr
instructions), dependencies to be added into uniform set should only be used
by existing uniform instructions or intructions outside of current loop.

Differential Revision: http://reviews.llvm.org/D21755

llvm-svn: 274262
2016-06-30 18:42:56 +00:00
Etienne Bergeron 078d8f69b6 revert http://reviews.llvm.org/D21101
llvm-svn: 274251
2016-06-30 17:52:24 +00:00
Zachary Turner ab58ae8730 [pdb] Re-add code to write PDB files.
Somehow all the functionality to write PDB files got removed,
probably accidentally when uploading the patch perhaps the wrong
one got uploaded.  This re-adds all the code, as well as the
corresponding test.

llvm-svn: 274248
2016-06-30 17:43:00 +00:00
Zachary Turner a30bd1a1bc Update llvm-pdbdump to use subcommands.
llvm-svn: 274247
2016-06-30 17:42:48 +00:00
Etienne Bergeron 47cf4eabe6 [exceptions] Upgrade exception handlers when stack protector is used
Summary:
MSVC provide exception handlers with enhanced information to deal with security buffer feature (/GS).

To be more secure, the security cookies (GS and SEH) are validated when unwinding the stack.

The following code:
```
void f() {}

void foo() {
  __try {
    f();
  } __except(1) {
    f();
  }
}
```

Reviewers: majnemer, rnk

Subscribers: thakis, llvm-commits, chrisha

Differential Revision: http://reviews.llvm.org/D21101

llvm-svn: 274239
2016-06-30 15:36:59 +00:00
Jun Bum Lim 596a3bd9ec [DSE] Fix bug in partial overwrite tracking
Summary:
Found cases where DSE incorrectly add partially-overwritten intervals.
Please see the test case for details.

Reviewers: mcrosier, eeckstein, hfinkel

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D21859

llvm-svn: 274237
2016-06-30 15:32:20 +00:00
Sanjay Patel 7c6eab5777 [InstCombine] shrink switch conditions better (PR24766)
https://llvm.org/bugs/show_bug.cgi?id=24766#c2

This removes a hack that was added for the benefit of x86 codegen. 
It prevented shrinking the switch condition even to smaller legal (DataLayout) types.
We have a safety mechanism in CGP after:
http://reviews.llvm.org/rL251857
...so we're free to use the optimal (smallest) IR type now.

Differential Revision: http://reviews.llvm.org/D12965

llvm-svn: 274233
2016-06-30 14:51:21 +00:00
Sanjay Patel 7ad98babfa [InstCombine] extend matchSelectFromAndOr() to work with i1 scalar types
If the incoming types are i1, then we don't have to pattern match any sext ops.

Differential Revision: http://reviews.llvm.org/D21740

llvm-svn: 274228
2016-06-30 14:18:18 +00:00
Jonas Paulsson 25e193da4c [SystemZ] Let z13 also support FeatureMiscellaneousExtensions.
This processor feature had been left out by mistake from the z13
ProcessorModel.

This time with updated test case. Thanks, Hans.

Reviewed by Ulrich Weigand.

llvm-svn: 274216
2016-06-30 07:13:56 +00:00
David Majnemer 9319cbc045 [CodeView] Implement support for bitfields in LLVM
CodeView need to know the offset of the storage allocation for a
bitfield.  Encode this via the "extraData" field in DIDerivedType and
introduced a new flag, DIFlagBitField, to indicate whether or not a
member is a bitfield.

This fixes PR28162.

Differential Revision: http://reviews.llvm.org/D21782

llvm-svn: 274200
2016-06-30 03:00:20 +00:00
Sanjoy Das 0da2d14766 [SCEV] Compute max be count from shift operator only if all else fails
In particular, check to see if we can compute a precise trip count by
exhaustively simulating the loop first.

llvm-svn: 274199
2016-06-30 02:47:28 +00:00
George Burgess IV d86e38e1db [CFLAA] Add support for ModRef queries.
This patch makes CFLAA answer some ModRef queries. Because we don't
distinguish between reading/writing when making StratifiedSets, we're
unable to offer any of the readonly-related answers.

Patch by Jia Chen.

Differential Revision: http://reviews.llvm.org/D21858

llvm-svn: 274197
2016-06-30 02:11:26 +00:00
Sanjay Patel 348111f4b9 add vector tests to show missing transform
llvm-svn: 274192
2016-06-30 00:09:13 +00:00
Sanjay Patel c3701e8b92 regenerate checks
llvm-svn: 274188
2016-06-29 23:58:39 +00:00
Vedant Kumar d6d192cd12 [llvm-cov] Use relative paths to file reports in -output-dir mode
This makes it possible to e.g copy a report to another filesystem.

llvm-svn: 274173
2016-06-29 21:55:46 +00:00
Artem Belevich 4d5d7be8cc Revert r273313 "[NVPTX] Improve lowering of byval args of device functions."
The change causes llvm crash in some unoptimized builds.

llvm-svn: 274163
2016-06-29 20:51:15 +00:00
Evgeniy Stepanov a5da256f92 StackColoring for SafeStack.
This is a fix for PR27842.

An IR-level implementation of stack coloring tailored to work with
SafeStack. It is a bit weaker than the MI implementation in that it
does not the "lifetime start at first access" logic. This can be
improved in the future.

This patch also replaces the naive implementation of stack frame
layout with a greedy algorithm that can split existing stack slots
and even fit small objects inside the alignment padding of other
objects.

llvm-svn: 274162
2016-06-29 20:37:43 +00:00
Tim Shen aec68b263d [InstCombine] Simplify and correct folding fcmps with the same children
Summary: Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp *, x, y) | (fcmp *, x, y) and (fcmp *, x, y) & (fcmp *, x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants.

Currently InstCombine wrongly folds (fcmp ogt, x, y) | (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that.

Reviewers: spatel

Subscribers: llvm-commits, iteratee, echristo

Differential Revision: http://reviews.llvm.org/D21775

llvm-svn: 274156
2016-06-29 20:10:17 +00:00
Tim Shen 860a67eb4c [InstCombine, NFC] Change the generated variable names by creating new instructions
This removes some noise for D21775's test changes.

llvm-svn: 274155
2016-06-29 20:10:13 +00:00
Nirav Dave 8e10380b73 Permit memory operands in ins/outs instructions
[x86] (PR15455) While (ins|outs)[bwld] instructions do not take %dx as a
memory operand, various unofficial references do and objdump
disassembles to this format. Extend special treatment of
similar (in|out)[bwld] operations.

Reviewers: craig.topper, rnk, ab

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18837

llvm-svn: 274152
2016-06-29 19:54:27 +00:00
Rafael Espindola 2211f015cc Don't verify inputs to the Linker if ODR merging.
This fixes pr28072.

The point, as Duncan pointed out, is that the file is already
partially linked by just reading it.

Long term I think the solution is to make metadata owned by the module
and then the linker will lazily read it and be in charge of all the
linking. Running a verifier in each input will defeat the lazy
loading, but will be legal.

Right now we are at the unfortunate position that to support odr
merging we cannot verify the inputs, which mildly annoying (see test
update).

llvm-svn: 274148
2016-06-29 18:31:48 +00:00
Tim Shen 4561e784f4 [InstCombine] Add full tests for FoldAndOfFCmps and FoldOrOfFCmps
Summary: This adds tests for covering all cases that FoldAndOfFCmps and FoldOrOfFCmps handle.

Reviewers: spatel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D21844

llvm-svn: 274144
2016-06-29 17:55:11 +00:00
Vedant Kumar c1561cb2fa [llvm-cov] Change some FileCheck prefixes to make tests reusable (NFC)
I'm planning on extending these two tests with checks that validate
html coverage reports. Make it easier to extend them by not using a
prefix called "CHECK".

llvm-svn: 274143
2016-06-29 17:47:08 +00:00
Nico Weber 0a480b2c05 Add a regression test for PR28348.
llvm-svn: 274142
2016-06-29 17:34:31 +00:00
Nico Weber 12fdf60b75 Revert r272251, it caused PR28348.
llvm-svn: 274141
2016-06-29 17:33:41 +00:00
Ahmed Bougacha 15a2f6d58c [X86] Lower blended PACKUSes using appropriate types.
When lowering two blended PACKUS, we used to disregard the types
of the PACKUS inputs, indiscriminately generating a v16i8 PACKUS.

This leads to non-selectable things like:
    (v16i8 (PACKUS (v4i32 v0), (v4i32 v1)))

Instead, check that the PACKUSes have the same type, and use that
as the final result type.

llvm-svn: 274138
2016-06-29 16:56:09 +00:00
Vedant Kumar 84cfb884e2 [llvm-cov] Disable PGO name compression in a test file
Some bots do not configure llvm with zlib enabled. Should fix:

http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/15571

llvm-svn: 274137
2016-06-29 16:34:57 +00:00
Vedant Kumar 2ca5eaa85a Fix a typo; NFC
llvm-svn: 274136
2016-06-29 16:23:34 +00:00
Vedant Kumar 4a54abeacd [llvm-cov] Do not allow ".." to escape the coverage sub-directory
In -output-dir mode, file reports are placed into a "coverage"
directory. If filenames in the coverage mapping contain "..", they might
escape out of this directory.

Fix the problem by removing ".." from source filenames (expand the path
component).

llvm-svn: 274135
2016-06-29 16:22:12 +00:00
Rafael Espindola c4cabb8054 Update tests to use at least darwin9.
llvm-svn: 274129
2016-06-29 14:51:10 +00:00
Simon Pilgrim f9c5908ffd [X86][SSE2] Added _mm_loadu_si64 test to match llvm\tools\clang\test\CodeGen\sse2-builtins.c
llvm-svn: 274127
2016-06-29 14:05:33 +00:00
Simon Pilgrim 851019175b [X86] Regenerated popcnt combine tests
llvm-svn: 274124
2016-06-29 13:54:03 +00:00
Elena Demikhovsky 5e21c94f25 Reverted patch 273864
llvm-svn: 274115
2016-06-29 10:01:06 +00:00
Marcin Koscielnicki 518cbc7cc3 [SystemZ] Add floating-point test data class instructions.
These are not used by CodeGen yet - ISD combiners creating the new node
will come in subsequent patches.

llvm-svn: 274108
2016-06-29 07:29:07 +00:00
Craig Topper df7454f94b Revert "[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition."
This is breaking an optimizaton remark test in clang. I've identified a couple fixes for that, but want to understand it better before I commit to anything.

llvm-svn: 274102
2016-06-29 04:57:00 +00:00
Craig Topper 2cc199baff [ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition.
If a operation for a recurrence is an addition with no signed wrap and both input sign bits are 0, then the result sign bit must also be 0. Similar for the negative case.

I found this deficiency while playing around with a loop in the x86 backend that contained a signed division that could be optimized into an unsigned division if we could prove both inputs were positive. One of them being the loop induction variable. With this patch we can perform the conversion for this case. One of the test cases here is a contrived variation of the loop I was looking at.

Differential revision: http://reviews.llvm.org/D21493

llvm-svn: 274098
2016-06-29 03:46:47 +00:00
Craig Topper 3a011de10c [DAGCombine] Teach DAG combine to handle ORs of shuffles involving zero vectors where the zero vector is the first operand to the shuffle instead of the second.
llvm-svn: 274097
2016-06-29 03:29:12 +00:00
Craig Topper 1e7e36e7e6 [DAGCombine] Add test cases to show that DAG combining an OR of two shuffles with zero vectors doesn't work if the zero vector is the first operand of the shuffle. Fix coming in a follow up patch.
llvm-svn: 274096
2016-06-29 03:29:09 +00:00
Eric Christopher 0c58837b1f Revert "[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions"
Revert "[InstCombine] Combine A->B->A BitCast"

as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847

This reverts commits r270135 and r263734.

llvm-svn: 274094
2016-06-29 03:05:58 +00:00
Vedant Kumar 8d74cb27e8 [llvm-cov] Minor cleanups to prepare for the html format patch
- Add renderView{Header,Footer}, renderLineSuffix, and hasSubViews to
  support creating tables with nested views.

- Move the 'Format' cl::opt to make it easier to extend.

- Just create one function view file, instead of overwriting the same
  file for every new function. Add a regression test for this.

llvm-svn: 274086
2016-06-29 00:38:21 +00:00
Kevin Enderby 42398051d8 Finish cleaning up most of the error handling in libObject’s MachOUniversalBinary
and its clients to use the new llvm::Error model for error handling.

Changed getAsArchive() from ErrorOr<...> to Expected<...> so now all
interfaces there use the new llvm::Error model for return values.

In the two places it had if (!Parent) this is actually a program error so changed
from returning errorCodeToError(object_error::parse_failed) to calling
report_fatal_error() with a message.

In getObjectForArch() added error messages to its two llvm::Error return values
instead of returning errorCodeToError(object_error::arch_not_found) with no
error message.

For the llvm-obdump, llvm-nm and llvm-size clients since the only binary files in
Mach-O Universal Binaries that are supported are Mach-O files or archives with
Mach-O objects, updated their logic to generate an error when a slice contains
something like an ELF binary instead of ignoring it. And added a test case for
that.

The last error stuff to be cleaned up for libObject’s MachOUniversalBinary is
the use of errorOrToExpected(Archive::create(ObjBuffer)) which needs
Archive::create() to be changed from ErrorOr<...> to Expected<...> first,
which I’ll work on next. 

llvm-svn: 274079
2016-06-28 23:16:13 +00:00
Weiming Zhao 5410edddb1 [ARM] Fix 28282: cost computation for constant hoisting
Summary:
This fixes bug: https://llvm.org/bugs/show_bug.cgi?id=28282

Currently the cost model of constant hoisting checks the bit width of the data type of the constants.
However, the actual immediate value is small enough and not need to be hoisted.
This patch checks for the actual bit width needed for the constant.

Reviewers: t.p.northover, rengolin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21668

llvm-svn: 274073
2016-06-28 22:30:45 +00:00
Dehao Chen 8cd84aaa6f Relax the clearance calculating for breaking partial register dependency.
Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency.

Reviewers: wmi, davidxl, stoklund, zansari, myatsina, RKSimon, DavidKreitzer, mkuper, joerg, spatel

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D21560

llvm-svn: 274068
2016-06-28 21:19:34 +00:00
Chris Bieneman 92b2e8a295 [YAML] Fix YAML tags appearing before the start of sequence elements
Our existing yaml::Output code writes tags immediately when mapTag is called, without any state handling. This results in tags on sequence elements being written before the element itself. For example, we see this:

SomeArray:     !elem_type
  - key1:         1
    key2:         2 !elem_type2
  - key3:         3
    key4:         4

We should instead see:

SomeArray:
  - !elem_type
    key1:         1
    key2:         2
  - !elem_type2
    key3:         3
    key4:         4

Our reader handles reading properly, so this bug only impacts writing yaml sequences with tagged elements.

As a test for this I've modified the Mach-O yaml encoding to allways apply the !mach-o tag when encoding MachOYAML::Object entries. This results in the !mach-o tag appearing as expected in dumped fat files.

llvm-svn: 274067
2016-06-28 21:10:26 +00:00
Zhan Jun Liau 347db3e18e [SystemZ] Use NILL instruction instead of NILF where possible
Summary: SystemZ shift instructions only use the last 6 bits of the shift
amount. When the result of an AND operation is used as a shift amount, this
means that we can use the NILL instruction (which operates on the last 16 bits)
rather than NILF (which operates on the last 32 bits) for a 16-bit savings in
instruction size.

Reviewers: uweigand

Subscribers: llvm-commits

Author: colpell
Committing on behalf of Elliot.

Differential Revision: http://reviews.llvm.org/D21686

llvm-svn: 274066
2016-06-28 21:03:19 +00:00
Matthias Braun 0b9a07883d X86FrameLowering: Check subregs when deciding prolog kill flags
llvm-svn: 274057
2016-06-28 20:31:56 +00:00
Sanjay Patel 3a0f2606ec minimize regression tests and update checks
llvm-svn: 274047
2016-06-28 18:40:08 +00:00
Sanjay Patel 8ce43c098b minimize regression tests and update checks
llvm-svn: 274046
2016-06-28 18:33:10 +00:00
Artur Pilipenko 7ad95ec22d Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270

llvm-svn: 274043
2016-06-28 18:27:25 +00:00
Jacques Pienaar f43266b868 [lanai] Update ELF number to correspond to the assigned number.
Change EM_LANAI to correspond to machine number assigned by Xinuos.

llvm-svn: 274042
2016-06-28 18:22:22 +00:00
Michael Kuperstein a118acb82f [X86] Update a test with more explicit checks. NFC.
llvm-svn: 274040
2016-06-28 17:42:13 +00:00
Vedant Kumar 9cbad2c2b8 [llvm-cov] Create an index of reports in -output-dir mode
This index lists the reports available in the 'coverage' sub-directory.
This will help navigate coverage output from large projects.

This commit factors the file creation code out of SourceCoverageView and
into CoveragePrinter.

llvm-svn: 274029
2016-06-28 16:12:24 +00:00
Vedant Kumar 64d8a029e9 [llvm-cov] Minor cleanups (NFC)
- Test the '-o' alias for -output-dir.
- Use a helper method in a conditional.
- Add a period.

llvm-svn: 274028
2016-06-28 16:12:20 +00:00
David Majnemer 1c7d532cde [X86] Make WRPKRU/RDPKRU pass -verify-machineinstrs
The original implementation attempted to zero registers using
XOR %foo, %foo.  This is problematic because it constitutes a
read-modify-write of a register which might not be defined.

Instead, use MOV32r0 to avoid these problems; expandPostRAPseudo does
the right thing here.

llvm-svn: 274024
2016-06-28 16:04:46 +00:00
Marcin Koscielnicki 234e5a809b [SystemZ] Save/restore r6 and r7 if function contains landing pad.
This fixes PR27102.

Differential Revision: http://reviews.llvm.org/D18541

llvm-svn: 274017
2016-06-28 14:13:11 +00:00
Simon Pilgrim 5f71c909f0 [X86][AVX] Peek through bitcasts to find the source of broadcasts (reapplied)
AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.

This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.

Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.

As we're being more aggressive with bitcasts, we also need to ensure that the broadcast type is correctly bitcasted

Differential Revision: http://reviews.llvm.org/D21660

llvm-svn: 274013
2016-06-28 13:24:05 +00:00
Arnaud A. de Grandmaison eee4711fbe [gold] Really fix test to run on non x86 platforms.
Address post-commit comment from H.J. Lu.

llvm-svn: 274000
2016-06-28 08:12:09 +00:00
Simon Pilgrim c15d217831 [X86][SSE] Added support for combining target shuffles to (V)PSHUFD/VPERMILPD/VPERMILPS immediate permutes
This patch allows target shuffles to be combined to single input immediate permute instructions - (V)PSHUFD/VPERMILPD/VPERMILPS - allowing more general pattern matching than what we current do and improves the likelihood of memory folding compared to existing patterns which tend to reuse the input in multiple arguments.

Further permute instructions (V)PSHUFLW/(V)PSHUFHW/(V)PERMQ/(V)PERMPD may be added in the future but its proven tricky to create tests cases for them so far. (V)PSHUFLW/(V)PSHUFHW is already handled quite well in combineTargetShuffle so it may be that removing some of that code may allow us to perform more of the combining in one place without duplication.

Differential Revision: http://reviews.llvm.org/D21148

llvm-svn: 273999
2016-06-28 08:08:15 +00:00
Elena Demikhovsky a727f3cfde [X86 Target Lowering] Merged ICMP test.
llvm-svn: 273995
2016-06-28 06:25:38 +00:00
Adam Nemet bd861acf29 [LLE] Don't hoist conditionally executed loads
If the load is conditional we can't hoist its 0-iteration instance to
the preheader because that would make it unconditional.  Thus we would
access a memory location that the original loop did not access.

llvm-svn: 273991
2016-06-28 04:02:47 +00:00
Vedant Kumar 7937ef3796 Reapply "[llvm-cov] Add an -output-dir option for the show sub-command""
Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if
it doesn't already exist, and prints reports into that directory.

In function view mode, all views are written into
path/to/dir/functions.$EXTENSION. In file view mode, all views are
written into path/to/dir/coverage/$PATH.$EXTENSION.

Changes since the initial commit:

- Avoid accidentally closing stdout twice.

llvm-svn: 273985
2016-06-28 02:09:39 +00:00
Nick Lewycky 9980075133 NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'.
Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test.

llvm-svn: 273984
2016-06-28 01:45:05 +00:00
Vedant Kumar a48d9fe86a Revert "[llvm-cov] Add an -output-dir option for the show sub-command"
This reverts commit r273971. test/profile/instrprof-visibility.cpp is
failing because of an uncaught error in SafelyCloseFileDescriptor.

llvm-svn: 273978
2016-06-28 01:14:04 +00:00
Matt Arsenault b4d9503171 AMDGPU: Fix out of bounds indirect indexing errors
This was producing acceses to registers beyond the super
register's limits, resulting in verifier failures.

llvm-svn: 273977
2016-06-28 01:09:00 +00:00
Vedant Kumar 02507c435c [llvm-cov] Add an -output-dir option for the show sub-command
Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if
it doesn't already exist, and prints reports into that directory.

In function view mode, all views are written into
path/to/dir/functions.$EXTENSION. In file view mode, all views are
written into path/to/dir/coverage/$PATH.$EXTENSION.

llvm-svn: 273971
2016-06-28 00:18:57 +00:00
Vedant Kumar dcbf4d68b2 [llvm-cov] Use -check-prefixes in a test (NFC)
llvm-svn: 273970
2016-06-28 00:18:53 +00:00
Vedant Kumar 635c83c1b4 [llvm-cov] Add a format option for the 'show' sub-command (mostly NFC)
llvm-svn: 273968
2016-06-28 00:15:54 +00:00
Chandler Carruth dca834089a [PM] Improve the debugging and logging facilities of the CGSCC bits of
the new pass manager.

This adds operator<< overloads for the various bits of the
LazyCallGraph, dump methods for use from the debugger, and debug logging
using them to the CGSCC pass manager.

Having this was essential for debugging the call graph update patch, and
I've extracted what I could from that patch here to minimize the delta.

llvm-svn: 273961
2016-06-27 23:26:08 +00:00
Easwaran Raman 22eb80a114 Fix size computation of array allocation in inline cost analysis
Differential revision: http://reviews.llvm.org/D21690

llvm-svn: 273952
2016-06-27 22:31:53 +00:00
Sanjay Patel 59ed2ffca3 [InstCombine] shrink type of sdiv if dividend is sexted and constant divisor is small enough (PR28153)
This should fix PR28153:
https://llvm.org/bugs/show_bug.cgi?id=28153

Differential Revision: http://reviews.llvm.org/D21769

llvm-svn: 273951
2016-06-27 22:27:11 +00:00
Kevin Enderby 1051909df1 Change all but the last ErrorOr<...> use for MachOUniversalBinary to Expected<...> to
allow a good error message to be produced.

I added the one test case that the object file tools could produce an error
message.  The other two errors can’t be triggered if the input file is passed
through sys::fs::identify_magic().  But the malformedError("bad magic number")
does get triggered by the logic in llvm-dsymutil when dealing with a normal
Mach-O file.  The other "File too small ..." error would take a logic error
currently to produce and is not tested for.

llvm-svn: 273946
2016-06-27 21:39:39 +00:00
Matt Arsenault 59c0ffa22a AMDGPU: Implement per-function subtargets
llvm-svn: 273940
2016-06-27 20:48:03 +00:00
Matt Arsenault 03d8584590 AMDGPU: Move subtarget feature checks into passes
llvm-svn: 273937
2016-06-27 20:32:13 +00:00
Sanjay Patel 5cdf699daa add tests for PR28153
llvm-svn: 273936
2016-06-27 20:28:59 +00:00
Justin Holewinski cb29fb4a98 Only emit extension for zeroext/signext arguments if type is < 32 bits
Reviewers: jingyue, jlebar

Subscribers: jholewinski

Differential Revision: http://reviews.llvm.org/D21756

llvm-svn: 273922
2016-06-27 20:22:22 +00:00
Rafael Espindola 8121becac3 Teach shouldAssumeDSOLocal about tls.
Fixes a fixme about handling other visibilities.

llvm-svn: 273921
2016-06-27 20:19:14 +00:00
Elena Demikhovsky 6f2ec8104a Fixed crash of SLP Vectorizer on KNL
The bug is connected to vector GEPs.
https://llvm.org/bugs/show_bug.cgi?id=28313

llvm-svn: 273919
2016-06-27 20:07:00 +00:00
Chris Bieneman e5cc1fd498 [yaml2obj] Missed updating a few test cases in r273915
This should fix the broken bots.

llvm-svn: 273918
2016-06-27 20:02:49 +00:00
Matt Arsenault 21a4625a16 AMDGPU: Fix verifier errors with undef vector indices
Also fix pointlessly adding exec to liveins.

llvm-svn: 273916
2016-06-27 19:57:44 +00:00
Chris Bieneman 8ff0c11357 [yaml2obj] Remove --format option in favor of YAML tags
Summary:
Our YAML library's handling of tags isn't perfect, but it is good enough to get rid of the need for the --format argument to yaml2obj. This patch does exactly that.

Instead of requiring --format, it infers the format based on the tags found in the object file. The supported tags are:

!ELF
!COFF
!mach-o
!fat-mach-o

I have a corresponding patch that is quite large that fixes up all the in-tree test cases.

Reviewers: rafael, Bigcheese, compnerd, silvas

Subscribers: compnerd, llvm-commits

Differential Revision: http://reviews.llvm.org/D21711

llvm-svn: 273915
2016-06-27 19:53:53 +00:00
Matt Arsenault 82f41518ed Verifier: Reject non-float !fpmath
Code already assumes this is float. getFPAccuracy()
crashes on any other type.

llvm-svn: 273912
2016-06-27 19:43:15 +00:00
Matt Arsenault f0f721a682 DAGCombiner: Don't narrow volatile vector loads + extract
llvm-svn: 273909
2016-06-27 19:31:04 +00:00
Elena Demikhovsky ad3929cc64 X86 Lowering - Fixed a crash in ICMP scalar instruction
Fixed a bug in EmitTest() function in combining shl + icmp.

https://llvm.org/bugs/show_bug.cgi?id=28119

llvm-svn: 273899
2016-06-27 18:07:16 +00:00
Sanjay Patel c6ada53be5 [InstCombine] use m_APInt for div --> ashr fold
The APInt matcher works with splat vectors, so we get this fold for vectors too.

llvm-svn: 273897
2016-06-27 17:25:57 +00:00
Artur Pilipenko 72f76b8805 Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures.
llvm-svn: 273895
2016-06-27 16:54:33 +00:00
Easwaran Raman 1832bf6aee [PM] Port PartialInlining to the new PM
Differential revision: http://reviews.llvm.org/D21699

llvm-svn: 273894
2016-06-27 16:50:18 +00:00
Artur Pilipenko a36aa41519 Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270

llvm-svn: 273892
2016-06-27 16:29:26 +00:00
Simon Pilgrim 476e8ceed3 [X86][SSE] Added extra broadcast tests to cover PR28327
llvm-svn: 273891
2016-06-27 16:15:37 +00:00
Zhan Jun Liau 4f130b4410 [SystemZ] Avoid generating 2 XOR instructions for (and (xor x, -1), y)
Summary:
Created a pattern to match 64-bit mode (and (xor x, -1), y)
to a shorter sequence of instructions.

Before the change, the canonical form is translated to:
        xihf    %r3, 4294967295
        xilf    %r3, 4294967295
        ngr     %r2, %r3

After the change, the canonical form is translated to:
        ngr     %r3, %r2
        xgr     %r2, %r3

Reviewers: zhanjunl, uweigand

Subscribers: llvm-commits

Author: assem

Committing on behalf of Assem.

Differential Revision: http://reviews.llvm.org/D21693

llvm-svn: 273887
2016-06-27 15:55:30 +00:00
Krzysztof Parzyszek 5da24e5495 [Hexagon] Equally-sized vectors are equivalent in ISel (except vNi1)
llvm-svn: 273885
2016-06-27 15:08:22 +00:00
Nico Weber 1e058160dd Revert 273848, it caused PR28329
llvm-svn: 273879
2016-06-27 14:36:46 +00:00
Simon Pilgrim 9c2f378587 Removed duplicate assertions note
llvm-svn: 273874
2016-06-27 13:06:18 +00:00
Elena Demikhovsky f65e865e33 Removed extra test from the prev commit.
llvm-svn: 273865
2016-06-27 11:40:49 +00:00
Elena Demikhovsky 4c58b2761a Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)

  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Re-commit rL273257 - revision: http://reviews.llvm.org/D20789

llvm-svn: 273864
2016-06-27 11:19:23 +00:00
Arnaud A. de Grandmaison efb0b899d3 [gold] Fix test to not assume it runs on x86 hardware.
llvm-svn: 273854
2016-06-27 09:13:03 +00:00
Hrvoje Varga 24b975dc66 [mips][micromips] Implement LD, LLD, LWU, SD, DSRL, DSRL32 and DSRLV instructions
Differential Revision: http://reviews.llvm.org/D16625

llvm-svn: 273850
2016-06-27 08:23:28 +00:00
Simon Pilgrim a45da385f8 [X86][AVX] Peek through bitcasts to find the source of broadcasts
AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.

This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.

Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.

Differential Revision: http://reviews.llvm.org/D21660

llvm-svn: 273848
2016-06-27 07:44:32 +00:00
Igor Breger 7357849dca [ConstantFolding] Fix bitcast vector of i1.
Differential Revision: http://reviews.llvm.org/D21735

llvm-svn: 273845
2016-06-27 06:42:54 +00:00
Rafael Espindola 1ac1fa818e Mips: Fix access to private functions.
llvm-svn: 273843
2016-06-27 03:19:40 +00:00
Sanjay Patel 1d745384da add tests for potential select transforms
llvm-svn: 273833
2016-06-26 23:44:21 +00:00
Nico Weber d8db1e172c Revert r273807 (and r273809, r273810), it caused PR28311
llvm-svn: 273815
2016-06-26 15:10:34 +00:00
Amjad Aboud ff976c99c7 [codeview] Improved array type support.
Added support for:
1. Multi dimension array.
2. Array of structure type, which previously was declared incompletely.
3. Dynamic size array.

Differential Revision: http://reviews.llvm.org/D21526

llvm-svn: 273807
2016-06-26 11:44:45 +00:00
Sanjoy Das a37bb4a65d [LoopUnswitch] Unswitch on conditions feeding into guards
Summary:
This is a straightforward extension of what LoopUnswitch does to
branches to guards.  That is, we unswitch

```
for (;;) {
  ...
  guard(loop_invariant_cond);
  ...
}
```

into

```
if (loop_invariant_cond) {
  for (;;) {
    ...
    // There is no need to emit guard(true)
    ...
  }
} else {
  for (;;) {
    ...
    guard(false);
    // SimplifyCFG will clean this up by adding an
    // unreachable after the guard(false)
    ...
  }
}
```

Reviewers: majnemer

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21725

llvm-svn: 273801
2016-06-26 05:10:45 +00:00
Jan Vesely 3bc1af2be4 AMDGPU/R600: Fix GlobalValue regressions.
Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary
expressions by including offset even if the offset it 0
(we haven't hit this sooner since tested workloads don't include static offsets)
We don't really care about the type of expression, so set it directly.
Fixes: r272705

Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32.
Fixes: r273166

Differential Revision: http://reviews.llvm.org/D21633

llvm-svn: 273785
2016-06-25 18:24:16 +00:00
Sanjay Patel 51ff79fd82 update tests to use FileCheck
llvm-svn: 273784
2016-06-25 17:39:10 +00:00
David Majnemer e14e7bc4b8 Revert "[SimplifyCFG] Stop inserting calls to llvm.trap for UB"
This reverts commit r273778, it seems to break UBSan :/

llvm-svn: 273779
2016-06-25 08:19:55 +00:00
David Majnemer d346a37737 [SimplifyCFG] Stop inserting calls to llvm.trap for UB
SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.

While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.

There are much better tools than llvm.trap: UBSan and ASan.

N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...

llvm-svn: 273778
2016-06-25 08:04:19 +00:00
David Majnemer bb53d23ef8 [InstSimplify] Replace calls to null with undef
Calling null is undefined behavior, we can simplify the resulting value
to undef.

llvm-svn: 273777
2016-06-25 07:37:30 +00:00
David Majnemer 1fea77c6fc [SimplifyCFG] Replace calls to null/undef with unreachable
Calling null is undefined behavior, a call to undef can be trivially
treated as a call to null.

llvm-svn: 273776
2016-06-25 07:37:27 +00:00
Konstantin Zhuravlyov f2f3d14774 [AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.

Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
  - offset 0: work group ID x
  - offset 4: work group ID y
  - offset 8: work group ID z
  - offset 16: work item ID x
  - offset 20: work item ID y
  - offset 24: work item ID z

Set
  - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
  - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
  - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled

Differential Revision: http://reviews.llvm.org/D20335

llvm-svn: 273769
2016-06-25 03:11:28 +00:00
Saleem Abdulrasool 92d33bd2af llvm-ar: add some tests for llvm-ar default selection
This adds some tests for the smarter llvm-ar selection mode as well as some
additional tests as per Rafael's post commit review comments.

llvm-svn: 273768
2016-06-25 03:05:56 +00:00
Tom Stellard b164a9843b AMDGPU/SI: Make sure not to fold offsets into local address space globals
Summary:
Offset folding only works if you are emitting relocations, and we don't
emit relocations for local address space globals.

Reviewers: arsenm, nhaustov

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21647

llvm-svn: 273765
2016-06-25 01:59:16 +00:00
Sanjoy Das f63768cbfc [PlaceSafepoints] Don't call undef in test case; NFC
llvm-svn: 273764
2016-06-25 01:40:54 +00:00
Sanjoy Das d850068282 [LoopUnswitch] Avoid exponential behavior
Summary: (No semantic change intended).

Reviewers: majnemer, bogner, mzolotukhin

Subscribers: mcrosier, llvm-commits, mzolotukhin

Differential Revision: http://reviews.llvm.org/D21707

llvm-svn: 273763
2016-06-25 01:14:19 +00:00
David Majnemer 0f45572761 The absence of noreturn doesn't ensure mayReturn
There are two separate issues:
- LLVM doesn't consider infinite loops to be side effects: we happily
  hoist/sink above/below loops whose bounds are unknown.
- The absence of the noreturn attribute is insufficient for us to know
  if a function will definitely return.  Relying on noreturn in the
  middle-end for any property is an accident waiting to happen.

llvm-svn: 273762
2016-06-25 00:55:12 +00:00
Peter Collingbourne 0312f614b1 IR: Introduce llvm.type.checked.load intrinsic.
This intrinsic safely loads a function pointer from a virtual table pointer
using type metadata. This intrinsic is used to implement control flow integrity
in conjunction with virtual call optimization. The virtual call optimization
pass will optimize away llvm.type.checked.load intrinsics associated with
devirtualized calls, thereby removing the type check in cases where it is
not needed to enforce the control flow integrity constraint.

This patch also introduces the capability to copy type metadata between
global variables, and teaches the virtual call optimization pass to do so.

Differential Revision: http://reviews.llvm.org/D21121

llvm-svn: 273756
2016-06-25 00:23:04 +00:00