Commit Graph

127118 Commits

Author SHA1 Message Date
Xinliang David Li 876ed52c8a Add a compatibility test
llvm-svn: 259632
2016-02-03 06:27:38 +00:00
Xinliang David Li 3c88288927 Fix a typo in comment
llvm-svn: 259631
2016-02-03 06:24:11 +00:00
Xinliang David Li a398d2d94a Fix uninitiazed variable use problem
llvm-svn: 259630
2016-02-03 06:23:16 +00:00
Xinliang David Li 6c93ee8d36 [PGO] Profile summary reader/writer support
With this patch, the profile summary data will be available in indexed
profile data file so that profiler reader/compiler optimizer can start
to make use of.

Differential Revision: http://reviews.llvm.org/D16258

llvm-svn: 259626
2016-02-03 04:08:18 +00:00
Peter Collingbourne 0c0d7e2d0f LowerBitSets: Don't bother to do any work if the llvm.bitset.test intrinsic is unused.
llvm-svn: 259625
2016-02-03 03:48:46 +00:00
Peter Collingbourne 83cc981c49 Add #include "llvm/Support/raw_ostream.h" to fix Windows build.
llvm-svn: 259623
2016-02-03 03:16:37 +00:00
Peter Collingbourne 9f7ec14009 Transforms: Move GlobalOpt's Evaluator to Utils where it can be reused.
llvm-svn: 259621
2016-02-03 02:51:00 +00:00
Nick Lewycky a093ab4ad6 Fix typo in comment. NFC
llvm-svn: 259620
2016-02-03 02:15:49 +00:00
Peter Collingbourne 4e3605a2af docs: Document how bitsets may be used to encode type information.
llvm-svn: 259619
2016-02-03 02:01:08 +00:00
Kyle Butt d62d8b771d Codegen: [PPC] Fix PPCVSXFMAMutate to handle duplicates.
The purpose of PPCVSXFMAMutate is to elide copies by changing FMA forms
on PPC.

    %vreg6<def> = COPY %vreg96
    %vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg7
    ;v6 = v6 + v5 * v7

is replaced by

    %vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg7, %vreg96
    ;v5 = v5 * v7 + v96

This was broken in the case where the target register was also used as a
multiplicand. Fix this case by checking for it and replacing both uses
with the copied register.

    %vreg6<def> = COPY %vreg96
    %vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg6
    ;v6 = v6 + v5 * v6

is replaced by

    %vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg96, %vreg96
    ;v5 = v5 * v96 + v96

llvm-svn: 259617
2016-02-03 01:41:09 +00:00
Yunzhong Gao eb959722a7 Revert r259576: Disable the vzeroupper insertion pass on PS4.
Will re-implement based on review feedback.

llvm-svn: 259615
2016-02-03 01:25:12 +00:00
Marcello Maggioni bfe87568aa RegCoalescer: Making sure re-materialization defines all subranges
The register coalescer can rematerialize constants that define
more of a register than the copy it is going to replace was going
to do.
This is valid in the case the register was undef before the
copy happened.
This patch makes sure that all the subranges defined by the new
rematerialization instructions have at least a dead def.

Review: http://reviews.llvm.org/D16693
llvm-svn: 259614
2016-02-03 00:22:32 +00:00
NAKAMURA Takumi a8d480d9d5 DiagnosticInfoWithDebugLocBase: Appease Twine for now.
FIXME: We should get rid of Twine in the record.
llvm-svn: 259612
2016-02-03 00:09:22 +00:00
Adam Nemet d52ed84160 [LoopVersioning] Expose loop versioning as a pass too
Summary:
LoopVersioning is a transform utility that transform passes can use to
run-time disambiguate may-aliasing accesses. I'd like to also expose as
pass to allow it to be unit-tested.

I am planning to add support for non-aliasing annotation in
LoopVersioning and I'd like to be able to write tests directly using
this pass.

(After that feature is done, the pass could also be used to look for
optimization opportunities that are hidden behind incomplete alias
information at compile time.)

The pass drives LoopVersioning in its default way which is to fully
disambiguate may-aliasing accesses no matter how many checks are
required.

Reviewers: hfinkel, ashutosh.nema, sbaranga

Subscribers: zzheng, mssimpso, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D16612

llvm-svn: 259610
2016-02-03 00:06:10 +00:00
George Burgess IV 60adac46f2 Attempt #2 to unbreak r259595.
llvm-svn: 259602
2016-02-02 23:26:01 +00:00
David Majnemer 30579ec851 [codeview] Improve readability of codeview assembly output
Strictly speaking, this is not an improvement in functionality per se
but a usability improvement to those debugging codeview.

llvm-svn: 259601
2016-02-02 23:18:23 +00:00
Kostya Serebryany d88d1305c4 [libFuzzer] don't create too many trace-based mutations as it may be too slow
llvm-svn: 259600
2016-02-02 23:17:45 +00:00
George Burgess IV b5a229f779 Attempt to fix builds broken by r259595.
llvm-svn: 259599
2016-02-02 23:15:26 +00:00
George Burgess IV e1100f533f This patch adds MemorySSA to LLVM.
Please see include/llvm/Transforms/Utils/MemorySSA.h for a description
of MemorySSA, and what it does.

Differential Revision: http://reviews.llvm.org/D7864

llvm-svn: 259595
2016-02-02 22:46:49 +00:00
Philip Reames b7571043f2 [LVI] Fix debug output
Due to staleness in a patch I committed yesterday, the debug output was reporting overdefined cases as being undefined.  Confusing to say the least.  The mistake appears to have only effected the debug output thankfully.

llvm-svn: 259594
2016-02-02 22:43:08 +00:00
Anna Zaks 3b50e70bbe [asan] Add iOS support to AddressSanitzier
Differential Revision: http://reviews.llvm.org/D15625

llvm-svn: 259586
2016-02-02 22:05:07 +00:00
Philip Reames ed8cd0d36e [LVI] Code motion only [NFC]
I introduced a declaration in 259583 to keep the diff readable.  This change just moves the definition up to remove the declaration again.

llvm-svn: 259585
2016-02-02 22:03:19 +00:00
Philip Reames d1f829d374 [LVI] Refactor to use newly introduced intersect utility
This patch uses the newly introduced 'intersect' utility (from 259461: [LVI] Introduce an intersect operation on lattice values) to simplify existing code in LVI.

While not introducing any new concepts, this change is probably not NFC.  The common 'intersect' function is more powerful that the ad-hoc implementations we'd had in a couple of places.  Given that, we may see optimizations triggering a bit more often.

llvm-svn: 259583
2016-02-02 21:57:37 +00:00
Justin Bogner 246345a834 Remove utils/buildit
The autoconf build system was removed - this doesn't even work and
doesn't need to be here.

llvm-svn: 259582
2016-02-02 21:56:16 +00:00
Hemant Kulkarni 782edae7d6 Correct size calculations for ELF files
llvm-svn: 259578
2016-02-02 21:41:49 +00:00
Yunzhong Gao b76ccacfb1 Disable the vzeroupper insertion pass on PS4.
See comments in test/CodeGen/X86/avx-vzeroupper.ll for more explanation.

Original patch by: Sean Silva

llvm-svn: 259576
2016-02-02 21:39:23 +00:00
Lang Hames 3923698b3f [Orc] Stub addresses should be based on stub size, not pointer size.
This didn't affect X86_64, which is the only client of this code at the moment,
as stubs and pointers are both 8-bytes there. It will affect other platforms
though.

llvm-svn: 259575
2016-02-02 21:38:30 +00:00
Matt Arsenault de4208122b AMDGPU: Do not promote allocas with non-inbounds GEPs
If we can't assume the pointer value isn't within the bounds
of the object, it seems risky to try to replace the pointer
calculations.

llvm-svn: 259573
2016-02-02 21:16:12 +00:00
Matt Arsenault 7e747f1a38 AMDGPU: Handle promoting memmove
Also add missing tests for the others.

llvm-svn: 259558
2016-02-02 20:28:10 +00:00
Quentin Colombet b8fb2ba1bb [X86] Fix the merging of SP updates in prologue/epilogue insertions.
When the merging was involving LEAs, we were taking the wrong immediate
from the list of operands.

rdar://problem/24446069

llvm-svn: 259553
2016-02-02 20:11:17 +00:00
Matthias Braun 1377fd6781 MachineVerifier: Check that defs/uses are live in subregisters as well.
llvm-svn: 259552
2016-02-02 20:04:51 +00:00
Matt Arsenault 8b175672cb AMDGPU: Skip promote alloca with no optimizations
llvm-svn: 259551
2016-02-02 19:32:42 +00:00
Matt Arsenault fb8cdbae0c AMDGPU: Minor cleanups for AMDGPUPromoteAlloca
Mostly convert to use range loops.

llvm-svn: 259550
2016-02-02 19:32:35 +00:00
Lang Hames e28b118be0 [Orc] Turn OrcX86_64::IndirectStubsInfo into a template helper class:
GenericIndirectStubsInfo.

This will allow architecture support classes for other architectures to re-use
this code.

llvm-svn: 259549
2016-02-02 19:31:15 +00:00
David Majnemer c9911f28e5 [codeview] Correctly handle inlining functions post-dominated by unreachable
CodeView requires us to accurately describe the extent of the inlined
code.  We did this by grabbing the next debug location in source order
and using *that* to denote where we stopped inlining.  However, this is
not sufficient or correct in instances where there is no next debug
location or the next debug location belongs to the start of another
function.

To get this correct, use the end symbol of the function to denote the
last possible place the inlining could have stopped at.

llvm-svn: 259548
2016-02-02 19:22:34 +00:00
Matt Arsenault e5737f7cac AMDGPU: Report AMDGPUPromoteAlloca changed the function
llvm-svn: 259547
2016-02-02 19:18:57 +00:00
Matt Arsenault ad1348459f AMDGPU: Whitelist handled intrinsics
We shouldn't crash on unhandled intrinsics.
Also simplify failure handling in loop.

llvm-svn: 259546
2016-02-02 19:18:53 +00:00
Matt Arsenault 853a1fc6d9 AMDGPU: Use inbounds when calculating workitem offset
When promoting allocas to LDS, we know we are indexing
into a specific area just created, and the calculation
will also never overflow.

Also emit some of the muls as nsw nuw, because instcombine
infers this already from the range metadata. I think
putting this on the other adds and muls might be OK too,
but I'm not 100% sure.

llvm-svn: 259545
2016-02-02 19:18:48 +00:00
Eugene Zelenko ecefe5a81f Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D16793

llvm-svn: 259539
2016-02-02 18:20:45 +00:00
Reid Kleckner 1fcd610c94 [codeview] Wire up the .cv_inline_linetable directive
This directive emits the binary annotations that describe line and code
deltas in inlined call sites. Single-stepping through inlined frames in
windbg now works.

llvm-svn: 259535
2016-02-02 17:41:18 +00:00
Derek Schuff c6d8fd3f54 [MC] Enable eip-relative addressing on x86-64 for X32 ABI
Summary:
Enables eip-based addressing, e.g.,

lea    constant(%eip), %rax
lea    constant(%eip), %eax

in MC, (used for the x32 ABI). EIP-base addressing is also valid in x86_64,
it is left enabled for that architecture as well.

Patch by João Porto

Differential Revision: http://reviews.llvm.org/D16581

llvm-svn: 259528
2016-02-02 17:20:04 +00:00
Chad Rosier 1142f3cf90 [AArch64] Add a FIXME comment.
llvm-svn: 259515
2016-02-02 15:22:55 +00:00
Chad Rosier bba881ef3d [AArch64] Allocate the modified and used regs only once per function.
llvm-svn: 259510
2016-02-02 15:02:30 +00:00
JF Bastien 926b189a81 WebAssembly: update expected GCC torture test failures
The 3 programs used __attribute__((mode(?))) on enum, which clang r259497 fixed.

llvm-svn: 259508
2016-02-02 14:27:34 +00:00
Oliver Stannard 7e7d983a87 Refactor backend diagnostics for unsupported features
Re-commit of r258951 after fixing layering violation.

The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.

llvm-svn: 259498
2016-02-02 13:52:43 +00:00
Simon Pilgrim 96fe4ef5f7 [X86][AVX512] Add support for AVX512 VMOVQ (load) shuffle decoding
llvm-svn: 259496
2016-02-02 13:32:56 +00:00
JF Bastien dc1255f02f WebAssembly: add option to disable register coloring
Having this hidden option makes it easier to debug other issues.

llvm-svn: 259482
2016-02-02 09:30:01 +00:00
Sjoerd Meijer ffe19f5245 Removed FeatureVFPOnlySP from the Cortex-R7 processor model
description and changed the regression test accordingly.
The default configuration of a Cortex-R7 is to implement the
VFPv3-D16 architecture and the feature line as it was is too
restrictive.

llvm-svn: 259480
2016-02-02 09:28:20 +00:00
David Majnemer ccc809e2e6 [RegisterCoalescer] Better DebugLoc for reMaterializeTrivialDef
When rematerializing a computation by replacing the copy, use the copy's
location.  The location of the copy is more representative of the
original program.

This partially fixes PR10003.

llvm-svn: 259469
2016-02-02 06:41:55 +00:00
Chandler Carruth a4499e9f73 [LCG] Build an edge abstraction for the LazyCallGraph and use it to
differentiate between indirect references to functions an direct calls.

This doesn't do a whole lot yet other than change the print out produced
by the analysis, but it lays the groundwork for a very major change I'm
working on next: teaching the call graph to actually be a call graph,
modeling *both* the indirect reference graph and the call graph
simultaneously. More details on that in the next patch though.

The rest of this is essentially a bunch of over-engineering that won't
be interesting until the next patch. But this also isolates essentially
all of the churn necessary to introduce the edge abstraction from the
very important behavior change necessary in order to separately model
the two graphs. So it should make review of the subsequent patch a bit
easier at the cost of making this patch seem poorly motivated. ;]

Differential Revision: http://reviews.llvm.org/D16038

llvm-svn: 259463
2016-02-02 03:57:13 +00:00
Philip Reames 44456b8963 [LVI] Introduce an intersect operation on lattice values
LVI has several separate sources of facts - edge local conditions, recursive queries, assumes, and control independent value facts - which all apply to the same value at the same location. The existing implementation was very conservative about exploiting all of these facts at once.

This change introduces an "intersect" function specifically to abstract the action of picking a good set of facts from all of the separate facts given. At the moment, this function is relatively simple (i.e. mostly just reuses the bits which were already there), but even the minor additions reveal the inherent power. For example, JumpThreading is now capable of doing an inductive proof that a particular value is always positive and removing a half range check.

I'm currently only using the new intersect function in one place. If folks are happy with the direction of the work, I plan on making a series of small changes without review to replace mergeIn with intersect at all the appropriate places.

Differential Revision: http://reviews.llvm.org/D14476

llvm-svn: 259461
2016-02-02 03:15:40 +00:00
Kostya Serebryany bfbe7fc404 [libFuzzer] allow passing 1 or more files as individual inputs
llvm-svn: 259459
2016-02-02 03:03:47 +00:00
Matthias Braun 579c9cda13 MachineVerifier: Use report_context() instead of ad-hoc messages.
llvm-svn: 259457
2016-02-02 02:44:25 +00:00
Sanjoy Das 881de4d12a [X86] Fix a bug in getMemOpBaseRegImmOfs
Fix a crash in `getMemOpBaseRegImmOfs` that happens if the base of
`MemOp` is a frame index memory operand.  The fix is to have
`getMemOpBaseRegImmOfs` bail out in such cases.  We can possibly be more
clever here, if needed.

llvm-svn: 259456
2016-02-02 02:32:43 +00:00
Kostya Serebryany 078e984d8d [libFuzzer] fail if the corpus dir does not exist
llvm-svn: 259454
2016-02-02 02:07:26 +00:00
Ahmed Bougacha 68a8efa374 [X86][FastISel] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR.
FastISel counterpart to r259448.

llvm-svn: 259449
2016-02-02 01:44:03 +00:00
Ahmed Bougacha 55c6682ae2 [X86] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR.
Officially, we don't acknowledge non-default configurations of MXCSR,
as getting there would require usage of the FENV_ACCESS pragma (at
least insofar as rounding mode is concerned).

We don't support the pragma, so we can assume that the default
rounding mode - round to nearest, ties to even - is always used.

However, it's inconsistent with the rest of the instruction set,
where MXCSR is always effective (unless otherwise specified).
Also, it's an unnecessary obstacle to the few brave souls that use
fenv.h with LLVM.

Avoid the hard-coded rounding mode for fp_to_f16; use MXCSR instead.

llvm-svn: 259448
2016-02-02 01:32:50 +00:00
Anna Zaks cad7994c3b [safestack] Make sure the unsafe stack pointer is popped in all cases
The unsafe stack pointer is only popped in moveStaticAllocasToUnsafeStack so it won't happen if there are no static allocas.

Fixes https://llvm.org/bugs/show_bug.cgi?id=26122

Differential Revision: http://reviews.llvm.org/D16339

llvm-svn: 259447
2016-02-02 01:03:11 +00:00
Philip Reames 2c275cc686 [LVI] Fix a latent bug in getValueAt
This routine was returning Undefined for most queries.  This was utterly wrong.  Amusingly, we do not appear to have any callers of this which are actually trying to exploit unreachable code or this would have broken the world.

A better approach would be to explicit describe the intersection of facts.  That's blocked behind http://reviews.llvm.org/D14476 and I wanted to fix the current bug.

llvm-svn: 259446
2016-02-02 00:45:30 +00:00
Sanjay Patel c54600dbb1 fix typos; NFC
llvm-svn: 259438
2016-02-01 23:53:35 +00:00
Philip Reames f3b94694c0 [LVI] Missing test case from 259432
llvm-svn: 259437
2016-02-01 23:44:38 +00:00
Teresa Johnson a9a630759f Add test for PR26419 (stable function summary ordering)
Enhance an existing test to also check that the ordering of the function
summary entries is stable.

llvm-svn: 259434
2016-02-01 23:26:30 +00:00
Philip Reames 13f7324b86 [LVI] Remove overly tight assert from 259429
I'll submit a test case shortly which covers this, but it's causing clang self host problems in the builders so I wanted to get it removed.

llvm-svn: 259432
2016-02-01 23:21:11 +00:00
Simon Pilgrim 5be17b6e3e [X86][AVX512] Add support for AVX512 VMOVD (load) shuffle decoding
llvm-svn: 259430
2016-02-01 23:04:05 +00:00
Philip Reames c0bdb0c1e5 [LVI] Add select handling
Teach LVI to handle select instructions in the exact same way it handles PHI nodes.  This is useful since various parts of the optimizer convert PHI nodes into selects and we don't want these transformations to cause inferior optimization.  

Note that this patch does nothing to exploit the implied constraint on the inputs represented by the select condition itself.  That will be a later patch and is blocked on http://reviews.llvm.org/D14476

llvm-svn: 259429
2016-02-01 22:57:53 +00:00
Simon Pilgrim f5c23ad3d7 [X86][AVX512] Add support for AVX512 VMOVSD/VMOVSS shuffle decoding
llvm-svn: 259427
2016-02-01 22:26:28 +00:00
Sanjay Patel 4b198802b3 function names start with a lowercase letter; NFC
llvm-svn: 259425
2016-02-01 22:23:39 +00:00
Sanjay Patel 103ab7d571 [InstCombine] simplify masked scatter/gather intrinsics with zero masks
A masked scatter with a zero mask means there's no store.
A masked gather with a zero mask means the passthru arg is returned.

This is a continuation of:
http://reviews.llvm.org/rL259369
http://reviews.llvm.org/rL259392

llvm-svn: 259421
2016-02-01 22:10:26 +00:00
Simon Pilgrim 025a3d857a [X86][AVX512] Add support for AVX512 VINSERTPS shuffle decoding
llvm-svn: 259420
2016-02-01 22:05:50 +00:00
Matthias Braun 3f88eabe93 SmallSet/SmallPtrSet: Refuse huge Small numbers
These sets do linear searching in small mode; It is not a good idea to
use huge numbers as the small value here, save people from themselves by
adding a static_assert.

Differential Revision: http://reviews.llvm.org/D16706

llvm-svn: 259419
2016-02-01 22:05:16 +00:00
Simon Pilgrim e9848d4a88 [X86][SSE] Regenerated load vector + element extraction tests.
llvm-svn: 259416
2016-02-01 21:46:12 +00:00
Chad Rosier dbdb1d6eaf Move comments a bit closer to associated code. NFC.
llvm-svn: 259411
2016-02-01 21:38:31 +00:00
Simon Pilgrim 068e38f7f4 [X86][SSE] Add AVX512 merge consecutive load tests
Add AVX512F/AVX512BW 512-bit tests.

Add AVX512F tests to existing 128/256-bit tests.

llvm-svn: 259410
2016-02-01 21:30:50 +00:00
Simon Pilgrim f3c37cc87e Regenerate vector blend tests.
llvm-svn: 259406
2016-02-01 21:06:32 +00:00
Simon Pilgrim 32b25549fa Regenerate vector sext/zext constant folding tests.
llvm-svn: 259405
2016-02-01 21:01:29 +00:00
Jun Bum Lim 53907161cc Avoid inlining call sites in unreachable-terminated block
Summary:
If the normal destination of the invoke or the parent block of the call site is unreachable-terminated, there is little point in inlining the call site unless there is literally zero cost. Unlike my previous change (D15289), this change specifically handle the call sites followed by unreachable in the same basic block for call or in the normal destination for the invoke. This change could be a reasonable first step to conservatively inline call sites leading to an unreachable-terminated block while BFI / BPI is not yet available in inliner.

Reviewers: manmanren, majnemer, hfinkel, davidxl, mcrosier, dblaikie, eraman

Subscribers: dblaikie, davidxl, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16616

llvm-svn: 259403
2016-02-01 20:55:11 +00:00
Chad Rosier 064261da16 Remove extra semicolon. NFC.
llvm-svn: 259402
2016-02-01 20:54:36 +00:00
Sanjoy Das 4c7b6d79c0 [SCEV] Clean up isKnownPredicateViaConstantRanges; NFCI
- ScalarEvolution::isKnownPredicateViaConstantRanges duplicates some
   logic already present in ConstantRange, use ConstantRange for those
   bits.

 - In some cases ScalarEvolution::isKnownPredicateViaConstantRanges
   returns `false` to mean "definitely false" (e.g. see the
   `LHSRange.getSignedMin().sge(RHSRange.getSignedMax())` case for
   `ICmpInst::ICMP_SLT`), but for `isKnownPredicateViaConstantRanges`,
   `false` actually means "don't know".  Get rid of this extra bit of
   code to avoid confusion.

llvm-svn: 259401
2016-02-01 20:48:14 +00:00
Sanjoy Das 401e631c4b [SCEV] Rename isKnownPredicateWithRanges; NFC
Make it obvious that it uses constant ranges, and use `Via` instead of
`With`, like other similar functions in SCEV.

llvm-svn: 259400
2016-02-01 20:48:10 +00:00
Rafael Espindola 52570ea2a2 Fix infinite recursion in MCAsmStreamer::EmitValueImpl.
If a target can only emit 8-bits data, we would loop in EmitValueImpl
since it will try to split a 32-bits data in 1 chunk of 32-bits.

No test since all current targets can emit 32bits at a time.

Patch by Alexandru Guduleasa!

llvm-svn: 259399
2016-02-01 20:36:49 +00:00
Teresa Johnson 2d9da4dc50 [ThinLTO] Ensure function summary output order is stable
Iterate over the function list instead of a DenseMap of Function pointers
when emitting the function summary into the module.

This fixes PR26419.

llvm-svn: 259398
2016-02-01 20:16:35 +00:00
Rafael Espindola ac60e5f028 Add a test for r258362.
Thanks to Mehdi for finding it.

llvm-svn: 259394
2016-02-01 19:56:12 +00:00
Sanjay Patel 04f792bdc9 [InstCombine] simplify masked store intrinsics with all ones or zeros masks
A masked store with a zero mask means there's no store.
A masked store with an allOnes mask means it's a normal vector store.

This is a continuation of:
http://reviews.llvm.org/rL259369

llvm-svn: 259392
2016-02-01 19:39:52 +00:00
Davide Italiano d4a48532e0 [llvm-nm] Simplify the code a bit. NFCI.
Fix a style violation while I'm here.

llvm-svn: 259391
2016-02-01 19:22:16 +00:00
Balaram Makam 92431703d7 AArch64: Implement missed conditional compare sequences.
Summary:
This is an extension to the existing implementation of r242436 which
restricts to only select inputs. This version fixes missed opportunities
in pr26084 by attempting to lower conditional compare sequences of
and/or trees with setcc leafs. This will additionaly handle the case
when a tree with select input is not a conjunction-disjunction tree
but some of the sub trees are conjunction-disjunction trees.

Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB

Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry

Differential Revision: http://reviews.llvm.org/D16291

llvm-svn: 259387
2016-02-01 19:13:07 +00:00
Matthew Simpson 45dee06177 Add test case missing from r259357 (NFC)
llvm-svn: 259385
2016-02-01 19:09:24 +00:00
Geoff Berry 29d4a695f4 [AArch64] Simplify prolog/epilog callee save/restore. NFC.
Summary:
Factor out common code for callee-save register pair calculation.  This
is intended to simplify follow-on changes that reduce the number of
registers saved/restored.

Depends on D16732

Reviewers: mcrosier, jmolloy, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16734

llvm-svn: 259384
2016-02-01 19:07:06 +00:00
Ulrich Weigand 4a4d4ab7a4 [SystemZ] Fix wrong-code generation for certain always-false conditions
We've found another bug in the code generation logic conditions for a
certain class of always-false conditions, those of the form
   if ((a & 1) < 0)

These only reach the back end when compiling without optimization.

The bug was introduced by the choice of using TEST UNDER MASK
to implement a check for
   if ((a & MASK) < VAL)
as
   if ((a & MASK) == 0)

where VAL is less than the the lowest bit of MASK.  This is correct
in all cases except for VAL == 0, in which case the original
condition is always false, but the replacement isn't.

Fixed by excluding that particular case.

llvm-svn: 259381
2016-02-01 18:31:19 +00:00
Colin LeMahieu 6fdfa3dc32 [NFC] Referencing manual for reason why subregbit is checked
llvm-svn: 259380
2016-02-01 18:15:39 +00:00
Sanjay Patel cf57c5a4ee fix broken check lines
Without the colon, it doesn't mean anything!

llvm-svn: 259377
2016-02-01 17:46:18 +00:00
David Majnemer f8853ae7b3 [InstCombine] Don't transform (X+INT_MAX)>=(Y+INT_MAX) -> (X<=Y)
This miscompile came about because we tried to use a transform which was
only appropriate for xor operators when addition was present.

This fixes PR26407.

llvm-svn: 259375
2016-02-01 17:37:56 +00:00
Jun Bum Lim ca832660ae [ValueTracking] Improve isKnownNonZero for PHI of non-zero constants
It is clear that a PHI is a non-zero if all incoming values are non-zero constants.

llvm-svn: 259370
2016-02-01 17:03:07 +00:00
Sanjay Patel b695c5557c [InstCombine] simplify masked load intrinsics with all ones or zeros masks
A masked load with a zero mask means there's no load.
A masked load with an allOnes mask means it's a normal vector load.

Differential Revision: http://reviews.llvm.org/D16691

llvm-svn: 259369
2016-02-01 17:00:10 +00:00
Geoff Berry b13b2eed11 [PrologEpilogInserter] Add some debug output for callee-save frame object allocation
Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16733

llvm-svn: 259367
2016-02-01 16:47:51 +00:00
Geoff Berry 04bf91a8c1 [AArch64] Simplify callee-save register save/restore. NFC.
Summary:
Simplify callee-save register save/restore code generation by
remembering the size of the callee-save area when it is computed so we
don't have to scan the prologue/epilogue instructions again later to
reconstruct it.

This is intended to simplify follow-on changes that reduce the number of
registers saved/restored.

Reviewers: mcrosier, jmolloy, t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16732

llvm-svn: 259365
2016-02-01 16:29:19 +00:00
Matthew Simpson 73dad62174 [LV] Rename RdxPHIsToFix to PHIsToFix (NFC)
In the future, we will vectorize recurrences other than reductions. This patch
renames a few variables and updates their associated comments to enable them to
be reused for non-reduction PHI nodes.

This change was requested in the review for D16197.

llvm-svn: 259364
2016-02-01 16:07:01 +00:00
Asaf Badouh 5a3a0231f4 [X86][AVX512VBMI] add encoding and intrinsics for Multishift
Differential Revision: http://reviews.llvm.org/D16399

llvm-svn: 259363
2016-02-01 15:48:21 +00:00
Vasileios Kalintiris a052037034 [mips] Split large test file into 3 smaller ones.
Remove the old select.ll file and use select-int.ll, select-flt.ll,
select-dbl.ll for testing selects on integers, floats & doubles respectivelly.

llvm-svn: 259361
2016-02-01 15:19:35 +00:00
Daniel Sanders f8bb23e509 [mips] Range check uimm16 and fix several bugs this revealed.
Summary:
The bugs were:
* teq and similar take 4-bit unsigned immediates on microMIPS.
* teqi and similar have side-effects like teq do.
* shll_s.w and shra_r.w take 5-bit unsigned immediates.
* The various DSP ext* instructions take a 5-bit immediate.
* repl.qh takes an 8-bit unsigned immediate.
* repl.ph takes a 10-bit unsigned immediate.
* rddsp/wrdsp take a 10-bit unsigned immediate.
* teqi and similar take signed 16-bit immediates (10-bit for microMIPS).
* Out-of-range immediate macros for or/xor take a simm32/simm64 depending
  on architecture. I'll fix the simm64 case properly when I reach simm32.

lui is a bit more lenient than GAS and accepts signed immediates in addition
to unsigned. This is because MipsMCExpr can produce signed values when
constant folding and it currently lacks a way of knowing it should fold to
an unsigned value.

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D15446

llvm-svn: 259360
2016-02-01 15:13:31 +00:00
Amjad Aboud 8bbce8ad8e Improved macro emission in dwarf.
Changed emitting offset of macinfo entry into compiler unit DIE to use "addSectionLabel" method rather than explicitly calculating size/offset of macro entry.

Differential Revision: http://reviews.llvm.org/D16292

llvm-svn: 259358
2016-02-01 14:09:41 +00:00
Matthew Simpson c578d67407 Reapply commit r258404 with fix.
The previous patch caused PR26364. The fix is to ensure that we don't enter a
cycle when iterating over use-def chains.

llvm-svn: 259357
2016-02-01 13:38:29 +00:00
JF Bastien a5b8ea0d66 WebAssembly NFC: simplify control flow
This should now be easier to read.

llvm-svn: 259349
2016-02-01 10:46:16 +00:00
Ewan Crawford 588980d69e DWARF RenderScript vendor extension
Patch adds a DWARF language vendor extension for RenderScript.
We are already using this identifier in LLDB with a hard coded value, so it's preferable to use a LLVM generated enum instead.
The language is intended to be added to the next version of the standard.
See http://www.dwarfstd.org/ShowIssue.php?issue=150331.1

Reviewers:  dexonsmith, echristo
Subscribers: probinson domipheus, srhines, llvm-commits
Differential Revision: http://reviews.llvm.org/D16409

llvm-svn: 259348
2016-02-01 10:39:24 +00:00
Igor Breger 56b039ea17 AVX512: fix mask handling for gather/scatter/prefetch intrinsics.
Differential Revision: http://reviews.llvm.org/D16755

llvm-svn: 259346
2016-02-01 09:57:15 +00:00
Simon Pilgrim 1358d86659 [X86][SSE] Find source of the inserted element of INSERTPS
Minor patch to trace back through target shuffles to the source of the inserted element in a (V)INSERTPS shuffle.

Differential Revision: http://reviews.llvm.org/D16652

llvm-svn: 259343
2016-02-01 08:59:30 +00:00
Igor Breger 6cc9115cec AVX512 : Fix SETCCE lowering for KNL 32 bit.
Differential Revision: http://reviews.llvm.org/D16752

llvm-svn: 259342
2016-02-01 07:56:09 +00:00
Frederic Riss 1d53658eae [dsymutil] Skip mach-o paired relocations
Noticed while working on scattered relocations.
I do not think these relocs can actually happen in the debug_info section,
but if they happen the code would mishandle them. Explicitely skip them
and warn if we encounter one.

llvm-svn: 259341
2016-02-01 04:43:14 +00:00
David Majnemer efb41741f2 [X86] Cleanup the WinEHState pass
Remove unnecessary includes and class state.

No functional change intended.

llvm-svn: 259340
2016-02-01 04:28:59 +00:00
Frederic Riss 0314e1e1ef [dsymutil] Support scattered relocs.
Although it seems like clang will never emit scattered relocations in
the debug information (at least I couldn't find a way), we have too
support them for the benefit of other compilers.
As clang doesn't generate them, the included testcase was produced
from hacked up assembly.

llvm-svn: 259339
2016-02-01 03:44:22 +00:00
David Majnemer 784d4a455b Revert r258580 and r258581.
Those commits created an artificial edge from a cleanup to a synthesized
catchswitch in order to get the MSVC personality routine to execute
cleanups which don't cleanupret and are not wrapped by a catchswitch.

This worked well enough but is not a complete solution in situations
where there the cleanup infinite loops.

However, the real deal breaker behind this approach comes about from a
degenerate case where the cleanup is post-dominated by unreachable *and*
throws an exception.  This ends poorly because the catchswitch will
inadvertently catch the exception.

Because of this we should go back to our previous behavior of not
executing certain cleanups (identical behavior with the Itanium ABI
implementation in clang, GCC and ICC).

N.B. I think this could be salvaged by making the catchpad rethrow the
exception and properly transforming throwing calls in the cleanup into
invokes.

llvm-svn: 259338
2016-02-01 03:29:38 +00:00
Craig Topper 28851b62cc [TableGen] Store result of getInstructionsByEnumValue in an ArrayRef instead of accidentally copying to a vector.
llvm-svn: 259336
2016-02-01 01:33:42 +00:00
Frederic Riss e5b78d8041 [MCDwarf] Fix encoding of line tables with weird custom parameters
With poorly chosen custom parameters, the line table encoding logic would
sometimes end up generating a special opcode bigger than 255, which is wrong.
The set of default parameters that LLVM uses isn't subject to this bug.

When carefully chosing the line table parameters, it's impossible to fall into the
corner case that this patch fixes. The standard however doesn't require that these
parameters be carefully chosen. And even if it did, we shouldn't generate broken
encoding.

Add a unittest for this specific encoding bug, and while at it, create some unit
tests for the encoding logic using different sets of parameters.

llvm-svn: 259334
2016-01-31 22:06:35 +00:00
Craig Topper b9f42468ad Remove utostr_32 as it has no uses anymore.
llvm-svn: 259331
2016-01-31 20:00:26 +00:00
Craig Topper 3ef74f5956 Replace usages of llvm::utostr_32 with just llvm::utostr. While this is less efficient, its unclear the few places that were using the _32 version were doing so for efficiency.
llvm-svn: 259330
2016-01-31 20:00:24 +00:00
Craig Topper fa8b2317a6 Merge utohex_buffer into utohexstr, it's only caller. Also change utohexstr to use the std::string constructor that takes a start and end pointer. This saves a call to strlen. NFC
llvm-svn: 259329
2016-01-31 20:00:22 +00:00
Sanjay Patel 0069f56e33 add helper function for minnum/maxnum ; NFC
llvm-svn: 259326
2016-01-31 16:35:23 +00:00
Sanjay Patel 8af7fbc34c use range-based for loop; NFC
llvm-svn: 259325
2016-01-31 16:34:48 +00:00
Sanjay Patel 690955fcbc fix formatting; NFC
llvm-svn: 259324
2016-01-31 16:34:11 +00:00
Sanjay Patel 24b77d11bc simplify; NFC
llvm-svn: 259323
2016-01-31 16:33:33 +00:00
Sanjay Patel bbdab7af5a clean up; NFC
function names, comments, formatting, typos

llvm-svn: 259322
2016-01-31 16:32:23 +00:00
JF Bastien 578c8cde53 WebAssembly: more failures are gone
llvm-svn: 259321
2016-01-31 08:19:40 +00:00
JF Bastien ac9e8664a4 WebAssembly: update expected failures
r259305 fixed a few assertions around FrameIndex, and I forgot to update these failures despite having run the torture tests.

llvm-svn: 259320
2016-01-31 08:05:05 +00:00
Frederic Riss 96bfaf5fb2 [dsymutil] Fix FileCheck command.
Damn case-insensitive filesystem...

llvm-svn: 259319
2016-01-31 04:39:16 +00:00
Frederic Riss 6c8521ad32 [dsymutil] Fix handling of common symbols.
llvm-dsymutil was misinterpreting the value of common symbols as their
address when it actually contains their size. This didn't impact
llvm-dsymutil's ability to link the debug information for common symbols
because these are always found by name and not by address. Things could
however go wrong when the size of a common object matched the object
file address of another symbol. Depending on the link order of the symbols
the common object might incorrectly evict this other object from the
address to symbol mapping, and then link the evicted symbol with a wrong
binary address.

Use the new ability to have symbols without an object file address to fix
this.

llvm-svn: 259318
2016-01-31 04:29:34 +00:00
Frederic Riss d8c33dc2f6 [dsymutil] Allow debug map mappings with no object file address. NFC
This change just changes the data structure that ties symbol names,
object file address and linked binary addresses to accept mappings
with no object file address. Such symbol mappings are not fed into
the debug map yet, so this patch is NFC.
A subsequent patch will make use of this functionality for common
symbols.

llvm-svn: 259317
2016-01-31 04:29:22 +00:00
Tim Shen 3b428cb764 [SelectionDAG] Eliminate exponential behavior in WalkChainUsers
llvm-svn: 259315
2016-01-31 03:59:34 +00:00
Craig Topper 429093a9a4 No need to use utostr/utohexstr when writing into a raw_ostream. NFC
llvm-svn: 259314
2016-01-31 01:55:15 +00:00
Craig Topper ca919dc310 Shrink character buffer size in raw_ostream::write_hex to 16 characters intead of 20 as that's the largest string a 64-bit hex value can be.
llvm-svn: 259313
2016-01-31 01:12:38 +00:00
Craig Topper ab3d2ace49 Use std::end instead of repeating buffer sizes.
llvm-svn: 259312
2016-01-31 01:12:35 +00:00
Craig Topper 2ed5369424 Convert int to Twine instead of using utostr since it was already being added to a Twine. NFC
llvm-svn: 259308
2016-01-31 00:15:35 +00:00
Jingyue Wu 313496b7c4 [doc] improve the doc for CUDA
1. Mentioned that CUDA support works best with trunk.
2. Simplified the example by removing its dependency on the CUDA samples.
3. Explain the --cuda-gpu-arch flag.

llvm-svn: 259307
2016-01-30 23:48:47 +00:00
Derek Schuff c97ba939d1 [WebAssembly] Fix uses of FrameIndex as store values
Previously the code assumed all uses of FI on loads and stores were as
addresses. This checks whether the use is the address or a value and
handles the latter case as it does for non-memory instructions.

llvm-svn: 259306
2016-01-30 21:43:08 +00:00
JF Bastien fbc89d21dd WebAssembly: don't optimize frameindex store
The previous code was incorrect (can't getReg a frameindex). We could instead optimize it to reduce tree height, but I'm not sure that's worthwhile yet because we then try to eliminate the frameindex.

This patch also fixes frame index elimination for operations which may load or store: it used to assume the base was operand 2 and immediate offset operand 1. That's not true for stores, where they're 4 and 3.

llvm-svn: 259305
2016-01-30 14:11:26 +00:00
JF Bastien 3ca3ea690f WebAssembly NFC: fix build warning
WebAssemblyFrameLowering.cpp:158:44: warning: enumeral and non-enumeral type in conditional expression [enabled by default]

llvm-svn: 259303
2016-01-30 11:19:26 +00:00
Gerolf Hoflehner d24671f880 [BasicAA] NFC - revised comment for function adjustToPointerSize()
llvm-svn: 259300
2016-01-30 05:58:38 +00:00
Gerolf Hoflehner 87ddb65fa6 [BasicAA] Fix for missing must alias (D16343)
llvm-svn: 259299
2016-01-30 05:52:53 +00:00
Gerolf Hoflehner 73fc84bfe9 [BasicAA] Update on r259290 - added missing cast
llvm-svn: 259298
2016-01-30 05:35:09 +00:00
Matt Arsenault e013246462 AMDGPU: Fix emitting invalid workitem intrinsics for HSA
The AMDGPUPromoteAlloca pass was emitting the read.local.size
calls, which with HSA was incorrectly selected to reading from
the offset mesa uses off of the kernarg pointer.

Error on intrinsics which aren't supported by HSA, and start
emitting the correct IR to read the workgroup size
out of the dispatch pointer.

Also initialize the pass so it can be tested with opt, and
start moving towards not depending on the subtarget as an
argument.

Start emitting errors for the intrinsics not handled with HSA.

llvm-svn: 259297
2016-01-30 05:19:45 +00:00
Matt Arsenault d0799df707 AMDGPU: Stop checking intrinsics not used by HSA for dispatch-ptr
Only the dispatch.ptr intrinsic is supposed to be used now to get
the workgroup size, and the read.local.size intrinsics do not
work correctly.

llvm-svn: 259296
2016-01-30 05:10:59 +00:00
Matt Arsenault 56c079f393 InstCombine: fabs(x) * fabs(x) -> x * x
llvm-svn: 259295
2016-01-30 05:02:00 +00:00
Dan Gohman ed0f113885 [WebAssembly] Refine block placement to insert blocks between trees.
Refine the test for whether an instruction is in an expression tree so that
it detects when one tree ends and another begins, so we can place a block
at that point, rather than continuing to find the first instruction not in
a tree at all.

llvm-svn: 259294
2016-01-30 05:01:06 +00:00
Matt Arsenault 43976df0da AMDGPU: Add new amdgcn workitem intrinsics
These use the correct prefix and follow the HSA naming convention
rather than the config register option names.

llvm-svn: 259293
2016-01-30 04:25:19 +00:00
Justin Bogner 4bc4b5f4b8 Remove references to *.h.in files and some autoconf hackery
Missed this stuff in r259291.

llvm-svn: 259292
2016-01-30 04:15:33 +00:00
Justin Bogner 0138037203 Remove *.h.in - these were only used by the autoconf build system
llvm-svn: 259291
2016-01-30 04:05:45 +00:00
Gerolf Hoflehner 1d1fbb52e3 [BasicAA] NFC - utility function for two's complement wrap-around
llvm-svn: 259290
2016-01-30 02:42:11 +00:00
Xinliang David Li fe28ccc98f Further reduce test time
llvm-svn: 259285
2016-01-30 01:37:32 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Matthias Braun daa812d518 Use Support/DataTypes.h instead of cstdint
llvm-svn: 259282
2016-01-30 01:14:01 +00:00
Alexey Samsonov f18fba6d96 [docs] Remove references to autotools build.
llvm-svn: 259280
2016-01-30 01:10:15 +00:00
Justin Lebar ead59f4765 [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor.
Summary: Previously we'd just silently skip these.

Reviewers: tra, jholewinski

Subscribers: llvm-commits, jhen, echristo,

Differential Revision: http://reviews.llvm.org/D16739

llvm-svn: 259279
2016-01-30 01:07:38 +00:00
David Majnemer 8b68a6cabd [CodeView] Properly handle empty line tables
Don't crash when there are no appropriate line table entries for a given
function.

llvm-svn: 259277
2016-01-30 00:36:09 +00:00
Manman Ren c77e0ff785 [Objective-C] Support a new special module flag.
"Objective-C Class Properties" will be put into the objc_imageinfo struct.

rdar://23891898

llvm-svn: 259270
2016-01-29 23:51:00 +00:00
Davide Italiano 63634cb0bc [llvm-nm] Add a comment to explain why we initialize MC.
llvm-svn: 259266
2016-01-29 23:38:05 +00:00
Kostya Serebryany 54a6363a8f [libFuzzer] add -timeout_exitcode option
llvm-svn: 259265
2016-01-29 23:30:07 +00:00
Sanjay Patel 6038d3e5c6 function names start with a lower case letter ; NFC
llvm-svn: 259264
2016-01-29 23:27:03 +00:00
Kostya Serebryany 085ca4131f [libFuzzer] re-enable test for -abort_on_timeout=1, this time protecting from ASAN_OPTIONS set outside
llvm-svn: 259263
2016-01-29 23:19:00 +00:00
Sanjay Patel f9f5d3cc45 fix formatting; NFC
llvm-svn: 259262
2016-01-29 23:14:58 +00:00
Fiona Glaser 36e8230db0 Fix typo in LoopSimplifyCFG
llvm-svn: 259261
2016-01-29 23:12:52 +00:00
Vedant Kumar 00dab22853 [Profiling] Add a -sparse mode to llvm-profdata merge
Add an option to llvm-profdata merge for writing out sparse indexed
profiles. These profiles omit InstrProfRecords for functions which are
never executed.

Differential Revision: http://reviews.llvm.org/D16727

llvm-svn: 259258
2016-01-29 22:54:45 +00:00
Reid Kleckner b046154ae9 Fix the MSVC build by moving static asserts into constructors
Apparently MSVC won't allow you to ask for the sizeof() a data member at
class scope.

llvm-svn: 259257
2016-01-29 22:40:22 +00:00
Fiona Glaser b417d464e6 Add LoopSimplifyCFG pass
Loop transformations can sometimes fail because the loop, while in
valid rotated LCSSA form, is not in a canonical CFG form. This is
an extremely simple pass that just merges obviously redundant
blocks, which can be used to fix some known failure cases. In the
future, it may be enhanced with more cases (and have code shared with
SimplifyCFG).

This allows us to run LoopSimplifyCFG -> LoopRotate -> LoopUnroll,
so that SimplifyCFG cleans up the loop before Rotate tries to run.

Not currently used in the pass manager, since this pass doesn't do
anything unless you can hook it up in an LPM with other loop passes.
It'll be added once Chandler cleans up things to allow this.

Tested in a custom pipeline out of tree to confirm it works in
practice (in addition to the included trivial test).

llvm-svn: 259256
2016-01-29 22:35:36 +00:00
Matthias Braun 9c98105002 Need #include <cstdint> for uint64_t
llvm-svn: 259255
2016-01-29 22:35:29 +00:00
Matthias Braun d520d4ecd2 Need #include <climit> for CHAR_BIT
llvm-svn: 259254
2016-01-29 22:30:30 +00:00
Xinliang David Li dd4ae7b522 Improve test speed/trial 2
llvm-svn: 259253
2016-01-29 22:29:15 +00:00
Matthias Braun 3328281538 AttributeSetImpl: Summarize existing function attributes in a bitset.
The majority of attribute queries checks for the existence of an enum
attribute in the FunctionIndex slot. We only have 48 of those and can
therefore summarize them in an uint64_t bitset which measurably improves
compile time.

Differential Revision: http://reviews.llvm.org/D16618

llvm-svn: 259252
2016-01-29 22:25:19 +00:00
Matthias Braun 31eeb76f5e AttributeSetNode: Summarize existing attributes in a bitset.
The majority of queries just checks for the existince of an enum
attribute.  We only have 48 of those and can summaryiz them in an
uint64_t bitfield so we can avoid searching the list. This improves
"opt" compile time by 1-4% in my measurements.

Differential Revision: http://reviews.llvm.org/D16617

llvm-svn: 259251
2016-01-29 22:25:13 +00:00
Xinliang David Li a8ba7affa8 Revert 259242, 259243 -- irrelvante changes pulled in
llvm-svn: 259244
2016-01-29 21:26:31 +00:00
Xinliang David Li bf38f39dd6 Use range for loop
llvm-svn: 259243
2016-01-29 21:23:47 +00:00
Xinliang David Li 631c1ecb64 Improve test speed (interchange loop, reducing padding)
llvm-svn: 259242
2016-01-29 21:13:55 +00:00
Yaron Keren eb2a25467e Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment.
clang part in r259232, this is the LLVM part of the patch.

llvm-svn: 259240
2016-01-29 20:50:44 +00:00
Sanjay Patel 66fff73c76 [InstCombine] avoid an insertelement transformation that induces the opposite extractelement fold (PR26354)
We would infinite loop because we created a shufflevector that was wider than
needed and then failed to combine that with the insertelement. When subsequently
visiting the extractelement from that shuffle, we see that it's unnecessary,
delete it, and trigger another visit to the insertelement.

llvm-svn: 259236
2016-01-29 20:21:02 +00:00
David Majnemer 4553723825 Unbreak windows buildbots
llvm-svn: 259231
2016-01-29 19:38:03 +00:00
David Majnemer 6fcbd7e909 [CodeView] Implement .cv_inline_linetable
This support is _very_ rudimentary, just enough to get some basic data
into the CodeView debug section.

Left to do is:
- Use the combined opcodes to save space.
- Do something about code offsets.

llvm-svn: 259230
2016-01-29 19:24:12 +00:00
Tim Northover c4093c3ced ARM: don't mangle DAG constant if it has more than one use
The basic optimisation was to convert (mul $LHS, $complex_constant) into
roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected
to be cheaper. The original logic checks that the mul only has one use (since
we're mangling $complex_constant), but when used in even more complex
addressing modes there may be an outer addition that can pick up the wrong
value too.

I *think* the ARM addressing-mode problem is actually unreachable at the
moment, but that depends on complex assessments of the profitability of
pre-increment addressing modes so I've put a real check in there instead of an
assertion.

llvm-svn: 259228
2016-01-29 19:18:46 +00:00
Derek Schuff d91a12ec11 [WebAssembly] Update test expectations
llvm-svn: 259223
2016-01-29 18:54:38 +00:00
Derek Schuff 6ea637af35 [WebAssembly] Support frame pointer
Add support for frame pointer use in prolog/epilog.
Supports dynamic allocas but not yet over-aligned locals.
Target-independend CG generates SP updates, but we still need to write
back the SP value to memory when necessary.

llvm-svn: 259220
2016-01-29 18:37:49 +00:00
Ahmed Bougacha 0df98ff913 [X86] Add missing "CHECK" colon in r259065 test.
llvm-svn: 259219
2016-01-29 18:25:33 +00:00
Reid Kleckner f3b9ba4941 [codeview] Begin to add support for inlined call sites
Summary:
There are three parts to inlined call frames:
1. The inlinee line subsection
2. The inline site symbol record
3. The function ids referenced by both

This change starts by emitting function ids (3) for all subprograms and
emitting the base inline site symbol record (2). The actual line numbers
in (2) use an encoded format that will come next, along with the inlinee
line subsection.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16333

llvm-svn: 259217
2016-01-29 18:16:43 +00:00
David Majnemer 75f492e7f1 Fix the build
llvm-svn: 259215
2016-01-29 17:46:57 +00:00
Jonas Paulsson 8c738635b1 Temporarily revert "[ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten."
Some buildbot failures needs to be debugged.

llvm-svn: 259213
2016-01-29 17:22:43 +00:00
Matthew Simpson 53d00ef874 [SLP] Fix printing of debug statement (NFC)
llvm-svn: 259212
2016-01-29 17:21:38 +00:00
Sanjoy Das c816f03b70 [RS4GC] Address post-commit review on r259208 from David
NFC

llvm-svn: 259211
2016-01-29 17:20:49 +00:00
Sanjoy Das 565f7866ac [RS4GC] Remove unnecessary const_cast; NFC
GCRelocateInst::getDerivedPtr already returns a non-const llvm::Value
pointer.

llvm-svn: 259209
2016-01-29 16:54:49 +00:00
Sanjoy Das 3794eeb8bb [RS4GC] Minor local cleanup to StabilizeOrder; NFC
- Locally declare struct, and call it BaseDerivedPair
 - Use a lambda to compare, instead of a singleton with uninitialized
   fields
 - Add a constructor to BaseDerivedPair and use SmallVector::emplace_back

llvm-svn: 259208
2016-01-29 16:50:34 +00:00
Reid Kleckner 828883b86c [CodeView] Fix dumping the is_stmt bit from the line table
Bug pointed out by George Rimar.

llvm-svn: 259205
2016-01-29 16:39:04 +00:00
Sanjoy Das 69b4a41fed [RS4GC] Remove unnecessary redirections from tests; NFC
llvm-svn: 259204
2016-01-29 16:32:30 +00:00
Sanjoy Das f3a4ee7542 [RS4GC] Add some missing tests and CHECK: lines
I missed porting these in rL259129.

llvm-svn: 259203
2016-01-29 16:32:25 +00:00
Zoran Jovanovic d474ef3a3b [mips] Absolute value macro expansion
Author: obucina
Reviewers: dsanders
Differential Revision: http://reviews.llvm.org/D16323

llvm-svn: 259202
2016-01-29 16:18:34 +00:00
Jonas Paulsson 23f12e5c02 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
The buildSchedGraph() was in need of reworking as the AA features had been
added on top of earlier code. It was very difficult to understand, and buggy.
There had been found cases where scheduling dependencies had actually been
missed (see r228686).

AliasChain, RejectMemNodes, adjustChainDeps() and iterateChainSucc() have
been removed. There are instead now just the four maps from Value to SUs, which
have been renamed to Stores, Loads, NonAliasStores and NonAliasLoads.

An unknown store used to become the AliasChain, but now becomes a store mapped
to 'unknownValue' (in Stores). What used to be PendingLoads is instead the
list of SUs mapped to 'unknownValue' in Loads.

RejectMemNodes and adjustChainDeps() used to be a safety-net for everything.
The SU maps were sometimes cleared and SUs were put in RejectMemNodes, where
adjustChainDeps() would look. Instead of this, a more straight forward approach
is used in maintaining the SU maps without clearing them and simply letting
them grow over time. Instead of the cutt-off in adjustChainDeps() search, a
reduction of maps will be done if needed (see below).

Each SUnit either becomes the BarrierChain, or is put into one of the maps. For
each SUnit encountered, all the information about previous ones are still
available until a new BarrierChain is set, at which point the maps are cleared.

For huge regions, the algorithm becomes slow, therefore the maps will get
reduced at a threshold (current default is 1000 nodes), by a fraction (default 1/2).
These values can be tuned by use of CL options in case some test case shows that
they need to be changed (-dag-maps-huge-region and -dag-maps-reduction-size).

There has not been any considerable change observed in output quality or compile
time. There may now be more DAG edges inserted than before (i.e. if A->B->C,
then A->C is not needed). However, in a comparison run there were fewer total
calls to AA, and a somewhat improved compile time, which means this seems to
be not a problem.

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

llvm-svn: 259201
2016-01-29 16:11:18 +00:00
Benjamin Kramer 34ac580164 [IR] Move definitions of users of Use::set to Value.h
Still ugly, but at least Use.h is self-contained again.

llvm-svn: 259191
2016-01-29 12:47:05 +00:00
Benjamin Kramer ccb7a86c59 [IR] Shuffle the code for getSequentialElementType to type.h to avoid circular header dependencies.
llvm-svn: 259190
2016-01-29 12:47:01 +00:00
Alexandros Lamprineas 8c26e7c647 [ARM] Emit trap instruction using .inst directive
The trap instruction is emitted as a data-in-text rather
than an instruction. This patch uses the .inst directive
for emitting trap.

Differential Revision: http://reviews.llvm.org/D16684

llvm-svn: 259182
2016-01-29 10:23:32 +00:00
Matt Arsenault 295875efda AMDGPU: Remove 24-bit intrinsics
The known bit matching code seems to work reasonably well,
so these shouldn't really be needed.

llvm-svn: 259180
2016-01-29 10:05:16 +00:00
George Burgess IV 5d095c91ee Minor bugfix in AAResults::getModRefInfo.
Also removed a few redundant `else`s.

Bug was found by a test I wrote for MemorySSA (in review at
http://reviews.llvm.org/D7864; shiny update coming soon). So, assuming
that lands at some point, this should be covered by that. If anyone
feels this deserves its own explicit test case, please let me know.
I'll write one.

llvm-svn: 259179
2016-01-29 07:51:15 +00:00
Eric Christopher 7d9b9b2d7d Refactor common code for PPC fast isel load immediate selection.
llvm-svn: 259178
2016-01-29 07:20:30 +00:00
Eric Christopher 5a2429e239 Since LI/LIS sign extend the constant passed into the instruction we should
check that the sign extended constant fits into 16-bits if we want a
zero extended value, otherwise go ahead and put it together piecemeal.

Fixes PR26356.

llvm-svn: 259177
2016-01-29 07:20:01 +00:00
Eric Christopher 80ba58a15c Fix up conditional formatting.
llvm-svn: 259176
2016-01-29 07:19:49 +00:00
Akira Hatanaka 4f472a8867 [llvm-bcanalyzer] Dump bitcode wrapper header
This patch enables llvm-bcanalyzer to print the bitcode wrapper header
if the file has one, which is needed to test the changes made in
r258627 (bitcode-wrapper-header-armv7m.ll is the test case for r258627).

Differential Revision: http://reviews.llvm.org/D16642

llvm-svn: 259162
2016-01-29 05:55:09 +00:00
David Majnemer f2bb710da5 [WinEH] Don't perform state stores in cleanups
Our cleanups do not support true lexical nesting of funclets which
obviates the need to perform state stores.

This fixes PR26361.

llvm-svn: 259161
2016-01-29 05:33:15 +00:00
Matthias Braun e61f8e3882 SmallPtrSetTest: More checks for the swap() testing
llvm-svn: 259152
2016-01-29 03:34:36 +00:00