Commit Graph

32168 Commits

Author SHA1 Message Date
Akira Hatanaka f6afd11538 [InstCombine] Preserve metadata when merging loads that are phi
arguments.

Make sure InstCombiner::FoldPHIArgLoadIntoPHI doesn't drop the following
metadata:

MD_tbaa
MD_alias_scope
MD_noalias
MD_invariant_load
MD_nonnull
MD_range

rdar://problem/17617709

Differential Revision: http://reviews.llvm.org/D12710

llvm-svn: 248419
2015-09-23 18:40:57 +00:00
Sanjay Patel 1a6534661b [x86] replace integer 'xor' ops with packed SSE FP 'xor' ops when operating on FP scalars
Turn this:

movd %xmm0, %eax
movd %xmm1, %ecx
xorl %eax, %ecx
movd %ecx, %xmm0

into this:

xorps %xmm1, %xmm0

This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

This is an extension of:
http://reviews.llvm.org/rL248395

llvm-svn: 248415
2015-09-23 18:33:42 +00:00
Sanjay Patel aba37553c4 [x86] replace integer 'or' ops with packed SSE FP 'or' ops when operating on FP scalars
Turn this:

movd %xmm0, %eax
movd %xmm1, %ecx
orl %eax, %ecx
movd %ecx, %xmm0

into this:

orps %xmm1, %xmm0

This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

This is an extension of:
http://reviews.llvm.org/rL248395

llvm-svn: 248409
2015-09-23 18:19:07 +00:00
Adrian Prantl 4c36e2f47e Fix the order of operations.
llvm-svn: 248406
2015-09-23 18:09:01 +00:00
Evgeniy Stepanov a2002b08f7 Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

This is a re-commit of a change in r248357 that was reverted in
r248358.

llvm-svn: 248405
2015-09-23 18:07:56 +00:00
Adrian Prantl c040893085 Temporarily make testcase more verbose to debug a msvc buildbot failure.
llvm-svn: 248403
2015-09-23 17:59:45 +00:00
Chen Li 5cd6deeae3 [Bug 24848] Use range metadata to constant fold comparisons with constant values
Summary:
This is the first part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848.

When range metadata is provided, it should be used to constant fold comparisons with constant values.

Reviewers: sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12988

llvm-svn: 248402
2015-09-23 17:58:44 +00:00
Adrian Prantl a112ef9e2d dsymutil: Resolve forward decls for types defined in clang modules.
This patch extends llvm-dsymutil's ODR type uniquing machinery to also
resolve forward decls for types defined in clang modules.

http://reviews.llvm.org/D13038

llvm-svn: 248398
2015-09-23 17:35:52 +00:00
Adrian Prantl 209370260d dsymutil: print a warning when there is a module hash mismatch.
This also updates the module binaries in the test directory because
their module hash mismatched.

llvm-svn: 248396
2015-09-23 17:11:10 +00:00
Sanjay Patel df2495f331 [x86] replace integer 'and' ops with packed SSE FP 'and' ops when operating on FP scalars
Turn this:
   movd %xmm0, %eax
   movd %xmm1, %ecx
   andl %eax, %ecx
   movd %ecx, %xmm0

into this:
   andps %xmm1, %xmm0


This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

Differential Revision: http://reviews.llvm.org/D13065

llvm-svn: 248395
2015-09-23 17:00:06 +00:00
Vedant Kumar ff08e926ba [Inline] Use AssumptionCache from the right Function
This changes the behavior of AddAligntmentAssumptions to match its
comment. I.e, prove the asserted alignment in the context of the caller,
not the callee.

Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko
who also saw a fix for the issue.

rdar://22521387

Differential Revision: http://reviews.llvm.org/D12997

llvm-svn: 248390
2015-09-23 15:49:08 +00:00
David Majnemer fa36bde2f6 [DeadArgElim] Split the invoke successor edge
Invoking a function which returns an aggregate can sometimes be
transformed to return a scalar value.  However, this means that we need
to create an insertvalue instruction(s) to recreate the correct
aggregate type.  We achieved this by inserting an insertvalue
instruction at the invoke's normal successor.  However, this is not
feasible if the normal successor uses the invoke's return value inside a
PHI node.

Instead, split the edge between the invoke and the unwind successor and
create the insertvalue instruction in the new basic block.  The new
basic block's successor will be the old invoke successor which leaves
us with IR which is well behaved.

This fixes PR24906.

llvm-svn: 248387
2015-09-23 15:41:09 +00:00
Igor Laevsky 029bd93c5d [DeadStoreElimination] Remove dead zero store to calloc initialized memory
This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function.

Differential Revision: http://reviews.llvm.org/D13021

llvm-svn: 248374
2015-09-23 11:38:44 +00:00
Simon Pilgrim 9cb018b6b6 [X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR
This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead.

LLVM counterpart to D12835

Differential Revision: http://reviews.llvm.org/D13002

llvm-svn: 248368
2015-09-23 08:48:33 +00:00
Evgeniy Stepanov 8d0e3011d8 Revert "Android support for SafeStack."
test/Transforms/SafeStack/abi.ll breaks when target is not supported;
needs refactoring.

llvm-svn: 248358
2015-09-23 01:23:22 +00:00
Evgeniy Stepanov ce2e16f00c Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

llvm-svn: 248357
2015-09-23 01:03:51 +00:00
Cong Hou b54a72ef78 Add a test case for the fix of profile update issue when lowering switch statement.
llvm-svn: 248356
2015-09-23 00:34:56 +00:00
Adrian Prantl 77fefeba37 Debug Info: Emit the dwo_name only in skeleton CUs, not in DWOs.
llvm-svn: 248340
2015-09-22 23:21:00 +00:00
Matthias Braun 73e4221e6c LiveIntervalAnalysis: Avoid multiple connected liveness components
We may have subregister defs which are unused but not discovered and
cleaned up prior to liveness analysis. This creates multiple connected
components in the resulting live range which are forbidden in the
MachineVerifier because they would unnecesarily constrain the register
allocator. Rewrite those dead definitions to define a newly created
virtual register.

Differential Revision: http://reviews.llvm.org/D13035

llvm-svn: 248335
2015-09-22 22:37:44 +00:00
Michael Zolotukhin deade19630 [Unroll] Do not crash trying to propagate a value to vector load.
llvm-svn: 248333
2015-09-22 22:27:12 +00:00
Adrian Prantl e5162dba49 dsymutil: Follow references to clang modules and recursively clone the
debug info.

This does not yet resolve external type references.

llvm-svn: 248331
2015-09-22 22:20:50 +00:00
Michael Zolotukhin 8bb31dd08a [Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad.
Apart from checking that GlobalVariable is a constant, we should check
that it's not a weak constant, in which case we can't propagate its
value.

llvm-svn: 248327
2015-09-22 21:41:29 +00:00
Davide Italiano 77011ba16a Remove macho-dump. Its functionality is now covered by llvm-readobj.
Approved by: Rafael Espindola, Eric Christopher, Jim Grosbach, 
             Alex Rosenberg

llvm-svn: 248302
2015-09-22 17:46:10 +00:00
Ahmed Bougacha 81616a72ea [ARM] Emit clrex in the expanded cmpxchg fail block.
ARM counterpart to r248291:

In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.

Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.

Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".

Differential Revision: http://reviews.llvm.org/D13033

llvm-svn: 248294
2015-09-22 17:22:58 +00:00
Ahmed Bougacha 07a844d758 [AArch64] Emit clrex in the expanded cmpxchg fail block.
In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.

Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.

Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".

Differential Revision: http://reviews.llvm.org/D13033

llvm-svn: 248291
2015-09-22 17:21:44 +00:00
Stephen Canon 8216d88511 Don't raise inexact when lowering ceil, floor, round, trunc.
The C standard has historically not specified whether or not these functions should raise the inexact flag. Traditionally on Darwin, these functions *did* raise inexact, and the llvm lowerings followed that conventions. n1778 (C bindings for IEEE-754 (2008)) clarifies that these functions should not set inexact. This patch brings the lowerings for arm64 and x86 in line with the newly specified behavior.  This also lets us fold some logic into TD patterns, which is nice.

Differential Revision: http://reviews.llvm.org/D12969

llvm-svn: 248266
2015-09-22 11:43:17 +00:00
Daniel Sanders f173dda0e2 [mips][ias] Implement .cpreturn directive.
Summary:
Based on a patch by David Chisnall. I've modified the original patch as follows:
* Moved the expansion to the TargetStreamers so that the directive isn't
  expanded when emitting assembly.
* Fixed an operand order bug.
* Changed the move instructions from DADDu to OR to match recent changes to GAS.

Reviewers: vkalintiris

Subscribers: llvm-commits, emaste, seanbruno, theraven

Differential Revision: http://reviews.llvm.org/D13017

llvm-svn: 248258
2015-09-22 10:50:09 +00:00
Simon Pilgrim 1cad0cd3ce [X86][SSE] Match zero/any extension shuffles that don't start from the first element
This patch generalizes the lowering of shuffles as zero extensions to allow extensions that don't start from the first element. It now recognises extensions starting anywhere in the lower 128-bits or at the start of any higher 128-bit lane.

The motivation was to reduce the number of high cost pshufb calls, but it also improves the SSE2 case as well.

Differential Revision: http://reviews.llvm.org/D12561

llvm-svn: 248250
2015-09-22 08:16:08 +00:00
Philip Reames 5f99423de9 [LICM] Hoist calls to readonly argmemonly functions even with stores in the loop
We know that an argmemonly function can only access memory pointed to by it's pointer arguments. Rather than needing to consider all possible stores as aliasing (as we do for a readonly function), we can only consider the aliasing of the pointer arguments.

Note that this change only addresses hoisting. I'm thinking about how to address speculation safety as well, but that will be a different change.

FYI, argmemonly disallows accessing memory through non-pointer typed arguments.  

Differential Revision: http://reviews.llvm.org/D12771

llvm-svn: 248220
2015-09-21 22:27:59 +00:00
Philip Reames 963febd4f8 Fix for pr24866
Turns out that not every basic block is guaranteed to have a node within the DominatorTree.  This is really hard to trigger, but the test case from the PR managed to do so.  There's active discussion continuing about what documentation and/or invariants needed cleaned up.

llvm-svn: 248216
2015-09-21 22:04:10 +00:00
Simon Pilgrim 4003ed2da3 [DAGCombiner] Improve FMA support for interpolation patterns
This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents.

This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t))))

Differential Revision: http://reviews.llvm.org/D13003

llvm-svn: 248210
2015-09-21 20:32:48 +00:00
Jeroen Ketema 41681a5329 [ARM] Do not scale vext with a factor
The vext pseudo-instruction takes the number of elements that need to be
extracted, not the number of bytes. Hence, use the number of elements
directly instead of scaling them with a factor.

Reviewers: Silviu Baranga, James Molloy
(not reflected in the differential revision)

Differential Revision: http://reviews.llvm.org/D12974

llvm-svn: 248208
2015-09-21 20:28:04 +00:00
James Molloy 50a4c27f97 [LoopUtils,LV] Propagate fast-math flags on generated FCmp instructions
We're currently losing any fast-math flags when synthesizing fcmps for
min/max reductions. In LV, make sure we copy over the scalar inst's
flags. In LoopUtils, we know we only ever match patterns with
hasUnsafeAlgebra, so apply that to any synthesized ops.

llvm-svn: 248201
2015-09-21 19:41:19 +00:00
Rafael Espindola 8055ed0c12 Avoid SEGFAULT if a requested symbol section is absent.
Patch by Igor Kudrin!

llvm-svn: 248194
2015-09-21 19:17:18 +00:00
Ulrich Weigand 126caeb043 [SystemZ] Fix expansion of ISD::FPOW and ISD::FSINCOS
The ISD::FPOW and ISD::FSINCOS opcodes default to Legal, but there
is no legal instruction for those on SystemZ.  This could cause
LLVM internal errors.  Fixed by setting the operation action to
Expand for those opcodes.

Also added test cases for all other LLVM IR intrinsics that should
generate a library call.  (Those already work correctly since the
default operation action is fine.)

llvm-svn: 248180
2015-09-21 17:35:45 +00:00
Matt Arsenault b774834429 DAGCombiner: Replace store of FP constant after attemping store merges
If storing multiple FP constants, some subset of the stores
would be replaced with integers due to visit order, so
MergeConsecutiveStores would only partially merge
these.

llvm-svn: 248169
2015-09-21 15:59:46 +00:00
Asaf Badouh eaf2da14bf [X86][AVX512] add masked version for RSQRT14 & RCP14 Scalar FP
Differential Revision: http://reviews.llvm.org/D12524

llvm-svn: 248147
2015-09-21 10:23:53 +00:00
Daniel Sanders 5d7962880d [mips] Allow constant expressions in second argument of .cpsetup.
Summary:
Also tightened up the test and made a trivial fix to prevent double-newline
after emitting .cpsetup directives.

Reviewers: vkalintiris

Subscribers: seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D12956

llvm-svn: 248143
2015-09-21 09:26:55 +00:00
Sanjay Patel bab5d6c636 add test file ahead of any functional changes for PR22428
llvm-svn: 248123
2015-09-20 15:58:00 +00:00
Simon Pilgrim c6a553241c [X86][SSE] Intrinsics builtins test refresh. NFCI
llvm-svn: 248122
2015-09-20 15:41:35 +00:00
Igor Breger b7e1f9d680 AVX512: Implemented encoding and intrinsics for vcmpss/sd.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12593

llvm-svn: 248121
2015-09-20 15:15:10 +00:00
Asaf Badouh 2744d21fb8 [X86][AVX512] extend support in Scalar conversion
add scalar FP to Int conversion with truncation intrinsics
add scalar conversion FP32 from/to FP64 intrinsics
add rounding mode and SAE mode encoding for these intrinsics

Differential Revision: http://reviews.llvm.org/D12665

llvm-svn: 248117
2015-09-20 14:31:19 +00:00
Igor Breger 4c4cd789c9 AVX512: vsqrtss/sd encoding and intrinsics implementation.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12102

llvm-svn: 248116
2015-09-20 09:13:41 +00:00
Asaf Badouh 572bbceecc [X86][AVX512DQ] Add fpclass instruction
Differential Revision: http://reviews.llvm.org/D12931

llvm-svn: 248115
2015-09-20 08:46:07 +00:00
Michael Kuperstein 58e86bc893 [X86] Fix sitofp and uitofp instruction matching failures with long double and avx512
The operation action for i32 and i64 cannot be set to legal, as long double 
needs custom lowering.

Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D12372

llvm-svn: 248114
2015-09-20 08:12:17 +00:00
Igor Breger 1d55f20bee AVX512: Implemented intrinsics for vshuff32x4, vshuff64x2, vshufi64x2, vshufi32x4
Added tests for intrinsics.

Differential Revision: http://reviews.llvm.org/D12525

llvm-svn: 248113
2015-09-20 07:18:53 +00:00
Igor Breger 0ede3cbb5c AVX512: Implement instructions encoding, lowering and intrinsics
vinserti64x4, vinserti64x2, vinserti32x8, vinserti32x4, vinsertf64x4, vinsertf64x2, vinsertf32x8, vinsertf32x4
Added tests for encoding, lowering and intrinsics.

Differential Revision: http://reviews.llvm.org/D11893

llvm-svn: 248111
2015-09-20 06:52:42 +00:00
Sanjoy Das 428db150d1 [IndVars] Fix a bug in r248045.
Because -indvars widens induction variables through arithmetic,
`NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV`
manages information for all transitive uses of an IV being widened,
including uses of `-1 * IV`).  Instead it must live on `NarrowIVDefUse`
which manages information for a specific def-use edge in the transitive
use list of an induction variable.

This change also adds a test case that demonstrates the problem with
r248045.

llvm-svn: 248107
2015-09-20 01:52:18 +00:00
Davide Italiano e210ee56f2 Fixup r248096, commit the *correct* test.
llvm-svn: 248097
2015-09-19 20:52:47 +00:00
Davide Italiano a539f63ae1 [obj2yaml] Fix "time of check to time of use" bug. Add a test.
llvm-svn: 248096
2015-09-19 20:49:34 +00:00
Simon Pilgrim 27f81776ad [X86][AVX2] Use general sext IR for vpmovsx stack folding tests
llvm-svn: 248093
2015-09-19 17:04:18 +00:00
Simon Pilgrim d0448ee59f [X86][SSE] Vectorize CTTZ + CTTZ_ZERO_UNDEF
Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1))

Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x))

Differential Revision: http://reviews.llvm.org/D12663

llvm-svn: 248091
2015-09-19 13:22:57 +00:00
NAKAMURA Takumi 5881d349f9 [CMake] Update LLVM_TEST_DEPENDS not to use macho-dump. It has been unused since r247235.
llvm-svn: 248088
2015-09-19 07:19:30 +00:00
David Majnemer 47ce0b81b0 [InstCombine] FoldICmpCstShrCst failed for ashr when comparing against -1
(icmp eq (ashr C1, %V) -1) may have multiple answers if C1 is not a
power of two and has the sign bit set.

This fixes PR24873.

llvm-svn: 248074
2015-09-19 00:48:31 +00:00
Matt Arsenault cc5d106263 AMDGPU: Add failing testcase for live interval construction
llvm-svn: 248067
2015-09-19 00:03:56 +00:00
Sanjoy Das f69d0e3384 [IndVars] Widen more comparisons for non-negative induction vars
Summary:
If an induction variable is provably non-negative, its sign extension is
equal to its zero extension.  This means narrow uses like

  icmp slt iNarrow %indvar, %rhs

can be widened into

  icmp slt iWide zext(%indvar), sext(%rhs)

Reviewers: atrick, mcrosier, hfinkel

Subscribers: hfinkel, reames, llvm-commits

Differential Revision: http://reviews.llvm.org/D12745

llvm-svn: 248045
2015-09-18 21:21:02 +00:00
Cong Hou d40105d321 Update edge weights properly when merging blocks in if-conversion.
In if-conversion, there is a utility function MergeBlocks() that is used to merge blocks. However, when new edges are built in this function the edge weight is either not provided or not updated properly, leading to a modified CFG with incorrect edge weights. This patch corrects this issue.

Differential Revision: http://reviews.llvm.org/D12513

llvm-svn: 248030
2015-09-18 20:22:41 +00:00
Eric Christopher a835956bda Limit the range of processors supported by ARM fast isel to v6 or
later as that's all that is tested right now.

Fixes PR24858.

llvm-svn: 248027
2015-09-18 20:08:18 +00:00
Cong Hou f9f9ffb98b Scaling up values in ARMBaseInstrInfo::isProfitableToIfCvt() before they are scaled by a probability to avoid precision issue.
In ARMBaseInstrInfo::isProfitableToIfCvt(), there is a simple cost model in which the number of cycles is scaled by a probability to estimate the cost. However, when the number of cycles is small (which is usually the case), there is a precision issue after the computation. To avoid this issue, this patch scales those cycles by 1024 (chosen to make the multiplication a litter faster) before they are scaled by the probability. Other variables are also scaled up for the final comparison.

Differential Revision: http://reviews.llvm.org/D12742

llvm-svn: 248018
2015-09-18 18:19:40 +00:00
Matthias Braun f89b7c7188 SelectionDAGDumper: Hide [ID=X], [ORD=X] and source locations by default.
You can show them with the new -dag-dump-verbose switch.

Differential Revision: http://reviews.llvm.org/D12566

llvm-svn: 248011
2015-09-18 17:57:28 +00:00
Matthias Braun 0b7d6c14c9 SelectionDAG: Introduce PersistentID to SDNode for assert builds.
This gives us more human readable numbers to identify nodes in debug
dumps.

Before:
  0x7fcbd9700160: ch = EntryToken

  0x7fcbd985c7c8: i64 = Register %RAX

   ...

      0x7fcbd9700160: <multiple use>
    0x7fcbd985c578: i64,ch = MOV64rm 0x7fcbd985c6a0, 0x7fcbd985cc68, 0x7fcbd985c200, 0x7fcbd985cd90, 0x7fcbd985ceb8, 0x7fcbd9700160<Mem:LD8[@foo]> [ORD=2]

  0x7fcbd985c8f0: ch,glue = CopyToReg 0x7fcbd9700160, 0x7fcbd985c7c8, 0x7fcbd985c578 [ORD=3]

    0x7fcbd985c7c8: <multiple use>
    0x7fcbd985c8f0: <multiple use>
    0x7fcbd985c8f0: <multiple use>
  0x7fcbd985ca18: ch = RETQ 0x7fcbd985c7c8, 0x7fcbd985c8f0, 0x7fcbd985c8f0:1 [ORD=3]

Now:
  t0: ch = EntryToken

  t5: i64 = Register %RAX

    ...

      t0: <multiple use>
    t3: i64,ch = MOV64rm t10, t12, t11, t13, t14, t0<Mem:LD8[@foo]> [ORD=2]

  t6: ch,glue = CopyToReg t0, t5, t3 [ORD=3]

    t5: <multiple use>
    t6: <multiple use>
    t6: <multiple use>
  t7: ch = RETQ t5, t6, t6:1 [ORD=3]

Differential Revision: http://reviews.llvm.org/D12564

llvm-svn: 248010
2015-09-18 17:41:00 +00:00
Geoff Berry 43ec15e57e [AArch64] Improved bitfield instruction selection.
Summary:
For bitfield insert OR matching, check both operands for larger pattern
first before checking for smaller pattern.

Add pattern for unsigned bitfield insert-in-zero done with SHL+AND.

Resolves PR21631.

Reviewers: jmolloy, t.p.northover

Subscribers: aemerson, rengolin, llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D12908

llvm-svn: 248006
2015-09-18 17:11:53 +00:00
Daniel Sanders df19a5e605 [mips][microMIPS] Fix an invalid read for lwm32 and reserved reglist values.
Summary:
Some values of 'reglist' are reserved and cause the disassembler to read past
the end of the Regs array. Treat lwm32's containing reserved values as invalid
instructions.

Reviewers: zoran.jovanovic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12959

llvm-svn: 247990
2015-09-18 14:20:54 +00:00
Igor Laevsky 0fa4819dd8 [LazyValueInfo] Report nonnull range for nonnull pointers
Currently LazyValueInfo will report only alloca's as having nonnull range. 
For loads with !nonnull metadata it will bailout with no additional information. 
Same is true for calls returning nonnull pointers.

This change extends LazyValueInfo to handle additional nonnull instructions.

Differential Revision: http://reviews.llvm.org/D12932

llvm-svn: 247985
2015-09-18 13:01:48 +00:00
Artur Pilipenko 84bc62f7a3 Support align attribute for return values
Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D12844

llvm-svn: 247984
2015-09-18 12:33:31 +00:00
Quentin Colombet b4c6886215 [ShrinkWrap] Refactor the handling of infinite loop in the analysis.
- Strenghten the logic to be sure we hoist the restore point out of the current
  loop. (The fixes a bug with infinite loop, added as part of the patch.)
- Walk over the exit blocks of the current loop to conver to the desired restore
  point in one iteration of the update loop.

llvm-svn: 247958
2015-09-17 23:21:34 +00:00
Davide Italiano 096cda11fc [llvm-readobj] Fix another "time of check to time of use bug".
It seems there's more copy-paste between tools than needed.

llvm-svn: 247954
2015-09-17 22:29:58 +00:00
David Majnemer 163b7f121c [WinEH] Fix tests broken by funclet-layout
llvm-svn: 247944
2015-09-17 21:11:12 +00:00
Joerg Sonnenberger 1bbfa7f9d7 [SPARC] Add mulscc.
llvm-svn: 247940
2015-09-17 20:54:26 +00:00
David Majnemer 978902309a [WinEH] Add a funclet layout pass
Windows EH funclets need to be contiguous.  The FuncletLayout pass will
ensure that the funclets are together and begin with a funclet entry MBB.

Differential Revision: http://reviews.llvm.org/D12943

llvm-svn: 247937
2015-09-17 20:45:18 +00:00
Reid Kleckner 5b8a46e771 [WinEH] Make funclet return instrs pseudo instrs
This makes catchret look more like a branch, and less like a weird use
of BlockAddress. It also lets us get away from
llvm.x86.seh.restoreframe, which relies on the old parentfpoffset label
arithmetic.

llvm-svn: 247936
2015-09-17 20:43:47 +00:00
Simon Pilgrim 61116ddc7b [InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions
The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results.

Differential Revision: http://reviews.llvm.org/D12680

llvm-svn: 247934
2015-09-17 20:32:45 +00:00
Teresa Johnson ff642b9b84 Restore "Function bitcode index in Value Symbol Table and lazy reading support"
This reverts commit r247898 (which reverted r247894).

Patch fixed to address two issues exposed by buildbots:
- unused variable warning in NDEBUG mode
- std::initializer_list lifetime issue causing test failures

Original Summary:
Support for including the function bitcode indices in the Value Symbol
Table. This requires writing the VST after the function blocks, which in
turn requires a new VST forward declaration record encoding the offset of
the full VST (which is backpatched to contain the offset after the VST
is written).

This patch also enables the lazy function reader to use the new function
indices out of the VST. This support will be used by ThinLTO as well, which
will be in a follow on patch. Backwards compatibility with older bitcode
files is maintained.

A new test is also included.

The bitcode format (used for the lazy reader as well as the upcoming
ThinLTO patches) came out of discussions with Duncan and others and is
described here:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view

Reviewers: dexonsmith, davidxl, joker.eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12536

llvm-svn: 247927
2015-09-17 20:12:00 +00:00
Diego Novillo d2e2137b4c Temporarily fix gcov failures in big-endian hosts.
This test uses a gcov file generated in a little-endian host. The gcov
reader does not allow different endianness, so the test fails on big
endian hosts.

XFAILing for now.

llvm-svn: 247920
2015-09-17 19:05:48 +00:00
Reid Kleckner b78585b3c2 Fix the test case I just committed
llvm-svn: 247905
2015-09-17 17:21:45 +00:00
Reid Kleckner ed17079b52 [WinEH] Add and use hasEHPadSuccessor instead of getLandingPadSuccessor
getLandingPadSuccessor assumes that each invoke can have at most one EH
pad successor, but WinEH invokes can have more than one. Two out of
three callers of getLandingPadSuccessor don't use the returned
landingpad, so we can make them use this simple predicate instead.

Eventually we'll have to circle back and fix SplitKit.cpp so that
register allocation works. Baby steps.

llvm-svn: 247904
2015-09-17 17:19:40 +00:00
Teresa Johnson 2e98d57ad4 Revert "Function bitcode index in Value Symbol Table and lazy reading support"
Temporarily revert to fix some buildbot issues. One is a minor issue
with a variable unused in NDEBUG mode. More concerning are some test
failures on win7 that I need to dig into.

This reverts commit 4e66a74543459832cfd571db42b4543580ae1d1d.

llvm-svn: 247898
2015-09-17 16:19:10 +00:00
Daniel Sanders e2982adc0b [mips] Add assembler support for the .cprestore directive.
Summary:
This assembler directive is used in O32 PIC to restore the current function's $gp after executing JAL's. The $gp is first stored on the stack at a user-specified offset.
It has the following format: ".cprestore 8" (where 8 is the offset).

This fixes llvm.org/PR20967.

Patch by Toma Tabacu.

Reviewers: seanbruno, tomatabacu

Subscribers: brooks, seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D6267

llvm-svn: 247897
2015-09-17 16:08:39 +00:00
Teresa Johnson b77b1f8a0c Function bitcode index in Value Symbol Table and lazy reading support
Summary:
Support for including the function bitcode indices in the Value Symbol
Table. This requires writing the VST after the function blocks, which in
turn requires a new VST forward declaration record encoding the offset of
the full VST (which is backpatched to contain the offset after the VST
is written).

This patch also enables the lazy function reader to use the new function
indices out of the VST. This support will be used by ThinLTO as well, which
will be in a follow on patch. Backwards compatibility with older bitcode
files is maintained.

A new test is also included.

The bitcode format (used for the lazy reader as well as the upcoming
ThinLTO patches) came out of discussions with Duncan and others and is
described here:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view

Reviewers: dexonsmith, davidxl, joker.eph

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12536

llvm-svn: 247894
2015-09-17 15:52:30 +00:00
Zoran Jovanovic 7ba636cb4c [mips][microMIPS] Implement TEQ, TGE, TGEU, TLT, TLTU and TNE instructions
Differential Revision: http://reviews.llvm.org/D9658

llvm-svn: 247880
2015-09-17 10:14:09 +00:00
Elena Demikhovsky 702a6adfaa AVX-512: shufflevector for i1 vectors <2 x i1> .. <64 x i1>
AVX-512 does not provide an instruction that shuffles mask register. So I do the following way:

mask-2-simd , shuffle simd , simd-2-mask

Differential Revision: http://reviews.llvm.org/D12727

llvm-svn: 247876
2015-09-17 06:53:12 +00:00
Diego Novillo 3376a78781 GCC AutoFDO profile reader - Initial support.
This adds enough machinery to support reading simple GCC AutoFDO
profiles. It now supports reading flat profiles (no function calls).
Subsequent patches will add support for:

- Inlined calls (in particular, the inline call stack is not traversed
  to accumulate samples).

- Working sets and modules. These are used mostly for GCC's LIPO
  optimizations, so they're not needed in LLVM atm. I'm not sure that
  we will ever need them. For now, I've if0'd around the calls.

The patch also adds support in GCOV.h for gcov version V704 (generated
by GCC's profile conversion tool).

llvm-svn: 247874
2015-09-17 00:17:24 +00:00
Reid Kleckner 813f1b65bc [WinEH] Rip out the landingpad-based C++ EH state numbering code
It never really worked, and the new code is working better every day.

llvm-svn: 247860
2015-09-16 22:14:46 +00:00
David Majnemer 67bff0d88b [WinEHPrepare] Turn terminatepad into a cleanuppad + call + cleanupret
The MSVC doesn't really support exception specifications so let's just
turn these into cleanuppads.  Later, we might use terminatepad to more
efficiently encode the "noexcept"-ness of a function body.

llvm-svn: 247848
2015-09-16 20:42:16 +00:00
Sanjoy Das e5f4889ba9 [InstCombine] Optimize icmp slt signum(x), 1 --> icmp slt x, 1
Summary:
`signum(x)` is sometimes implemented as `(x >> 63) | (-x >>> 63)` (for
an `i64` `x`).  This change adds a matcher for that pattern, and an
instcombine rule to optimize `signum(x) s< 1`.

Later, we can also consider optimizing:

  icmp slt signum(x), 0 --> icmp slt x, 0
  icmp sle signum(x), 1 --> true

etc.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12703

llvm-svn: 247846
2015-09-16 20:41:29 +00:00
Reid Kleckner b005d281c3 [WinEH] Pull Adjectives and CatchObj out of the catchpad arg list
Clang now passes the adjectives as an argument to catchpad.

Getting the CatchObj working is simply a matter of threading another
static alloca through codegen, first as an alloca, then as a frame
index, and finally as a frame offset.

llvm-svn: 247844
2015-09-16 20:16:27 +00:00
David Majnemer 459a64aed7 [WinEHPrepare] Provide a cloning mode which doesn't demote
We are experimenting with a new approach to saving and restoring SSA
values used across funclets: let the register allocator do the dirty
work for us.

However, this means that we need to be able to clone commoned blocks
without relying on demotion.

llvm-svn: 247835
2015-09-16 18:40:37 +00:00
Teresa Johnson 8c8fe5a015 Disable the second verification run when performing LTO through
gold in NDEBUG mode.
Follow on patch for r247729 - LTO: Disable extra verify runs in release
builds.

llvm-svn: 247824
2015-09-16 18:06:45 +00:00
Reid Kleckner 84ebff4a5e [WinEH] Skip state numbering when no EH pads are present
Otherwise we'd try to emit the thunk that passes the LSDA to
__CxxFrameHandler3. We don't emit the LSDA if there were no landingpads,
so we'd end up with an assembler error when trying to write the COFF
object.

llvm-svn: 247820
2015-09-16 17:19:44 +00:00
Mehdi Amini 6f6f137e49 Improve "default_triple" specification: make it at the directory level for test/tools/llvm-mc
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247819
2015-09-16 17:03:12 +00:00
Dan Gohman 950a13cfa3 [WebAssembly] Check in an initial CFG Stackifier pass
This pass implements a simple algorithm for conversion from CFG to
wasm's structured control flow. It doesn't yet handle multiple-entry
loops; that will be added in a future patch.

It also adds initial support for switch statements.

Differential Revision: http://reviews.llvm.org/D12735

llvm-svn: 247818
2015-09-16 16:51:30 +00:00
Sanjay Patel a260701bbb propagate fast-math-flags on DAG nodes
After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, 
so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: 
if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is
one test case in this patch to prove that point.

This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I 
did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF
( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes.

This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as
FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the
current global settings.

Differential Revision: http://reviews.llvm.org/D12095

llvm-svn: 247815
2015-09-16 16:31:21 +00:00
Reid Kleckner 85dfb68e50 Add assembler fatal error for undefined assembler labels in COFF writer
llvm-svn: 247814
2015-09-16 16:26:29 +00:00
Joerg Sonnenberger 22cd644e1b [SPARC] Both GNU and Solaris as support eq as condition code for integer ops.
llvm-svn: 247804
2015-09-16 14:41:36 +00:00
Joerg Sonnenberger 9763490e4d [SPARC] Recognize st/stx operations with %fsr argument too.
llvm-svn: 247794
2015-09-16 13:30:54 +00:00
Michael Kuperstein d926465342 [X86] Do not generate 64-bit pops of 32-bit GPRs.
When trying emit a stack adjustments using pops, frame lowering selects an
arbitrary free GPR. It should always select one from an appropriate class...
This fixes PR24649.

Patch by: amjad.aboud@intel.com
Differential Revision: http://reviews.llvm.org/D12609

llvm-svn: 247785
2015-09-16 11:27:20 +00:00
Zoran Jovanovic 6e6a2c9cd7 [mips][microMIPS] Implement PREFX, LHUE, LBE, LBUE, LHE, LWE, SBE, SHE and SWE instructions
Differential Revision: http://reviews.llvm.org/D9189

llvm-svn: 247780
2015-09-16 09:14:35 +00:00
NAKAMURA Takumi d42d3df56f Copy back Inputs/gmlt.ll. Also DebugInfo/X86/gmlt.test uses it.
llvm-svn: 247777
2015-09-16 06:22:55 +00:00
Mehdi Amini 01fee92d05 Fix test gmlt.test by moving its Inputs where expected.
I couldn't see the failure as the test is XFAIL'ed on Darwin.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247776
2015-09-16 06:04:31 +00:00
Mehdi Amini d178f4fc89 Make the default triple optional by allowing an empty string
When building LLVM as a (potentially dynamic) library that can be linked against
by multiple compilers, the default triple is not really meaningful.
We allow to explicitely set it to an empty string when configuring LLVM.
In this case, said "target independent" tests in the test suite that are using
the default triple are disabled by matching the newly available feature
"default_triple".

Reviewers: probinson, echristo
Differential Revision: http://reviews.llvm.org/D12660

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247775
2015-09-16 05:34:32 +00:00
Mehdi Amini 8e468d6388 [NaryReassociate] Improve test CHECK
Add `CHECK` directives for the function calls.

Differential Revision: http://reviews.llvm.org/D12885

Patch by: Volkan Keles <vkeles@apple.com>

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247774
2015-09-16 05:27:46 +00:00
Michael Zolotukhin fc314be0ec [Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad.
We only checked that a global is initialized with constants, which is
incorrect. We should be checking that GlobalVariable *is* a constant,
not just initialized with it.

llvm-svn: 247769
2015-09-16 03:25:09 +00:00
Sanjoy Das 8a5526e8be [IndVars] Fix PR24783.
In `IndVarSimplify::ExpandSCEVIfNeeded`,
`SCEVExpander::findExistingExpansion` may return an `llvm::Value` that
differs in type from the SCEV it was asked to find an expansion for (but
computes the same value).  In such cases, we fall back on
`expandCodeFor`; and rely on LLVM to CSE the two equivalent
expressions (different only by a no-op cast) into a single computation.

I tried a few other approaches to fixing PR24783, all of which turned
out to be more complex than this current version:

 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`.  This got
    problematic because currently we do not pass in the `Loop *` into
    `expandCodeFor`.  Changing the interface to do this is a more
    invasive change, and really does not make much semantic sense unless
    the SCEV being passed in is an add recurrence.

    There is also the problem of `expandCodeFor` being used in places
    other than `indvars` -- there may be performance / correctness
    issues elsewhere if `expandCodeFor` is moved from always generating
    IR from scratch to cache-like model.

 2. Have `findExistingExpansion` only return expression with the correct
    type.  This would make `isHighCostExpansionHelper` and thus
    `isHighCostExpansion` more conservative than necessary.

 3. Insert casts on the value returned by `findExistingExpansion` if
    needed using `InsertNoopCastOfTo`.  This is complicated because
    `InsertNoopCastOfTo` depends on internal state of its
    `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this
    may not be set up when `ExpandSCEVIfNeeded` is called.

 4. Manually insert casts on the value returned by
    `findExistingExpansion` if needed using `InsertNoopCastOfTo` via
    `CastInst::Create`.  This is probably workable, but figuring out the
    location where the cast instruction needs to be inserted has enough
    edge cases (arguments, constants, invokes, LCSSA must be preserved)
    makes me feel what I have right now is simplest solution.

llvm-svn: 247749
2015-09-15 23:45:39 +00:00
Davide Italiano 386e2ab158 [llvm-cxxdump] Remove duplicate code check.
We already fail with 'No such file or directory' when we try to open
the file -- if that doesn't exist. Also add a test to verify this behavior.

llvm-svn: 247744
2015-09-15 23:35:32 +00:00
Duncan P. N. Exon Smith cff5feff6f Reapply "LTO: Disable extra verify runs in release builds"
This reverts commit r247730, effectively reapplying r247729.  This time
I have an lld commit ready to follow.

llvm-svn: 247735
2015-09-15 23:05:59 +00:00
Alexey Samsonov c1603b6493 [ASan] Don't instrument globals in .preinit_array/.init_array/.fini_array
These sections contain pointers to function that should be invoked
during startup/shutdown by __libc_csu_init and __libc_csu_fini.
Instrumenting these globals will append redzone to them, which will be
filled with zeroes. This will cause null pointer dereference at runtime.

Merge ASan regression tests for globals that should be ignored by
instrumentation pass.

llvm-svn: 247734
2015-09-15 23:05:48 +00:00
Duncan P. N. Exon Smith 7de73e56a4 Revert "LTO: Disable extra verify runs in release builds"
This temporarily reverts commit r247729, as it caused lld build
failures.  I'll recommit once I have an lld patch ready-to-go.

llvm-svn: 247730
2015-09-15 22:47:38 +00:00
Duncan P. N. Exon Smith 236787838c LTO: Disable extra verify runs in release builds
The verifier currently runs three times in LTO: (1) after parsing, (2)
at the beginning of the optimization pipeline, and (3) at the end of it.

The first run is important, since we're not sure where the bitcode comes
from and it's nice to validate it, but in release builds the extra runs
aren't appropriate.

This commit:
  - Allows these runs to be disabled in LTOCodeGenerator.
  - Adds command-line options to llvm-lto.
  - Adds command-line options to libLTO.dylib, and disables the verifier
    by default in release builds (based on NDEBUG).

This shaves about 3.5% off the runtime of ld64 when linking
verify-uselistorder with -flto -g.

rdar://22509081

llvm-svn: 247729
2015-09-15 22:26:11 +00:00
Justin Bogner 01fa3f96b3 test: Add "REQUIRES: native" so this test passes with no default triple configured
llvm-svn: 247719
2015-09-15 21:13:33 +00:00
Quentin Colombet 7b13976254 [ShrinkWrapping] Add a test case for r247710.
llvm-svn: 247713
2015-09-15 18:51:43 +00:00
Piotr Padlewski 6c15ec49ed Introducing llvm.invariant.group.barrier intrinsic
For more info for what reason it was invented, goto:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html

invariant.group.barrier:
http://reviews.llvm.org/D12310
docs:
http://reviews.llvm.org/D11399
CodeGenPrepare:
http://reviews.llvm.org/D12875

llvm-svn: 247711
2015-09-15 18:32:14 +00:00
Arch D. Robison 8ed0854f55 Broaden optimization of fcmp ([us]itofp x, constant) by instcombine.
The patch extends the optimization to cases where the constant's
magnitude is so small or large that the rounding of the conversion
is irrelevant.  The "so small" case includes negative zero.

Differential review: http://reviews.llvm.org/D11210

llvm-svn: 247708
2015-09-15 17:51:59 +00:00
Igor Laevsky bdc1eafe20 [CorrelatedValuePropagation] Infer nonnull attributes
LazuValueInfo can prove that value is nonnull based on the context information. 
Make use of this ability to infer nonnull attributes for the call arguments.

Differential Revision: http://reviews.llvm.org/D12836

llvm-svn: 247707
2015-09-15 17:51:50 +00:00
Marcello Maggioni 454faa84e2 [NaryReassociate] Add support for Mul instructions
This patch extends the current pass by handling
Mul instructions as well.

Patch by: Volkan Keles (vkeles@apple.com)

llvm-svn: 247705
2015-09-15 17:22:52 +00:00
Zoran Jovanovic dc4b8c2761 [mips][microMIPS] Fix an issue with disassembling lwm32 instruction
Fixed microMIPS disassembler crash on test case generated by llvm-mc-fuzzer.
Differential Revision: http://reviews.llvm.org/D12881

llvm-svn: 247698
2015-09-15 15:21:27 +00:00
Zoran Jovanovic 8eb8c9861d [mips] Add support for branch-likely pseudo-instructions
Differential Revision: http://reviews.llvm.org/D10537

llvm-svn: 247697
2015-09-15 15:06:26 +00:00
Ulrich Weigand e861e6442c [SystemZ] Fix assertion failure in tryBuildVectorShuffle
Under certain circumstances, tryBuildVectorShuffle would attempt to
create a BUILD_VECTOR node with an invalid combination of types.
This happened when one of the components of the original BUILD_VECTOR
was itself a TRUNCATE node.  That TRUNCATE was stripped off during
intermediate processing to simplify code, but when adding the node
back to the result vector, we still need it to get the type right.

llvm-svn: 247694
2015-09-15 14:27:46 +00:00
Zoran Jovanovic 7beb737b46 [mips][microMIPS] Implement CACHEE and PREFE instructions for microMIPS32r6
Differential Revision: http://reviews.llvm.org/D11632

llvm-svn: 247670
2015-09-15 10:05:10 +00:00
Daniel Sanders e4e83a7bc1 [mips] Added support for various EVA ASE instructions.
Summary:
Added support for the following instructions:

CACHEE, LBE, LBUE, LHE, LHUE, LWE, LLE, LWLE, LWRE, PREFE,
SBE, SHE, SWE, SCE, SWLE, SWRE, TLBINV, TLBINVF

This required adding some infrastructure for the EVA ASE.

Patch by Scott Egerton.

Reviewers: vkalintiris, dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11139

llvm-svn: 247669
2015-09-15 10:02:16 +00:00
Sanjoy Das f75e15e5ac [PlaceSafepoints] Make the width of a counted loop settable.
Summary:
This change lets a `PlaceSafepoints` client change how wide the trip
count of a loop has to be for the loop to be considerd "counted", via
`CountedLoopTripWidth`.  It also removes the boolean `SkipCounted` flag
and the `upperTripBound` constant -- we can get the old behavior of
`SkipCounted` == `false` by setting `CountedLoopTripWidth` to `13` (2 ^
13 == 8192).

Reviewers: reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D12789

llvm-svn: 247656
2015-09-15 01:42:48 +00:00
Dan Gohman 311b488d76 [WebAssembly] Implement int64-to-int32 conversion.
llvm-svn: 247649
2015-09-15 00:55:19 +00:00
Adrian Prantl deef90d7f5 DwarfDebug: Emit dwo_id+dwo_name for DICompileUnits that provide a dwoId.
For module debugging clang emits prefabricated skeleton compile units
that can be recognized by a nonzero dwoId.

llvm-svn: 247626
2015-09-14 22:10:22 +00:00
Chen Li 0d043b52eb [InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite
Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. It also adds assertions in isKnownNonNull() and isKnownNonNullFromDominatingCondition() to make sure the value checked is pointer type (as defined in LLVM document). These assertions might trip failures in things which are not  covered under llvm/test, but fixes should be pretty obvious. 

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12779

llvm-svn: 247587
2015-09-14 18:10:43 +00:00
Davide Italiano 2c007d050a [llvm-mc] Better error handling in ENOENT case + test.
This is a follow up to r247518.
As a general note, I think we could do a much better job testing for
error conditions in tools. I already anticipated in a previous mail,
but while implementing this I noticed that the code coverage we have 
for error checking is pretty low. I can arbitrarily remove checks from 
several tools and the suite still passes.

Differential Revision:	 http://reviews.llvm.org/D12846

llvm-svn: 247582
2015-09-14 17:10:01 +00:00
Jun Bum Lim 34b9bd0435 Improve ISel using across lane min/max reduction
In vectorized integer min/max reduction code, the final "reduce" step
is sub-optimal. In AArch64, this change wll combine :
  %svn0 = vector_shuffle %0, undef<2,3,u,u>
  %smax0 = smax %0, svn0
  %svn3 = vector_shuffle %smax0, undef<1,u,u,u>
  %sc = setcc %smax0, %svn3, gt
  %n0 = extract_vector_elt %sc, #0
  %n1 = extract_vector_elt %smax0, #0
  %n2 = extract_vector_elt $smax0, #1
  %result = select %n0, %n1, n2
becomes :
  %1 = smaxv %0
  %result = extract_vector_elt %1, 0

This change extends r246790.

llvm-svn: 247575
2015-09-14 16:19:52 +00:00
JF Bastien 26aca14b15 [MergeFuncs] Fix bug in merging GetElementPointers
GetElementPointers must have the first argument's type compared
for structural equivalence. Previously the code erroneously compared the
pointer's type, but this code was dead because all pointer types (of the
same address space) are the same. The pointee must be compared instead
(using the type stored in the GEP, not from the pointer type which will
be erased anyway).

Author: jrkoenig
Reviewers: dschuff, nlewycky, jfb
Subscribers: nlewycky, llvm-commits
Differential revision: http://reviews.llvm.org/D12820

llvm-svn: 247570
2015-09-14 15:37:48 +00:00
John Brawn 056e67865a [ARM] Extract shifts out of multiply-by-constant
Turning (op x (mul y k)) into (op x (lsl (mul y k>>n) n)) is beneficial when
we can do the lsl as a shifted operand and the resulting multiply constant is
simpler to generate.

Do this by doing the transformation when trying to select a shifted operand,
as that ensures that it actually turns out better (the alternative would be to
do it in PreprocessISelDAG, but we don't know for sure there if extracting the
shift would allow a shifted operand to be used).

Differential Revision: http://reviews.llvm.org/D12196

llvm-svn: 247569
2015-09-14 15:19:41 +00:00
NAKAMURA Takumi c397e7881a Revert part of r247553, "[CMake] Reformat CLANG_TEST_DEPS." It was accidental commit.
llvm-svn: 247555
2015-09-14 12:51:01 +00:00
NAKAMURA Takumi 0f1cbee00e [CMake] Reformat CLANG_TEST_DEPS.
llvm-svn: 247553
2015-09-14 12:41:53 +00:00
Simon Pilgrim f8f86ab176 [X86][MMX] Added shuffle decodes for MMX/3DNow! shuffles.
Added shuffle decodes for MMX PUNPCK + PSHUFW shuffles.
Added shuffle decodes for 3DNow! PSWAPD shuffles.

llvm-svn: 247526
2015-09-13 11:28:45 +00:00
Elena Demikhovsky 8671fcbbd6 AVX-512: Fixed a bug in OR/XOR operations for 512-bit FP values on KNL.
KNL does not have VXORPS, VORPS for 512-bit values.
I use integer VPXOR, VPOR that actually do the same.

X86ISD::FXOR/FOR are generated as a result of FSUB combining.

Differential Revision: http://reviews.llvm.org/D12753

llvm-svn: 247523
2015-09-13 08:15:15 +00:00
Sanjay Patel 8b960d22ad [x86] enable machine combiner reassociations for 128-bit vector logical integer insts (2nd try)
The changes in:
test/CodeGen/X86/machine-cp.ll
are just due to scheduling differences after some logic instructions were reassociated.

llvm-svn: 247516
2015-09-12 19:47:50 +00:00
Simon Pilgrim 42c834bad5 [X86] Added i1 vector sextload tests
llvm-svn: 247509
2015-09-12 15:36:41 +00:00
Simon Pilgrim 779bcf3e3d [X86][FMA] Refreshed fma tests
llvm-svn: 247508
2015-09-12 15:33:05 +00:00
Sanjay Patel 99f7370a79 revert r247506; need to verify changes in existing tests
llvm-svn: 247507
2015-09-12 15:27:31 +00:00
Sanjay Patel 08755c7dbc [x86] enable machine combiner reassociations for 128-bit vector logical integer insts
llvm-svn: 247506
2015-09-12 14:58:04 +00:00
Simon Pilgrim 20c607b110 [InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding
Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion):

<4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion.

Added constant folding support.

Differential Revision: http://reviews.llvm.org/D12731

llvm-svn: 247504
2015-09-12 13:39:53 +00:00
Simon Pilgrim 8ee50e9cff [X86][SSE] Use general sext IR for (v)pmovsx stack folding tests
llvm-svn: 247502
2015-09-12 11:45:24 +00:00
Chandler Carruth 29a18a4663 [PM] Port SROA to the new pass manager.
In some ways this is a very boring port to the new pass manager as there
are no interesting analyses or dependencies or other oddities.

However, this does introduce the first good example of a transformation
pass with non-trivial state porting to the new pass manager. I've tried
to carve out patterns here to replicate elsewhere, and would appreciate
comments on whether folks like these patterns:

- A common need in the new pass manager is to effectively lift the pass
  class and some of its state into a public header file. Prior to this,
  LLVM used anonymous namespaces to provide "module private" types and
  utilities, but that doesn't scale to cases where a public header file
  is needed and the new pass manager will exacerbate that. The pattern
  I've adopted here is to use the namespace-cased-name of the core pass
  (what would be a module if we had them) as a module-private namespace.
  Then utility and other code can be declared and defined in this
  namespace. At some point in the future, we could even have
  (conditionally compiled) code that used modules features when
  available to do the same basic thing.

- I've split the actual pass run method in two in order to expose
  a private method usable by the old pass manager to wrap the new class
  with a minimum of duplicated code. I actually looked at a bunch of
  ways to automate or generate these, but they are all quite terrible
  IMO. The fundamental need is to extract the set of analyses which need
  to cross this interface boundary, and that will end up being too
  unpredictable to effectively encapsulate IMO. This is also
  a relatively small amount of boiler plate that will live a relatively
  short time, so I'm not too worried about the fact that it is boiler
  plate.

The rest of the patch is totally boring but results in a massive diff
(sorry). It just moves code around and removes or adds qualifiers to
reflect the new name and nesting structure.

Differential Revision: http://reviews.llvm.org/D12773

llvm-svn: 247501
2015-09-12 09:09:14 +00:00
Davide Italiano 63cee81c3c [MC] Don't crash on division by zero.
Differential Revision:	http://reviews.llvm.org/D12776

llvm-svn: 247471
2015-09-11 20:47:35 +00:00
Yunzhong Gao 46261a74db Add a non-exiting diagnostic handler for LTO.
This is in order to give LTO clients a chance to do some clean-up before
terminating the process.

llvm-svn: 247461
2015-09-11 20:01:53 +00:00
Akira Hatanaka bc497c93f5 Use function attribute "stackrealign" to decide whether stack
realignment should be forced.

With this commit, we can now force stack realignment when doing LTO and
do so on a per-function basis. Also, add a new cl::opt option
"stackrealign" to CommandFlags.h which is used to force stack
realignment via llc's command line.

Out-of-tree projects currently using -force-align-stack to force stack
realignment should make changes to attach the attribute to the functions
in the IR.

Differential Revision: http://reviews.llvm.org/D11814

llvm-svn: 247450
2015-09-11 18:54:38 +00:00
David Majnemer 0e70598a5b [X86] Make sure startproc/endproc are paired
We used different conditions to determine if we should emit startproc vs
endproc.  Use the same condition to ensure that they will always be
paired.

This fixes PR24374.

llvm-svn: 247435
2015-09-11 17:34:34 +00:00
Reid Kleckner 5dbee7baef [IR] Print the label operands of a catchpad like an invoke
The rest of the EH pads are fine, since they have at most one label and
take fewer operands for the personality.

Old catchpad vs. new:
  %5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)] to label %__except.ret.10 unwind label %catchendblock.9
-----
  %5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)]
          to label %__except.ret.10 unwind label %catchendblock.9

llvm-svn: 247433
2015-09-11 17:27:52 +00:00
Daniel Sanders 715f8f1332 [mips] Add missing disassembler tests for MIPS64-MIPS64R5.
llvm-svn: 247422
2015-09-11 16:24:11 +00:00
Daniel Sanders 9676db005e [mips] Add missing MIPS32 - MIPS32R5 disassembler tests.
llvm-svn: 247420
2015-09-11 15:28:19 +00:00
Daniel Sanders aba4daab10 [mips] Attempt to fix llvm-s390x-linux1
It doesn't seem to like the '|&' in the test command.

llvm-svn: 247418
2015-09-11 14:57:54 +00:00
Daniel Sanders 9bbda77006 [mips] Add missing MIPS-IV disassembler tests.
llvm-svn: 247417
2015-09-11 14:54:58 +00:00
Daniel Sanders ff338fa355 [mips] Add missing MIPS-III disassembler tests.
llvm-svn: 247416
2015-09-11 14:48:46 +00:00
Arnaud A. de Grandmaison ba4e977cff Tweak 2 x86 gold tests so they can run on non-x86 platforms
llvm-svn: 247415
2015-09-11 14:45:34 +00:00