Commit Graph

245950 Commits

Author SHA1 Message Date
Craig Topper 66b2fd1209 [AVX-512] Remove many of the masked 128/256-bit shift builtins and replace them with unmasked builtins and selects.
llvm-svn: 285539
2016-10-31 04:30:51 +00:00
Eric Fiselier ebcc86e469 Add 'inline' but not 'always_inline' to std::strings destructor.
Adding both 'inline' and 'always_inline' to the destructor has been contentious.
However most of the performance benefits can be gained by only adding 'inline',
and there is no reason to hold up that change while discussing the other.

llvm-svn: 285538
2016-10-31 03:42:50 +00:00
Eric Fiselier 0f0a077c89 Remove additional function template definitions from the dylib
llvm-svn: 285537
2016-10-31 03:40:29 +00:00
Sanjoy Das fd080904b7 Make a test case more rigorous; NFC
llvm-svn: 285536
2016-10-31 03:32:45 +00:00
Sanjoy Das 1707869db5 [SCEV] Try to order n-ary expressions in CompareValueComplexity
llvm-svn: 285535
2016-10-31 03:32:43 +00:00
Sanjoy Das 3d6e3df5f9 [SCEV] Reduce boilerplate in unit tests
llvm-svn: 285534
2016-10-31 03:32:39 +00:00
Artem Dergachev e14d881808 [analyzer] NumberObjectConversion: support more types, misc updates.
Support CFNumberRef and OSNumber objects, which may also be accidentally
converted to plain integers or booleans.

Enable explicit boolean casts by default in non-pedantic mode.

Improve handling for warnings inside macros.

Improve error messages.

Differential Revision: https://reviews.llvm.org/D25731

llvm-svn: 285533
2016-10-31 03:08:48 +00:00
Eric Fiselier a55333003d Optimize filesystem::path by providing weaker exception guarantees.
path uses string::append to construct, append, and concatenate paths. Unfortunatly
string::append has a strong exception safety guaranteed and if it can't prove
that the iterator operations don't throw then it will allocate a temporary
string copy to append to. However this extra allocation and copy is very
undesirable for path which doesn't have the same exception guarantees.

To work around this this patch adds string::__append_forward_unsafe which exposes
the std::string::append interface for forward iterators without enforcing
that the iterator is noexcept.

llvm-svn: 285532
2016-10-31 02:46:25 +00:00
Eric Fiselier 7ca76565e7 Fix _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY to always have default visibility.
This prevent the symbols from being both externally available and hidden, which
causes them to be linked incorrectly. This is only a problem when the address
of the function is explicitly taken since it will always be inlined otherwise.

This patch fixes the issues that caused r285456 to be reverted, and can
now be reapplied.

llvm-svn: 285531
2016-10-31 02:07:23 +00:00
Eric Fiselier ef915d3ef4 Improve performance of constructing filesystem::path from strings.
This patch fixes a performance bug when constructing or appending to a path
from a string or c-string. Previously we called 'push_back' to append every
single character. This caused multiple re-allocation and copies when at most
one reallocation is necessary. The new behavior is to simply call
`string::append` so it can correctly handle reallocation.

For large strings this change is a ~4x improvement. This also makes our path
faster to construct than libstdc++'s.

llvm-svn: 285530
2016-10-30 23:53:50 +00:00
Sanjoy Das 299e67291c [SCEV] In CompareValueComplexity, order global values by their name
llvm-svn: 285529
2016-10-30 23:52:56 +00:00
Sanjoy Das b4830a84b9 [SCEV] Use auto for consistency with an upcoming change; NFC
llvm-svn: 285528
2016-10-30 23:52:53 +00:00
Sanjoy Das b53021d71f Clean up test a little bit; NFC
llvm-svn: 285527
2016-10-30 23:52:50 +00:00
Eric Fiselier 1467a197e5 Rewrite std::filesystem::path iterators and parser
This patch entirely rewrites the parsing logic for paths. Unlike the previous
implementation this one stores information about the current state; For example
if we are in a trailing separator or a root separator. This avoids the need for
extra lookahead (and extra work) when incrementing or decrementing an iterator.
Roughly this gives us a 15% speedup over the previous implementation.

Unfortunately this implementation is still a lot slower than libstdc++'s.
Because libstdc++ pre-parses and splits the path upon construction their
iterators are trivial to increment/decrement. This makes libc++ lazy parsing
100x slower than libstdc++. However the pre-parsing libstdc++ causes a ton
of extra and unneeded allocations when constructing the string. For example
`path("/foo/bar/")` would require at least 5 allocations with libstdc++
whereas libc++ uses only one. The non-allocating behavior is much preferable
when you consider filesystem usages like 'exists("/foo/bar/")'.

Even then libc++'s path seems to be twice as slow to simply construct compared
to libstdc++. More investigation is needed about this.

llvm-svn: 285526
2016-10-30 23:30:38 +00:00
Mehdi Amini b2461ce33a Fix clang installed path to handle case where clang is invoked through a symlink
This code path is used when generating the path to libLTO.dylib, which
is passed to the linker as `-lto_library'.
Without this, if clang is invoked through a symlink, libLTO is
searched in a path relative to where the symlink is instead of
where clang is actually installed.

Fix PR30811.

Patch by: Jack Howarth

Differential Revision: https://reviews.llvm.org/D26116

llvm-svn: 285525
2016-10-30 23:26:13 +00:00
Eric Fiselier 3aa5478e21 Add start of filesystem benchmarks
llvm-svn: 285524
2016-10-30 22:53:00 +00:00
Eric Fiselier 2d4fbb7b0c Mark thread exit test as unsupported w/o threads
llvm-svn: 285523
2016-10-30 20:05:52 +00:00
Sanjay Patel 339a51ac13 [DAG] x | x --> x
llvm-svn: 285522
2016-10-30 18:19:35 +00:00
Sanjay Patel 13aee345ca [DAG] x & x --> x
llvm-svn: 285521
2016-10-30 18:13:30 +00:00
Sanjay Patel 8a5f9810a0 [x86] add tests for basic logic op folds
llvm-svn: 285520
2016-10-30 18:04:19 +00:00
Michael Zuckerman d343697f1e Fixing "type" issue for (epi32)
and replaceing hardcoded inf with clang builtin inf "__builtin_inff()" for float ({max|min}_{pd|ps})

llvm-svn: 285519
2016-10-30 14:54:05 +00:00
Dorit Nuzman 06903d16af Revert r285517 due to build failures.
llvm-svn: 285518
2016-10-30 14:34:57 +00:00
Dorit Nuzman 3c1c658f24 [LoopVectorize] Make interleaved-accesses analysis less conservative about
possible pointer-wrap-around concerns, in some cases.

Before this patch, collectConstStridedAccesses (part of interleaved-accesses
analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when
examining all candidate pointers. This is too conservative. Instead, this
patch makes collectConstStridedAccesses use an optimistic approach, calling
getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the
candidate interleave groups have been formed, revisits the pointer-wrapping
analysis but only where it matters: namely, in groups that have gaps, and where
the gaps are not at the very end of the group (in which case the loop is
peeled). This second time getPtrStride is called with [Assume=false,
ShouldCheckWrap=true], but this could further be improved to using Assume=true,
once we also add the logic to track that we are not going to meet the scev
runtime checks threshold.

Differential Revision: https://reviews.llvm.org/D25276

llvm-svn: 285517
2016-10-30 12:23:26 +00:00
Craig Topper 312ff9d19d [AVX-512] Remove masked 128/256-bit builtins for vpmaddwd and vpmaddubsw. Replace with unmasked builtins and select.
llvm-svn: 285516
2016-10-30 07:11:34 +00:00
Craig Topper b7781a95fd [X86] Use intrinsics table for PMADDUBSW and PMADDWD so that we can use the legacy intrinsics to select EVEX encoded instructions when available.
This removes a couple tablegen classes that become unused after this change. Another class gained an additional parameter to allow PMADDUBSW to specify a different result type from its input type.

llvm-svn: 285515
2016-10-30 06:56:16 +00:00
Hongbin Zheng 72f9ed1807 [Polly] Remove the unused POLLY_LINK_LIBS for linking polly into
tools

    Differential Revision: https://reviews.llvm.org/D25861

llvm-svn: 285514
2016-10-30 06:07:59 +00:00
Teresa Johnson bf28c8fa45 [ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm
Summary:
Instead of using the workaround of suppressing the entire index for
modules that call inline asm that may reference locals, use the
NoRename flag on the summary for any locals in the llvm.used set, and
add a reference edge from any functions containing inline asm.

This avoids issues from having no summaries despite the module defining
global values, which was preventing more aggressive index-based
optimization. It will be followed by a subsequent patch to make a
similar fix for local references in module level asm (to fix PR30610).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26121

llvm-svn: 285513
2016-10-30 05:40:44 +00:00
Teresa Johnson 3bc8abdffc [ThinLTO] Correctly resolve linkonce when importing aliasee
Summary:
When we have an aliasee that is linkonce, while we can't convert
the non-prevailing copies to available_externally, we still need to
convert the prevailing copy to weak. If a reference to the aliasee
is exported, not converting a copy to weak will result in undefined
references when the linkonce is removed in its original module.

Add a new test and update existing tests.

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26076

llvm-svn: 285512
2016-10-30 05:15:23 +00:00
NAKAMURA Takumi 720033ad19 clang/test/Driver/openmp-offload.c: Relax expressions if "ld.exe" exists, like mingw.
llvm-svn: 285511
2016-10-30 02:58:48 +00:00
Craig Topper bf9e5a16a4 [X86] Don't use loadv2i64 on SSE version of PMULHRSW. Use memopv2i64 instead.
This bug was introduced in r285501.

llvm-svn: 285510
2016-10-30 00:02:55 +00:00
NAKAMURA Takumi ff76cfefc0 NativeFormatting.cpp: Fix build for mingw. Where would writePadding() be?
llvm-svn: 285509
2016-10-29 23:14:18 +00:00
Teresa Johnson 38d4df714c [ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC)
Rename as suggested in code review for D26063.

llvm-svn: 285508
2016-10-29 21:52:23 +00:00
Teresa Johnson 1b9c2be8f4 [ThinLTO] Use NoPromote flag in summary during promotion
Summary:
Replace the check of whether a GV has a section with the flag check
in the summary. This is in preparation for using the NoPromote flag
to convey other situations when we can't promote (e.g. locals used in
inline asm).

Reviewers: mehdi_amini

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26063

llvm-svn: 285507
2016-10-29 21:31:48 +00:00
Peter Collingbourne 310474f576 IR: Remove a no longer needed assert.
This assert was checking for a miscompile in a version of GCC that
we no longer support.

llvm-svn: 285506
2016-10-29 20:57:12 +00:00
Craig Topper 4caf76bee2 [AVX-512] Remove 128/256-bit masked pmulhrsw/pmulhuw/pmulhw builtins and use unmasked builtins and select instead.
llvm-svn: 285505
2016-10-29 19:02:14 +00:00
Craig Topper 2eadf1b67e [AVX-512] Remove masked 128/256-bit sqrt builtins and replace them with unmasked builtins and a select.
llvm-svn: 285504
2016-10-29 19:02:10 +00:00
Craig Topper 09e94007be [AVX-512] Remove masked 128/256-bit pmuludq/pmuldq builtins and replace them with unmasked builtins and a select.
llvm-svn: 285503
2016-10-29 19:02:07 +00:00
Craig Topper 160ca8420d [AVX-512] Remove masked 128/256-bit floating point max/min builtins. Use unmasked builtins with select instead.
llvm-svn: 285502
2016-10-29 19:02:03 +00:00
Craig Topper defe9ffbb5 [X86] Use intrinsics table for VPMULHRSW intrincis so that the legacy intrinsics can select EVEX encoded instructions when available.
This requires a minor rename of the instructions due to the use of different tablegen classes and how the names are concatenated.

llvm-svn: 285501
2016-10-29 18:41:45 +00:00
Richard Smith 2680bc9951 Factor finding of libc++ include path out of building -cc1 arguments.
llvm-svn: 285500
2016-10-29 17:28:48 +00:00
Sanjay Patel 36eeb6d6f6 [ValueTracking] recognize more variants of smin/smax
Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. 
There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding
smax pattern and commuted variants.

The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We
can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR.

Differential Revision: https://reviews.llvm.org/D26091

llvm-svn: 285499
2016-10-29 16:21:19 +00:00
Sanjay Patel e9fa95e572 [x86] add tests for smin/smax matchSelPattern (D26091)
llvm-svn: 285498
2016-10-29 16:02:57 +00:00
Piotr Padlewski 77cc962bce [Devirtualization] Decorate vfunction load with invariant.load
Summary:
This patch was introduced one year ago, but because my google account
was disabled, I didn't get email with failing buildbot and I missed
revert of this commit. There was small but in test regex.
I am back.

Reviewers: rsmith, rengolin

Subscribers: nlewycky, rjmccall, cfe-commits

Differential Revision: https://reviews.llvm.org/D26117

llvm-svn: 285497
2016-10-29 15:28:30 +00:00
Piotr Padlewski 2f8b97f3a6 NFC small format
llvm-svn: 285496
2016-10-29 15:28:25 +00:00
Sanjay Patel 978f827d12 [InstCombine] re-use bitcasted compare operands in selects (PR28001)
These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>.

The bitcasts obfuscate the expected min/max forms as shown in PR28001:
https://llvm.org/bugs/show_bug.cgi?id=28001#c6

Differential Revision: https://reviews.llvm.org/D25943

llvm-svn: 285495
2016-10-29 15:22:04 +00:00
Simon Pilgrim 75a697a17e [DAGCombiner] (REAPPLIED) Add vector demanded elements support to computeKnownBits
Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements.

This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.

The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used.

I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course.

DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit.

This looked like this had caused compile time regressions on some buildbots (and was reverted in rL285381), but appears to have just been a harmless bystander!

Differential Revision: https://reviews.llvm.org/D25691

llvm-svn: 285494
2016-10-29 11:29:39 +00:00
Michael Zuckerman 25eb420233 [X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (max|min) intrinsics to Clang .
After LGTM and Check-all 

Vector-reduction arithmetic accepts vectors as inputs and produces 
scalars as outputs.This class of vector operation forms the basis 
of many scientific computations. In vector-reduction arithmetic, 
the evaluation off is independent of the order of the input elements of V.

Reviewer: 1. craig.topper 
          2. igorb

Differential Revision: https://reviews.llvm.org/D25988

llvm-svn: 285493
2016-10-29 10:29:20 +00:00
Elena Demikhovsky 519b4ccd70 Fixed FMA + FNEG combine.
Masked form of FMA should be omitted in this optimization.

Differential Revision: https://reviews.llvm.org/D25984

llvm-svn: 285492
2016-10-29 08:44:46 +00:00
Tobias Grosser ebb626e4b7 [ScopDetect] Use SCEVRewriteVisitor to simplify SCEVRemoveSMax rewriter
ScalarEvolution got at some pointer a SCEVRewriteVisitor. Use it to simplify
our SCEVRemoveSMax visitor.

llvm-svn: 285491
2016-10-29 06:19:34 +00:00
Matt Arsenault c88ba36eab AMDGPU: Use 1/2pi inline imm on VI
I'm guessing at how it is supposed to be printed

llvm-svn: 285490
2016-10-29 04:05:06 +00:00