llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	66b2fd1209	[AVX-512] Remove many of the masked 128/256-bit shift builtins and replace them with unmasked builtins and selects. llvm-svn: 285539	2016-10-31 04:30:51 +00:00
Eric Fiselier	ebcc86e469	Add 'inline' but not 'always_inline' to std::strings destructor. Adding both 'inline' and 'always_inline' to the destructor has been contentious. However most of the performance benefits can be gained by only adding 'inline', and there is no reason to hold up that change while discussing the other. llvm-svn: 285538	2016-10-31 03:42:50 +00:00
Eric Fiselier	0f0a077c89	Remove additional function template definitions from the dylib llvm-svn: 285537	2016-10-31 03:40:29 +00:00
Sanjoy Das	fd080904b7	Make a test case more rigorous; NFC llvm-svn: 285536	2016-10-31 03:32:45 +00:00
Sanjoy Das	1707869db5	[SCEV] Try to order n-ary expressions in CompareValueComplexity llvm-svn: 285535	2016-10-31 03:32:43 +00:00
Sanjoy Das	3d6e3df5f9	[SCEV] Reduce boilerplate in unit tests llvm-svn: 285534	2016-10-31 03:32:39 +00:00
Artem Dergachev	e14d881808	[analyzer] NumberObjectConversion: support more types, misc updates. Support CFNumberRef and OSNumber objects, which may also be accidentally converted to plain integers or booleans. Enable explicit boolean casts by default in non-pedantic mode. Improve handling for warnings inside macros. Improve error messages. Differential Revision: https://reviews.llvm.org/D25731 llvm-svn: 285533	2016-10-31 03:08:48 +00:00
Eric Fiselier	a55333003d	Optimize filesystem::path by providing weaker exception guarantees. path uses string::append to construct, append, and concatenate paths. Unfortunatly string::append has a strong exception safety guaranteed and if it can't prove that the iterator operations don't throw then it will allocate a temporary string copy to append to. However this extra allocation and copy is very undesirable for path which doesn't have the same exception guarantees. To work around this this patch adds string::__append_forward_unsafe which exposes the std::string::append interface for forward iterators without enforcing that the iterator is noexcept. llvm-svn: 285532	2016-10-31 02:46:25 +00:00
Eric Fiselier	7ca76565e7	Fix _LIBCPP_EXTERN_TEMPLATE_INLINE_VISIBILITY to always have default visibility. This prevent the symbols from being both externally available and hidden, which causes them to be linked incorrectly. This is only a problem when the address of the function is explicitly taken since it will always be inlined otherwise. This patch fixes the issues that caused r285456 to be reverted, and can now be reapplied. llvm-svn: 285531	2016-10-31 02:07:23 +00:00
Eric Fiselier	ef915d3ef4	Improve performance of constructing filesystem::path from strings. This patch fixes a performance bug when constructing or appending to a path from a string or c-string. Previously we called 'push_back' to append every single character. This caused multiple re-allocation and copies when at most one reallocation is necessary. The new behavior is to simply call `string::append` so it can correctly handle reallocation. For large strings this change is a ~4x improvement. This also makes our path faster to construct than libstdc++'s. llvm-svn: 285530	2016-10-30 23:53:50 +00:00
Sanjoy Das	299e67291c	[SCEV] In CompareValueComplexity, order global values by their name llvm-svn: 285529	2016-10-30 23:52:56 +00:00
Sanjoy Das	b4830a84b9	[SCEV] Use auto for consistency with an upcoming change; NFC llvm-svn: 285528	2016-10-30 23:52:53 +00:00
Sanjoy Das	b53021d71f	Clean up test a little bit; NFC llvm-svn: 285527	2016-10-30 23:52:50 +00:00
Eric Fiselier	1467a197e5	Rewrite std::filesystem::path iterators and parser This patch entirely rewrites the parsing logic for paths. Unlike the previous implementation this one stores information about the current state; For example if we are in a trailing separator or a root separator. This avoids the need for extra lookahead (and extra work) when incrementing or decrementing an iterator. Roughly this gives us a 15% speedup over the previous implementation. Unfortunately this implementation is still a lot slower than libstdc++'s. Because libstdc++ pre-parses and splits the path upon construction their iterators are trivial to increment/decrement. This makes libc++ lazy parsing 100x slower than libstdc++. However the pre-parsing libstdc++ causes a ton of extra and unneeded allocations when constructing the string. For example `path("/foo/bar/")` would require at least 5 allocations with libstdc++ whereas libc++ uses only one. The non-allocating behavior is much preferable when you consider filesystem usages like 'exists("/foo/bar/")'. Even then libc++'s path seems to be twice as slow to simply construct compared to libstdc++. More investigation is needed about this. llvm-svn: 285526	2016-10-30 23:30:38 +00:00
Mehdi Amini	b2461ce33a	Fix clang installed path to handle case where clang is invoked through a symlink This code path is used when generating the path to libLTO.dylib, which is passed to the linker as `-lto_library'. Without this, if clang is invoked through a symlink, libLTO is searched in a path relative to where the symlink is instead of where clang is actually installed. Fix PR30811. Patch by: Jack Howarth Differential Revision: https://reviews.llvm.org/D26116 llvm-svn: 285525	2016-10-30 23:26:13 +00:00
Eric Fiselier	3aa5478e21	Add start of filesystem benchmarks llvm-svn: 285524	2016-10-30 22:53:00 +00:00
Eric Fiselier	2d4fbb7b0c	Mark thread exit test as unsupported w/o threads llvm-svn: 285523	2016-10-30 20:05:52 +00:00
Sanjay Patel	339a51ac13	[DAG] x \| x --> x llvm-svn: 285522	2016-10-30 18:19:35 +00:00
Sanjay Patel	13aee345ca	[DAG] x & x --> x llvm-svn: 285521	2016-10-30 18:13:30 +00:00
Sanjay Patel	8a5f9810a0	[x86] add tests for basic logic op folds llvm-svn: 285520	2016-10-30 18:04:19 +00:00
Michael Zuckerman	d343697f1e	Fixing "type" issue for (epi32) and replaceing hardcoded inf with clang builtin inf "__builtin_inff()" for float ({max\|min}_{pd\|ps}) llvm-svn: 285519	2016-10-30 14:54:05 +00:00
Dorit Nuzman	06903d16af	Revert r285517 due to build failures. llvm-svn: 285518	2016-10-30 14:34:57 +00:00
Dorit Nuzman	3c1c658f24	[LoopVectorize] Make interleaved-accesses analysis less conservative about possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517	2016-10-30 12:23:26 +00:00
Craig Topper	312ff9d19d	[AVX-512] Remove masked 128/256-bit builtins for vpmaddwd and vpmaddubsw. Replace with unmasked builtins and select. llvm-svn: 285516	2016-10-30 07:11:34 +00:00
Craig Topper	b7781a95fd	[X86] Use intrinsics table for PMADDUBSW and PMADDWD so that we can use the legacy intrinsics to select EVEX encoded instructions when available. This removes a couple tablegen classes that become unused after this change. Another class gained an additional parameter to allow PMADDUBSW to specify a different result type from its input type. llvm-svn: 285515	2016-10-30 06:56:16 +00:00
Hongbin Zheng	72f9ed1807	[Polly] Remove the unused POLLY_LINK_LIBS for linking polly into tools Differential Revision: https://reviews.llvm.org/D25861 llvm-svn: 285514	2016-10-30 06:07:59 +00:00
Teresa Johnson	bf28c8fa45	[ThinLTO] Use per-summary flag to prevent exporting locals used in inline asm Summary: Instead of using the workaround of suppressing the entire index for modules that call inline asm that may reference locals, use the NoRename flag on the summary for any locals in the llvm.used set, and add a reference edge from any functions containing inline asm. This avoids issues from having no summaries despite the module defining global values, which was preventing more aggressive index-based optimization. It will be followed by a subsequent patch to make a similar fix for local references in module level asm (to fix PR30610). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26121 llvm-svn: 285513	2016-10-30 05:40:44 +00:00
Teresa Johnson	3bc8abdffc	[ThinLTO] Correctly resolve linkonce when importing aliasee Summary: When we have an aliasee that is linkonce, while we can't convert the non-prevailing copies to available_externally, we still need to convert the prevailing copy to weak. If a reference to the aliasee is exported, not converting a copy to weak will result in undefined references when the linkonce is removed in its original module. Add a new test and update existing tests. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26076 llvm-svn: 285512	2016-10-30 05:15:23 +00:00
NAKAMURA Takumi	720033ad19	clang/test/Driver/openmp-offload.c: Relax expressions if "ld.exe" exists, like mingw. llvm-svn: 285511	2016-10-30 02:58:48 +00:00
Craig Topper	bf9e5a16a4	[X86] Don't use loadv2i64 on SSE version of PMULHRSW. Use memopv2i64 instead. This bug was introduced in r285501. llvm-svn: 285510	2016-10-30 00:02:55 +00:00
NAKAMURA Takumi	ff76cfefc0	NativeFormatting.cpp: Fix build for mingw. Where would writePadding() be? llvm-svn: 285509	2016-10-29 23:14:18 +00:00
Teresa Johnson	38d4df714c	[ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC) Rename as suggested in code review for D26063. llvm-svn: 285508	2016-10-29 21:52:23 +00:00
Teresa Johnson	1b9c2be8f4	[ThinLTO] Use NoPromote flag in summary during promotion Summary: Replace the check of whether a GV has a section with the flag check in the summary. This is in preparation for using the NoPromote flag to convey other situations when we can't promote (e.g. locals used in inline asm). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26063 llvm-svn: 285507	2016-10-29 21:31:48 +00:00
Peter Collingbourne	310474f576	IR: Remove a no longer needed assert. This assert was checking for a miscompile in a version of GCC that we no longer support. llvm-svn: 285506	2016-10-29 20:57:12 +00:00
Craig Topper	4caf76bee2	[AVX-512] Remove 128/256-bit masked pmulhrsw/pmulhuw/pmulhw builtins and use unmasked builtins and select instead. llvm-svn: 285505	2016-10-29 19:02:14 +00:00
Craig Topper	2eadf1b67e	[AVX-512] Remove masked 128/256-bit sqrt builtins and replace them with unmasked builtins and a select. llvm-svn: 285504	2016-10-29 19:02:10 +00:00
Craig Topper	09e94007be	[AVX-512] Remove masked 128/256-bit pmuludq/pmuldq builtins and replace them with unmasked builtins and a select. llvm-svn: 285503	2016-10-29 19:02:07 +00:00
Craig Topper	160ca8420d	[AVX-512] Remove masked 128/256-bit floating point max/min builtins. Use unmasked builtins with select instead. llvm-svn: 285502	2016-10-29 19:02:03 +00:00
Craig Topper	defe9ffbb5	[X86] Use intrinsics table for VPMULHRSW intrincis so that the legacy intrinsics can select EVEX encoded instructions when available. This requires a minor rename of the instructions due to the use of different tablegen classes and how the names are concatenated. llvm-svn: 285501	2016-10-29 18:41:45 +00:00
Richard Smith	2680bc9951	Factor finding of libc++ include path out of building -cc1 arguments. llvm-svn: 285500	2016-10-29 17:28:48 +00:00
Sanjay Patel	36eeb6d6f6	[ValueTracking] recognize more variants of smin/smax Try harder to detect obfuscated min/max patterns: the initial pattern was added with D9352 / rL236202. There was a bug fix for PR27137 at rL264996, but I think we can do better by folding the corresponding smax pattern and commuted variants. The codegen tests demonstrate the effect of ValueTracking on the backend via SelectionDAGBuilder. We can't expose these differences minimally in IR because we don't have smin/smax intrinsics for IR. Differential Revision: https://reviews.llvm.org/D26091 llvm-svn: 285499	2016-10-29 16:21:19 +00:00
Sanjay Patel	e9fa95e572	[x86] add tests for smin/smax matchSelPattern (D26091) llvm-svn: 285498	2016-10-29 16:02:57 +00:00
Piotr Padlewski	77cc962bce	[Devirtualization] Decorate vfunction load with invariant.load Summary: This patch was introduced one year ago, but because my google account was disabled, I didn't get email with failing buildbot and I missed revert of this commit. There was small but in test regex. I am back. Reviewers: rsmith, rengolin Subscribers: nlewycky, rjmccall, cfe-commits Differential Revision: https://reviews.llvm.org/D26117 llvm-svn: 285497	2016-10-29 15:28:30 +00:00
Piotr Padlewski	2f8b97f3a6	NFC small format llvm-svn: 285496	2016-10-29 15:28:25 +00:00
Sanjay Patel	978f827d12	[InstCombine] re-use bitcasted compare operands in selects (PR28001) These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495	2016-10-29 15:22:04 +00:00
Simon Pilgrim	75a697a17e	[DAGCombiner] (REAPPLIED) Add vector demanded elements support to computeKnownBits Currently computeKnownBits returns the common known zero/one bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original computeKnownBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. The approach was found to be easier than trying to add a per-element known bits solution, for a similar usefulness given the combines where computeKnownBits is typically used. I've only added support for a few opcodes so far (the ones that have proven straightforward to test), all others will default to demanding all elements but can be updated in due course. DemandedElts support could similarly be added to computeKnownBitsForTargetNode in a future commit. This looked like this had caused compile time regressions on some buildbots (and was reverted in rL285381), but appears to have just been a harmless bystander! Differential Revision: https://reviews.llvm.org/D25691 llvm-svn: 285494	2016-10-29 11:29:39 +00:00
Michael Zuckerman	25eb420233	[X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (max\|min) intrinsics to Clang . After LGTM and Check-all Vector-reduction arithmetic accepts vectors as inputs and produces scalars as outputs.This class of vector operation forms the basis of many scientific computations. In vector-reduction arithmetic, the evaluation off is independent of the order of the input elements of V. Reviewer: 1. craig.topper 2. igorb Differential Revision: https://reviews.llvm.org/D25988 llvm-svn: 285493	2016-10-29 10:29:20 +00:00
Elena Demikhovsky	519b4ccd70	Fixed FMA + FNEG combine. Masked form of FMA should be omitted in this optimization. Differential Revision: https://reviews.llvm.org/D25984 llvm-svn: 285492	2016-10-29 08:44:46 +00:00
Tobias Grosser	ebb626e4b7	[ScopDetect] Use SCEVRewriteVisitor to simplify SCEVRemoveSMax rewriter ScalarEvolution got at some pointer a SCEVRewriteVisitor. Use it to simplify our SCEVRemoveSMax visitor. llvm-svn: 285491	2016-10-29 06:19:34 +00:00
Matt Arsenault	c88ba36eab	AMDGPU: Use 1/2pi inline imm on VI I'm guessing at how it is supposed to be printed llvm-svn: 285490	2016-10-29 04:05:06 +00:00

1 2 3 4 5 ...

245950 Commits All Branches Search

245950 Commits

All Branches