Commit Graph

224845 Commits

Author SHA1 Message Date
Saleem Abdulrasool 8b30f9854e ARM: correct __builtin_longjmp on WoA
WoA uses r11 as the FP even though it is a pure thumb-2 environment in contrast
to AAPCS which states r7.  This adjusts __builtin_longjmp to not clobber r7 and
to properly restore the frame pointer on execution.

llvm-svn: 263118
2016-03-10 15:11:09 +00:00
Simon Pilgrim 05e07836c8 Updated SSE3 builtin tests to more closely match the llvm fast-isel equivalent tests
llvm-svn: 263117
2016-03-10 14:46:49 +00:00
Simon Pilgrim 99665cb77f Added note to SSE4a builtins about keeping in sync with llvm tests
llvm-svn: 263116
2016-03-10 14:44:32 +00:00
Simon Pilgrim baed60dd0a Updated SSSE3 builtin tests to more closely match the llvm fast-isel equivalent tests
llvm-svn: 263115
2016-03-10 14:42:17 +00:00
Chandler Carruth cf3f4f25ca [CG] Back out my pointless move ctor and add the explicit template
instantiation needed for the mingw dll build bot.

llvm-svn: 263114
2016-03-10 14:33:10 +00:00
Simon Pilgrim 0d1e3fff2b Minor Wdocumentation fix. NFCI.
llvm-svn: 263113
2016-03-10 14:16:36 +00:00
Chandler Carruth d94a5962cc [SROA] Clean up some really weird code, no functionality changed.
We already have the instruction extracted into 'I', just cast that to
a store the way we do for loads. Also, we don't enter the if unless SI
is non-null, so don't test it again for null.

I'm pretty sure the entire test there can be nuked, but this is just the
trivial cleanup.

llvm-svn: 263112
2016-03-10 14:16:18 +00:00
Elena Demikhovsky cd9967d160 AVX-512: Fixed a bug in i1 vector zero extending. (Skylake-avx512)
(failed on instruction selection phase)

Differential Revision: http://reviews.llvm.org/D17924

llvm-svn: 263111
2016-03-10 13:44:22 +00:00
Chandler Carruth 3d1506ed37 [CG] Try adding an explicit move constructor to see if that helps the
one build bot that is crashing on this code.

llvm-svn: 263110
2016-03-10 13:43:06 +00:00
Aaron Ballman 10670b4f1c Correcting an attribute documentation generation error by giving the abi_tag attribute a documentation category.
llvm-svn: 263109
2016-03-10 13:08:22 +00:00
Valery Pykhtin a4db224d54 [AMDGPU] Fix SMEM instructions encoding/operand namings
Differential Revision: http://reviews.llvm.org/D17651

llvm-svn: 263108
2016-03-10 13:06:08 +00:00
Ewan Crawford 7648dd375f Revert "Track expression language from one place in ClangExpressionParser"
r263099 seems to have broken some OSX tests

llvm-svn: 263107
2016-03-10 12:38:55 +00:00
Filipe Cabecinhas 721447c873 [test/asan/closed-fds] Properly quote log_path for shell invocation.
llvm-svn: 263106
2016-03-10 11:51:59 +00:00
Simon Pilgrim 13d4056795 [X86][AVX] Improve target shuffle combining of BLEND+zero
The BLEND+zero combine was failing to combine equivalent BLEND masks.

Follow up to D17483 and D17858

llvm-svn: 263105
2016-03-10 11:50:15 +00:00
Chandler Carruth 4c660f7087 [CG] Add a new pass manager printer pass for the old call graph and
actually finish wiring up the old call graph.

There were bugs in the old call graph that hadn't been caught because it
wasn't being tested. It wasn't being tested because it wasn't in the
pipeline system and we didn't have a printing pass to run in tests. This
fixes all of that.

As for why I'm still keeping the old call graph alive its so that I can
port GlobalsAA to the new pass manager with out forking it to work with
the lazy call graph. That's clearly the right eventual design, but it
seems pragmatic to defer that until its necessary. The old call graph
works just fine for GlobalsAA.

llvm-svn: 263104
2016-03-10 11:24:11 +00:00
Chandler Carruth b95def7491 [LCG] Spell the printing pass pipeline name for the lazy call graph
'lcg' instead of just 'cg'.

This makes it consistent with the analysis name of 'lcg'.

No functionality changed.

llvm-svn: 263103
2016-03-10 11:24:06 +00:00
Simon Pilgrim 16d11785a5 [X86][SSE] Basic combining of unary target shuffles of binary target shuffles.
This patch reorders the combining of target shuffle masks so that when a unary shuffle takes a binary shuffle as its input but only references one of its inputs it can correctly combine into a unary shuffle mask.

This is starting to encroach on the purpose of resolveTargetShuffleInputs, but I don't want to remove it until we definitely know we won't need it for full binary shuffle combining.

There is a lot more work before we can properly support binary target shuffle masks but this was an easy case to add support for.

Differential Revision: http://reviews.llvm.org/D17858

llvm-svn: 263102
2016-03-10 11:23:51 +00:00
Chandler Carruth 1ecd740cf0 [CG] Actually hoist up the generic CallGraphPrinter pass from a weird
location in the opt tool to live along side the analysis in LLVM's
libraries.

No functionality changed here, but this will allow me to port the
printer to the new pass manager as well.

llvm-svn: 263101
2016-03-10 11:08:44 +00:00
Chandler Carruth 5f432292a6 [CG] Rename the DOT printing pass to actually reference "DOT".
There is another pass by the generic name 'CallGraphPrinter' which is
actually just a call graph printer tucked away inside the opt tool. I'd
like to bring it out and make it follow the same patterns as the rest of
the CallGraph code, but doing so would end up conflicting with the name
of the DOT printing pass. So this makes the DOT printing pass name be
more precise.

No functionality changed here.

llvm-svn: 263100
2016-03-10 11:04:40 +00:00
Ewan Crawford 6dc9db5244 Track expression language from one place in ClangExpressionParser
The current expression language is currently tracked in a few places within the ClangExpressionParser constructor. 
This patch adds a private lldb::LanguageType attribute to the ClangExpressionParser class and tracks the expression language from that one place.

Author: Luke Drummond <luke.drummond@codeplay.com>
Differential Revision: http://reviews.llvm.org/D17719

llvm-svn: 263099
2016-03-10 10:31:08 +00:00
Ekaterina Romanova e2961f71d2 Add doxygen comments to xmmintrin.h's intrinsics.
Only half of the intrinsics in this file is documented here. The patch for the other half will be sent out later.

The doxygen comments are automatically generated based on Sony's intrinsics document.

I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream.

llvm-svn: 263098
2016-03-10 09:37:04 +00:00
Elena Demikhovsky 38f78a2b92 AVX-512: Fixed a bug in shuffle for v64i8 type
Operation SCALAR_TO_VECTOR for v64i8 and v32i16 should be lowered if BW feature is "on".

Differential Revision: http://reviews.llvm.org/D17994

llvm-svn: 263097
2016-03-10 08:32:09 +00:00
Vedant Kumar ae22c58737 [opt] Fix description of the -disable-verify flag
llvm-svn: 263096
2016-03-10 06:58:53 +00:00
Mark Lacey 125bb29c65 Add an LLVM_BUILTIN_DEBUGTRAP macro.
Summary:
This provides a macro that expands to __builtin_debugtrap() for clang,
and __debugbreak() for MSVC.

It intentionally expands to nothing for compilers that do not support a
similar mechanism that halts the debugger without otherwise crashing the
process.

Differential Revision: http://reviews.llvm.org/D18002

llvm-svn: 263095
2016-03-10 05:15:03 +00:00
Sean Silva 03e41ee6a7 [lto] Initialize asmparsers.
Summary:
They are needed for inline asm during LTO.

In particular we hit the report_fatal_error on
llvm/lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp:138

    LLVM ERROR: Inline asm not supported by this streamer because we don't have an asm parser for this target

Reviewers: ruiu, rafael

Subscribers: Bigcheese, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D18027

llvm-svn: 263094
2016-03-10 04:58:52 +00:00
Tim Northover e5dc94ee31 ARM: fix arm_neon_intrinsics.c and re-enable.
It turns out I'd never actually tested my recent change because it was
gated on long-tests. Failure ensued.

llvm-svn: 263093
2016-03-10 04:39:45 +00:00
Roman Levenstein 2792b3f02f Add support for a preserve_most calling convention to the AArch64 backend.
This change adds a support for a preserve_most calling convention to the AArch64 backend, similar to how it was done for X86-64.

There is also a subsequent patch on top of this one to add a tail-calls support for this calling convention.

Differential Revision: http://reviews.llvm.org/D18016

llvm-svn: 263092
2016-03-10 04:35:09 +00:00
Richard Trieu 9402d58e5b Disable failing test and fix RUN line.
See https://llvm.org/bugs/show_bug.cgi?id=26894 for details.  This change
fixes the incorrect flags to Clang and the piping issue.  It also disables
the FileCheck portion of the test, which is currently failing.

llvm-svn: 263091
2016-03-10 04:04:12 +00:00
Vedant Kumar 37a1d6207f [opt] Only create Verifier passes when requested
opt adds Verifier passes in AddOptimizationPasses even if
-disable-verify is on. Fix it so that the extra verification occurs
either when (1) -disable-verifier is off, or (2) -verify-each is on.

Thanks to David Jones for pointing out this behavior!

llvm-svn: 263090
2016-03-10 03:40:14 +00:00
Michael Zolotukhin b88fbe08fc [SLP] Add -slp-min-reg-size command line option.
MinVecRegSize is currently hardcoded to 128; this patch adds a cl::opt
to allow changing it. I tried not to change any existing behavior for the default
case.

Differential revision: http://reviews.llvm.org/D13278

llvm-svn: 263089
2016-03-10 02:49:47 +00:00
Mehdi Amini 237e606a42 Add an entry in the Release Notes for LLVMContext::discardValueNames()
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263088
2016-03-10 02:18:17 +00:00
Steven Wu 92910f69a0 Fix false positives for for-loop-analysis warning
Summary:
For PseudoObjectExpr, the DeclMatcher need to search only all the semantics
but also need to search pass OpaqueValueExpr for all potential uses for the
Decl.

Reviewers: thakis, rtrieu, rjmccall, doug.gregor

Subscribers: xazax.hun, rjmccall, doug.gregor, cfe-commits

Differential Revision: http://reviews.llvm.org/D17627

llvm-svn: 263087
2016-03-10 02:02:48 +00:00
Mehdi Amini 09b4a8daa3 Add a flag to the LLVMContext to disable name for Value other than GlobalValue
Summary:
This is intended to be a performance flag, on the same level as clang
cc1 option "--disable-free". LLVM will never initialize it by default,
it will be up to the client creating the LLVMContext to request this
behavior. Clang will do it by default in Release build (just like
--disable-free).

"opt" and "llc" can opt-in using -disable-named-value command line
option.

When performing LTO on llvm-tblgen, the initial merging of IR peaks
at 92MB without this patch, and 86MB after this patch,setNameImpl()
drops from 6.5MB to 0.5MB.
The total link time goes from ~29.5s to ~27.8s.

Compared to a compile-time flag (like the IRBuilder one), it performs
very close. I profiled on SROA and obtain these results:

 420ms with IRBuilder that preserve name
 372ms with IRBuilder that strip name
 375ms with IRBuilder that preserve name, and a runtime flag to strip

Reviewers: chandlerc, dexonsmith, bogner

Subscribers: joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D17946

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 263086
2016-03-10 01:28:54 +00:00
Siva Chandra aaae5f87af [DWARFASTParserClang] Start with member offset of 0 for members of union types.
Summary:
GCC does not emit DW_AT_data_member_location for members of a union.
Starting with a 0 value for member locations helps is reading union types
in such cases.

Reviewers: clayborg

Subscribers: ldrumm, lldb-commits

Differential Revision: http://reviews.llvm.org/D18008

llvm-svn: 263085
2016-03-10 01:15:17 +00:00
Chandler Carruth 7776377e62 [gvn] Fix more indenting and formatting in regions of code that will
need to be changed for porting to the new pass manager.

Also sink the comment on the ValueTable class back to that class instead
of it dangling on an anonymous namespace.

No functionality changed.

llvm-svn: 263084
2016-03-10 00:58:20 +00:00
Chandler Carruth 169c84f1cc [gvn] Reformat a chunk of the GVN code that is strangely indented prior
to restructuring it for porting to the new pass manager.

No functionality changed.

llvm-svn: 263083
2016-03-10 00:58:18 +00:00
Chandler Carruth 61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Alexey Samsonov ae81bbb496 EmitCXXStructorCall -> EmitCXXDestructorCall. NFC.
This function is only used in Microsoft ABI and only to emit
destructors. Rename/simplify it accordingly.

llvm-svn: 263081
2016-03-10 00:20:37 +00:00
Alexey Samsonov efa956cea0 Remove unused function arguments. NFC.
llvm-svn: 263080
2016-03-10 00:20:33 +00:00
Enrico Granata 391075c38e Certain hardware architectures have registers of 256 bits in size
This patch extends Scalar such that it can support data living in such registers (e.g. float values living in the XMM registers)

llvm-svn: 263079
2016-03-10 00:14:29 +00:00
Zachary Turner 7e8c7bea79 Fix SymbolFilePDB for discontiguous functions.
Previously line table parsing code assumed that the only gaps would
occur at the end of functions.  In practice this isn't true, so this
patch makes the line table parsing more robust in the face of
functions with non-contiguous byte arrangements.

llvm-svn: 263078
2016-03-10 00:06:26 +00:00
Alexey Samsonov c1424fc7c8 sanitizer: Fix endianness checks for gcc
Summary:
__BIG_ENDIAN__ and __LITTLE_ENDIAN__ are not supported by gcc, which
eg. for ubsan Value::getFloatValue will silently fall through to
the little endian branch, breaking display of float values by ubsan.
Use __BYTE_ORDER__ == __ORDER_BIG/LITTLE_ENDIAN__ as the condition
instead, which is supported by both clang and gcc.

Noticed while porting ubsan to s390x.

Patch by Marcin Kościelnicki!

Differential Revision: http://reviews.llvm.org/D17660

llvm-svn: 263077
2016-03-09 23:39:40 +00:00
Ben Langmuir 3c4b1290a4 [Modules] Add stdatomic to the list of builtin headers
Since it's provided by the compiler. This allows a system module map
file to declare a module for it.

No test change for cstd.m, since stdatomic.h doesn't function without a
relatively complete stdint.h and stddef.h, which tests using this module
don't provide.

rdar://problem/24931246

llvm-svn: 263076
2016-03-09 23:31:34 +00:00
Philip Reames d9f4a3d18c [BasicAA/MDA] Sink aliasing rules for malloc and calloc into BasicAA
MemoryDependenceAnalysis had a hard-coded exception to the general aliasing rules for malloc and calloc. The reasoning that applied there is equally valid in BasicAA and clarifies the remaining logic in MDA.

In principal, this can expose slightly more optimization opportunities, but since essentially all of our aliasing aware memory optimization passes go through MDA, this will likely be NFC in practice.

Differential Revision: http://reviews.llvm.org/D15912

llvm-svn: 263075
2016-03-09 23:19:56 +00:00
Philip Reames ac115ed72f [CGP] Duplicate addressing computation in cold paths if required to sink addressing mode
This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store.

In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path.

This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had.

Differential Revision: http://reviews.llvm.org/D17652

llvm-svn: 263074
2016-03-09 23:13:12 +00:00
Philip Reames e0a5454df4 Fix the build
I screwed up rebasing 263072.  This change fixes the build and passes all make check.

llvm-svn: 263073
2016-03-09 23:07:53 +00:00
Philip Reames b54c8e6eea [LICM] Store promotion when memory is thread local
This patch teaches LICM's implementation of store promotion to exploit the fact that the memory location being accessed might be provable thread local. The fact it's thread local weakens the requirements for where we can insert stores since no other thread can observe the write. This allows us perform store promotion even in cases where the store is not guaranteed to execute in the loop.

Two key assumption worth drawing out is that this assumes a) no-capture is strong enough to imply no-escape, and b) standard allocation functions like malloc, calloc, and operator new return values which can be assumed not to have previously escaped.

In future work, it would be nice to generalize this so that it works without directly seeing the allocation site. I believe that the nocapture return attribute should be suitable for this purpose, but haven't investigated carefully. It's also likely that we could support unescaped allocas with similar reasoning, but since SROA and Mem2Reg should destroy those, they're less interesting than they first might seem.

Differential Revision: http://reviews.llvm.org/D16783

llvm-svn: 263072
2016-03-09 22:59:30 +00:00
Sean Silva 1c82b7f64d Use %t as the output file name to avoid repetition. NFC.
llvm-svn: 263071
2016-03-09 22:30:09 +00:00
Sean Silva 4aaeac6ad3 [lto] Add saving the LTO .o file to -save-temps.
Summary:
This implements another part of -save-temps.
After this, the only remaining part is dumping the optimized bitcode. But
currently LLD's LTO doesn't have a non-intrusive place to put this.
Eventually we probably will and it will make sense to add it then.

Reviewers: ruiu, rafael

Subscribers: joker.eph, Bigcheese, llvm-commits

Differential Revision: http://reviews.llvm.org/D18009

llvm-svn: 263070
2016-03-09 22:30:05 +00:00
Sanjay Patel 9f6c4d50b4 [x86] fix cost model inaccuracy for vector memory ops
The irony of this patch is that one CPU that is affected is AMD Jaguar, and Jaguar
has a completely double-pumped AVX implementation. But getting the cost model to
reflect that is a much bigger problem. The small goal here is simply to improve on
the lie that !AVX2 == SandyBridge.

Differential Revision: http://reviews.llvm.org/D18000

llvm-svn: 263069
2016-03-09 22:23:33 +00:00