llvm-project

Commit Graph

Author	SHA1	Message	Date
JF Bastien	65f0a71f40	WebAssembly: test global array indexing This case was tested in the linker from code, but not from globals indexing into other globals. The linker currently barfs on this, ncbray volunteered to fix it. llvm-svn: 255601	2015-12-15 02:02:51 +00:00
Mehdi Amini	1c131b37ed	Instcombine: destructor loads of structs that do not contains padding For non padded structs, we can just proceed and deaggregate them. We don't want ot do this when there is padding in the struct as to not lose information about this padding (the subsequents passes would then try hard to preserve the padding, which is undesirable). Also update extractvalue.ll and cast.ll so that they use structs with padding. Remove the FIXME in the extractvalue of laod case as the non padded case is handled when processing the load, and we don't want to do it on the padded case. Patch by: Amaury SECHET <deadalnix@gmail.com> Differential Revision: http://reviews.llvm.org/D14483 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255600	2015-12-15 01:44:07 +00:00
Reid Kleckner	7c0c0c0501	[llvm-readobj] s/FunctionName/LinkageName/ for codeview dumping The symbol being printed in this field comes from the main symbol table, not 0xF1 subsection. Use LinkageName to make that a lot clearer. llvm-svn: 255596	2015-12-15 01:23:55 +00:00
Cong Hou	750c9e0457	Let operator/ with uint32_t rhs operand be a member of BranchProbability and add a new operator /=. NFC. llvm-svn: 255595	2015-12-15 01:21:14 +00:00
Mehdi Amini	33a7ea4b9a	Add a C++11 ThreadPool implementation in LLVM This is a very simple implementation of a thread pool using C++11 thread. It accepts any std::function<void()> for asynchronous execution. Individual task can be synchronize using the returned future, or the client can block on the full queue completion. In case LLVM is configured with Threading disabled, it falls back to sequential execution using std::async with launch:deferred. This is intended to support parallelism for ThinLTO processing in linker plugin, but is generic enough for any other uses. This is a recommit of r255444 ; trying to workaround a bug in the MSVC 2013 standard library. I think I was hit by: http://connect.microsoft.com/VisualStudio/feedbackdetail/view/791185/std-packaged-task-t-where-t-is-void-or-a-reference-class-are-not-movable Recommit of r255589, trying to please g++ as well. Differential Revision: http://reviews.llvm.org/D15464 From: mehdi_amini <mehdi_amini@91177308-0d34-0410-b5e6-96231b3b80d8> llvm-svn: 255593	2015-12-15 00:59:19 +00:00
Mehdi Amini	2bc6a5ad84	Revert "Add a C++11 ThreadPool implementation in LLVM" This reverts commit r255589. Breaks g++ From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255591	2015-12-15 00:42:44 +00:00
Mehdi Amini	ef0ef2860d	Add a C++11 ThreadPool implementation in LLVM This is a very simple implementation of a thread pool using C++11 thread. It accepts any std::function<void()> for asynchronous execution. Individual task can be synchronize using the returned future, or the client can block on the full queue completion. In case LLVM is configured with Threading disabled, it falls back to sequential execution using std::async with launch:deferred. This is intended to support parallelism for ThinLTO processing in linker plugin, but is generic enough for any other uses. This is a recommit of r255444 ; trying to workaround a bug in the MSVC 2013 standard library. I think I was hit by: http://connect.microsoft.com/VisualStudio/feedbackdetail/view/791185/std-packaged-task-t-where-t-is-void-or-a-reference-class-are-not-movable Differential Revision: http://reviews.llvm.org/D15464 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255589	2015-12-15 00:38:05 +00:00
Xinliang David Li	c7018a25c6	[PGO] make profile prefix even shorter and more readable llvm-svn: 255586	2015-12-15 00:32:56 +00:00
Quentin Colombet	25b43f3624	[X86] Add relaxtion logic for SBB instructions. Prior to this patch, we would wrongly stick to the variant with imm8 encoding even when the relocation could not fit that size. rdar://problem/23785506 llvm-svn: 255583	2015-12-15 00:09:23 +00:00
Mike Aizatsky	c13e9633f4	sancov: coverage can be reported by multiple functions. Differential Revision: http://reviews.llvm.org/D15430 llvm-svn: 255582	2015-12-14 23:55:04 +00:00
Rafael Espindola	d4937b4c7d	Yet another missing include. llvm-svn: 255579	2015-12-14 23:39:05 +00:00
Rafael Espindola	19ed1951f6	A better attempt to add a missing include llvm-svn: 255578	2015-12-14 23:34:35 +00:00
Rafael Espindola	42d04b4e29	Trying to fix the build in a bot. llvm-svn: 255577	2015-12-14 23:31:08 +00:00
Xinliang David Li	0812747979	[PGO] Shorten profile symbol prefixes Profile symbols have long prefixes which waste space and creating pressure for linker. This patch shortens the prefixes to minimal length without losing verbosity. Differential Revision: http://reviews.llvm.org/D15503 llvm-svn: 255575	2015-12-14 23:26:27 +00:00
Justin Bogner	6291b587b6	LoopRotate: Convert the methods of LoopRotate to utility functions. NFC This moves the actual work to do loop rotation into standalone functions with the analysis results they need passed in as arguments, leaving the class itself as a relatively simple shim. This will make the functions easy to reuse when we're ready to port this transformation to the new pass manager. llvm-svn: 255574	2015-12-14 23:22:48 +00:00
Justin Bogner	a730045156	LoopRotate: Reorder some method implementations. NFC This just moves some callers after their callees. My next patch will convert some of these methods to stand alone functions, and that diff is more obviously NFC if I move these first. That change, in turn, will make it much easier to port this pass to the new pass manager once the loop pass manager is in place. llvm-svn: 255573	2015-12-14 23:22:44 +00:00
Rafael Espindola	9d2bfc4874	Use diagnostic handler in the LLVMContext This patch converts code that has access to a LLVMContext to not take a diagnostic handler. This has a few advantages * It is easier to use a consistent diagnostic handler in a single program. * Less clutter since we are not passing a handler around. It does make it a bit awkward to implement some C APIs that return a diagnostic string. I will propose new versions of these APIs and deprecate the current ones. llvm-svn: 255571	2015-12-14 23:17:03 +00:00
Quentin Colombet	2cb8a51c1f	[X86] Add relaxtion logic for ADC instructions. Prior to this patch, we would wrongly stick to the variant with imm8 encoding even when the relocation could not fit that size. rdar://problem/23785506 llvm-svn: 255570	2015-12-14 23:12:40 +00:00
Pete Cooper	6a5def90d9	Factor out some duplication. NFC. llvm-svn: 255569	2015-12-14 23:10:52 +00:00
Dan Gohman	c7c0445443	[WebAssembly] Add type prefixes to call instructions Add return type information to call and call_indirect instructions. This allows them to be disambiguated without knowledge of the callee. Differential Revision: http://reviews.llvm.org/D15484 llvm-svn: 255565	2015-12-14 22:56:51 +00:00
Dan Gohman	8fe7e86bf5	[WebAssembly] Implement a new algorithm for placing BLOCK markers Implement a new BLOCK scope placement algorithm which better handles early-return blocks and early exists from nested scopes. Differential Revision: http://reviews.llvm.org/D15368 llvm-svn: 255564	2015-12-14 22:51:54 +00:00
Dan Gohman	a712a6c4ce	[WebAssembly] Avoid adding redundant EXPR_STACK uses. llvm-svn: 255563	2015-12-14 22:37:23 +00:00
Reid Kleckner	db9a91e324	Revert "Don't create unnecessary PHIs" This reverts commit r255489. It causes test failures in Chromium and does not appear to respect the AlternativeV parameter. llvm-svn: 255562	2015-12-14 22:36:57 +00:00
Chih-Hung Hsieh	7993e18e80	[X86] Part 2 to fix x86-64 fp128 calling convention. Part 1 was submitted in http://reviews.llvm.org/D15134. Changes in this part: * X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class. * X86CallingConv.td: Pass f128 values in XMM registers or on stack. * X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td: Add instruction selection patterns for f128. * X86ISelLowering.cpp: When target has MMX registers, configure MVT::f128 in FR128RegClass, with TypeSoftenFloat action, and custom actions for some opcodes. Add missed cases of MVT::f128 in places that handle f32, f64, or vector types. Add TODO comment to support f128 type in inline assembly code. * SelectionDAGBuilder.cpp: Fix infinite loop when f128 type can have VT == TLI.getTypeToTransformTo(Ctx, VT). * Add unit tests for x86-64 fp128 type. Differential Revision: http://reviews.llvm.org/D11438 llvm-svn: 255558	2015-12-14 22:08:36 +00:00
Sanjay Patel	fa54acedd1	add fast-math-flags to 'call' instructions (PR21290) This patch adds optional fast-math-flags (the same that apply to fmul/fadd/fsub/fdiv/frem/fcmp) to call instructions in IR. Follow-up patches would use these flags in LibCallSimplifier, add support to clang, and extend FMF to the DAG for calls. Motivating example: %y = fmul fast float %x, %x %z = tail call float @sqrtf(float %y) We'd like to be able to optimize sqrt(x*x) into fabs(x). We do this today using a function-wide attribute for unsafe-math, but we really want to trigger on the instructions themselves: %z = tail call fast float @sqrtf(float %y) because in an LTO build it's possible that calls with fast semantics have been inlined into a function with non-fast semantics. The code changes and tests are based on the recent commits that added "notail": http://reviews.llvm.org/rL252368 and added FMF to fcmp: http://reviews.llvm.org/rL241901 Differential Revision: http://reviews.llvm.org/D14707 llvm-svn: 255555	2015-12-14 21:59:03 +00:00
Ben Craig	46642ffeeb	Reordering fields to reduce padding in LLVM. NFC llvm-svn: 255554	2015-12-14 21:57:05 +00:00
Dan Gohman	87b4aa8914	[WebAssembly] Add an assert to sanity-check dead flags. The WebAssemblyStoreResults pass runs before LiveVariables, so it doesn't expect to have to keep dead flags up to date; check this with an assert. llvm-svn: 255551	2015-12-14 21:53:54 +00:00
Pete Cooper	52abc5f971	Start implementing FDE dumping when printing the eh_frame. This code adds some simple decoding of the FDE's in an eh_frame. There's still more to be done in terms of error handling and verification. Also, we need to be able to decode the CFI's. llvm-svn: 255550	2015-12-14 21:49:49 +00:00
Pete Cooper	23bfa7e925	Print the eh_frame section in MachoDump. This is the start of work to dump the contents of the eh_frame section. It currently emits CIE entries. FDE entries will come later. It also needs improved error checking which will follow soon. http://reviews.llvm.org/D15502 Reviewed by Kevin Enderby and Lang Hames. llvm-svn: 255546	2015-12-14 21:39:27 +00:00
Krzysztof Parzyszek	5e6f2bd0cb	[Hexagon] Add "const" to function parameters in HexagonInstrInfo llvm-svn: 255544	2015-12-14 21:32:25 +00:00
Diego Novillo	d3babdbc8b	Fix formatting. NFC. llvm-svn: 255541	2015-12-14 20:37:15 +00:00
Krzysztof Parzyszek	dac7102874	[Packetizer] Add AliasAnalysis as a parameter to the packetizer This will make the depedence graph more accurate if an alias analysis is provided. If nullptr is specified in its place, the behavior will remain as it is currently. llvm-svn: 255540	2015-12-14 20:35:13 +00:00
Pete Cooper	1d07869c29	Add missing vtable anchor's. The following description is from http://reviews.llvm.org/D15481: ICmpInst, GetElementPtrInst and PHINode have no anchor functions. This causes the vtable and the type info (if RTTI is enabled in user code) to be emitted in multiple translation units. Before 3.7, the destructors were the key functions for these nodes, but they have been removed. There have been discussions about this here: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089010.html and here: http://lists.llvm.org/pipermail/llvm-dev/2015-December/092921.html. Patch by Visoiu Mistrih Francis llvm-svn: 255538	2015-12-14 20:29:16 +00:00
Krzysztof Parzyszek	8d73bdb316	[Packetizer] Make endPacket virtual This will allow custom handling of packet finalization. The current definition of endPacket will still perform the default finalization. llvm-svn: 255537	2015-12-14 20:12:24 +00:00
David Majnemer	59be1d653a	[ConstantFold] Fix bitcast to gep constant folding transform. Make sure to check that the destination type is sized. A check was present but was incorrectly checking the source type instead. Patch by Amaury SECHET! Differential Revision: http://reviews.llvm.org/D15264 llvm-svn: 255536	2015-12-14 19:30:32 +00:00
Yaron Keren	45ea8fa1f4	Save several std::string constructions using llvm::Twine. llvm-svn: 255535	2015-12-14 19:28:40 +00:00
Peter Collingbourne	45cd0c3264	docs: Correct wording in LangRef relating to available_externally linkage. Differential Revision: http://reviews.llvm.org/D15343 llvm-svn: 255534	2015-12-14 19:22:37 +00:00
Cong Hou	c5f510bc23	Remove the successor probabilities normalization in tail duplication pass. The normalization may cause assertion failures on SystemZ and some out-of-tree tests. The root cause is that unknown probabilities are materialized into known ones by calling getSuccProbability(), which is then used to add another successor to the same MBB which results in mixed known and unknown probabilities. But currently those mixed probabilities cannot be normalized. I will compose another patch to fix the root issue. llvm-svn: 255530	2015-12-14 19:11:54 +00:00
Sanjoy Das	adfec011e1	[MergeFunctions] Use II instead of CI for InvokeInst; NFC Using `CI` is slightly misleading. llvm-svn: 255529	2015-12-14 19:11:45 +00:00
Sanjoy Das	2a74eb0000	Teach MergeFunctions about operand bundles llvm-svn: 255528	2015-12-14 19:11:40 +00:00
Sanjoy Das	2de4d0aa18	Teach haveSameSpecialState about operand bundles llvm-svn: 255527	2015-12-14 19:11:35 +00:00
Krzysztof Parzyszek	d44a1fd506	Add "const" to function arguments in DFAPacketizer llvm-svn: 255526	2015-12-14 18:54:44 +00:00
Xinliang David Li	e3bf4fd394	[PGO] Value profiling text format reader/writer support This patch adds the missing functionality in parsable text format support for value profiling. Differential Revision: http://reviews.llvm.org/D15212 llvm-svn: 255523	2015-12-14 18:44:01 +00:00
David Majnemer	bbfc7219ef	[IR] Remove terminatepad It turns out that terminatepad gives little benefit over a cleanuppad which calls the termination function. This is not sufficient to implement fully generic filters but MSVC doesn't support them which makes terminatepad a little over-designed. Depends on D15478. Differential Revision: http://reviews.llvm.org/D15479 llvm-svn: 255522	2015-12-14 18:34:23 +00:00
Paul Robinson	accc3e0376	FastISel needs to remove dead code when it bails out. When FastISel fails to translate an instruction it hands off code generation to SelectionDAG. Before it does so, it may have generated local value instructions to feed phi nodes in successor blocks. These instructions will then be generated again by SelectionDAG, causing duplication and less efficient code, including extra spill instructions. Patch by Wolfgang Pieb! Differential Revision: http://reviews.llvm.org/D11768 llvm-svn: 255520	2015-12-14 18:33:18 +00:00
Petar Jovanovic	280f7101e8	[Power PC] llvm soft float support for ppc32 This is the second in a set of patches for soft float support for ppc32, it enables soft float operations. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D13700 llvm-svn: 255516	2015-12-14 17:57:33 +00:00
Matt Arsenault	d079285e05	AMDGPU: Use generic bitreverse intrinsic Also fix bug in vector legalization for bitreverse. llvm-svn: 255512	2015-12-14 17:25:38 +00:00
Sanjay Patel	af674fbfd9	getParent() ^ 3 == getModule() ; NFCI llvm-svn: 255511	2015-12-14 17:24:23 +00:00
Geoff Berry	8f5acb1bd1	Remove dead function AArch64TargetLowering::getFunctionAlignment. NFC. Reviewers: t.p.northover, jmolloy, mcrosier Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D15458 llvm-svn: 255509	2015-12-14 17:01:10 +00:00
Matt Arsenault	52a52a564b	AMDGPU: Fix splitting vector loads with existing offsets If the original MMO had an offset, it was dropped. Also use the correct alignment after adding the new offset. llvm-svn: 255508	2015-12-14 16:59:40 +00:00
Sanjay Patel	f727e387be	[InstCombine] fold trunc ([lshr] (bitcast vector) ) --> extractelement (PR25543) This is a fix for PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The idea is to take the existing fold of: bitcast ( trunc ( lshr ( bitcast X))) --> extractelement (bitcast X) ( http://reviews.llvm.org/rL112232 ) And break it into less specific transforms so we'll catch more cases such as the example in the bug report: bitcast ( trunc ( lshr ( bitcast X))) --> bitcast ( extractelement (bitcast X)) --> extractelement (bitcast X) Enabling patches for this change: http://reviews.llvm.org/rL255399 (combine bitcasts) http://reviews.llvm.org/rL255433 (canonicalize extractelement(bitcast X)) Differential Revision: http://reviews.llvm.org/D15392 llvm-svn: 255504	2015-12-14 16:16:54 +00:00
Krzysztof Parzyszek	759a7d0ed7	[Hexagon] Subtarget features/default CPU corrections llvm-svn: 255501	2015-12-14 15:03:54 +00:00
Chad Rosier	bc9d4f9947	[PPC] Early exit loop. NFC. llvm-svn: 255497	2015-12-14 14:44:06 +00:00
Adhemerval Zanella	d2b10c5e9a	[sanitizer] [msan] VarArgHelper for AArch64 This patch add support for variadic argument for AArch64. All the MSAN unit tests are not passing as well the signal_stress_test (currently set as XFAIl for aarch64). llvm-svn: 255495	2015-12-14 14:14:15 +00:00
James Molloy	2b1e101e99	Don't create unnecessary PHIs In conditional store merging, we were creating PHIs when we didn't need to. If the value to be predicated isn't defined in the block we're predicating, then it doesn't need a PHI at all (because we only deal with triangles and diamonds, any value not in the predicated BB must dominate the predicated BB). This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite. llvm-svn: 255489	2015-12-14 10:57:01 +00:00
NAKAMURA Takumi	7b69ca0281	Reformat to untabify. llvm-svn: 255483	2015-12-14 07:58:25 +00:00
David Blaikie	f5cb6279a6	[llvm-dwp] Deduplicate type units It's O(N^2) because it does a simple walk through the existing types to find duplicates, but that will be fixed in a follow-up commit to use a mapping data structure of some kind. llvm-svn: 255482	2015-12-14 07:42:00 +00:00
David Blaikie	429f8ca66d	[llvm-dwp] Remove some unused test code llvm-svn: 255481	2015-12-14 07:41:56 +00:00
Akira Hatanaka	cedf8e9be8	[Docs] Fix underlines that were too short or too long. llvm-svn: 255480	2015-12-14 05:15:40 +00:00
Michael Zuckerman	c5f47b3571	I Added a triple flag for x86-evenDirective test. Continue of rL255461 Differential Revision: http://reviews.llvm.org/D15413 llvm-svn: 255469	2015-12-13 21:12:33 +00:00
Cong Hou	ccec6e4d84	Revert r255460, which still causes test failures on some platforms. Further investigation on the failures is ongoing. llvm-svn: 255463	2015-12-13 17:15:38 +00:00
Michael Zuckerman	02ecd43c63	[X86][inline asm] support even directive The .even directive aligns content to an evan-numbered address. In at&t syntax .even In Microsoft syntax even (without the dot). Differential Revision: http://reviews.llvm.org/D15413 llvm-svn: 255462	2015-12-13 17:07:23 +00:00
Cong Hou	c00e65aa89	Fix a type issue in r255455. Should not use unsigned type as std::abs()'s template type. llvm-svn: 255461	2015-12-13 17:00:25 +00:00
Cong Hou	e6a210f50b	[LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions. (This is the second attempt to check in this patch: REQUIRES: asserts is added to reg-usage.ll now.) LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255460	2015-12-13 16:55:46 +00:00
Simon Pilgrim	3e0c022aed	Fix line endings llvm-svn: 255459	2015-12-13 12:49:48 +00:00
Cong Hou	663dd018c1	Replace <cstdint> by llvm/Support/DataTypes.h for the typedef of uint64_t. NFC. llvm-svn: 255458	2015-12-13 09:52:14 +00:00
Cong Hou	c0a33e0f62	Add the missing header file <cstdint> needed by uint64_t llvm-svn: 255457	2015-12-13 09:32:21 +00:00
Cong Hou	7c369156eb	Revert r255454 as it leads to several test failers on buildbots. llvm-svn: 255456	2015-12-13 09:28:57 +00:00
Cong Hou	c106989fd5	Normalize MBB's successors' probabilities in several locations. This patch adds some missing calls to MBB::normalizeSuccProbs() in several locations where it should be called. Those places are found by checking if the sum of successors' probabilities is approximate one in MachineBlockPlacement pass with some instrumented code (not in this patch). Differential revision: http://reviews.llvm.org/D15259 llvm-svn: 255455	2015-12-13 09:26:17 +00:00
Cong Hou	7f8b43d424	[LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions. LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255454	2015-12-13 08:44:08 +00:00
Saleem Abdulrasool	778c268594	ARM: only emit EABI attributes on EABI targets EABI attributes should only be emitted on EABI targets. This prevents the emission of the optimization goals EABI attribute on Windows ARM. llvm-svn: 255448	2015-12-13 05:27:45 +00:00
Nico Weber	c2a687b6a6	Revert r255444. It doesn't build on Windows and broke the Windows LLD and LLDB bots: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/27693/steps/build_Lld/logs/stdio http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc/builds/13468/steps/build/logs/stdio llvm-svn: 255446	2015-12-13 04:14:39 +00:00
Mehdi Amini	396abbb6f0	Add a C++11 ThreadPool implementation in LLVM This is a very simple implementation of a thread pool using C++11 thread. It accepts any std::function<void()> for asynchronous execution. Individual task can be synchronize using the returned future, or the client can block on the full queue completion. In case LLVM is configured with Threading disabled, it falls back to sequential execution using std::async with launch:deferred. This is intended to support parallelism for ThinLTO processing in linker plugin, but is generic enough for any other uses. Differential Revision: http://reviews.llvm.org/D15464 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255444	2015-12-12 22:55:25 +00:00
Davide Italiano	b627d9f8fc	[llvm-objdump/MachoDump] Simplify. llvm-svn: 255443	2015-12-12 21:50:11 +00:00
Simon Pilgrim	052191dd82	[X86][AVX512] Added support for VMOVQ shuffle comments llvm-svn: 255442	2015-12-12 21:46:23 +00:00
Manuel Jacob	1578ec8860	Partially fix memcpy / memset / memmove lowering in SelectionDAG construction if address space != 0. Summary: Previously SelectionDAGBuilder asserted that the pointer operands of memcpy / memset / memmove intrinsics are in address space < 256. This assert implicitly assumed the X86 backend, where all address spaces < 256 are equivalent to address space 0 from the code generator's point of view. On some targets (R600 and NVPTX) several address spaces < 256 have a target-defined meaning, so this assert made little sense for these targets. This patch removes this wrong assertion and adds extra checks before lowering these intrinsics to library calls. If a pointer operand can't be casted to address space 0 without changing semantics, a fatal error is reported to the user. The new behavior should be valid for all targets that give address spaces != 0 a target-specified meaning (NVPTX, R600, X86). NVPTX lowers big or variable-sized memory intrinsics before SelectionDAG construction. All other memory intrinsics are inlined (the threshold is set very high for this target). R600 doesn't support memcpy / memset / memmove library calls (previously the illegal emission of a call to such library function triggered an error somewhere in the code generator). X86 now emits inline loads and stores for address spaces 256 and 257 up to the same threshold that is used for address space 0 and reports a fatal error otherwise. I call this a "partial fix" because there are still cases that can't be lowered. A fatal error is reported in these cases. Reviewers: arsenm, theraven, compnerd, hfinkel Subscribers: hfinkel, llvm-commits, alex Differential Revision: http://reviews.llvm.org/D7241 llvm-svn: 255441	2015-12-12 21:33:31 +00:00
Xinliang David Li	d1bab96045	[PGO] Stop using invalid char in instr variable names. Before the patch, -fprofile-instr-generate compile will fail if no integrated-as is specified when the file contains any static functions (the -S output is also invalid). This is the second try. The fix in this patch is very localized. Only profile symbol names of profile symbols with internal linkage are fixed up while initializer of name syms are not changes. This means there is no format change nor version bump. llvm-svn: 255434	2015-12-12 17:28:03 +00:00
Sanjay Patel	1d49fc9b27	[InstCombine] canonicalize (bitcast (extractelement X)) --> (extractelement(bitcast X)) This change was discussed in D15392. It allows us to remove the fold that was added in: http://reviews.llvm.org/r255261 ...and it will allow us to generalize this fold: http://reviews.llvm.org/rL112232 while preserving the order of bitcast + extract that it produces and testing shows is better handled by the backend. Note that the existing check for "isVectorTy()" wasn't strong enough in general and specifically because: x86_mmx. It's not a vector, but it's not vectorizable either. So here we check VectorType::isValidElementType() directly before proceeding with the transform. llvm-svn: 255433	2015-12-12 16:44:48 +00:00
Simon Pilgrim	a2d1591876	[X86][AVX] Tests tidyup Cleanup/regenerate some tests for some upcoming patches. llvm-svn: 255432	2015-12-12 12:52:52 +00:00
David Majnemer	496842fb39	Try to appease sphinx llvm-svn: 255429	2015-12-12 06:56:02 +00:00
David Majnemer	550654aaf1	Move catchpad-phi-cast.ll to the X86 specific subdirectory It is X86 specific and will not be properly exercised unless LLVM is built with the X86 target. llvm-svn: 255426	2015-12-12 06:21:08 +00:00
David Majnemer	f28c52f8f7	Try to appease a buildbot The builder complains thusly: error C2027: use of undefined type 'llvm::raw_ostream' Try to make it happy by including raw_ostream.h llvm-svn: 255425	2015-12-12 05:53:20 +00:00
David Majnemer	8a1c45d6e8	[IR] Reformulate LLVM's EH funclet IR While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422	2015-12-12 05:38:55 +00:00
Hal Finkel	98347d3f2c	[PowerPC] OutStreamer cleanup in PPCAsmPrinter We don't need to pass OutStreamer as a parameter to LowerSTACKMAP and LowerPATCHPOINT. It is a member variable of PPCAsmPrinter, and thus, is already available. NFC. llvm-svn: 255418	2015-12-12 01:47:08 +00:00
Chen Li	1b26b9ec9d	[X86ISelLowering] Add additional support for multiplication-to-shift conversion. Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255415	2015-12-12 01:04:15 +00:00
Hal Finkel	4d3da9c29b	Fix test/CodeGen/PowerPC/ppc-shrink-wrapping.ll after r255398 llvm-svn: 255414	2015-12-12 00:42:05 +00:00
Sanjay Patel	93f55dd36d	[InstCombine] allow any pair of bitcasts to be combined This change is discussed in D15392 and should allow us to effectively revert: http://llvm.org/viewvc/llvm-project?view=revision&revision=255261 if we canonicalize bitcasts ahead of extracts. It should be safe to convert any pair of bitcasts into a single bitcast, however, it was mentioned here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110829/127089.html that we're not allowed to bitcast from an x86_mmx to some other types, but I'm not seeing any failures from that, and we have regression tests in CodeGen/X86 that appear to cover all of those cases. Some day we'll get to remove that MMX wart from LLVM IR completely? Differential Revision: http://reviews.llvm.org/D15468 llvm-svn: 255399	2015-12-12 00:33:36 +00:00
Hal Finkel	65539e3c94	[PowerPC] Add Branch Hints for Highly-Biased Branches This branch adds hints for highly biased branches on the PPC architecture. Even in absence of profiling information, LLVM will mark code reaching unreachable terminators and other exceptional control flow constructs as highly unlikely to be reached. Patch by Tom Jablin! llvm-svn: 255398	2015-12-12 00:32:00 +00:00
Derek Schuff	8f55497264	[WebAssembly] Update test expectations Many tests are now passing due to eliminateFrameIndex implementation and the list needs to be re-triaged because it unblocks other failures, and some previous failures are different. However I'm about to churn it more by implementing more lowering, so will wait on that. llvm-svn: 255396	2015-12-12 00:18:40 +00:00
Chen Li	02ef2e1385	Revert rL255391: [X86ISelLowering] Add additional support for multiplication-to-shift conversion. because it broke buildbot. llvm-svn: 255395	2015-12-12 00:08:37 +00:00
Sanjay Patel	ffde9e14a2	use FileCheck for better checking llvm-svn: 255394	2015-12-12 00:01:10 +00:00
Derek Schuff	9769debf88	[WebAssembly] Implement prolog/epilog insertion and FrameIndex elimination Summary: Use the SP32 physical register as the base for FrameIndex lowering. Update it and the __stack_pointer global var in the prolog and epilog. Extend the mapping of virtual registers to wasm locals to include the physical registers. Rather than modify the target-independent PrologEpilogInserter (which asserts that there are no virtual registers left) include a slightly-modified copy for Wasm that does not have this assertion and only clears the virtual registers if scavenging was needed (which of course it isn't for wasm). Differential Revision: http://reviews.llvm.org/D15344 llvm-svn: 255392	2015-12-11 23:49:46 +00:00
Chen Li	e8f9387e0c	[X86ISelLowering] Add additional support for multiplication-to-shift conversion. Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255391	2015-12-11 23:39:32 +00:00
Diego Novillo	10cf124bb9	SamplePGO - Reduce memory utilization by 10x. DenseMap is the wrong data structure to use for sample records and call sites. The keys are too large, causing massive core memory growth when reading profiles. Before this patch, a 21Mb input profile was causing the compiler to grow to 3Gb in memory. By switching to std::map, the compiler now grows to 300Mb in memory. There still are some opportunities for memory footprint reduction. I'll be looking at those next. llvm-svn: 255389	2015-12-11 23:21:38 +00:00
Matt Arsenault	fabab4b7dd	SelectionDAG: Match min/max if the scalar operation is legal llvm-svn: 255388	2015-12-11 23:16:47 +00:00
Hal Finkel	cd8664c3c2	Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics After much discussion, ending here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html it has been decided that, instead of having the vectorizer directly generate special absdiff and horizontal-add intrinsics, we'll recognize the relevant reduction patterns during CodeGen. Accordingly, these intrinsics are not needed (the operations they represent can be pattern matched, as is already done in some backends). Thus, we're backing these out in favor of the current development work. r248483 - Codegen: Fix llvm.*absdiff semantic. r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation llvm-svn: 255387	2015-12-11 23:11:52 +00:00
Rafael Espindola	515f8df3f1	Avoid buffered reads of /dev/urandom I am seeing disappointing clang performance on a large PowerPC64 Linux box. GetRandomNumberSeed() does a buffered read from /dev/urandom to seed its PRNG. As a result we read an entire page even though we only need 4 bytes. With every clang task reading a page worth of /dev/urandom we end up spending a large amount of time stuck on kernel spinlock. Patch by Anton Blanchard! llvm-svn: 255386	2015-12-11 22:52:32 +00:00
Davide Italiano	62507043c5	[llvm-objdump/MachODump] Reduce code duplication. llvm-svn: 255380	2015-12-11 22:27:59 +00:00
Sanjay Patel	d497ad43da	Add tests for bitcast-bitcast sequences for all scalar/vector permutations As noted in http://reviews.llvm.org/D15392 , we should be able to improve this. llvm-svn: 255370	2015-12-11 20:26:30 +00:00
Xinliang David Li	a86545b0b5	[PGO] Revert r255365: solution incomplete, not handling lambda yet llvm-svn: 255369	2015-12-11 20:23:22 +00:00
Xinliang David Li	c79283ef29	[PGO] Stop using invalid char in instr variable names. Before the patch, -fprofile-instr-generate compile will fail if no integrated-as is specified when the file contains any static functions (the -S output is also invalid). This patch fixed the issue. With the change, the index format version will be bumped up by 1. Backward compatibility is preserved with this change. Differential Revision: http://reviews.llvm.org/D15243 llvm-svn: 255365	2015-12-11 19:53:19 +00:00
Matthias Braun	60d69e2865	CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness() computeRegisterLiveness() was broken in that it reported dead for a register even if a subregister was alive. I assume this was because the results of analayzePhysRegs() are hard to understand with respect to subregisters. This commit: Changes the results of analyzePhysRegs (=struct PhysRegInfo) to be clearly understandable, also renames the fields to avoid silent breakage of third-party code (and improve the grammar). Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling it and removing workarounds for the bug. This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033 Differential Revision: http://reviews.llvm.org/D15320 llvm-svn: 255362	2015-12-11 19:42:09 +00:00
Matt Arsenault	fbd9bbfda3	Start replacing vector_extract/vector_insert with extractelt/insertelt These are redundant pairs of nodes defined for INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT. insertelement/extractelement are slightly closer to the corresponding C++ node name, and has stricter type checking so prefer it. Update targets to only use these nodes where it is trivial to do so. AArch64, ARM, and Mips all have various type errors on simple replacement, so they will need work to fix. Example from AArch64: def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8), (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>; Which is trying to do sext_inreg i8, i8. llvm-svn: 255359	2015-12-11 19:20:16 +00:00
Derek Schuff	5a14306323	[WebAssembly] Fix ADJCALLSTACKDOWN/UP use/defs Summary: ADJCALLSTACK{DOWN,UP} (aka CALLSEQ_{START,END}) MIs are supposed to use and def the stack pointer. Since they do not, all the nodes are being eliminated by DeadMachineInstructionElim, so they aren't in the IR when PrologEpilogInserter/eliminateCallFramePseudo needs them. This change fixes that, but since RegStackify will not stackify across them (and it runs early, before PEI), change LowerCall to only emit them when the call frame size is > 0. That makes the current code work the same way and makes code handled by D15344 also work the same way. We can expand the condition beyond NumBytes > 0 in the future if needed. Reviewers: sunfish, jfb Subscribers: jfb, dschuff, llvm-commits Differential Revision: http://reviews.llvm.org/D15459 llvm-svn: 255356	2015-12-11 18:55:34 +00:00
Chad Rosier	d7634fc91d	Revert r255247, r255265, and r255286 due to serious compile-time regressions. Revert "[DSE] Disable non-local DSE to see if the bots go green." Revert "[DeadStoreElimination] Use range-based loops. NFC." Revert "[DeadStoreElimination] Add support for non-local DSE." llvm-svn: 255354	2015-12-11 18:39:41 +00:00
Manman Ren	abc7c1d1d2	CXX_FAST_TLS calling convention: target independent portion. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given calling convention and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. rdar://problem/23557469 Differential Revision: http://reviews.llvm.org/D15340 llvm-svn: 255353	2015-12-11 18:24:30 +00:00
Sanjay Patel	4dad27e016	fix typos; NFC llvm-svn: 255352	2015-12-11 18:12:01 +00:00
Frederic Riss	841b1732df	[dsymutil] Ignore absolute symbols in the debug map Quoting from the comment added to the code: // Objective-C on i386 uses artificial absolute symbols to // perform some link time checks. Those symbols have a fixed 0 // address that might conflict with real symbols in the object // file. As I cannot see a way for absolute symbols to find // their way into the debug information, let's just ignore those. llvm-svn: 255350	2015-12-11 17:50:37 +00:00
Hal Finkel	494393b740	AlignmentFromAssumptions and SLPVectorizer preserves AA and GlobalsAA GlobalsAA's assumptions that passes do not escape globals not previously escaped is not violated by AlignmentFromAssumptions and SLPVectorizer. Marking them as such allows GlobalsAA to be preserved until GVN in the LTO pipeline. http://lists.llvm.org/pipermail/llvm-dev/2015-December/092972.html Patch by Vaivaswatha Nagaraj! llvm-svn: 255348	2015-12-11 17:46:01 +00:00
Hal Finkel	cd5f984670	[TableGen] Correct Namespace lookup with AltNames in AsmWriterEmitter AsmWriterEmitter will generate a getRegisterName function with an alternate register name index as its second argument if the target makes use of them. The enum of these values is generated in RegisterInfoEmitter. The getRegisterName generator would assume the namespace could always be found by reading index 1 of the list of AltNameIndices, but this will fail if this list is sorted such that the NoRegAltName is at index 1. Because this list is sorted by record name (in CodeGenTarget::ReadRegAltNameIndices), you only run in to problems if your MyTargetRegisterInfo.td defines a single RegAltNameIndex that sorts lexically before NoRegAltName. For example, if a target has something like def AnAltNameIndex : RegAltNameIndex and defines RegAltNameIndices for some registers then, prior to this change, AsmWriterEmitter would generate references to ::AnAltNameIndex and ::NoRegAltName Patch by Alex Bradbury! llvm-svn: 255344	2015-12-11 17:31:27 +00:00
Artur Pilipenko	7ae49ac619	PruneEH pass incorrectly reports that a change was made Reviewed By: reames Differential Revision: http://reviews.llvm.org/D14097 llvm-svn: 255343	2015-12-11 16:30:26 +00:00
James Molloy	1bb6ea5e2d	[Mem2Reg] Respect optnone Mem2Reg shouldn't be optimizing a function that is marked optnone. There is a test checking this that fails when mem2reg is explicitly added to the standard pass pipeline. llvm-svn: 255336	2015-12-11 13:36:59 +00:00
James Molloy	37b82e79b2	[InstCombine] Make MatchBSwap also match bit reversals MatchBSwap has most of the functionality to match bit reversals already. If we switch it from looking at bytes to individual bits and remove a few early exits, we can extend the main recursive function to match any sequence of ORs, ANDs and shifts that assemble a value from different parts of another, base value. Once we have this bit->bit mapping, we can very simply detect if it is appropriate for a bswap or bitreverse. llvm-svn: 255334	2015-12-11 10:04:51 +00:00
Maxim Ostapenko	1dbfca60f8	Revert previous test commit. llvm-svn: 255331	2015-12-11 07:40:25 +00:00
Maxim Ostapenko	e518db35a8	This is a test commit to check my commit access works. llvm-svn: 255330	2015-12-11 07:31:29 +00:00
Xinliang David Li	d922c26c02	[PGO] Read VP raw data without depending on the Value field Before this patch, each function's on-disk VP data is 'pointed' to by the Value field of per-function ProfileData structue, and read relies on this field (relocated with ValueDataDelta field) to read the value data. However this means the Value field needs to be updated during runtime before dumping, which creates undesirable data races. With this patch, the reading of VP data no longer depends on Value field. There is no format change. ValueDataDelta header field becomes obsolute but will be kept for compatibility reason (will be removed next time the raw format change is needed). llvm-svn: 255329	2015-12-11 06:53:53 +00:00
Hans Wennborg	a8e6b3ecb7	Fix build after r255319. llvm-svn: 255322	2015-12-11 00:58:32 +00:00
Eric Christopher	5e834a5dc4	Fix a spurious if. llvm-svn: 255321	2015-12-11 00:51:59 +00:00
Akira Hatanaka	2992beec00	[LazyValueInfo] Stop inserting overdefined values into ValueCache to reduce memory usage. Previously, LazyValueInfoCache inserted overdefined lattice values into both ValueCache and OverDefinedCache. This wasn't necessary and was causing LazyValueInfo to use an excessive amount of memory in some cases. This patch changes LazyValueInfoCache to insert overdefined values only into OverDefinedCache. The memory usage decreases by 70 to 75% when one of the files in llvm is compiled. rdar://problem/11388615 Differential revision: http://reviews.llvm.org/D15391 llvm-svn: 255320	2015-12-11 00:49:47 +00:00
Kyle Butt	1452b76f1f	[PPC]: Peephole optimize small accesss to aligned globals. Access to aligned globals gives us a chance to peephole optimize nonzero offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't overflow the available displacement. For example: addis 3, 2, b4v@toc@ha addi 4, 3, b4v@toc@l lbz 5, b4v@toc@l(3) ; This is the result of the current peephole lbz 6, 1(4) ; optimizer lbz 7, 2(4) lbz 8, 3(4) If b4v is 4-byte aligned, we can skip using register 4 because we know that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate: addis 3, 2, b4v@toc@ha lbz 4, b4v@toc@l(3) lbz 5, b4v@toc@l+1(3) lbz 6, b4v@toc@l+2(3) lbz 7, b4v@toc@l+3(3) Saving a register and an addition. Larger alignments allow larger structures/arrays to be optimized. llvm-svn: 255319	2015-12-11 00:47:36 +00:00
Hans Wennborg	e59910cba9	Check in the script for building Win snapshots llvm-svn: 255318	2015-12-11 00:43:42 +00:00
Vedant Kumar	2491dd118f	[ProfileData] clang-format TextInstrProfReader::hasFormat. NFC. llvm-svn: 255317	2015-12-11 00:40:05 +00:00
Cong Hou	59898d8c68	[X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1. Previously in the conversion cost table there are no entries for integer-integer conversions on SSE2. This will result in imprecise costs for certain vectorized operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers are counted from the result of running llc on the new test case in this patch. Differential revision: http://reviews.llvm.org/D15132 llvm-svn: 255315	2015-12-11 00:31:39 +00:00
Xinliang David Li	2d4803e81b	Format fix (NFC) llvm-svn: 255313	2015-12-10 23:48:05 +00:00
Eric Christopher	86e031a889	s/need/needs llvm-svn: 255306	2015-12-10 22:29:26 +00:00
Eric Christopher	325e8d06dc	Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst, x)) combines for ppc_fp128, since signbit computation is more complicated. Discussion thread: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html Patch by Tim Shen! llvm-svn: 255305	2015-12-10 22:09:06 +00:00
Eric Christopher	2ec6a49fbf	Attempt to fix the ReST compilation to html of the C API docs. llvm-svn: 255304	2015-12-10 22:04:11 +00:00
Eric Christopher	df2e4d2914	More non-ascii quote characters. llvm-svn: 255303	2015-12-10 21:47:38 +00:00
Eric Christopher	dedacf9c73	Clarify some of the wording on adding a new subcomponent to the C API. llvm-svn: 255302	2015-12-10 21:46:24 +00:00
Eric Christopher	b5c2b8dc92	Fix non-ascii quotes. llvm-svn: 255301	2015-12-10 21:38:56 +00:00
Eric Christopher	d9f8ce9977	Add C API guidelines to the developer policy to match discussions on the llvm mailing lists. llvm-svn: 255300	2015-12-10 21:33:53 +00:00
Kyle Butt	28b01a51b3	PPC: Teach FMA mutate to respect register classes. This was causing bad code gen and assembly that won't assemble, as mixed altivec and vsx code would end up with a vsx high register assigned to an altivec instruction, which won't work. Constraining the classes allows the optimization to proceed. llvm-svn: 255299	2015-12-10 21:28:40 +00:00
Chris Bieneman	dbdec57b56	[CMake] Add LLVM_BUILD_INSTRUMENTED option to enable building with -fprofile-instr-generate This is the first step in supporting PGO data generation via CMake. I've marked the option as advanced and experimental until it is fleshed out further. llvm-svn: 255298	2015-12-10 21:19:07 +00:00
Mike Aizatsky	a1a5c69b57	[LibFuzzer] Introducing FUZZER_FLAG_UNSIGNED and using it for seeding. Differential Revision: http://reviews.llvm.org/D15339 done llvm-svn: 255296	2015-12-10 20:41:53 +00:00
JF Bastien	82bf85ffed	EarlyCSE: add tests Summary: As a follow-up to rL255054 I wasn't able to convince myself that the code did what I thought, so I wrote more tests. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15371 llvm-svn: 255295	2015-12-10 20:24:34 +00:00
Xinliang David Li	c61289aa4c	Add a forward declaration (NFC) llvm-svn: 255292	2015-12-10 20:13:41 +00:00
Cong Hou	5146b2d1da	Delete a duplicate branch in IfConversion.cpp. NFC. llvm-svn: 255291	2015-12-10 19:57:22 +00:00
Simon Pilgrim	06ea4be281	[DAGCombiner] Fix PR25763 - vector comparison constant folding + sign-extension PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code. llvm-svn: 255289	2015-12-10 19:47:06 +00:00
Chad Rosier	843c7b4309	[DSE] Disable non-local DSE to see if the bots go green. I see a few bots timing out, so I'm speculatively disabling r255247. llvm-svn: 255286	2015-12-10 19:23:02 +00:00
Rafael Espindola	a8547d35e9	Fix another case where the linkage was not set. llvm-svn: 255272	2015-12-10 18:44:26 +00:00
Rong Xu	2611ff8a27	[PGO] Use %t as the temporary profdata filename in the test cases. Using %t rather %T/<specific_name> as the temporary profdata filename. llvm-svn: 255271	2015-12-10 18:24:44 +00:00
Duncan P. N. Exon Smith	836f0ddb60	Verifier: Avoid quadratic checking of aggregates for bad bitcasts Avoid O(N^2) behaviour when checking for bad bitcasts in `ConstantExpr`s buried inside of aggregate initializers to `GlobalVariable`s. I've: - centralized the "visited" set for recursing through `ConstantExpr`s so that expressions are only visited once per Verifier run, - removed the duplicate logic for the stack visit, and - avoided recursing into other `GlobalValue`s. This recovers roughly a 100x time difference in clang compiles of a particular input file (filled with large cross-referencing tables) that depends on whether `-disable-llvm-verifier` is on. This slowdown was caused by r187506, which introduced these checks. Now, avoiding `-disable-llvm-verifier` only causes a 2x slowdown for this case. (Interestingly, dumping the textual IR for this file starts at least 50GB of global variable initializers (I don't know the total, since I killed the dump)...) llvm-svn: 255269	2015-12-10 17:56:06 +00:00
Chad Rosier	02fe4248a2	[DeadStoreElimination] Use range-based loops. NFC. llvm-svn: 255265	2015-12-10 17:27:18 +00:00
Nathan Slingerland	51abea7442	[ProfileData] Add unit test infrastructure for sample profile reader/writer Summary: Adds support for in-memory round-trip of sample profile data along with basic round trip unit tests. This will also make it easier to include unit tests for future changes to sample profiling. Reviewers: davidxl, dnovillo, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15211 llvm-svn: 255264	2015-12-10 17:21:42 +00:00
Pirama Arumuga Nainar	1317d5f311	Fix fptosi, fptoui from f16 vectors to i8, i16 vectors Summary: Convert f16 vectors to corresponding f32 vectors before doing the conversion to int. Add tests for v4f16, v8f16. Reviewers: ab, jmolloy Subscribers: llvm-commits, srhines Differential Revision: http://reviews.llvm.org/D14936 llvm-svn: 255263	2015-12-10 17:16:49 +00:00
Sanjay Patel	c83fd9554a	[InstCombine] fold bitcasts around an extractelement (3rd try) This is a redo of r255137 (reverted at r255227) which was a redo of r255124 (reverted at r255126) with a fixed check for a scalar source type and an added test for the failure that caused the revert. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255261	2015-12-10 17:09:28 +00:00
Teresa Johnson	9f2ff9c669	[ThinLTO] Debug message cleanup (NFC) Added some missing spaces between the module identifier and the start of the debug message. Also added a ":" after the module identifier to make this look a little nicer. llvm-svn: 255259	2015-12-10 16:39:07 +00:00
Rafael Espindola	f81c7b03a0	Avoid undefined behavior when vector is empty. Found by ubsan. llvm-svn: 255258	2015-12-10 16:35:06 +00:00
Sanjay Patel	87c6c0797e	remove duplicated comments and don't repeat function names in comments; NFC llvm-svn: 255257	2015-12-10 16:34:21 +00:00
Teresa Johnson	9d5b71b3d2	[ThinLTO] Release files in gold plugin during combined index (take 2) Ensure we release the files even when they don't hold a function index summary section, by restructuring the control flow a little bit. llvm-svn: 255256	2015-12-10 16:11:23 +00:00
Dan Gohman	28818d7840	[WebAssembly] Tighten up several CHECK tests. llvm-svn: 255255	2015-12-10 14:52:34 +00:00
Rafael Espindola	caabe22832	Slit lib/Linker in two. A linker normally has two stages: symbol resolution and "moving stuff". In lib/Linker there is the complication of lazy linking some globals, but it was still far more mixed than it needed to. This splits the linker into a lower level IRMover and the linker proper. The IRMover just takes a list of globals to move and a callback that lets the user control what is lazy linked. The main motivation is that now tools/gold (and soon lld) can use their own symbol resolution to instruct IRMover what to do. llvm-svn: 255254	2015-12-10 14:19:35 +00:00
Dan Gohman	b949b9c01b	[WebAssembly] Make WebAssemblyStoreResults only return true when it has a change. llvm-svn: 255253	2015-12-10 14:17:36 +00:00
Dan Gohman	a87629d6d7	[WebAssembly] Fix WebAssemblyPeephole to set Changed to true when making changes. llvm-svn: 255252	2015-12-10 14:16:34 +00:00
Dan Gohman	acc0941bd1	[WebAssembly] Declare that WebAssemblyPeephole does not modify the CFG. llvm-svn: 255251	2015-12-10 14:12:04 +00:00
Dan Gohman	6d63f96749	[WebAssembly] Remove an unneeded getAnalysisUsage override. llvm-svn: 255250	2015-12-10 14:10:04 +00:00
Chad Rosier	533bc3fcac	[DeadStoreElimination] Add support for non-local DSE. We extend the search for redundant stores to predecessor blocks that unconditionally lead to the block BB with the current store instruction. That also includes single-block loops that unconditionally lead to BB, and if-then-else blocks where then- and else-blocks unconditionally lead to BB. http://reviews.llvm.org/D13363 Patch by Ivan Baev <ibaev@codeaurora.org>! llvm-svn: 255247	2015-12-10 13:51:43 +00:00
Nemanja Ivanovic	ac8d01add0	Bitcasts between FP and INT values using direct moves This patch corresponds to review: http://reviews.llvm.org/D15286 LLVM IR frequently contains bitcast operations between floating point and integer values of the same width. Doing this through memory operations is quite expensive on PPC. This patch allows the use of direct register moves between FPRs and GPRs for lowering bitcasts. llvm-svn: 255246	2015-12-10 13:35:28 +00:00
Amjad Aboud	a9bcf16ebc	Macro debug info support in LLVM IR Introduced DIMacro and DIMacroFile debug info metadata in the LLVM IR to support macros. Differential Revision: http://reviews.llvm.org/D14687 llvm-svn: 255245	2015-12-10 12:56:35 +00:00
Silviu Baranga	86de80db37	[LLE] Use the PredicatedScalarEvolution interface to query SCEVs for dependences Summary: LAA uses the PredicatedScalarEvolution interface, so it can produce forward/backward dependences having SCEVs that are AddRecExprs only after being transformed by PredicatedScalarEvolution. Use PredicatedScalarEvolution to get the expected expressions. Reviewers: anemet Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D15382 llvm-svn: 255241	2015-12-10 11:07:18 +00:00
Jonas Paulsson	e451eeff5c	[PostRA scheduling] Allow a target to do scheduling when it wants post RA. SystemZ needs to do its scheduling after branch relaxation, which can only happen after block placement, and therefore the standard PostRAScheduler point in the pass sequence is too early. TargetMachine::targetSchedulesPostRAScheduling() is a new method that signals on returning true that target will insert the final scheduling pass on its own. Reviewed by Hal Finkel llvm-svn: 255234	2015-12-10 09:10:07 +00:00
Akira Hatanaka	a3c0e8e1ba	Revert r255137. This commit broke apple's internal bot. llvm-svn: 255227	2015-12-10 08:00:52 +00:00
Sanjoy Das	ccd14566e2	Add arg_begin() and arg_end() to CallInst and InvokeInst; NFCI - This simplifies the CallSite class, arg_begin / arg_end are now simple wrapper getters. - In several places, we were creating CallSite instances solely to call arg_begin and arg_end. With this change, that's no longer required. llvm-svn: 255226	2015-12-10 06:39:02 +00:00
Craig Topper	8e44b9a4d1	[X86] Fix a couple cases were bitwise and logical operations were being mixed. NFC llvm-svn: 255224	2015-12-10 06:09:41 +00:00
Alexey Bataev	860435c8e2	[OPENMP] Make -fopenmp to turn on OpenMP support by default. Patch turns on OpenMP support in clang by default after fixing OpenMP buildbots. Differential Revision: http://reviews.llvm.org/D13802 llvm-svn: 255222	2015-12-10 05:45:58 +00:00
Dan Gohman	f170ba08af	[WebAssembly] Implement mixed-type ISD::FCOPYSIGN. ISD::FCOPYSIGN permits its operands to have differing types, and DAGCombiner uses this. Add some def : Pat rules to expand this out into an explicit conversion and a normal copysign operation. llvm-svn: 255220	2015-12-10 04:55:31 +00:00
Dan Gohman	9341c1d4b3	[WebAssembly] Implement fma. It is lowered to a libcall for now, but this is expected to change in the future. llvm-svn: 255219	2015-12-10 04:52:33 +00:00
Tom Stellard	c2d654322b	AMDGPU/SI: Fix warning introduced by r255204 llvm-svn: 255205	2015-12-10 03:10:46 +00:00
Tom Stellard	c93fc11f36	AMDGPU/SI: Emit constant arrays in the .text section Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204	2015-12-10 02:13:01 +00:00
Tom Stellard	b3c3bda512	AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs. Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15342 llvm-svn: 255203	2015-12-10 02:12:53 +00:00
Dan Gohman	60bddf17c5	[WebAssembly] Fix legalization of f32->f64 EXTLOAD. llvm-svn: 255202	2015-12-10 02:07:53 +00:00
Derek Schuff	6fd28dfe5d	[WebAssembly] Update known test failures We can now select sign_extend_inreg llvm-svn: 255197	2015-12-10 01:09:40 +00:00
Matthias Braun	7d8e41e82c	RegisterPressure: Factor out liveness dead-def detection logic; NFCI Detecting additional dead-defs without a dead flag that are only visible through liveness information should be part of the register operand collection not intertwined with the register pressure update logic. llvm-svn: 255192	2015-12-10 01:04:15 +00:00
Dan Gohman	a5603b835b	[WebAssembly] Also legalize sign_extend_inreg of i32->i64. llvm-svn: 255191	2015-12-10 01:00:19 +00:00
Derek Schuff	71d0eae609	[WebAssembly] Update test failure expectations llvm-svn: 255190	2015-12-10 00:56:18 +00:00
Dan Gohman	dab313e0ed	PeepholeOptimizer: Ignore dead implicit defs Target-specific instructions may have uninteresting physreg clobbers, for target-specific reasons. The peephole pass doesn't need to concern itself with such defs, as long as they're implicit and marked as dead. llvm-svn: 255182	2015-12-10 00:37:51 +00:00
Dan Gohman	a8483755d3	[WebAssembly] Fix legalization of shift operators with illegal types. llvm-svn: 255181	2015-12-10 00:26:26 +00:00
Dan Gohman	7935fa3d1b	[WebAssembly] Fix copy+pastos. llvm-svn: 255180	2015-12-10 00:22:40 +00:00
Dan Gohman	df00a9ebc2	[WebAssembly] Implement anyext. llvm-svn: 255179	2015-12-10 00:17:35 +00:00
Quentin Colombet	5d2f7cfd44	[X86] Enable shrink-wrapping by default, but keep it disabled for stack frames without a frame pointer when unwind may happen. This is a workaround for a bug in the way we emit the CFI directives for frameless unwind information. See PR25614. llvm-svn: 255175	2015-12-09 23:08:18 +00:00
Sanjay Patel	87d2ae23ac	use range-based for loops; NFCI llvm-svn: 255171	2015-12-09 22:45:45 +00:00
Rafael Espindola	ed11bd286f	Synchronize the logic for deciding to link a gv. We were deciding to not link an available_externally gv over a declaration, but then copying over the body anyway. llvm-svn: 255169	2015-12-09 22:44:00 +00:00
Rong Xu	7dd9b1ea75	[PGO] Rename the profdata filename to avoid the conflict b/w tests. Two tests diag_mismatch.ll and diag_no_funcprofdata.ll generates the same profdata filename which can conflict in current test runs. This patch renames them to have different names. llvm-svn: 255158	2015-12-09 21:27:59 +00:00
Justin Bogner	b7389d6714	IR: Make ConstantDataArray::getFP actually return a ConstantDataArray The ConstantDataArray::getFP(LLVMContext &, ArrayRef<uint16_t>) overload has had a typo in it since it was written, where it will create a Vector instead of an Array. This obviously doesn't work at all, but it turns out that until r254991 there weren't actually any callers of this overload. Fix the typo and add some test coverage. llvm-svn: 255157	2015-12-09 21:21:07 +00:00
Teresa Johnson	db51357c11	[ThinLTO] Release files read when creating combined index in gold plugin This wasn't causing an issue since at HEAD we exit the linker completely after creating the combined index. llvm-svn: 255156	2015-12-09 21:11:42 +00:00
Reid Kleckner	54ade23504	[Float2Int] Don't operate on vector instructions This fixes a crash bug. It's also not clear if we'd want to do this transform for vectors. llvm-svn: 255155	2015-12-09 21:08:18 +00:00
David Blaikie	c3826da895	[llvm-dwp] Sink debug_types.dwo emission into the code parsing the type signatures (NFC) This is a preliminary change towards deduplicating type units based on their signatures. Next change will skip emission of types when their signature has already been seen. llvm-svn: 255154	2015-12-09 21:02:33 +00:00
Rafael Espindola	9edc3b8403	Don't assign a temporary string to a StringRef. Should fix the windows debug and asan bots. llvm-svn: 255149	2015-12-09 20:41:10 +00:00
Sanjoy Das	9abfb0b429	Use WeakVH to keep track of calls with operand bundles in CloneCodeInfo `CloneAndPruneIntoFromInst` can DCE instructions after cloning them into the new function, and so an AssertingVH is too strong. This change switches CloneCodeInfo to use a std::vector<WeakVH>. llvm-svn: 255148	2015-12-09 20:33:52 +00:00
Sanjoy Das	1f8fd88873	Delete trailing whitespace; NFC llvm-svn: 255147	2015-12-09 20:33:45 +00:00
Teresa Johnson	af9e93183d	Delay context construction to when/if it is needed in gold plugin (NFC) llvm-svn: 255146	2015-12-09 19:49:40 +00:00
Teresa Johnson	b13dbd633a	clang-format order of gold-plugin includes (NFC) llvm-svn: 255144	2015-12-09 19:45:55 +00:00
Teresa Johnson	7f961e14d3	[ThinLTO] FunctionImport pass can take a const index pointer (NFC) llvm-svn: 255140	2015-12-09 19:39:47 +00:00
Sanjay Patel	b67e6b6044	[InstCombine] fold bitcasts around an extractelement (2nd try) This is a redo of r255124 (reverted at r255126) with an added check for a scalar destination type and an added test for the failure seen in Clang's test/CodeGen/vector.c. The extra test shows a different missing optimization. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255137	2015-12-09 18:57:16 +00:00
Michael Zolotukhin	78760ee73d	Revert "Revert r253253 and r253126: "Don't recompute LCSSA after loop-unrolling when possible."" The bug in IndVarSimplify was fixed in r254976, r254977, so I'm reapplying the original patch for avoiding redundant LCSSA recomputation. This reverts commit ffe3b434e505e403146aff00be0c177bb6d13466. llvm-svn: 255133	2015-12-09 18:20:28 +00:00
Rong Xu	f430ae40cf	[PGO] Resubmit "MST based PGO instrumentation infrastructure" (r254021) This new patch fixes a few bugs that exposed in last submit. It also improves the test cases. --Original Commit Message-- This patch implements a minimum spanning tree (MST) based instrumentation for PGO. The use of MST guarantees minimum number of CFG edges getting instrumented. An addition optimization is to instrument the less executed edges to further reduce the instrumentation overhead. The patch contains both the instrumentation and the use of the profile to set the branch weights. Differential Revision: http://reviews.llvm.org/D12781 llvm-svn: 255132	2015-12-09 18:08:16 +00:00
Nathan Slingerland	644badbf01	[Support] Change SaturatingAdd()/SaturatingMultiply() to use pointer for returning overflow state Summary: Improve SaturatingAdd()/SaturatingMultiply() to use bool * to optionally return overflow result. This should make it clearer that the value is returned at callsites and reduces the size of the implementation. Reviewers: davidxl, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15219 llvm-svn: 255128	2015-12-09 17:11:28 +00:00
Mehdi Amini	4e2b7c454c	Revert "[InstCombine] fold bitcasts around an extractelement" This reverts commit r255124. Broke http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/4193/steps/test/logs/stdio From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255126	2015-12-09 16:31:39 +00:00
Dan Gohman	1cf96c0c34	[WebAssembly] Reintroduce ARGUMENT moving logic Reinteroduce the code for moving ARGUMENTS back to the top of the basic block. While the ARGUMENTS physical register prevents sinking and scheduling from moving them, it does not appear to be sufficient to prevent SelectionDAG from moving them down in the initial schedule. This patch introduces a patch that moves them back to the top immediately after SelectionDAG runs. This is still hopefully a temporary solution. http://reviews.llvm.org/D14750 is one alternative, though the review has not been favorable, and proposed alternatives are longer-term and have other downsides. This fixes the main outstanding -verify-machineinstrs failures, so it adds -verify-machineinstrs to several tests. Differential Revision: http://reviews.llvm.org/D15377 llvm-svn: 255125	2015-12-09 16:23:59 +00:00
Sanjay Patel	07410ed234	[InstCombine] fold bitcasts around an extractelement Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255124	2015-12-09 16:17:20 +00:00
Mehdi Amini	b000bbdec2	Change hasUniqueInitializer() to call isStrongDefinitionForLinker() instead of !isWeakForLinker() Summary: Available_externally global variable with initializer were considered "hasInitializer()", while obviously it can't match the description: Whether the global variable has an initializer, and any changes made to the initializer will turn up in the final executable. since modifying the initializer of an externally available variable does not make sense. Reviewers: pcc, rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15351 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255123	2015-12-09 16:17:07 +00:00
Silviu Baranga	9cd9a7e310	Re-commit r255115, with the PredicatedScalarEvolution class moved to ScalarEvolution.h, in order to avoid cyclic dependencies between the Transform and Analysis modules: [LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255122	2015-12-09 16:06:28 +00:00
Tim Northover	d91d635b36	ARM: don't use a deleted node as the BaseReg in complex pattern. We mutated the DAG, which invalidated the node we were trying to use as a base register. Sometimes we got away with it, but other times the node really did get deleted before it was finished with. Should fix PR25733 llvm-svn: 255120	2015-12-09 15:54:50 +00:00
JF Bastien	88f8014e8e	WebAssembly: add missing failure to the list. llvm-svn: 255119	2015-12-09 15:52:57 +00:00
Silviu Baranga	ad1ccb357b	Revert r255115 until we figure out how to fix the bot failures. llvm-svn: 255117	2015-12-09 15:25:28 +00:00
Silviu Baranga	41eb682501	[LV][LAA] Add a layer over SCEV to apply run-time checked knowledge on SCEV expressions Summary: This change creates a layer over ScalarEvolution for LAA and LV, and centralizes the usage of SCEV predicates. The SCEVPredicatedLayer takes the statically deduced knowledge by ScalarEvolution and applies the knowledge from the SCEV predicates. The end goal is that both LAA and LV should use this interface everywhere. This also solves a problem involving the result of SCEV expression rewritting when the predicate changes. Suppose we have the expression (sext {a,+,b}) and two predicates P1: {a,+,b} has nsw P2: b = 1. Applying P1 and then P2 gives us {a,+,1}, while applying P2 and the P1 gives us sext({a,+,1}) (the AddRec expression was changed by P2 so P1 no longer applies). The SCEVPredicatedLayer maintains the order of transformations by feeding back the results of previous transformations into new transformations, and therefore avoiding this issue. The SCEVPredicatedLayer maintains a cache to remember the results of previous SCEV rewritting results. This also has the benefit of reducing the overall number of expression rewrites. Reviewers: mzolotukhin, anemet Subscribers: jmolloy, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D14296 llvm-svn: 255115	2015-12-09 15:03:52 +00:00
Robert Lougher	f0033b29d4	Fix cycle in selection DAG introduced by extractelement legalization During selection DAG legalization, extractelement is replaced with a load instruction. To do this, a temporary store to the stack is used unless an existing store is found that can be re-used. If re-using a store, the chain going out of the store must be replaced by the one going out of the new load (this ensures that any stores that must take place after the store happens after the load, else the value might be overwritten before it is loaded). The problem is, if the extractelement index is dependent on the store replacing the chain will introduce a cycle in the selection DAG (the load uses the index, and by replacing the chain we will make the index dependent on the load). To fix this, if the index is dependent on the store, the store is skipped. This is conservative as we may end up creating an unnecessary extra store to the stack. However, the situation is not expected to occur very often. Differential Revision: http://reviews.llvm.org/D15330 llvm-svn: 255114	2015-12-09 14:34:10 +00:00
Oliver Stannard	86f729296a	[AArch64] Fix FP16 vector instructions that should only accept low registers llvm-svn: 255113	2015-12-09 14:32:11 +00:00
Daniel Sanders	3c7223133d	[mips][ias] Range check uimm10 operands Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15229 llvm-svn: 255112	2015-12-09 13:48:05 +00:00
JF Bastien	c2b30484ae	WebAssembly: add known failures The bots are now running the torture tests properly. Bin all failures from the GCC C torture tests so that we can tackle failures and make the tree go red on regressions. llvm-svn: 255111	2015-12-09 13:29:32 +00:00
Vasileios Kalintiris	ddf7e6885a	[mips] Use multiclass patterns for f32/f64 comparisons and i32 selects. Summary: Although the multiclass for i32 selects might seem redundant as it has only one instantiation, we will use it to replace the correspondent patterns in Mips64r6InstrInfo.td in follow-up commits. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14612 llvm-svn: 255110	2015-12-09 13:24:22 +00:00
Zlatko Buljan	48f1f39bfe	Revert r254897 "[mips][microMIPS] Implement LH, LHE, LHU and LHUE instructions" Commited patch was intended to implement LH, LHE, LHU and LHUE instructions. After commit test-suite failed with error message in the form of: fatal error: error in backend: Cannot select: t124: i32,ch = load<LD2[%d](tbaa=<0x94acc48>), sext from i16> t0, t2, undef:i32 For that reason I decided to revert commit r254897 and make new patch which besides implementation and standard regression tests will also have dedicated tests (CodeGen) for the above error. llvm-svn: 255109	2015-12-09 13:07:45 +00:00
JF Bastien	9938425b31	EarlyCSE: fix typo from rL255054. llvm-svn: 255102	2015-12-09 09:05:42 +00:00
Mehdi Amini	ceca971576	Revert "Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933 " This reverts commit r255096. Break the bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/16378/ From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255101	2015-12-09 08:17:42 +00:00
Mehdi Amini	7e88d0da38	The current importing scheme is processing one function at a time, loading the source Module, linking the function in the destination module, and destroying the source Module before repeating with the next function to import (potentially from the same Module). Ideally we would keep the source Module alive and import the next Function needed from this Module. Unfortunately this is not possible because the linker does not leave it in a usable state. However we can do better by first computing the list of all candidates per Module, and only then load the source Module and import all the function we need for it. The trick to process callees is to materialize function in the source module when building the list of function to import, and inspect them in their source module, collecting the list of callees for each callee. When we move the the actual import, we will import from each source module exactly once. Each source module is loaded exactly once. The only drawback it that it requires to have all the lazy-loaded source Module in memory at the same time. Currently this patch already improves considerably the link time, a multithreaded link of llvm-dis on my laptop was: real 1m12.175s user 6m32.430s sys 0m10.529s and is now: real 0m40.697s user 2m10.237s sys 0m4.375s Note: this is the full link time (linker+Import+Optimizer+CodeGen) Differential Revision: http://reviews.llvm.org/D15178 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255100	2015-12-09 08:17:35 +00:00
Vikram TV	0876d2d5cf	Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933 llvm-svn: 255096	2015-12-09 05:49:14 +00:00
Vikram TV	74b4111483	Test commit access - Fix few missing '.' in comments of LoopInterchange code. llvm-svn: 255095	2015-12-09 05:16:24 +00:00
Steven Wu	b5104b5884	Fix the order of destructors in LibLTOCodeGenerator Summary: The order of destructors in LTOCodeGenerator gets changed in r254696. It is possible for LTOCodeGenerator to have a MergedModule created in the OwnedContext, in which case the module must be destructed before the context. Reviewers: rafael, dexonsmith Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D15346 llvm-svn: 255092	2015-12-09 03:37:51 +00:00
Ahmed Bougacha	97564c3a1b	[AArch64][ARM] Don't base interleaved op legality on type alloc size. Otherwise, we think that most types that look like they'd fit in a legal vector type are legal (so, basically, any vector type with a size between 33 and 128 bits, I think, since we use pow2 alignment; e.g., v2i25, v3f32, ...). DataLayout::getTypeAllocSize rounds up based on alignment. When checking for target intrinsic legality, that's not what we want: if rounding makes a difference, the type isn't legal, and the target intrinsics shouldn't be used, as they are always assumed legal. One could make the argument that alloc size is ultimately the most relevant here, since we're dealing with LD/ST intrinsics. That's only true if we did legalize them though; that's a problem for another day. Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits. Type::getSizeInBits can't be used because that'd gratuitously break pointer vector support. Some of these uses are currently fine, because we only hit them when the type is already known legal (e.g., r114454). Update them for consistency. It's faster to avoid the rounding anyway! llvm-svn: 255089	2015-12-09 01:19:50 +00:00
Sanjoy Das	b8dced5dfa	Don't drop attributes when inlining through "deopt" operand bundles Test case attached (test case also checks that we don't drop the calling convention, but that functionality was correct before this patch). llvm-svn: 255088	2015-12-09 01:01:28 +00:00
Rafael Espindola	7471abd9ed	Simplify testMergedProgram. It now receives and returns std::unique_ptr. llvm-svn: 255087	2015-12-09 00:55:05 +00:00
Rafael Espindola	bc12cbc359	Simplify memory management. NFC. This passes std::unique_ptr to predicates that are expected to delete their argument. llvm-svn: 255086	2015-12-09 00:51:06 +00:00
Rafael Espindola	50102c29f0	Return std::unique_ptr from SplitFunctionsOutOfModule. NFC. llvm-svn: 255084	2015-12-09 00:34:10 +00:00
Rafael Espindola	000bf49cec	Simplify memory management. NFC. llvm-svn: 255082	2015-12-09 00:18:41 +00:00
Vyacheslav Klochkov	a3cd08b05c	X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes. Reviewer: Simon Pilgrim. Differential Revision: http://reviews.llvm.org/D15317 llvm-svn: 255080	2015-12-09 00:12:13 +00:00
Rafael Espindola	88a8f725cb	Simplify memory management a bit. NFC. llvm-svn: 255079	2015-12-09 00:08:22 +00:00
Rafael Espindola	cab951dd46	Return a std::unique_ptr from CloneModule. NFC. llvm-svn: 255078	2015-12-08 23:57:17 +00:00
Sanjoy Das	42e551b92d	[IndVars] Use any_of and foreach instead of explicit for loops; NFC llvm-svn: 255077	2015-12-08 23:52:58 +00:00
Sanjoy Das	48945cdc15	[OperandBundles] Have PruneEH work correct with operand bundles. For an invoke with operand bundles, the [op_begin(), op_end()-3] range can contain things other than invoke arguments. This change teaches PruneEH to use arg_begin() and arg_end() explicitly. llvm-svn: 255073	2015-12-08 23:16:52 +00:00
Pirama Arumuga Nainar	e6ccd7b66a	Define selection for v4f16, v8f16 scalar_to_vector Summary: This fixes failure when trying to select insertelement <4 x half> undef, half %a, i64 0 which gets transformed to a scalar_to_vector node. The accompanying v4 and v8 tests fail instruction selection without this patch. Reviewers: ab, jmolloy Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D15322 llvm-svn: 255072	2015-12-08 23:07:06 +00:00
Mehdi Amini	5411d0510c	Fix/Improve Debug print in FunctionImport pass From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255071	2015-12-08 23:04:19 +00:00
Reid Kleckner	8de1fe23ed	[CGP] Reimplement r255055 a different way llvm-svn: 255070	2015-12-08 23:00:03 +00:00
Sanjoy Das	d87e4354a7	[SCEV] Use for-each; NFC llvm-svn: 255069	2015-12-08 22:53:36 +00:00
Mehdi Amini	d16c8065ff	Remove caching in FunctionImport: a Module can't be reused after being linked from The Linker destroys the source module (API change coming to make it explicit) From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255064	2015-12-08 22:39:40 +00:00
Reid Kleckner	e18f92bfe9	Revert "[CGP] Check that we have an insert point before moving llvm.dbg.value around" This reverts commit r255055. Breakage has been reported. llvm-svn: 255063	2015-12-08 22:33:23 +00:00
Sanjoy Das	8a954a0553	[OperandBundles] Fix a transform in simplifycfg Reviewers: pcc, majnemer, reames Subscribers: reames, llvm-commits Differential Revision: http://reviews.llvm.org/D15345 llvm-svn: 255062	2015-12-08 22:26:08 +00:00
Simon Pilgrim	323e00d9c7	[X86][AVX] Fold loads + splats into broadcast instructions On AVX and AVX2, BROADCAST instructions can load a scalar into all elements of a target vector. This patch improves the lowering of 'splat' shuffles of a loaded vector into a broadcast - currently the lowering only works for cases where we are splatting the zero'th element, which is now generalised to any element. Fix for PR23022 Differential Revision: http://reviews.llvm.org/D15310 llvm-svn: 255061	2015-12-08 22:17:11 +00:00
Chris Bieneman	48eaa54151	[CMake] Ignore externalizing debuginfo for unit tests If you externalize debug info for unit tests the test runner finds the mach-o inside the dsym bundle and tries to execute it as a test. llvm-svn: 255056	2015-12-08 21:51:48 +00:00
Reid Kleckner	7c005324d5	[CGP] Check that we have an insert point before moving llvm.dbg.value around llvm-svn: 255055	2015-12-08 21:50:52 +00:00
Philip Reames	8fc2cbf933	[EarlyCSE] Value forwarding for unordered atomics This patch teaches the fully redundant load part of EarlyCSE how to forward from atomic and volatile loads and stores, and how to eliminate unordered atomics (only). This patch does not include dead store elimination support for unordered atomics, that will follow in the near future. The basic idea is that we allow all loads and stores to be tracked by the AvailableLoad table. We store a bit in the table which tracks whether load/store was atomic, and then only replace atomic loads with ones which were also atomic. No attempt is made to refine our handling of ordered loads or stores. Those are still treated as full fences. We could pretty easily extend the release fence handling to release stores, but that should be a separate patch. Differential Revision: http://reviews.llvm.org/D15337 llvm-svn: 255054	2015-12-08 21:45:41 +00:00
Simon Pilgrim	0aea1b89eb	[X86][SSE4A] Added fast-isel intrinsics tests As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse4a-builtins.c llvm-svn: 255053	2015-12-08 21:43:41 +00:00
Simon Pilgrim	0ca7cb6334	[X86][SSSE3] Added fast-isel intrinsics tests As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/ssse3-builtins.c llvm-svn: 255052	2015-12-08 21:32:08 +00:00
Simon Pilgrim	9d76810949	[X86][SSE3] Added fast-isel intrinsics tests As discussed on PR24580, this patch adds fast-isel codegen tests to match the IR generated in clang/test/CodeGen/sse3-builtins.c llvm-svn: 255051	2015-12-08 21:27:19 +00:00
Artyom Skrobov	0a37b80bcb	Fix ARMv4T (Thumb1) epilogue generation Summary: Before ARMv5T, Thumb1 code could not pop PC, as described at D14357 and D14986; so we need the special fixup in the epilogue. Reviewers: jroelofs, qcolombet Subscribers: aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D15126 llvm-svn: 255047	2015-12-08 19:59:01 +00:00
Mehdi Amini	bddfbeaf59	Revert "Add Available Externally linkage type to isWeakForLinker()" This reverts r255043, as per post-review concern were raised on the correctness. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255045	2015-12-08 19:13:31 +00:00
Mehdi Amini	69e3ae8d4b	Cleanup test: remove useless alignment From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255044	2015-12-08 19:02:55 +00:00
Mehdi Amini	37c25fa1d1	Add Available Externally linkage type to isWeakForLinker() Per LangRef: "Globals with available_externally linkage are allowed to be discarded at will, and are otherwise the same as linkonce_odr", since linkonce_odr is in this list it makes sense to have available_externally there as well. Reviewers: rafael Differential Revision: http://reviews.llvm.org/D15323 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255043	2015-12-08 19:01:29 +00:00
Tim Northover	614e8ff855	X86: produce more friendly errors during MachO relocation handling llvm-svn: 255036	2015-12-08 18:31:35 +00:00
Renato Golin	412ee3d45d	[ARM] Allowing SP/PC for AND/BIC mod_imm_not AND/BIC instructions do accept SP/PC, so the register class should be more generic (rGPR -> GPR) to cope with that case. Adding more tests. llvm-svn: 255034	2015-12-08 18:10:58 +00:00
Mike Aizatsky	43de555ad9	adding readability-identifier-naming to llvm clang-tidy configuration. Differential Revision: http://reviews.llvm.org/D15196 llvm-svn: 255028	2015-12-08 17:44:51 +00:00

... 3 4 5 6 7 ...

125192 Commits