llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Dardis	53a3492b71	Summary: Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like binutils. This patch was previous committed as r266055 as seemed to have caused some spurious test failures. They did not reappear after further local testing. llvm-svn: 266301	2016-04-14 13:43:17 +00:00
Vasileios Kalintiris	d10ce39254	[mips] Remove duplicate tests and add missing prefixes for *-LABEL checks. NFC. Summary: The only difference between the removed tests and the pre-existing ones, is the materialization of the zero constant, which shouldn't matter for these cases. Reviewers: dsanders, sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18693 llvm-svn: 266285	2016-04-14 09:13:13 +00:00
Adam Nemet	7aab648831	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This reverts commit r266086. It breaks the LTO build of gcc in SPEC2000. llvm-svn: 266282	2016-04-14 08:47:17 +00:00
David Majnemer	0f26b0aeb4	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279	2016-04-14 07:13:24 +00:00
Matt Arsenault	9cd90712f0	AMDGPU: Implement canonicalize Also add generic DAG node for it. llvm-svn: 266272	2016-04-14 01:42:16 +00:00
Sanjay Patel	748b06514a	[ppc] add tests to show potential andc optimization llvm-svn: 266261	2016-04-13 23:23:30 +00:00
Tim Northover	5c02f9ad28	ARM: override cost function to re-enable ConstantHoisting (& fix it). At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260	2016-04-13 23:08:27 +00:00
Matthias Braun	707e02c273	ARM: Use a callee save register for the swiftself parameter. It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18901 llvm-svn: 266253	2016-04-13 21:43:25 +00:00
Matthias Braun	588d1cdad4	X86: Use a callee save register for the swiftself parameter. It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18902 llvm-svn: 266252	2016-04-13 21:43:21 +00:00
Matthias Braun	74a0bd319a	AArch64: Use a callee save registers for swiftself parameters It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D19007 llvm-svn: 266251	2016-04-13 21:43:16 +00:00
Kevin Enderby	8702574557	Start to add real error messages for malformed Mach-O files. And update the existing test cases in test/Object/macho-invalid.test to use llvm-objdump with the -macho option to produce these error messages and stop producing the generic "Invalid data was encountered while parsing the file" message. Working from the beginning of the file, if the mach header is too large for the size of the file and then if the load commands that follow extend past the end of the file these two errors now generate correct error messages. Both of these have existing test cases in test/Object/macho-invalid.test . But the first with macho-invalid-header it will never trigger the error message "mach header extends past the end of the file" using any of the llvm tools as they all use identify_magic() which rejects files with the correct magic number that are too small in size. So I tested this by hacking that code and seeing the error message down in parseHeader() really does happen. So in case there is ever code in llvm that directly calls createMachOObjectFile() this error message will be correctly produced. The second error message of "load commands extends past the end of the file" is triggered by a number of existing tests cases in test/Object/macho-invalid.test . Also other tests trigger different error messages now like "ilocalsym plus nlocalsym in LC_DYSYMTAB load command extends past the end of the symbol table". There are two existing test cases that still get the "Invalid data was encountered ..." error messages that I will tackle next. But they will involve a bit of pluming an Expect<...> up through the call stack and I want to do those as separate changes. FYI, for those test cases that were trying to test specific errors that now get different errors I’ll fix those in follow on changes and create new test cases for those so they test the error they were meant to test. llvm-svn: 266248	2016-04-13 21:17:58 +00:00
Sanjay Patel	d6c53f8727	[x86] add tests to show potential BMI optimization llvm-svn: 266243	2016-04-13 20:40:43 +00:00
Tim Northover	c0bef99bb0	AsmParser: record "# line file" context to calculate location for diag Since we can't emit diagnostics for missing "jmp 1f" labels until the end of the file, we need to be able to restore the context used to calculate file/line. This is basically the "# line file" directive that's being used at the time the expression is seen. rdar://25706972 llvm-svn: 266238	2016-04-13 19:46:54 +00:00
Easwaran Raman	cbd3989742	Test case for r265852. llvm-svn: 266237	2016-04-13 19:43:31 +00:00
Peter Collingbourne	5413f6f863	LibDriver: Silently do nothing when provided no inputs. This behavior is strange, but it matches lib.exe. Based on a patch by Nico Weber. Fixes PR27335. llvm-svn: 266236	2016-04-13 19:36:04 +00:00
Betul Buyukkurt	bf8554c279	[PGO] Remove redundant VP instrumentation LLVM optimization passes may reduce a profiled target expression to a constant. Removing runtime calls at such instrumentation points would help speedup the runtime of the instrumented program. llvm-svn: 266229	2016-04-13 18:52:19 +00:00
Nemanja Ivanovic	87bcae366d	[PowerPC] Basic support for P9 byte comparison and count trailing zero insns This patch corresponds to review: http://reviews.llvm.org/D17850 This patch implements the following instructions: cmprb, cmpeqb, cnttzw, cnttzw., cnttzd, cnttzd. llvm-svn: 266228	2016-04-13 18:51:18 +00:00
Evandro Menezes	8d53f88162	[AArch64] Disable LDP/STP for quads Disable LDP/STP for quads on Exynos M1 as they are not as efficient as pairs of regular LDR/STR. Patch by Abderrazek Zaafrani <a.zaafrani@samsung.com>. llvm-svn: 266223	2016-04-13 18:31:45 +00:00
Davide Italiano	266a665f9f	Revert "[IR/Verifier] Each DISubprogram with isDefinition: true must belong to a CU." This reverts commit r266102. The O(N^2) verifier check causes timeouts in LTO test suite. llvm-svn: 266221	2016-04-13 18:08:07 +00:00
Nirav Dave	2477491a92	Cleanup Store Merging in UseAA case This patch fixes a bug (PR26827) when using anti-aliasing in store merging. This sets the chain users of the component stores to point to the new store instead of the component stores chain parent. Reviewers: jyknight Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18909 llvm-svn: 266217	2016-04-13 17:27:26 +00:00
Mehdi Amini	b5b289339b	Revert "Make aliases explicit in the summary" Inadvertently commited... This reverts commit e618ec93786d99df2ddf280ad2d5e02f5516cecf. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266215	2016-04-13 17:20:07 +00:00
Mehdi Amini	ce744a95fd	Make aliases explicit in the summary Summary: To be able to work accurately on the reference graph when taking decision about internalizing, promoting, renaming, etc. We need to have the alias information explicit. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18836 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266214	2016-04-13 17:18:42 +00:00
Tim Northover	b8a1ecfc62	AArch64: don't create instructions that write to xzr/wzr twice. These are unpredictable even on AArch64. Patch by Yichao Yu. llvm-svn: 266206	2016-04-13 16:25:39 +00:00
Artem Tamazov	eb4d5a9b0b	[AMDGPU][llvm-mc] Support of Trap Handler registers (TTMP0..11 and TBA/TMA)git status Tests added along with implemented feature. Note that there is a small leftover of unecessary MI sheduling issue (more info in the review). CodeGen/AMDGPU/salu-to-valu.ll updated to fix the false regression. TODO: Support for TTMP quads, comma-separated syntax in "[]" and more. Differential Revision: http://reviews.llvm.org/D17825 llvm-svn: 266205	2016-04-13 16:18:41 +00:00
Zoran Jovanovic	2f6845ba39	[mips] Fix emitAtomicCmpSwapPartword to handle 64 bit pointers correctly Differential Revision: http://reviews.llvm.org/D18995 llvm-svn: 266204	2016-04-13 16:02:25 +00:00
Vasileios Kalintiris	3751d4114c	[mips] Sign-extend i32 values truncated from previously zero-extended i32 values. Summary: This is a special case for MIPS64 because the architecture requires properly 32-bit sign-extended values in the register containers. Additionaly, we merge consecutive trunc + AssertZExt nodes in order to avoid unnecessary sign-extensions when the extension comes from a type smaller than i32. Reviewers: dsanders Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18893 llvm-svn: 266203	2016-04-13 15:07:45 +00:00
David L Kreitzer	752c1448fe	Simplify strlen to a subtraction for certain cases. Patch by Li Huang (li1.huang@intel.com) Differential Revision: http://reviews.llvm.org/D18230 llvm-svn: 266200	2016-04-13 14:31:06 +00:00
Simon Pilgrim	cc901b57d5	[X86][SSE] Regenerated vector integer absolute tests llvm-svn: 266194	2016-04-13 12:40:22 +00:00
Petar Jovanovic	644b8c1a5d	Calculate __builtin_object_size when pointer depends on a condition This patch fixes calculating of builtin_object_size if it depends on a condition. Before this patch compiler did not know how to calculate the object size when it finds a condition that cannot be eliminated. This patch enables calculating of builtin_object_size even in case when condition cannot be eliminated by choosing minimum or maximum value as a result from condition. Choosing minimum or maximum value from condition is based on the second argument of __builtin_object_size function. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18438 llvm-svn: 266193	2016-04-13 12:25:25 +00:00
Simon Pilgrim	b6ef8b730d	Added missing autogeneration note llvm-svn: 266185	2016-04-13 09:28:44 +00:00
Zlatko Buljan	58d6a959be	[mips][microMIPS] Add CodeGen support for DIV, MOD, DIVU, MODU, DDIV, DMOD, DDIVU and DMODU instructions Differential Revision: http://reviews.llvm.org/D17137 This patch was reverted after the revertion of dependant patch http://reviews.llvm.org/D17068. There was the problem with test-suite failure. The problem is hopefully solved with dependant patch so this patch is commited again. llvm-svn: 266179	2016-04-13 08:02:26 +00:00
David Majnemer	3ee5f34469	[InstCombine] We folded an fcmp to an i1 instead of a vector of i1 Remove an ad-hoc transform in InstCombine and replace it with more general machinery (ValueTracking, InstructionSimplify and VectorUtils). This fixes PR27332. llvm-svn: 266175	2016-04-13 06:55:52 +00:00
Hrvoje Varga	11dd31df9a	[mips][microMIPS] Fix for "Cannot copy registers" assertion Differential Revision: http://reviews.llvm.org/D17068 This changes contains fix for failing test-suite. So, this patch should hopefully work now. llvm-svn: 266171	2016-04-13 06:17:21 +00:00
Mehdi Amini	24d3414f06	Refactor the InternalizePass into a helper class, and expose it through a public free function (NFC) There is really no reason to require to instanciate a pass manager to internalize. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266167	2016-04-13 05:25:08 +00:00
Wei Mi	9a16d655c7	Recommit r265547, and r265610,r265639,r265657 on top of it, plus two fixes with one about error verify-regalloc reported, and another about live range update of phi after rematerialization. r265547: Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Patches on top of r265547: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" Differential Revision: http://reviews.llvm.org/D15302 Differential Revision: http://reviews.llvm.org/D18934 Differential Revision: http://reviews.llvm.org/D18935 Differential Revision: http://reviews.llvm.org/D18936 llvm-svn: 266162	2016-04-13 03:08:27 +00:00
Matt Arsenault	887d4767b7	AMDGPU: Add test for m0 initialization in basic loop Initialization of m0 is emitted for each LDS operation, so every block with LDS usage ends up with one. MachineLICM used to fail to hoist this out of the loop, so every loop iteration with LDS usage in it would re-initialize it. This seems to be fixed now, so add a test to make sure that it stays this way. llvm-svn: 266156	2016-04-13 00:39:52 +00:00
Matt Arsenault	b34eea9cb5	AMDGPU: Remove leftover ShaderType attributes in tests llvm-svn: 266155	2016-04-13 00:39:48 +00:00
Justin Bogner	263f314ba7	CodeGen: Clear the MFI's save and restore point after PrologEpilogInserter This state is no longer useful and not guaranteed to be valid in later codegen passes. For example, see the added test, which would print a savepoint of %bb.-1 without this change, and crashes with a use-after-free error under ASan if you apply the recycling allocator patch from llvm.org/PR26808. llvm-svn: 266150	2016-04-12 23:21:53 +00:00
Sanjay Patel	5e5056d939	[x86, InstCombine] fix masked load pass-through operand to be a zero vector This bug was introduced with: http://reviews.llvm.org/rL262269 AVX masked loads are specified to set vector lanes to zero when the high bit of the mask element for that lane is zero: "If the mask is 0, the corresponding data element is set to zero in the load form of these instructions, and unmodified in the store form." --Intel manual Differential Revision: http://reviews.llvm.org/D19017 llvm-svn: 266148	2016-04-12 23:16:23 +00:00
Davide Italiano	245383c584	[DebugInfo] Add error message to test. Suggested by Rafael as post-commit review (r266102). llvm-svn: 266134	2016-04-12 21:44:16 +00:00
Mehdi Amini	d5faa267c4	Add a pass to name anonymous/nameless function Summary: For correct handling of alias to nameless function, we need to be able to refer them through a GUID in the summary. Here we name them using a hash of the non-private global names in the module. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18883 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266132	2016-04-12 21:35:28 +00:00
Mehdi Amini	68da426eea	Move summary creation out of llvm-as into opt Summary: Let keep llvm-as "dumb": it converts textual IR to bitcode. This commit removes the dependency from llvm-as to libLLVMAnalysis. We'll add back summary in llvm-as if we get to a textual representation for it at some point. In the meantime, opt seems like a better place for that. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19032 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266131	2016-04-12 21:35:18 +00:00
Nicolai Haehnle	df77c9ada4	AMDGPU: add llvm.amdgcn.buffer.load/store intrinsics Summary: They correspond to BUFFER_LOAD/STORE_DWORD[_X2,X3,X4] and mostly behave like llvm.amdgcn.buffer.load/store.format. They will be used by Mesa for SSBO and atomic counters at least when robust buffer access behavior is desired. (These instructions perform no format conversion and do buffer range checking per component.) As a side effect of sharing patterns with llvm.amdgcn.buffer.store.format, it has become trivial to add support for the f32 and v2f32 variants of that intrinsic, so the patch does so. Also DAG-ify (and fix) some tests that I noticed intermittent failures in while developing this patch. Some tests were (temporarily) adjusted for the required mayLoad/hasSideEffects changes to the BUFFER_STORE_DWORD* instructions. See also http://reviews.llvm.org/D18291. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18292 llvm-svn: 266126	2016-04-12 21:18:10 +00:00
James Y Knight	19f6cce4e3	Add __atomic_* lowering to AtomicExpandPass. (Recommit of r266002, with r266011, r266016, and not accidentally including an extra unused/uninitialized element in LibcallRoutineNames) AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266115	2016-04-12 20:18:48 +00:00
Derek Schuff	b861ec8734	[WebAssembly] Fix debug info in reg-stackify.ll test It lacked a CU and thus became invalid with r266102 llvm-svn: 266114	2016-04-12 20:12:05 +00:00
Tom Stellard	ab1d3a9d50	AMDGPU/SI: Insert wait states required after v_readfirstlane on SI Summary: We will be able to handle this case much better once the hazard recognizer is finished, but this conservative implementation fixes a hang with the piglit test: spec/arb_arrays_of_arrays/execution/sampler/fs-nested-struct-arrays-nonconst-nested-arra Reviewers: arsenm, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18988 llvm-svn: 266105	2016-04-12 18:40:43 +00:00
Matt Arsenault	3b08238f78	AMDGPU: Eliminate half of i64 or if one operand is zero_extend from i32 This helps clean up some of the mess when expanding unaligned 64-bit loads when changed to be promote to v2i32, and fixes situations where or x, 0 was emitted after splitting 64-bit ors during moveToVALU. I think this could be a generic combine but I'm not sure. llvm-svn: 266104	2016-04-12 18:24:38 +00:00
Davide Italiano	b390d8ee39	[IR/Verifier] Each DISubprogram with isDefinition: true must belong to a CU. Add a check to catch violations. ~60 tests were broken and prevented this change to be committed. Adrian and I (thanks Adrian!) went through them in the last week or so updating. The check can be done more efficiently but I'd still like to get this in ASAP to avoid more broken tests to be checked in (if any). PR: 27101 llvm-svn: 266102	2016-04-12 18:22:33 +00:00
Nicolai Haehnle	279970c0dc	AMDGPU/SI: Fix a mis-compilation of multi-level breaks Summary: Under certain circumstances, multi-level breaks (or what is understood by the control flow passes as such) could be miscompiled in a way that causes infinite loops, by emitting incorrect control flow intrinsics. This fixes a hang in dEQP-GLES3.functional.shaders.loops.while_dynamic_iterations.conditional_continue_vertex Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18967 llvm-svn: 266088	2016-04-12 16:10:38 +00:00
Artur Pilipenko	dbe0bc8df4	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 llvm-svn: 266086	2016-04-12 15:58:04 +00:00
Davide Italiano	25570c5423	[Bitcode] Fix + regenerate old test so that it includes a DICompileUnit. llvm-svn: 266085	2016-04-12 15:51:23 +00:00
Geoff Berry	c0739d8305	[ScheduleDAGInstrs] Handle instructions with multiple MMOs Summary: In getUnderlyingObjectsForInstr(): Don't give up on instructions with multiple MMOs, instead look through all the MMOs and if they all meet the conservative criteria previously used for single MMO instructions, then return all of the underlying objects derived from the MMOs. The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid the case where multiple underlying objects are present and are related in such a way that successive iterations of the loop end up adding a dependency from an instruction to itself. Reviewers: atrick, hfinkel Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18093 llvm-svn: 266084	2016-04-12 15:50:19 +00:00
Than McIntosh	b9049084a0	Test commit, NFC. Adds a blank line. llvm-svn: 266082	2016-04-12 15:35:05 +00:00
Petar Jovanovic	48e4db1ca2	[mips] add assembler support for .set arch=octeon This patch enables assembler support for .set arch=octeon. It will fix issues with inline assembler when this directive is used. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D18548 llvm-svn: 266081	2016-04-12 15:28:16 +00:00
Aaron Ballman	01a9854ee8	Moving llvm-test-depends and test-depends into the Tests folder; NFC, this simply cleans up the generated solution so that these targets don't live in the root folder of the IDE. llvm-svn: 266078	2016-04-12 15:09:14 +00:00
Matt Arsenault	64fa2f4513	AMDGPU: Implement i64 global atomics llvm-svn: 266075	2016-04-12 14:05:11 +00:00
Matt Arsenault	a9dbdcae04	AMDGPU: Add atomic_inc + atomic_dec intrinsics These are different than atomicrmw add 1 because they have an additional input value to clamp the result. llvm-svn: 266074	2016-04-12 14:05:04 +00:00
Matt Arsenault	44e5483ada	AMDGPU: Add volatile to test loads and stores When the memory vectorizer is enabled, these tests break. These tests don't really care about the memory instructions, and it's easier to write check lines with the unmerged loads. llvm-svn: 266071	2016-04-12 13:38:18 +00:00
Simon Pilgrim	edf100e4e7	[X86] Regenerated avx512 calling convention test checks llvm-svn: 266070	2016-04-12 13:31:01 +00:00
Rafael Espindola	d41b54be11	This reverts commit r266002, r266011 and r266016. They broke the msan bot. Original message: Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266062	2016-04-12 12:30:25 +00:00
Simon Dardis	ee1590f5f0	Revert "[mips] MIPSR6 Compact branch aliases" This reverts commit r266055. ps4-buildslave2 is highlighting a failure. llvm-svn: 266061	2016-04-12 12:22:45 +00:00
Simon Dardis	703c864fe3	[mips] MIPSR6 Compact branch aliases Summary: Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like binutils. Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D18856 llvm-svn: 266055	2016-04-12 10:41:53 +00:00
Mehdi Amini	f59f2bb1b5	Refactor the Internalize stage of libLTO in a separate file (NFC) This is intended to be shared by the ThinLTOCodeGenerator. Note that there is a change in the way the verifier is run, previously it was ran as a Pass on the merged module during internalization. While now the verifier is called explicitely on the merged module outside of the internalize "pass pipeline". What remains strange in the API is the fact that `DisableVerify` in the API does not disable this initial verifier. Differential Revision: http://reviews.llvm.org/D19000 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266047	2016-04-12 06:34:10 +00:00
Chuang-Yu Cheng	94f58e79ae	[PPC64] Mark CR0 Live if PPCInstrInfo::optimizeCompareInstr Creates a Use of CR0 Resolve Bug 27046 (https://llvm.org/bugs/show_bug.cgi?id=27046). The PPCInstrInfo::optimizeCompareInstr function could create a new use of CR0, even if CR0 were previously dead. This patch marks CR0 live if a use of CR0 is created. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D18884 llvm-svn: 266040	2016-04-12 03:10:52 +00:00
Chuang-Yu Cheng	6efde2fb45	[PPC64] Use mfocrf in prologue when we only need to save 1 nonvolatile CR field In the ELFv2 ABI, we are not required to save all CR fields. If only one nonvolatile CR field is clobbered, use mfocrf instead of mfcr to selectively save the field, because mfocrf has short latency compares to mfcr. Thanks Nemanja's invaluable hint! Reviewers: nemanjai tjablin hfinkel kbarton http://reviews.llvm.org/D17749 llvm-svn: 266038	2016-04-12 03:04:44 +00:00
George Burgess IV	278199f615	Add the allocsize attribute to LLVM. `allocsize` is a function attribute that allows users to request that LLVM treat arbitrary functions as allocation functions. This patch makes LLVM accept the `allocsize` attribute, and makes `@llvm.objectsize` recognize said attribute. The review for this was split into two patches for ease of reviewing: D18974 and D14933. As promised on the revisions, I'm landing both patches as a single commit. Differential Revision: http://reviews.llvm.org/D14933 llvm-svn: 266032	2016-04-12 01:05:35 +00:00
Quentin Colombet	5933eb84d3	[AArch64] Add test cases for the repairing of physical registers. llvm-svn: 266030	2016-04-12 00:43:40 +00:00
Quentin Colombet	f43f882f37	[AArch64] Add a test case for the propagation of register banks through phis. llvm-svn: 266028	2016-04-12 00:32:55 +00:00
Quentin Colombet	99d19d89e9	[AArch64] Add a test case for the repairing of definitions. llvm-svn: 266026	2016-04-12 00:25:22 +00:00
JF Bastien	4f43cfd2c2	MergeFunctions: test alloca better r237193 fix handling of alloca size / align in MergeFunctions, but only tested one and didn't follow FunctionComparator::cmpOperations's usual comparison pattern. It also didn't update Instruction.cpp:haveSameSpecialState which I'll do separately. llvm-svn: 266022	2016-04-12 00:03:26 +00:00
Quentin Colombet	d52f4128d6	[AArch64] Test that RegBankSelect inserts the proper copies to fix the register bank assignments. llvm-svn: 266021	2016-04-12 00:00:42 +00:00
Adrian Prantl	74c0c09666	Add a missing DICompileUnit to testcase. llvm-svn: 266019	2016-04-11 23:30:29 +00:00
Mehdi Amini	ae280e54a9	ThinLTO renaming: use module hash instead of position in the summary This is more robust to changes in the link ordering. Differential Revision: http://reviews.llvm.org/D18946 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266018	2016-04-11 23:26:46 +00:00
Adrian Prantl	ea7eb841a1	Legalize the debug info in this testcase in anticipation of future Verifier improvements. llvm-svn: 266017	2016-04-11 23:26:31 +00:00
Evgeniy Stepanov	f17120a85f	[safestack] Add canary to unsafe stack frames Add StackProtector to SafeStack. This adds limited protection against data corruption in the caller frame. Current implementation treats all stack protector levels as -fstack-protector-all. llvm-svn: 266004	2016-04-11 22:27:48 +00:00
Tim Northover	a6dea06fe3	ARM: use r7 as the frame-pointer on all MachO targets. This is better for a few reasons: + It matches the other tooling for iOS. + It matches EABI in more cases (i.e. Thumb-mode, and in practice we don't use ARM mode). + It leads to infinitesimally smaller code (0.2%, yay!). rdar://25369506 llvm-svn: 266003	2016-04-11 22:27:40 +00:00
James Y Knight	b91d38c5fe	Add __atomic_* lowering to AtomicExpandPass. AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and cmpxchg instructions to __atomic_* library calls, when the target doesn't support atomics of a given size. This is the first step towards moving all atomic lowering from clang into llvm. When all is done, the behavior of __sync_* builtins, __atomic_* builtins, and C11 atomics will be unified. Previously LLVM would pass everything through to the ISelLowering code. There, unsupported atomic instructions would turn into __sync_* library calls. Because of that behavior, Clang currently avoids emitting llvm IR atomic instructions when this would happen, and emits __atomic_* library functions itself, in the frontend. This change makes LLVM able to emit __atomic_* libcalls, and thus will eventually allow clang to depend on LLVM to do the right thing. It is advantageous to do the new lowering to atomic libcalls in AtomicExpandPass, before ISel time, because it's important that all atomic operations for a given size either lower to __atomic_* libcalls (which may use locks), or native instructions which won't. No mixing and matching. At the moment, this code is enabled only for SPARC, as a demonstration. The next commit will expand support to all of the other targets. Differential Revision: http://reviews.llvm.org/D18200 llvm-svn: 266002	2016-04-11 22:22:33 +00:00
Manman Ren	2aa7f23272	swifterror: fix up a testing case. llvm-svn: 266000	2016-04-11 21:45:33 +00:00
Davide Italiano	0778bec6f9	[DebugInfo/Test] Add CU as required. llvm-svn: 265999	2016-04-11 21:16:48 +00:00
Simon Pilgrim	82e54871d0	[DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998	2016-04-11 21:10:33 +00:00
Manman Ren	5751814eda	Swift Calling Convention: swifterror target support. Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997	2016-04-11 21:08:06 +00:00
Adrian Prantl	2fec729c50	Revert accidentally committed change llvm-svn: 265996	2016-04-11 21:00:26 +00:00
Adrian Prantl	115edcaec1	Add missing DICompileUnit to this testcase llvm-svn: 265995	2016-04-11 20:58:57 +00:00
Zachary Turner	4d5535a70a	Fix some display bugs in llvm-pdbdump. We were incorrectly reporting all non-64 bit integers as int64s. This is most evident when trying to print the "short" type, but in theory could happen with chars too (although usually chars use a different builtin type). Additionally, we were using the wrong check when deciding whether to print an enum definition as a global enum. We were checking whether or not the enum was "nested", and if so saving it until we print the class definition that it was nested in. But this is not correct in rare situations where the enum is nested, but the class that it's nested in does not have type information in the PDB. So instead we check if there is a class definition for the parent in the PDB. If so we save it for later, otherwise we print it. llvm-svn: 265993	2016-04-11 20:39:17 +00:00
Tom Stellard	0ffdf65eaa	Revert "AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute" This reverts commit r263720. Just confirmed that s_waitcnt is required after ds_permute/ds_bpermute. llvm-svn: 265992	2016-04-11 20:38:40 +00:00
Tim Northover	6b3169bb97	MCParser: diagnose missing directional labels more clearly. Before, ELF at least managed a diagnostic but it was a completely untraceable "undefined symbol" error. MachO had a variety of even worse behaviours: crash, emit corrupt file, or an equally bad message. llvm-svn: 265984	2016-04-11 19:50:46 +00:00
Matthew Simpson	53207a99f9	[LoopUtils, LV] Fix PR27246 (first-order recurrences) This patch ensures that when we detect first-order recurrences, we reject a phi node if its previous value is also a phi node. During vectorization the initial and previous values of the recurrence are shuffled together to create the value for the current iteration. However, phi nodes are not widened like other instructions. This fixes PR27246. Differential Revision: http://reviews.llvm.org/D18971 llvm-svn: 265983	2016-04-11 19:48:18 +00:00
Davide Italiano	7469644259	[DebugInfo] Fix even more tests to include DICompileunit. llvm-svn: 265980	2016-04-11 18:53:27 +00:00
Adrian Prantl	f95164227e	Fix missing DICompileUnits in testcases llvm-svn: 265974	2016-04-11 18:15:44 +00:00
Sanjay Patel	1dd212f900	[InstCombine] consolidate tests for related bugs llvm-svn: 265973	2016-04-11 17:58:37 +00:00
Hemant Kulkarni	9b1b7f0803	[llvm-readobj] Add ELF hash histogram printing Differential Revision: http://reviews.llvm.org/D18907 llvm-svn: 265967	2016-04-11 17:15:30 +00:00
Adrian Prantl	7452093e99	Make the distinct DISubprogram in this testcase really distinct. llvm-svn: 265962	2016-04-11 16:58:40 +00:00
Adrian Prantl	d1f52b672a	Update discriminator testcases to use proper NoDebug CUs instead of omitting !llvm.dbg.cu. llvm-svn: 265961	2016-04-11 16:58:35 +00:00
Sanjay Patel	816ec8882a	[InstCombine] don't try to shift an illegal amount (PR26760) This is the straightforward fix for PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 But we still need to make some changes to generalize this helper function and then send the lshr case into here. llvm-svn: 265960	2016-04-11 16:50:32 +00:00
Adrian Prantl	b01a4d48ac	More upgrading of old- and very-old-style debug info in testcases. llvm-svn: 265953	2016-04-11 15:53:44 +00:00
Sanjoy Das	f9d88e650b	This reverts commit r265913 and r265912 See PR27315 r265913: "[IndVars] Eliminate op.with.overflow when possible" r265912: "[SCEV] See through op.with.overflow intrinsics" llvm-svn: 265950	2016-04-11 15:26:18 +00:00
Petar Jovanovic	e578e970cb	[mips] Make Static a default relocation model for MIPS codegen This change follows up defaults for GCC and Clang, so LLVM does not differ from them. While number of the test files are touched with this change, they all keep the old (expected) behaviour with the explicit option: "-relocation-model=pic" The tests that have not been touched are insensitive to relocation model. Differential Revision: http://reviews.llvm.org/D17995 llvm-svn: 265949	2016-04-11 15:24:23 +00:00
Daniel Sanders	a45d3e439f	[mips] Trivial corrections to range checked immediates. Summary: SYNC has a 5-bit unsigned immediate. Move MIPS16-specific pcrel16 operand to Mips16 files. Reviewers: vkalintiris Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18755 llvm-svn: 265947	2016-04-11 15:20:40 +00:00
Sanjay Patel	2d2d67994c	[InstCombine] replace test that no longer works as intended This is step 1 of refactoring to solve PR26760: https://llvm.org/bugs/show_bug.cgi?id=26760 llvm-svn: 265946	2016-04-11 15:19:44 +00:00
Ulrich Weigand	1bac911c58	[SystemZ] Add SVC instruction This is going to be useful for inline assembly only. Author: koriakin Differential Revision: http://reviews.llvm.org/D18952 llvm-svn: 265943	2016-04-11 14:35:39 +00:00
Teresa Johnson	2d5487cf44	[ThinLTO] Move summary computation from BitcodeWriter to new pass Summary: This is the first step in also serializing the index out to LLVM assembly. The per-module summary written to bitcode is moved out of the bitcode writer and to a new analysis pass (ModuleSummaryIndexWrapperPass). The pass itself uses a new builder class to compute index, and the builder class is used directly in places where we don't have a pass manager (e.g. llvm-as). Because we are computing summaries outside of the bitcode writer, we no longer can use value ids created by the bitcode writer's ValueEnumerator. This required changing the reference graph edge type to use a new ValueInfo class holding a union between a GUID (combined index) and Value* (permodule index). The Value* are converted to the appropriate value ID during bitcode writing. Also, this enables removal of the BitWriter library's dependence on the Analysis library that was previously required for the summary computation. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18763 llvm-svn: 265941	2016-04-11 13:58:45 +00:00
Oliver Stannard	c869e9158d	[ARM] Avoid switching ARM/Thumb mode on .arch/.cpu directive When we see a .arch or .cpu directive, we should try to avoid switching ARM/Thumb mode if possible. If we do have to switch modes, we also need to emit the correct mapping symbol for the new ISA. We did not do this previously, so could emit ARM code with Thumb mapping symbols (or vice-versa). The GAS behaviour is to always stay in the same mode, and to emit an error on any instructions seen when the current mode is not available on the current target. We can't represent that situation easily (we assume that Thumb mode is available if ModeThumb is set), so we differ from the GAS behaviour when switching to a target that can't support the old mode. I've added a warning for when this implicit mode-switch occurs. Differential Revision: http://reviews.llvm.org/D18955 llvm-svn: 265936	2016-04-11 13:06:28 +00:00
Ulrich Weigand	848a513d0a	[SystemZ] Support conditional indirect sibling calls via BCR This adds a conditional variant of CallBR instruction, CallBCR. Also, it can be fused with integer comparisons, resulting in one of the new C*BCall instructions. In addition to CallBRCL limitations, this has another one: it won't trigger if the function to call isn't already in %r1 - see f22 in the test for an example (it's also why the loads in tests are volatile). Author: koriakin Differential Revision: http://reviews.llvm.org/D18928 llvm-svn: 265933	2016-04-11 12:12:32 +00:00
Simon Pilgrim	ac0cac3b0d	[X86] Added extra widening tests for and/xor/or bit operations Add tests for bitcasting an illegal vector to/from a legal scalar Additional tests requested for D18944 llvm-svn: 265930	2016-04-11 11:10:36 +00:00
Simon Pilgrim	ff434aaa3a	[X86] Added extra widening tests for and/xor/or bit operations To make sure we're dealing with both cases of legal/illegal number of vector elements and legal/illegal vector element types llvm-svn: 265929	2016-04-11 10:58:52 +00:00
Simon Pilgrim	d839fe9cfb	[X86] Regenerated sdglue test checks llvm-svn: 265927	2016-04-11 10:22:05 +00:00
Simon Pilgrim	da730769fb	[X86] Added widening tests for and/xor/or bit operations Part of additional tests requested for D18944 llvm-svn: 265925	2016-04-11 10:16:27 +00:00
Andrey Turetskiy	9df334c28e	[X86] Restrict max long nop length for Lakemont. Restrict the max length of long nops for Lakemont to 7. Experiments on MCU benchmarks (Dhrystone, Coremark) show that this is the most optimal length. Differential Revision: http://reviews.llvm.org/D18897 llvm-svn: 265924	2016-04-11 10:07:36 +00:00
Sanjoy Das	a07ad647ee	[IndVars] Eliminate op.with.overflow when possible Summary: If we can prove that an op.with.overflow intrinsic does not overflow, we can get rid of the intrinsic, and replace it with non-wrapping arithmetic. Reviewers: atrick, regehr Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18685 llvm-svn: 265913	2016-04-10 22:50:31 +00:00
Sanjoy Das	3c529a40ca	[SCEV] See through op.with.overflow intrinsics Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 llvm-svn: 265912	2016-04-10 22:50:26 +00:00
Simon Pilgrim	2d463b1a0f	[X86][AVX512] Add vector integer division by constant tests Added sdiv/srem and udiv/urem tests cases for 512 bit vectors. llvm-svn: 265903	2016-04-10 17:14:26 +00:00
Simon Pilgrim	d263fdc512	[X86][AVX512BW] Add support for v64i8 multiplies Extend the existing lowering of vXi8 multiplies to support v64i8 on avx512bw targets. I added the Lower512IntArith helper function to help with this - not sure how often this could be used in the future, but it seemed better than putting all that logic inside LowerMUL. Differential Revision: http://reviews.llvm.org/D18937 llvm-svn: 265902	2016-04-10 17:02:48 +00:00
Elena Demikhovsky	751ed0a06a	Loop vectorization with uniform load Vectorization cost of uniform load wasn't correctly calculated. As a result, a simple loop that loads a uniform value wasn't vectorized. Differential Revision: http://reviews.llvm.org/D18940 llvm-svn: 265901	2016-04-10 16:53:19 +00:00
Simon Pilgrim	bab9979ee9	[X86][AVX512] Regenerated mask op tests llvm-svn: 265898	2016-04-10 14:16:03 +00:00
Jeroen Ketema	562380a331	[OCaml] Expose the LLVM diagnostic handler Differential Revision: http://reviews.llvm.org/D18891 llvm-svn: 265897	2016-04-10 13:55:53 +00:00
Charles Davis	2f65f35c27	[CodeGen] Don't assume that fixed stack objects are aligned in a stack-realigned function. Summary: After we make the adjustment, we can assume that for local allocas, but not for stack parameters, the return address, or any other fixed stack object (which has a negative offset and therefore lies prior to the adjusted SP). Fixes PR26662. Reviewers: hfinkel, qcolombet, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D18471 llvm-svn: 265886	2016-04-09 23:34:42 +00:00
Davide Italiano	7aa47094b2	[MC] support TLSDESC and TLSCALL / GNU2 tls dialect Differential Revision: http://reviews.llvm.org/D18885 llvm-svn: 265881	2016-04-09 20:32:33 +00:00
Adrian Prantl	3891e9e859	Drop debug info for DISubprograms that are not referenced by anything This patch drops the debug info for all DISubprograms that are (a) not attached to an llvm::Function and (b) not indirectly reachable via inline scopes from any surviving Function and (c) not reachable from a type (i.e.: member functions). Background: I'm currently working on a patch to reverse the pointers between DICompileUnit and DISubprogram (for more info check Duncan's RFC on lazy-loading of debug info metadata http://lists.llvm.org/pipermail/llvm-dev/2016-March/097419.html). The idea is to remove the list of subprograms from DICompileUnit and instead point to the owning compile unit from each DISubprogram. After doing this all DISubprograms fulfilling the above criteria will be implicitly dropped unless we go through an extra effort to preserve them. http://reviews.llvm.org/D18477 <rdar://problem/25256815> llvm-svn: 265876	2016-04-09 18:10:22 +00:00
Sanjay Patel	4abae4e0fa	[x86] use BMI 'andn' for logic + compare ops With BMI, we can use 'andn' to save an instruction when the result is only used in a compare. This is related to one of the potential sequences to check 'isfinite' in: https://llvm.org/bugs/show_bug.cgi?id=27164 Differential Revision: http://reviews.llvm.org/D18910 llvm-svn: 265875	2016-04-09 16:02:52 +00:00
Simon Pilgrim	1cc5712763	[X86][XOP] Support for VPPERM 2-input shuffle mask decoding This patch adds support for decoding XOP VPPERM instruction when it represents a basic shuffle. The mask decoding required the existing MCInstrLowering code to be updated to support binary shuffles - the implementation now matches what is done in X86InstrComments.cpp. Differential Revision: http://reviews.llvm.org/D18441 llvm-svn: 265874	2016-04-09 14:51:26 +00:00
Sanjoy Das	dd77e1e6a5	Maintain calling convention when inling calls to llvm.deoptimize The behavior here was buggy -- we'd forget the calling convention after inlining a callsite calling llvm.deoptimize. llvm-svn: 265867	2016-04-09 00:22:59 +00:00
Adrian Prantl	5992a72b4d	Support the Nodebug emission kind for DICompileUnits. Sample-based profiling and optimization remarks currently remove DICompileUnits from llvm.dbg.cu to suppress the emission of debug info from them. This is somewhat of a hack and only borderline legal IR. This patch uses the recently introduced NoDebug emission kind in DICompileUnit to achieve the same result without breaking the Verifier. A nice side-effect of this change is that it is now possible to combine NoDebug and regular compile units under LTO. http://reviews.llvm.org/D18808 <rdar://problem/25427165> llvm-svn: 265861	2016-04-08 22:43:03 +00:00
Tim Shen	0012756489	[SSP] Remove llvm.stackprotectorcheck. This is a cleanup patch for SSP support in LLVM. There is no functional change. llvm.stackprotectorcheck is not needed, because SelectionDAG isn't actually lowering it in SelectBasicBlock; rather, it adds check code in FinishBasicBlock, ignoring the position where the intrinsic is inserted (See FindSplitPointForStackProtector()). llvm-svn: 265851	2016-04-08 21:26:31 +00:00
Sanjay Patel	c0a627524d	[x86] show missed opportunities to use andn llvm-svn: 265850	2016-04-08 21:26:11 +00:00
Sanjay Patel	a699ecc55c	[x86] regenerate checks for BMI tests llvm-svn: 265841	2016-04-08 20:29:39 +00:00
James Y Knight	fec5c64b63	Add trailing colons to labels in a test. This will avoid matching on the FILENAME if it happened to contain, say, "f4" anywhere in the file path. llvm-svn: 265837	2016-04-08 19:49:03 +00:00
Nirav Dave	66f485f4e2	Fix Load Control Dependence in MemCpy Generation In Memcpy lowering we had missed a dependence from the load of the operation to successor operations. This causes us to potentially construct an in initial DAG with a memory dependence not fully represented in the chain sub-DAG but rather require looking at the entire DAG breaking alias analysis by allowing incorrect repositioning of memory operations. To work around this, r200033 changed DAGCombiner::GatherAllAliases to be conservative if any possible issues to happen. Unfortunately this check forbade many non-problematic situations as well. For example, it's common for incoming argument lowering to add a non-aliasing load hanging off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would find that other (unvisited) use of the EntryNode chain, and just give up entirely. Furthermore, the check was incomplete: it would not actually detect all such potentially problematic DAG constructions, because GatherAllAliases did not guarantee to visit all chain nodes going up to the root EntryNode. This is in general fine -- giving up early will just miss a potential optimization, not generate incorrect results. But, for this non-chain dependency detection code, it's possible that you could have a load attached to a higher-up chain node than any which were visited. If that load aliases your store, but the only dependency is through the value operand of a non-aliasing store, it would've been missed by this code, and potentially reordered. With the dependence added, this check can be removed and Alias Analysis can be much more aggressive. This fixes code quality regression in the Consecutive Store Merge cleanup (D14834). Test Change: ppc64-align-long-double.ll now may see multiple serializations of its stores Differential Revision: http://reviews.llvm.org/D18062 llvm-svn: 265836	2016-04-08 19:44:40 +00:00
Kevin B. Smith	e0a6fc3bcc	[X86] Fix PR23155 by turning on X86FixupBWInsts by default. Differential Revision: http://reviews.llvm.org/D18866 llvm-svn: 265830	2016-04-08 18:58:29 +00:00
Sanjoy Das	87b9e1b727	Propagate Undef in llvm.cos Intrinsic Summary: The llvm cos intrinsic currently does not propagate undef's. This change transforms cos(undef) to null value or 0. There are 2 test cases added as well. Patch by Anna Thomas! Reviewers: sanjoy Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18863 llvm-svn: 265825	2016-04-08 18:21:11 +00:00
Colin LeMahieu	efe3732883	Revert r265817 lld tests need to be addressed. llvm-svn: 265822	2016-04-08 18:15:37 +00:00
Colin LeMahieu	4a1975ba8e	[llvm-objdump] Printing hex instead of dec by default Differential Revision: http://reviews.llvm.org/D18770 llvm-svn: 265817	2016-04-08 17:55:03 +00:00
Ulrich Weigand	fa2dffbc1a	[SystemZ] Support conditional sibling calls via BRCL This adds a conditional variant of CallJG instruction, CallBRCL. It can be used for conditional sibling calls. Unfortunately, due to IfCvt limitations, it only really works well for functions without arguments. Author: koriakin Differential Revision: http://reviews.llvm.org/D18864 llvm-svn: 265814	2016-04-08 17:22:19 +00:00
Quentin Colombet	bd19c8a39e	[AArch64] Add a test case for the default mapping of RegBankSelect. llvm-svn: 265811	2016-04-08 17:11:51 +00:00
David Majnemer	56737722e4	[InstCombine] Fix miscompile in FoldSPFofSPF We had a select of a cast of a select but attempted to replace the outer select with the inner select dispite their incompatible types. Patch by Anton Korobeynikov! This fixes PR27236. llvm-svn: 265805	2016-04-08 16:51:49 +00:00
Quentin Colombet	876ddf8107	[MIR] Teach the parser how to deal with register banks. llvm-svn: 265802	2016-04-08 16:40:43 +00:00
David Majnemer	60c6abc3cc	[LoopVectorize] Register cloned assumptions InstCombine cannot effectively remove redundant assumptions without them registered in the assumption cache. The vectorizer can create identical assumptions but doesn't register them with the cache, resulting in slower compile times because InstCombine tries to reason about a lot more assumptions. Fix this by registering the cloned assumptions. llvm-svn: 265800	2016-04-08 16:37:10 +00:00
Sam Parker	2d5126cdf5	[ARM] Enable SMLAW[B\|T] and SMLUW[B\|T] instruction selection Added ISelDAGToDAG functions to enable selection of the smlawb, smlawt, smulwb and smulwt instructions for the ARM backend. Also updated the smul CodeGen test and removed the smulw one. Differential Revision: http://reviews.llvm.org/D18892 llvm-svn: 265793	2016-04-08 16:02:53 +00:00
Hans Wennborg	5a7723c7a2	Revert r265547 "Recommit r265309 after fixed an invalid memory reference bug happened" It caused PR27275: "ARM: Bad machine code: Using an undefined physical register" Also reverting the following commits that were landed on top: r265610 "Fix the compare-clang diff error introduced by r265547." r265639 "Fix the sanitizer bootstrap error in r265547." r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]" llvm-svn: 265790	2016-04-08 15:17:43 +00:00
Simon Pilgrim	872dd6c3fe	[X86][SSE] Added 32-bit tests for vector lzcnt/tzcnt tests v2i64 tests are particularly bad on 32-bit targets. llvm-svn: 265789	2016-04-08 15:01:31 +00:00
Silviu Baranga	6f444dfd55	Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV This re-commits r265535 which was reverted in r265541 because it broke the windows bots. The problem was that we had a PointerIntPair which took a pointer to a struct allocated with new. The problem was that new doesn't provide sufficient alignment guarantees. This pattern was already present before r265535 and it just happened to work. To fix this, we now separate the PointerToIntPair from the ExitNotTakenInfo struct into a pointer and a bool. Original commit message: Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265786	2016-04-08 14:29:09 +00:00
Chuang-Yu Cheng	98c1894755	CXX_FAST_TLS calling convention: performance improvement for PPC64 This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781	2016-04-08 12:04:32 +00:00
Jeroen Ketema	ad659c3400	[llvm-c] Expose LLVMContextGetDiagnostic{Handler,Context} Differential Revision: http://reviews.llvm.org/D18820 llvm-svn: 265773	2016-04-08 09:19:02 +00:00
Zlatko Buljan	53a037f5cc	[mips][microMIPS] Add CodeGen support for ADD, ADDIU, ADDU and DADD* instructions Differential Revision: http://reviews.llvm.org/D16454 llvm-svn: 265772	2016-04-08 07:27:26 +00:00
Dmitry Polukhin	24c4f28ac9	[IFUNC] Fix ifunc-asm.ll test It seems that llc cannot be called used in assembler tests so test that checks asm for particular target needs to be moved to codegen. llvm-svn: 265770	2016-04-08 06:45:19 +00:00
Valery Pykhtin	f5cf6fba3f	[AMDGPU] Add some VI disassembler tests missing from previous autogeneration due to different filecheck prefix. NFC. llvm-svn: 265769	2016-04-08 05:42:20 +00:00
Duncan P. N. Exon Smith	4ec55f8ab6	Reapply "ValueMapper: Treat LocalAsMetadata more like function-local Values" This reverts commit r265765, reapplying r265759 after changing a call from LocalAsMetadata::get to ValueAsMetadata::get (and adding a unit test). When a local value is mapped to a constant (like "i32 %a" => "i32 7"), the new debug intrinsic operand may no longer be pointing at a local. http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/19020/ The previous coommit message follows: -- This is a partial re-commit -- maybe more of a re-implementation -- of r265631 (reverted in r265637). This makes RF_IgnoreMissingLocals behave (almost) consistently between the Value and the Metadata hierarchy. In particular: - MapValue returns nullptr or "metadata !{}" for missing locals in MetadataAsValue/LocalAsMetadata bridging paris, depending on the RF_IgnoreMissingLocals flag. - MapValue doesn't memoize LocalAsMetadata-related results. - MapMetadata no longer deals with LocalAsMetadata or RF_IgnoreMissingLocals at all. (This wasn't in r265631 at all, but I realized during testing it would make the patch simpler with no loss of generality.) r265631 went too far, making both functions universally ignore RF_IgnoreMissingLocals. This broke building (e.g.) compiler-rt. Reassociate (and possibly other passes) don't currently maintain dominates-use invariants for metadata operands, resulting in IR like this: define void @foo(i32 %arg) { call void @llvm.some.intrinsic(metadata i32 %x) %x = add i32 1, i32 %arg } If the inliner chooses to inline @foo into another function, then RemapInstruction will call `MapValue(metadata i32 %x)` and assert that the return is not nullptr. I've filed PR27273 to add a Verifier check and fix the underlying problem in the optimization passes. As a workaround, return `!{}` instead of nullptr for unmapped LocalAsMetadata when RF_IgnoreMissingLocals is unset. Otherwise, match the behaviour of r265631. Original commit message: ValueMapper: Make LocalAsMetadata match function-local Values Start treating LocalAsMetadata similarly to function-local members of the Value hierarchy in MapValue and MapMetadata. - Don't memoize them. - Return nullptr if they are missing. This also cleans up ConstantAsMetadata to stop listening to the RF_IgnoreMissingLocals flag. llvm-svn: 265768	2016-04-08 03:13:22 +00:00
Duncan P. N. Exon Smith	805873148a	Revert "ValueMapper: Treat LocalAsMetadata more like function-local Values" This reverts commit r265759, since even this limited version breaks some bots: http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/3311 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/17696 This also reverts r265761 "ValueMapper: Unduplicate RF_NoModuleLevelChanges check, NFC", since I had trouble separating it from r265759. llvm-svn: 265765	2016-04-08 00:56:21 +00:00
Sanjoy Das	5ce3272833	Don't IPO over functions that can be de-refined Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 llvm-svn: 265762	2016-04-08 00:48:30 +00:00
Adrian Prantl	3e9c88753b	DwarfDebug: Support floating point constants in location lists. This patch closes a gap in the DWARF backend that caused LLVM to drop debug info for floating point variables that were constant for part of their scope. Floating point constants are emitted as one or more DW_OP_constu joined via DW_OP_piece. This fixes a regression caught by the LLDB testsuite that I introduced in r262247 when we stopped blindly expanding the range of singular DBG_VALUEs to span the entire scope and started to emit location lists with accurate ranges instead. Also deletes a now-impossible testcase (debug-loc-empty-entries). <rdar://problem/25448338> llvm-svn: 265760	2016-04-08 00:38:37 +00:00
Duncan P. N. Exon Smith	267185ec92	ValueMapper: Treat LocalAsMetadata more like function-local Values This is a partial re-commit -- maybe more of a re-implementation -- of r265631 (reverted in r265637). This makes RF_IgnoreMissingLocals behave (almost) consistently between the Value and the Metadata hierarchy. In particular: - MapValue returns nullptr or "metadata !{}" for missing locals in MetadataAsValue/LocalAsMetadata bridging paris, depending on the RF_IgnoreMissingLocals flag. - MapValue doesn't memoize LocalAsMetadata-related results. - MapMetadata no longer deals with LocalAsMetadata or RF_IgnoreMissingLocals at all. (This wasn't in r265631 at all, but I realized during testing it would make the patch simpler with no loss of generality.) r265631 went too far, making both functions universally ignore RF_IgnoreMissingLocals. This broke building (e.g.) compiler-rt. Reassociate (and possibly other passes) don't currently maintain dominates-use invariants for metadata operands, resulting in IR like this: define void @foo(i32 %arg) { call void @llvm.some.intrinsic(metadata i32 %x) %x = add i32 1, i32 %arg } If the inliner chooses to inline @foo into another function, then RemapInstruction will call `MapValue(metadata i32 %x)` and assert that the return is not nullptr. I've filed PR27273 to add a Verifier check and fix the underlying problem in the optimization passes. As a workaround, return `!{}` instead of nullptr for unmapped LocalAsMetadata when RF_IgnoreMissingLocals is unset. Otherwise, match the behaviour of r265631. Original commit message: ValueMapper: Make LocalAsMetadata match function-local Values Start treating LocalAsMetadata similarly to function-local members of the Value hierarchy in MapValue and MapMetadata. - Don't memoize them. - Return nullptr if they are missing. This also cleans up ConstantAsMetadata to stop listening to the RF_IgnoreMissingLocals flag. llvm-svn: 265759	2016-04-08 00:33:44 +00:00
Davide Italiano	08f8f21b91	[IR/Verifier] Fix (yet another) crash. We need to check that if we reference a retainedType from DICompileUnit we're actually referencing a DICompositeType. llvm-svn: 265752	2016-04-08 00:01:32 +00:00
Amaury Sechet	c53ad4f3b2	Do not select EhPad BB in MachineBlockPlacement when there is regular BB to schedule Summary: EHPad BB are not entered the classic way and therefor do not need to be placed after their predecessors. This patch make sure EHPad BB are not chosen amongst successors to form chains, and are selected as last resort when selecting the best candidate. EHPad are scheduled in reverse probability order in order to have them flow into each others naturally. Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17625 llvm-svn: 265726	2016-04-07 21:29:39 +00:00
Jan Vesely	43b7b5b846	AMDGPU/SI: Implement atomic load/store for i32 and i64 Standard load/store instructions with GLC bit set. Reviewers: tstellardAMD, arsenm Differential Revision: http://reviews.llvm.org/D18760 llvm-svn: 265709	2016-04-07 19:23:11 +00:00
Tom Stellard	9112758077	AMDGPU/SI: Add latency for export instructions Reviewers: arsenm, nhaehnle Subscribers: nhaehnle, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18599 llvm-svn: 265708	2016-04-07 18:30:05 +00:00
Simon Pilgrim	fba9352f31	[X86][SSE] Added bitmask pattern shuffle tests Based on OR(AND(MASK,V0),AND(~MASK,V1)) style patterns llvm-svn: 265697	2016-04-07 17:23:55 +00:00
Kevin B. Smith	3802c4af59	[X86]: Fix for PR27251. Differential Revision: http://reviews.llvm.org/D18850 llvm-svn: 265690	2016-04-07 16:15:34 +00:00
Ulrich Weigand	2eb027d21f	[SystemZ] Implement conditional returns Return is now considered a predicable instruction, and is converted to a newly-added CondReturn (which maps to BCR to %r14) instruction by the if conversion pass. Also, fused compare-and-branch transform knows about conditional returns, emitting the proper fused instructions for them. This transform triggers on a lot of tests, hence the huge diffstat. The changes are mostly jX to br %r14 -> bXr %r14. Author: koriakin Differential Revision: http://reviews.llvm.org/D17339 llvm-svn: 265689	2016-04-07 16:11:44 +00:00
Ulrich Weigand	6e6966460a	[GVN] Fix handling of sub-byte types in big-endian mode When GVN wants to re-interpret an already available value in a smaller type, it needs to right-shift the value on big-endian systems to ensure the correct bytes are accessed. The shift value is the difference of the sizes of the two types. This is correct as long as both types occupy multiples of full bytes. However, when one of them is a sub-byte type like i1, this no longer holds true: we still need to shift, but only to access the correct byte. Accessing bits within the byte requires no shift in either endianness; e.g. an i1 resides in the least-significant bit of its containing byte on both big- and little-endian systems. Therefore, the appropriate shift value to be used is the difference of the storage sizes of the two types. This is already handled correctly in one place where such a shift takes place (GetStoreValueForLoad), but is incorrect in two other places: GetLoadValueForLoad and CoerceAvailableValueToLoadType. This patch changes both places to use the storage size as well. Differential Revision: http://reviews.llvm.org/D18662 llvm-svn: 265684	2016-04-07 15:45:02 +00:00
Ehsan Amiri	4701a91e59	[PPC] Enable transformations in PPCPassConfig::addIRPasses at O2 http://reviews.llvm.org/D18562 A large number of testcases has been modified so they pass after this test. One testcase is deleted, because I realized even after undoing the original change that was committed with this testcase, the testcase still passes. So I removed it. The change to one other testcase (test/CodeGen/PowerPC/pr25802.ll) is an arbitrary change to keep it passing. Given the original intention of the testcase, and the fact that fixing it will require some time to change the testcase, we concluded that this quick change will be enough. llvm-svn: 265683	2016-04-07 15:30:55 +00:00
Valery Pykhtin	e23b6deb01	[AMDGPU] fix readlane/readfirstlane src vgpr operand type. For VGPR_32 operand disassembler expects a VGPR register encoded as 0..255 (enum8 src operand). readfirstlane/readline actually has enum9 operand and this change fixes VGPR_32 to VS_32 (enum9 encoding). Differential Revision: http://reviews.llvm.org/D18696 llvm-svn: 265670	2016-04-07 13:41:51 +00:00
Dmitry Polukhin	af16b958c0	Fix test/Assembler/ifunc-asm.ll test on hexagon-elf bots Temporary disable llc test, it seems that such test should be in some other directory. llvm-svn: 265669	2016-04-07 13:18:43 +00:00
Dmitry Polukhin	a1feff7024	[GCC] Attribute ifunc support in llvm This patch add support for GCC attribute((ifunc("resolver"))) for targets that use ELF as object file format. In general ifunc is a special kind of function alias with type @gnu_indirect_function. Patch for Clang http://reviews.llvm.org/D15524 Differential Revision: http://reviews.llvm.org/D15525 llvm-svn: 265667	2016-04-07 12:32:19 +00:00
Simon Pilgrim	d54bae6525	[X86][SSE] Add support for VZEXT constant folding llvm-svn: 265646	2016-04-07 07:52:45 +00:00
Valery Pykhtin	de04805e9f	[AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support. Reenable reverted r265550 with endianness issue fixed. Variables of endian-aware types such as ulittle32_t should be explicitly casted to their natural equivalent types before passing it as vararg to printf like functions (format in my case). Added lit config file depending on AMDGPU target as the testcase uses assembler. Differential revision: http://reviews.llvm.org/D16998 llvm-svn: 265645	2016-04-07 07:24:01 +00:00
Ahmed Bougacha	1cf67fb9cb	[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC. Re-apply r265450 which caused PR27245 and was reverted in r265559 because of a wrong generalization: the fetch_and_add->add_and_fetch combine only works in specific, but pretty common, cases: (icmp slt x, 0) -> (icmp sle (add x, 1), 0) (icmp sge x, 0) -> (icmp sgt (add x, 1), 0) (icmp sle x, 0) -> (icmp slt (sub x, 1), 0) (icmp sgt x, 0) -> (icmp sge (sub x, 1), 0) Original Message: We only generate LOCKed versions of add/sub when the result is unused. It often happens that the result is used, but only by a comparison. We can optimize those out by reusing EFLAGS, which lets us use the proper instructions, instead of having to fallback to LXADD. Instead of doing this as an MI peephole (as we do for the other non-LOCKed (really, non-MR) forms), do it in ISel. It becomes quite tricky later. This also makes it eventually possible to stop expanding and/or/xor if the only user is an icmp (also see D18141). This uses the LOCK ISD opcodes added by r262244. Differential Revision: http://reviews.llvm.org/D17633 llvm-svn: 265636	2016-04-07 02:07:10 +00:00
Ahmed Bougacha	70bde5445b	[X86] Refresh and tweak EFLAGS reuse tests. NFC. The non-1 and EQ/NE tests were misguided. llvm-svn: 265635	2016-04-07 02:06:53 +00:00
Hans Wennborg	ab16be799c	Re-commit r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)" Third time's the charm? The previous attempt (r265345) caused ASan test failures on X86, as broken CFI caused stack traces to not work. This version of the patch makes sure not to merge with stack adjustments that have CFI, and to not add merged instructions' offests to the CFI about to be generated. This is already covered by the lit tests; I just got the expectations wrong previously. llvm-svn: 265623	2016-04-07 00:05:49 +00:00
Mike Aizatsky	70ea45306a	[sancov] enabling coverage edge pruning by default. Differential Revision: http://reviews.llvm.org/D18844 llvm-svn: 265615	2016-04-06 23:24:37 +00:00
Kevin Enderby	3fcdf6ae2a	Thread Expected<...> up from createMachOObjectFile() to allow llvm-objdump to produce a real error message Produce the first specific error message for a malformed Mach-O file describing the problem instead of the generic message for object_error::parse_failed of "Invalid data was encountered while parsing the file”. Many more good error messages will follow after this first one. This is built on Lang Hames’ great work of adding the ’Error' class for structured error handling and threading Error through MachOObjectFile construction. And making createMachOObjectFile return Expected<...> . So to to get the error to the llvm-obdump tool, I changed the stack of these methods to also return Expected<...> : object::ObjectFile::createObjectFile() object::SymbolicFile::createSymbolicFile() object::createBinary() Then finally in ParseInputMachO() in MachODump.cpp the error can be reported and the specific error message can be printed in llvm-objdump and can be seen in the existing test case for the existing malformed binary but with the updated error message. Converting these interfaces to Expected<> from ErrorOr<> does involve touching a number of places. To contain the changes for now use of errorToErrorCode() and errorOrToExpected() are used where the callers are yet to be converted. Also there some were bugs in the existing code that did not deal with the old ErrorOr<> return values. So now with Expected<> since they must be checked and the error handled, I added a TODO and a comment: “// TODO: Actually report errors helpfully” and a call something like consumeError(ObjOrErr.takeError()) so the buggy code will not crash since needed to deal with the Error. Note there is one fix also needed to lld/COFF/InputFiles.cpp that goes along with this that I will commit right after this. So expect lld not to built after this commit and before the next one. llvm-svn: 265606	2016-04-06 22:14:09 +00:00
Michael Zolotukhin	97567e141e	[LoopUnroll] Fix the way we update DT after complete unrolling. Updating dominators for exit-blocks of the unrolled loops is not enough, as shown in PR27157. The proper way is to update dominators for all dominance-children of original loop blocks. llvm-svn: 265605	2016-04-06 21:47:12 +00:00
Ehsan Amiri	322eca3849	[PPC] Use VSX/FP Facility integer load when an integer load's only users are conversion to FP http://reviews.llvm.org/D18405 When the integer value loaded is never used directly as integer we should use VSX or Floating Point Facility integer loads and avoid extra direct move llvm-svn: 265593	2016-04-06 20:12:29 +00:00
Sanjay Patel	6cc488004d	regenerate checks llvm-svn: 265591	2016-04-06 19:58:06 +00:00
Nicolai Haehnle	df3a20cd80	AMDGPU: Add a shader calling convention This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589	2016-04-06 19:40:20 +00:00
Davide Italiano	5f1c87bf07	[IRVerifier] Don't crash on invalid DIFile inside DISubprogram. r265515, this time with the correct fix. file inside DISubprogram is not mandatory. llvm-svn: 265586	2016-04-06 18:46:39 +00:00
Evgeniy Stepanov	268826a287	[gold] Save bitcode for module partitions (save-temps + split codegen). llvm-svn: 265583	2016-04-06 18:32:13 +00:00
Hans Wennborg	6849f8f15f	Revert r265450 "[X86] Reuse EFLAGS and form LOCKed ops when only user is SETCC." It caused ASan 32-bit tests to hang (PR27245). llvm-svn: 265559	2016-04-06 16:44:38 +00:00
Fiona Glaser	16332ba861	LoopUnroll: only allow non-modulo Partial unrolling when Runtime=true Patch by Evgeny Stupachenko <evstupac@gmail.com>. llvm-svn: 265558	2016-04-06 16:43:45 +00:00
Valery Pykhtin	1dcb91b4de	Revert "[AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support." This reverts commit r265550. There're problems with endianness on dumping instruction bytes. Need to find out how to use support::ulittle32_t type properly. llvm-svn: 265554	2016-04-06 16:30:21 +00:00
Hans Wennborg	a7e396b5ef	Revert "Re-commit r265039 "[X86] Merge adjacent stack adjustments in eliminateCallFramePseudoInstr (PR27140)"" It seems to be causing ASan tests to crash, probably due to miscompiling the run-time somehow. llvm-svn: 265551	2016-04-06 16:10:20 +00:00
Valery Pykhtin	bd90c60afb	[AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support. Differential revision: http://reviews.llvm.org/D16998 llvm-svn: 265550	2016-04-06 15:55:10 +00:00
Wei Mi	18293bef4e	Recommit r265309 after fixed an invalid memory reference bug happened when DenseMap growed and moved memory. I verified it fixed the bootstrap problem on x86_64-linux-gnu but I cannot verify whether it fixes the bootstrap error on clang-ppc64be-linux. I will watch the build-bot result closely. Replace analyzeSiblingValues with new algorithm to fix its compile time issue. The patch is to solve PR17409 and its duplicates. analyzeSiblingValues is a N x N complexity algorithm where N is the number of siblings generated by reg splitting. Although it causes siginificant compile time issue when N is large, it is also important for performance since it removes redundent spills and enables rematerialization. To solve the compile time issue, the patch removes analyzeSiblingValues and replaces it with lower cost alternatives containing two parts. The first part creates a new spill hoisting method in postOptimization of register allocation. It does spill hoisting at once after all the spills are generated instead of inside every instance of selectOrSplit. The second part queries the define expr of the original register for rematerializaiton and keep it always available during register allocation even if it is already dead. It deletes those dead instructions only in postOptimization. With the two parts in the patch, it can remove analyzeSiblingValues without sacrificing performance. Differential Revision: http://reviews.llvm.org/D15302 llvm-svn: 265547	2016-04-06 15:41:07 +00:00
Silviu Baranga	a393baf1fd	Revert r265535 until we know how we can fix the bots llvm-svn: 265541	2016-04-06 14:06:32 +00:00
Sam Kolton	ff90c60a78	[AMDGPU] AsmParser: disable DPP for unsupported instructions. New dpp tests. Fix v_nop_dpp. Summary: 1. Disable DPP encoding for instructions that do not support it: - VOP1: - v_readfirstlane_b32 - v_clrexcp - v_movreld_b32 - v_movrels_b32 - v_movrelsd_b32 - VOP2: - v_madmk_f16/32 - v_madak_f16/32 - VOPC, VINTRP, VOP3 2. Fix DPP for v_nop 3. New DPP tests for VOP1 and VOP2 instructions Reviewers: nhaustov, tstellarAMD, vpykhtin Subscribers: tstellarAMD, arsenm Differential Revision: http://reviews.llvm.org/D18552 llvm-svn: 265538	2016-04-06 13:29:59 +00:00
Silviu Baranga	72b4a4a330	[SCEV] Introduce a guarded backedge taken count and use it in LAA and LV Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 llvm-svn: 265535	2016-04-06 13:18:26 +00:00
Chuang-Yu Cheng	6e1408a891	[ppc64] Temporary disable sibling call optimization on ppc64 due to breaking test case r265506 breaks print-stack-trace.cc test case of compiler-rt in bootstrap test. http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/1708 llvm-svn: 265528	2016-04-06 10:48:36 +00:00
David Majnemer	12fd50410d	[SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnan To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has undefined behavior for negative numbers other than -0.0 (which allows for better optimization, because there is no need to worry about errno being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt." This means that it's unsafe to replace sqrt with llvm.sqrt unless the call is annotated with nnan. Thanks to Hal Finkel for pointing this out! llvm-svn: 265521	2016-04-06 07:04:53 +00:00
Davide Italiano	22680e1c5c	Revert "[IRVerifier] Don't crash on invalid DIFile inside DISubprogram." This reverts commit r265515 as lots of tests need to be fixed before this actually can go in. llvm-svn: 265517	2016-04-06 04:34:38 +00:00
Davide Italiano	2deceb0339	[IRVerifier] Don't crash on invalid DIFile inside DISubprogram. llvm-svn: 265515	2016-04-06 03:57:47 +00:00
Davide Italiano	8dc23a3cb5	[IRVerifier] Avoid crashing on an invalid compile unit. llvm-svn: 265514	2016-04-06 03:07:58 +00:00
Duncan P. N. Exon Smith	6f2e37429a	ValueMapper: Fix delayed blockaddress handling after r265273 r265273 added Mapper::mapBlockAddress, which delays mapping a blockaddress value until the function has a body. The condition was backwards, and should be checking Function::empty instead of GlobalValue::isDeclaration. llvm-svn: 265508	2016-04-06 02:25:12 +00:00
Duncan P. N. Exon Smith	29883866a4	AsmParser: Don't crash on unresolved !tbaa Instead of crashing, give a nice error. As a drive-by, fix the location associated with the errors for unresolved metadata (the location was off by one token). llvm-svn: 265507	2016-04-06 02:06:40 +00:00
Chuang-Yu Cheng	2e5973ef74	[ppc64] Enable sibling call optimization on ppc64 ELFv1/ELFv2 abi This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and add a couple of test cases. This patch also passed llvm/clang bootstrap test, and spec2006 build/run/result validation. Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617 Great thanks to Tom's (tjablin) help, he contributed a lot to this patch. Thanks Hal and Kit's invaluable opinions! Reviewers: hfinkel kbarton http://reviews.llvm.org/D16315 llvm-svn: 265506	2016-04-06 02:04:38 +00:00
Chuang-Yu Cheng	024a623c55	[Power9] Implement add-pc, multiply-add, modulo, extend-sign-shift, random number, set bool, and dfp test significance This patch implement the following instructions: - addpcis subpcis - maddhd maddhdu maddld - modsw moduw modsd modud - darn - extswsli extswsli. - setb - dtstsfi dtstsfiq Total 15 instructions Reviewers: nemanjai hfinkel tjablin amehsan kbarton http://reviews.llvm.org/D17885 llvm-svn: 265505	2016-04-06 01:47:02 +00:00
Chuang-Yu Cheng	eaf4b3d75c	[Power9] Implement copy-paste, msgsync, slb, and stop instructions This patch implements the following BookII and Book III instructions: - copy copy_first cp_abort paste paste. paste_last - msgsync - slbieg slbsync - stop Total 10 instructions Reviewers: nemanjai hfinkel tjablin amehsan kbarton llvm-svn: 265504	2016-04-06 01:46:45 +00:00
Sanjoy Das	65a60670e8	Lower @llvm.experimental.deoptimize as a noreturn call While preserving the return value for @llvm.experimental.deoptimize at the IR level is useful during mid-level optimization, doing so at the machine instruction level requires generating some extra code and a return that is non-ideal. This change has LLVM lower ``` %val = call @llvm.experimental.deoptimize ret %val ``` to effectively ``` call @__llvm_deoptimize() unreachable ``` instead. llvm-svn: 265502	2016-04-06 01:33:49 +00:00
David Majnemer	25d03dbcde	[SLPVectorizer] Vectorize libcalls of sqrt We didn't realize that we could transform the libcall into a vectorized intrinsic. llvm-svn: 265493	2016-04-06 00:14:59 +00:00
Davide Italiano	ea04026c13	[DebugInfo] Fix tests so that each subprogram belongs to a CU. llvm-svn: 265490	2016-04-05 23:37:08 +00:00
Sanjoy Das	49e974b33b	[RS4GC] Better codegen for deoptimize calls Don't emit a gc.result for a statepoint lowered from @llvm.experimental.deoptimize since the call into __llvm_deoptimize is effectively noreturn. Instead follow the corresponding gc.statepoint with an "unreachable". llvm-svn: 265485	2016-04-05 23:18:35 +00:00
Manman Ren	802cd6f9d7	Swift Calling Convention: swiftcc for ARM. Differential Revision: http://reviews.llvm.org/D18769 llvm-svn: 265482	2016-04-05 22:44:44 +00:00
Evgeniy Stepanov	dde29e2799	Faster stack-protector for Android/AArch64. Bionic has a defined thread-local location for the stack protector cookie. Emit a direct load instead of going through __stack_chk_guard. llvm-svn: 265481	2016-04-05 22:41:50 +00:00

... 2 3 4 5 6 ...

35778 Commits