llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	f2f5626b84	[X86][AVX] Fixed triple/arch clash in test case We were specifying a x64 triple and then overriding with a x86 arch. llvm-svn: 262398	2016-03-01 21:33:08 +00:00
Matt Arsenault	b36d462fac	DAGCombiner: Turn truncate of a bitcasted vector to an extract On AMDGPU where operations i64 operations are often bitcasted to v2i32 and back, this pattern shows up regularly where it breaks some expected combines on i64, such as load width reducing. This fixes some test failures in a future commit when i64 loads are changed to promote. llvm-svn: 262397	2016-03-01 21:31:53 +00:00
Jacques Pienaar	ea9f25a740	[lanai] Add ELF enum value and relocations. Add ELF enum value and relocations for Lanai backed. General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html). Differential Revision: http://reviews.llvm.org/D17008 llvm-svn: 262394	2016-03-01 21:21:42 +00:00
Kit Barton	e725669483	[Power9] Implement new vector compare, extract, insert instructions This change implements the following vector operations: - Vector Compare Not Equal - vcmpneb(.) vcmpneh(.) vcmpnew(.) - vcmpnezb(.) vcmpnezh(.) vcmpnezw(.) - Vector Extract Unsigned - vextractub vextractuh vextractuw vextractd - vextublx vextubrx vextuhlx vextuhrx vextuwlx vextuwrx - Vector Insert - vinsertb vinserth vinsertw vinsertd 26 instructions. Phabricator: http://reviews.llvm.org/D15916 llvm-svn: 262392	2016-03-01 20:51:57 +00:00
Geoff Berry	a0df341082	Revert "[AArch64] Fix isLegalAddImmediate() to return true for valid negative values." Revert r262248 in an attempt to fix the clang-native-aarch64-full bot and to investigate a performance regression in SingleSource/Benchmarks/CoyoteBench/huffbench llvm-svn: 262388	2016-03-01 20:28:52 +00:00
Vasileios Kalintiris	36901dd1c3	Revert "[mips] Promote the result of SETCC nodes to GPR width." This reverts commit r262316. It seems that my change breaks an out-of-tree chromium buildbot, so I'm reverting this in order to investigate the situation further. llvm-svn: 262387	2016-03-01 20:25:43 +00:00
Owen Anderson	7ea02fc787	Fix an issue where fast math flags were dropped during scalarization. Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376	2016-03-01 19:35:52 +00:00
Justin Lebar	b5ca00a58d	[NVPTX] Use different, convergent MIs for convergent calls. Summary: Calls sometimes need to be convergent. This is already handled at the LLVM IR level, but it also needs to be handled at the MI level. Ideally we'd propagate convergence from instructions, down through the selection DAG, and into MIs. But this is Hard, and would affect optimizations in the SDNs -- right now only SDNs with two operands have any flags at all. Instead, here's a much simpler hack: Add new opcodes for NVPTX for convergent calls, and generate these when lowering convergent LLVM calls. Reviewers: jholewinski Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17423 llvm-svn: 262373	2016-03-01 19:24:03 +00:00
David Majnemer	791b88b6da	[X86] Elide references to _chkstk for dynamic allocas The _chkstk function is called by the compiler to probe the stack in an order consistent with Windows' expectations. However, it is possible to elide the call to _chkstk and manually adjust the stack pointer if we can prove that the allocation is fixed size and smaller than the probe size. This shrinks chrome.dll, chrome_child.dll and chrome.exe by a cummulative ~133 KB. Differential Revision: http://reviews.llvm.org/D17679 llvm-svn: 262370	2016-03-01 19:20:23 +00:00
David Majnemer	45ebda4278	[Verifier] Don't abort on invalid cleanuprets Code in visitEHPadPredecessors assume a little too much about the validity of a cleanupret with an invalid cleanuppad operand. llvm-svn: 262364	2016-03-01 18:59:50 +00:00
Simon Atanasyan	f69c7e5382	[DebugInfo] Dump CIE augmentation data as a list of hex bytes CIE augmentation data might contain non-printable characters. The patch prints the data as a list of hex bytes. Differential Revision: http://reviews.llvm.org/D17759 llvm-svn: 262361	2016-03-01 18:38:05 +00:00
Matt Arsenault	03dac8d8e4	DAGCombiner: Turn extract of bitcasted integer into truncate This reduces the number of bitcast nodes and generally cleans up the DAG when bitcasting between integers and vectors everywhere. llvm-svn: 262358	2016-03-01 18:01:37 +00:00
Changpeng Fang	24f035af32	AMDGPU/SI: Implement DS_PERMUTE/DS_BPERMUTE Instruction Definitions and Intrinsics Summary: This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics, which are new since VI. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D17614 llvm-svn: 262356	2016-03-01 17:51:23 +00:00
Michael Zuckerman	433b241570	[LLVM][AVX512] PSRL{DI\|QI} Change imm8 to int Differential Revision: http://reviews.llvm.org/D17713 llvm-svn: 262353	2016-03-01 17:46:32 +00:00
Hans Wennborg	e64cf9dddb	[X86] Check that attribute parameters match for tail calls (PR26590) In the code below on 32-bit targets, x would previously get forwarded to g() without sign-extension to 32 bits as required by the parameter attribute. void g(signed short); void f(unsigned short x) { g(x); } llvm-svn: 262352	2016-03-01 17:45:23 +00:00
Petar Jovanovic	6315f3f9b7	Revert "calculate builtin_object_size if argument is a removable pointer" Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349	2016-03-01 16:50:08 +00:00
Petar Jovanovic	8aef99aa86	calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337	2016-03-01 14:39:55 +00:00
Petr Pavlu	7ad9ec9fcf	[LTO] Fix error reporting from lto_module_create_in_local_context() Function lto_module_create_in_local_context() would previously rely on the default LLVMContext being created for it by LTOModule::makeLTOModule(). This context exits the program on error and is not arranged to update sLastStringError in tools/lto/lto.cpp. Function lto_module_create_in_local_context() now creates an LLVMContext by itself, sets it up correctly to its needs and then passes it to LTOModule::createInLocalContext() which takes ownership of the context and keeps it present for the lifetime of the returned LTOModule. Function LTOModule::makeLTOModule() is modified to take a reference to LLVMContext (instead of a pointer) and no longer creates a default context when nullptr is passed to it. Method LTOModule::createInContext() that takes a pointer to LLVMContext is removed because it allows to pass a nullptr to it. Instead LTOModule::createFromBuffer() (that takes a reference to LLVMContext) should be used. Differential Revision: http://reviews.llvm.org/D17715 llvm-svn: 262330	2016-03-01 13:13:49 +00:00
Michael Zuckerman	7878888690	[AVX512][PSRAQ][PSRAD] Change imm8 to int. Differential Revision: http://reviews.llvm.org/D17692 llvm-svn: 262320	2016-03-01 11:36:23 +00:00
Amjad Aboud	719325fe11	Disallow generating vzeroupper before return instruction (iret) in interrupt handler function. This resolves https://llvm.org/bugs/show_bug.cgi?id=26412 Differential Revision: http://reviews.llvm.org/D17542 llvm-svn: 262319	2016-03-01 11:32:03 +00:00
Vasileios Kalintiris	3a8f7f9e31	[mips] Promote the result of SETCC nodes to GPR width. Summary: This patch modifies the existing comparison, branch, conditional-move and select patterns, and adds new ones where needed. Also, the updated SLT{u,i,iu} set of instructions generate a GPR width result. The majority of the code changes in the Mips back-end fix the wrong assumption that the result of SETCC nodes always produce an i32 value. The changes in the common code path account for the fact that in 64-bit MIPS targets, i1 is promoted to i32 instead of i64. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D10970 llvm-svn: 262316	2016-03-01 10:08:01 +00:00
Nikolay Haustov	ea8febde04	[TableGen] AsmMatcher: Skip optional operands in the midle of instruction if it is not present Previosy, if actual instruction have one of optional operands then other optional operands listed before this also should be presented. For example instruction v_fract_f32 v0, v1, mul:2 have one optional operand - OMod and do not have optional operand clamp. Previously this was not allowed because clamp is listed before omod in AsmString: string AsmString = "v_fract_f32$vdst, $src0_modifiers$clamp$omod"; Making this work required some hacks (both OMod and Clamp match classes have same PredicateMethod). Now, if MatchInstructionImpl meets formal optional operand that is not presented in actual instruction it skips this formal operand and tries to match current actual operand with next formal. Patch by: Sam Kolton Review: http://reviews.llvm.org/D17568 [AMDGPU] Assembler: Check immediate types for several optional operands in predicate methods With this change you should place optional operands in order specified by asm string: clamp -> omod offset -> glc -> slc -> tfe Fixes for several tests. Depends on D17568 Patch by: Sam Kolton Review: http://reviews.llvm.org/D17644 llvm-svn: 262314	2016-03-01 08:34:43 +00:00
Nikolay Haustov	95b4fcd377	AsmParser: Fix nested .irp/.irpc Count .irp/.irpc in parseMacroLikeBody similar to .rept Update tests. Review: http://reviews.llvm.org/D17707 llvm-svn: 262313	2016-03-01 08:18:28 +00:00
Matt Arsenault	59b8b77405	AMDGPU: Set HasExtractBitInsn This currently does not have the control over the bitwidth, and there are missing optimizations to reduce the integer to 32-bit if it can be. But in most situations we do want the sinking to occur. llvm-svn: 262296	2016-03-01 04:58:17 +00:00
David Majnemer	cb305dea1c	[WinEH] Allocate the registration node before the catch objects The CatchObjOffset is relative to the end of the EH registration node for 32-bit x86 WinEH targets. A special sentinel value, 0, is used to indicate that no catch object should be initialized. This means that a catch object allocated immediately before the registration node would be assigned a CatchObjOffset of 0, leading the runtime to believe that a catch object should not be initialized. To handle this, allocate the registration node prior to any other frame object. This will ensure that catch objects will not be allocated before the registration node. This fixes PR26757. Differential Revision: http://reviews.llvm.org/D17689 llvm-svn: 262294	2016-03-01 04:30:16 +00:00
David Majnemer	f08579f5a8	[Verifier] Diagnose when unwinding out of cycles of blocks Generally speaking, this can only happen with unreachable code. However, neglecting to check for this condition would lead us to loop forever. llvm-svn: 262284	2016-03-01 01:19:05 +00:00
Adam Nemet	948775196d	[LLE] Add testcase for the fix in r262267 llvm-svn: 262280	2016-03-01 00:50:14 +00:00
Sanjay Patel	6f2c01f712	[x86, InstCombine] transform more x86 masked loads to LLVM intrinsics Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273	2016-02-29 23:59:00 +00:00
Sanjay Patel	98a71505f5	[x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsics The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the load mask is constant: __m128 mload_zeros(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(0)); } __m128 mload_fakeones(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(1)); } __m128 mload_ones(float f) { return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000)); } __m128 mload_oneset(float f) { return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0)); } ...so none of the above will actually generate a masked load for optimized code. This is the masked load counterpart to: http://reviews.llvm.org/rL262064 llvm-svn: 262269	2016-02-29 23:16:48 +00:00
David Majnemer	fe2f7f367a	[Verifier] Handle more funclet edge cases This change makes the verifier a little more paranoid. It was possible to trick the verifier into crashing or infinite looping. llvm-svn: 262268	2016-02-29 22:56:36 +00:00
Adrian Prantl	a349714bf9	Document an anomaly in this testcase. llvm-svn: 262264	2016-02-29 22:28:16 +00:00
Paul Robinson	a908e7bd4d	Reapply r262092: [FileCheck] Abort if -NOT is combined with another suffix. Combinations of suffixes that look useful are actually ignored; complaining about them will avoid mistakes. Differential Revision: http://reviews.llvm.org/D17587 llvm-svn: 262263	2016-02-29 22:13:03 +00:00
Sanjoy Das	999dc75c12	[Verifier] Minor fix to error message; NFC llvm-svn: 262262	2016-02-29 22:04:25 +00:00
Colin LeMahieu	ab9eca4d9f	[Hexagon] As a size optimization, not lazy extending TPREL or DTPREL variants since they're usually in range. llvm-svn: 262258	2016-02-29 21:21:56 +00:00
Adrian Prantl	c0a85eca6c	Fixup MIPS testcase after r262247 and make it a little more robust. llvm-svn: 262249	2016-02-29 20:25:10 +00:00
Geoff Berry	f5ba61d18c	[AArch64] Fix isLegalAddImmediate() to return true for valid negative values. Reviewers: t.p.northover, jmolloy Subscribers: mcrosier, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D17463 llvm-svn: 262248	2016-02-29 19:53:22 +00:00
Adrian Prantl	fb2add2be1	Fix PR26585 by improving the promotion of DBG_VALUEs to DW_AT_locations. When a variable is described by a single DBG_VALUE instruction we can often use a more efficient inline DW_AT_location instead of using a location list. This commit makes the heuristic that decides when to apply this optimization stricter by also verifying that the DBG_VALUE is live at the entry of the function (instead of just checking that it is valid until the end of the function). <rdar://problem/24611008> llvm-svn: 262247	2016-02-29 19:49:46 +00:00
Steven Wu	f2fe0141ca	Rename embedded bitcode section in MachO Summary: Rename the section embeds bitcode from ".llvmbc,.llvmbc" to "__LLVM,__bitcode". The new name matches MachO section naming convention. Reviewers: rafael, pcc Subscribers: davide, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D17388 llvm-svn: 262245	2016-02-29 19:40:10 +00:00
David Majnemer	e60ee3b8ce	[WinEH] Make setjmp work correctly with EH 32-bit X86 EH on Windows utilizes a stack of registration nodes allocated and deallocated on entry/exit. A registration node contains a bunch of EH personality specific information like which try-state we are currently in. Because a setjmp target allows control flow from arbitrary program points, there is no way to ensure that the try-state we are in is correctly updated once we transfer control. MSVC compatible compilers, like MSVC and ICC, utilize runtime helpers to reinitialize the try-state when a longjmp occurs. This is implemented by adding additional arguments to _setjmp3: the desired try-state and a helper routine to update the try-state. Differential Revision: http://reviews.llvm.org/D17721 llvm-svn: 262241	2016-02-29 19:16:03 +00:00
Nemanja Ivanovic	1a5706ca1b	Fix for PR26180 Corresponds to Phabricator review: http://reviews.llvm.org/D16592 This fix includes both an update to how we handle the "generic" CPU on LE systems as well as Anton's fix for the Fast Isel issue. llvm-svn: 262233	2016-02-29 16:42:27 +00:00
Daniel Sanders	03a8d2f8ec	[mips] Range check uimm20 and fixed a bug this revealed. Summary: The bug was that dextu's operand 3 would print 0-31 instead of 32-63 when printing assembly. This came up when replacing MipsInstPrinter::printUnsignedImm() with a version that could handle arbitrary bit widths. MipsAsmPrinter::printUnsignedImm*() don't seem to be used so they have been removed. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15521 llvm-svn: 262231	2016-02-29 16:06:38 +00:00
Vasileios Kalintiris	29620aca3e	[mips] Do not use SLL for ANY_EXTEND nodes as the high bits are undefined. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15420 llvm-svn: 262230	2016-02-29 15:58:12 +00:00
Daniel Sanders	611eb82953	[mips] Make isel select the correct DEXT variant up front. Summary: Previously, it would always select DEXT and substitute any invalid matches for DEXTU/DEXTM during MipsMCCodeEmitter::encodeInstruction(). This works but causes problems when adding range checked immediates to IAS. Now isel selects the correct variant up front. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D16810 llvm-svn: 262229	2016-02-29 15:26:54 +00:00
Rafael Espindola	8d6fbc3a4e	IRObject: Mark extern_weak as weak. llvm-svn: 262222	2016-02-29 14:26:06 +00:00
Benjamin Kramer	6bb15021b3	[InstSimplify] Restore fsub 0.0, (fsub 0.0, X) ==> X optzn I accidentally removed this in r262212 but there was no test coverage to detect it. llvm-svn: 262215	2016-02-29 12:18:25 +00:00
Daniel Sanders	90f0d0b8e3	[mips] Make symbols an acceptable branch target when expanding compare-to-immediate-and-branch macros. Reviewers: vkalintiris Subscribers: llvm-commits, vkalintiris, dim, seanbruno, dsanders Differential Revision: http://reviews.llvm.org/D15369 llvm-svn: 262213	2016-02-29 11:24:49 +00:00
Benjamin Kramer	f5b2a47ac6	[InstSimplify] fsub 0.0, (fsub -0.0, X) ==> X is only safe if signed zeros are ignored. Only allow fsub -0.0, (fsub -0.0, X) ==> X without nsz. PR26746. llvm-svn: 262212	2016-02-29 11:12:23 +00:00
Chandler Carruth	8b5a7419b8	[PM] Wire up optimization levels and default pipeline construction APIs in the PassBuilder. These are really just stubs for now, but they give a nice API surface that Clang or other tools can start learning about and enabling for experimentation. I've also wired up parsing various synthetic module pass names to generate these set pipelines. This allows the pipelines to be combined with other passes and have their order controlled, with clear separation between the kind of canned pipeline, and the level of optimization to be used within that canned pipeline. The most interesting part of this patch is almost certainly the spec for the different optimization levels. I don't think we can ever have hard and fast rules that would make it easy to determine whether a particular optimization makes sense at a particular level -- it will always be in large part a judgement call. But hopefully this will outline the expected rationale that should be used, and the direction that the pipelines should be taken. Much of this was based on a long llvm-dev discussion I started years ago to try and crystalize the intent behind these pipelines, and now, at long long last I'm returning to the task of actually writing it down somewhere that we can cite and try to be consistent with. Differential Revision: http://reviews.llvm.org/D12826 llvm-svn: 262196	2016-02-28 22:16:03 +00:00
JF Bastien	3a0814ac1a	WebAssembly: fix test Operand order seems to have changed, the new one is nicer. llvm-svn: 262180	2016-02-28 15:44:54 +00:00
Michael Zuckerman	96836fc81c	[AVX512][PSLLW ][PSLLV] Change imm8 to int Differential Revision: http://reviews.llvm.org/D17684 llvm-svn: 262176	2016-02-28 07:32:10 +00:00
Xinliang David Li	985ff20a9c	[PGO] Remove redundant counter copies for avail_extern functions. Differential Revision: http://reviews.llvm.org/D17654 llvm-svn: 262157	2016-02-27 23:11:30 +00:00
Matt Arsenault	3a61985b2f	AMDGPU: More bits of frame index are known to be zero The maximum private allocation for the whole GPU is 4G, so the maximum possible index for a single workitem is the maximum size divided by the smallest granularity for a dispatch. This increases the number of known zero high bits, which enables more offset folding. The maximum private size per workitem with this is 128M but may be smaller still. llvm-svn: 262153	2016-02-27 20:26:57 +00:00
Matt Arsenault	982224cfb8	DAGCombiner: Don't unnecessarily swap operands in ReassociateOps In the case where op = add, y = base_ptr, and x = offset, this transform: (op y, (op x, c1)) -> (op (op x, y), c1) breaks the canonical form of add by putting the base pointer in the second operand and the offset in the first. This fix is important for the R600 target, because for some address spaces the base pointer and the offset are stored in separate register classes. The old pattern caused the ISel code for matching addressing modes to put the base pointer and offset in the wrong register classes, which required no-trivial code transformations to fix. llvm-svn: 262148	2016-02-27 19:57:45 +00:00
Chris Dewhurst	0a2c033e2d	Addition of tests to previous check-in. Tests for coprocessor register usage in Sparc. Previous check-in message was: The patch adds missing registers and instructions to complete all the registers supported by the Sparc v8 manual. These are all co-processor registers, with the exception of the floating-point deferred-trap queue register. Although these will not be lowered automatically by any instructions, it allows the use of co-processor instructions implemented by inline-assembly. Code Reviewed at http://reviews.llvm.org/D17133, with the exception of a very small change in brace placement in SparcInstrInfo.td, which was formerly causing a problem in the disassembly of the %fq register. llvm-svn: 262135	2016-02-27 12:52:26 +00:00
Simon Pilgrim	83e76327e8	[X86][AVX] vpermilvar.pd mask element indices only use bit1 llvm-svn: 262134	2016-02-27 12:51:46 +00:00
Chris Dewhurst	053826af69	The patch adds missing registers and instructions to complete all the registers supported by the Sparc v8 manual. These are all co-processor registers, with the exception of the floating-point deferred-trap queue register. Although these will not be lowered automatically by any instructions, it allows the use of co-processor instructions implemented by inline-assembly. Code Reviewed at http://reviews.llvm.org/D17133, with the exception of a very small change in brace placement in SparcInstrInfo.td, which was formerly causing a problem in the disassembly of the %fq register. llvm-svn: 262133	2016-02-27 12:49:59 +00:00
Simon Pilgrim	a9a7bf68ee	[X86][AVX] Added AVX1 target shuffle combine tests llvm-svn: 262132	2016-02-27 12:33:08 +00:00
Chandler Carruth	30811a4dde	[PM] Loosen the regex for the proxy template name even further to cope with 'class' keywords in the template arguments and other silliness. llvm-svn: 262130	2016-02-27 11:07:16 +00:00
Chandler Carruth	08a25ce0e3	[PM] Use a boring regex instead of explicitly naming the analysis manager as some compilers print the typedef name and others print the "canonical" name of the underlying class template. This isn't really an important artifact of the test anyways so it seems fine to just loosen the test assertions here. llvm-svn: 262129	2016-02-27 10:48:14 +00:00
Chandler Carruth	2a54094d40	[PM] Provide two templates for the two directionalities of analysis manager proxies and use those rather than repeating their definition four times. There are real differences between the two directions: outer AMs are const and don't need to have invalidation tracked. But every proxy in a particular direction is identical except for the analysis manager type and the IR unit they proxy into. This makes them prime candidates for nice templates. I've started introducing explicit template instantiation declarations and definitions as well because we really shouldn't be emitting all this everywhere. I'm going to go back and add the same for the other templates like this in a follow-up patch. I've left the analysis manager as an opaque type rather than using two IR units and requiring it to be an AnalysisManager template specialization. I think its important that users retain the ability to provide their own custom analysis management layer and provided it has the appropriate API everything should Just Work. llvm-svn: 262127	2016-02-27 10:38:10 +00:00
Matt Arsenault	360d244d5b	DAGCombiner: Relax sqrt NaN folding check This is OK for +0 since compares to +/-0 give the same result. llvm-svn: 262125	2016-02-27 09:38:05 +00:00
Matt Arsenault	274d34e725	AMDGPU: Add s_sleep intrinsic llvm-svn: 262120	2016-02-27 08:53:52 +00:00
Matt Arsenault	61738cbcb6	AMDGPU: Implement readcyclecounter This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. llvm-svn: 262119	2016-02-27 08:53:46 +00:00
Sean Silva	ea399f0242	[instrprof] Use __{start,stop}_SECNAME on PS4 too. Summary: The PS4 linker seems to handle this fine. Hi David, it seems that indeed most ELF linkers support __{start,stop}_SECNAME, as our proprietary linker does as well. This follows the pattern of r250679 w.r.t. the testing. Maggie, Phillip, Paul: I've tested this with the PS4 SDK 3.5 toolchain prerelease and it seems to work fine. Reviewers: davidxl Subscribers: probinson, phillip.power, MaggieYi Differential Revision: http://reviews.llvm.org/D17672 llvm-svn: 262112	2016-02-27 06:01:26 +00:00
Kostya Serebryany	3c767db3c5	[libFuzzer] don't emit callbacks to sanitizer run-time in -fsanitize-coverage=trace-pc mode; update libFuzzer doc for previous commit llvm-svn: 262110	2016-02-27 05:45:12 +00:00
Chandler Carruth	ad8cb382fa	[LICM] Teach LICM how to handle cases where the alias set tracker was merged into a loop that was subsequently unrolled (or otherwise nuked). In this case it can't merge in the ASTs for any remaining nested loops, it needs to re-add their instructions dircetly. The fix is very isolated, but I've pulled the code for merging blocks into the AST into a single place in the process. The only behavior change is in the case which would have crashed before. This fixes a crash reported by Mikael Holmen on the list after r261316 restored much of the loop pass pipelining and allowed us to actually do this kind of nested transformation sequenc. I've taken that test case and further reduced it into the somewhat twisty maze of loops in the included test case. This does in fact trigger the bug even in this reduced form. llvm-svn: 262108	2016-02-27 04:34:07 +00:00
Mike Aizatsky	0d202ffa7c	[sancov] print_coverage_points command. Differential Revision: http://reviews.llvm.org/D17670 llvm-svn: 262104	2016-02-27 02:21:44 +00:00
Reid Kleckner	892ae2e2b6	[InstCombine] Be more conservative about removing stackrestore We ended up removing a save/restore pair around an inalloca call, leading to a miscompile in Chromium. llvm-svn: 262095	2016-02-27 00:53:54 +00:00
Paul Robinson	4b618dcc93	Revert r262092, caught LLD tests llvm-svn: 262093	2016-02-26 23:44:10 +00:00
Paul Robinson	abcfa39566	[FileCheck] Abort if -NOT is combined with another suffix. Combinations of suffixes that look useful actually are ignored; complaining about them will avoid mistakes. Differential Revision: http://reviews.llvm.org/D17587 llvm-svn: 262092	2016-02-26 23:34:02 +00:00
Cong Hou	e0eb8bfe37	Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause assertion failure on AArch64. llvm-svn: 262091	2016-02-26 23:25:30 +00:00
Ahmed Bougacha	0c95decaaa	[X86] Move an encoding test from CodeGen to MC. NFC. llvm-svn: 262089	2016-02-26 23:00:03 +00:00
Ahmed Bougacha	ccf38fd0e2	[X86] Delete old redundant test. NFC. llvm-svn: 262088	2016-02-26 23:00:00 +00:00
Philip Reames	adf0e35308	[LVI] Extend select handling to catch min/max/clamp idioms Most of this is fairly straight forward. Add handling for min/max via existing matcher utility and ConstantRange routines. Add handling for clamp by exploiting condition constraints on inputs. Note that I'm only handling two constant ranges at this point. It would be reasonable to consider treating overdefined as a full range if the instruction is typed as an integer, but that should be a separate change. Differential Revision: http://reviews.llvm.org/D17184 llvm-svn: 262085	2016-02-26 22:53:59 +00:00
Kit Barton	915c5ecee1	[PPC] Legalize FNEG on PPC when possible Currently we always expand ISD::FNEG. For v4f32 and v2f64 vector types VSX has native support for this opcode Phabricator: http://reviews.llvm.org/D17647 llvm-svn: 262079	2016-02-26 21:59:44 +00:00
Sanjay Patel	fc7e7ebf36	[x86, InstCombine] transform x86 AVX2 masked stores to LLVM intrinsics Replicate everything for integers...because x86. Continuation of: http://reviews.llvm.org/rL262064 llvm-svn: 262077	2016-02-26 21:51:44 +00:00
Paul Robinson	1d412f6457	Reapply r262054 with triple fix. llvm-svn: 262069	2016-02-26 21:18:34 +00:00
Kit Barton	93612ec5f2	Power9] Implement new vsx instructions: compare and conversion This change implements the following vsx instructions: Quad/Double-Precision Compare: xscmpoqp xscmpuqp xscmpexpdp xscmpexpqp xscmpeqdp xscmpgedp xscmpgtdp xscmpnedp xvcmpnedp(.) xvcmpnesp(.) Quad-Precision Floating-Point Conversion xscvqpdp(o) xscvdpqp xscvqpsdz xscvqpswz xscvqpudz xscvqpuwz xscvsdqp xscvudqp xscvdphp xscvhpdp xvcvhpsp xvcvsphp xsrqpi xsrqpix xsrqpxp 28 instructions Phabricator: http://reviews.llvm.org/D16709 llvm-svn: 262068	2016-02-26 21:11:55 +00:00
Sanjay Patel	1ace99351f	[x86, InstCombine] transform x86 AVX masked stores to LLVM intrinsics The intended effect of this patch in conjunction with: http://reviews.llvm.org/rL259392 http://reviews.llvm.org/rL260145 is that customers using the AVX intrinsics in C will benefit from combines when the store mask is constant: void mstore_zero_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0), v); } void mstore_fake_ones_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(1), v); } void mstore_ones_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set1_epi32(0x80000000), v); } void mstore_one_set_elt_mask(float f, __m128 v) { _mm_maskstore_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0), v); } ...so none of the above will actually generate a masked store for optimized code. Differential Revision: http://reviews.llvm.org/D17485 llvm-svn: 262064	2016-02-26 21:04:14 +00:00
Paul Robinson	d68c435a5d	Revert r262054 on one file that fails sometimes. llvm-svn: 262060	2016-02-26 20:41:07 +00:00
Paul Robinson	51fa0a87c3	Fix tests that used CHECK-NEXT-NOT and CHECK-DAG-NOT. FileCheck actually doesn't support combo suffixes. Differential Revision: http://reviews.llvm.org/D17588 llvm-svn: 262054	2016-02-26 19:40:34 +00:00
Nirav Dave	2993854bb4	Fix Sparc 32bit Lowering to rebundle up v2i32 values. Summary: Fix LowerCall to rebundle v2i32 values after lowering and add testcase Reviewers: jyknight Subscribers: llvm-commits, jyknight Differential Revision: http://reviews.llvm.org/D17615 llvm-svn: 262048	2016-02-26 18:55:22 +00:00
Sanjay Patel	155193c3aa	[x86, AVX] fold 'isPositive' 256-bit vector integer operations (PR26701) This extends the fold introduced with: http://reviews.llvm.org/rL262036 llvm-svn: 262047	2016-02-26 18:42:50 +00:00
Sanjay Patel	334685b486	[x86, AVX] add 256-bit tests llvm-svn: 262044	2016-02-26 18:07:58 +00:00
Sanjay Patel	4402a32b32	[x86, SSE] fold 'isPositive' vector integer operations (PR26701) This is one of the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 Shift and negate is what InstCombine appears to prefer, so I've started with that pattern. Note that the 'pcmpeq' instructions are always generating the negative one for the actual 'pcmpgt' comparison in each case (side note: why isn't there an alias mnemonic for that?). Differential Revision: http://reviews.llvm.org/D17630 llvm-svn: 262036	2016-02-26 16:56:03 +00:00
Reid Kleckner	70c9bc71d4	[WinEH] Fix funclet return block clobber mask placement MBB slot index intervals are half open, not closed. getMBBEndIndex() returns the slot index of the start of the next block in layout order. Placing a register mask there is incorrect if the successor of the funclet return is not laid out after the return. Clang generates IR for catch bodies before generating the following normal code, so we never noticed this issue until the D frontend authors filed a bug about it. Instead, we can put the clobber mask on the last instruction of the funclet return block. We still aren't using a register mask operand on the CATCHRET instruction because it would cause PEI to spill all CSRs, including XMM regs, in the prologue. Fixes PR26679. llvm-svn: 262035	2016-02-26 16:53:19 +00:00
Chris Dewhurst	829b104dc2	Reverting breaking change. Sorry. llvm-svn: 262007	2016-02-26 12:20:10 +00:00
Chris Dewhurst	9c3bf91d6e	Reviewed at reviews.llvm.org/D17133 llvm-svn: 262005	2016-02-26 11:46:47 +00:00
Chandler Carruth	3a63435551	[PM] Introduce CRTP mixin base classes to help define passes and analyses in the new pass manager. These just handle really basic stuff: turning a type name into a string statically that is nice to print in logs, and getting a static unique ID for each analysis. Sadly, the format of passes in anonymous namespaces makes using their names in tests really annoying so I've customized the names of the no-op passes to keep tests sane to read. This is the first of a few simplifying refactorings for the new pass manager that should reduce boilerplate and confusion. llvm-svn: 262004	2016-02-26 11:44:45 +00:00
Nikolay Haustov	2f684f1347	[AMDGPU] Assembler: Basic support for MIMG Add parsing and printing of image operands. Matches legacy sp3 assembler. Change image instruction order to have data/image/sampler operands in the beginning. This is needed because optional operands in MC are always last. Update SITargetLowering for new order. Add basic MC test. Update CodeGen tests. Review: http://reviews.llvm.org/D17574 llvm-svn: 261995	2016-02-26 09:51:05 +00:00
Simon Pilgrim	cf5352db84	[X86][F16C] Added native IR half/float conversion tests. Placeholder tests until we start improving native vector support. llvm-svn: 261989	2016-02-26 08:52:29 +00:00
David Blaikie	f1958da1c3	llvm-dwp: provide diagnostics for duplicate DWO IDs These diagnostics aren't perfect - in the case of merging several dwos into dwps and those dwps into more dwps - just getting the message about the original source file name might not be much help (since it's the same in both dwos, by definition - but doesn't tell you which chain of dwps to backtrack) It might be worth adding the DW_AT_dwo_id to the split debug info to improve the diagnostic experience - might help track down the duplicates better. llvm-svn: 261988	2016-02-26 07:30:15 +00:00
David Blaikie	5d6d4dc306	llvm-dwp: Support empty .dwo files Though a bit odd, this is handy for a few reasons - for example, in a build system that wants consistent input/output of build steps, but where split-dwarf might be overriden/disabled by the user on a per-file basis. llvm-svn: 261987	2016-02-26 07:04:58 +00:00
Craig Topper	d50b5f8abc	[X86] Add test cases for r261977 and fix a grammatical error. llvm-svn: 261983	2016-02-26 06:50:24 +00:00
Haicheng Wu	5539f852ae	[JumpThreading] Simplify Instructions first in ComputeValueKnownInPredecessors() This change tries to find more opportunities to thread over basic blocks. llvm-svn: 261981	2016-02-26 06:06:04 +00:00
Hongbin Zheng	b8bb0d8813	Another fix the testcase introduced by r261903 - Add the missing matches llvm-svn: 261971	2016-02-26 03:41:47 +00:00
Matthias Braun	9dcd65f478	MachineCopyPropagation: Catch copies of the form A<-B;A<-B Differential Revision: http://reviews.llvm.org/D17475 llvm-svn: 261966	2016-02-26 03:18:55 +00:00
Matthias Braun	e39ff70685	MachineCopyPropagation: Keep scanning through instructions with regmasks This also simplifies the code by removing the overly conservative NoInterveningSideEffect() function. This function checked: - That the two copies belong to the same block: We only process one block at a time and clear our maps in between it is impossible to find a copy from a different block. - There is no terminator between the two copy instructions: This is not allowed anyway (the MachineVerifier would complain) - Does not have instructions with hasUnmodeledSideEffects() or isCall() set: Even for those instructuction we must have all clobbers/defs of registers explicit as an operand. If the register is explicitely clobbered we would never come to the point of checking for NoInterveningSideEffect() anyway. (I also checked this with a temporary build of the test-suite with all potentially failing conditions in NoInterveningSideEffect() turned into asserts) Differential Revision: http://reviews.llvm.org/D17474 llvm-svn: 261965	2016-02-26 03:18:50 +00:00
Xinliang David Li	23682e9cab	[PGO] Add test case to ensure covmap section is not allocatable. Differential Revision: http://reviews.llvm.org/D17324 llvm-svn: 261959	2016-02-26 03:05:10 +00:00
Mike Aizatsky	5971f18133	[sancov] Pruning full dominator blocks from instrumentation. Summary: This is the first simple attempt to reduce number of coverage- instrumented blocks. If a basic block dominates all its successors, then its coverage information is useless to us. Ingore such blocks if santizer-coverage-prune-tree option is set. Differential Revision: http://reviews.llvm.org/D17626 llvm-svn: 261949	2016-02-26 01:17:22 +00:00
Sanjay Patel	7ed9361896	[x86, SSE] add tests to show missing pcmp folds llvm-svn: 261948	2016-02-26 01:14:27 +00:00
David Majnemer	08dd52dc75	[WinEH] Don't remove unannotated inline-asm calls Inline-asm calls aren't annotated with funclet bundle operands because they don't throw and cannot be inlined through. We shouldn't require them to bear an funclet bundle operand. llvm-svn: 261942	2016-02-26 00:04:25 +00:00
Hemant Kulkarni	de1152f444	Reverts change r261907 and r261918 llvm-svn: 261927	2016-02-25 20:47:07 +00:00
Hongbin Zheng	8c70ab75a0	Use regex in testcase, do not fail windows bots llvm-svn: 261922	2016-02-25 19:16:40 +00:00
Hemant Kulkarni	2a834115bf	[llvm-readobj] Enable GNU style sections and relocations printing http://reviews.llvm.org/D17523 llvm-svn: 261907	2016-02-25 18:02:00 +00:00
Hongbin Zheng	bc53977a0d	Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC Differential Revision: http://reviews.llvm.org/D17571 llvm-svn: 261904	2016-02-25 17:54:25 +00:00
Hongbin Zheng	751337faa7	Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC Differential Revision: http://reviews.llvm.org/D17570 llvm-svn: 261903	2016-02-25 17:54:15 +00:00
Hongbin Zheng	3f97840721	Introduce analysis pass to compute PostDominators in the new pass manager. NFC Differential Revision: http://reviews.llvm.org/D17537 llvm-svn: 261902	2016-02-25 17:54:07 +00:00
Tim Northover	aa35bd26c7	ARM: disallow pc as a base register in Thumb2 memory ops. These should all be deferring to the "OP (literal)" variant according to the ARM ARM. llvm-svn: 261895	2016-02-25 16:54:52 +00:00
Hongbin Zheng	66b19fbc4e	Revert "Introduce analysis pass to compute PostDominators in the new pass manager. NFC" This reverts commit a3e5cc6a51ab5ad88d1760c63284294a4e34c018. llvm-svn: 261891	2016-02-25 16:45:53 +00:00
Hongbin Zheng	ad782ce3f7	Revert "Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC" This reverts commit 109c38b2226a87b0be73fa7a0a8c1a81df20aeb2. llvm-svn: 261890	2016-02-25 16:45:46 +00:00
Hongbin Zheng	921fabf34b	Revert "Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC" This reverts commit 8228b4d374edeb4cc0c5fddf6e1ab876918ee126. llvm-svn: 261889	2016-02-25 16:45:37 +00:00
Hongbin Zheng	2fa386fd6c	Introduce RegionInfoAnalysis, which compute Region Tree in the new PassManager. NFC Differential Revision: http://reviews.llvm.org/D17571 llvm-svn: 261884	2016-02-25 16:33:26 +00:00
Hongbin Zheng	237197ba63	Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC Differential Revision: http://reviews.llvm.org/D17570 llvm-svn: 261883	2016-02-25 16:33:15 +00:00
Hongbin Zheng	a0273a04f5	Introduce analysis pass to compute PostDominators in the new pass manager. NFC Differential Revision: http://reviews.llvm.org/D17537 llvm-svn: 261882	2016-02-25 16:33:06 +00:00
Nikolay Haustov	161a158e5c	[AMDGPU] Disassembler: Support for all VOP1 instructions. Support all instructions with VOP1 encoding with 32 or 64-bit operands for VI subtarget: VGPR_32 and VReg_64 operand register classes VS_32 and VS_64 operand register classes with inline and literal constants Tests for VOP1 instructions. Patch by: skolton Reviewers: arsenm, tstellarAMD Review: http://reviews.llvm.org/D17194 llvm-svn: 261878	2016-02-25 16:09:14 +00:00
Igor Breger	45ef10f110	AVX512F: Add GATHER/SCATTER assembler Intel syntax tests for knl/skx/avx . Change memory operand parser handling. Differential Revision: http://reviews.llvm.org/D17564 llvm-svn: 261862	2016-02-25 13:30:17 +00:00
Hrvoje Varga	46458d0bcc	[mips][microMIPS] Implement DINSU, DINSM, DINS instructions Differential Revision: http://reviews.llvm.org/D16181 llvm-svn: 261860	2016-02-25 12:53:29 +00:00
Chandler Carruth	395fe57374	[PM] Add the IR unit type to the pass manager's logging and make all of the testing more more explicit. This will currently fail on platforms without support for getTypeName. While an assert failure seems too harsh, I'm hoping we're OK with the regression test failure, and I'd like to find out about what platforms actually exist in this state if there are any so we can get implementations in place for them. But if we just can't fix all the host compilers to have a reasonably portable variant of getTypeName and are worried about xfailing this test on those platforms, I can add the horrible regular expression magic to make the tests support "unknown" here as well. llvm-svn: 261853	2016-02-25 10:27:39 +00:00
Simon Pilgrim	e4178ae510	[X86][SSE3] Added combine support for MOVDDUP/MOVSHDUP/MOVSLDUP target shuffles Now that PerformShuffleCombine can handle unary shuffles. llvm-svn: 261843	2016-02-25 09:12:12 +00:00
NAKAMURA Takumi	51db1e0991	Revert r260064, "Disable llvm/test/tools/llvm-profdata/value-prof.proftext on win32 for now. Investigating." It seems unreproducible any more for me. llvm-svn: 261842	2016-02-25 08:50:26 +00:00
Justin Bogner	eecc3c826a	PM: Implement a basic loop pass manager This creates the new-style LoopPassManager and wires it up with dummy and print passes. This version doesn't support modifying the loop nest at all. It will be far easier to discuss and evaluate the approaches to that with this in place so that the boilerplate is out of the way. llvm-svn: 261831	2016-02-25 07:23:08 +00:00
Elena Demikhovsky	e5bbca6ae2	Optimized loading (zextload) of i1 value from memory. This patch is a partial revert of https://llvm.org/svn/llvm-project/llvm/trunk@237793. Extra "and" causes performance degradation. We assume that i1 is stored in zero-extended form. And store operation is responsible for zeroing upper bits. Differential Revision: http://reviews.llvm.org/D17541 llvm-svn: 261828	2016-02-25 07:05:12 +00:00
Justin Bogner	08154bf3d2	IR: Make the X / undef -> undef fold match the comment The constant folding for sdiv and udiv has a big discrepancy between the comments and the code, which looks like a typo. Currently, we're folding X / undef pretty inconsistently: 0 / undef -> undef C / undef -> 0 undef / undef -> 0 Whereas the comments state we do X / undef -> undef. The logic that returns zero is actually commented as doing undef / X -> 0, despite that the LHS isn't undef in many of the cases that hit it. llvm-svn: 261813	2016-02-25 01:02:18 +00:00
Junmo Park	161dc1c605	[CodeGenPrepare] Remove load-based heuristic Summary: Both the hardware and LLVM have changed since 2012. Now, load-based heuristic don't show big differences any more on OoO cores. There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5). Reviewers: spatel, zansari Differential Revision: http://reviews.llvm.org/D16836 llvm-svn: 261809	2016-02-25 00:23:27 +00:00
Cong Hou	ce16649d49	Move test/CodeGen/Generic/pr26652.ll to test/CodeGen/X86/pr26652.ll and test it only on X86. llvm-svn: 261807	2016-02-25 00:12:18 +00:00
Cong Hou	4ce0280a41	Detecte vector reduction operations just before instruction selection. (This is the second attemp to commit this patch, after fixing pr26652 & pr26653). This patch detects vector reductions before instruction selection. Vector reductions are vectorized reduction operations, and for such operations we have freedom to reorganize the elements of the result as long as the reduction of them stay unchanged. This will enable some reduction pattern recognition during instruction combine such as SAD/dot-product on X86. A flag is added to SDNodeFlags to mark those vector reduction nodes to be checked during instruction combine. To detect those vector reductions, we search def-use chains starting from the given instruction, and check if all uses fall into two categories: 1. Reduction with another vector. 2. Reduction on all elements. in which 2 is detected by recognizing the pattern that the loop vectorizer generates to reduce all elements in the vector outside of the loop, which includes several ShuffleVector and one ExtractElement instructions. Differential revision: http://reviews.llvm.org/D15250 llvm-svn: 261804	2016-02-24 23:40:36 +00:00
Sanjay Patel	8ad2c4eeb0	add tests to show missing bitcasted logic transform llvm-svn: 261799	2016-02-24 22:31:18 +00:00
Anna Zaks	40148f1716	[asan] Do not instrument globals in the special "LLVM" sections llvm-svn: 261794	2016-02-24 22:12:18 +00:00
Matthias Braun	aca625a4fe	MachineInstr: Respect register aliases in clearRegiserKills() This fixes bugs in copy elimination code in llvm. It slightly changes the semantics of clearRegisterKills(). This is appropriate because: - Users in lib/CodeGen/MachineCopyPropagation.cpp and lib/Target/AArch64RedundantCopyElimination.cpp and lib/Target/SystemZ/SystemZElimCompare.cpp are incorrect without it (see included testcase). - All other users in llvm are unaffected (they pass TRI==nullptr) - (Kill flags are optional anyway so removing too many shouldn't hurt.) Differential Revision: http://reviews.llvm.org/D17554 llvm-svn: 261763	2016-02-24 19:21:48 +00:00
Tim Northover	ca8e7e2e23	AArch64: remove CRC feature from Cyclone. Turns out we don't actually support those instructions. llvm-svn: 261759	2016-02-24 18:10:17 +00:00
Simon Pilgrim	cd25a2bef0	[X86][SSSE3] Added target shuffle combine tests for SSE3/SSSE3 specific shuffles. Allows us to test SSSE3 PSHUFB intrinsic. llvm-svn: 261753	2016-02-24 17:08:59 +00:00
Sanjay Patel	34ad6b32ee	remove fixme comment that was fixed with r261750 llvm-svn: 261752	2016-02-24 17:08:29 +00:00
Sanjay Patel	dbbaca0e1b	[InstCombine] enable optimization of casted vector xor instructions This is part of the payoff for the refactoring in: http://reviews.llvm.org/rL261649 http://reviews.llvm.org/rL261707 In addition to removing a pile of duplicated code, the xor case was missing the optimization for vector types because it checked "SrcTy->isIntegerTy()" rather than "SrcTy->isIntOrIntVectorTy()" like 'and' and 'or' were already doing. This solves part of: https://llvm.org/bugs/show_bug.cgi?id=26702 llvm-svn: 261750	2016-02-24 17:00:34 +00:00
Sanjay Patel	3a74fb51af	add test to show missing bitcasted vector xor fold llvm-svn: 261748	2016-02-24 16:34:29 +00:00
Anton Korobeynikov	064dbac212	`MSP430InstrInfo::loadRegFromStackSlot` forgets to set register def. Summary: For instance, compiling the below results in a panic: ``` llc: ../lib/CodeGen/InlineSpiller.cpp:1140: bool (anonymous namespace)::InlineSpiller::foldMemoryOperand(ArrayRef<std::pair<MachineInstr , unsigned int> >, llvm::MachineInstr ): Assertion `MO->isDead() && "Cannot fold physreg def"' failed. #0 0x00007f50fbcf353e llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:321:15 #1 0x00007f50fbcf3929 PrintStackTraceSignalHandler(void) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:380:1 #2 0x00007f50fbcf22a3 llvm::sys::RunSignalHandlers() /home/h/3rd/llvm/build/../lib/Support/Signals.cpp:45:5 #3 0x00007f50fbcf3bb4 SignalHandler(int) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:210:1 #4 0x00007f50fa87a180 (/lib/x86_64-linux-gnu/libc.so.6+0x35180) #5 0x00007f50fa87a107 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x35107) #6 0x00007f50fa87b4e8 abort (/lib/x86_64-linux-gnu/libc.so.6+0x364e8) #7 0x00007f50fa873226 (/lib/x86_64-linux-gnu/libc.so.6+0x2e226) #8 0x00007f50fa8732d2 (/lib/x86_64-linux-gnu/libc.so.6+0x2e2d2) #9 0x00007f50fddd9287 (anonymous namespace)::InlineSpiller::foldMemoryOperand(llvm::ArrayRef<std::pair<llvm::MachineInstr, unsigned int> >, llvm::MachineInstr) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1141:21 #10 0x00007f50fddd9ee9 (anonymous namespace)::InlineSpiller::spillAroundUses(unsigned int) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1286:9 #11 0x00007f50fddd388b (anonymous namespace)::InlineSpiller::spillAll() /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1338:21 #12 0x00007f50fddd221d (anonymous namespace)::InlineSpiller::spill(llvm::LiveRangeEdit&) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1391:3 #13 0x00007f50fdfd921b (anonymous namespace)::RAGreedy::selectOrSplitImpl(llvm::LiveInterval&, llvm::SmallVectorImpl<unsigned int>&, llvm::SmallSet<unsigned int, 16u, std::less<unsigned int> >&, unsigned int) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2555:5 #14 0x00007f50fdfd647b (anonymous namespace)::RAGreedy::selectOrSplit(llvm::LiveInterval&, llvm::SmallVectorImpl<unsigned int>&) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2221:12 #15 0x00007f50fdfc89f9 llvm::RegAllocBase::allocatePhysRegs() /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocBase.cpp:110:14 #16 0x00007f50fdfd6337 (anonymous namespace)::RAGreedy::runOnMachineFunction(llvm::MachineFunction&) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2611:3 #17 0x00007f50fded33ee llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/h/3rd/llvm/build/../lib/CodeGen/MachineFunctionPass.cpp:43:3 #18 0x00007f50fd6cdc6f llvm::FPPassManager::runOnFunction(llvm::Function&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1550:23 #19 0x00007f50fd6cdf85 llvm::FPPassManager::runOnModule(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1571:16 #20 0x00007f50fd6ce71a (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1627:23 #21 0x00007f50fd6ce246 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1730:16 #22 0x00007f50fd6cec31 llvm::legacy::PassManager::run(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1761:3 #23 0x0000000000415bdc compileModule(char, llvm::LLVMContext&) /home/h/3rd/llvm/build/../tools/llc/llc.cpp:405:5 #24 0x0000000000414571 main /home/h/3rd/llvm/build/../tools/llc/llc.cpp:211:13 #25 0x00007f50fa866b45 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b45) #26 0x0000000000414296 _start (/home/h/3rd/llvm/build/bin/llc+0x414296) Stack dump: 0. Program arguments: ./bin/llc -mtriple msp430 loadstore.ll 1. Running pass 'Function Pass Manager' on module 'loadstore.ll'. 2. Running pass 'Greedy Register Allocator' on function '@inc' ``` Original IR: ```llvm %struct.VeryLarge = type { i8, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 } ; Function Attrs: norecurse nounwind define void @inc(%struct.VeryLarge noalias nocapture sret %agg.result, %struct.VeryLarge* byval align 1 %s) #0 { entry: %p0 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 0 %0 = load i8, i8* %p0, align 1, !tbaa !1 %p1 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 1 %1 = load i32, i32* %p1, align 1, !tbaa !6 %p2 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 2 %2 = load i32, i32* %p2, align 1, !tbaa !7 %p3 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 3 %3 = load i32, i32* %p3, align 1, !tbaa !8 %p4 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 4 %4 = load i32, i32* %p4, align 1, !tbaa !9 %p5 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 5 %5 = load i32, i32* %p5, align 1, !tbaa !10 %p6 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 6 %6 = load i32, i32* %p6, align 1, !tbaa !11 %p7 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 7 %7 = load i32, i32* %p7, align 1, !tbaa !12 %p8 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 8 %8 = load i32, i32* %p8, align 1, !tbaa !13 %p9 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 9 %9 = load i32, i32* %p9, align 1, !tbaa !14 %p10 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 10 %10 = load i32, i32* %p10, align 1, !tbaa !15 %p11 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 11 %11 = load i32, i32* %p11, align 1, !tbaa !16 %p12 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 12 %12 = load i32, i32* %p12, align 1, !tbaa !17 %p13 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 13 %13 = load i32, i32* %p13, align 1, !tbaa !18 %p14 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 14 %14 = load i32, i32* %p14, align 1, !tbaa !19 %p15 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 15 %15 = load i32, i32* %p15, align 1, !tbaa !20 %p16 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 16 %16 = load i32, i32* %p16, align 1, !tbaa !21 %p17 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 17 %17 = load i32, i32* %p17, align 1, !tbaa !22 %p18 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 18 %18 = load i32, i32* %p18, align 1, !tbaa !23 %p19 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 19 %19 = load i32, i32* %p19, align 1, !tbaa !24 %p20 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 20 %20 = load i32, i32* %p20, align 1, !tbaa !25 %p21 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 21 %21 = load i32, i32* %p21, align 1, !tbaa !26 %p22 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 22 %22 = load i32, i32* %p22, align 1, !tbaa !27 %p23 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 23 %23 = load i32, i32* %p23, align 1, !tbaa !28 %p24 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 24 %24 = load i32, i32* %p24, align 1, !tbaa !29 %p25 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 25 %25 = load i32, i32* %p25, align 1, !tbaa !30 %p26 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 26 %26 = load i32, i32* %p26, align 1, !tbaa !31 %p27 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 27 %27 = load i32, i32* %p27, align 1, !tbaa !32 %p28 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 28 %28 = load i32, i32* %p28, align 1, !tbaa !33 %p29 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 29 %29 = load i32, i32* %p29, align 1, !tbaa !34 %p30 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 30 %30 = load i32, i32* %p30, align 1, !tbaa !35 %p31 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 31 %31 = load i32, i32* %p31, align 1, !tbaa !36 %p32 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 32 %32 = load i32, i32* %p32, align 1, !tbaa !37 %add = add i8 %0, 1 store i8 %add, i8* %p0, align 1, !tbaa !1 %add2 = add i32 %1, 2 store i32 %add2, i32* %p1, align 1, !tbaa !6 %add3 = add i32 %2, 3 store i32 %add3, i32* %p2, align 1, !tbaa !7 %add4 = add i32 %3, 4 store i32 %add4, i32* %p3, align 1, !tbaa !8 %add5 = add i32 %4, 5 store i32 %add5, i32* %p4, align 1, !tbaa !9 %add6 = add i32 %5, 6 store i32 %add6, i32* %p5, align 1, !tbaa !10 %add7 = add i32 %6, 7 store i32 %add7, i32* %p6, align 1, !tbaa !11 %add8 = add i32 %7, 8 store i32 %add8, i32* %p7, align 1, !tbaa !12 %add9 = add i32 %8, 9 store i32 %add9, i32* %p8, align 1, !tbaa !13 %add10 = add i32 %9, 10 store i32 %add10, i32* %p9, align 1, !tbaa !14 %add11 = add i32 %10, 11 store i32 %add11, i32* %p10, align 1, !tbaa !15 %add12 = add i32 %11, 12 store i32 %add12, i32* %p11, align 1, !tbaa !16 %add13 = add i32 %12, 13 store i32 %add13, i32* %p12, align 1, !tbaa !17 %add14 = add i32 %13, 14 store i32 %add14, i32* %p13, align 1, !tbaa !18 %add15 = add i32 %14, 15 store i32 %add15, i32* %p14, align 1, !tbaa !19 %add16 = add i32 %15, 16 store i32 %add16, i32* %p15, align 1, !tbaa !20 %add17 = add i32 %16, 17 store i32 %add17, i32* %p16, align 1, !tbaa !21 %add18 = add i32 %17, 18 store i32 %add18, i32* %p17, align 1, !tbaa !22 %add19 = add i32 %18, 19 store i32 %add19, i32* %p18, align 1, !tbaa !23 %add20 = add i32 %19, 20 store i32 %add20, i32* %p19, align 1, !tbaa !24 %add21 = add i32 %20, 21 store i32 %add21, i32* %p20, align 1, !tbaa !25 %add22 = add i32 %21, 22 store i32 %add22, i32* %p21, align 1, !tbaa !26 %add23 = add i32 %22, 23 store i32 %add23, i32* %p22, align 1, !tbaa !27 %add24 = add i32 %23, 24 store i32 %add24, i32* %p23, align 1, !tbaa !28 %add25 = add i32 %24, 25 store i32 %add25, i32* %p24, align 1, !tbaa !29 %add26 = add i32 %25, 26 store i32 %add26, i32* %p25, align 1, !tbaa !30 %add27 = add i32 %26, 27 store i32 %add27, i32* %p26, align 1, !tbaa !31 %add28 = add i32 %27, 28 store i32 %add28, i32* %p27, align 1, !tbaa !32 %add29 = add i32 %28, 29 store i32 %add29, i32* %p28, align 1, !tbaa !33 %add30 = add i32 %29, 30 store i32 %add30, i32* %p29, align 1, !tbaa !34 %add31 = add i32 %30, 31 store i32 %add31, i32* %p30, align 1, !tbaa !35 %add32 = add i32 %31, 32 store i32 %add32, i32* %p31, align 1, !tbaa !36 %add33 = add i32 %32, 33 store i32 %add33, i32* %p32, align 1, !tbaa !37 %33 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %agg.result, i32 0, i32 0 call void @llvm.memcpy.p0i8.p0i8.i32(i8* %33, i8* %p0, i32 129, i32 1, i1 false), !tbaa.struct !38 ret void } ; Function Attrs: argmemonly nounwind declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i32, i1) #1 attributes #0 = { norecurse nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } attributes #1 = { argmemonly nounwind } !llvm.ident = !{!0} !0 = !{!"clang version 3.8.0 (git://github.com/llvm-mirror/clang 40ef2b7531472c41212c4719a9294aeb7bddebbc) (git://github.com/llvm-mirror/llvm c601eaf55606dfb9ad372b514b77aa00d1409be1)"} !1 = !{!2, !3, i64 0} !2 = !{!"", !3, i64 0, !5, i64 1, !5, i64 5, !5, i64 9, !5, i64 13, !5, i64 17, !5, i64 21, !5, i64 25, !5, i64 29, !5, i64 33, !5, i64 37, !5, i64 41, !5, i64 45, !5, i64 49, !5, i64 53, !5, i64 57, !5, i64 61, !5, i64 65, !5, i64 69, !5, i64 73, !5, i64 77, !5, i64 81, !5, i64 85, !5, i64 89, !5, i64 93, !5, i64 97, !5, i64 101, !5, i64 105, !5, i64 109, !5, i64 113, !5, i64 117, !5, i64 121, !5, i64 125} !3 = !{!"omnipotent char", !4, i64 0} !4 = !{!"Simple C/C++ TBAA"} !5 = !{!"int", !3, i64 0} !6 = !{!2, !5, i64 1} !7 = !{!2, !5, i64 5} !8 = !{!2, !5, i64 9} !9 = !{!2, !5, i64 13} !10 = !{!2, !5, i64 17} !11 = !{!2, !5, i64 21} !12 = !{!2, !5, i64 25} !13 = !{!2, !5, i64 29} !14 = !{!2, !5, i64 33} !15 = !{!2, !5, i64 37} !16 = !{!2, !5, i64 41} !17 = !{!2, !5, i64 45} !18 = !{!2, !5, i64 49} !19 = !{!2, !5, i64 53} !20 = !{!2, !5, i64 57} !21 = !{!2, !5, i64 61} !22 = !{!2, !5, i64 65} !23 = !{!2, !5, i64 69} !24 = !{!2, !5, i64 73} !25 = !{!2, !5, i64 77} !26 = !{!2, !5, i64 81} !27 = !{!2, !5, i64 85} !28 = !{!2, !5, i64 89} !29 = !{!2, !5, i64 93} !30 = !{!2, !5, i64 97} !31 = !{!2, !5, i64 101} !32 = !{!2, !5, i64 105} !33 = !{!2, !5, i64 109} !34 = !{!2, !5, i64 113} !35 = !{!2, !5, i64 117} !36 = !{!2, !5, i64 121} !37 = !{!2, !5, i64 125} !38 = !{i64 0, i64 1, !39, i64 1, i64 4, !40, i64 5, i64 4, !40, i64 9, i64 4, !40, i64 13, i64 4, !40, i64 17, i64 4, !40, i64 21, i64 4, !40, i64 25, i64 4, !40, i64 29, i64 4, !40, i64 33, i64 4, !40, i64 37, i64 4, !40, i64 41, i64 4, !40, i64 45, i64 4, !40, i64 49, i64 4, !40, i64 53, i64 4, !40, i64 57, i64 4, !40, i64 61, i64 4, !40, i64 65, i64 4, !40, i64 69, i64 4, !40, i64 73, i64 4, !40, i64 77, i64 4, !40, i64 81, i64 4, !40, i64 85, i64 4, !40, i64 89, i64 4, !40, i64 93, i64 4, !40, i64 97, i64 4, !40, i64 101, i64 4, !40, i64 105, i64 4, !40, i64 109, i64 4, !40, i64 113, i64 4, !40, i64 117, i64 4, !40, i64 121, i64 4, !40, i64 125, i64 4, !40} !39 = !{!3, !3, i64 0} !40 = !{!5, !5, i64 0} ``` Reviewers: asl Subscribers: qcolombet Differential Revision: http://reviews.llvm.org/D17441 llvm-svn: 261746	2016-02-24 15:15:02 +00:00
Simon Pilgrim	3b6feeaa7c	[X86][SSE41] Combine vector blends with zero Part 2 of 2 This patch add support for combining target shuffles into blends-with-zero. Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261745	2016-02-24 15:14:21 +00:00
Simon Pilgrim	dd01f70085	[X86][SSE41] Combine insertion of zero scalars into vector blends with zero Part 1 of 2 This patch attempts to replace the insertion of zero scalars with a vector blend with zero, avoiding the use of the integer insertion instructions (which are particularly slow on many targets). (Part 2 will add support for combining multiple blends-with-zero). Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261743	2016-02-24 14:53:27 +00:00
Simon Pilgrim	e1bb13d67a	[X86][SSE] Fixed vector rotation test name typo Rotation of 16i6 vector not 8i16 vector - copy+paste is not your friend llvm-svn: 261733	2016-02-24 11:39:13 +00:00
David Majnemer	ec72e37220	[SimplifyCFG] Do not blindly remove unreachable blocks DeleteDeadBlock was called indiscriminately, leading to cleanuprets with undef cleanuppad references. Instead, try to drain the BB of most of it's instructions if it is unreachable. We can then remove the BB if it solely consists of a terminator (and maybe some phis). llvm-svn: 261731	2016-02-24 10:02:16 +00:00
David Majnemer	e2ad73759d	[CodeView] Describe variables live in x87 registers We didn't have a mapping from LLVM's x87 floating point registers to CodeView's encoding. llvm-svn: 261730	2016-02-24 10:01:24 +00:00
Michael Zuckerman	a1f2d27da2	[LLVM][AVX512][PSHUFHW ][PSHUFLW ] Change imm8 to int Differential Revision: http://reviews.llvm.org/D17538 llvm-svn: 261725	2016-02-24 08:39:05 +00:00
Igor Breger	c7ba5699c5	AVX512: Add vpmovzxbw/d/q ,vpmovzxw/d/q ,vpmovzxbdq lowering patterns that support 256bit inputs like AVX patterns ( that are disable in case HasVLX , see SS41I_pmovx_avx2_patterns). Differential Revision: http://reviews.llvm.org/D17504 llvm-svn: 261724	2016-02-24 08:15:20 +00:00
Derek Schuff	f9c0a5c377	Revert "[WebAssembly] Stackify code emitted by eliminateFrameIndex" This reverts r261685 due to wasm test breakage. llvm-svn: 261702	2016-02-23 22:13:21 +00:00
Sanjay Patel	55a2b24410	minimize test and use FileCheck llvm-svn: 261701	2016-02-23 22:03:44 +00:00
Derek Schuff	b21570cc1d	[WebAssembly] Stackify code emitted by eliminateFrameIndex llvm-svn: 261685	2016-02-23 21:25:17 +00:00
Tim Northover	b9edc5ca21	ARM: fix handling of movw/movt relocations with addend. We were emitting only one half of a the paired relocations needed for these instructions because we decided that an offset needed a scattered relocation. In fact, movw/movt relocations can be paired without being scattered. llvm-svn: 261679	2016-02-23 20:20:23 +00:00
Geoff Berry	27b1ded41a	[AArch64] Generate csinv instruction more often Reviewers: t.p.northover, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17546 llvm-svn: 261675	2016-02-23 19:34:13 +00:00
Hans Wennborg	d3661cd140	Revert r261633 "Supporting all entities declared in lexical scope in LLVM debug info." This and the corresponding Clang change caused PR26715. llvm-svn: 261671	2016-02-23 19:17:03 +00:00
Derek Schuff	4b3bb213b2	[WebAssembly] Implement red zone for user stack Implements a mostly-conventional redzone for the userspace stack. Because we have unsigned load/store offsets we continue to use a local SP subtracted from the incoming SP but do not write it back to memory. Differential Revision: http://reviews.llvm.org/D17525 llvm-svn: 261662	2016-02-23 18:13:07 +00:00
David Majnemer	223538f764	[WinEH] Don't inline an 'unwinds to caller' cleanupret into funclets which locally unwind It is problematic if the inlinee has a cleanupret which unwinds to caller and we inline it into a call site which doesn't unwind. If the funclet unwinds anywhere other than to the caller, then we will give the funclet two unwind destinations. This will result in a verifier failure. Seeing as how the caller wasn't an invoke (which would locally unwind) and that the funclet cannot unwind to caller, we must conclude that an 'unwind to caller' cleanupret is dynamically unreachable. This fixes PR26698. Differential Revision: http://reviews.llvm.org/D17536 llvm-svn: 261656	2016-02-23 17:11:04 +00:00
Geoff Berry	a1c6269c91	[AArch64] Fix fastcc -tailcallopt epilog code generation. Summary: Fix a bug in epilog generation where the incoming stack arguments were not being popped for fastcc functions when -tailcallopt was passed. Reviewers: t.p.northover, mcrosier, jmolloy, rengolin Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16894 llvm-svn: 261650	2016-02-23 16:54:36 +00:00
Amjad Aboud	fc8f296782	Supporting all entities declared in lexical scope in LLVM debug info. Differential Revision: http://reviews.llvm.org/D15976 llvm-svn: 261633	2016-02-23 13:36:51 +00:00
Alexander Kornienko	d95b5d4490	Remove a space after a trailing backslash. llvm-svn: 261629	2016-02-23 11:19:56 +00:00
Nikolay Haustov	2a62b3c244	[AMDGPU] Fix operands of S_BFE_U64 and S_BFM_B64 src1 of s_bfe_u64 is 32-bit (same as s_bfe_i64). src0 and src1 of s_bfm_b64 are 32-bit. Update tests. Review: http://reviews.llvm.org/D17480 Reviewers: arsenm llvm-svn: 261621	2016-02-23 09:19:14 +00:00
Igor Breger	1360f4db47	AVX512: Fix predicate of AVX pcmpeqw/b , pcmpgtb/w/d instructions . AVX512 version of this instructions return result in kmask register, so AVX patterns should not be disabled. Differential Revision: http://reviews.llvm.org/D17517 llvm-svn: 261619	2016-02-23 08:55:33 +00:00
David Majnemer	17525aba8a	[WinEH] Visit 'unwind to caller' catchswitches nested in catchswitches We had the right logic for the nested cleanuppad case but omitted it for catchswitches. llvm-svn: 261615	2016-02-23 07:18:15 +00:00
Dehao Chen	f84b630044	Add prefix based function layout when profile is available. Summary: If a function is hot, put it in text.hot section. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17532 llvm-svn: 261607	2016-02-23 03:39:24 +00:00
Duncan P. N. Exon Smith	b3613fce19	Revert "Add prefix based function layout when profile is available." This reverts commit r261582, since this bot has been broken for four hours: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/19399/ llvm-svn: 261604	2016-02-23 02:28:40 +00:00
Matt Arsenault	d38272b214	AMDGPU: Add failing testcase for register coalescer llvm-svn: 261592	2016-02-22 23:45:42 +00:00
Krzysztof Parzyszek	e261e5ac47	More detailed dependence test between volatile and non-volatile accesses Differential Revision: http://reviews.llvm.org/D16857 llvm-svn: 261589	2016-02-22 23:07:43 +00:00
Dehao Chen	6c73b49911	Set function entry count as 0 if sample profile is not found for the function. Summary: This change makes the sample profile's behavior consistent with instr profile. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17522 llvm-svn: 261587	2016-02-22 22:46:21 +00:00
David Majnemer	964b70d559	[X86] Create mergeable constant pool entries for AVX We supported creating mergeable constant pool entries for smaller constants but not for 32-byte AVX constants. llvm-svn: 261584	2016-02-22 22:23:11 +00:00
Dehao Chen	c5f76f7347	Add prefix based function layout when profile is available. Summary: If a function is hot, put it in text.hot section. Reviewers: davidxl Subscribers: eraman, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17460 llvm-svn: 261582	2016-02-22 22:14:14 +00:00
Derek Schuff	27e3b8a6e3	[WebAssembly] Fix writeback of stack pointer with dynamic alloca Previously the stack pointer was only written back to memory in the prolog. But this is wrong for dynamic allocas, for which target-independent codegen handles SP updates after the prolog (and possibly even in another BB). Instead update the SP global in ADJCALLSTACKDOWN which is generated after the SP update sequence. This will have further refinements when we add red zone support. llvm-svn: 261579	2016-02-22 21:57:17 +00:00
Adam Nemet	fb31d580ea	[LoopDataPrefetch] Make it testable with opt Summary: Since this is an IR pass it's nice to be able to write tests without llc. This is the counterpart of the llc test under CodeGen/PowerPC/loop-data-prefetch.ll. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17464 llvm-svn: 261578	2016-02-22 21:41:22 +00:00
Michael Zolotukhin	d734bea8ba	[LoopUnrolling] Fix a bug introduced in r259869 (PR26688). The issue was that we only required LCSSA rebuilding if the immediate parent-loop had values used outside of it. The fix is to enaable the same logic for all outer loops, not only immediate parent. llvm-svn: 261575	2016-02-22 21:21:45 +00:00
Matt Arsenault	d85e5a1303	AMDGPU: Fix alignments in test I don't think this test was intending to test unaligned load/store. Change it to use the natural alignment to avoid regressing. Also adds missing SI checks. llvm-svn: 261571	2016-02-22 21:04:23 +00:00
Matt Arsenault	fa67bdbde0	AMDGPU/R600: Implement allowsMisalignedMemoryAccess This avoids some test regressions in a future commit when unaligned operations are expanded when they have custom lowering. llvm-svn: 261570	2016-02-22 21:04:16 +00:00
Philip Reames	ce38c2ddf6	[RS4GC] "Constant fold" the rs4gc-split-vector-values flag This flag was part of a migration to a new means of handling vectors-of-points which was described in the llvm-dev thread "FYI: Relocating vector of pointers". The old code path has been off by default for a while without complaints, so time to cleanup. llvm-svn: 261569	2016-02-22 21:01:28 +00:00
Tim Northover	d32f8e60bf	ARM: sink atomic release barrier as far as possible into cmpxchg. DMB instructions can be expensive, so it's best to avoid them if possible. In atomicrmw operations there will always be an attempted store so a release barrier is always needed, but in the cmpxchg case we can delay the DMB until we know we'll definitely try to perform a store (and so need release semantics). In the strong cmpxchg case this isn't quite free: we must duplicate the LDREX instructions to skip the barrier on subsequent iterations. The basic outline becomes: ldrex rOld, [rAddr] cmp rOld, rDesired bne Ldone dmb Lloop: strex rRes, rNew, [rAddr] cbz rRes Ldone ldrex rOld, [rAddr] cmp rOld, rDesired beq Lloop Ldone: So we'll skip this version for strong operations in "minsize" functions. llvm-svn: 261568	2016-02-22 20:55:50 +00:00
Philip Reames	79fa9b75c0	[RS4GC] Revert optimization attempt due to memory corruption This change reverts "246133 [RewriteStatepointsForGC] Reduce the number of new instructions for base pointers" and a follow on bugfix 12575. As pointed out in pr25846, this code suffers from a memory corruption bug. Since I'm (empirically) not going to get back to this any time soon, simply reverting the problematic change is the right answer. llvm-svn: 261565	2016-02-22 20:45:56 +00:00
Dan Gohman	3b09d279be	[WebAssembly] Teach address folding to fold bitwise-or nodes. LLVM converts adds into ors when it can prove that the operands don't share any non-zero bits. Teach address folding to recognize or instructions with constant operands with this property that can be folded into addresses as if they were adds. llvm-svn: 261562	2016-02-22 20:04:02 +00:00
Tom Stellard	d93a34f714	[AMDGPU][llvm-mc] Support for 32-bit inline literals Patch by: Artem Tamazov Summary: Note: Support for 64-bit inline literals TBD Added: Support of abs/neg modifiers for literals (incomplete; parsing TBD). Added: Some TODO comments. Reworked/clarity: rename isInlineImm() to isInlinableImm() Reworked/robustness: disallow BitsToFloat() with undefined value in isInlinableImm() Reworked/reuse: isSSrc32/64(), isVSrc32/64() Tests added. Reviewers: tstellarAMD, arsenm Subscribers: vpykhtin, nhaustov, SamWot, arsenm Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D17204 llvm-svn: 261559	2016-02-22 19:17:56 +00:00
Tom Stellard	82fc153baa	[AMDGPU] [llvm-mc] [VI] Fix encoding of LDS/GDS instructions. Patch by: Artem Tamazov Summary: Tests added. Reviewers: tstellarAMD, arsenm Subscribers: vpykhtin, SamWot, #llvm-amdgpu-spb Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D17271 llvm-svn: 261558	2016-02-22 19:17:53 +00:00
Justin Lebar	ccbd8f5a02	Revert "[attrs] Handle convergent CallSites." This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549	2016-02-22 18:24:43 +00:00
Nemanja Ivanovic	a8ef3c9b86	Fix for PR26690 take 2 This is what was meant to be in the initial commit to fix this bug. The parens were missing. This commit also adds a test case for the bug and has undergone full testing on PPC and X86. llvm-svn: 261546	2016-02-22 18:04:00 +00:00
Justin Lebar	7bf9187abb	[attrs] Handle convergent CallSites. Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544	2016-02-22 17:51:35 +00:00
Justin Lebar	5b82c9ba31	Don't tail-duplicate blocks that contain convergent instructions. Summary: Convergent instrs shouldn't be made control-dependent on other values, but this is basically the whole point of tail duplication. So just bail if we see a convergent instruction. Reviewers: iteratee Subscribers: jholewinski, jhen, hfinkel, tra, jingyue, llvm-commits Differential Revision: http://reviews.llvm.org/D17320 llvm-svn: 261540	2016-02-22 17:50:52 +00:00
Dan Gohman	595e8ab22d	[WebAssembly] Properly ignore llvm.dbg.value instructions. llvm-svn: 261538	2016-02-22 17:45:20 +00:00
Zoran Jovanovic	d665a66b0f	[mips] added support for trunc macro Author: obucina Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D15745 llvm-svn: 261529	2016-02-22 16:00:23 +00:00
Igor Breger	252c2d9680	AVX512F: Add assembler Intel syntax tests for knl, fix minor bugs. Differential Revision: http://reviews.llvm.org/D17498 llvm-svn: 261521	2016-02-22 12:37:41 +00:00
Igor Breger	4511e76e5c	AVX512: Fix scalar mem operands. Differential Revision: http://reviews.llvm.org/D17500 llvm-svn: 261520	2016-02-22 11:48:27 +00:00
Elena Demikhovsky	9914dbd11b	Allow setting MaxRerollIterations above 16 By Ayal Zaks. Differential Revision http://reviews.llvm.org/D17258 llvm-svn: 261517	2016-02-22 09:38:28 +00:00
Kevin B. Smith	47e64abe1a	[X86] More test updates to support fixup-byte-word-insts optimization either on or off. Differential Revisions: http://reviews.llvm.org/D17458 llvm-svn: 261505	2016-02-22 01:27:56 +00:00
Simon Pilgrim	e9093adae0	[X86][AVX] Add shuffle masking support for EltsFromConsecutiveLoads Add support for the case where we have a consecutive load (which must include the first + last elements) with a mixture of undef/zero elements. We load the vector and then apply a shuffle to clear the zero'd elements. Differential Revision: http://reviews.llvm.org/D17297 llvm-svn: 261490	2016-02-21 19:15:48 +00:00
Sanjoy Das	aa63dc0e9a	Fix LLVM's handling and detection of skylake and cannonlake CPUs Summary: - Rename `"skylake"` == SkylakeServerProc to `"skylake-avx512"` - Change `"skylake"` to denote SkylakeClientProc - Fix the detection of cpu family 6 and model 94 to be SkylakeClientProc instead of SkylakeServerProc - Remove the `"cnl"` for CannonLake Reviewers: craig.topper, delena Subscribers: zansari, echristo, qcolombet, RKSimon, spatel, DavidKreitzer, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17090 llvm-svn: 261482	2016-02-21 17:12:03 +00:00
Simon Pilgrim	b1cc4d6d69	[InstCombine] Added SSE41 roundss/roundsd demanded vector elements invec tests llvm-svn: 261472	2016-02-21 14:50:27 +00:00
Simon Pilgrim	e2ada8d3a7	[InstCombine] Added XOP frczss/vfrczsd demanded vector elements tests llvm-svn: 261469	2016-02-21 12:45:36 +00:00
Simon Pilgrim	2ed8686a28	[InstCombine] Added SSE41 roundss/roundsd demanded vector elements tests llvm-svn: 261468	2016-02-21 12:40:39 +00:00
Dan Gohman	27a11eefcc	[WebAssembly] Support physical registers in the rewrite-to-discard optimization. llvm-svn: 261465	2016-02-21 03:27:22 +00:00
David Majnemer	a3ea407d48	[X86] Use the correct alignment for COMDAT constant pool entries COFF doesn't have sections with mergeable contents. Instead, each constant pool entry ends up in a COMDAT section. The linker, when choosing between COMDAT sections, doesn't choose the max alignment of the two sections. You just get whatever alignment was on the section. If one constant needed a higher alignment in one object file from another one, then we will get into trouble if the linker chooses the lower alignment one. Instead, lets promote the alignment of the constant pool entry to make sure we don't use an under aligned constant with an instruction which assumed otherwise. This fixes PR26680. llvm-svn: 261462	2016-02-21 01:30:30 +00:00
Simon Pilgrim	471efd244a	[InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use the lowest vector element llvm-svn: 261460	2016-02-20 23:17:35 +00:00
Dan Gohman	02c0871abd	[WebAssembly] Handle CopyToReg nodes with flag results in LowerCopyToReg. llvm-svn: 261457	2016-02-20 23:09:44 +00:00
Simon Pilgrim	6a0dca535a	[InstCombine] Added SSE/SSE2 comparison intrinsics demanded vector elements tests llvm-svn: 261454	2016-02-20 22:41:31 +00:00
Derek Schuff	90dbb8cfc3	[WebAssembly] Write stack pointer back to memory when FP is used The stack pointer is bumped when there is a frame pointer or when there are static-size objects, but was only getting written back when there were static-size objects. llvm-svn: 261453	2016-02-20 22:18:47 +00:00
Derek Schuff	dc5f6aa4bb	[WebAssembly] Stackify function prologs and epilogs The instructions are the same, but fewer locals are used. Differential Revision: http://reviews.llvm.org/D17428 llvm-svn: 261452	2016-02-20 21:46:50 +00:00
Simon Pilgrim	d4768fa314	[InstCombine] Added some SSE/SSE2 demanded vector elements tests llvm-svn: 261451	2016-02-20 21:44:48 +00:00
Simon Pilgrim	d765c0b8b9	[X86][AVX] Added test case for PR22359 llvm-svn: 261444	2016-02-20 19:21:20 +00:00
Simon Pilgrim	79a14dd3d1	[X86] Regenerated pr16360.ll llvm-svn: 261440	2016-02-20 17:56:45 +00:00

... 2 3 4 5 6 ...

34956 Commits