llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	d2d50cba2a	[AVX-512] Add PAVGB/PAVGW to load folding tables. llvm-svn: 295035	2017-02-14 06:54:57 +00:00
Mikael Holmen	ece84cd10c	[LSR] Pointers with different address spaces are considered incompatible. Summary: Function isCompatibleIVType is already used as a guard before the call to SE.getMinusSCEV(OperExpr, PrevExpr); in LSRInstance::ChainInstruction. getMinusSCEV requires the expressions to be of the same type, so we now consider two pointers with different address spaces to be incompatible, since it is possible that the pointers in fact have different sizes. Reviewers: qcolombet, eli.friedman Reviewed By: qcolombet Subscribers: nhaehnle, Ka-Ka, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D29885 llvm-svn: 295033	2017-02-14 06:37:42 +00:00
Peter Collingbourne	002c2d5380	ThinLTOBitcodeWriter: Write available_externally copies of VCP eligible functions to merged module. Differential Revision: https://reviews.llvm.org/D29701 llvm-svn: 295021	2017-02-14 03:42:38 +00:00
Philip Reames	b2bca7e309	[LICM] Make store promotion work in the face of unordered atomics Extend our store promotion code to deal with unordered atomic accesses. Ordered atomics continue to be unhandled. Most of the change is straight-forward, the only complicated bit is in the reasoning around mixing of atomic and non-atomic memory access. Rather than trying to reason about the complex semantics in these cases, I simply disallowed promotion when both atomic and non-atomic accesses are present. This is conservatively correct. It seems really tempting to just promote all access to atomics, but the original accesses might have been conditional. Since we can't lower an arbitrary atomic type, it might not be safe to promote all access to atomic. Consider a loop like the following: while(b) { load i128 ... if (can lower i128 atomic) store atomic i128 ... else store i128 } It could be there's no race on the location and thus the code is perfectly well defined even if we can't lower a i128 atomically. It's not clear we need to be this conservative - arguably the program above is brocken since it can't be lowered unless the branch is folded - but I didn't want to have to fix any fallout which might result. Differential Revision: https://reviews.llvm.org/D15592 llvm-svn: 295015	2017-02-14 01:38:31 +00:00
Andrew Kaylor	709f1c2a9b	[X86] Add MXCSR register This adds MXCSR to the set of recognized registers for X86 targets and updates the instructions that read or write it. I do not intend for all of the various floating point instructions that implicitly use the control bits or update the status bits of this register to ever have that usage modeled by default. However, when constrained floating point modes (such as strict FP exception status modeling or dynamic rounding modes) are enabled, implicit use/def information for MXCSR will be added to those instructions. Until those additional updates are made this should cause (almost?) no functional changes. Theoretically, this will prevent instructions like LDMXCSR and STMXCSR from being moved past one another, but that should be prevented anyway and I haven't found a case where it is happening now. Differential Revision: https://reviews.llvm.org/D29903 llvm-svn: 295004	2017-02-13 23:38:52 +00:00
Sanjay Patel	4f74216da0	[FunctionAttrs] try to extend nonnull-ness of arguments from a callsite back to its parent function As discussed here: http://lists.llvm.org/pipermail/llvm-dev/2016-December/108182.html ...we should be able to propagate 'nonnull' info from a callsite back to its parent. The original motivation for this patch is our botched optimization of "dyn_cast" (PR28430), but this won't solve that problem. The transform is currently disabled by default while we wait for clang to work-around potential security problems: http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html Differential Revision: https://reviews.llvm.org/D27855 llvm-svn: 294998	2017-02-13 23:10:51 +00:00
Amaury Sechet	3422b539c8	Revert autogenerated check result for test/CodeGen/X86/atomic-minmax-i6432.ll as they don't regenerate cleanly. llvm-svn: 294996	2017-02-13 23:00:23 +00:00
Tim Northover	48dfa1a6ed	GlobalISel: represent atomic loads & stores via the MachineMemOperand. Also make sure the AArch64 backend doesn't try to convert them into normal loads and stores. llvm-svn: 294993	2017-02-13 22:14:16 +00:00
Tim Northover	b73e309071	MIR: parse & print the atomic parts of a MachineMemOperand. We're going to need them very soon for GlobalISel. llvm-svn: 294992	2017-02-13 22:14:08 +00:00
Arnold Schwaighofer	8f3df731dc	swiftcc: Don't emit tail calls from callers with swifterror parameters Backends don't support this yet. They would have to move to the swifterror register before the tail call to make sure it is live-in to the call. rdar://30495920 llvm-svn: 294982	2017-02-13 19:58:28 +00:00
Peter Collingbourne	2b33f65317	IR: Type ID summary extensions for WPD; thread summary into WPD pass. Make the whole thing testable by adding YAML I/O support for the WPD summary information and adding some negative tests that exercise the YAML support. Differential Revision: https://reviews.llvm.org/D29782 llvm-svn: 294981	2017-02-13 19:26:18 +00:00
Alexey Bataev	7bed48e7a3	[SLP] Test for extractelement cost fix. llvm-svn: 294980	2017-02-13 19:08:19 +00:00
Taewook Oh	06a2128cfa	Make MachineBasicBlock::updateTerminator to update DebugLoc as well Summary: Currently MachineBasicBlock::updateTerminator simply drops DebugLoc for newly created branch instructions, which may cause incorrect stepping and/or imprecise sample profile data. Below is an example: ``` 1 extern int bar(int x); 2 3 int foo(int begin, int end) { 4 int i; 5 int ret = 0; 6 for ( 7 i = begin ; 8 i != end ; 9 i++) 10 { 11 ret += bar(i); 12 } 13 return ret; 14 } ``` Below is a bitcode of 'foo' at the end of LLVM-IR level optimizations with -O3: ``` define i32 @foo(i32* readonly %begin, i32* readnone %end) !dbg !4 { entry: %cmp6 = icmp eq i32* %begin, %end, !dbg !9 br i1 %cmp6, label %for.end, label %for.body.preheader, !dbg !12 for.body.preheader: ; preds = %entry br label %for.body, !dbg !13 for.body: ; preds = %for.body.preheader, %for.body %ret.08 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ] %i.07 = phi i32* [ %incdec.ptr, %for.body ], [ %begin, %for.body.preheader ] %0 = load i32, i32* %i.07, align 4, !dbg !13, !tbaa !15 %call = tail call i32 @bar(i32 %0), !dbg !19 %add = add nsw i32 %call, %ret.08, !dbg !20 %incdec.ptr = getelementptr inbounds i32, i32* %i.07, i64 1, !dbg !21 %cmp = icmp eq i32* %incdec.ptr, %end, !dbg !9 br i1 %cmp, label %for.end.loopexit, label %for.body, !dbg !12, !llvm.loop !22 for.end.loopexit: ; preds = %for.body br label %for.end, !dbg !24 for.end: ; preds = %for.end.loopexit, %entry %ret.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.end.loopexit ] ret i32 %ret.0.lcssa, !dbg !24 } ``` where ``` !12 = !DILocation(line: 6, column: 3, scope: !11) ``` . As you can see, the terminator of 'entry' block, which is a loop control branch, has a DebugLoc of line 6, column 3. Howerver, after the execution of 'MachineBlock::updateTerminator' function, which is triggered by MachineSinking pass, the DebugLoc info is dropped as below (see there's no debug-location for JNE_1): ``` bb.0.entry: successors: %bb.4(0x30000000), %bb.1.for.body.preheader(0x50000000) liveins: %rdi, %rsi %6 = COPY %rsi %5 = COPY %rdi %8 = SUB64rr %5, %6, implicit-def %eflags, debug-location !9 JNE_1 %bb.1.for.body.preheader, implicit %eflags ``` This patch addresses this issue and make newly created branch instructions to keep debug-location info. Reviewers: aprantl, MatzeB, craig.topper, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D29596 llvm-svn: 294976	2017-02-13 18:15:31 +00:00
Matthew Simpson	659f92e2aa	Revert "[LV] Extend trunc optimization to all IVs with constant integer steps" This reverts commit r294967. This patch caused execution time slowdowns in a few LLVM test-suite tests, as reported by the clang-cmake-aarch64-quick bot. I'm reverting to investigate. llvm-svn: 294973	2017-02-13 18:02:35 +00:00
Quentin Colombet	fbae5fcb96	[FastISel] Add a diagnostic to warm on fallback. This is consistent with what we do for GlobalISel. That way, it is easy to see whether or not FastISel is able to fully select a function. At some point we may want to switch that to an optimization remark. llvm-svn: 294970	2017-02-13 17:38:59 +00:00
James Molloy	0ae2202235	[ARM] Fix crash caused by r294945 I'd missed a creator of FCMP nodes - duplicateCmp(). Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite. llvm-svn: 294968	2017-02-13 17:18:00 +00:00
Matthew Simpson	7b7f40297f	[LV] Extend trunc optimization to all IVs with constant integer steps This patch extends the optimization of truncations whose operand is an induction variable with a constant integer step. Previously we were only applying this optimization to the primary induction variable. However, the cost model assumes the optimization is applied to the truncation of all integer induction variables (even regardless of step type). The transformation is now applied to the other induction variables, and I've updated the cost model to ensure it is better in sync with the transformation we actually perform. Differential Revision: https://reviews.llvm.org/D29847 llvm-svn: 294967	2017-02-13 16:48:00 +00:00
Simon Dardis	d9858dfdee	[mips] Fix failing test. llvm-svn: 294966	2017-02-13 16:42:35 +00:00
Simon Dardis	509da1a46d	[mips] divide macro instruction cleanup. Clean up the implementation of divide macro expansion by getting rid of a FIXME regarding magic numbers and branch instructions. Match GAS' behaviour for expansion of ddiv / div in the two and three operand cases. Add the two operand alias for MIPSR6. Finally, optimize macro expansion cases where the divisior is the $zero register. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D29887 llvm-svn: 294960	2017-02-13 16:06:48 +00:00
Davide Italiano	513dfaa0a3	[PM] Hook up the instrumented PGO machinery in the new PM. Differential Revision: https://reviews.llvm.org/D29308 llvm-svn: 294955	2017-02-13 15:26:22 +00:00
Simon Pilgrim	ce2cb2d968	[X86][SSE] Add v4f32 and v2f64 extract to store tests llvm-svn: 294952	2017-02-13 14:20:13 +00:00
Sanne Wouda	490d4a6da6	[CodeGen] fix alignment of JUMPTABLE_INSTS on v8M.base Summary: The attached test case fails with "fatal error: error in backend: misaligned pc-relative fixup value" as the jump table is misaligned. The EmitAlignment existed already for ARM and Thumb-1 code, but was missing for Thumb-2. The test checks that the fatal error disappears when generating an obj file, as well as checking the align directive is there when producing an asm file. Reviewers: rengolin, grosbach, t.p.northover, jmolloy, SjoerdMeijer, samparker Reviewed By: samparker Subscribers: samparker, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D29650 llvm-svn: 294950	2017-02-13 14:07:45 +00:00
James Molloy	92497542e7	[Thumb-1] TBB generation: spot redefinitions of index register We match a sequence of 3-4 instructions into a tTBB pseudo. One of our checks is that a particular register in that sequence is killed (so it can be clobbered by the pseudo). We weren't noticing if an errant MOV or other instruction had infiltrated the sequence we were walking. If it had, and it defined the register we've already identified as killed, it makes it live across the tBR_JT and thus unclobberable. Notice this case and bail out. llvm-svn: 294949	2017-02-13 14:07:39 +00:00
Simon Pilgrim	0de807f878	[X86][SSE] Add more thorough extract to store tests Added v4i32 and v2i64 tests and test on i686 as well as x86_64. llvm-svn: 294946	2017-02-13 13:40:12 +00:00
James Molloy	d508789668	[ARM] Use VCMP, not VCMPE, for floating point equality comparisons When generating a floating point comparison we currently unconditionally generate VCMPE. This has the sideeffect of setting the cumulative Invalid bit in FPSCR if any of the operands are QNaN. It is expected that use of a relational predicate on a QNaN value should raise Invalid. Quoting from the C standard: The relational and equality operators support the usual mathematical relationships between numeric values. For any ordered pair of numeric values exactly one of relationships the less, greater, equal and is true. Relational operators may raise the floating-point exception when argument values are NaNs. The standard doesn't explicitly state the expectation for equality operators, but the implication and obvious expectation is that equality operators should not raise Invalid on a QNaN input, as those predicates are wholly defined on unordered inputs (to return not equal). Therefore, add a new operand to ARMISD::FPCMP and FPCMPZ indicating if QNaN should raise Invalid, and pipe that through to TableGen. llvm-svn: 294945	2017-02-13 12:32:47 +00:00
Pierre Gousseau	796e0d6df1	[X86] Improve readability of test/CodeGen/X86/lzcnt-zext-cmp.ll by adding a common check prefix ALL. NFC. llvm-svn: 294938	2017-02-13 09:57:17 +00:00
Alexey Bataev	e8b1536e21	[SLP] Fix for PR31690: Allow using of extra values in horizontal reductions. Currently, LLVM supports vectorization of horizontal reduction instructions with initial value set to 0. Patch supports vectorization of reduction with non-zero initial values. Also, it supports a vectorization of instructions with some extra arguments, like: ``` float f(float x[], int a, int b) { float p = a % b; p += x[0] + 3; for (int i = 1; i < 32; i++) p += x[i]; return p; } ``` Patch allows vectorization of this kind of horizontal reductions. Differential Revision: https://reviews.llvm.org/D29727 llvm-svn: 294934	2017-02-13 08:01:26 +00:00
Craig Topper	3668bde371	[DAGCombiner] Teach DAG combine that inserting an extract_subvector result into the same location of a an undef vector can just use the original input to the extract. llvm-svn: 294932	2017-02-13 04:53:33 +00:00
Craig Topper	680c73e7ab	[X86] Genericize the handling of INSERT_SUBVECTOR from an EXTRACT_SUBVECTOR to support 512-bit vectors with 128-bit or 256-bit subvectors. We now detect that both the extract and insert indices are non-zero and convert to a shuffle. This will be lowered as a blend for 256-bit vectors or as a vshuf operations for 512-bit vectors. llvm-svn: 294931	2017-02-13 04:53:29 +00:00
Craig Topper	aa46204ed9	[DAGCombiner] Remove the half vector width check for the combine of EXTRACT_SUBVECTOR from an INSERT_SUBVECTOR. This gives more parallelism opportunities for AVX-512 when dealing with 128-bit extracts from 512-bit vectors. llvm-svn: 294930	2017-02-12 23:49:49 +00:00
Daniel Berlin	1bcd504a88	NewGVN: Update a number of xfailed tests to either be correct or note why they fail. llvm-svn: 294928	2017-02-12 23:28:06 +00:00
Daniel Berlin	2ef385d019	NewGVN: We really pass TBAA if we enable DCE and fix the test. Note that GVN eliminates no-use readonly/readnone calls, even if they are not marked nounwind. NewGVN only eliminates them if they are marked nounwind, and thus, trivially dead. llvm-svn: 294927	2017-02-12 23:24:47 +00:00
Sanjay Patel	0557a44287	[TargetLowering] fix SETCC SETLT folding with FP types The bug was introduced with: https://reviews.llvm.org/rL294863 ...and manifests as a selection failure in x86, but that's actually another bug. This fix prevents wrong codegen with -0.0, but in the more common case when we have NSZ and NNAN (-ffast-math), we should still be able to fold this setcc/compare. llvm-svn: 294924	2017-02-12 23:07:52 +00:00
Daniel Berlin	86eab15f2b	NewGVN: Apply the fast math flags fix in r267113 to NewGVN as well. llvm-svn: 294922	2017-02-12 22:25:20 +00:00
Daniel Berlin	dbe8264c93	PredicateInfo: Handle critical edges Summary: This adds support for placing predicateinfo such that it affects critical edges. This fixes the issues mentioned by Nuno on the mailing list. Depends on D29519 Reviewers: davide, nlopes Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29606 llvm-svn: 294921	2017-02-12 22:12:20 +00:00
Craig Topper	6eca3170a8	[AVX-512] Add VPEXTRD/Q to load folding tables. llvm-svn: 294905	2017-02-12 18:47:37 +00:00
Sanjay Patel	45b7e69fef	[InstCombine] fold icmp sgt/slt (add nsw X, C2), C --> icmp sgt/slt X, (C - C2) I found one special case of this transform for 'slt 0', so I removed that and added the general transform. Alive code to check correctness: Name: slt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp slt %a, C1 => %b = icmp slt %x, C1 - C2 Name: sgt_no_overflow Pre: WillNotOverflowSignedSub(C1, C2) %a = add nsw i8 %x, C2 %b = icmp sgt %a, C1 => %b = icmp sgt %x, C1 - C2 http://rise4fun.com/Alive/MH Differential Revision: https://reviews.llvm.org/D29774 llvm-svn: 294898	2017-02-12 16:40:30 +00:00
Sanjay Patel	97e4b98749	[ValueTracking] use nonnull argument attribute to eliminate null checks Enhancing value tracking's analysis of null-ness was suggested in D27855, so here's a first attempt at that. This is part of solving: https://llvm.org/bugs/show_bug.cgi?id=28430 Differential Revision: https://reviews.llvm.org/D28204 llvm-svn: 294897	2017-02-12 15:35:34 +00:00
Simon Pilgrim	4cd841757a	[X86][AVX2] Add support for combining target shuffles to VPMOVZX Initial 256-bit vector support - 512-bit support requires extra checks for AVX512BW support (PMOVZXBW) that will be handled in a future patch. llvm-svn: 294896	2017-02-12 14:31:23 +00:00
Dorit Nuzman	eac89d736c	[LV/LoopAccess] Check statically if an unknown dependence distance can be proven larger than the loop-count This fixes PR31098: Try to resolve statically data-dependences whose compile-time-unknown distance can be proven larger than the loop-count, instead of resorting to runtime dependence checking (which are not always possible). For vectorization it is sufficient to prove that the dependence distance is >= VF; But in some cases we can prune unknown dependence distances early, and even before selecting the VF, and without a runtime test, by comparing the distance against the loop iteration count. Since the vectorized code will be executed only if LoopCount >= VF, proving distance >= LoopCount also guarantees that distance >= VF. This check is also equivalent to the Strong SIV Test. Reviewers: mkuper, anemet, sanjoy Differential Revision: https://reviews.llvm.org/D28044 llvm-svn: 294892	2017-02-12 09:32:53 +00:00
Davide Italiano	77d42eac64	[LTO] Remove useless redirection from test. NFCI. llvm-svn: 294889	2017-02-12 05:43:25 +00:00
Chandler Carruth	719ffe1a66	[PM] Add devirtualization-based iteration utility into the new PM's default pipeline. A clang with this patch built with ASan and asserts can build all of the test-suite as well, so it seems to not uncover any latent problems. Differential Revision: https://reviews.llvm.org/D29853 llvm-svn: 294888	2017-02-12 05:38:04 +00:00
Chandler Carruth	e87fc8cb71	[PM] Enable GlobalsAA in the new PM's pipeline by default. All the invalidation issues and bugs in this seem to be fixed, it has survived a full build of the test suite plus SPEC with asserts and ASan enabled on the Clang binary used. Differential Revision: https://reviews.llvm.org/D29815 llvm-svn: 294887	2017-02-12 05:34:04 +00:00
Davide Italiano	6cb6f997d8	[lib/LTO] Add support for hotness optremarks in the new API. llvm-svn: 294885	2017-02-12 05:05:35 +00:00
Davide Italiano	1e30b3d7be	[LTO] Simplify this test quite a bit, @func2 is unused/unneeded. llvm-svn: 294884	2017-02-12 03:47:54 +00:00
Davide Italiano	ebd471974a	[lib/LTO] Initial support for optimization remarks in the new API. llvm-svn: 294882	2017-02-12 03:31:30 +00:00
Craig Topper	04840ab752	[X86] Update test case I missed in r294876. llvm-svn: 294878	2017-02-11 23:23:11 +00:00
Craig Topper	1c37e991e6	[X86] Move code for using blendi for insert_subvector out to an isel pattern. This gives the DAG combiner more opportunity to optimize without needing to dig through the blend. llvm-svn: 294876	2017-02-11 22:57:12 +00:00
Simon Pilgrim	755d9127f5	[X86][SSE] Use VSEXT/VZEXT constant folding for SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG Preparatory step for PR31712 llvm-svn: 294874	2017-02-11 22:47:06 +00:00
Simon Pilgrim	437d64c49e	[X86][SSE] Improve VSEXT/VZEXT constant folding. Generalize VSEXT/VZEXT constant folding to work with any target constant bits source not just BUILD_VECTOR . llvm-svn: 294873	2017-02-11 21:55:24 +00:00
Amaury Sechet	cafc256fd4	Fix atomic-minmax-i6432.ll . llvm-svn: 294867	2017-02-11 19:34:11 +00:00
Amaury Sechet	42fb927438	Regen expected tests result. NFC llvm-svn: 294866	2017-02-11 19:27:15 +00:00
Sanjay Patel	63499b61c9	[TargetLowering] check for sign-bit comparisons in SimplifyDemandedBits I don't know if anything other than x86 vectors is affected by this change, but this may allow us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises from the fact that blendv* instructions only use the sign-bit when deciding which vector element to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already exists in x86 lowering; this demanded bits change just enables the transform to fire more often. The original motivation starts with a bug for DSE of masked stores that seems completely unrelated, but I've explained the likely steps in this series here: https://llvm.org/bugs/show_bug.cgi?id=11210 Differential Revision: https://reviews.llvm.org/D29687 llvm-svn: 294863	2017-02-11 18:01:55 +00:00
Amaury Sechet	9df26d330f	Fix typo in test filename. NFC llvm-svn: 294860	2017-02-11 17:48:49 +00:00
Craig Topper	255343483d	[AVX-512] Add VPMINS/MINU/MAXS/MAXU instructions to load folding tables. llvm-svn: 294858	2017-02-11 17:35:28 +00:00
Simon Pilgrim	86a95c1ff7	[X86][3DNow!] Add tests to ensure PFMAX/PFMIN are not commuted. llvm-svn: 294848	2017-02-11 14:01:37 +00:00
Simon Pilgrim	6411a0ebed	[X86][3DNow!] Enable PFSUB<->PFSUBR commutation llvm-svn: 294847	2017-02-11 13:51:14 +00:00
Simon Pilgrim	4ead1d4aa9	[X86][3DNow!] Enable commutation for PFADD/PFMUL/PFCMPEQ/PAVGUSB/PMULHRW All commutations confirmed to give identical results - note PFMAX/PFMIN do not PFSUB<->PFSUBR should be commutable as well llvm-svn: 294846	2017-02-11 13:32:55 +00:00
Simon Pilgrim	6b4a5134af	[X86][3DNow!] Add tests showing missed commutation opportunities. llvm-svn: 294845	2017-02-11 13:00:32 +00:00
Daniel Berlin	b79f53669a	NewGVN: Clean up how we handle the INITIAL class so that everything in it is dead or unreachable, as it should be. This also makes the leader of INITIAL undef, enabling us to handle irreducibility properly. Summary: This lets us verify, more than we do now, that we didn't screw up value numbering. Reviewers: davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D29842 llvm-svn: 294844	2017-02-11 12:48:50 +00:00
Simon Pilgrim	8158816efe	[X86][XOP] Regenerate XOP commutation tests. Added 32-bit tests as well. llvm-svn: 294841	2017-02-11 12:30:59 +00:00
Simon Pilgrim	008ba63e04	[X86][SSE] Regenerate float comparison commutation tests. llvm-svn: 294840	2017-02-11 12:29:56 +00:00
Simon Pilgrim	0d8632f089	[X86] Regenerate CLMUL commutation tests. llvm-svn: 294839	2017-02-11 12:23:22 +00:00
Craig Topper	1f6153bab4	[AVX-512] Add VPINSRB/W/D/Q instructions to load folding tables. llvm-svn: 294830	2017-02-11 07:01:40 +00:00
Craig Topper	3afa777f10	[AVX-512] Add VPSADBW instructions to load folding tables. llvm-svn: 294827	2017-02-11 06:24:03 +00:00
Evgeny Stupachenko	5f3d9b6c09	The patch fixes r294821 Summary: Update register match for windows testing From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294825	2017-02-11 05:39:00 +00:00
Craig Topper	464b8cb244	[X86] Don't base domain decisions on VEXTRACTF128/VINSERTF128 if only AVX1 is available. Seems the execution dependency pass likes to use FP instructions when most of the consuming code is integer if a vextractf128 instruction produced the register. Without AVX2 we don't have the corresponding integer instruction available. This patch suppresses the domain on these instructions to GenericDomain if AVX2 is not supported so that they are ignored by domain fixing. If AVX2 is supported we'll report the correct domain and allow them to switch between integer and fp. Overall I think this produces better results in the modified test cases. llvm-svn: 294824	2017-02-11 05:32:57 +00:00
Evgeny Stupachenko	fe6f548d2d	Fix PR23384 (under "-lsr-insns-cost" option) Summary: The patch adds instructions number generated by a solution to LSR cost under "-lsr-insns-cost" option. Reviewers: qcolombet, hfinkel Differential Revision: http://reviews.llvm.org/D28307 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 294821	2017-02-11 02:57:43 +00:00
Ahmed Bougacha	8425f453ef	[ARM] Make f16 interleaved accesses expensive. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Teach the cost model to consider f16 interleaved operations as expensive. Otherwise, we are all but guaranteed to end up with a large block of scalarized vector code. llvm-svn: 294819	2017-02-11 01:53:04 +00:00
Ahmed Bougacha	fc979dc9dd	[ARM] Don't lower f16 interleaved accesses. There are no vldN/vstN f16 variants, even with +fullfp16. We could use the i16 variants, but, in practice, even with +fullfp16, the f16 sequence leading to the i16 shuffle usually gets scalarized. We'd need to improve our support for f16 codegen before getting there. Reject f16 interleaved accesses. If we try to emit the f16 intrinsics, we'll just end up with a selection failure. llvm-svn: 294818	2017-02-11 01:53:00 +00:00
Ahmed Bougacha	f37fb89edc	[ARM] Unique some redundant CHECK lines. NFC. llvm-svn: 294817	2017-02-11 01:52:57 +00:00
Wei Mi	8f20e63a20	[LSR] Recommit: Allow formula containing Reg for SCEVAddRecExpr related with outerloop. The recommit includes some changes of testcases. No functional change to the patch. In RateRegister of existing LSR, if a formula contains a Reg which is a SCEVAddRecExpr, and this SCEVAddRecExpr's loop is an outerloop, the formula will be marked as Loser and dropped. Suppose we have an IR that %for.body is outerloop and %for.body2 is innerloop. LSR only handle inner loop now so only %for.body2 will be handled. Using the logic above, formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1reg({0,+,1}<%for.body2>) will be dropped no matter what because reg({1,+, %size}<%for.body>) is a SCEVAddRecExpr type reg related with outerloop. Only formula like reg(%array) + 1reg({{1,+, %size}<%for.body>,+,1}<nuw><nsw><%for.body2>) will be kept because the SCEVAddRecExpr related with outerloop is folded into the initial value of the SCEVAddRecExpr related with current loop. But in some cases, we do need to share the basic induction variable reg{0 ,+, 1}<%for.body2> among LSR Uses to reduce the final total number of induction variables used by LSR, so we don't want to drop the formula like reg(%array) + reg({1,+, %size}<%for.body>) + 1*reg({0,+,1}<%for.body2>) unconditionally. From the existing comment, it tries to avoid considering multiple level loops at the same time. However, existing LSR only handles innermost loop, so for any SCEVAddRecExpr with a loop other than current loop, it is an invariant and will be simple to handle, and the formula doesn't have to be dropped. Differential Revision: https://reviews.llvm.org/D26429 llvm-svn: 294814	2017-02-11 00:50:23 +00:00
Krzysztof Parzyszek	f9015e62fd	[Hexagon] Introduce Hexagon V62 llvm-svn: 294805	2017-02-10 23:46:45 +00:00
Davide Italiano	95a8707de8	[tests] Be explicit about the files we want to remove. Hopefully Windows will stop whining after this change. llvm-svn: 294801	2017-02-10 22:55:37 +00:00
Peter Collingbourne	be9ffaacfa	IR: Function summary extensions for whole-program devirtualization pass. The summary information includes all uses of llvm.type.test and llvm.type.checked.load intrinsics that can be used to devirtualize calls, including any constant arguments for virtual constant propagation. Differential Revision: https://reviews.llvm.org/D29734 llvm-svn: 294795	2017-02-10 22:29:38 +00:00
Davide Italiano	62092aeb42	[LTO] Make these tests robust across multiple iterations. Same as r294784, but for regular LTO. llvm-svn: 294789	2017-02-10 22:11:06 +00:00
Nico Weber	ee0b0ec935	Revert r294532, it caused PR31935 llvm-svn: 294787	2017-02-10 21:57:30 +00:00
Yaxun Liu	ba01ed00fe	Fix invalid addrspacecast due to combining alloca with global var For function-scope variables with large initialisation list, FE usually generates a global variable to hold the initializer, then generates memcpy intrinsic to initialize the alloca. InstCombiner::visitAllocaInst identifies such allocas which are accessed only by reading and replaces them with the global variable. This is done by casting the global variable to the type of the alloca and replacing all references. However, when the global variable is in a different address space which is disjoint with addr space 0 (e.g. for IR generated from OpenCL, global variable cannot be in private addr space i.e. addr space 0), casting the global variable to addr space 0 results in invalid IR for certain targets (e.g. amdgpu). To fix this issue, when the global variable is not in addr space 0, instead of casting it to addr space 0, this patch chases down the uses of alloca until reaching the load instructions, then replaces load from alloca with load from the global variable. If during the chasing bitcast and GEP are encountered, new bitcast and GEP based on the global variable are generated and used in the load instructions. Differential Revision: https://reviews.llvm.org/D27283 llvm-svn: 294786	2017-02-10 21:46:07 +00:00
Davide Italiano	d6979b8c38	[ThinLTO] Make this test more robust across multiple runs. The yaml emitter files are left around otherwise. llvm-svn: 294784	2017-02-10 21:35:31 +00:00
Dehao Chen	fb02f7140a	Encode duplication factor from loop vectorization and loop unrolling to discriminator. Summary: This patch starts the implementation as discuss in the following RFC: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html When optimization duplicates code that will scale down the execution count of a basic block, we will record the duplication factor as part of discriminator so that the offline process tool can find the duplication factor and collect the accurate execution frequency of the corresponding source code. Two important optimization that fall into this category is loop vectorization and loop unroll. This patch records the duplication factor for these 2 optimizations. The recording will be guarded by a flag encode-duplication-in-discriminators, which is off by default. Reviewers: probinson, aprantl, davidxl, hfinkel, echristo Reviewed By: hfinkel Subscribers: mehdi_amini, anemet, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D26420 llvm-svn: 294782	2017-02-10 21:09:07 +00:00
Ahmed Bougacha	2e275e272f	[X86] Bitcast subvector before broadcasting it. Since r274013, we've been looking through bitcasts on broadcast inputs. In the scalar-folding case (from a load, build_vector, or sc2vec), the input type didn't matter, as we'd simply bitcast the resulting scalar back. However, when broadcasting a 128-bit-lane-aligned element, we create an EXTRACT_SUBVECTOR. Use proper types, by creating an extract_subvector of the original input type. llvm-svn: 294774	2017-02-10 19:51:47 +00:00
Kevin Enderby	dc412ccc41	Yet another fix llvm-objdump so it picks a good CPU based for Mach-O files, in this case for CPU_SUBTYPE_ARM64_ALL. For this cpusubtype it should default to a cyclone CPU to give proper disassembly without a -mcpu= flag. rdar://27767188 llvm-svn: 294771	2017-02-10 19:27:10 +00:00
Tim Northover	0e01170c79	GlobalISel: drop lifetime intrinsics during translation. We don't use them yet and they just cause problems. llvm-svn: 294770	2017-02-10 19:10:38 +00:00
Simon Pilgrim	39f8da3823	[X86][AVX512] Add vector rotate tests for AVX512 targets AVX512 does have vector rotate instructions, but we don't lower to them yet llvm-svn: 294766	2017-02-10 18:06:11 +00:00
Amaury Sechet	280ad2cebb	Autogenerate results for test/CodeGen/X86/peep-test-4.ll . NFC llvm-svn: 294765	2017-02-10 17:57:48 +00:00
Amaury Sechet	f6308cfe87	Autogenerate results for test/CodeGen/X86/pr14314.ll . NFC llvm-svn: 294764	2017-02-10 17:57:46 +00:00
John Brawn	e60f4e4b8d	[ARM] Fix incorrect mask bits in MSR encoding for write_register intrinsic In the encoding of system registers in the M-class MSR instruction the mask bits should be 2 for registers that don't take a _<bits> qualifier (the instruction is unpredictable otherwise), and should also be 2 if the register takes a _<bits> qualifier but it's not present as no _<bits> is an alias for _nzcvq. Differential Revision: https://reviews.llvm.org/D29828 llvm-svn: 294762	2017-02-10 17:41:08 +00:00
Amaury Sechet	c8587e4257	Use autogenerate check in CodeGen/X86/pr16031.ll . NFC llvm-svn: 294761	2017-02-10 17:26:21 +00:00
Amaury Sechet	3b87944433	Check full codegen in CodeGen/X86/i256-add.ll NFC llvm-svn: 294756	2017-02-10 16:34:17 +00:00
Matthew Simpson	df124a7569	[LV] Remove type restriction for vector phi creation We previously only created a vector phi node for an induction variable if its type matched the type of the canonical induction variable. Differential Revision: https://reviews.llvm.org/D29776 llvm-svn: 294755	2017-02-10 16:15:26 +00:00
Krzysztof Parzyszek	a72fad980c	[Hexagon] Replace instruction definitions with auto-generated ones llvm-svn: 294753	2017-02-10 15:33:13 +00:00
Rafael Espindola	be99157127	Move some error handling down to MCStreamer. This makes sure we get the same redefinition rules regardless of who is printing (asm parser, codegen) and to what (asm, obj). This fixes an unintentional regression in r293936. llvm-svn: 294752	2017-02-10 15:13:12 +00:00
Simon Pilgrim	a3362a1c9e	[X86][SSE] Added chained FDIV test cases for D26855 Tests to demonstrate throughput-latency decision between div and rcp on faster hardware such as Haswell llvm-svn: 294750	2017-02-10 14:56:12 +00:00
Simon Pilgrim	bfb1747806	[DAGCombine] Allow vector constant folding of any value type before type legalization The patch comes in 2 parts: 1 - it makes use of the SelectionDAG::NewNodesMustHaveLegalTypes flag to tell when it can safely constant fold illegal types. 2 - it correctly resets SelectionDAG::NewNodesMustHaveLegalTypes at the start of each call to SelectionDAGISel::CodeGenAndEmitDAG so all the pre-legalization stages can make use of it - not just the first basic block that gets handled. Fix for PR30760 Differential Revision: https://reviews.llvm.org/D29568 llvm-svn: 294749	2017-02-10 14:37:25 +00:00
Simon Pilgrim	c371159aac	[X86][SSE] Add support for extracting target constants from BUILD_VECTOR In some cases we call getTargetConstantBitsFromNode for nodes that haven't been lowered from BUILD_VECTOR yet Note: We're getting very close to being able to move most of the constant extraction code from getTargetShuffleMaskIndices into getTargetConstantBitsFromNode llvm-svn: 294746	2017-02-10 14:04:11 +00:00
Chandler Carruth	7bc6028d7d	[PM] Relax the patterns used in the new test I added because some compilers don't print the typedef name. llvm-svn: 294729	2017-02-10 08:48:50 +00:00
Chandler Carruth	f425292721	[PM] Fix a bug in the new loop PM when handling functions with no loops. Without any loops, we don't even bother to build the standard analyses used by loop passes. Without these, we can't run loop analyses or invalidate them properly. Unfortunately, we did these things in the wrong order which would allow a loop analysis manager's proxy to be built but then not have the standard analyses built. When we went to do the invalidation in the proxy thing would fall apart. In the test case provided, it would actually crash. The fix is to carefully check for loops first, and to in fact build the standard analyses before building the proxy. This allows it to correctly trigger invalidation for those standard analyses. An alternative might seem to be to look at whether there are any loops when doing invalidation, but this doesn't work when during the loop pipeline run we delete the last loop. I've even included that as a test case. It is both simpler and more robust to defer building the proxy until there are definitely the standard set of analyses and indeed loops. This bug was uncovered by enabling GlobalsAA in the pipeline. llvm-svn: 294728	2017-02-10 08:26:58 +00:00
Igor Breger	b4442f34cd	[X86][GlobalISel] Add general-purpose Register Bank Summary: [X86][GlobalISel] Add general-purpose Register Bank. Add trivial handling of G_ADD legalization . Add Regestry Bank selection for COPY and G_ADD instructions Reviewers: rovka, zvi, ab, t.p.northover, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, mgorny, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29771 llvm-svn: 294723	2017-02-10 07:05:56 +00:00
Philip Reames	578dafbd8b	[LoopUnswitch] Remove BFI usage (dead code) Chandler mentioned at the last social that the need for BFI in the new pass manager was causing a slight hiccup for this pass. Given this code has been checked in, but off for over a year, it makes sense to just remove it for now. Note that there's nothing wrong with the general idea - it's actually a quite good one - and once we have the infrastructure in place to implement this without the full recompuation on every loop, we absolutely should. llvm-svn: 294715	2017-02-10 06:12:06 +00:00
Eric Christopher	0824096cc0	Temporarily revert "For X86-64 linux and PPC64 linux align int128 to 16 bytes." until we can get better TargetMachine::isCompatibleDataLayout to compare - otherwise we can't code generate existing bitcode without a string equality data layout. This reverts commit r294702. llvm-svn: 294709	2017-02-10 04:35:32 +00:00
Eric Christopher	42b9248803	For X86-64 linux and PPC64 linux align int128 to 16 bytes. For other platforms we should find out what they need and likely make the same change, however, a smaller additional change is easier for platforms we know have it specified in the ABI. As part of this rewrite some of the handling in the backends for data layout and update a bunch of testcases. Based on a patch by Simonas Kazlauskas! llvm-svn: 294702	2017-02-10 03:32:21 +00:00
Wei Ding	205bfdb3e9	AMDGPU : Add trap handler support. Differential Revision: http://reviews.llvm.org/D26010 llvm-svn: 294692	2017-02-10 02:15:29 +00:00
Stanislav Mekhanoshin	6dec24316b	[AMDGPU] Override PSet for M0 This change returns empty PSet list for M0 register. Otherwise its PSet as defined by tablegen is SReg_32. This results in incorrect register pressure calculation every time an instruction uses M0. Such uses count as SReg_32 PSet and inadequately increase pressure on SGPRs. Differential Revision: https://reviews.llvm.org/D29798 llvm-svn: 294691	2017-02-10 02:07:58 +00:00
David L. Jones	e072cf51da	Update test/CodeGen/X86/sse-align-10.ll to use FileCheck instead of grep Patch by Jorge Gorbe (lethalantidote). Differential Revision: https://reviews.llvm.org/D29797 llvm-svn: 294686	2017-02-10 01:35:31 +00:00
Michael J. Spencer	788b10ecbc	[LoadCombine] Change test to not use instcombine. llvm-svn: 294682	2017-02-10 00:44:08 +00:00
Chandler Carruth	0ede22e1c0	[PM] Add Argument Promotion to the pass pipeline. This needs explicit requires of the optimization remark emission before loop pass pipelines containing LICM as we no longer get it from the inliner -- Argument Promotion may invalidate it. Technically the inliner could also have broken this, but it never came up in testing. Differential Revision: https://reviews.llvm.org/D29595 llvm-svn: 294670	2017-02-09 23:54:57 +00:00
Davide Italiano	fc0d442cf1	[NewGVN] Fix test so that it doesn't rely on InstCombine anymore. llvm-svn: 294668	2017-02-09 23:48:10 +00:00
Chandler Carruth	addcda483e	[PM] Port ArgumentPromotion to the new pass manager. Now that the call graph supports efficient replacement of a function and spurious reference edges, we can port ArgumentPromotion to the new pass manager very easily. The old PM-specific bits are sunk into callbacks that the new PM simply doesn't use. Unlike the old PM, the new PM simply does argument promotion and afterward does the update to LCG reflecting the promoted function. Differential Revision: https://reviews.llvm.org/D29580 llvm-svn: 294667	2017-02-09 23:46:27 +00:00
Peter Collingbourne	17febdbb25	WholeProgramDevirt: Check that VCP candidate functions are defined before evaluating them. This was crashing before. llvm-svn: 294666	2017-02-09 23:46:26 +00:00
George Burgess IV	ccf11c2f9f	[ARM] Add support for armv7ve triple in llvm (PR31358). Gcc supports target armv7ve which is armv7-a with virtualization extensions. This change adds support for this in llvm for gcc compatibility. Also remove redundant FeatureHWDiv, FeatureHWDivARM for a few models as this is specified automatically by FeatureVirtualization. Patch by Manoj Gupta. Differential Revision: https://reviews.llvm.org/D29472 llvm-svn: 294661	2017-02-09 23:29:14 +00:00
Sanjay Patel	f38bab73aa	[InstCombine] allow (X * C2) << C1 --> X * (C2 << C1) for vectors This fold already existed for vectors but only when 'C1' was a splat constant (but 'C2' could be any constant). There were no tests for any vector constants, so I'm adding a test that shows non-splat constants for both operands. llvm-svn: 294650	2017-02-09 23:13:04 +00:00
Peter Collingbourne	ef089bdb4b	X86: Introduce relocImm-based patterns for cmp. Differential Revision: https://reviews.llvm.org/D28690 llvm-svn: 294636	2017-02-09 22:02:28 +00:00
Matt Arsenault	0699ef39ce	AMDGPU: Add pass to expand memcpy/memmove/memset llvm-svn: 294635	2017-02-09 22:00:42 +00:00
Peter Collingbourne	d7dd65ad7c	X86: Teach X86InstrInfo::analyzeCompare to recognize compares of symbols. This requires that we communicate to X86InstrInfo::optimizeCompareInstr that the second operand is neither a register nor an immediate. The way we do that is by setting CmpMask to zero. Note that there were already instructions where the second operand was not a register nor an immediate, namely X86::SUB*rm, so also set CmpMask to zero for those instructions. This seems like a latent bug, but I was unable to trigger it. Differential Revision: https://reviews.llvm.org/D28621 llvm-svn: 294634	2017-02-09 21:58:24 +00:00
Michael J. Spencer	714d9d22ad	[LoadCombine] Fix combining of loads which span an aliasing store. Fixes PR31517 Differential Revision: https://reviews.llvm.org/D28922 llvm-svn: 294632	2017-02-09 21:46:49 +00:00
Sanjay Patel	ae3b43e488	[InstCombine] use m_APInt to allow demanded bits analysis on splat constants llvm-svn: 294628	2017-02-09 21:43:06 +00:00
Konstantin Zhuravlyov	fd87137710	[AMDGPU] Calculate number of min/max SGPRs/VGPRs for WavesPerEU instead of using switch statement Differential Revision: https://reviews.llvm.org/D29741 llvm-svn: 294627	2017-02-09 21:33:23 +00:00
Sanjay Patel	5bcb2d97f0	[InstCombine] add test for demanded bits with splat vector constants; NFC llvm-svn: 294625	2017-02-09 21:33:19 +00:00
Saleem Abdulrasool	864bd176a6	test: adjust the test for the BSD format The padding for ld64 changes the header to include the padding. Adjust the test to account for this. llvm-svn: 294619	2017-02-09 20:06:30 +00:00
Frederic Riss	1488766bdf	[dsymutil] Fix handling of empty CUs in LTO links. r288399 introduced the DIEUnit class, and in the process broke the corner case where dsymutil generates an empty CU during an LTO link. This restores the logic and adds a test for the corner case. llvm-svn: 294618	2017-02-09 19:41:55 +00:00
Sanjoy Das	74bda4d591	[JumpThreading] Thread through guards Summary: This patch allows JumpThreading also thread through guards. Virtually, guard(cond) is equivalent to the following construction: if (cond) { do something } else {deoptimize} Yet it is not explicitly converted into IFs before lowering. This patch enables early threading through guards in simple cases. Currently it covers the following situation: if (cond1) { // code A } else { // code B } // code C guard(cond2) // code D If there is implication cond1 => cond2 or !cond1 => cond2, we can transform this construction into the following: if (cond1) { // code A // code C } else { // code B // code C guard(cond2) } // code D Thus, removing the guard from one of execution branches. Patch by Max Kazantsev! Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29620 llvm-svn: 294617	2017-02-09 19:40:22 +00:00
Saleem Abdulrasool	111cd669e9	Object: pad out BSD archive members to 8-bytes ld64 requires its archive members to be 8-byte aligned for 64-bit content and 4-byte aligned for 32-bit content. Opt for the larger alignment requirement. This ensures that ld64 can consume archives generated by llvm-ar. Thanks to Kevin Enderby for the hint about the ld64/cctools behaviours! Resolves PR28361! llvm-svn: 294615	2017-02-09 19:29:35 +00:00
Geoff Berry	7e320c2485	[SelectionDAG] Fix bugs in inverted condition splitting code. Summary: Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by Mikael Holmen. Handle non-canonicalized xor not operation correctly (was assuming operand 0 was always the non-constant operand) and check that the negated condition is also in the same block as the original and/or instruction (as is done for and/or operands already) before proceeding with optimization. Reviewers: bogner, MatzeB, qcolombet Subscribers: mcrosier, uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D29680 llvm-svn: 294605	2017-02-09 18:28:17 +00:00
Sanjay Patel	b36e1f0223	[InstCombine] add tests for icmp with add nsw; NFC llvm-svn: 294601	2017-02-09 18:12:39 +00:00
Kevin Enderby	5879a48c17	Tweak the implementation of llvm-objdump’s -objc-meta-data option so that it works when the ObjC metadata sections end up in the __DATA_CONST or __DATA_DIRTY segments. rdar://26315238 llvm-svn: 294599	2017-02-09 17:56:26 +00:00
Simon Pilgrim	b25f60210f	[X86][BMI2] Regenerate mulx tests llvm-svn: 294598	2017-02-09 17:54:51 +00:00
David Bozier	93e773e9be	Revert: "[Stack Protection] Add diagnostic information for why stack protection was applied to a function" this reverts revision r294590 as it broke some buildbots. llvm-svn: 294593	2017-02-09 15:40:14 +00:00
Artur Pilipenko	0e4583b56c	Add DAGCombiner load combine tests for partially available values If some of the trailing or leading bytes of a load combine pattern are zeroes we can combine the pattern to a load + zext and shift. Currently we don't support it, so the tests check the current codegen without load combine. This change will make the patch to support this kind of combine a bit more clear. llvm-svn: 294591	2017-02-09 15:13:40 +00:00
David Bozier	6a44b7c2eb	[Stack Protection] Add diagnostic information for why stack protection was applied to a function Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function. This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled. Patch by: James Henderson Differential Revision: https://reviews.llvm.org/D29023 llvm-svn: 294590	2017-02-09 15:08:40 +00:00
Rafael Espindola	dc1c3011fd	Make it possible to set SHF_LINK_ORDER explicitly. This will make it possible to add support for gcing user metadata (asan for example). llvm-svn: 294589	2017-02-09 14:59:20 +00:00
Pierre Gousseau	6953b32475	[X86][btver2] PR31902: Fix a crash in combineOrCmpEqZeroToCtlzSrl under fast math. In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants. Differential Revision: https://reviews.llvm.org/D29756 llvm-svn: 294588	2017-02-09 14:43:58 +00:00
Simon Pilgrim	05ac1f70be	[X86][SSE] Added extra FMA/NO-FMA reciprocal test cases for D26855 Test for expected codegen for nr reciprocal cases with/without FMA llvm-svn: 294587	2017-02-09 14:14:06 +00:00
Diana Picus	7232af352f	[ARM] GlobalISel: Lower single precision FP args Both for aapcscc and aapcs_vfpcc. We currently filter out soft float targets because we don't support libcalls yet. llvm-svn: 294584	2017-02-09 13:09:59 +00:00
Artur Pilipenko	4a64031954	[DAGCombiner] Support non-zero offset in load combine Enable folding patterns which load the value from non-zero offset: i8 a = ... i32 val = a[4] \| (a[5] << 8) \| (a[6] << 16) \| (a[7] << 24) => i32 val = ((i32*)(a+4)) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D29394 llvm-svn: 294582	2017-02-09 12:06:01 +00:00
Simon Pilgrim	563e23e66e	[X86][SSE] Attempt to break register dependencies during lowerBuildVector LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register. This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD. On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.) Differential Revision: https://reviews.llvm.org/D29720 llvm-svn: 294581	2017-02-09 11:50:19 +00:00
Igor Breger	ed43f15637	Add new tests for EXTRACT_VECTOR_ELT (vector of packed i8/16/i32/i64/ps/pd data) llvm-svn: 294565	2017-02-09 07:39:19 +00:00
Craig Topper	50f3d1452c	[X86] Clzero intrinsic and its addition under znver1 This patch does the following. 1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero 2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1) 3. Adds the clzero feature under znver1 architecture. 4. The custom inserter is added in Lowering. 5. A testcase is added to check the intrinsic. 6. The clzero instruction is added to assembler test. Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me. Differential revision: https://reviews.llvm.org/D29385 llvm-svn: 294558	2017-02-09 04:27:34 +00:00
Arnold Schwaighofer	26f016f143	SwiftCC: swifterror register cannot be as the base register Functions that have a dynamic alloca require a base register which is defined to be X19 on AArch64 and r6 on ARM. We have defined the swifterror register to be the same register. Use a different callee save register for swifterror instead: X21 on AArch64 R8 on ARM rdar://30433803 llvm-svn: 294551	2017-02-09 01:52:17 +00:00
Peter Collingbourne	58c90c0c80	LowerTypeTests: Change a few vtable globals in tests to constants. It turns out that some of our negative tests were not in fact providing the test coverage we expected: they were passing because the vtables were failing an early check that they were constant. Fix this by changing the globals in these tests to constants. llvm-svn: 294550	2017-02-09 01:48:24 +00:00
Wolfgang Pieb	458b4e7c46	Reapply r294356 ("Keep track of spilled variables in LiveDebugValues"). Was reverted with r294447 due to undefined behavior with negative offsets in DBG_VALUE instructions. llvm-svn: 294532	2017-02-08 23:46:59 +00:00
Tim Northover	e041841811	GlobalISel: legalize G_FPOW to a libcall on AArch64. There's no instruction to implement it. llvm-svn: 294531	2017-02-08 23:23:39 +00:00
Tim Northover	b38b4e2464	GlobalISel: translate @llvm.pow intrinsic to G_FPOW. It'll usually be immediately legalized back to a libcall, but occasionally something can be done with it so we'd just as well enable that flexibility from the start. llvm-svn: 294530	2017-02-08 23:23:32 +00:00
Mike Aizatsky	4705ae936d	[sancov] using comdat only when it is enabled Differential Revision: https://reviews.llvm.org/D29733 llvm-svn: 294529	2017-02-08 23:12:46 +00:00
Arnold Schwaighofer	db7bbcbe78	[ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not 'this returns' We mark X0 as preserved by a call that passes the returned parameter. x0 = ... fun(x0) // no implicit def of x0 This no longer is valid if we pass the parameter in a different register then the returned value as is the case with a swiftself parameter (passed in x20). x20 = ... fun(x20) // there should be an implict def of x8 rdar://30425845 llvm-svn: 294527	2017-02-08 22:30:47 +00:00
Sanjay Patel	a62bc44f67	[InstCombine] add tests to show information-losing add nsw/nuw transforms; NFC llvm-svn: 294524	2017-02-08 22:14:11 +00:00
Amara Emerson	c3a4b282bb	Revert r294437 as it broke an asan buildbot. llvm-svn: 294523	2017-02-08 21:41:16 +00:00
Tim Northover	9dd78f8a6d	GlobalISel: select G_[SU]MULH on AArch64. Hopefully this'll be nuked by tablegen pretty soon, but until then it's reasonably important for supporting C++ operator new[]. llvm-svn: 294520	2017-02-08 21:22:25 +00:00
Tim Northover	0a9b27933a	GlobalISel: expand mul-with-overflow into mul-hi on AArch64. AArch64 has specific instructions to multiply two numbers at double the width and produce the high part of the result. These can be used to implement LLVM's mul.with.overflow instructions fairly simply. Helps with C++ operator new[]. llvm-svn: 294519	2017-02-08 21:22:15 +00:00
Mike Aizatsky	401d369328	[sancov] specifying comdat for sancov constructors Differential Revision: https://reviews.llvm.org/D29662 llvm-svn: 294517	2017-02-08 21:20:33 +00:00
Peter Collingbourne	28ffd3261f	ThinLTOBitcodeWriter: Strip debug info from merged module. This module will contain nothing but vtable definitions and (soon) available_externally function definitions, so there is no point in keeping debug info in the module. Differential Revision: https://reviews.llvm.org/D28913 llvm-svn: 294511	2017-02-08 20:44:00 +00:00
Alexey Bataev	0674fe39e5	[SLP] Additional test to check correct work of horizontal reductions, NFC. llvm-svn: 294505	2017-02-08 19:52:46 +00:00
Elena Demikhovsky	5267edd3e3	[Loop Vectorizer] Cost-based decision for vectorization form of memory instruction. Making the cost model selecting between Interleave, GatherScatter or Scalar vectorization form of memory instruction. The right decision should be done for non-consecutive memory access instrcuctions that may have more than one vectorization solution. This patch includes the following changes: - Cost Model calculates the cost of Load/Store vector form and choose the better option between Widening, Interleave, GatherScactter and Scalarization. Cost Model keeps the widening decision. - Arrays of Uniform and Scalar values are moved from Legality to Cost Model. - Cost Model collects Uniforms and Scalars per VF. The collection is based on CM decision map of Loadis/Stores vectorization form. - Vectorization of memory instruction is performed according to the CM decision. Differential Revision: https://reviews.llvm.org/D27919 llvm-svn: 294503	2017-02-08 19:25:23 +00:00
Simon Pilgrim	696e27e1ec	[X86][SSE] Regenerate scalar integer conversions to float tests llvm-svn: 294499	2017-02-08 19:01:27 +00:00
Reid Kleckner	e332a5b670	Fix inline-asm-diags.ll on Windows, give it a triple to avoid WoA thumb confusion llvm-svn: 294496	2017-02-08 18:17:21 +00:00
Tim Northover	e9600d861c	GlobalISel: select G_VASTART on iOS AArch64. The AAPCS ABI is substantially more complicated so that's coming in a separate patch. For now we can generate correct code for iOS though. llvm-svn: 294493	2017-02-08 17:57:27 +00:00
Tim Northover	f19d467ff6	GlobalISel: translate @llvm.va_start intrinsic. Because we need to preserve the memory access being performed we need a separate instruction to represent this. llvm-svn: 294492	2017-02-08 17:57:20 +00:00
Adrian Prantl	a5bf2d7003	Fix bitcode upgrade for DIGlobalVariables with a var: field. This is a follow-up to https://reviews.llvm.org/D29349. It turns out that NeedUpgradeToDIGlobalVariableExpression is always necessary when we encountered a version==0 record because it may always be referenced via a list of globals in a DICompileUnit. My tests weren't good enough to catch this though. To trigger this case, we need much older bitcode produced by LLVM around version 3.7. <rdar://problem/30404262> Differential Revision: https://reviews.llvm.org/D29693 llvm-svn: 294488	2017-02-08 17:44:43 +00:00
Sanjay Patel	d11a03b263	[InstCombine] add test for missed vector icmp fold; NFC Also, move the related existing scalar test to a renamed file where I'm planning to add more icmp-add tests. llvm-svn: 294487	2017-02-08 17:37:17 +00:00
Sanne Wouda	fc674bcb12	Move inline asm diags tests to an ARM directory. The assembler syntaxes (and parsers) differ too much to expect this test to pass for all of them. llvm-svn: 294475	2017-02-08 16:48:35 +00:00
Sanne Wouda	7e101936b6	Fix inline asm diagnostics test. Don't depend on X86 everywhere. Fix the original problem with a reg-exp for the column number. llvm-svn: 294468	2017-02-08 16:14:01 +00:00
Sanjay Patel	28ef27e3dc	[x86] add AVX512vl target for more coverage; NFC llvm-svn: 294462	2017-02-08 15:22:52 +00:00
Sanne Wouda	2933875cc2	[Assembler] Enable nicer diagnostics for inline assembly. Fixed test. Summary: Enables source location in diagnostic messages from the backend. This is after parsing, during finalization. This requires the SourceMgr, the inline assembly string buffer, and DiagInfo to still be alive after EmitInlineAsm returns. This patch creates a single SourceMgr for inline assembly inside the AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one SourceMgr per call to EmitInlineAsm would make it difficult for MCContext to figure out in which SourceMgr the SMLoc is located, while a single SourceMgr can figure it out if it has multiple buffers. The Str argument to EmitInlineAsm is copied into a buffer and owned by the inline asm SourceMgr. This ensures that DiagHandlers won't print garbage. (Clang emits a "note: instantiated into assembly here", which refers to this string.) The AsmParser gets destroyed before finalization, which means that the DiagHandlers the AsmParser installs into the SourceMgr will be stale. Restore the saved DiagHandlers. Since now we're using just one SourceMgr for multiple inline asm strings, we need to tell the AsmParser which buffer it needs to parse currently. Hand a buffer id -- returned from SourceMgr:: AddNewSourceBuffer -- to the AsmParser. Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29441 llvm-svn: 294458	2017-02-08 14:48:05 +00:00
Amaury Sechet	887117fb3d	Add test case for pr31890. NFC llvm-svn: 294455	2017-02-08 14:35:48 +00:00
Igor Laevsky	900ffa34c8	[InstCombineCalls] Unfold element atomic memcpy instruction Differential Revision: https://reviews.llvm.org/D28909 llvm-svn: 294453	2017-02-08 14:32:04 +00:00
Diana Picus	e79e5ee244	Fix test to work on swift/cyclone too I forgot to remove the neonfp target feature from the test, which means we'd have trouble selecting VADDS on targets that have neonfp enabled by default. llvm-svn: 294451	2017-02-08 14:23:30 +00:00
Konstantin Zhuravlyov	9f89ede107	[AMDGPU] Add target information that is required by tools to metadata Differential Revision: https://reviews.llvm.org/D28760#fb670e28 llvm-svn: 294449	2017-02-08 14:05:23 +00:00
Diana Picus	79add417b4	Revert "[Assembler] Enable nicer diagnostics for inline assembly." This reverts commit r294433 because it seems it broke the buildbots. llvm-svn: 294448	2017-02-08 14:02:16 +00:00
NAKAMURA Takumi	a3bc043caa	Revert r294356, "DebugInfo: Track spilled variables in LiveDebugValues" It caused undefined behavior in VarLoc. As far as I investigated, - VarLoc::VarLoc() treats negative offset value as InvalidKind. Consider the case that (int64_t)MI.getOperand(1).getImm() is negative and whether it satisfies ((uint64_t)Offset < (1ULL << 32)). - Comparison operators in VarLoc behave undefined since VarLoc::Loc.Hash is uninitialized in case of InvalidKind. I guess Offset (in VarLoc) could be made aware of signed, but I am not sure. So I have reverted it for now. llvm-svn: 294447	2017-02-08 13:49:28 +00:00
Diana Picus	e7ab088a0e	Move test from r294430 to target-specific directory The test is X86-specific, and it broke on the ARM bots because they don't build the X86 target. llvm-svn: 294446	2017-02-08 13:48:08 +00:00
Diana Picus	4fa83c03fd	[ARM] GlobalISel: Add FPR reg bank Add a register bank for floating point values and select simple instructions using them (add, copies from GPR). This assumes that the hardware can cope with a single precision add (VADDS) instruction, so the legalizer will treat G_FADD as legal and the instruction selector will refuse to select if the hardware doesn't support it. In the future we'll want to be more careful about this, and legalize to libcalls if we have to use soft float. llvm-svn: 294442	2017-02-08 13:23:04 +00:00
Amara Emerson	fecdb36f92	[AArch64][TableGen] Skip tied result operands for InstAlias This patch checks the number of operands in the resulting instruction instead of just the alias, then skips over tied operands when generating the printing method. This allows us to generate the preferred assembly syntax for the AArch64 'ins' instruction, which should always be displayed as 'mov' according to the ARMARM. Several unit tests have changed as a result, but only to reflect the preferred disassembly. Some other InstAlias patterns (movk/bic/orr) needed a slight adjustment to stop them becoming the default and breaking other unit tests. Patch by Graham Hunter. Differential Revision: https://reviews.llvm.org/D29219 llvm-svn: 294437	2017-02-08 11:28:08 +00:00
Dylan McKay	db370bd6ef	[AVR] XFAIL a set of failing CodeGen tests There are about 3 underlying bugs causing the tests to fail. On top of that, some tests just we're 'generic' enough. i.e. 32-bit registers. llvm-svn: 294434	2017-02-08 10:24:18 +00:00
Sanne Wouda	09adc245ea	[Assembler] Enable nicer diagnostics for inline assembly. Summary: Enables source location in diagnostic messages from the backend. This is after parsing, during finalization. This requires the SourceMgr, the inline assembly string buffer, and DiagInfo to still be alive after EmitInlineAsm returns. This patch creates a single SourceMgr for inline assembly inside the AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one SourceMgr per call to EmitInlineAsm would make it difficult for MCContext to figure out in which SourceMgr the SMLoc is located, while a single SourceMgr can figure it out if it has multiple buffers. The Str argument to EmitInlineAsm is copied into a buffer and owned by the inline asm SourceMgr. This ensures that DiagHandlers won't print garbage. (Clang emits a "note: instantiated into assembly here", which refers to this string.) The AsmParser gets destroyed before finalization, which means that the DiagHandlers the AsmParser installs into the SourceMgr will be stale. Restore the saved DiagHandlers. Since now we're using just one SourceMgr for multiple inline asm strings, we need to tell the AsmParser which buffer it needs to parse currently. Hand a buffer id -- returned from SourceMgr:: AddNewSourceBuffer -- to the AsmParser. Reviewers: rnk, grosbach, compnerd, rengolin, rovka, anemet Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29441 llvm-svn: 294433	2017-02-08 10:20:07 +00:00
Sam Parker	5fba45ad11	Use dynamic symbols for ELF disassembly Disassembly currently begins from addresses obtained from the objects symbol table. For ELF, add the dynamic symbols to the list if no static symbols are available so that we can more successfully disassemble stripped binaries. Differential Revision: https://reviews.llvm.org/D29632 llvm-svn: 294430	2017-02-08 09:44:18 +00:00
Chandler Carruth	1497710f52	[ArgPromote] Delete a test that makes no sense (any more). This test is under 'ArgumentPromotion' but there are no arguments that get promoted in the test case, so there seems to be no point. Also, there are no assertions about the output at all, so this seems like something we should just delete given the low value. llvm-svn: 294428	2017-02-08 08:54:08 +00:00
Chandler Carruth	2af523f20c	[ArgPromote] Clean up a crash test case by rinsing it through opt, renaming things to at least have somewhat spelled out names, and even have meaningful names where I could guess at what they should be. Also add FileCheck assertions that we're actually doing what we set out to do for some of the tests, for example not promoting a type that would result in infinite promotion. llvm-svn: 294426	2017-02-08 08:47:35 +00:00
Chandler Carruth	102fa92b4e	[ArgPromote] Actually add FileCheck to a test that I actually updated to have nice CHECK patterns instead of relying on a coarse 'not grep' check. Sorry that I missed this the first time through. llvm-svn: 294422	2017-02-08 08:04:02 +00:00
Chandler Carruth	9e44e08953	[ArgPromote] Actually run FileCheck on this test. The CHECK lines are already there, just waiting to, well, be checked. =] llvm-svn: 294421	2017-02-08 08:01:14 +00:00
Matt Arsenault	cb3fa37c7e	LSR: Check atomic instruction pointer operands llvm-svn: 294410	2017-02-08 06:44:58 +00:00
Matt Arsenault	417e0072d6	AMDGPU: Enable InferAddressSpaces llvm-svn: 294408	2017-02-08 06:16:04 +00:00
Craig Topper	3fd463a15a	[X86] Add test for clflushopt intrinsic and only enable it to be selected if the feature flag is set. llvm-svn: 294407	2017-02-08 05:45:46 +00:00
Craig Topper	e0ac7f3beb	[X86] Remove PCOMMIT instruction support since Intel has deprecated this instruction with no plans to release products with it. Intel's documentation for the deprecation https://software.intel.com/en-us/blogs/2016/09/12/deprecate-pcommit-instruction llvm-svn: 294405	2017-02-08 05:45:39 +00:00
Amaury Sechet	4b946916ac	[DAGCombiner] Push truncate through adde when the carry isn't used. Summary: As per title. Reviewers: mkuper, spatel, bkramer, RKSimon, zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29528 llvm-svn: 294394	2017-02-08 00:32:36 +00:00
Sanjoy Das	ec892139bd	[IRCE] Add a missing invariant check Currently IRCE relies on the loops it transforms to be (semantically) of the form: for (i = START; i < END; i++) ... or for (i = START; i > END; i--) ... However, we were not verifying the presence of the START < END entry check (i.e. check before the first iteration). We were only verifying that the backedge was guarded by (i + 1) < END. Usually this would work "fine" since (especially in Java) most loops do actually have the START < END check, but of course that is not guaranteed. llvm-svn: 294375	2017-02-07 23:59:07 +00:00
Simon Pilgrim	39c138cc76	[X86][SSE] Add SSE2 build vector insertion tests llvm-svn: 294365	2017-02-07 22:23:12 +00:00
Simon Pilgrim	90ee0b2786	[X86][SSE] Add additional v4i32/v8i16/v16i8 build vector insertion tests With particular interest in cases where we don't make use of implicit zeroing or fail to break register dependencies llvm-svn: 294363	2017-02-07 22:03:37 +00:00
Wolfgang Pieb	02f329370f	DebugInfo: Track spilled variables in LiveDebugValues When variables are spilled to the stack by the register allocator, keep track of their debug locations in LiveDebugValues and insert DBG_VALUE instructions at the appropriate place. Ensure that the locations are propagated down the dominator tree via the existing mechanisms. Reviewer: aprantl Differential Revision: https://reviews.llvm.org/D29500 llvm-svn: 294356	2017-02-07 21:23:15 +00:00
Kevin Enderby	86d8bd1da5	Fix a typo in an error message for a check of invalid Mach-O files where it was printing the field name fileoff instead of filesize. The original check was added in r278557. This was found in tracking down the problem that lead to the fix in r293842 - [dsymutil] Fix __LINKEDIT vmsize in dsymutil upgrade path rdar://30386075 llvm-svn: 294354	2017-02-07 21:20:44 +00:00
Daniel Berlin	439042b7ad	Add PredicateInfo utility and printing pass Summary: This patch adds a utility to build extended SSA (see "ABCD: eliminating array bounds checks on demand"), and an intrinsic to support it. This is then used to get functionality equivalent to propagateEquality in GVN, in NewGVN (without having to replace instructions as we go). It would work similarly in SCCP or other passes. This has been talked about a few times, so i built a real implementation and tried to productionize it. Copies are inserted for operands used in assumes and conditional branches that are based on comparisons (see below for more) Every use affected by the predicate is renamed to the appropriate intrinsic result. E.g. %cmp = icmp eq i32 %x, 50 br i1 %cmp, label %true, label %false true: ret i32 %x false: ret i32 1 will become %cmp = icmp eq i32, %x, 50 br i1 %cmp, label %true, label %false true: ; Has predicate info ; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 } %x.0 = call @llvm.ssa_copy.i32(i32 %x) ret i32 %x.0 false: ret i23 1 (you can use -print-predicateinfo to get an annotated-with-predicateinfo dump) This enables us to easily determine what operations are affected by a given predicate, and how operations affected by a chain of predicates. Reviewers: davide, sanjoy Subscribers: mgorny, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D29519 Update for review comments Fix a bug Nuno noticed where we are giving information about and/or on edges where the info is not useful and easy to use wrong Update for review comments llvm-svn: 294351	2017-02-07 21:10:46 +00:00
Hans Wennborg	819e3e02a9	[X86] Disable conditional tail calls (PR31257) They are currently modelled incorrectly (as calls, which clobber registers, confusing e.g. Machine Copy Propagation). Reverting until we figure out the proper solution. llvm-svn: 294348	2017-02-07 20:37:45 +00:00
Tim Northover	d0d025ae45	GlobalISel: translate @llvm.va_end intrinsic. Turns out no-one actually cares about this one (at least) in tree so we can just drop it entirely. llvm-svn: 294345	2017-02-07 20:08:59 +00:00
Matthew Simpson	3877f397cd	[LV] Add new ARM/AArch64 interleaved access cost model tests (NFC) llvm-svn: 294342	2017-02-07 19:34:24 +00:00
Sanjoy Das	2f63cbcc0c	[ImplicitNullCheck] Extend Implicit Null Check scope by using stores Summary: This change allows usage of store instruction for implicit null check. Memory Aliasing Analisys is not used and change conservatively supposes that any store and load may access the same memory. As a result re-ordering of store-store, store-load and load-store is prohibited. Patch by Serguei Katkov! Reviewers: reames, sanjoy Reviewed By: sanjoy Subscribers: atrick, llvm-commits Differential Revision: https://reviews.llvm.org/D29400 llvm-svn: 294338	2017-02-07 19:19:49 +00:00
Matthew Simpson	1cd02f13a5	[LV] Simplify ARM/AArch64 interleaved access cost model tests (NFC) This patch removes unneeded instructions from the existing ARM/AArch64 interleaved access cost model tests. I'll be adding a similar set of tests in a follow-on patch to increase coverage. llvm-svn: 294336	2017-02-07 19:17:44 +00:00
Nemanja Ivanovic	17aeb5a260	[PowerPC][Altivec] Add vnot extended mnemonic Adds the vnot extended mnemonic for the vnor instruction. Committing on behalf of brunoalr (Bruno Rosa). Differential Revision: https://reviews.llvm.org/D29225 llvm-svn: 294330	2017-02-07 18:57:29 +00:00
Alexander Timofeev	a3dace3619	[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track lane masks. Differential revision: https://reviews.llvm.org/D29442 llvm-svn: 294324	2017-02-07 17:57:48 +00:00
Krzysztof Parzyszek	c8d676ef72	[Hexagon] Remove encoding bits from mapped instructions - Map A2_zxtb to A2_andir. - Map PS_call_nr J2_call. - Map A2_tfr[t\|f][new] to A2_padd[t\|f][new]. Patch by Colin LeMahieu. llvm-svn: 294320	2017-02-07 17:42:11 +00:00
Reid Kleckner	828f3179c2	Fix my GVNHoist test case from r294317 llvm-svn: 294319	2017-02-07 17:35:53 +00:00
Adrian Prantl	e37d314464	Fix the bitcode upgrade for DIGlobalVariable in a DIImportedEntity context. The bitcode upgrade for DIGlobalVariable unconditionally wrapped DIGlobalVariables in a DIGlobalVariableExpression. When a DIGlobalVariable is referenced by a DIImportedEntity, however, this is wrong. This patch fixes the bitcode upgrade by deferring the creation of DIGlobalVariableExpressions until we know the context of the DIGlobalVariable. <rdar://problem/30134279> Differential Revision: https://reviews.llvm.org/D29349 llvm-svn: 294318	2017-02-07 17:35:41 +00:00
Reid Kleckner	79e37d517c	Revert "[GVNHoist] Merge DebugLoc metadata on hoisted instructions" This reverts commit r294250. It caused PR31891. Add a test case that shows that inlinable calls retain location information with an accurate scope. llvm-svn: 294317	2017-02-07 17:31:13 +00:00

... 2 3 4 5 6 ...

42966 Commits