llvm-project

Commit Graph

Author	SHA1	Message	Date
Stepan Dyatkovskiy	157bb42e27	Fix for PR18102. Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from mem-ops sequence sorting. Consider, how MergeConsequtiveStores works for next example: store i8 1, a[0] store i8 2, a[1] store i8 3, a[1] ; a[1] again. return ; DAG starts here 1. Method will collect all the 3 stores. 2. It sorts them by distance from the base pointer (farthest with highest index). 3. It takes first consecutive non-overlapping stores and (if possible) replaces them with a single store instruction. The point is, we can't determine here which 'store' instruction would be the second after sorting ('store 2' or 'store 3'). It happens that 'store 3' would be the second, and 'store 2' would be the third. So after merging we have the next result: store i16 (1 \| 3 << 8), base ; is a[0] but bit-casted to i16 store i8 2, a[1] So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1]. Fix: In sort routine just also take into account mem-op sequence number. llvm-svn: 200201	2014-01-27 09:18:31 +00:00
Kevin Qin	fb9871ff50	[AArch64 NEON] Fix pattern match failed on FP_ROUND from v1f128 to v1f64. llvm-svn: 200109	2014-01-26 02:19:35 +00:00
Hal Finkel	dbebb52a2f	Disable the use of TBAA when using AA in CodeGen There are currently two issues, of which I currently know, that prevent TBAA from being correctly usable in CodeGen: 1. Stack coloring does not update TBAA when merging allocas. This is easy enough to fix, but is not the largest problem. 2. CGP inserts ptrtoint/inttoptr pairs when sinking address computations. Because BasicAA does not handle inttoptr, we'll often miss basic type punning idioms that we need to catch so we don't miscompile real-world code (like LLVM). I don't yet have a small test case for this, but this fixes self hosting a non-asserts build of LLVM on PPC64 when using -enable-aa-sched-mi and -misched=shuffle. llvm-svn: 200093	2014-01-25 19:24:54 +00:00
Hal Finkel	9b2617a5a8	Add combiner-aa-only-func (debug only) This option (which is !NDEBUG only) allows restricting the use of alias analysis in DAGCombiner to a specific function. This has proved extremely valuable to isolating bugs related to this feature, and mirrors the misched-only-func option provided by the new instruction scheduler. llvm-svn: 200088	2014-01-25 17:32:39 +00:00
Hal Finkel	5fb07341f1	Improve descriptions of combiner-alias-analysis and combiner-global-alias-analysis llvm-svn: 200087	2014-01-25 17:32:37 +00:00
Juergen Ributzka	f26beda7c7	Revert "Revert "Add Constant Hoisting Pass" (r200034)" This reverts commit r200058 and adds the using directive for ARMTargetTransformInfo to silence two g++ overload warnings. llvm-svn: 200062	2014-01-25 02:02:55 +00:00
Hans Wennborg	4d67a2e85a	Revert "Add Constant Hoisting Pass" (r200034) This commit caused -Woverloaded-virtual warnings. The two new TargetTransformInfo::getIntImmCost functions were only added to the superclass, and to the X86 subclass. The other targets were not updated, and the warning highlighted this by pointing out that e.g. ARMTTI::getIntImmCost was hiding the two new getIntImmCost variants. We could pacify the warning by adding "using TargetTransformInfo::getIntImmCost" to the various subclasses, or turning it off, but I suspect that it's wrong to leave the functions unimplemnted in those targets. The default implementations return TCC_Free, which I don't think is right e.g. for ARM. llvm-svn: 200058	2014-01-25 01:18:18 +00:00
Juergen Ributzka	4f3df4ad64	Add Constant Hoisting Pass Retry commit r200022 with a fix for the build bot errors. Constant expressions have (unlike instructions) module scope use lists and therefore may have users in different functions. The fix is to simply ignore these out-of-function uses. llvm-svn: 200034	2014-01-24 20:18:00 +00:00
Hal Finkel	51a9838049	Fix DAGCombiner::GatherAllAliases to account for non-chain dependencies DAGCombiner::GatherAllAliases, which is only used when AA used is enabled during DAGCombine, had a fundamentally incorrect assumption for which this change compensates. GatherAllAliases, which is used to find aliasing predecessor chain nodes (so that a better chain can be selected for a load or store to enable subsequent optimizations) assumed that walking up the chain would always catch all possibly-aliasing loads and stores. This is not true: To really find all aliases, we also need to search for aliases through the value operand of a store, etc. Consider the following situation: Token1 = ... L1 = load Token1, %52 S1 = store Token1, L1, %51 L2 = load Token1, %52+8 S2 = store Token1, L2, %51+8 Token2 = Token(S1, S2) L3 = load Token2, %53 S3 = store Token2, L3, %52 L4 = load Token2, %53+8 S4 = store Token2, L4, %52+8 If we search for aliases of S3 (which loads address %52), and we look only through the chain, then we'll miss the trivial dependence on L1 (which loads from %52). We then might change all loads and stores to use Token1 as their chain operand, which could result in copying %53 into %52 before copying %52 into %51 (which should happen first). The problem is, however, that searching for such data dependencies can become expensive, and the cost is not directly related to the chain depth. Instead, we'll rule out such configurations by insisting that we've visited all chain users (except for users of the original chain, which is not necessary). When doing this, we need to look through nodes we don't care about (otherwise, things like register copies will interfere with trivial use cases). Unfortunately, I don't have a small test case for this problem. Creating the underlying situation is not hard (a pair of memcpys will do it), but arranging for the default instruction schedule to be incorrect is very fragile. This unbreaks self hosting on PPC64 when using -mllvm -combiner-global-alias-analysis -mllvm -combiner-alias-analysis. llvm-svn: 200033	2014-01-24 20:12:02 +00:00
Juergen Ributzka	50e7e80d00	Revert "Add Constant Hoisting Pass" This reverts commit r200022 to unbreak the build bots. llvm-svn: 200024	2014-01-24 18:40:30 +00:00
Hal Finkel	ccc18e1330	Restrict FindBetterChain DAG combines to unindexed nodes These transformations obviously won't work for indexed (pre/post-inc) loads and stores. In practice, I'm not sure there is any benefit to enabling them for indexed nodes because other transformations that these might enable likely also won't handle indexed nodes. I don't have an in-tree test case that hits this problem, but an upcoming bug fix will make it much more likely. llvm-svn: 200023	2014-01-24 18:25:26 +00:00
Juergen Ributzka	38b67d0caf	Add Constant Hoisting Pass This pass identifies expensive constants to hoist and coalesces them to better prepare it for SelectionDAG-based code generation. This works around the limitations of the basic-block-at-a-time approach. First it scans all instructions for integer constants and calculates its cost. If the constant can be folded into the instruction (the cost is TCC_Free) or the cost is just a simple operation (TCC_BASIC), then we don't consider it expensive and leave it alone. This is the default behavior and the default implementation of getIntImmCost will always return TCC_Free. If the cost is more than TCC_BASIC, then the integer constant can't be folded into the instruction and it might be beneficial to hoist the constant. Similar constants are coalesced to reduce register pressure and materialization code. When a constant is hoisted, it is also hidden behind a bitcast to force it to be live-out of the basic block. Otherwise the constant would be just duplicated and each basic block would have its own copy in the SelectionDAG. The SelectionDAG recognizes such constants as opaque and doesn't perform certain transformations on them, which would create a new expensive constant. This optimization is only applied to integer constants in instructions and simple (this means not nested) constant cast experessions. For example: %0 = load i64* inttoptr (i64 big_constant to i64*) Reviewed by Eric llvm-svn: 200022	2014-01-24 18:23:08 +00:00
Alp Toker	cb40291100	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. llvm-svn: 200018	2014-01-24 17:20:08 +00:00
Owen Anderson	77e4d44411	Revert r162101 and replace it with a solution that works for targets where the pointer type is illegal. This is a horrible bit of code. We're calling a simplification routine in the middle of type legalization. We tell the simplification routine that it's running after legalization, but some of the types it will encounter will be illegal! The fix is only to invoke the simplification if the types in question were legal, so that none of its invariants will be violated. llvm-svn: 199847	2014-01-22 22:34:17 +00:00
Elena Demikhovsky	9d56f1e0e5	AVX512: combining setcc and zext is wrong on AVX512 because vector compare instruction puts result in mask register. llvm-svn: 199798	2014-01-22 12:26:19 +00:00
Owen Anderson	fb00d5bc7c	Allow SMUL_LOHI and UMUL_LOHI to be narrow to MUL on targets where MUL is Custom rather than Legal. Even if the target is doing some kind of expansion for MUL, it's pretty much guaranteed to be more efficent than whatever it does for SMUL_LOHI or UMUL_LOHI! llvm-svn: 199678	2014-01-20 18:41:34 +00:00
Andrea Di Biagio	d7c03ec348	[DAGCombiner] Fix a wrong check in method SimplifyVBinOp. This fixes a regression intruced by r199135. Revision 199135 tried to simplify part of the logic in method DAGCombiner::SimplifyVBinOp introducing calls to method BuildVectorSDNode::isConstant(). However, that revision wrongly changed the check performed by method SimplifyVBinOp to identify dag nodes that can be folded. Before revision 199135, that method only tried to simplify vector binary operations if both operands were build_vector of Constant/ConstantFP/Undef only. After revision 199135, method SimplifyVBinop tried to simplify also vector binary operations with only one constant operand. This fixes the problem restoring the old behavior of SimplifyVBinOp. llvm-svn: 199328	2014-01-15 19:51:32 +00:00
Jakob Stoklund Olesen	b6b35a4955	Always let value types influence register classes. When creating a virtual register for a def, the value type should be used to pick the register class. If we only use the register class constraint on the instruction, we might pick a too large register class. Some registers can store values of different sizes. For example, the x86 xmm registers can hold f32, f64, and 128-bit vectors. The three different value sizes are represented by register classes with identical register sets: FR32, FR64, and VR128. These register classes have different spill slot sizes, so it is important to use the right one. The register class constraint on an instruction doesn't necessarily care about the size of the value its defining. The value type determines that. This fixes a problem where InstrEmitter was picking 32-bit register classes for 64-bit values on SPARC. llvm-svn: 199187	2014-01-14 06:18:38 +00:00
Juergen Ributzka	6840282c99	[DAG] Refactor ReassociateOps - no functional change intended. llvm-svn: 199146	2014-01-13 21:49:25 +00:00
Juergen Ributzka	7384405f23	[DAG] Teach DAG to also reassociate vector operations This commit teaches DAG to reassociate vector ops, which in turn enables constant folding of vector op chains that appear later on during custom lowering and DAG combine. Reviewed by Andrea Di Biagio llvm-svn: 199135	2014-01-13 20:51:35 +00:00
Andrew Trick	7daf6a45f4	Hide the pre-RA-sched= option. This is a very confusing option for a feature that will go away. -enable-misched is exposed instead to help triage issues with the new scheduler. llvm-svn: 199133	2014-01-13 20:08:27 +00:00
Nico Rieck	b5262d6d8f	Fix non-deterministic SDNodeOrder-dependent codegen Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050	2014-01-12 14:09:17 +00:00
Alp Toker	798060e006	Fix 'ned' typo in doc comment Patch by Jasper Neumann! llvm-svn: 199007	2014-01-11 14:01:43 +00:00
Richard Sandiford	15cfc1c33c	Handle masked rotate amounts At the moment we expect rotates to have the form: (or (shl X, Y), (shr X, Z)) where Y == bitsize(X) - Z or Z == bitsize(X) - Y. This form means that the (or ...) is undefined for Y == 0 or Z == 0. This undefinedness can be avoided by using Y == (C * bitsize(X) - Z) & (bitsize(X) - 1) or Z == (C * bitsize(X) - Y) & (bitsize(X) - 1) for any integer C (including 0, the most natural choice). llvm-svn: 198861	2014-01-09 10:56:42 +00:00
Richard Sandiford	0f264db3c6	Match the InstCombine form of rotates by X+C InstCombine converts (sub 32, (add X, C)) into (sub 32-C, X), so a rotate left of a 32-bit Y by X+C could appear as either: (or (shl Y, (add X, C)), (shr Y, (sub 32, (add X, C)))) without InstCombine or: (or (shl Y, (add X, C)), (shr Y, (sub 32-C, X))) with it. We already matched the first form. This patch handles the second too. llvm-svn: 198860	2014-01-09 10:49:40 +00:00
Chandler Carruth	d48cdbf0c3	Put the functionality for printing a value to a raw_ostream as an operand into the Value interface just like the core print method is. That gives a more conistent organization to the IR printing interfaces -- they are all attached to the IR objects themselves. Also, update all the users. This removes the 'Writer.h' header which contained only a single function declaration. llvm-svn: 198836	2014-01-09 02:29:41 +00:00
Andrea Di Biagio	23df4e4a2d	Teach the DAGCombiner how to fold 'vselect' dag nodes according to the following two rules: 1) fold (vselect (build_vector AllOnes), A, B) -> A 2) fold (vselect (build_vector AllZeros), A, B) -> B llvm-svn: 198777	2014-01-08 18:33:04 +00:00
Richard Sandiford	95c864d9bd	[DAGCombiner] Factor duplicated rotate code into a separate function No functional change intended. llvm-svn: 198768	2014-01-08 15:40:47 +00:00
Chandler Carruth	9aca918df9	Move the LLVM IR asm writer header files into the IR directory, as they are part of the core IR library in order to support dumping and other basic functionality. Rename the 'Assembly' include directory to 'AsmParser' to match the library name and the only functionality left their -- printing has been in the core IR library for quite some time. Update all of the #includes to match. All of this started because I wanted to have the layering in good shape before I started adding support for printing LLVM IR using the new pass infrastructure, and commandline support for the new pass infrastructure. llvm-svn: 198688	2014-01-07 12:34:26 +00:00
Kevin Qin	5cd73c9e0a	[AArch64 NEON] Fix invalid constant used in vselect condition. There is a wrong assumption that the vector element type and the type of each ConstantSDNode in the build_vector were the same. However, when promoting the integer operand of a legally typed build_vector, the operand type and the vector element type do not need to be the same (See method 'DAGTypeLegalizer::PromoteIntOp_BUILD_VECTOR' in LegalizeIntegerTypes.cpp). in AArch64 backend, the following dag sequence: C0: i1 = Constant<0> C1: i1 = Constant<-1> V: v8i1 = BUILD_VECTOR C1, C1, C0, C0, C0, C0, C0, C0 is type-legalized into: NewC0: i32 = Constant<0> NewC1: i32 = Constant<1> V: v8i8 = BUILD_VECTOR NewC1, NewC1, NewC0, NewC0, NewC0, NewC0, NewC0, NewC0 Forcing a getZeroExtend to VTBits to ensure that the new constant is correctly. llvm-svn: 198582	2014-01-06 02:26:10 +00:00
Bill Wendling	908bf814e7	Refactor function that checks that __builtin_returnaddress's argument is constant. This moves the check up into the parent class so that all targets can use it without having to copy (and keep in sync) the same error message. llvm-svn: 198579	2014-01-06 00:43:20 +00:00
Kevin Qin	ede9ce1933	Fix a bug in DAGcombiner about zero-extend after setcc. For AArch64 backend, if DAGCombiner see "sext(setcc)", it will combine them together to a single setcc with extended value type. Then if it see "zext(setcc)", it assumes setcc is Vxi1, and try to create "(and (vsetcc), (1, 1, ...)". While setcc isn't Vxi1, DAGcombiner will create wrong node and get wrong code emitted. llvm-svn: 198190	2013-12-30 02:05:13 +00:00
Andrea Di Biagio	46dcddb350	Teach DAGCombiner how to fold a SIGN_EXTEND_INREG of a BUILD_VECTOR of ConstantSDNodes (or UNDEFs) into a simple BUILD_VECTOR. For example, given the following sequence of dag nodes: i32 C = Constant<1> v4i32 V = BUILD_VECTOR C, C, C, C v4i32 Result = SIGN_EXTEND_INREG V, ValueType:v4i1 The SIGN_EXTEND_INREG node can be folded into a build_vector since the vector in input is a BUILD_VECTOR of constants. The optimized sequence is: i32 C = Constant<-1> v4i32 Result = BUILD_VECTOR C, C, C, C llvm-svn: 198084	2013-12-27 20:20:28 +00:00
Josh Magee	22b8ba2d67	[stackprotector] Use analysis from the StackProtector pass for stack layout in PEI a nd LocalStackSlot passes. This changes the MachineFrameInfo API to use the new SSPLayoutKind information produced by the StackProtector pass (instead of a boolean flag) and updates a few pass dependencies (to preserve the SSP analysis). The stack layout follows the same approach used prior to this change - i.e., only LargeArray stack objects will be placed near the canary and everything else will be laid out normally. After this change, structures containing large arrays will also be placed near the canary - a case previously missed by the old implementation. Out of tree targets will need to update their usage of MachineFrameInfo::CreateStackObject to remove the MayNeedSP argument. The next patch will implement the rules for sspstrong and sspreq. The end goal is to support ssp-strong stack layout rules. WIP. Differential Revision: http://llvm-reviews.chandlerc.com/D2158 llvm-svn: 197653	2013-12-19 03:17:11 +00:00
Jim Grosbach	ea2db453dd	Add a machine code print in DEBUG() following instruction selection. Make debugging ISel a bit easier by printing out a dump of the generated code at the end. llvm-svn: 197456	2013-12-17 02:01:10 +00:00
Juergen Ributzka	e82947539e	[Stackmap] Liveness Analysis Pass This optional register liveness analysis pass can be enabled with either -enable-stackmap-liveness, -enable-patchpoint-liveness, or both. The pass traverses each basic block in a machine function. For each basic block the instructions are processed in reversed order and if a patchpoint or stackmap instruction is encountered the current live-out register set is encoded as a register mask and attached to the instruction. Later on during stackmap generation the live-out register mask is processed and also emitted as part of the stackmap. This information is optional and intended for optimization purposes only. This will enable a client of the stackmap to reason about the registers it can use and which registers need to be preserved. Reviewed by Andy llvm-svn: 197317	2013-12-14 06:53:06 +00:00
Andrew Trick	7bcb0100df	Revert "Liveness Analysis Pass" This reverts commit r197254. This was an accidental merge of Juergen's patch. It will be checked in shortly, but wasn't meant to go in quite yet. Conflicts: include/llvm/CodeGen/StackMaps.h lib/CodeGen/StackMaps.cpp test/CodeGen/X86/stackmap-liveness.ll llvm-svn: 197260	2013-12-13 18:57:20 +00:00
Andrew Trick	e8cba373a3	Grow the stackmap/patchpoint format to hold 64-bit IDs. llvm-svn: 197255	2013-12-13 18:37:10 +00:00
Andrew Trick	8d6a658430	Liveness Analysis Pass llvm-svn: 197254	2013-12-13 18:37:03 +00:00
Benjamin Kramer	671a596282	SelectionDAG: Fix a typo. Found by "cppcheck". PR18208. llvm-svn: 197047	2013-12-11 16:36:09 +00:00
Richard Sandiford	d1093636cc	Extend (truncate (load)) folding DAGCombiner could fold (truncate (load)) -> smaller load if the original load was the width of the truncation result or wider. This patch extends it to handle cases where the original load was narrower (and so the extension type stays the same). llvm-svn: 197030	2013-12-11 11:37:27 +00:00
Reid Kleckner	ee08897fb8	Reland "Fix miscompile of MS inline assembly with stack realignment" This re-lands commit r196876, which was reverted in r196879. The tests have been fixed to pass on platforms with a stack alignment larger than 4. Update to clang side tests will land shortly. llvm-svn: 196939	2013-12-10 18:27:32 +00:00
Richard Sandiford	9afe613d12	Add TargetLowering::prepareVolatileOrAtomicLoad One unusual feature of the z architecture is that the result of a previous load can be reused indefinitely for subsequent loads, even if a cache-coherent store to that location is performed by another CPU. A special serializing instruction must be used if you want to force a load to be reattempted. Since volatile loads are not supposed to be omitted in this way, we should insert a serializing instruction before each such load. The same goes for atomic loads. The patch implements this at the IR->DAG boundary, in a similar way to atomic fences. It is a no-op for targets other than SystemZ. llvm-svn: 196905	2013-12-10 10:36:34 +00:00
Reid Kleckner	0a9509f080	Revert "Fix miscompile of MS inline assembly with stack realignment" This reverts commit r196876. Its tests failed on the bots, so I'll figure it out tomorrow. llvm-svn: 196879	2013-12-10 05:31:27 +00:00
Reid Kleckner	7f10a8cd45	Fix miscompile of MS inline assembly with stack realignment For stack frames requiring realignment, three pointers may be needed: - ebp to address incoming arguments - esi (could be any callee-saved register) to address locals - esp to address outgoing arguments We would use esi unconditionally without verifying that it did not conflict with inline assembly. This change doesn't do the verification, it simply emits a fatal error on functions that use stack realignment, dynamic SP adjustments, and inline assembly. Because stack realignment is common on Windows, we also no longer assume that MS inline assembly clobbers esp. Instead, we analyze the inline instructions for implicit definitions and check if esp is there. If so, we require the use of a base pointer and consider it in the condition above. Mostly fixes PR16830, but we could try harder to find a non-conflicting base pointer. Reviewers: sunfish Differential Revision: http://llvm-reviews.chandlerc.com/D1317 llvm-svn: 196876	2013-12-10 05:12:23 +00:00
Nadav Rotem	6eee080450	Fix PR18162 - Incorrect assertion assumed that the SDValue resno is zero. llvm-svn: 196858	2013-12-10 01:13:59 +00:00
Alp Toker	f907b891da	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
Rafael Espindola	d50dbc783b	Try harder to get a consistent floating point results. This just extends the existing hack. It should be enough to get a reproducible bootstrap on 32 bits. I will open a bug to track getting a real fix for this. llvm-svn: 196462	2013-12-05 04:14:33 +00:00
Andrew Trick	391dbadb51	StackMap: Implement support for DirectMemRefOp. A Direct stack map location records the address of frame index. This address is itself the value that the runtime requested. This differs from IndirectMemRefOp locations, which refer to a stack locations from which the requested values must be loaded. Direct locations can directly communicate the address if an alloca, while IndirectMemRefOp handle register spills. For example: entry: %a = alloca i64... llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a) Since both the alloca and stackmap intrinsic are in the entry block, and the intrinsic takes the address of the alloca, the runtime can assume that LLVM will not substitute alloca with any intervening value. This must be verified by the runtime by checking that the stack map's location is a Direct location type. The runtime can then determine the alloca's relative location on the stack immediately after compilation, or at any time thereafter. This differs from Register and Indirect locations, because the runtime can only read the values in those locations when execution reaches the instruction address of the stack map. llvm-svn: 195712	2013-11-26 02:03:25 +00:00
Bill Wendling	9200bb08f9	Unrevert r195599 with testcase fix. I'm not sure how it was checking for the wrong values... PR18023. llvm-svn: 195670	2013-11-25 18:05:22 +00:00
Amara Emerson	f59125f5bb	Revert r195599 as it broke the builds. llvm-svn: 195636	2013-11-25 11:24:18 +00:00
Daniel Sanders	b021c6fdbd	Fixed tryFoldToZero() for vector types that need expansion. Summary: Moved the requirement for SelectionDAG::getConstant() to return legally typed nodes slightly earlier. There were two optional DAGCombine passes that were missed out and were required to produce type-legal DAGs. Simplified a code-path in tryFoldToZero() to use SelectionDAG::getConstant(). This provides support for both promoted and expanded vector types whereas the previous code only supported promoted vector types. Fixes a "Type for zero vector elements is not legal" assertion detected by an llvm-stress generated test. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2251 llvm-svn: 195635	2013-11-25 11:14:43 +00:00
Bill Wendling	e3c48709ed	Don't look past volatile loads. A volatile load should block us from trying to coalesce stores. PR18023 llvm-svn: 195599	2013-11-25 05:01:21 +00:00
Paul Robinson	d89125a5d8	Teach ISel not to optimize 'optnone' functions (revised). Improvements over r195317: - Set/restore EnableFastISel flag instead of just running FastISel within SelectAllBasicBlocks; the flag is checked in various places, and FastISel won't run properly if those places don't do the right thing. - Test looks for normal ISel versus FastISel behavior, and not something more subtle that doesn't work everywhere. Based on work by Andrea Di Biagio. llvm-svn: 195491	2013-11-22 19:11:24 +00:00
Andrew Trick	4a1abb7ab5	patchpoint: factor SD builder code for live vars. Plain stackmap also optimizes Constant values now. llvm-svn: 195488	2013-11-22 19:07:36 +00:00
Andrew Trick	a2428e0f40	patchpoint: eliminate hard coded operand indices. llvm-svn: 195487	2013-11-22 19:07:33 +00:00
Tom Stellard	06c67bcbe4	SelectionDAG: Optimize expansion of vec_type = BITCAST scalar_type The legalizer can now do this type of expansion for more type combinations without loading and storing to and from the stack. NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195398	2013-11-22 00:41:05 +00:00
Tom Stellard	9cbd2c5581	Split SETCC if VSELECT requires splitting too. This patch is a rewrite of the original patch commited in r194542. Instead of relying on the type legalizer to do the splitting for us, we now peform the splitting ourselves in the DAG combiner. This is necessary for the case where the vector mask is a legal type after promotion and still wouldn't require splitting. Patch by: Juergen Ributzka NOTE: This is a candidate for the 3.4 branch. llvm-svn: 195397	2013-11-22 00:39:23 +00:00
Daniel Sanders	edc071b815	Add support for legalizing SETNE/SETEQ by inverting the condition code and the result of the comparison. Summary: LegalizeSetCCCondCode can now legalize SETEQ and SETNE by returning the inverse condition and requesting that the caller invert the result of the condition. The caller of LegalizeSetCCCondCode must handle the inverted CC, and they do so as follows: SETCC, BR_CC: Invert the result of the SETCC with SelectionDAG::getNOT() SELECT_CC: Swap the true/false operands. This is necessary for MSA which lacks an integer SETNE instruction. Reviewers: resistor CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2229 llvm-svn: 195355	2013-11-21 13:24:49 +00:00
NAKAMURA Takumi	43aa939625	Revert r195317 (and r195333), "Teach ISel not to optimize 'optnone' functions." It broke, at least, i686 target. It is reproducible with "llc -mtriple=i686-unknown". FYI, it didn't appear to add either "-O0" or "-fast-isel". llvm-svn: 195339	2013-11-21 10:55:15 +00:00
Paul Robinson	b379efeb53	Teach ISel not to optimize 'optnone' functions. Based on work by Andrea Di Biagio. llvm-svn: 195317	2013-11-21 06:33:32 +00:00
Jack Carter	d4b22dcbf3	long line correction llvm-svn: 195179	2013-11-20 00:32:32 +00:00
Jack Carter	5c0af48a11	long lines and white space correction llvm-svn: 195170	2013-11-19 23:43:22 +00:00
Juergen Ributzka	b34871027f	[DAG] Refactor vector splitting code in SelectionDAG. No functional change intended. Reviewed by Tom llvm-svn: 195156	2013-11-19 21:20:17 +00:00
Andrew Trick	1f54e805f2	Fix patchpoint comments. llvm-svn: 195103	2013-11-19 05:05:43 +00:00
Benjamin Kramer	bb1dd73d3e	DAGCombiner: Partially revert r192795, getNOT was fixed not to create illegal constants. llvm-svn: 194959	2013-11-17 10:40:03 +00:00
Matt Arsenault	64283bd99c	Use more getZExtOrTruncs llvm-svn: 194945	2013-11-17 02:31:26 +00:00
Matt Arsenault	873bb3ea86	Use getZExtOrTrunc instead of repeating the same logic. llvm-svn: 194944	2013-11-17 02:24:21 +00:00
Matt Arsenault	36f5eb5949	Use right address space pointer size llvm-svn: 194940	2013-11-17 00:06:39 +00:00
Matt Arsenault	dfb3e7092e	Fix assert on unaligned access to global with different address space size. llvm-svn: 194934	2013-11-16 20:50:54 +00:00
Matt Arsenault	19231e630e	Fix codegen for null different sized pointer. llvm-svn: 194932	2013-11-16 20:24:41 +00:00
Bob Wilson	9f3e6b25ee	Avoid illegal integer promotion in fastisel Stop folding constant adds into GEP when the type size doesn't match. Otherwise, the adds' operands are effectively being promoted, changing the conditions of an overflow. Results are different when: sext(a) + sext(b) != sext(a + b) Problem originally found on x86-64, but also fixed issues with ARM and PPC, which used similar code. <rdar://problem/15292280> Patch by Duncan Exon Smith! llvm-svn: 194840	2013-11-15 19:09:27 +00:00
Daniel Sanders	50b8041066	Fix illegal DAG produced by SelectionDAG::getConstant() for v2i64 type Summary: When getConstant() is called for an expanded vector type, it is split into multiple scalar constants which are then combined using appropriate build_vector and bitcast operations. In addition to the usual big/little endian differences, the case where the element-order of the vector does not have the same endianness as the elements themselves is also accounted for. For example, for v4i32 on big-endian MIPS, the byte-order of the vector is <3210,7654,BA98,FEDC>. For little-endian, it is <0123,4567,89AB,CDEF>. Handling this case turns out to be a nop since getConstant() returns a splatted vector (so reversing the element order doesn't change the value) This fixes a number of cases in MIPS MSA where calling getConstant() during operation legalization introduces illegal types (e.g. to legalize v2i64 UNDEF into a v2i64 BUILD_VECTOR of illegal i64 zeros). It should also handle bigger differences between illegal and legal types such as legalizing v2i64 into v8i16. lowerMSASplatImm() in the MIPS backend no longer needs to avoid calling getConstant() so this function has been updated in the same patch. For the sake of transparency, the steps I've taken since the review are: * Added 'virtual' to isVectorEltOrderLittleEndian() as requested. This revealed that the MIPS tests were falsely passing because a polymorphic function was not actually polymorphic in the reviewed patch. * Fixed the tests that were now failing. This involved deleting the code to handle the MIPS MSA element-order (which was previously doing an byte-order swap instead of an element-order swap). This left isVectorEltOrderLittleEndian() unused and it was deleted. * Fixed build failures caused by rebasing beyond r194467-r194472. These build failures involved the bset, bneg, and bclr instructions added in these commits using lowerMSASplatImm() in a way that was no longer valid after this patch. Some of these were fixed by calling SelectionDAG::getConstant() instead, others were fixed by a new function getBuildVectorSplat() that provided the removed functionality of lowerMSASplatImm() in a more sensible way. Reviewers: bkramer Reviewed By: bkramer CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1973 llvm-svn: 194811	2013-11-15 12:56:49 +00:00
Matt Arsenault	c5559bb14b	Add target hook to prevent folding some bitcasted loads. This is to avoid this transformation in some cases: fold (conv (load x)) -> (load (conv*)x) On architectures that don't natively support some vector loads efficiently casting the load to a smaller vector of larger types and loading is more efficient. Patch by Micah Villmow. llvm-svn: 194783	2013-11-15 04:42:23 +00:00
Matt Arsenault	b03bd4d96b	Add addrspacecast instruction. Patch by Michele Scandale! llvm-svn: 194760	2013-11-15 01:34:59 +00:00
Andrew Trick	561f2218e0	Minor extension to llvm.experimental.patchpoint: don't require a call. If a null call target is provided, don't emit a dummy call. This allows the runtime to reserve as little nop space as it needs without the requirement of emitting a call. llvm-svn: 194676	2013-11-14 06:54:10 +00:00
Juergen Ributzka	34c652d34d	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. This patch reapplies r193676 with an additional fix for the Hexagon backend. The SystemZ backend has already been fixed by r194148. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 194542	2013-11-13 01:57:54 +00:00
Daniel Sanders	a1840d2f88	Vector forms of SHL, SRA, and SRL can be constant folded using SimplifyVBinOp too Reviewers: dsanders Reviewed By: dsanders CC: llvm-commits, nadav Differential Revision: http://llvm-reviews.chandlerc.com/D1958 llvm-svn: 194393	2013-11-11 17:23:41 +00:00
Juergen Ributzka	87ed906b2e	[Stackmap] Materialize the jump address within the patchpoint noop slide. This patch moves the jump address materialization inside the noop slide. This enables patching of the materialization itself or its complete removal. This patch also adds the ability to define scratch registers that can be used safely by the code called from the patchpoint intrinsic. At least one scratch register is required, because that one is used for the materialization of the jump address. This patch depends on D2009. Differential Revision: http://llvm-reviews.chandlerc.com/D2074 Reviewed by Andy llvm-svn: 194306	2013-11-09 01:51:33 +00:00
Juergen Ributzka	9969d3e6e8	[Stackmap] Add AnyReg calling convention support for patchpoint intrinsic. The idea of the AnyReg Calling Convention is to provide the call arguments in registers, but not to force them to be placed in a paticular order into a specified set of registers. Instead it is up tp the register allocator to assign any register as it sees fit. The same applies to the return value (if applicable). Differential Revision: http://llvm-reviews.chandlerc.com/D2009 Reviewed by Andy llvm-svn: 194293	2013-11-08 23:28:16 +00:00
Andrew Trick	6664df12fb	Slightly change the way stackmap and patchpoint intrinsics are lowered. MorphNodeTo is not safe to call during DAG building. It eagerly deletes dependent DAG nodes which invalidates the NodeMap. We could expose a safe interface for morphing nodes, but I don't think it's worth it. Just create a new MachineNode and replaceAllUsesWith. My understaning of the SD design has been that we want to support early target opcode selection. That isn't very well supported, but generally works. It seems reasonable to rely on this feature even if it isn't widely used. llvm-svn: 194102	2013-11-05 22:44:04 +00:00
Juergen Ributzka	359c532d36	[Stackmap] Remove erroneous assert. llvm-svn: 193871	2013-11-01 17:53:27 +00:00
Aaron Ballman	2b7a733b16	Commenting out this assert because it is causing the build bots to fail. This effectively reverts r193861, but needs to be fixed as part of r193769. llvm-svn: 193862	2013-11-01 15:12:23 +00:00
Aaron Ballman	96321aa523	Fixing an order of evaluation error in an assert. llvm-svn: 193861	2013-11-01 14:53:14 +00:00
Andrew Trick	153ebe6d2a	Add support for stack map generation in the X86 backend. Originally implemented by Lang Hames. llvm-svn: 193811	2013-10-31 22:11:56 +00:00
Andrew Trick	74f4c749cf	Lower stackmap intrinsics directly to their target opcode in the DAG builder. llvm-svn: 193769	2013-10-31 17:18:24 +00:00
Andrew Trick	d4d1d9c06e	whitespace llvm-svn: 193765	2013-10-31 17:18:07 +00:00
Jim Grosbach	7236678687	Legalize: Improve legalization of long vector extends. When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727	2013-10-31 00:20:48 +00:00
Matt Arsenault	2ba54c3d90	Fix CodeGen for unaligned loads with address spaces llvm-svn: 193721	2013-10-30 23:30:05 +00:00
Juergen Ributzka	3bd686d493	Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." Now Hexagon and SystemZ are not happy with it :-( llvm-svn: 193677	2013-10-30 06:36:19 +00:00
Juergen Ributzka	6ad05d6b95	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. This mask has usually the same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 193676	2013-10-30 05:48:18 +00:00
Alp Toker	6a03374526	Fix "existant" typos llvm-svn: 193579	2013-10-29 02:35:28 +00:00
Richard Sandiford	981fdeb477	[DAGCombiner] Respect volatility when checking for aliases Making useAA() default to true for SystemZ showed that the combiner alias analysis wasn't handling volatile accesses. This hit many of the SystemZ tests, but I arbitrarily picked one for the purpose of this patch. llvm-svn: 193518	2013-10-28 12:00:00 +00:00
Richard Sandiford	39c1ce4dc1	Keep TBAA info when rewriting SelectionDAG loads and stores Most SelectionDAG code drops the TBAA info when creating a new form of a load and store (e.g. during legalization, or when converting a plain load to an extending one). This patch tries to catch all cases where the TBAA information can legitimately be carried over. The patch adds alternative forms of getLoad() and getExtLoad() that take a MachineMemOperand instead of individual fields. (The corresponding getTruncStore() already exists.) The idea is to use the MachineMemOperand forms when all fields are carried over (size, pointer info, isVolatile, isNonTemporal, alignment and TBAA info). If some adjustment is being made, e.g. to narrow the load, then we still pass the individual fields but also pass the TBAA info. llvm-svn: 193517	2013-10-28 11:17:59 +00:00
Tim Northover	a564d329c2	LegalizeDAG: allow libcalls for max/min atomic operations ARM processors without ldrex/strex need to be able to make libcalls for all atomic operations, including the newer min/max versions. The alternative would probably be expanding these operations in terms of cmpxchg (as x86 does always), but in the configurations where this matters code-size tends to be paramount so the libcall is more desirable. llvm-svn: 193398	2013-10-25 09:30:20 +00:00
Nadav Rotem	d369d4bdf9	Optimize concat_vectors(X, undef) -> scalar_to_vector(X). This optimization is not SSE specific so I am moving it to DAGco. The new scalar_to_vector dag node exposed a missing pattern in the AArch64 target that I needed to add. llvm-svn: 193393	2013-10-25 06:41:18 +00:00
Tom Stellard	8d7d4deafe	SelectionDAG: Pass along the original argument/element type in ISD::InputArg For some targets, it is useful to be able to look at the original type of an argument without having to dig through the original IR. This also fixes a bug in SelectionDAGBuilder where InputArg.PartOffset was not taking into account the offset of structure elements. Patch by: Justin Holewinski Tom Stellard: - Changed the type of ArgVT to EVT, so it can store non-simple types like v3i32. llvm-svn: 193214	2013-10-23 00:44:24 +00:00
Wan Xiaofei	2f8dc08b8c	Using FoldingSet in SelectionDAG::getVTList. VTList has a long life cycle through the module and getVTList is frequently called. In current getVTList, sequential search over a std::vector is used, this is inefficient in big module. This patch use FoldingSet to implement hashing mechanism when searching. Reviewer: Nadav Rotem Test : Pass unit tests & LNT test suite llvm-svn: 193150	2013-10-22 08:02:02 +00:00
Matt Arsenault	b768912db8	Fix CodeGen for different size address space GEPs llvm-svn: 193111	2013-10-21 20:03:54 +00:00
Matt Arsenault	bbd24901cf	Reuse variable llvm-svn: 193107	2013-10-21 19:24:15 +00:00
Bill Schmidt	3684fdd59f	[PATCH] Fix PR17168 (DAG scheduler inserts DBG_VALUE before PHI with fast-isel) PR17168 describes a test case that fails when compiling for debug with fast-isel. Investigation showed that the test was failing because a DBG_VALUE machine instruction was placed prior to a PHI. For this problem to occur requires the following: * Compile for debug * Compile with fast-isel * In a block B, fast-isel must partially succeed before punting to DAG-isel * B must start with a PHI * The first unhandled node in the DAG must not generate a machine instruction * A debug value with an order less than that of that first node exists When all of these circumstances apply, the existing test that an instruction was not inserted won't fire. Currently it tests whether the block is empty, or whether the last instruction generated is a phi. When fast-isel has partially succeeded, the last instruction generated will not be a phi. Instead, we need to check whether the current insert position is immediately following a phi. This patch adds that check, and adds the test case from the PR as a regression test. llvm-svn: 192976	2013-10-18 14:20:11 +00:00
David Majnemer	451b7dd1ef	CodeGen: Emit a libcall if the target doesn't support 16-byte wide atomics There are targets that support i128 sized scalars but cannot emit instructions that modify them directly. The proper thing to do is to emit a libcall. This fixes PR17481. llvm-svn: 192957	2013-10-18 08:03:43 +00:00
Richard Sandiford	95f7ba988b	Replace sra with srl if a single sign bit is required E.g. (and (sra (i32 x) 31) 2) -> (and (srl (i32 x) 30) 2). llvm-svn: 192884	2013-10-17 11:16:57 +00:00
Andrea Di Biagio	561badf717	Fix edge condition in DAGCombiner to improve codegen of shift sequences. When canonicalizing dags according to the rule (shl (zext (shr X, c1) ), c1) ==> (zext (shl (shr X, c1), c1)) remember to add the new shl dag to the DAGCombiner worklist of nodes. If we don't explicitly add it to the worklist of nodes to visit, we may not trigger later on the rule that folds the shift left + logical shift right into a AND instruction with bitmask. llvm-svn: 192883	2013-10-17 11:02:58 +00:00
Jack Carter	d4e9615d1c	[projects/test-suite] White space and long line fixes. No functionality changes. llvm-svn: 192863	2013-10-17 01:34:33 +00:00
Benjamin Kramer	00eb07b791	DAGCombiner: Don't fold xor into not if getNOT would introduce an illegal constant. This happens e.g. with <2 x i64> -1 on x86_32. It cannot be generated directly because i64 is illegal. It would be nice if getNOT would handle this transparently, but I don't see a way to generate a legal constant there right now. Fixes PR17487. llvm-svn: 192795	2013-10-16 14:16:19 +00:00
Richard Sandiford	374a0e50c4	Handle (shl (anyext (shr ...))) in SimpilfyDemandedBits This is really an extension of the current (shl (shr ...)) -> shl optimization. The main difference is that certain upper bits must also not be demanded. The motivating examples are the first two in the testcase, which occur in llvmpipe output. llvm-svn: 192783	2013-10-16 10:26:19 +00:00
David Blaikie	6004dbc9fa	Fix indenting. That wasn't confusing /at all/... llvm-svn: 192617	2013-10-14 20:15:04 +00:00
Elena Demikhovsky	82a46ebe0a	Fixed a bug in dynamic allocation memory on stack. The alignment of allocated space was wrong, see Bugzila 17345. Done by Zvi Rackover <zvi.rackover@intel.com>. llvm-svn: 192573	2013-10-14 07:26:51 +00:00
Will Dietz	ae726a93e3	TargetLowering: Don't index into empty string. (This is triggered by current lit tests) llvm-svn: 192549	2013-10-13 03:08:49 +00:00
Quentin Colombet	de0e06234c	[DAGCombiner] Reapply load slicing (192471) with a test that explicitly set sse4.2 support. This should fix the buildbots. Original commit message: [DAGCombiner] Slice a big load in two loads when the element are next to each other in memory and the target has paired load and performs post-isel loads combining. E.g., this optimization will transform something like this: a = load i64* addr b = trunc i64 a to i32 c = lshr i64 a, 32 d = trunc i64 c to i32 into: b = load i32* addr1 d = load i32* addr2 Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and performs post-isel loads combining. One should overload TargetLowering::hasPairedLoad to provide this information. The default is false. <rdar://problem/14477220> llvm-svn: 192476	2013-10-11 18:29:42 +00:00
Quentin Colombet	5aee63d9e3	[DAGCombiner] Revert load slicing (r192471), until I figure out why it fails on ubuntu. llvm-svn: 192474	2013-10-11 18:17:17 +00:00
Quentin Colombet	41dc258f71	[DAGCombiner] Slice a big load in two loads when the element are next to each other in memory and the target has paired load and performs post-isel loads combining. E.g., this optimization will transform something like this: a = load i64* addr b = trunc i64 a to i32 c = lshr i64 a, 32 d = trunc i64 c to i32 into: b = load i32* addr1 d = load i32* addr2 Where addr1 = addr2 +/- sizeof(i32), if the target supports paired load and performs post-isel loads combining. One should overload TargetLowering::hasPairedLoad to provide this information. The default is false. <rdar://problem/14477220> llvm-svn: 192471	2013-10-11 18:01:14 +00:00
Matt Arsenault	a98c3b1816	Use getPointerSizeInBits() rather than 8 * getPointerSize() llvm-svn: 192386	2013-10-10 19:09:05 +00:00
Craig Topper	a7afa71494	Fix some assert messages to say the correct opcode name. Looks like one assert got copy and pasted to many places. llvm-svn: 192078	2013-10-06 22:38:19 +00:00
Craig Topper	a1bbc323fa	Add OPC_CheckChildSame0-3 to the DAG isel matcher. This replaces sequences of MoveChild, CheckSame, MoveParent. Saves 846 bytes from the X86 DAG isel matcher, ~300 from ARM, ~840 from Hexagon. llvm-svn: 192026	2013-10-05 05:38:16 +00:00
Adrian Prantl	f01b562a15	Debug info: Don't crash in SelectionDAGISel when a vreg that is being pointed to by a dbg_value belonging to a function argument is eliminated during instruction selection. rdar://problem/15094721. llvm-svn: 192011	2013-10-05 00:08:27 +00:00
Hal Finkel	dbc7a8a8a3	Fix DAGCombiner::visitFP_EXTEND to ignore indexed loads DAGCombiner::visitFP_EXTEND will apply the following transformation: fold (fpext (load x)) -> (fpext (fptrunc (extload x))) but the implementation does not handle indexed loads (pre/post inc.), but did not specifically ignore them either (unlike for extending loads, which it already ignored), causing an assert when the transformation was applied to an indexed load. This is the minimal fix for correctness (causing the transformation to be skipped for indexed loads). Unfortunately, I don't have an in-tree test case. llvm-svn: 191989	2013-10-04 22:18:12 +00:00
Craig Topper	d9a6cc031d	Revert r191940 to see if it fixes the build bots. llvm-svn: 191941	2013-10-04 05:52:17 +00:00
Craig Topper	a2efe9ebc6	Add OPC_CheckChildSame0-3 to the DAG isel matcher. This replaces sequences of MoveChild, CheckSame, MoveParent. Saves 846 bytes from the X86 DAG isel matcher, ~300 from ARM, ~840 from Hexagon. llvm-svn: 191940	2013-10-04 05:22:20 +00:00
Jin-Gu Kang	0bf8241d4b	Added checking code whehter target supports specific dag combining about rotate or not. The corresponding dag patterns are as following: "DAGCombier::MatchRotate" function in DAGCombiner.cpp Pattern1 // fold (or (shl (ext x), (ext y)), // (srl (ext x), (ext (sub 32, y)))) -> // (ext (rotl x, y)) // fold (or (shl (ext x), (ext y)), // (srl (ext x), (ext (sub 32, y)))) -> // (ext (rotr x, (sub 32, y))) pattern2 // fold (or (shl (ext x), (ext (sub 32, y))), // (srl (ext x), (ext y))) -> // (ext (rotl x, y)) // fold (or (shl (ext x), (ext (sub 32, y))), // (srl (ext x), (ext y))) -> // (ext (rotr x, (sub 32, y))) llvm-svn: 191905	2013-10-03 15:58:48 +00:00
Rafael Espindola	44fee4e0eb	Remove several unused variables. Patch by Alp Toker. llvm-svn: 191757	2013-10-01 13:32:03 +00:00
Tom Stellard	6aada32dc4	SelectionDAG: Clarify comments from r191600 llvm-svn: 191724	2013-10-01 02:09:00 +00:00
Benjamin Kramer	c3c807b3bf	Allocate AtomicSDNode operands in SelectionDAG's allocator to stop leakage. SDNode destructors are never called. As an optimization use AtomicSDNode's internal storage if we have a small number of operands. llvm-svn: 191636	2013-09-29 11:18:56 +00:00
Robert Wilhelm	f0cfb83bb4	Fix spelling intruction -> instruction. llvm-svn: 191610	2013-09-28 11:46:15 +00:00
Tom Stellard	45015d9796	SelectionDAG: Silence unused variable warning on release builds llvm-svn: 191604	2013-09-28 03:10:17 +00:00
Tom Stellard	5694d3090a	SelectionDAG: Improve legalization of SELECT_CC with illegal condition codes SelectionDAG will now attempt to inverse an illegal conditon in order to find a legal one and if that doesn't work, it will attempt to swap the operands using the inverted condition. There are no new test cases for this, but a nubmer of the existing R600 tests hit this path. llvm-svn: 191602	2013-09-28 02:50:43 +00:00
Tom Stellard	cd42818d86	SelectionDAG: Try to expand all condition codes using getCCSwappedOperands() This is useful for targets like R600, which only support GT, GE, NE, and EQ condition codes as it removes the need to handle unsupported condition codes in target specific code. There are no tests with this commit, but R600 has been updated to take advantage of this new feature, so its existing selectcc tests are now testing the swapped operands path. llvm-svn: 191601	2013-09-28 02:50:38 +00:00
Tom Stellard	08690a146f	SelectionDAG: Clean up LegalizeSetCCCondCode() function Interpreting the results of this function is not very intuitive, so I cleaned it up to make it more clear whether or not a SETCC op was legalized and how it was legalized (either by swapping LHS and RHS or replacing with AND/OR). This patch does change functionality in the LHS and RHS swapping case, but unfortunately there are no in-tree tests for this. However, this patch is a prerequisite for R600 to take advantage of the LHS and RHS swapping, so tests will be added in subsequent commits. llvm-svn: 191600	2013-09-28 02:50:32 +00:00
Andrea Di Biagio	56ce9c4e78	Re-apply the change from r191393 with fix for pr17380. This change fixes the problem reported in pr17380 and re-add the dagcombine transformation ensuring that the value types are always legal if the transformation is triggered after Legalization took place. Added the test case from pr17380. llvm-svn: 191509	2013-09-27 11:37:05 +00:00
Andrea Di Biagio	549d6605a0	Revert r191393 since it caused pr17380. llvm-svn: 191438	2013-09-26 16:54:01 +00:00
Amara Emerson	b4ad2f396a	[ARM] Use the load-acquire/store-release instructions optimally in AArch32. Patch by Artyom Skrobov. llvm-svn: 191428	2013-09-26 12:22:36 +00:00
Andrew Trick	71e8bb6d1d	Added temp flag -misched-bench for staging in default changes. llvm-svn: 191423	2013-09-26 05:53:35 +00:00
Andrew Trick	6f5aad7a24	whitespace llvm-svn: 191422	2013-09-26 05:53:31 +00:00
Andrea Di Biagio	9f3313109f	Teach DAGCombiner how to canonicalize dags according to the rule (shl (zext (shr A, X)), X) => (zext (shl (shr A, X), X)). The rule only triggers when there are no other uses of the zext to avoid materializing more instructions. This helps the DAGCombiner understand that the shl/shr sequence can then be converted into an and instruction. llvm-svn: 191393	2013-09-25 19:01:01 +00:00
Eli Friedman	a961d694e2	Add missing check to SETCC optimization. PR17338. llvm-svn: 191337	2013-09-24 22:50:14 +00:00
Benjamin Kramer	64bdb29a83	DAGCombiner: Unify rotate matching for extended and unextended amounts. No functionality change, lots of indentation changes. llvm-svn: 191303	2013-09-24 14:21:28 +00:00
Jiangning Liu	63dc840fc5	Initial support for Neon scalar instructions. Patch by Ana Pazos. 1.Added support for v1ix and v1fx types. 2.Added Scalar Pairwise Reduce instructions. 3.Added initial implementation of Scalar Arithmetic instructions. llvm-svn: 191263	2013-09-24 02:47:27 +00:00
Michael Gottesman	5e3600c1ce	[stackprotector] Allow for copies from vreg -> vreg to be in a terminator sequence. Sometimes a copy from a vreg -> vreg sneaks into the middle of a terminator sequence. It is safe to slice this into the stack protector success bb. This fixes PR16979. llvm-svn: 191260	2013-09-24 01:50:26 +00:00
Kay Tiong Khoo	9195a5b081	fix typo: than -> then llvm-svn: 191214	2013-09-23 18:43:51 +00:00
Tim Northover	31d093c705	ISelDAG: spot chain cycles involving MachineNodes Previously, the DAGISel function WalkChainUsers was spotting that it had entered already-selected territory by whether a node was a MachineNode (amongst other things). Since it's fairly common practice to insert MachineNodes during ISelLowering, this was not the correct check. Looking around, it seems that other nodes get their NodeId set to -1 upon selection, so this makes sure the same thing happens to all MachineNodes and uses that characteristic to determine whether we should stop looking for a loop during selection. This should fix PR15840. llvm-svn: 191165	2013-09-22 08:21:56 +00:00
Juergen Ributzka	f043a65327	Revert "SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too." This reverts commit r191130. llvm-svn: 191138	2013-09-21 15:09:46 +00:00
Juergen Ributzka	e9a80fc912	SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask for the given target. This mask has usually te same size as the VSELECT return type (except for Intel KNL). Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. llvm-svn: 191130	2013-09-21 04:55:18 +00:00
David Blaikie	9d117ab7ef	Add braces to suppress Clang's dangling-else warning. These violations were introduced in r191049 llvm-svn: 191059	2013-09-20 00:33:11 +00:00
Kai Nacke	d09bb4614b	PR16726: extend rol/ror matching C-like languages promote types like unsigned short to unsigned int before performing an arithmetic operation. Currently the rotate matcher in the DAGCombiner does not consider this situation. This commit extends the DAGCombiner in the way that the pattern (or (shl ([az]ext x), (ext y)), (srl ([az]ext x), (ext (sub 32, y)))) is folded into ([az]ext (rotl x, y)) The matching is restricted to aext and zext because in this cases the upper bits are either undefined or known. Test case is included. This fixes PR16726. llvm-svn: 191049	2013-09-19 23:00:28 +00:00
Kai Nacke	2d967b2751	Revert PR16726: extend rol/ror matching There is a buildbot failure. Need to investigate this. llvm-svn: 191048	2013-09-19 22:53:36 +00:00
Kai Nacke	4eaf6444fa	PR16726: extend rol/ror matching C-like languages promote types like unsigned short to unsigned int before performing an arithmetic operation. Currently the rotate matcher in the DAGCombiner does not consider this situation. This commit extends the DAGCombiner in the way that the pattern (or (shl ([az]ext x), (ext y)), (srl ([az]ext x), (ext (sub 32, y)))) is folded into ([az]ext (rotl x, y)) The matching is restricted to aext and zext because in this cases the upper bits are either undefined or known. Test case is included. This fixes PR16726. llvm-svn: 191045	2013-09-19 22:36:39 +00:00
Benjamin Kramer	d443e4a080	DAGCombiner: Don't fold vector muls with constants that look like a splat of a power of 2 but differ in bit width. PR17283. llvm-svn: 191000	2013-09-19 13:28:20 +00:00
Adrian Prantl	262bcf4584	Debug info: Get rid of the VLA indirection hack in FastISel. Use the DIVariable::isIndirect() flag set by the frontend instead of guessing whether to set the machine location's indirection bit. Paired commit with CFE. llvm-svn: 190961	2013-09-18 22:08:59 +00:00
Serge Pavlov	8ec39992c1	Added documentation to getMemsetStores. llvm-svn: 190866	2013-09-17 16:24:42 +00:00
Quentin Colombet	d30a9585b8	[SelectionDAG] Teach the vector scalarizer about TRUNCATE. When a truncate node defines a legal vector type but uses an illegal vector type, the legalization process was splitting the vector until <1 x vector> type, but then it was failing to scalarize the node because it did not know how to handle TRUNCATE. <rdar://problem/14989896> llvm-svn: 190830	2013-09-17 00:26:56 +00:00
Adrian Prantl	db3e26d193	Debug info: Fix PR16736 and rdar://problem/14990587. A DBG_VALUE is register-indirect iff the first operand is a register _and_ the second operand is an immediate. llvm-svn: 190821	2013-09-16 23:29:03 +00:00
Hal Finkel	31658834e6	Prevent assert in CombinerGlobalAA with null values DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we can't use AA in this case (if we try, then the casting code in AA will assert). llvm-svn: 190763	2013-09-15 02:19:49 +00:00
Matt Arsenault	bc08ddba58	Remove pointless assertion after r190376 llvm-svn: 190565	2013-09-12 01:07:49 +00:00
Benjamin Kramer	079b96e6f7	Revert "Give internal classes hidden visibility." It works with clang, but GCC has different rules so we can't make all of those hidden. This reverts commit r190534. llvm-svn: 190536	2013-09-11 18:05:11 +00:00
Benjamin Kramer	6a44af3629	Give internal classes hidden visibility. Worth 100k on a linux/x86_64 Release+Asserts clang. llvm-svn: 190534	2013-09-11 17:42:27 +00:00
Eli Friedman	8f06d55697	Rename variables for consistency. No functional change. llvm-svn: 190466	2013-09-11 00:41:02 +00:00
Eli Friedman	78bffa5767	Fix unused variables. llvm-svn: 190448	2013-09-10 23:18:14 +00:00
Matt Arsenault	d232222f34	Don't use getSetCCResultType for creating a vselect The vselect mask isn't a setcc. This breaks in the case when the result of getSetCCResultType is larger than the vector operands e.g. %tmp = select i1 %cmp <2 x i8> %a, <2 x i8> %b when getSetCCResultType returns <2 x i32>, the assertion that the (MaskTy.getSizeInBits() == Op1.getValueType().getSizeInBits()) is hit. No test since I don't think I can hit this with any of the current targets. The R600/SI implementation would break, since it returns a vector of i1 for this, but it doesn't reach ExpandSELECT for other reasons. llvm-svn: 190376	2013-09-10 00:41:56 +00:00
Jack Carter	170a5f2983	white spaces and long lines llvm-svn: 190358	2013-09-09 22:02:08 +00:00
Bob Wilson	e407736a06	Revert patches to add case-range support for PR1255. The work on this project was left in an unfinished and inconsistent state. Hopefully someone will eventually get a chance to implement this feature, but in the meantime, it is better to put things back the way the were. I have left support in the bitcode reader to handle the case-range bitcode format, so that we do not lose bitcode compatibility with the llvm 3.3 release. This reverts the following commits: 155464, 156374, 156377, 156613, 156704, 156757, 156804 156808, 156985, 157046, 157112, 157183, 157315, 157384, 157575, 157576, 157586, 157612, 157810, 157814, 157815, 157880, 157881, 157882, 157884, 157887, 157901, 158979, 157987, 157989, 158986, 158997, 159076, 159101, 159100, 159200, 159201, 159207, 159527, 159532, 159540, 159583, 159618, 159658, 159659, 159660, 159661, 159703, 159704, 160076, 167356, 172025, 186736 llvm-svn: 190328	2013-09-09 19:14:35 +00:00
Tim Northover	950fcc0577	SelectionDAG: create correct BooleanContent constants Occasionally DAGCombiner can spot that a SETCC operation is completely redundant and reduce it to "all true" or "all false". If this happens to a vector, the value produced has to take account of what a normal comparison would have produced, which may be an all-1s bitmask. The fix in SelectionDAG.cpp is tested, however, as far as I can see the code in TargetLowering.cpp is possibly unreachable and almost certainly irrelevant when triggered so there are no tests. However, I believe it's still clearly the right change and may save someone else some hassle if it suddenly becomes reachable. So I'm doing it anyway. llvm-svn: 190147	2013-09-06 12:38:12 +00:00
Hal Finkel	5ef4dccdce	Use TargetSubtargetInfo::useAA() in DAGCombine This uses the TargetSubtargetInfo::useAA() function to control the defaults of the -combiner-alias-analysis and -combiner-global-alias-analysis options. llvm-svn: 189564	2013-08-29 03:29:55 +00:00
Juergen Ributzka	11c52c601a	Fix a typo and coding style of a previous commit. No functional change. llvm-svn: 189526	2013-08-28 22:33:58 +00:00
Tim Northover	819bfb5a25	DAGCombiner: make sure or/shl/srl really has zero high bits before forming bswap We want to convert code like (or (srl N, 8), (shl N, 8)) into (srl (bswap N), const), but this is only valid if the bits above 16 on the source pattern are 0, the checks we were doing on this were slightly wrong before. llvm-svn: 189348	2013-08-27 13:46:45 +00:00
Owen Anderson	a0260f848d	Remove an over-zealous assertion. A pointer type could be illegal if the target is prepared to custom-legalize pointer operands. This assertion was evaluated before the target would have a chance to do so, making it impossible. llvm-svn: 189299	2013-08-27 00:28:23 +00:00
Tom Stellard	838e2344ec	SelectionDAG: Remove unnecessary uses of TargetLowering::getPointerTy() If we have a binary operation like ISD:ADD, we can set the result type equal to the result type of one of its operands rather than using TargetLowering::getPointerTy(). Also, any use of DAG.getIntPtrConstant(C) as an operand for a binary operation can be replaced with: DAG.getConstant(C, OtherOperand.getValueType()); llvm-svn: 189227	2013-08-26 15:06:10 +00:00
Tom Stellard	7da047c9fb	SelectionDAG: Use correct pointer size when splitting vector stores llvm-svn: 189224	2013-08-26 15:05:55 +00:00
Tom Stellard	fd155828ed	SelectionDAG: Use correct pointer size when lowering function arguments v2 This adds minimal support to the SelectionDAG for handling address spaces with different pointer sizes. The SelectionDAG should now correctly lower pointer function arguments to the correct size as well as generate the correct code when lowering getelementptr. This patch also updates the R600 DataLayout to use 32-bit pointers for the local address space. v2: - Add more helper functions to TargetLoweringBase - Use CHECK-LABEL for tests llvm-svn: 189221	2013-08-26 15:05:36 +00:00
Benjamin Kramer	b12cf01908	Add a function object to compare the first or second component of a std::pair. Replace instances of this scattered around the code base. llvm-svn: 189169	2013-08-24 12:54:27 +00:00
Michael Gottesman	20f25eb958	[stack protector] Work around an issue with the BMOVPCB_CALL instruction on ARM by disabling does not return on __stack_chk_fail. This is to fix the bots while I look to see if there is something I can do here. rdar://14811848 llvm-svn: 189076	2013-08-22 23:45:24 +00:00
Michael Gottesman	1adac3582d	[stackprotector] When finding the split point to splice off the end of a parentmbb into a successmbb, include any DBG_VALUE MI. Fix for PR16954. llvm-svn: 188987	2013-08-22 05:40:50 +00:00
Tom Stellard	1b2c2d8414	SelectionDAG: Make sure stores are always added to the LegalizedNodes list When truncated vector stores were being custom lowered in VectorLegalizer::LegalizeOp(), the old (illegal) and new (legal) node pair was not being added to LegalizedNodes list. Instead of the legalized result being passed to VectorLegalizer::TranslateLegalizeResult(), the result was being passed back into VectorLegalizer::LegalizeOp(), which ended up adding a (new, new) pair to the list instead. This was causing an assertion failure when a custom lowered truncated vector store was the last instruction a basic block and the VectorLegalizer was unable to find it in the LegalizedNodes list when updating the DAG root. llvm-svn: 188953	2013-08-21 22:42:58 +00:00
Juergen Ributzka	3db39dc1ae	Teach BaseIndexOffset::match to identify base pointers in loops. The small utility function that pattern matches Base + Index + Offset patterns for loads and stores fails to recognize the base pointer for loads/stores from/into an array at offset 0 inside a loop. As a result DAGCombiner::MergeConsecutiveStores was not able to merge all stores. This commit fixes the issue by adding an additional pattern match and also a test case. Reviewer: Nadav llvm-svn: 188936	2013-08-21 21:53:38 +00:00
Richard Sandiford	6f6d55161b	[SystemZ] Use SRST to optimize memchr SystemZTargetLowering::emitStringWrapper() previously loaded the character into R0 before the loop and made R0 live on entry. I'd forgotten that allocatable registers weren't allowed to be live across blocks at this stage, and it confused LiveVariables enough to cause a miscompilation of f3 in memchr-02.ll. This patch instead loads R0 in the loop and leaves LICM to hoist it after RA. This is actually what I'd tried originally, but I went for the manual optimisation after noticing that R0 often wasn't being hoisted. This bug forced me to go back and look at why, now fixed as r188774. We should also try to optimize null checks so that they test the CC result of the SRST directly. The select between null and the SRST GPR result could then usually be deleted as dead. llvm-svn: 188779	2013-08-20 09:38:48 +00:00
Michael Gottesman	f7e1203d95	Remove unused variables that crept in. llvm-svn: 188761	2013-08-20 07:17:27 +00:00
Michael Gottesman	b27f0f1f6b	Teach selectiondag how to handle the stackprotectorcheck intrinsic. Previously, generation of stack protectors was done exclusively in the pre-SelectionDAG Codegen LLVM IR Pass "Stack Protector". This necessitated splitting basic blocks at the IR level to create the success/failure basic blocks in the tail of the basic block in question. As a result of this, calls that would have qualified for the sibling call optimization were no longer eligible for optimization since said calls were no longer right in the "tail position" (i.e. the immediate predecessor of a ReturnInst instruction). Then it was noticed that since the sibling call optimization causes the callee to reuse the caller's stack, if we could delay the generation of the stack protector check until later in CodeGen after the sibling call decision was made, we get both the tail call optimization and the stack protector check! A few goals in solving this problem were: 1. Preserve the architecture independence of stack protector generation. 2. Preserve the normal IR level stack protector check for platforms like OpenBSD for which we support platform specific stack protector generation. The main problem that guided the present solution is that one can not solve this problem in an architecture independent manner at the IR level only. This is because: 1. The decision on whether or not to perform a sibling call on certain platforms (for instance i386) requires lower level information related to available registers that can not be known at the IR level. 2. Even if the previous point were not true, the decision on whether to perform a tail call is done in LowerCallTo in SelectionDAG which occurs after the Stack Protector Pass. As a result, one would need to put the relevant callinst into the stack protector check success basic block (where the return inst is placed) and then move it back later at SelectionDAG/MI time before the stack protector check if the tail call optimization failed. The MI level option was nixed immediately since it would require platform specific pattern matching. The SelectionDAG level option was nixed because SelectionDAG only processes one IR level basic block at a time implying one could not create a DAG Combine to move the callinst. To get around this problem a few things were realized: 1. While one can not handle multiple IR level basic blocks at the SelectionDAG Level, one can generate multiple machine basic blocks for one IR level basic block. This is how we handle bit tests and switches. 2. At the MI level, tail calls are represented via a special return MIInst called "tcreturn". Thus if we know the basic block in which we wish to insert the stack protector check, we get the correct behavior by always inserting the stack protector check right before the return statement. This is a "magical transformation" since no matter where the stack protector check intrinsic is, we always insert the stack protector check code at the end of the BB. Given the aforementioned constraints, the following solution was devised: 1. On platforms that do not support SelectionDAG stack protector check generation, allow for the normal IR level stack protector check generation to continue. 2. On platforms that do support SelectionDAG stack protector check generation: a. Use the IR level stack protector pass to decide if a stack protector is required/which BB we insert the stack protector check in by reusing the logic already therein. If we wish to generate a stack protector check in a basic block, we place a special IR intrinsic called llvm.stackprotectorcheck right before the BB's returninst or if there is a callinst that could potentially be sibling call optimized, before the call inst. b. Then when a BB with said intrinsic is processed, we codegen the BB normally via SelectBasicBlock. In said process, when we visit the stack protector check, we do not actually emit anything into the BB. Instead, we just initialize the stack protector descriptor class (which involves stashing information/creating the success mbbb and the failure mbb if we have not created one for this function yet) and export the guard variable that we are going to compare. c. After we finish selecting the basic block, in FinishBasicBlock if the StackProtectorDescriptor attached to the SelectionDAGBuilder is initialized, we first find a splice point in the parent basic block before the terminator and then splice the terminator of said basic block into the success basic block. Then we code-gen a new tail for the parent basic block consisting of the two loads, the comparison, and finally two branches to the success/failure basic blocks. We conclude by code-gening the failure basic block if we have not code-gened it already (all stack protector checks we generate in the same function, use the same failure basic block). llvm-svn: 188755	2013-08-20 07:00:16 +00:00
Hal Finkel	0c5c01aa4a	Add a llvm.copysign intrinsic This adds a llvm.copysign intrinsic; We already have Libfunc recognition for copysign (which is turned into the FCOPYSIGN SDAG node). In order to autovectorize calls to copysign in the loop vectorizer, we need a corresponding intrinsic as well. In addition to the expected changes to the language reference, the loop vectorizer, BasicTTI, and the SDAG builder (the intrinsic is transformed into an FCOPYSIGN node, just like the function call), this also adds FCOPYSIGN to a few lists in LegalizeVector{Ops,Types} so that vector copysigns can be expanded. In TargetLoweringBase::initActions, I've made the default action for FCOPYSIGN be Expand for vector types. This seems correct for all in-tree targets, and I think is the right thing to do because, previously, there was no way to generate vector-values FCOPYSIGN nodes (and most targets don't specify an action for vector-typed FCOPYSIGN). llvm-svn: 188728	2013-08-19 23:35:46 +00:00
Paul Redmond	62f840f46a	Improve the widening of integral binary vector operations - split WidenVecRes_Binary into WidenVecRes_Binary and WidenVecRes_BinaryCanTrap - WidenVecRes_BinaryCanTrap preserves the original behaviour for operations that can trap - WidenVecRes_Binary simply widens the operation and improves codegen for 3-element vectors by allowing widening and promotion on x86 (matches the behaviour of unary and ternary operation widening) - use WidenVecRes_Binary for operations on integers. Reviewed by: nrotem llvm-svn: 188699	2013-08-19 20:01:35 +00:00
Hal Finkel	e4eb78188c	Add ExpandFloatOp_FCOPYSIGN to handle ppcf128-related expansions We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the sum of two doubles, and the first double must have the larger magnitude, we can take the sign from the first double. As a result, in addition to fixing the crash, this is also an optimization. llvm-svn: 188655	2013-08-19 06:55:37 +00:00
Jim Grosbach	06c2a68125	ARM: Fix more fast-isel verifier failures. Teach the generic instruction selection helper functions to constrain the register classes of their input operands. For non-physical register references, the generic code needs to be careful not to mess that up when replacing references to result registers. As the comment indicates for MachineRegisterInfo::replaceRegWith(), it's important to call constrainRegClass() first. rdar://12594152 llvm-svn: 188593	2013-08-16 23:37:31 +00:00
Richard Sandiford	0dec06a28c	[SystemZ] Use SRST to implement strlen and strnlen It would also make sense to use it for memchr; I'm working on that now. llvm-svn: 188547	2013-08-16 11:41:43 +00:00
Richard Sandiford	bb83a50f57	[SystemZ] Use MVST to implement strcpy and stpcpy llvm-svn: 188546	2013-08-16 11:29:37 +00:00
Richard Sandiford	ca23271010	[SystemZ] Use CLST to implement strcmp llvm-svn: 188544	2013-08-16 11:21:54 +00:00
Richard Sandiford	e3827751e2	[SystemZ] Fix handling of 64-bit memcmp results Generalize r188163 to cope with return types other than MVT::i32, just as the existing visitMemCmpCall code did. I've split this out into a subroutine so that it can be used for other upcoming patches. I also noticed that I'd used the wrong API to record the out chain. It's a load that uses DAG.getRoot() rather than getRoot(), so the out chain should go on PendingLoads. I don't have a testcase for that because we don't do any interesting scheduling on z yet. llvm-svn: 188540	2013-08-16 10:55:47 +00:00
Craig Topper	d9c2783d8f	Replace getValueType().getSimpleVT() with getSimpleValueType(). llvm-svn: 188442	2013-08-15 02:44:19 +00:00
Jim Grosbach	327ccc787e	DAG: Combine (and (setne X, 0), (setne X, -1)) -> (setuge (add X, 1), 2) A common idiom is to use zero and all-ones as sentinal values and to check for both in a single conditional ("x != 0 && x != (unsigned)-1"). That generates code, for i32, like: testl %edi, %edi setne %al cmpl $-1, %edi setne %cl andb %al, %cl With this transform, we generate the simpler: incl %edi cmpl $1, %edi seta %al Similar improvements for other integer sizes and on other platforms. In general, combining the two setcc instructions into one is better. rdar://14689217 llvm-svn: 188315	2013-08-13 21:30:58 +00:00
Michael Gottesman	7a8017290a	Update makeLibCall to return both the call and the chain associated with the libcall instead of just the call. This allows us to specify libcalls that return void. LowerCallTo returns a pair with the return value of the call as the first element and the chain associated with the return value as the second element. If we lower a call that has a void return value, LowerCallTo returns an SDValue with a NULL SDNode and the chain for the call. Thus makeLibCall by just returning the first value makes it impossible for you to set up the chain so that the call is not eliminated as dead code. I also updated all references to makeLibCall to reflect the new return type. llvm-svn: 188300	2013-08-13 17:54:56 +00:00
Michael Gottesman	3923bec37b	Fixed SelectionDAGBuilder.h C++ filetype declaration to use the canonical C++ instead of c++. llvm-svn: 188203	2013-08-12 21:02:02 +00:00
Richard Sandiford	564681c88d	[SystemZ] Use CLC and IPM to implement memcmp For now this is restricted to fixed-length comparisons with a length in the range [1, 256], as for memcpy() and MVC. llvm-svn: 188163	2013-08-12 10:28:10 +00:00
Craig Topper	0ecb26a79e	Change asserts at the top of getVectorShuffle to check that LHS and RHS have the same type as the result. Previously the asserts were only checking that RHS and LHS were the same type and had the same element type as the result. All downstream code for ISD::VECTOR_SHUFFLE requires the types to be the same. Also removed one unnecessary check of matched element counts that was present in the code. llvm-svn: 188051	2013-08-09 04:37:24 +00:00
Craig Topper	9a39b07a60	Remove AllUndef check from one of the loops in getVectorShuffle. It was already handled by the 'AllLHS && AllRHS' check after the previous loop. llvm-svn: 187965	2013-08-08 08:03:12 +00:00
Craig Topper	309dfefb6f	Optimize mask generation for one of the DAG combiner shufflevector cases. llvm-svn: 187961	2013-08-08 07:38:55 +00:00
Hal Finkel	171817ee8a	Add ISD::FROUND for libm round() All libm floating-point rounding functions, except for round(), had their own ISD nodes. Recent PowerPC cores have an instruction for round(), and so here I'm adding ISD::FROUND so that round() can be custom lowered as well. For the most part, this is straightforward. I've added an intrinsic and a matching ISD node just like those for nearbyint() and friends. The SelectionDAG pattern I've named frnd (because ISD::FP_ROUND has already claimed fround). This will be used by the PowerPC backend in a follow-up commit. llvm-svn: 187926	2013-08-07 22:49:12 +00:00
Tom Stellard	d42c594960	TargetLowering: Add getVectorIdxTy() function v2 This virtual function can be implemented by targets to specify the type to use for the index operand of INSERT_VECTOR_ELT, EXTRACT_VECTOR_ELT, INSERT_SUBVECTOR, EXTRACT_SUBVECTOR. The default implementation returns the result from TargetLowering::getPointerTy() The previous code was using TargetLowering::getPointerTy() for vector indices, because this is guaranteed to be legal on all targets. However, using TargetLowering::getPointerTy() can be a problem for targets with pointer sizes that differ across address spaces. On such targets, when vectors need to be loaded or stored to an address space other than the default 'zero' address space (which is the address space assumed by TargetLowering::getPointerTy()), having an index that is a different size than the pointer can lead to inefficient pointer calculations, (e.g. 64-bit adds for a 32-bit address space). There is no intended functionality change with this patch. llvm-svn: 187748	2013-08-05 22:22:01 +00:00
Eric Christopher	e6656ac870	Fix crashing on invalid inline asm with matching constraints. For a testcase like the following: typedef unsigned long uint64_t; typedef struct { uint64_t lo; uint64_t hi; } blob128_t; void add_128_to_128(const blob128_t in, blob128_t res) { asm ("PAND %1, %0" : "+Q"(res) : "Q"(in)); } where we'll fail to allocate the register for the output constraint, our matching input constraint will not find a register to match, and could try to search past the end of the current operands array. On the idea that we'd like to attempt to keep compilation going to find more errors in the module, change the error cases when we're visiting inline asm IR to return immediately and avoid trying to create a node in the DAG. This leaves us with only a single error message per inline asm instruction, but allows us to safely keep going in the general case. llvm-svn: 187470	2013-07-31 01:26:24 +00:00
Eric Christopher	029af15086	Reflow this to be easier to read. llvm-svn: 187459	2013-07-30 22:50:44 +00:00
Quentin Colombet	6bf4baa408	[DAGCombiner] insert_vector_elt: Avoid building a vector twice. This patch prevents the following combine when the input vector is used more than once. insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx => build_vector elt0, ..., NewEltIdx, ..., eltN The reasons are: - Building a vector may be expensive, so try to reuse the existing part of a vector instead of creating a new one (think big vectors). - elt0 to eltN now have two users instead of one. This may prevent some other optimizations. llvm-svn: 187396	2013-07-30 00:24:09 +00:00
Nick Lewycky	0b68245ec8	Reimplement isPotentiallyReachable to make nocapture deduction much stronger. Adds unit tests for it too. Split BasicBlockUtils into an analysis-half and a transforms-half, and put the analysis bits into a new Analysis/CFG.{h,cpp}. Promote isPotentiallyReachable into llvm::isPotentiallyReachable and move it into Analysis/CFG. llvm-svn: 187283	2013-07-27 01:24:00 +00:00
Justin Holewinski	d3f2035a3c	Add a target legalize hook for SplitVectorOperand (again) CustomLowerNode was not being called during SplitVectorOperand, meaning custom legalization could not be used by targets. This also adds a test case for NVPTX that depends on this custom legalization. Differential Revision: http://llvm-reviews.chandlerc.com/D1195 Attempt to fix the buildbots by making the X86 test I just added platform independent llvm-svn: 187202	2013-07-26 13:28:29 +00:00

... 2 3 4 5 6 ...

6324 Commits