llvm-project

Commit Graph

Author	SHA1	Message	Date
Ahmed Bougacha	16547c4e31	[CodeGen] Round [SU]INT_TO_FP result when promoting from f16. If we don't, values that aren't precisely representable in f16 could be used as-is in a promoted f32 operation, which would produce incorrect results. AArch64 had the correct behavior; add a focused test. Fixes http://llvm.org/PR26871 llvm-svn: 268700	2016-05-06 00:58:00 +00:00
Justin Bogner	b012699741	SDAG: Rename Select->SelectImpl and repurpose Select as returning void This is a step towards removing the rampant undefined behaviour in SelectionDAG, which is a part of llvm.org/PR26808. We rename SelectionDAGISel::Select to SelectImpl and update targets to match, and then change Select to return void and consolidate the sketchy behaviour we're trying to get away from there. Next, we'll update backends to implement `void Select(...)` instead of SelectImpl and eventually drop the base Select implementation. llvm-svn: 268693	2016-05-05 23:19:08 +00:00
Justin Bogner	465886ece1	SDAG: Remove OPC_MarkGlueResults and associated logic. NFC This opcode never happens in practice, and yet the logic we have in place to handle it would be undefined behaviour if we ever executed it. Remove it rather than trying to refactor code that's never reached. llvm-svn: 268692	2016-05-05 22:37:45 +00:00
Sanjay Patel	c91351c2b7	clean up; NFCI llvm-svn: 268564	2016-05-04 22:39:36 +00:00
Simon Pilgrim	1f5ad702f8	[SelectionDAG] BITREVERSE vector legalization of bit operations (REAPPLIED) Some vector bit operations are promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use a new TLI helper isOperationLegalOrCustomOrPromote instead, allowing the SSE implementations to stay on the simd unit. Differential Revision: http://reviews.llvm.org/D19805 llvm-svn: 268561	2016-05-04 22:08:51 +00:00
Simon Pilgrim	1a14f0d25c	Revert r268504 llvm-svn: 268526	2016-05-04 17:49:14 +00:00
Simon Pilgrim	b97c06210b	[SelectionDAG] BITREVERSE vector legalization of bit operations Vector bit operations are typically promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use isOperationLegalOrPromote instead, allowing the SSE implementations to stay on the simd unit. Differential Revision: http://reviews.llvm.org/D19805 llvm-svn: 268504	2016-05-04 15:01:13 +00:00
Craig Topper	3fc0e668ff	[CodeGen] Add some space optimized forms of EmitNode and MorphNodeTo that implicitly indicate the number of result VTs. This shaves about 16K off the X86 matching table taking it down to about 470K. Overall this reduces the llc binary size with all in-tree targets by about 40K. llvm-svn: 268365	2016-05-03 05:54:13 +00:00
Wolfgang Pieb	56aa4b0629	DebugInfo: Avoid propagating incorrect debug locations in SelectionDAG via CSE. Summary: When SelectionDAG performs CSE it is possible that the context's source location is different from that of the selected node. This can lead to incorrect line number records. We update the debug location to the one that occurs earlier in the instruction sequence. This fixes PR21006. Reviewers: echristo, sdmitrouk Subscribers: jevinskie, asl, llvm-commits Differential Revision: http://reviews.llvm.org/D12094 llvm-svn: 268323	2016-05-02 22:50:51 +00:00
Eric Christopher	94a9ee65c6	Fix grammar and correct comment - the debug information wasn't incorrect, rather suboptimal. llvm-svn: 268211	2016-05-02 05:30:26 +00:00
Craig Topper	e3c1e225d7	[CodeGen] Add OPC_MoveChild0-OPC_MoveChild7 opcodes to isel matching tables to optimize table size. Shaves about 12K off the X86 matcher table. llvm-svn: 268209	2016-05-02 01:53:30 +00:00
Igor Breger	110af565c7	getelementptr instruction, support index vector of EVT. Differential Revision: http://reviews.llvm.org/D19775 llvm-svn: 268195	2016-05-01 13:29:12 +00:00
Matt Arsenault	ab2232cf73	DAGCombiner: Reduce truncated shl width llvm-svn: 268094	2016-04-29 19:53:16 +00:00
Simon Pilgrim	464f1f3bea	Use SelectionDAG::getTargetConstant* helper functions. NFC. Instead of SelectionDAG::getConstant directly to make it more obvious that we're creating target constants. llvm-svn: 268074	2016-04-29 17:42:45 +00:00
Filipe Cabecinhas	0da9937517	Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050	2016-04-29 15:22:48 +00:00
Gerolf Hoflehner	50426191d7	[DAGCombiner] Follow coding convention for function name (NFC) llvm-svn: 267745	2016-04-27 17:27:16 +00:00
Nico Weber	e69b9548b8	Revert r267649, it caused PR27539. llvm-svn: 267723	2016-04-27 15:16:54 +00:00
Cong Hou	6f879d9eb1	Detects the SAD pattern on X86 so that much better code will be emitted once the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649	2016-04-27 01:29:18 +00:00
Ahmed Bougacha	128f8732a5	[CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI. Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606	2016-04-26 21:15:30 +00:00
Marcin Koscielnicki	1c1af6ef77	[PR27390] [CodeGen] Reject indexed loads in CombinerDAG. visitAND, when folding and (load) forgets to check which output of an indexed load is involved, happily folding the updated address output on the following testcase: target datalayout = "e-m:e-i64:64-n32:64" target triple = "powerpc64le-unknown-linux-gnu" %typ = type { i32, i32 } define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) { %b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1 %1 = load i32, i32* %b, align 4 %2 = ptrtoint i32* %b to i64 %3 = and i64 %2, -35184372088833 %4 = inttoptr i64 %3 to i32* %_msld = load i32, i32* %4, align 4 %zzz = add i32 %1, %_msld ret i32 %zzz } Fix this by checking ResNo. I've found a few more places that currently neglect to check for indexed load, and tightened them up as well, but I don't have test cases for them. In fact, they might not be triggerable at all, at least with current targets. Still, better safe than sorry. Differential Revision: http://reviews.llvm.org/D19202 llvm-svn: 267420	2016-04-25 15:43:44 +00:00
Gerolf Hoflehner	01b3a6184a	[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098) The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328	2016-04-24 05:14:01 +00:00
Craig Topper	36c133159a	[CodeGen] Teach DAG combine to fold select_cc seteq X, 0, sizeof(X), ctlz_zero_undef(X) -> ctlz(X). InstCombine already does this for IR and X86 pattern matches this during isel. A follow up commit will remove the X86 patterns to allow this to be tested. llvm-svn: 267325	2016-04-24 04:38:32 +00:00
Craig Topper	7e5fad66f3	[CodeGen] When promoting CTTZ operations to larger type, don't insert a select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part. llvm-svn: 267280	2016-04-23 05:20:47 +00:00
Matt Arsenault	3b748d76f6	DAGCombiner: Relax alignment restriction when changing store type If the target allows the alignment, this should be OK. llvm-svn: 267217	2016-04-22 21:01:41 +00:00
Matt Arsenault	629d12de70	DAGCombiner: Relax alignment restriction when changing load type If the target allows the alignment, this should still be OK. llvm-svn: 267209	2016-04-22 20:21:36 +00:00
Daniel Sanders	591c379563	Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64 It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127	2016-04-22 09:37:26 +00:00
Gerolf Hoflehner	b32f11fc62	[MachineCombiner] Support for floating-point FMA on ARM64 Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098	2016-04-22 02:15:19 +00:00
Matt Arsenault	7846d885ed	LegalizeDAG: Move unaligned load/store expansion to TLI When custom lowered, this is not called if the store is custom lowered. Move it to be a utility function so targets can easily expand unaligned accesses when custom lowering. llvm-svn: 267029	2016-04-21 18:19:11 +00:00
Matt Arsenault	8d1052f55c	DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit component If the extracted bits are restricted to the upper half or lower half, this can be truncated. llvm-svn: 267024	2016-04-21 18:03:06 +00:00
Craig Topper	52cb5ec36f	[SelectionDAG] Teach LegalizeVectorOps to directly Expand CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to CTTZ/CTLZ directly if those ops are Legal/Custom instead of deferring it to LegalizeOps. This is needed to support CTTZ/CTLZ Custom correctly since LegalizeOps would be too late to do the custom lowering. llvm-svn: 266951	2016-04-21 04:43:57 +00:00
Tim Shen	a1d8bc5597	[PPC, SSP] Support PowerPC Linux stack protection. llvm-svn: 266809	2016-04-19 20:14:52 +00:00
Tim Shen	e885d5e4d3	[SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD With this change, ideally IR pass can always generate llvm.stackguard call to get the stack guard; but for now there are still IR form stack guard customizations around (see getIRStackGuard()). Future SSP customization should go through LOAD_STACK_GUARD. There is a behavior change: stack guard values are not CSEed anymore, since we should never reuse the value in case that it has been spilled (and corrupted). See ssp-guard-spill.ll. This also cause the change of stack size and codegen in X86 and AArch64 test cases. Ideally we'd like to know if the guard created in llvm.stackprotector() gets spilled or not. If the value is spilled, discard the value and reload stack guard; otherwise reuse the value. This can be done by teaching register allocator to know how to rematerialize LOAD_STACK_GUARD and force a rematerialization (which seems hard), or check for spilling in expandPostRAPseudo. It only makes sense when the stack guard is a global variable, which requires more instructions to load. Anyway, this seems to go out of the scope of the current patch. llvm-svn: 266806	2016-04-19 19:40:37 +00:00
Mehdi Amini	b550cb1750	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Hans Wennborg	07f6d3a893	Switch lowering: don't add incoming PHI values from skipped bit test MBB's (PR27135) After r245976, LLVM will skip the last bit test case if knows it will always be true. However, we would still erroneously update PHI nodes with incoming values from the MBB that would perform the final bit test, causing -verify-machineinstrs to fail. llvm-svn: 266479	2016-04-15 21:45:30 +00:00
Hans Wennborg	c944c13dc1	SelectionDAGISel: rangeify a loop llvm-svn: 266478	2016-04-15 21:45:09 +00:00
Reid Kleckner	28865809fe	Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.h MachineInstr.h and MachineInstrBuilder.h are very popular headers, widely included across all LLVM backends. It turns out that there only a handful of TUs that actually care about DI operands on MachineInstrs. After this change, touching DebugInfoMetadata.h and rebuilding llc only needs 112 actions instead of 542. llvm-svn: 266351	2016-04-14 18:29:59 +00:00
David Majnemer	0f26b0aeb4	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279	2016-04-14 07:13:24 +00:00
Matt Arsenault	9cd90712f0	AMDGPU: Implement canonicalize Also add generic DAG node for it. llvm-svn: 266272	2016-04-14 01:42:16 +00:00
Matthias Braun	46b0f03e12	TargetLowering: Factor out common code for tail call eligibility checking; NFC llvm-svn: 266270	2016-04-14 01:10:42 +00:00
Nirav Dave	2477491a92	Cleanup Store Merging in UseAA case This patch fixes a bug (PR26827) when using anti-aliasing in store merging. This sets the chain users of the component stores to point to the new store instead of the component stores chain parent. Reviewers: jyknight Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18909 llvm-svn: 266217	2016-04-13 17:27:26 +00:00
Ahmed Bougacha	7ac86c47d2	[CodeGen] Remove constant-folding dead code. NFC. This code was specific to vector operations with scalar operands: all the opcodes in FoldValue (via FoldConstantArithmetic) can't match those criteria. Replace it with an assert if that ever changes: at that point, we might need to add back a splat BUILD_VECTOR. llvm-svn: 266100	2016-04-12 18:15:39 +00:00
Philip Reames	92d1f0cb6d	Introduce an GCRelocateInst class [NFC] Previously, we were using isGCRelocate predicates. Using a subclass of IntrinsicInst is far more idiomatic. The refactoring also enables a couple of minor simplifications and code sharing. llvm-svn: 266098	2016-04-12 18:05:10 +00:00
Simon Pilgrim	82e54871d0	[DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) anytime before LegalizeVectorOprs xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well. The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles. I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result. Differential Revision: http://reviews.llvm.org/D18944 llvm-svn: 265998	2016-04-11 21:10:33 +00:00
Tom Stellard	52686e4182	TargetRegisterInfo: Add getRegAsmName() Summary: The motivation for this new function is to move an invalid assumption about the relationship between the names of register definitions in tablegen files and their assembly names into TargetRegisterInfo, so that we can begin working on fixing this assumption. The current problem is that if you have a register definition in TableGen like: def MYReg0 : Register<"r0", 0>; The function TargetLowering::getRegForInlineAsmConstraint() derives the assembly name from the tablegen name: "MyReg0" rather than the given assembly name "r0". This is working, because on most targets the tablegen name and the assembly names are case insensitive matches for each other (e.g. def EAX : X86Reg<"eax", ...> getRegAsmName() will allow targets to override this default assumption and return the correct assembly name. Reviewers: echristo, hfinkel Subscribers: SamWot, echristo, hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D15614 llvm-svn: 265955	2016-04-11 16:21:12 +00:00
Sanjay Patel	4abae4e0fa	[x86] use BMI 'andn' for logic + compare ops With BMI, we can use 'andn' to save an instruction when the result is only used in a compare. This is related to one of the potential sequences to check 'isfinite' in: https://llvm.org/bugs/show_bug.cgi?id=27164 Differential Revision: http://reviews.llvm.org/D18910 llvm-svn: 265875	2016-04-09 16:02:52 +00:00
Tim Shen	0012756489	[SSP] Remove llvm.stackprotectorcheck. This is a cleanup patch for SSP support in LLVM. There is no functional change. llvm.stackprotectorcheck is not needed, because SelectionDAG isn't actually lowering it in SelectBasicBlock; rather, it adds check code in FinishBasicBlock, ignoring the position where the intrinsic is inserted (See FindSplitPointForStackProtector()). llvm-svn: 265851	2016-04-08 21:26:31 +00:00
Nirav Dave	66f485f4e2	Fix Load Control Dependence in MemCpy Generation In Memcpy lowering we had missed a dependence from the load of the operation to successor operations. This causes us to potentially construct an in initial DAG with a memory dependence not fully represented in the chain sub-DAG but rather require looking at the entire DAG breaking alias analysis by allowing incorrect repositioning of memory operations. To work around this, r200033 changed DAGCombiner::GatherAllAliases to be conservative if any possible issues to happen. Unfortunately this check forbade many non-problematic situations as well. For example, it's common for incoming argument lowering to add a non-aliasing load hanging off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would find that other (unvisited) use of the EntryNode chain, and just give up entirely. Furthermore, the check was incomplete: it would not actually detect all such potentially problematic DAG constructions, because GatherAllAliases did not guarantee to visit all chain nodes going up to the root EntryNode. This is in general fine -- giving up early will just miss a potential optimization, not generate incorrect results. But, for this non-chain dependency detection code, it's possible that you could have a load attached to a higher-up chain node than any which were visited. If that load aliases your store, but the only dependency is through the value operand of a non-aliasing store, it would've been missed by this code, and potentially reordered. With the dependence added, this check can be removed and Alias Analysis can be much more aggressive. This fixes code quality regression in the Consecutive Store Merge cleanup (D14834). Test Change: ppc64-align-long-double.ll now may see multiple serializations of its stores Differential Revision: http://reviews.llvm.org/D18062 llvm-svn: 265836	2016-04-08 19:44:40 +00:00
JF Bastien	800f87a871	NFC: make AtomicOrdering an enum class Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602	2016-04-06 21:19:33 +00:00
Sanjoy Das	65a60670e8	Lower @llvm.experimental.deoptimize as a noreturn call While preserving the return value for @llvm.experimental.deoptimize at the IR level is useful during mid-level optimization, doing so at the machine instruction level requires generating some extra code and a return that is non-ideal. This change has LLVM lower ``` %val = call @llvm.experimental.deoptimize ret %val ``` to effectively ``` call @__llvm_deoptimize() unreachable ``` instead. llvm-svn: 265502	2016-04-06 01:33:49 +00:00
Manman Ren	e221a870d3	Swift Calling Convention: swifterror target-independent change. At IR level, the swifterror argument is an input argument with type ErrorObject*. For targets that support swifterror, we want to optimize it to behave as an inout value with type ErrorObject; it will be passed in a fixed physical register. The main idea is to track the virtual registers for each swifterror value. We define swifterror values as AllocaInsts with swifterror attribute or a function argument with swifterror attribute. In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before handling the basic blocks. When iterating over all basic blocks in RPO, before actually visiting the basic block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when there are multiple predecessors or to simply propagate them. There, we create a virtual register for each swifterror value in the entry block. For predecessors that are not yet visited, we create virtual registers to hold the swifterror values at the end of the predecessor. The assignments are saved in SwiftErrorWorklist and will be materialized at the end of visiting the basic block. When visiting a load from a swifterror value, we copy from the current virtual register assignment. When visiting a store to a swifterror value, we create a virtual register to hold the swifterror value and update SwiftErrorMap to track the current virtual register assignment. Differential Revision: http://reviews.llvm.org/D18108 llvm-svn: 265433	2016-04-05 18:13:16 +00:00
Sanjay Patel	769b5fd546	fix typos; NFC llvm-svn: 265356	2016-04-04 22:45:56 +00:00
Manman Ren	9bfd0d03e9	Swift Calling Convention: add swifterror attribute. A ``swifterror`` attribute can be applied to a function parameter or an AllocaInst. This commit does not include any target-specific change. The target-specific optimization will come as a follow-up patch. Differential Revision: http://reviews.llvm.org/D18092 llvm-svn: 265189	2016-04-01 21:41:15 +00:00
Sanjoy Das	9d41a8f269	Don't use an i64 return type with webkit_jscc Re-enable an assertion enabled by Justin Lebar in rL265092. rL265092 was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc does not like non-i64 return types. Change the test case to not do that. llvm-svn: 265099	2016-04-01 02:51:21 +00:00
Justin Lebar	98981e5573	Revert "Protect some assertions with NDEBUG rather than DEBUG()." This reverts r265092, because it breaks CodeGen/X86/deopt-intrinsic.ll. llvm-svn: 265093	2016-04-01 01:23:23 +00:00
Justin Lebar	c814e8e4ab	Protect some assertions with NDEBUG rather than DEBUG(). DEBUG() only runs if you pass -debug, but these assertions are generally useful. llvm-svn: 265092	2016-04-01 01:09:12 +00:00
Sanjay Patel	4d71160d5d	fix typo; NFC llvm-svn: 265054	2016-03-31 21:00:48 +00:00
Nirav Dave	83ce54aac2	Prevent X86ISelLowering from merging volatile loads Change isConsecutiveLoads to check that loads are non-volatile as this is a requirement for any load merges. Propagate change to two callers. Reviewers: RKSimon Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18546 llvm-svn: 265013	2016-03-31 13:40:55 +00:00
Matt Arsenault	46ba31650e	LegalizeDAG: Don't replace vector store with integer if not legal For the same reason as the corresponding load change. Note that ExpandStore is completely broken for non-byte sized element vector stores, but preserve the current broken behavior which has tests for it. The behavior should be the same, but now introduces a new typed store that is incorrectly split later rather than doing it directly. llvm-svn: 264928	2016-03-30 21:15:18 +00:00
Matt Arsenault	a4b1b6ea05	LegalizeDAG: Don't replace vector load with integer unless legal On AMDGPU we want to be able to promote i64/f64 loads to v2i32. If the access is unaligned, this would conclude that since i64 is legal, it would convert it back to i64 and there is an endless legalization loop. Extract the logic for scalarizing the load into a new TargetLowering function, where this can also replace the custom function AMDGPU has for this. llvm-svn: 264927	2016-03-30 21:15:10 +00:00
Nirav Dave	2aab7f4358	Add support for no-jump-tables Add function soft attribute to the generation of Jump Tables in CodeGen as initial step towards clang support of gcc's no-jump-table support Reviewers: hans, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18321 llvm-svn: 264756	2016-03-29 17:46:23 +00:00
Manman Ren	f46262e0b7	Swift Calling Convention: add swiftself attribute. Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754	2016-03-29 17:37:21 +00:00
Kyle Butt	5e241b11ed	[Codegen] Decrease minimum jump table density. Minimum density for both optsize and non optsize are now options -sparse-jump-table-density (default 10) for non optsize functions -dense-jump-table-density (default 40) for optsize functions, which matches the current default. This improves several benchmarks at google at the cost of a small codesize increase. For code compiled with -Os, the old behavior continues llvm-svn: 264689	2016-03-29 00:23:41 +00:00
Junmo Park	a26e93bcec	Minor code cleanup. NFC. llvm-svn: 264505	2016-03-26 06:04:55 +00:00
Nirav Dave	fa250cad37	Prevent construction of cycle in DAG store merge When merging stores in DAGCombiner, add check to ensure that no dependenices exist that would cause the construction of a cycle in our DAG. This may happen if one store has a data dependence on another instruction (e.g. a load) which itself has a (chain) dependence on another store being merged. These stores cannot be merged safely and doing so results in a cycle that is discovered in LegalizeDAG. This test is only done in cases where Antialias analysis is used (UseAA) as non-AA store merge candidates will be merged logically after all loads which have been checked to not alias. Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight Subscribers: llvm-commits, tberghammer, danalbert, srhines Differential Revision: http://reviews.llvm.org/D18336 llvm-svn: 264461	2016-03-25 21:06:30 +00:00
Manman Ren	9dd8c14674	CXX TLS: collect return blocks after SelectAllBasicBlocks. It is incorrect to get the corresponding MBB for a ReturnInst before SelectAllBasicBlocks since SelectAllBasicBlocks can change the correspondence between a ReturnInst and the MBB it is in. PR27062 llvm-svn: 264358	2016-03-24 23:21:29 +00:00
Sanjoy Das	fd3eaa8c5c	Reduce code duplication by extracting out a helper function; NFC llvm-svn: 264355	2016-03-24 22:51:49 +00:00
Sanjoy Das	731c67fed2	Lower varargs correctly in deopt bundle lowering Earlier we were ignoring varargs in LowerCallSiteWithDeoptBundle because populateCallLoweringInfo does not set CallLoweringInfo::IsVarArg. llvm-svn: 264354	2016-03-24 22:37:52 +00:00
Sanjoy Das	df9ae70f49	Add lowering support for llvm.experimental.deoptimize Summary: Only adds support for "naked" calls to llvm.experimental.deoptimize. Support for round-tripping through RewriteStatepointsForGC will come as a separate patch (should be simpler than this one). Reviewers: reames Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18429 llvm-svn: 264329	2016-03-24 20:23:29 +00:00
Sanjoy Das	c0c59fe14e	[Statepoints] Fix yet another issue around gc pointer uniqueing Given that StatepointLowering now uniques derived pointers before putting them in the per-statepoint spill map, we may end up with missing entries for derived pointers when we visit a gc.relocate on a pointer that was de-duplicated away. Fix this by keeping two maps, one mapping gc pointers to their de-duplicated values, and one mapping a de-duplicated value to the slot it is spilled in. llvm-svn: 264320	2016-03-24 18:57:39 +00:00
Sanjoy Das	42f91a9959	Minor cosmestic changes (NFC) - Reflow comments - Rename function llvm-svn: 264319	2016-03-24 18:57:31 +00:00
Tim Northover	4498eff9bb	CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS. If the operation's type has been promoted during type legalization, we need to account for the fact that the high bits of the comparison operand are likely unspecified. The LHS is usually zero-extended, but MIPS sign extends it, so we have to be slightly careful. Patch by Simon Dardis. llvm-svn: 264296	2016-03-24 15:38:38 +00:00
Pirama Arumuga Nainar	dc45aef2d8	Remove unsafe AssertZext after promoting result of FP_TO_FP16 Summary: Some target lowerings of FP_TO_FP16, for instance ARM's vcvtb.f16.f32 instruction, do not guarantee that the top 16 bits are zeroed out. Remove the unsafe AssertZext and add tests to exercise this. Reviewers: jmolloy, sbaranga, kristof.beyls, aadg Subscribers: llvm-commits, srhines, aemerson Differential Revision: http://reviews.llvm.org/D18426 llvm-svn: 264285	2016-03-24 14:06:03 +00:00
Justin Bogner	c35c10593b	SelectionDAG: Remove a tautological dyn_cast. NFC Index is already a StoreSDNode, so this dyn_cast doesn't do anything. llvm-svn: 264177	2016-03-23 18:15:33 +00:00
Sanjoy Das	a5b2972977	Remove stale comment llvm-svn: 264131	2016-03-23 02:28:35 +00:00
Sanjoy Das	ac53dc7520	[StatepointLowering] Don't do two DenseMap lookups; nfci llvm-svn: 264130	2016-03-23 02:24:15 +00:00
Sanjoy Das	7edbef316b	[StatepointLowering] Minor NFC cleanups - Use auto - Name variables in LLVM style - Use llvm::find instead of std::find - Blank lines between declarations llvm-svn: 264129	2016-03-23 02:24:13 +00:00
Sanjoy Das	4cd746ebe0	[StatepointLowering] Minor nfc refactoring Now that StatepointLoweringInfo represents base pointers, derived pointers and gc relocates as SmallVectors and not ArrayRefs, we no longer need to allocate "backing storage" on stack in LowerStatepoint. So elide the backing storage, and inline the trivial body of getIncomingStatepointGCValues. llvm-svn: 264128	2016-03-23 02:24:10 +00:00
Sanjoy Das	e58ca59cf4	[StatepointLowering] Schedule gc relocates before uniqueing them Otherwise we can see an "unexpected" gc.relocate that we uniqued away. llvm-svn: 264127	2016-03-23 02:24:07 +00:00
Simon Pilgrim	c6f5fe3d69	[SelectionDAG] Ensure constant folded legalized vector element types are compatible with the BUILD_VECTOR type Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected. llvm-svn: 264085	2016-03-22 19:59:53 +00:00
Tim Northover	b49a8a9dbb	CodeGen: check return types match when emitting tail call to builtin. We were just completely ignoring the types when determining whether we could safely emit a libcall as a tail call. This is clearly wrong. Theoretically, we could dig deeper looking for incidental matches (much like the generic code in Analysis.cpp does), but it's probably not worth it for the few libcalls that exist. llvm-svn: 264084	2016-03-22 19:14:38 +00:00
Sanjoy Das	eb5037cadc	Allow lowering call sites with both funclets and deopt state Lowering funclets is a no-op, so we can just go ahead and lower the deopt state. llvm-svn: 264078	2016-03-22 18:10:39 +00:00
Sanjoy Das	6b535630a1	Add a hasOperandBundlesOtherThan helper, and use it; NFC llvm-svn: 264072	2016-03-22 17:51:25 +00:00
Simon Pilgrim	25fb4177fb	[X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardware Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Reapplied with a fix for PR26953 (missing vector widening legalization). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 264062	2016-03-22 16:22:08 +00:00
Sanjoy Das	38bfc22161	Add "first class" lowering for deopt operand bundles Summary: After this change, deopt operand bundles can be lowered directly by SelectionDAG into STATEPOINT instructions (which are then lowered to a call or sequence of nop, with an associated __llvm_stackmaps entry0. This obviates the need to round-trip deoptimization state through gc.statepoint via RewriteStatepointsForGC. Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin Subscribers: sanjoy, mcrosier, majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D18257 llvm-svn: 264015	2016-03-22 00:59:13 +00:00
Silviu Baranga	46030585b3	[DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes Summary: extract_vector_elt can cause an implicit any_ext if the types don't match. When processing the following pattern: (and (extract_vector_elt (load ([non_ext\|any_ext\|zero_ext] V))), c) DAGCombine was ignoring the possible extend, and sometimes removing the AND even though it was required to maintain some of the bits in the result to 0, resulting in a miscompile. This change fixes the issue by limiting the transformation only to cases where the extract_vector_elt doesn't perform the implicit extend. Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18247 llvm-svn: 263935	2016-03-21 11:43:46 +00:00
Manman Ren	2828c57b6f	[CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86. Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855	2016-03-18 23:38:49 +00:00
Sanjoy Das	3a02019fbc	[SelectionDAG] Remove visitStatepoint; NFC This way we have a single entry point into StatepointLowering. The method was a direct dispatch to LowerStatepoint anyway. llvm-svn: 263682	2016-03-17 00:47:14 +00:00
Sanjoy Das	43e33d61c6	Fix indentation; NFC llvm-svn: 263672	2016-03-16 23:11:21 +00:00
Sanjoy Das	70697ff74d	Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC Summary: This is a step towards implementing "direct" lowering of calls and invokes with deopt operand bundles into STATEPOINT nodes (as opposed to having them mandatorily pass through RewriteStatepointsForGC, which is the case today). This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint` helper function that is able to lower a "statepoint like thing", and uses it to lower `gc.statepoint` calls. This is an NFC now, but in a later change we will use `LowerAsStatepoint` to directly lower calls and invokes with operand bundles without going through an intermediate `gc.statepoint` IR representation. FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add support for lowering non gc.statepoints, right now it is fairly tightly coupled with an IR level `gc.statepoint`. Reviewers: reames, pgavlin, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18106 llvm-svn: 263671	2016-03-16 23:08:00 +00:00
James Y Knight	f44fc5219f	Tweak some atomics functions in preparation for larger changes; NFC. - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665	2016-03-16 22:12:04 +00:00
Sanjoy Das	19c6159833	[SelectionDAG] Extract out populateCallLoweringInfo; NFC SelectionDAGBuilder::populateCallLoweringInfo is now used instead of SelectionDAGBuilder::lowerCallOperands. The populateCallLoweringInfo interface is more composable in face of design changes like http://reviews.llvm.org/D18106 llvm-svn: 263663	2016-03-16 20:49:31 +00:00
Simon Pilgrim	b5a20f0fec	Removed trailing whitespace llvm-svn: 263650	2016-03-16 18:37:44 +00:00
Sanjoy Das	c11460e051	[StatepointLowering] Move an assertion; NFCI Instead of running an explicit loop over `gc.relocate` calls hanging off of a `gc.statepoint`, assert the validity of the type of the value being relocated in `visitRelocate`. llvm-svn: 263516	2016-03-15 01:16:31 +00:00
Eric Christopher	da8b3f1914	Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware" as it seems to be causing crashes during code generation in halide. PR forthcoming. This reverts commit r263303. llvm-svn: 263512	2016-03-14 23:59:57 +00:00
Sanjay Patel	7506852709	[DAG] use !isUndef() ; NFCI llvm-svn: 263453	2016-03-14 18:09:43 +00:00
Sanjay Patel	5719584129	[DAG] use isUndef() ; NFCI llvm-svn: 263448	2016-03-14 17:28:46 +00:00
Sanjoy Das	ecf96c9516	Make gc relocates more strongly typed; NFC Don't use a `Value ` where we can use a stronger `GCRelocateInst ` type. llvm-svn: 263327	2016-03-12 02:54:27 +00:00
Simon Pilgrim	33d57c7547	[X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions. We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT. Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718). Differential Revision: http://reviews.llvm.org/D17932 llvm-svn: 263303	2016-03-11 22:18:05 +00:00
Simon Pilgrim	61eb49e437	[X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns. Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion). Differential Revision: http://reviews.llvm.org/D17691 llvm-svn: 263159	2016-03-10 20:40:26 +00:00
Tom Stellard	9f2e00de7b	SelectionDAG: Fix a crash on inline asm when output register supports multiple types Summary: The code in SelectionDAG did not handle the case where the register type and output types were different, but had the same size. Reviewers: arsenm, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17940 llvm-svn: 263022	2016-03-09 16:02:52 +00:00

1 2 3 4 5 ...

7601 Commits