llvm-project

Commit Graph

Author	SHA1	Message	Date
Nemanja Ivanovic	eebbcb6d57	[PowerPC] Cannonicalize applicable vector shift immediates as swaps This patch corresponds to review: http://reviews.llvm.org/D21358 Vector shifts that have the same semantics as a vector swap are cannonicalized as such to provide additional opportunities for swap removal optimization to remove unnecessary swaps. llvm-svn: 275168	2016-07-12 12:16:27 +00:00
Eric Christopher	cd7194629b	Use the class version of getPointerTy rather than getting back to ourselves via a call through the DAG. llvm-svn: 274721	2016-07-07 01:49:59 +00:00
Eric Christopher	317df66f15	Use the class definition for useSoftFloat. llvm-svn: 274720	2016-07-07 01:49:57 +00:00
Eric Christopher	2454a3b4e7	Rename argument for consistency. llvm-svn: 274717	2016-07-07 01:08:23 +00:00
Eric Christopher	e0d09ba443	Remove the plumbing for isDarwinABI from EmitTailCallLoadFPAndRetAddr. llvm-svn: 274716	2016-07-07 01:08:21 +00:00
Eric Christopher	606a268bed	Use the MachineFunction that we've already queried for in the function. llvm-svn: 274715	2016-07-07 01:08:19 +00:00
Eric Christopher	327e440c6c	Remove the plumbing for isDarwinABI from the PrepareTailCall hierarchy. llvm-svn: 274714	2016-07-07 01:08:17 +00:00
Eric Christopher	ade4eed8a7	Remove the plumbing of 64-bitness from PrepareTailCall and functions called by it. llvm-svn: 274711	2016-07-07 00:39:32 +00:00
Eric Christopher	c16ccbe731	Sink call to get the MachineFunction into EmitTailCallStoreFPAndRetAddr and remove the argument. llvm-svn: 274710	2016-07-07 00:39:30 +00:00
Eric Christopher	b976a392e5	Remove unnecessary subtarget parameters in PPCTargetLowering. llvm-svn: 274709	2016-07-07 00:39:27 +00:00
Sanjay Patel	9cc21ac412	fix typo; NFC llvm-svn: 274636	2016-07-06 16:42:46 +00:00
Nemanja Ivanovic	44513e545f	[PowerPC] - Legalize vector types by widening instead of integer promotion This patch corresponds to review: http://reviews.llvm.org/D20443 It changes the legalization strategy for illegal vector types from integer promotion to widening. This only applies for vectors with elements of width that is a multiple of a byte since we have hardware support for vectors with 1, 2, 3, 8 and 16 byte elements. Integer promotion for vectors is quite expensive on PPC due to the sequence of breaking apart the vector, extending the elements and reconstituting the vector. Two of these operations are expensive. This patch causes between minor and major improvements in performance on most benchmarks. There are very few benchmarks whose performance regresses. These regressions can be handled in a subsequent patch with a DAG combine (similar to how this patch handles int -> fp conversions of illegal vector types). llvm-svn: 274535	2016-07-05 09:22:29 +00:00
Duncan P. N. Exon Smith	e4f5e4f4d1	CodeGen: Use MachineInstr& in TargetLowering, NFC This is a mechanical change to make TargetLowering API take MachineInstr& (instead of MachineInstr), since the argument is expected to be a valid MachineInstr. In one case, changed a parameter from MachineInstr to MachineBasicBlock::iterator, since it was used as an insertion point. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. llvm-svn: 274287	2016-06-30 22:52:52 +00:00
Rafael Espindola	db6bd02185	Delete unused includes. NFC. llvm-svn: 274225	2016-06-30 12:19:16 +00:00
Duncan P. N. Exon Smith	9cfc75c214	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189	2016-06-30 00:01:54 +00:00
Rafael Espindola	a99ccfce1a	Drop support for creating $stubs. They are created by ld64 since OS X 10.5. llvm-svn: 274130	2016-06-29 14:59:50 +00:00
Nick Lewycky	9980075133	NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'. Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test. llvm-svn: 273984	2016-06-28 01:45:05 +00:00
Rafael Espindola	3beef8d6db	Move shouldAssumeDSOLocal to Target. Should fix the shared library build. llvm-svn: 273958	2016-06-27 23:15:57 +00:00
Rafael Espindola	21d22a01ea	Use the isPositionIndependent predicate. NFC. llvm-svn: 273875	2016-06-27 14:05:43 +00:00
Rafael Espindola	e1d255f05c	Simplify getLabelAccessInfo. It now takes a IsPIC flag instead of computing and returning it. llvm-svn: 273871	2016-06-27 12:56:02 +00:00
Rafael Espindola	53fd425e06	Refactor duplicated code. NFC. llvm-svn: 273595	2016-06-23 18:43:06 +00:00
Rafael Espindola	928a95d0b0	Use shouldAssumeDSOLocal. With this it handle -fPIE. llvm-svn: 273499	2016-06-22 22:09:17 +00:00
Rafael Espindola	45bb5c69a0	Extract a few variables to make 'if' smaller. NFC. llvm-svn: 273497	2016-06-22 21:56:34 +00:00
Krzysztof Parzyszek	e116d500a7	[SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee The setCallee function will set the number of fixed arguments based on the size of the argument list. The FixedArgs parameter was often explicitly set to 0, leading to a lack of consistent value for non- vararg functions. Differential Revision: http://reviews.llvm.org/D20376 llvm-svn: 273403	2016-06-22 12:54:25 +00:00
NAKAMURA Takumi	fd92154b20	Reformat blank lines. llvm-svn: 273131	2016-06-20 01:05:15 +00:00
NAKAMURA Takumi	ae7c97d39d	Trailing whitespace. llvm-svn: 273130	2016-06-20 00:49:20 +00:00
NAKAMURA Takumi	fe1202c4cb	Untabify. llvm-svn: 273129	2016-06-20 00:37:41 +00:00
Davide Italiano	4cccc488b7	[Codegen] Change PICLevel. We convert `Default` to `NotPIC` so that target independent code can reason about this correctly. Differential Revision: http://reviews.llvm.org/D21394 llvm-svn: 273024	2016-06-17 18:07:14 +00:00
Benjamin Kramer	1d67ac5639	[PPC] Strength-reduce SmallVectors into arrays. No functionality change intended. llvm-svn: 272999	2016-06-17 13:15:10 +00:00
Benjamin Kramer	bdc4956bac	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512	2016-06-12 15:39:02 +00:00
Hal Finkel	1fb10e846a	[PowerPC] Fix a DAG replacement bug in PPCTargetLowering::DAGCombineExtBoolTrunc While promoting nodes in PPCTargetLowering::DAGCombineExtBoolTrunc, it is possible for one of the nodes to be replaced by another. To make sure we do not visit the deleted nodes, and to make sure we visit the replacement nodes, use a list of HandleSDNodes to track the to-be-promoted nodes during the promotion process. The same fix has been applied to the analogous code in PPCTargetLowering::DAGCombineTruncBoolExt. Fixes PR26985. llvm-svn: 269272	2016-05-12 04:00:56 +00:00
Nemanja Ivanovic	6e29baf7f5	[Power9] Add support for -mcpu=pwr9 in the back end This patch corresponds to review: http://reviews.llvm.org/D19683 Simply adds the bits for being able to specify -mcpu=pwr9 to the back end. llvm-svn: 268950	2016-05-09 18:54:58 +00:00
Strahinja Petrovic	e682b80b8b	[PowerPC] fix register alignment for long double type This patch fixes register alignment for long double type in soft float mode. Before this patch alignment was 8 and this patch changes it to 4. Differential Revision: http://reviews.llvm.org/D18034 llvm-svn: 268909	2016-05-09 12:27:39 +00:00
Nemanja Ivanovic	1a2b2f03e7	[PowerPC] Generate VSX version of splat word This patch corresponds to review: http://reviews.llvm.org/D18592 It allows the PPC back end to generate the xxspltw instruction where we previously only emitted vspltw. llvm-svn: 268516	2016-05-04 16:04:02 +00:00
Guozhi Wei	fa3e04298b	[PPC] Enable shuffling of VSX vectors This patch fixes PR27078 by enabling shuffling of vectors if VSX is available. llvm-svn: 268064	2016-04-29 17:00:54 +00:00
Craig Topper	33772c5375	[CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853	2016-04-28 03:34:31 +00:00
Ahmed Bougacha	128f8732a5	[CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI. Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606	2016-04-26 21:15:30 +00:00
Marcin Koscielnicki	0cfb612413	[PowerPC] Add support for llvm.thread.pointer Differential Revision: http://reviews.llvm.org/D19304 llvm-svn: 267546	2016-04-26 10:37:22 +00:00
Chuang-Yu Cheng	0600e8d759	[ppc64] Reenable sibling call optimization on ppc64 since fixed tsan library tail-call issue print-stack-trace.cc test failure of compiler-rt has been fixed by r266869 (http://reviews.llvm.org/D19148), so reenable sibling call optimization on ppc64 Reviewers: nemanjai kbarton llvm-svn: 267527	2016-04-26 07:38:24 +00:00
Tim Shen	a1d8bc5597	[PPC, SSP] Support PowerPC Linux stack protection. llvm-svn: 266809	2016-04-19 20:14:52 +00:00
Nirav Dave	1f51c334ca	Fix typing on generated LXV2DX/STXV2DX instructions [PPC] Previously when casting generic loads to LXV2DX/ST instructions we would leave the original load return type in place allowing for an assertion failure when we merge two equivalent LXV2DX nodes with different types. This fixes PR27350. Reviewers: nemanjai Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19133 llvm-svn: 266438	2016-04-15 15:01:38 +00:00
Chuang-Yu Cheng	98c1894755	CXX_FAST_TLS calling convention: performance improvement for PPC64 This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781	2016-04-08 12:04:32 +00:00
JF Bastien	800f87a871	NFC: make AtomicOrdering an enum class Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602	2016-04-06 21:19:33 +00:00
Ehsan Amiri	322eca3849	[PPC] Use VSX/FP Facility integer load when an integer load's only users are conversion to FP http://reviews.llvm.org/D18405 When the integer value loaded is never used directly as integer we should use VSX or Floating Point Facility integer loads and avoid extra direct move llvm-svn: 265593	2016-04-06 20:12:29 +00:00
Chuang-Yu Cheng	6e1408a891	[ppc64] Temporary disable sibling call optimization on ppc64 due to breaking test case r265506 breaks print-stack-trace.cc test case of compiler-rt in bootstrap test. http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/1708 llvm-svn: 265528	2016-04-06 10:48:36 +00:00
Chuang-Yu Cheng	2e5973ef74	[ppc64] Enable sibling call optimization on ppc64 ELFv1/ELFv2 abi This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and add a couple of test cases. This patch also passed llvm/clang bootstrap test, and spec2006 build/run/result validation. Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617 Great thanks to Tom's (tjablin) help, he contributed a lot to this patch. Thanks Hal and Kit's invaluable opinions! Reviewers: hfinkel kbarton http://reviews.llvm.org/D16315 llvm-svn: 265506	2016-04-06 02:04:38 +00:00
Hal Finkel	fc35391f2b	[PowerPC] Add a late MI-level pass for QPX load/splat simplification Chapter 3 of the QPX manual states that, "Scalar floating-point load instructions, defined in the Power ISA, cause a replication of the source data across all elements of the target register." Thus, if we have a load followed by a QPX splat (from the first lane), the splat is redundant. This adds a late MI-level pass to remove the redundant splats in some of these cases (specifically when both occur in the same basic block). This optimization is scheduled just prior to post-RA scheduling. It can't happen before anything that might replace the load with some already-computed quantity (i.e. store-to-load forwarding). llvm-svn: 265047	2016-03-31 20:39:41 +00:00
Hal Finkel	851b33a0b1	[PowerPC] Load two floats directly instead of using one 64-bit integer load When dealing with complex<float>, and similar structures with two single-precision floating-point numbers, especially when such things are being passed around by value, we'll sometimes end up loading both float values by extracting them from one 64-bit integer load. It looks like this: t13: i64,ch = load<LD8[%ref.tmp]> t0, t6, undef:i64 t16: i64 = srl t13, Constant:i32<32> t17: i32 = truncate t16 t18: f32 = bitcast t17 t19: i32 = truncate t13 t20: f32 = bitcast t19 The problem, especially before the P8 where those bitcasts aren't legal (and get expanded via the stack), is that it would have been better to use two floating-point loads directly. Here we add a target-specific DAGCombine to do just that. In short, we turn: ld 3, 0(5) stw 3, -8(1) rldicl 3, 3, 32, 32 stw 3, -4(1) lfs 3, -4(1) lfs 0, -8(1) into: lfs 3, 4(5) lfs 0, 0(5) llvm-svn: 264988	2016-03-31 02:56:05 +00:00
Hal Finkel	fa7057a415	[PowerPC] Refactor popcnt[dw] target features Instead of using two feature bits, one to indicate the availability of the popcnt[dw] instructions, and another to indicate whether or not they're fast, use a single enum. This allows more consistent control via target attribute strings, and via Clang's command line. llvm-svn: 264690	2016-03-29 01:36:01 +00:00
Hal Finkel	7059d41622	[PowerPC] On the A2, popcnt[dw] are very slow The A2 cores support the popcntw/popcntd instructions, but they're microcoded, and slower than our default software emulation. Specifically, popcnt[dw] take approximately 74 cycles, whereas our software emulation takes only 24-28 cycles. I've added a new target feature to indicate a slow popcnt[dw], instead of just removing the existing target feature from the a2/a2q processor models, because: 1. This allows us to return more accurate information via the TTI interface (I recognize that this currently makes no practical difference) 2. Is hopefully easier to understand (it allows the core's features to match its manual while still having the desired effect). llvm-svn: 264600	2016-03-28 17:52:08 +00:00
Eric Christopher	b979d51afa	Finish the incomplete 'd' inline asm constraint support for PPC by making sure we give it a register and mark it as a register constraint. llvm-svn: 264340	2016-03-24 21:04:52 +00:00
Nemanja Ivanovic	5ebc92dbe1	[PowerPC] Disable direct moves for extractelement and bitcast in 32-bit mode This patch corresponds to review: http://reviews.llvm.org/D17711 It disables direct moves on these operations in 32-bit mode since the patterns assume 64-bit registers. The final patch is slightly different from the Phabricator review as the bitcast operations needed to be disabled in 32-bit mode as well. This fixes PR26617. llvm-svn: 264282	2016-03-24 13:40:33 +00:00
James Y Knight	f44fc5219f	Tweak some atomics functions in preparation for larger changes; NFC. - Rename getATOMIC to getSYNC, as llvm will soon be able to emit both '__sync' libcalls and '__atomic' libcalls, and this function is for the '__sync' ones. - getInsertFencesForAtomic() has been replaced with shouldInsertFencesForAtomic(Instruction), so that the decision can be made per-instruction. This functionality will be used soon. - emitLeadingFence/emitTrailingFence are no longer called if shouldInsertFencesForAtomic returns false, and thus don't need to check the condition themselves. llvm-svn: 263665	2016-03-16 22:12:04 +00:00
Sanjay Patel	7506852709	[DAG] use !isUndef() ; NFCI llvm-svn: 263453	2016-03-14 18:09:43 +00:00
Sanjay Patel	5719584129	[DAG] use isUndef() ; NFCI llvm-svn: 263448	2016-03-14 17:28:46 +00:00
Kit Barton	a1d6a6f1de	[PPC] backend changes to generate xvabs[s,d]p and xvnabs[s,d]p instructions This has to be committed before the FE changes Phabricator: http://reviews.llvm.org/D17837 llvm-svn: 263035	2016-03-09 17:48:01 +00:00
Nemanja Ivanovic	1a5706ca1b	Fix for PR26180 Corresponds to Phabricator review: http://reviews.llvm.org/D16592 This fix includes both an update to how we handle the "generic" CPU on LE systems as well as Anton's fix for the Fast Isel issue. llvm-svn: 262233	2016-02-29 16:42:27 +00:00
Kit Barton	915c5ecee1	[PPC] Legalize FNEG on PPC when possible Currently we always expand ISD::FNEG. For v4f32 and v2f64 vector types VSX has native support for this opcode Phabricator: http://reviews.llvm.org/D17647 llvm-svn: 262079	2016-02-26 21:59:44 +00:00
Ahmed Bougacha	93cff7fb82	[CodeGen] Document and use getConstant's splat-building feature. NFC. Differential Revision: http://reviews.llvm.org/D17229 llvm-svn: 260901	2016-02-15 18:07:29 +00:00
Manuel Jacob	5f6eaac611	GlobalValue: use getValueType() instead of getType()->getPointerElementType(). Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999	2016-01-16 20:30:46 +00:00
Alexander Kornienko	175a7cbf3f	Refactor: Simplify boolean conditional return statements in lib/Target/PowerPC Summary: Use clang-tidy to simplify boolean conditional return statements Reviewers: uweigand, rafael, wschmidt Subscribers: craig.topper, llvm-commits Patch by Richard Thomson! Differential Revision: http://reviews.llvm.org/D9984 llvm-svn: 256493	2015-12-28 13:38:42 +00:00
Nemanja Ivanovic	8922476bcb	Bitcasts between FP and INT values using direct moves This patch corresponds to review: http://reviews.llvm.org/D15286 This patch was meant to land in revision 255246, but I accidentally uploaded the patch that corresponds to http://reviews.llvm.org/D15372 in that revision accidentally. Thereby, this patch is the actual Bitcasts using direct moves patch, whereas http://reviews.llvm.org/rL255246 actually corresponds to http://reviews.llvm.org/D15372. llvm-svn: 255649	2015-12-15 14:50:34 +00:00
Petar Jovanovic	280f7101e8	[Power PC] llvm soft float support for ppc32 This is the second in a set of patches for soft float support for ppc32, it enables soft float operations. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D13700 llvm-svn: 255516	2015-12-14 17:57:33 +00:00
Chad Rosier	bc9d4f9947	[PPC] Early exit loop. NFC. llvm-svn: 255497	2015-12-14 14:44:06 +00:00
Nemanja Ivanovic	74e31bc929	Patch to fix a crash in the PowerPC back end due to ISD::ROTL and ISD::ROTR not being expanded. Test case included. llvm-svn: 254501	2015-12-02 10:36:24 +00:00
Yury Gribov	d7dbb66eb8	Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics. The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404	2015-12-01 11:40:55 +00:00
Artyom Skrobov	314ee04268	Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC) Summary: Many target lowerings copy-paste the code to test SDValues for known constants. This code can instead be shared in SelectionDAG.cpp, and reused in the targets. Reviewers: MatzeB, andreadb, tstellarAMD Subscribers: arsenm, jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D14945 llvm-svn: 254085	2015-11-25 19:41:11 +00:00
Cong Hou	1938f2eb98	Let SelectionDAG start to use probability-based interface to add successors. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes. 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights. 3. Use new interfaces in all other passes. 4. Remove old interfaces. This the second patch above. In this patch SelectionDAG starts to use probability-based interfaces in MBB to add successors but other MC passes are still using weight-based interfaces. Therefore, we need to maintain correct weight list in MBB even when probability-based interfaces are used. This is done by updating weight list in probability-based interfaces by treating the numerator of probabilities as weights. This change affects many test cases that check successor weight values. I will update those test cases once this patch looks good to you. Differential revision: http://reviews.llvm.org/D14361 llvm-svn: 253965	2015-11-24 08:51:23 +00:00
Joseph Tremoulet	f748c8937e	[WinEH] Update exception pointer registers Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383	2015-11-07 01:11:31 +00:00
Nemanja Ivanovic	be5f0c04f1	Fix for bootstrap bug introduced in r244921 This revision has introduced an issue that only affects bootstrapped compiler when it is printing the ASM. It turns out that the new code path taken due to legalizing a scalar_to_vector of i64 -> v2i64 exposes a missing check in a micro optimization to change a load followed by a scalar_to_vector into a load and splat instruction on PPC. llvm-svn: 251798	2015-11-02 14:01:11 +00:00
Hal Finkel	bdd292ae22	[PowerPC] Don't return unsupported register classes for asm constraints As a follow-up to r251566, do the same for the other optionally-supported register classes (mostly for vector registers). Don't return an unavailable register class (which would cause an assert later), but fail cleanly when provided an unsupported inline asm constraint. llvm-svn: 251575	2015-10-28 23:03:45 +00:00
Hal Finkel	34d4149452	[PowerPC] Cleanly reject asm crbit constraint with -crbits When crbits are disabled, cleanly reject the constraint (return the register class only to cause an assert later). llvm-svn: 251566	2015-10-28 22:25:52 +00:00
Duncan P. N. Exon Smith	ac65b4c422	PowerPC: Remove implicit ilist iterator conversions, NFC llvm-svn: 250787	2015-10-20 01:07:37 +00:00
Nemanja Ivanovic	d389657399	Vector element extraction without stack operations on Power 8 This patch corresponds to review: http://reviews.llvm.org/D12032 This patch builds onto the patch that provided scalar to vector conversions without stack operations (D11471). Included in this patch: - Vector element extraction for all vector types with constant element number - Vector element extraction for v16i8 and v8i16 with variable element number - Removal of some unnecessary COPY_TO_REGCLASS operations that ended up unnecessarily moving things around between registers Not included in this patch (will be in upcoming patch): - Vector element extraction for v4i32, v4f32, v2i64 and v2f64 with variable element number - Vector element insertion for variable/constant element number Testing is provided for all extractions. The extractions that are not implemented yet are just placeholders. llvm-svn: 249822	2015-10-09 11:12:18 +00:00
Nemanja Ivanovic	2c84b29464	Addition of interfaces the BE to conform to Table A-2 of ELF V2 ABI V1.1 This patch corresponds to review: http://reviews.llvm.org/D13191 Back end portion of the fifth round of additions to altivec.h. llvm-svn: 248809	2015-09-29 17:41:53 +00:00
NAKAMURA Takumi	10c80e7996	Prune trailing whitespaces. llvm-svn: 248265	2015-09-22 11:19:03 +00:00
NAKAMURA Takumi	0a7d0ad95f	Untabify. llvm-svn: 248264	2015-09-22 11:15:07 +00:00
NAKAMURA Takumi	a9cb538a74	Reformat blank lines. llvm-svn: 248263	2015-09-22 11:14:39 +00:00
NAKAMURA Takumi	84965031a7	Reformat comment lines. llvm-svn: 248262	2015-09-22 11:14:12 +00:00
NAKAMURA Takumi	70ad98aca4	Reformat. llvm-svn: 248261	2015-09-22 11:13:55 +00:00
NAKAMURA Takumi	bf9cc7f30b	Fix utf8 chars. llvm-svn: 248259	2015-09-22 11:10:08 +00:00
Sanjay Patel	a260701bbb	propagate fast-math-flags on DAG nodes After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is one test case in this patch to prove that point. This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF ( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes. This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the current global settings. Differential Revision: http://reviews.llvm.org/D12095 llvm-svn: 247815	2015-09-16 16:31:21 +00:00
Hal Finkel	e6702ca0e2	[PowerPC] Try harder to find a base+offset when looking for consecutive accesses When forming permutation-based unaligned vector loads, we need to know whether it is valid to read ahead of the requested address by a full vector length. Doing so is more efficient (and allows for more CSE with later loads), but could trigger a page fault if invalid. To determine validity, we look for other loads in the same block that access the relevant address range. The relevant point here is that we need to do this as part of the process of forming permutation-based vector loads, and this happens quite early in the SDAG pipeline - specifically before many of the address calculations are fully canonicalized. As a result, we need to try harder to recognize base+offset address computations, because they still might appear as chain of adds (base+offset+offset, for example). To account for this, we'll look through chains of adds, accumulating the constant offsets. llvm-svn: 246813	2015-09-03 22:37:44 +00:00
Hal Finkel	99d95328d6	[PowerPC] Compute the MMO offset for an unaligned load with signed arithmetic If you compute the MMO offset using unsigned arithmetic, you end up with a large positive offset instead of a small negative one. In theory, this could cause bad instruction-scheduling decisions later. I noticed this by inspection from the debug output, and using that for the regression test is the best I can do right now. llvm-svn: 246805	2015-09-03 21:12:15 +00:00
Hal Finkel	77c8b7ffd3	[PowerPC] Don't always consider P8Altivec-only masks in LowerVECTOR_SHUFFLE LowerVECTOR_SHUFFLE needs to decide whether to pass a vector shuffle off to the TableGen-generated matching code, and it does this by testing the same predicates used by the TableGen files. Unfortunately, when we added new P8Altivec-only predicates, we started universally testing them in LowerVECTOR_SHUFFLE, and if then matched when targeting a system prior to a P8, we'd end up with a selection failure. llvm-svn: 246675	2015-09-02 16:52:37 +00:00
Hal Finkel	a2cdbce661	[PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands There were really two problems here. The first was that we had the truth tables for signed i1 comparisons backward. I imagine these are not very common, but if you have: setcc i1 x, y, LT this has the '0 1' and the '1 0' results flipped compared to: setcc i1 x, y, ULT because, in the signed case, '1 0' is really '-1 0', and the answer is not the same as in the unsigned case. The second problem was that we did not have patterns (at all) for the unsigned comparisons select_cc nodes for i1 comparison operands. This was the specific cause of PR24552. These had to be added (and a missing Altivec promotion added as well) to make sure these function for all types. I've added a bunch more test cases for these patterns, and there are a few FIXMEs in the test case regarding code-quality. Fixes PR24552. llvm-svn: 246400	2015-08-30 22:12:50 +00:00
Hal Finkel	be78c25acb	[PowerPC] Fix the int2fp(fp2int(x)) DAGCombine to ignore ppc_fp128 This DAGCombine was creating custom SDAG nodes with an illegal ppc_fp128 operand type because it was triggering on f64/f32 int2fp(fp2int(ppc_fp128 x)), but shouldn't (it should only apply to f32/f64 types). The result was a crash. llvm-svn: 245530	2015-08-20 01:18:20 +00:00
Nemanja Ivanovic	5f1cea4141	Temporary fix for the self-host failures introduced by rL244921. This revision has introduced an issue that only affects bootstrapped compiler when it is printing the ASM. I am working on resolving the issue, but in the meantime, I'm disabling the legalization of scalar_to_vector operation for v2i64 and the associated testing until I can get this fixed. llvm-svn: 245481	2015-08-19 19:04:47 +00:00
Nemanja Ivanovic	1c39ca6501	Scalar to vector conversions using direct moves This patch corresponds to review: http://reviews.llvm.org/D11471 It improves the code generated for converting a scalar to a vector value. With direct moves from GPRs to VSRs, we no longer require expensive stack operations for this. Subsequent patches will handle the reverse case and more general operations between vectors and their scalar elements. llvm-svn: 244921	2015-08-13 17:40:44 +00:00
Alex Lorenz	e40c8a2b26	PseudoSourceValue: Replace global manager with a manager in a machine function. This commit removes the global manager variable which is responsible for storing and allocating pseudo source values and instead it introduces a new manager class named 'PseudoSourceValueManager'. Machine functions now own an instance of the pseudo source value manager class. This commit also modifies the 'get...' methods in the 'MachinePointerInfo' class to construct pseudo source values using the instance of the pseudo source value manager object from the machine function. This commit updates calls to the 'get...' methods from the 'MachinePointerInfo' class in a lot of different files because those calls now need to pass in a reference to a machine function to those methods. This change will make it easier to serialize pseudo source values as it will enable me to transform the mips specific MipsCallEntry PseudoSourceValue subclass into two target independent subclasses. Reviewers: Akira Hatanaka llvm-svn: 244693	2015-08-11 23:09:45 +00:00
Bill Schmidt	42ddd71120	[PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask Given certain shuffle-vector masks, LLVM emits splat instructions which splat the wrong bytes from the source register. The issue is that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp does not ensure that the splat pattern found is requesting bytes that are aligned on an EltSize boundary. This patch detects this situation as not a valid splat mask, resulting in a permute being generated instead of a splat. Patch and test case by Tyler Kenney, cleaned up a bit by me. This is a simple bug fix that would be good to incorporate into 3.7. llvm-svn: 243519	2015-07-29 14:31:57 +00:00
Sanjay Patel	1dd15598cf	fix TLI's combineRepeatedFPDivisors interface to return the minimum user threshold This fix was suggested as part of D11345 and is part of fixing PR24141. With this change, we can avoid walking the uses of a divisor node if the target doesn't want the combineRepeatedFPDivisors transform in the first place. There is no NFC-intended other than that. Differential Revision: http://reviews.llvm.org/D11531 llvm-svn: 243498	2015-07-28 23:05:48 +00:00
Chih-Hung Hsieh	1e859582d6	Implement target independent TLS compatible with glibc's emutls.c. The 'common' section TLS is not implemented. Current C/C++ TLS variables are not placed in common section. DWARF debug info to get the address of TLS variables is not generated yet. clang and driver changes in http://reviews.llvm.org/D10524 Added -femulated-tls flag to select the emulated TLS model, which will be used for old targets like Android that do not support ELF TLS models. Added TargetLowering::LowerToTLSEmulatedModel as a target-independent function to convert a SDNode of TLS variable address to a function call to __emutls_get_address. Added into lib/Target//ISelLowering.cpp to call LowerToTLSEmulatedModel for TLSModel::Emulated. Although all targets supporting ELF TLS models are enhanced, emulated TLS model has been tested only for Android ELF targets. Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for emulated TLS variables. Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls. TODO: Add proper DIE for emulated TLS variables. Added new unit tests with emulated TLS. Differential Revision: http://reviews.llvm.org/D10522 llvm-svn: 243438	2015-07-28 16:24:05 +00:00
Pete Cooper	2e20147403	Revert "Add const to some Type* parameters which didn't need to be mutable. NFC." This reverts commit r243146. Feedback from Craig Topper and David Blaikie was that we don't put const on Type as it has no mutable state. llvm-svn: 243282	2015-07-27 17:15:24 +00:00
Pete Cooper	098f7c1fcb	Add const to some Type* parameters which didn't need to be mutable. NFC. We were only getting the size of the type which doesn't need to modify the type. llvm-svn: 243146	2015-07-24 19:19:26 +00:00
Pete Cooper	0debbdc872	Use foreach loops for StructType::elements(). NFC. We had a few places where we did for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) { but those could instead do for (auto *EltTy : STy->elements()) { llvm-svn: 243136	2015-07-24 18:55:49 +00:00
Bill Schmidt	54cced54a6	[PowerPC] v4i32 is a VSRCRegClass I was looking at some vector code generation and kept seeing unnecessary vector copies into the Altivec half of the VSX registers. I discovered that we overlooked v4i32 when adding the register classes for VSX; we only added v4f32 and v2f64. This means that anything that canonicalizes into v4i32 (which is a LOT of stuff) ends up being forced into VRRC on its way to VSRC. The fix is one line. The rest of the patch is fixing up some test cases whose code generation has changed as a result. This seems like it would be a good candidate for backport to 3.7. llvm-svn: 242442	2015-07-16 21:14:07 +00:00
Bill Schmidt	1e77bb12b4	[PPC64LE] Fix vec_sld semantics for little endian The vec_sld interface provides access to the vsldoi instruction. Unlike most of the vec_* interfaces, we do not attempt to change the generated code for vec_sld based on the endian mode. It is too difficult to correctly infer the desired semantics because of different element types, and the corrected instruction sequence is expensive, involving loading a permute control vector and performing a generalized permute. For GCC, this was implemented as "Don't touch the vec_sld" implementation. When it came time for the LLVM implementation, I did the same thing. However, this was hasty and incorrect. In LLVM's version of altivec.h, vec_sld was previously defined in terms of the vec_perm interface. Because vec_perm semantics are adjusted for little endian, this means that leaving vec_sld untouched causes it to generate something different for LE than for BE. Not good. This back-end patch accompanies the changes to altivec.h that change vec_sld's behavior for little endian. Those changes mean that we see slightly different code in the back end when trying to recognize a VSLDOI instruction in isVSLDOIShuffleMask. In particular, a ShuffleKind of 1 (where the two inputs are identical) must now be treated the same way as a ShuffleKind of 2 (little endian with different inputs) when little endian mode is in force. This is because ShuffleKind of 1 is defined using big-endian numbering. This has a ripple effect on LowerBUILD_VECTOR, where we create our own internal VSLDOI instructions. Because these are a ShuffleKind of 1, they will now have their shift amounts subtracted from 16 when recognizing the shuffle mask. To avoid problems we have to subtract them from 16 again before creating the VSLDOI instructions. There are a couple of other uses of BuildVSLDOI, but these do not need to be modified because the shift amount is 8, which is unchanged when subtracted from 16. llvm-svn: 242296	2015-07-15 15:45:30 +00:00
Hal Finkel	cbf08925ef	[PowerPC] Make use of the TargetRecip system r238842 added the TargetRecip system for controlling use of reciprocal estimates for sqrt and division using a set of parameters that can be set by the frontend. Clang now supports a sophisticated -mrecip option, and this will allow that option to effectively control the relevant code-generation functionality of the PPC backend. llvm-svn: 241985	2015-07-12 02:33:57 +00:00
Hal Finkel	965cea5670	[PowerPC] Support the nest parameter attribute This adds support for the 'nest' attribute, which allows the static chain register to be set for functions calls under non-Darwin PPC/PPC64 targets. r11 is the chain register (which the PPC64 ELF ABI calls the "environment pointer"). For indirect calls under PPC64 ELFv1, this would normally be loaded from the function descriptor, but providing an explicit 'nest' parameter will override that process and use the value provided. This allows __builtin_call_with_static_chain to work as expected on PowerPC. llvm-svn: 241984	2015-07-12 00:37:44 +00:00

1 2 3 4 5 ...

1165 Commits