llvm-project

Commit Graph

Author	SHA1	Message	Date
Evan Cheng	21b0348199	Disable the Thumb no-return call optimization: mov lr, pc b.w _foo The "mov" instruction doesn't set bit zero to one, it's putting incorrect value in lr. It messes up backtraces. rdar://12663632 llvm-svn: 167657	2012-11-10 02:09:05 +00:00
Chad Rosier	66bb178eef	Revert r167620; this can be implemented using an existing CL option. llvm-svn: 167622	2012-11-09 18:25:27 +00:00
Chad Rosier	332fc75b2c	Add support for -mstrict-align compiler option for ARM targets. rdar://12340498 llvm-svn: 167620	2012-11-09 17:29:38 +00:00
Chad Rosier	1ec8e404fc	Mark the Int_eh_sjlj_dispatchsetup pseudo instruction as clobbering all registers. Previously, the register we being marked as implicitly defined, but not killed. In some cases this would cause the register scavenger to spill a dead register. Also, use an empty register mask to simplify the logic and to reduce the memory footprint. rdar://12592448 llvm-svn: 167499	2012-11-06 23:05:24 +00:00
Quentin Colombet	8e1fe84c3c	Vext Lowering was missing opportunities llvm-svn: 167318	2012-11-02 21:32:17 +00:00
Quentin Colombet	5799e9f66c	Change ForceSizeOpt attribute into MinSize attribute llvm-svn: 167020	2012-10-30 16:32:52 +00:00
Quentin Colombet	3ee56a3bf5	[code size][ARM] Emit regular call instructions instead of the move, branch sequence llvm-svn: 166854	2012-10-27 01:10:17 +00:00
Stepan Dyatkovskiy	dab8043048	ARM: Removed extra stack frame object for fixed byval arguments, VarArgsStyleRegisters invocation was reworked due to some improper usage in past. PR14099 also demonstrates it. llvm-svn: 166273	2012-10-19 08:23:06 +00:00
Stepan Dyatkovskiy	e59a920b0c	Issue: Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. llvm-svn: 166018	2012-10-16 07:16:47 +00:00
Silviu Baranga	b14097000b	Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE. llvm-svn: 165929	2012-10-15 09:41:32 +00:00
Manman Ren	7e48b252e7	ARM: tail-call inside a function where part of a byval argument is on caller's local frame causes problem. For example: void f(StructToPass s) { g(&s, sizeof(s)); } will cause problem with tail-call since part of s is passed via registers and saved in f's local frame. When g tries to access s, part of s may be corrupted since f's local frame is popped out before the tail-call. The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for the caller. This is a conservative approach, if we can prove the address of s or part of s is not taken and passed to g, it should be okay to perform tail-call. rdar://12442472 llvm-svn: 165853	2012-10-12 23:39:43 +00:00
Jim Grosbach	30af442a84	ARM: Mark VSELECT as 'expand'. The backend already pattern matches to form VBSL when it can. We may want to teach it to use the vbsl intrinsics at some point to prevent machine licm from mucking with this, but using the Expand is completely correct. http://llvm.org/bugs/show_bug.cgi?id=13831 http://llvm.org/bugs/show_bug.cgi?id=13961 Patch by Peter Couperus <peter.couperus@st.com>. llvm-svn: 165845	2012-10-12 22:59:21 +00:00
Stepan Dyatkovskiy	283baa0027	Fix for LDRB instruction: SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer that described in .td. 7 ops is needed, but SDNode with only 6 is created. In more details: In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset operand is defined as am2offset_imm. am2offset_imm is complex parameter type, and actually it consists from dummy register and imm itself. As I understood trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy register was not added to SDNode, and it cause crash in Peephole Optimizer pass. The problem fixed by setting up additional dummy reg when emitting LDRB_POST_IMM instruction. llvm-svn: 165617	2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy	f13dbb8e24	Issue description: SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack objects and byval parameters. So loading byval parameters from stack may be inserted before it will be stored, since these operations are treated as independent. Fix: Currently ARMTargetLowering::LowerFormalArguments saves byval registers with FixedStack MachinePointerInfo. To fix the problem we need to store byval registers with MachinePointerInfo referenced to first the "byval" parameter. Also commit adds two new fields to the InputArg structure: Function's argument index and InputArg's part offset in bytes relative to the start position of Function's argument. E.g.: If function's argument is 128 bit width and it was splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index, but different offset values. llvm-svn: 165616	2012-10-10 11:37:36 +00:00
Bill Wendling	c9b22d735a	Create enums for the different attributes. We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488	2012-10-09 07:45:08 +00:00
Micah Villmow	cdfe20b97f	Move TargetData to DataLayout. llvm-svn: 165402	2012-10-08 16:38:25 +00:00
Bob Wilson	e8a549cd92	Add LLVM support for Swift. llvm-svn: 164899	2012-09-29 21:43:49 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Bill Wendling	863bab689a	Remove the `hasFnAttr' method from Function. The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725	2012-09-26 21:48:26 +00:00
James Molloy	9e98ef1c59	Fix ordering of operands on lowering of atomicrmw min/max nodes on ARM. llvm-svn: 164685	2012-09-26 09:48:32 +00:00
Evan Cheng	90ae8f8442	Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte aligned address. Based on patch by David Peixotto. Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment hints. rdar://12090772, rdar://12238782 llvm-svn: 164089	2012-09-18 01:42:45 +00:00
Silviu Baranga	b47bb94f93	This patch introduces A15 as a target in LLVM. llvm-svn: 163803	2012-09-13 15:05:10 +00:00
Craig Topper	3e41a5bb31	Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. llvm-svn: 163458	2012-09-08 04:58:43 +00:00
Jakob Stoklund Olesen	e45e22b20f	Custom DAGCombine for and/or/xor are for all ARMs. The 'select' transformations apply to all ARM architectures and don't require hasV6T2Ops. llvm-svn: 163396	2012-09-07 17:34:15 +00:00
James Molloy	9d30dc2432	Fix self-host; ensure signedness is consistent. llvm-svn: 163306	2012-09-06 10:32:08 +00:00
James Molloy	49bdbce8e1	Improve codegen for BUILD_VECTORs on ARM. If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base. llvm-svn: 163304	2012-09-06 09:55:02 +00:00
Arnold Schwaighofer	f00fb1c581	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 163136	2012-09-04 14:37:49 +00:00
Jakob Stoklund Olesen	d3bda3c5b9	Fix a couple of typos in EmitAtomic. Thumb2 instructions are mostly constrained to rGPR, not tGPR which is for Thumb1. rdar://problem/12203728 llvm-svn: 162968	2012-08-31 02:08:34 +00:00
Jakob Stoklund Olesen	710093e360	Use a SmallPtrSet to dedup successors in EmitSjLjDispatchBlock. The test case ARM/2011-05-04-MultipleLandingPadSuccs.ll was creating duplicate successor list entries. llvm-svn: 162222	2012-08-20 20:52:03 +00:00
Jakob Stoklund Olesen	e1014e7b98	Remove the CAND/COR/CXOR custom ISD nodes and their select code. These nodes are no longer needed because the peephole pass can fold CMOV+AND into ANDCC etc. llvm-svn: 162179	2012-08-18 21:49:50 +00:00
Jakob Stoklund Olesen	dded061f85	Also combine zext/sext into selects for ARM. This turns common i1 patterns into predicated instructions: (add (zext cc), x) -> (select cc (add x, 1), x) (add (sext cc), x) -> (select cc (add x, -1), x) For a function like: unsigned f(unsigned s, int x) { return s + (x>0); } We now produce: cmp r1, #0 it gt addgt.w r0, r0, #1 Instead of: movs r2, #0 cmp r1, #0 it gt movgt r2, #1 add r0, r2 llvm-svn: 162177	2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen	aab43dbfbb	Also pass logical ops to combineSelectAndUse. Add these transformations to the existing add/sub ones: (and (select cc, -1, c), x) -> (select cc, x, (and, x, c)) (or (select cc, 0, c), x) -> (select cc, x, (or, x, c)) (xor (select cc, 0, c), x) -> (select cc, x, (xor, x, c)) The selects can then be transformed to a single predicated instruction by peephole. This transformation will make it possible to eliminate the ISD::CAND, COR, and CXOR custom DAG nodes. llvm-svn: 162176	2012-08-18 21:25:16 +00:00
Jakob Stoklund Olesen	c1dee482c8	Add comment, clean up code. No functional change. llvm-svn: 162107	2012-08-17 16:59:09 +00:00
Jakob Stoklund Olesen	c19bf0282d	Handle ARM MOVCC optimization in PeepholeOptimizer. Use the target independent select analysis hooks. llvm-svn: 162060	2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen	6cb96120f1	Fold predicable instructions into MOVCC / t2MOVCC. The ARM select instructions are just predicated moves. If the select is the only use of an operand, the instruction defining the operand can be predicated instead, saving one instruction and decreasing register pressure. This implementation can turn AND/ORR/EOR instructions into their corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to predicate any instruction, but we don't yet support predicated instructions in SSA form. llvm-svn: 161994	2012-08-15 22:16:39 +00:00
Evan Cheng	eec6bc6270	Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029 llvm-svn: 161962	2012-08-15 17:44:53 +00:00
Nadav Rotem	3a94c545cf	Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user. rdar://11876519 llvm-svn: 161775	2012-08-13 18:52:44 +00:00
Arnold Schwaighofer	b73da9453c	Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM architecture It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7 thumb O3. llvm-svn: 161736	2012-08-12 05:11:56 +00:00
Craig Topper	4fa625fda7	Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed. llvm-svn: 161735	2012-08-12 03:16:37 +00:00
Arnold Schwaighofer	81b2eec1ab	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 161581	2012-08-09 15:25:52 +00:00
Bob Wilson	3e6fa462f3	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Eric Christopher	b3322364e4	Add support for the ARM GHC calling convention, this patch was in 3.0, but somehow managed to be dropped later. Patch by Karel Gardas. llvm-svn: 161226	2012-08-03 00:05:53 +00:00
Jim Grosbach	6df755cc4e	ARM: Don't assume an SDNode is a constant. Before accessing a node as a ConstandSDNode, make sure it actually is one. No testcase of non-trivial size. rdar://11948669 llvm-svn: 160735	2012-07-25 17:02:47 +00:00
Andrew Trick	a22cdb713b	Fix ARMTargetLowering::isLegalAddImmediate to consider thumb encodings. Based on Evan's suggestion without a commitable test. llvm-svn: 160441	2012-07-18 18:34:27 +00:00
Andrew Trick	bc325168c3	whitespace llvm-svn: 160440	2012-07-18 18:34:24 +00:00
Manman Ren	6e1fd46fdf	ARM: use NOEN loads and stores if possible when handling struct byval. This change is to be enabled in clang. rdar://9877866 llvm-svn: 158684	2012-06-18 22:23:48 +00:00
Manman Ren	e0763c7472	ARM: optimization for sub+abs. This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 llvm-svn: 158551	2012-06-15 21:32:12 +00:00
Bill Wendling	4b79647a6e	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Manman Ren	e873552091	ARM: properly handle alignment for struct byval. Factor out the expansion code into a function. This change is to be enabled in clang. rdar://9877866 llvm-svn: 157830	2012-06-01 19:33:18 +00:00
Manman Ren	9f9111651e	ARM: support struct byval in llvm We handle struct byval by inserting a pseudo op, which will be expanded to a loop at ExpandISelPseudos. A separate patch for clang will be submitted to enable struct byval. rdar://9877866 llvm-svn: 157793	2012-06-01 02:44:42 +00:00
Justin Holewinski	aa58397b3c	Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall to pass around a struct instead of a large set of individual values. This cleans up the interface and allows more information to be added to the struct for future targets without requiring changes to each and every target. NV_CONTRIB llvm-svn: 157479	2012-05-25 16:35:28 +00:00
Jakob Stoklund Olesen	691ae3388f	Use the right register class for LDRrs. llvm-svn: 157152	2012-05-20 06:38:47 +00:00
Benjamin Kramer	e31f31e5c0	Add a new target hook "predictableSelectIsExpensive". This will be used to determine whether it's profitable to turn a select into a branch when the branch is likely to be predicted. Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM. I'm not entirely happy with the name of this flag, suggestions welcome ;) llvm-svn: 156233	2012-05-05 12:49:14 +00:00
Matt Beaumont-Gay	e82ab6baa7	Pacify GCC's -Wreturn-type llvm-svn: 156189	2012-05-04 18:34:27 +00:00
Hans Wennborg	aea412008e	Make ARM and Mips use TargetMachine::getTLSModel() This moves the logic for selecting a TLS model to a single place, instead of the previous three (ARM, Mips, and X86 which already uses this function). llvm-svn: 156162	2012-05-04 09:40:39 +00:00
Bob Wilson	9245c93656	Don't introduce illegal types when creating vmull operations. <rdar://11324364> ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. llvm-svn: 155824	2012-04-30 16:53:34 +00:00
Craig Topper	c7242e054d	Convert more uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent. llvm-svn: 155188	2012-04-20 07:30:17 +00:00
Evan Cheng	d0007f3c83	Handle llvm.fma.* intrinsics. rdar://10914096 llvm-svn: 154439	2012-04-10 21:40:28 +00:00
Evan Cheng	f8bad08001	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Chad Rosier	e0e38f61a5	When performing a truncating store, it's possible to rearrange the data in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340	2012-04-09 20:32:02 +00:00
Chad Rosier	99cbde9e82	Update comments and remove unnecessary isVolatile() check. llvm-svn: 154336	2012-04-09 19:38:15 +00:00
Jim Grosbach	0c509fa6bf	Tidy up. 80 columns. llvm-svn: 154226	2012-04-06 23:43:50 +00:00
Chandler Carruth	8a102c21e3	There is no portable std::abs overload for int64_t, use the llvm::abs64 which exists for this purpose. llvm-svn: 154199	2012-04-06 20:10:52 +00:00
Jakob Stoklund Olesen	967b86a0a2	Allow negative immediates in ARM and Thumb2 compares. ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. llvm-svn: 154183	2012-04-06 17:45:04 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Evan Cheng	a40d40602c	ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249 llvm-svn: 153717	2012-03-30 01:24:39 +00:00
Lang Hames	591cdaf2ee	Try using vmov.i32 to materialize FP32 constants that can't be materialized by vmov.f32. llvm-svn: 153696	2012-03-29 21:56:11 +00:00
Craig Topper	f6e7e12f75	Remove unnecessary llvm:: qualifications llvm-svn: 153500	2012-03-27 07:21:54 +00:00
Craig Topper	5fa0caafc0	Prune includes and replace uses of ARMRegisterInfo.h with ARMBaeRegisterInfo.h llvm-svn: 153422	2012-03-26 00:45:15 +00:00
Craig Topper	07720d8dcd	Replace uses of ARMBaseInstrInfo and ARMTargetMachine with the Base versions. llvm-svn: 153421	2012-03-25 23:49:58 +00:00
Anton Korobeynikov	3edd854d64	Perform mul combine when multiplying wiht negative constants. Patch by Weiming Zhao! This fixes PR12212 llvm-svn: 153049	2012-03-19 19:19:50 +00:00
Craig Topper	188ed9d56e	Reorder includes to match coding standards. Fix an issue or two exposed by that. llvm-svn: 152978	2012-03-17 07:33:42 +00:00
Lang Hames	c35ee8b54a	Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints on register allocation by allowing all 32 D-registers to be used. Patch by Cameron Zwarich. llvm-svn: 152824	2012-03-15 18:49:02 +00:00
Craig Topper	bef78fc2ee	Convert more static tables of registers used by calling convention to uint16_t to reduce space. llvm-svn: 152538	2012-03-11 07:57:25 +00:00
Craig Topper	420525ce3b	Use uint16_t to store registers in callee saved register tables to reduce size of static data. llvm-svn: 151996	2012-03-04 03:33:22 +00:00
Evan Cheng	d12af5dc69	Neuter the optimization I implemented with r107852 and r108258 which turn some floating point equality comparisons into integer ones with -ffast-math. The issue is the optimization causes +0.0 != -0.0. Now the optimization is only done when one side is known to be 0.0. The other side's sign bit is masked off for the comparison. rdar://10964603 llvm-svn: 151861	2012-03-01 23:27:13 +00:00
Evan Cheng	65f9d19c4f	Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call. llvm-svn: 151645	2012-02-28 18:51:51 +00:00
Daniel Dunbar	ee7b899343	Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630	2012-02-28 15:36:07 +00:00
Evan Cheng	87c7b09d8d	Some ARM implementaions, e.g. A-series, does return stack prediction. That is, the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623	2012-02-28 06:42:03 +00:00
Jakob Stoklund Olesen	fa7a53746c	Switch ARM target to register masks. I'll let the buildbots determine the compile time improvements from this change, but 464.h264ref has 5% faster codegen at -O2. This patch does cause some assembly changes. Branch folding can make different decisions about calls with dead return values. CriticalAntiDepBreaker may choose different registers because its liveness tracking is affected. MachineCopyPropagation may sometimes leave a dead copy behind. llvm-svn: 151331	2012-02-24 01:19:29 +00:00
Dan Gohman	d4a77c4682	When emitting a cmp with 0 for a lowered select, mask out the high bits of the value carying the boolean condition, as their contents are undefined. This fixes rdar://10887484. llvm-svn: 151310	2012-02-24 00:09:36 +00:00
Evan Cheng	f258a15bdf	Canonicalize (srl (bswap x), 16) to (rotr (bswap x), 16) if the high 16 bits of x are zero. This optimizes rev + lsr 16 to rev16. rdar://10750814 llvm-svn: 151230	2012-02-23 02:58:19 +00:00
Evan Cheng	e87681cf34	Optimize a couple of common patterns involving conditional moves where the false value is zero. Instead of a cmov + op, issue an conditional op instead. e.g. cmp r9, r4 mov r4, #0 moveq r4, #1 orr lr, lr, r4 should be: cmp r9, r4 orreq lr, lr, #1 That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y). It's possible to extend this to ADD and SUB but I don't think they are common. rdar://8659097 llvm-svn: 151224	2012-02-23 01:19:06 +00:00
Craig Topper	760b134ffa	Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified. llvm-svn: 151134	2012-02-22 05:59:10 +00:00
Evan Cheng	0460ae8d80	Proper support for a bastardized darwin-eabi hybird ABI. llvm-svn: 151083	2012-02-21 20:46:00 +00:00
James Molloy	547d4c0662	Improve generated code for extending loads and some trunc stores on ARM. Teach TargetSelectionDAG about lengthening loads for vector types and set v4i8 as legal. Allow FP_TO_UINT for v4i16 from v4i32. llvm-svn: 150956	2012-02-20 09:24:05 +00:00
Bill Wendling	05d6f2ff1e	Don't reserve the R0 and R1 registers here. We don't use these registers, and marking them as "live-in" into a BB ruins some invariants that the back-end tries to maintain. llvm-svn: 150437	2012-02-13 23:47:16 +00:00
Jason W Kim	c7f4841769	Make valgrind happy. llvm-svn: 150251	2012-02-10 16:07:59 +00:00
Craig Topper	e55c556a24	Convert assert(0) to llvm_unreachable llvm-svn: 149961	2012-02-07 02:50:20 +00:00
Anton Korobeynikov	d0c46550fe	Cleanups for EABI standard functions llvm-svn: 149195	2012-01-29 09:11:50 +00:00
Anton Korobeynikov	1b42e64280	Use base AAPCS for varargs functions even for AAPCS-VFP CC llvm-svn: 149194	2012-01-29 09:06:09 +00:00
David Blaikie	46a9f016c5	More dead code removal (using -Wunreachable-code) llvm-svn: 148578	2012-01-20 21:51:11 +00:00
David Blaikie	5d8e42755c	Refactor variables unused under non-assert builds (& remove two entirely unused variables). llvm-svn: 148230	2012-01-16 05:17:39 +00:00
Benjamin Kramer	339ced4e34	Return an ArrayRef from ShuffleVectorSDNode::getMask and push it through CodeGen. llvm-svn: 148218	2012-01-15 13:16:05 +00:00
Jakob Stoklund Olesen	083dbdca7f	Match SelectionDAG logic for enabling movt. Darwin doesn't do static, and ELF targets only support static. llvm-svn: 147740	2012-01-07 20:49:15 +00:00
Benjamin Kramer	6898db6269	Remove VectorExtras. This unused helper was written for a type of API that is discouraged now. llvm-svn: 147738	2012-01-07 19:42:13 +00:00
Bob Wilson	1a74de9504	Add variants of the dispatchsetup pseudo for Thumb and !VFP. <rdar://10620138> My change r146949 added register clobbers to the eh_sjlj_dispatchsetup pseudo instruction, but on Thumb1 some of those registers cannot be used. This caused massive failures on the testsuite when compiling for Thumb1. While fixing that, I noticed that the eh_sjlj_setjmp instruction has a "nofp" variant, and I realized that dispatchsetup needs the same thing, so I have added that as well. llvm-svn: 147204	2011-12-22 23:39:48 +00:00
Eli Friedman	c9bf1b1bff	Make check a bit more strict so we don't call ARM_AM::getFP32Imm with a value that isn't a 32-bit value. (This is just to be safe; I don't think this actually causes any issues in practice.) llvm-svn: 146700	2011-12-15 22:56:53 +00:00
Chandler Carruth	637cc6a8aa	Initial CodeGen support for CTTZ/CTLZ where a zero input produces an undefined result. This adds new ISD nodes for the new semantics, selecting them when the LLVM intrinsic indicates that the undef behavior is desired. The new nodes expand trivially to the old nodes, so targets don't actually need to do anything to support these new nodes besides indicating that they should be expanded. I've done this for all the operand types that I could figure out for all the targets. Owners of various targets, please review and let me know if any of these are incorrect. Note that the expand behavior is conservatively correct, and exactly matches LLVM's current behavior with these operations. Ideally this patch will not change behavior in any way. For example the regtest suite finds the exact same instruction sequences coming out of the code generator. That's why there are no new tests here -- all of this is being exercised by the existing test suite. Thanks to Duncan Sands for reviewing the various bits of this patch and helping me get the wrinkles ironed out with expanding for each target. Also thanks to Chris for clarifying through all the discussions that this is indeed the approach he was looking for. That said, there are likely still rough spots. Further review much appreciated. llvm-svn: 146466	2011-12-13 01:56:10 +00:00
Stepan Dyatkovskiy	4683740967	Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Third attempt: simplified checks in test for armv7-apple-darwin11. llvm-svn: 146341	2011-12-11 14:35:48 +00:00
Chad Rosier	6641294e3b	Revert r146322 to appease buildbots. Original commit message: Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt. llvm-svn: 146328	2011-12-10 19:55:03 +00:00
Stepan Dyatkovskiy	df0b779e9f	Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt. llvm-svn: 146322	2011-12-10 08:42:24 +00:00
Eli Friedman	4e36a934dc	Splats can contain undef's; make sure to handle them correctly. PR11526. llvm-svn: 146299	2011-12-09 23:54:42 +00:00
Daniel Dunbar	c09e4593b2	Revert r146143, "Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2).", it is failing tests. llvm-svn: 146157	2011-12-08 17:32:18 +00:00
Stepan Dyatkovskiy	a4bcf27dae	Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). llvm-svn: 146143	2011-12-08 07:55:03 +00:00
Evan Cheng	7f8e563a69	Add bundle aware API for querying instruction properties and switch the code generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if all of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026	2011-12-07 07:15:52 +00:00
Nick Lewycky	50f02cb21b	Move global variables in TargetMachine into new TargetOptions class. As an API change, now you need a TargetOptions object to create a TargetMachine. Clang patch to follow. One small functionality change in PTX. PTX had commented out the machine verifier parts in their copy of printAndVerify. That now calls the version in LLVMTargetMachine. Users of PTX who need verification disabled should rely on not passing the command-line flag to enable it. llvm-svn: 145714	2011-12-02 22:16:29 +00:00
Benjamin Kramer	7ba71be392	Move code into anonymous namespaces. llvm-svn: 145154	2011-11-26 23:01:57 +00:00
Bob Wilson	f6d1728d8f	Fix ARM SjLj-EH dispatch setup code. <rdar://problem/10444602> The EmitBasePointerRecalculation function has 2 problems, one minor and one fatal. The minor problem is that it inserts the code at the setjmp instead of in the dispatch block. The fatal problem is that at the point where this code runs, we don't know whether there will be a base pointer, so the entire function is a no-op. The base pointer recalculation needs to be handled as it was before, by inserting a pseudo instruction that gets expanded late. Most of the support for the old approach is still here, but it no longer has any connection to the eh_sjlj_dispatchsetup intrinsic. Clean up the parts related to the intrinsic and just generate the pseudo instruction directly. llvm-svn: 144781	2011-11-16 07:11:57 +00:00
Jay Foad	0745e645e0	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144631	2011-11-15 07:24:32 +00:00
Evan Cheng	7ca4b6eb5c	Add vmov.f32 to materialize f32 immediate splats which cannot be handled by integer variants. rdar://10437054 llvm-svn: 144608	2011-11-15 02:12:34 +00:00
Eli Friedman	c4a001478c	Make sure to expand SIGN_EXTEND_INREG for NEON vectors. PR11319, round 3. llvm-svn: 144361	2011-11-11 03:16:38 +00:00
Eli Friedman	2d4055b683	Make sure we correctly unroll conversions between v2f64 and v2i32 on ARM. llvm-svn: 144241	2011-11-09 23:36:02 +00:00
Lang Hames	b85fcd07df	Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported. Add support for trimming constants to GetDemandedBits. This fixes some funky constant generation that occurs when stores are expanded for targets that don't support unaligned stores natively. llvm-svn: 144102	2011-11-08 18:56:23 +00:00
Pete Cooper	82cd9e81fc	Added invariant field to the DAG.getLoad method and changed all calls. When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses llvm-svn: 144100	2011-11-08 18:42:53 +00:00
Eli Friedman	6f84fed675	Make sure to mark vector extload's as expand on ARM. Fixes PR11319. llvm-svn: 144057	2011-11-08 01:43:53 +00:00
Dan Gohman	198b7ffc11	Reapply r143206, with fixes. Disallow physical register lifetimes across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660	2011-11-03 21:49:52 +00:00
Lang Hames	1f4603d498	Fixed parameter name. llvm-svn: 143594	2011-11-02 23:37:04 +00:00
Lang Hames	9929c423a1	Try to lower memset/memcpy/memmove to vector instructions on ARM where the alignment permits. llvm-svn: 143582	2011-11-02 22:52:45 +00:00
Dan Gohman	9b9c970148	Revert r143206, as there are still some failing tests. llvm-svn: 143262	2011-10-29 00:41:52 +00:00
Dan Gohman	73057ad24f	Reapply r143177 and r143179 (reverting r143188), with scheduler fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206	2011-10-28 17:55:38 +00:00
Duncan Sands	225a7037d6	Speculatively disable Dan's commits 143177 and 143179 to see if it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188	2011-10-28 09:55:57 +00:00
Dan Gohman	4db3f7dd83	Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177	2011-10-28 01:29:32 +00:00
Lang Hames	c47e283430	Make sure short memsets on ARM lower to stores, even when optimizing for size. llvm-svn: 143055	2011-10-26 20:56:52 +00:00
James Molloy	dd9137aa56	Revert r142530 at least temporarily while a discussion is had on llvm-commits regarding exactly how much optsize should optimize for size over performance. llvm-svn: 143023	2011-10-26 08:53:19 +00:00
Bill Wendling	1414bc5a14	Use a worklist to prevent the iterator from becoming invalidated because of the 'removeSuccessor' call. Noticed in a Release+Asserts+Check buildbot. llvm-svn: 143018	2011-10-26 07:16:18 +00:00
Evan Cheng	043c9d3f7a	Revert part of r142530. The patch potentially hurts performance especially on Darwin platforms where -Os means optimize for size without hurting performance. llvm-svn: 143002	2011-10-26 01:17:44 +00:00
Eli Friedman	a5e244c08d	Don't crash on variable insertelement on ARM. PR10258. llvm-svn: 142871	2011-10-24 23:08:52 +00:00
Dan Gohman	4ed1afa51d	Change this overloaded use of Sched::Latency to be an overloaded use of Sched::ILP instead, as Sched::Latency is going away. llvm-svn: 142813	2011-10-24 17:55:11 +00:00
Bill Wendling	94e6643fce	The different flavors of ARM have different valid subsets of registers. Check that the set of callee-saved registers is correct for the specific platform. <rdar://problem/10313708> & ctor_dtor_count & ctor_dtor_count-2 llvm-svn: 142706	2011-10-22 00:29:28 +00:00
Bill Wendling	cf7bdf4438	Add missing operand. <rdar://problem/10313323> llvm-svn: 142615	2011-10-20 20:37:11 +00:00
James Molloy	2d768fd379	Use literal pool loads instead of MOVW/MOVT for materializing global addresses when optimizing for size. On spec/gcc, this caused a codesize improvement of ~1.9% for ARM mode and ~4.9% for Thumb(2) mode. This is codesize including literal pools. The pools themselves doubled in size for ARM mode and quintupled for Thumb mode, leaving suggestion that there is still perhaps redundancy in LLVM's use of constant pools that could be decreased by sharing entries. Fixes PR11087. llvm-svn: 142530	2011-10-19 14:11:07 +00:00
Bill Wendling	2977a15ab1	Make sure we emit the 'movw' and 'movt' only if it's supported. Otherwise, use a constant pool. llvm-svn: 142485	2011-10-19 09:24:02 +00:00
Bill Wendling	7c1634556d	Remove some dead code. llvm-svn: 142484	2011-10-19 09:04:11 +00:00
Bill Wendling	94f60018e0	Emit the MOVT instruction only if the # LPads is > 64K. llvm-svn: 142460	2011-10-18 23:19:55 +00:00
Bill Wendling	64e6bfc16c	For Thumb mode, we need to use a constant pool if the value is too large to be used with the CMP instruction. llvm-svn: 142458	2011-10-18 23:11:05 +00:00
Bill Wendling	4969dcdef9	Use the integer compare when the value is small enough. Use the "move into a register and then compare against that" method when it's too large. We have to move the value into the register in the "movw, movt" pair of instructions. llvm-svn: 142440	2011-10-18 22:52:20 +00:00
Bill Wendling	85833f71c6	Use the integer compare when the value is small enough. Use the "move into a register and then compare against that" method when it's too large. We have to move the value into the register in the "movw, movt" pair of instructions. llvm-svn: 142437	2011-10-18 22:49:07 +00:00
Bill Wendling	973c817cde	The value we're comparing against may be too large for the ARM CMP instruction. Move the value into a register and then use that for the CMP. <rdar://problem/10305266> llvm-svn: 142431	2011-10-18 22:11:18 +00:00
Bill Wendling	b2a703d352	The immediate may be too large for the CMP instruction. Move it into a register and use that in the CMP. <rdar://problem/10305266> llvm-svn: 142429	2011-10-18 21:55:58 +00:00
Andrew Trick	88b2450adc	Use ARM/t2PseudoInst class from ARM/Thumb2 special adds/subs patterns. Clean up the patterns, fix comments, and avoid confusing both tools and coders. Note that the special adds/subs SelectionDAG nodes no longer have the dummy cc_out operand. llvm-svn: 142397	2011-10-18 19:18:52 +00:00
Bob Wilson	93b0f7b319	Use isIntN and isUIntN to check for valid signed/unsigned numbers. llvm-svn: 142395	2011-10-18 18:46:49 +00:00
Andrew Trick	3f07c429b5	whitespace llvm-svn: 142394	2011-10-18 18:40:53 +00:00
Bill Wendling	617075fcf6	A landing pad could have more than one predecessor. In that case, we want that predecessor to remove the jump to it as well. Delay clearing the 'landing pad' flag until after the jumps have been removed. (There is an implicit assumption in several modules that an MBB which jumps to a landing pad has only two successors.) <rdar://problem/10304224> llvm-svn: 142390	2011-10-18 18:30:49 +00:00
Bob Wilson	9258b76d8d	Fix incorrect check for sign-extended constant BUILD_VECTOR. <rdar://problem/10298332> llvm-svn: 142371	2011-10-18 17:34:51 +00:00
Duncan Sands	d278d35b13	Fix a bunch of unused variable warnings when doing a release build with gcc-4.6. llvm-svn: 142350	2011-10-18 12:44:00 +00:00
Bill Wendling	510fbcd440	Don't renumber the blocks here. This could cause problems later on if another pass renumbers the blocks again. llvm-svn: 142258	2011-10-17 21:32:56 +00:00
Bill Wendling	f7f223f69e	Add a call to EmitSjLjDispatchBlock. Once the intrinsics are marked as having a custom inserter, it will call this method to emit the dispatch table into the machine function. llvm-svn: 142245	2011-10-17 20:37:20 +00:00
Bill Wendling	26d2780d07	Add comment explaining that the order of processing doesn't matter here. llvm-svn: 142176	2011-10-17 05:25:09 +00:00

1 2 3 4 5 ...

958 Commits