llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	0498b88d48	Post process ADC/SBB and use a shorter encoding if they use a sign extended immediate. llvm-svn: 177243	2013-03-18 03:34:55 +00:00
Craig Topper	7e9a1cb199	Refactor some duplicated code into helper functions. llvm-svn: 177242	2013-03-18 02:53:34 +00:00
Craig Topper	612f7bfa4d	Add X86 code emitter support AVX encoded MRMDestReg instructions. Previously we weren't skipping the VVVV encoded register. Based on patch by Michael Liao. llvm-svn: 177221	2013-03-16 03:44:31 +00:00
Jakob Stoklund Olesen	63bff2eb39	Define more SchedWrites for annotating X86 instructions. Since almost all X86 instructions can fold loads, use a multiclass to define register/memory pairs of SchedWrites. An X86FoldableSchedWrite represents the register version of an instruction. It holds a reference to the SchedWrite to use when the instruction folds a load. This will be used inside multiclasses that define rr and rm instruction versions together. llvm-svn: 177210	2013-03-16 00:02:17 +00:00
Eric Christopher	8996c5d469	Silence anonymous type in anonymous union warnings. llvm-svn: 177135	2013-03-15 00:42:55 +00:00
Nadav Rotem	adfa5eaf8c	Unaligned loads should use the VMOVUPS opcode. llvm-svn: 177130	2013-03-14 23:49:44 +00:00
Jakob Stoklund Olesen	712366821a	Prepare for adding InstrSchedModel annotations to X86 instructions. The new InstrSchedModel is easier to use than the instruction itineraries. It will be used to model instruction latency and throughput in modern Intel microarchitectures like Sandy Bridge. InstrSchedModel should be able to coexist with instruction itinerary classes, but for cleanliness we should switch the Atom processor model to the new InstrSchedModel as well. llvm-svn: 177122	2013-03-14 22:42:17 +00:00
Chad Rosier	4b54f594b4	[fast-isel] The X86FastISel::FastLowerArguments function doesn't properly handle the win64 calling convention. rdar://13423768 llvm-svn: 177113	2013-03-14 21:25:04 +00:00
Craig Topper	ba82429826	Fix the name of a variable to match its declaration. Fixes build failure from r177014. llvm-svn: 177015	2013-03-14 07:47:43 +00:00
Craig Topper	872999737d	Fix a bug in the calculation of the VEX.B bit for FMA4 rr with the VEX.W bit set. The VEX.B was being calculated from the wrong operand. Fixes at least some portion of PR14185. llvm-svn: 177014	2013-03-14 07:40:52 +00:00
Craig Topper	a66d81d521	Teach X86 MC instruction lowering that VMOVAPSrr and other VEX-encoded register to register moves should be switched from using the MRMSrcReg form to the MRMDestReg form if the source register is a 64-bit extended register and the destination register is not. This allows the instruction to be encoded using the 2-byte VEX form instead of the 3-byte VEX form. The GNU assembler has similar behavior. llvm-svn: 177011	2013-03-14 07:09:57 +00:00
Michael Liao	20d287044c	Fix PR15309 - Fix the typo on type checking llvm-svn: 177010	2013-03-14 06:57:42 +00:00
Kevin Enderby	f15856ebb4	Fixes disassembler crashes on 2013 Haswell RTM instructions. rdar://13318048 llvm-svn: 176828	2013-03-11 21:17:13 +00:00
Tom Stellard	b1588fc057	DAGCombiner: Use correct value type for checking legality of BR_CC v3 LegalizeDAG.cpp uses the value of the comparison operands when checking the legality of BR_CC, so DAGCombiner should do the same. v2: - Expand more BR_CC value types for NVPTX v3: - Expand correct BR_CC value types for Hexagon, Mips, and XCore. llvm-svn: 176694	2013-03-08 15:36:57 +00:00
Benjamin Kramer	2c3d0df8ee	X86: Fold EXTRACT_SUBVECTORs of a BUILD_VECTOR into a smaller BUILD_VECTOR. That can usually be lowered efficiently and is common in sandybridge code. It would be nice to do this in DAGCombiner but we can't insert arbitrary BUILD_VECTORs this late. Fixes PR15462. llvm-svn: 176634	2013-03-07 18:48:40 +00:00
Michael Liao	d5cac37dc5	Fix two remaining issue after fixing PR15355 when CMOV is not available - Phi nodes should be replaced/updated after lowering CMOV into branch because 'mainMBB' updating operand in Phi node is changed. - Add EFLAGS in livein before lowering the 2nd CMOV. It's necessary as we will reuse the EFLAGS generated before the 1st lowered CMOV, which won't clobber EFLAGS. However, we need explicitly specify that. - '-attr=-cmov' test case are added. llvm-svn: 176598	2013-03-07 01:01:29 +00:00
Michael Liao	da22b30be5	Fix PR15355 - Clear 'mayStore' flag when loading from the atomic variable before the spin loop - Clear kill flag from one use to multiple use in registers forming the address to that atomic variable - don't use a physical register as live-in register in BB (neither entry nor landing pad.) by copying it into virtual register (patch by Cameron Zwarich) llvm-svn: 176538	2013-03-06 00:17:04 +00:00
David Sehr	4c8979cd4d	The current X86 NOP padding uses one long NOP followed by the remainder in one-byte NOPs. If the processor actually executes those NOPs, as it sometimes does with aligned bundling, this can have a performance impact. From my micro-benchmarks run on my one machine, a 15-byte NOP followed by twelve one-byte NOPs is about 20% worse than a 15 followed by a 12. This patch changes NOP emission to emit as many 15-byte (the maximum) as possible followed by at most one shorter NOP. llvm-svn: 176464	2013-03-05 00:02:23 +00:00
Preston Gurd	485296d1e8	Bypass Slow Divides * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442	2013-03-04 18:13:57 +00:00
Arnold Schwaighofer	20ef54f4c1	X86 cost model: Adjust cost for custom lowered vector multiplies This matters for example in following matrix multiply: int mmult(int rows, int cols, int m1, int m2, int m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 llvm-svn: 176403	2013-03-02 04:02:52 +00:00
Michael Liao	6af16fc3b7	Fix PR10475 - ISD::SHL/SRL/SRA must have either both scalar or both vector operands but TLI.getShiftAmountTy() so far only return scalar type. As a result, backend logic assuming that breaks. - Rename the original TLI.getShiftAmountTy() to TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to return target-specificed scalar type or the same vector type as the 1st operand. - Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar type. llvm-svn: 176364	2013-03-01 18:40:30 +00:00
Duncan Sands	2cb41d372c	GCC thinks that this variable might be used uninitialized (it isn't). llvm-svn: 176341	2013-03-01 09:46:03 +00:00
Yiannis Tsiouris	d4842e5ee9	Re-format comments (and check commit access) llvm-svn: 176270	2013-02-28 16:59:10 +00:00
Nadav Rotem	08ab877cc7	Revert r176166 because it broke one of the lit tests. llvm-svn: 176171	2013-02-27 05:56:20 +00:00
Nadav Rotem	85e1211fbf	std::string to StringRef. llvm-svn: 176166	2013-02-27 05:23:56 +00:00
Chad Rosier	1b33e8d63e	[fast-isel] Make sure the FastLowerArguments function checks to make sure the arguments type is a simple type. rdar://13290455 llvm-svn: 176066	2013-02-26 01:05:31 +00:00
Michael Liao	609a527286	Refine fix to PR10499, no functionality change - Put expensive checking after simple one llvm-svn: 176060	2013-02-25 23:16:36 +00:00
Michael Liao	ab97668061	Fix PR10499 - Check whether SSE is available before lowering all 1s vector building with PCMPEQD, which is only available from SSE2 llvm-svn: 176058	2013-02-25 23:01:03 +00:00
Chad Rosier	a92ef4ba5b	[fast-isel] Add X86FastIsel::FastLowerArguments to handle functions with 6 or fewer scalar integer (i32 or i64) arguments. It completely eliminates the need for SDISel for trivial functions. Also, add the new llc -fast-isel-abort-args option, which is similar to -fast-isel-abort option, but for formal argument lowering. llvm-svn: 176052	2013-02-25 21:59:35 +00:00
Chad Rosier	669bb3ee77	[ms-inline asm] Add support for the pushad/popad mnemonics. rdar://13254235 llvm-svn: 176036	2013-02-25 19:06:27 +00:00
Nadav Rotem	b532fca92c	Revert r169638 because it broke Mesa llvmpipe tests. Fix PR15239. llvm-svn: 175985	2013-02-24 07:09:35 +00:00
Benjamin Kramer	ee23dcb461	X86: Disable cmov-memory patterns on subtargets without cmov. Fixes PR15115. llvm-svn: 175962	2013-02-23 10:40:58 +00:00
Peter Collingbourne	7b57621fb3	x86_64: designate most general purpose and SSE registers as callee save under coldcc llvm-svn: 175911	2013-02-22 19:19:44 +00:00
Eli Bendersky	8da87163ca	Move the eliminateCallFramePseudoInstr method from TargetRegisterInfo to TargetFrameLowering, where it belongs. Incidentally, this allows us to delete some duplicated (and slightly different!) code in TRI. There are potentially other layering problems that can be cleaned up as a result, or in a similar manner. The refactoring was OK'd by Anton Korobeynikov on llvmdev. Note: this touches the target interfaces, so out-of-tree targets may be affected. llvm-svn: 175788	2013-02-21 20:05:00 +00:00
Eli Bendersky	e93249befa	getX86SubSuperRegister has a special mode with High=true for i64 which exists solely to enable it to call itself for i8 with some registers. The proposed patch simplifies the function somewhat to make the High bit only meaningful for the i8 mode, which makes sense. No functional difference (getX86SubSuperRegister is not getting called from anywhere outside with i64 and High=true). llvm-svn: 175762	2013-02-21 16:40:18 +00:00
Jim Grosbach	d2037eb1ee	MCParser: Update method names per coding guidelines. s/AddDirectiveHandler/addDirectiveHandler/ s/ParseMSInlineAsm/parseMSInlineAsm/ s/ParseIdentifier/parseIdentifier/ s/ParseStringToEndOfStatement/parseStringToEndOfStatement/ s/ParseEscapedString/parseEscapedString/ s/EatToEndOfStatement/eatToEndOfStatement/ s/ParseExpression/parseExpression/ s/ParseParenExpression/parseParenExpression/ s/ParseAbsoluteExpression/parseAbsoluteExpression/ s/CheckForValidSection/checkForValidSection/ http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly No functional change intended. llvm-svn: 175675	2013-02-20 22:21:35 +00:00
Jim Grosbach	341ad3e72a	Update TargetLowering ivars for name policy. http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly ivars should be camel-case and start with an upper-case letter. A few in TargetLowering were starting with a lower-case letter. No functional change intended. llvm-svn: 175667	2013-02-20 21:13:59 +00:00
Chad Rosier	a018cfd10c	[ms-inline asm] Make the comment a bit more verbose. llvm-svn: 175641	2013-02-20 18:03:44 +00:00
Elena Demikhovsky	0ccdd1315b	I optimized the following patterns: sext <4 x i1> to <4 x i64> sext <4 x i8> to <4 x i64> sext <4 x i16> to <4 x i64> I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns: (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT))) The sext_in_reg (v4i32 x) may be lowered to shl+sar operations. The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution. I also added a cost of this operations to the AVX costs table. llvm-svn: 175619	2013-02-20 12:42:54 +00:00
Chad Rosier	45a52fa097	[ms-inline asm] Force the use of a base pointer if the MachineFunction includes MS-style inline assembly. This is a follow-on to r175334. Forcing a FP to be emitted doesn't ensure it will be used. Therefore, force the base pointer as well. We now treat MS inline assembly in the same way we treat functions with dynamic stack realignment and VLAs. This guarantees the BP will be used to reference parameters and locals. rdar://13218191 llvm-svn: 175576	2013-02-19 23:50:45 +00:00
Jakub Staszak	e167cf5c4d	Add obvious constantness. llvm-svn: 175560	2013-02-19 21:54:59 +00:00
Benjamin Kramer	1cb826b0ad	Clean up HiPE prologue emission a bit and avoid signed arithmetic tricks. No intended functionality change. llvm-svn: 175536	2013-02-19 17:32:57 +00:00
Rafael Espindola	1c040b5788	Move LLVM_LIBRARY_VISIBILITY for consistency with what was done to PPCJITInfo.cpp in r175394. llvm-svn: 175531	2013-02-19 17:14:33 +00:00
Eli Bendersky	b0b13b22a3	Make pass name more precise and fix comment. llvm-svn: 175525	2013-02-19 16:38:32 +00:00
Craig Topper	f371e89264	Fix capitalization in comment to match function name. llvm-svn: 175497	2013-02-19 07:43:59 +00:00
Jakub Staszak	1f199a0ef2	Use array_pod_sort instead of std::sort. llvm-svn: 175472	2013-02-18 23:18:22 +00:00
NAKAMURA Takumi	3a8002f61d	X86FrameLowering.cpp: Fixup. Sorry for the breakage. llvm-svn: 175467	2013-02-18 23:15:21 +00:00
NAKAMURA Takumi	a614ec7e6f	X86FrameLowering.cpp: Fix a warning in -Asserts. [-Wunused-variable] llvm-svn: 175464	2013-02-18 23:08:49 +00:00
Chad Rosier	441e81287f	Remove a useless assert. llvm-svn: 175463	2013-02-18 22:20:16 +00:00
Benjamin Kramer	5c6e653b72	Fix a 32/64 bit incompatibility in the HiPE prologue generation. llvm-svn: 175458	2013-02-18 21:45:01 +00:00

1 2 3 4 5 ...

9087 Commits