llvm-project

Commit Graph

Author	SHA1	Message	Date
Hans Wennborg	7036e503d7	Fix "Not having LAHF/SAHF" assert. It wants to assert that the subtarget is 64-bit, not the register. llvm-svn: 255703	2015-12-15 23:21:46 +00:00
Sanjay Patel	271efcdf20	[x86] inline calls to fmaxf / llvm.maxnum.f32 using maxss (PR24475) This patch improves on the suggested codegen from PR24475: https://llvm.org/bugs/show_bug.cgi?id=24475 but only for the fmaxf() case to start, so we can sort out any bugs before extending to fmin, f64, and vectors. The fmax / maxnum definitions provide us flexibility for signed zeros, so the only thing we have to worry about in this replacement sequence is NaN handling. Note 1: It may be better to implement this as lowerFMAXNUM(), but that exposes a problem: SelectionDAGBuilder::visitSelect() transforms compare/select instructions into FMAXNUM nodes if we declare FMAXNUM legal or custom. Perhaps that should be checking for NaN inputs or global unsafe-math before transforming? As it stands, that bypasses a big set of optimizations that the x86 backend already has in PerformSELECTCombine(). Note 2: The v2f32 test reveals another bug; the vector is extended to v4f32, so we have completely unnecessary operations happening on undef elements of the vector. Differential Revision: http://reviews.llvm.org/D15294 llvm-svn: 255700	2015-12-15 23:11:43 +00:00
Reid Kleckner	d7045faa10	[WinEH] Remove unused intrinsic llvm.x86.seh.restoreframe We can clean this up now that we have the X86 CATCHRET instruction to restore the FP, SP, and BP. llvm-svn: 255677	2015-12-15 21:41:34 +00:00
Michael Kuperstein	53946bf8c6	[X86] MOVPC32r should only emit CFI adjustments when needed We only want to emit CFI adjustments when actually using DWARF. This fixes PR25828. Differential Revision: http://reviews.llvm.org/D15522 llvm-svn: 255664	2015-12-15 18:50:32 +00:00
Hans Wennborg	08d5905bac	[X86] Smaller code for materializing 32-bit 1 and -1 constants "movl $-1, %eax" is 5 bytes, "xorl %eax, %eax; decl %eax" is 3 bytes. This commit makes LLVM use the latter when optimizing for size. Differential Revision: http://reviews.llvm.org/D14971 llvm-svn: 255656	2015-12-15 17:10:28 +00:00
Asaf Badouh	5acf66ff97	[x86] adding PKU feature flag the feature flag is essential for RDPKRU and WRPKRU instruction more about the instruction can be found in the SDM rev 56, vol 2 from http://www.intel.com/sdm Differential Revision: http://reviews.llvm.org/D15491 llvm-svn: 255644	2015-12-15 13:35:29 +00:00
Elena Demikhovsky	6015f5c823	Type legalizer for masked gather and scatter intrinsics. Full type legalizer that works with all vectors length - from 2 to 16, (i32, i64, float, double). This intrinsic, for example void @llvm.masked.scatter.v2f32(<2 x float>%data , <2 x float*>%ptrs , i32 align , <2 x i1>%mask ) requires type widening for data and type promotion for mask. Differential Revision: http://reviews.llvm.org/D13633 llvm-svn: 255629	2015-12-15 08:40:41 +00:00
Quentin Colombet	25b43f3624	[X86] Add relaxtion logic for SBB instructions. Prior to this patch, we would wrongly stick to the variant with imm8 encoding even when the relocation could not fit that size. rdar://problem/23785506 llvm-svn: 255583	2015-12-15 00:09:23 +00:00
Quentin Colombet	2cb8a51c1f	[X86] Add relaxtion logic for ADC instructions. Prior to this patch, we would wrongly stick to the variant with imm8 encoding even when the relocation could not fit that size. rdar://problem/23785506 llvm-svn: 255570	2015-12-14 23:12:40 +00:00
Chih-Hung Hsieh	7993e18e80	[X86] Part 2 to fix x86-64 fp128 calling convention. Part 1 was submitted in http://reviews.llvm.org/D15134. Changes in this part: * X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class. * X86CallingConv.td: Pass f128 values in XMM registers or on stack. * X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td: Add instruction selection patterns for f128. * X86ISelLowering.cpp: When target has MMX registers, configure MVT::f128 in FR128RegClass, with TypeSoftenFloat action, and custom actions for some opcodes. Add missed cases of MVT::f128 in places that handle f32, f64, or vector types. Add TODO comment to support f128 type in inline assembly code. * SelectionDAGBuilder.cpp: Fix infinite loop when f128 type can have VT == TLI.getTypeToTransformTo(Ctx, VT). * Add unit tests for x86-64 fp128 type. Differential Revision: http://reviews.llvm.org/D11438 llvm-svn: 255558	2015-12-14 22:08:36 +00:00
Yaron Keren	45ea8fa1f4	Save several std::string constructions using llvm::Twine. llvm-svn: 255535	2015-12-14 19:28:40 +00:00
Michael Zuckerman	02ecd43c63	[X86][inline asm] support even directive The .even directive aligns content to an evan-numbered address. In at&t syntax .even In Microsoft syntax even (without the dot). Differential Revision: http://reviews.llvm.org/D15413 llvm-svn: 255462	2015-12-13 17:07:23 +00:00
Simon Pilgrim	3e0c022aed	Fix line endings llvm-svn: 255459	2015-12-13 12:49:48 +00:00
Simon Pilgrim	052191dd82	[X86][AVX512] Added support for VMOVQ shuffle comments llvm-svn: 255442	2015-12-12 21:46:23 +00:00
David Majnemer	8a1c45d6e8	[IR] Reformulate LLVM's EH funclet IR While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422	2015-12-12 05:38:55 +00:00
Chen Li	1b26b9ec9d	[X86ISelLowering] Add additional support for multiplication-to-shift conversion. Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255415	2015-12-12 01:04:15 +00:00
Chen Li	02ef2e1385	Revert rL255391: [X86ISelLowering] Add additional support for multiplication-to-shift conversion. because it broke buildbot. llvm-svn: 255395	2015-12-12 00:08:37 +00:00
Chen Li	e8f9387e0c	[X86ISelLowering] Add additional support for multiplication-to-shift conversion. Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255391	2015-12-11 23:39:32 +00:00
Matthias Braun	60d69e2865	CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness() computeRegisterLiveness() was broken in that it reported dead for a register even if a subregister was alive. I assume this was because the results of analayzePhysRegs() are hard to understand with respect to subregisters. This commit: Changes the results of analyzePhysRegs (=struct PhysRegInfo) to be clearly understandable, also renames the fields to avoid silent breakage of third-party code (and improve the grammar). Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling it and removing workarounds for the bug. This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033 Differential Revision: http://reviews.llvm.org/D15320 llvm-svn: 255362	2015-12-11 19:42:09 +00:00
Matt Arsenault	fbd9bbfda3	Start replacing vector_extract/vector_insert with extractelt/insertelt These are redundant pairs of nodes defined for INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT. insertelement/extractelement are slightly closer to the corresponding C++ node name, and has stricter type checking so prefer it. Update targets to only use these nodes where it is trivial to do so. AArch64, ARM, and Mips all have various type errors on simple replacement, so they will need work to fix. Example from AArch64: def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8), (i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>; Which is trying to do sext_inreg i8, i8. llvm-svn: 255359	2015-12-11 19:20:16 +00:00
Cong Hou	59898d8c68	[X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1. Previously in the conversion cost table there are no entries for integer-integer conversions on SSE2. This will result in imprecise costs for certain vectorized operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers are counted from the result of running llc on the new test case in this patch. Differential revision: http://reviews.llvm.org/D15132 llvm-svn: 255315	2015-12-11 00:31:39 +00:00
Craig Topper	8e44b9a4d1	[X86] Fix a couple cases were bitwise and logical operations were being mixed. NFC llvm-svn: 255224	2015-12-10 06:09:41 +00:00
Quentin Colombet	5d2f7cfd44	[X86] Enable shrink-wrapping by default, but keep it disabled for stack frames without a frame pointer when unwind may happen. This is a workaround for a bug in the way we emit the CFI directives for frameless unwind information. See PR25614. llvm-svn: 255175	2015-12-09 23:08:18 +00:00
Vyacheslav Klochkov	a3cd08b05c	X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes. Reviewer: Simon Pilgrim. Differential Revision: http://reviews.llvm.org/D15317 llvm-svn: 255080	2015-12-09 00:12:13 +00:00
Simon Pilgrim	323e00d9c7	[X86][AVX] Fold loads + splats into broadcast instructions On AVX and AVX2, BROADCAST instructions can load a scalar into all elements of a target vector. This patch improves the lowering of 'splat' shuffles of a loaded vector into a broadcast - currently the lowering only works for cases where we are splatting the zero'th element, which is now generalised to any element. Fix for PR23022 Differential Revision: http://reviews.llvm.org/D15310 llvm-svn: 255061	2015-12-08 22:17:11 +00:00
Tim Northover	614e8ff855	X86: produce more friendly errors during MachO relocation handling llvm-svn: 255036	2015-12-08 18:31:35 +00:00
Sanjay Patel	a6bdd70f4b	don't repeat function names in comments; NFC llvm-svn: 254930	2015-12-07 19:31:34 +00:00
Sanjay Patel	f9bdb872bd	remove redundant check: optForSize() includes a check for the minsize attribute; NFCI llvm-svn: 254925	2015-12-07 19:13:40 +00:00
Elena Demikhovsky	291fe0159f	VX-512: Fixed a bug in FP logic operation lowering FP logic instructions are supported in DQ extension on AVX-512 target. I use integer operations instead. Added tests. I also enabled FABS in this patch in order to check ANDPS. The operations are FOR, FXOR, FAND, FANDN. The instructions, that supported for 512-bit vector under DQ are: VORPS/PD, VXORPS/PD, VANDPS/PD, FANDNPS/PD. Differential Revision: http://reviews.llvm.org/D15110 llvm-svn: 254913	2015-12-07 14:33:34 +00:00
Elena Demikhovsky	33e61eceb4	AVX-512: Fixed masked load / store instruction selection for KNL. Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store. This intrinsic comes with all legal types: <8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru), but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only. All data operands should be widened to 512-bit vector. The mask operand should be widened to v16i1 with zeroes. Differential Revision: http://reviews.llvm.org/D15265 llvm-svn: 254909	2015-12-07 13:39:24 +00:00
Igor Breger	3ab6f17530	AVX-512: implement kunpck intrinsics. Differential Revision: http://reviews.llvm.org/D14821 llvm-svn: 254908	2015-12-07 13:25:18 +00:00
Marina Yatsina	497d44a081	[X86] Adding support for FWORD type for MS inline asm Adding support for FWORD type for MS inline asm. Differential Revision: http://reviews.llvm.org/D15268 llvm-svn: 254904	2015-12-07 13:09:20 +00:00
Marina Yatsina	1d1aa0b0a8	[X86] Add support for loopz, loopnz for Intel syntax According to x86 spec, loopz and loopnz should be supported for Intel syntax, where loopz is equivalent to loope and loopnz is equivalent to loopne. Differential Revision: http://reviews.llvm.org/D15148 llvm-svn: 254877	2015-12-06 15:31:47 +00:00
Asaf Badouh	41ecf460fa	[X86][AVX512] add vmovss/sd missing encoding Differential Revision: http://reviews.llvm.org/D14701 llvm-svn: 254875	2015-12-06 13:26:56 +00:00
Michael Kuperstein	77ce9d3b1a	[X86] Always generate precise CFA adjustments. This removes the code path that generate "synchronous" (only correct at call site) CFA. We will probably want to re-introduce it once we are capable of emitting different .eh_frame and .debug_frame sections. Differential Revision: http://reviews.llvm.org/D14948 llvm-svn: 254874	2015-12-06 13:06:20 +00:00
Igor Breger	076dfe5c12	AVX512: support AVX512BW Intrinsic in 32bit mode. Differential Revision: http://reviews.llvm.org/D15076 llvm-svn: 254873	2015-12-06 11:35:18 +00:00
Simon Pilgrim	4ba5969224	[X86][ADX] Added memory folding patterns and stack folding tests llvm-svn: 254844	2015-12-05 07:27:50 +00:00
Craig Topper	e5e035a3a8	Replace uint16_t with the MCPhysReg typedef in many places. A lot of physical register arrays already use this typedef. llvm-svn: 254843	2015-12-05 07:13:35 +00:00
Simon Pilgrim	5a64d98303	[X86][FMA4] Explicitly set the domain of FMA4 float/double scalar instructions Both were defaulting to the float domain - now matches the packed instructions. llvm-svn: 254841	2015-12-05 07:07:42 +00:00
Hans Wennborg	fbf2822e6d	Add FeatureLAHFSAHF to amdfam10 as well. llvm-svn: 254801	2015-12-04 23:32:19 +00:00
Hans Wennborg	5000ce8a63	X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supported These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 llvm-svn: 254793	2015-12-04 23:00:33 +00:00
Sanjay Patel	1640c54593	fix formatting; NFC llvm-svn: 254739	2015-12-04 17:51:55 +00:00
Manman Ren	19c7bbe3b7	[CXX TLS calling convention] Add CXX TLS calling convention. This commit adds a new target-independent calling convention for C++ TLS access functions. It aims to minimize overhead in the caller by perserving as many registers as possible. The target-specific implementation for X86-64 is defined as following: Arguments are passed as for the default C calling convention The same applies for the return value(s) The callee preserves all GPRs - except RAX and RDI The access function makes C-style TLS function calls in the entry and exit block, C-style TLS functions save a lot more registers than normal calls. The added calling convention ties into the existing implementation of the C-style TLS functions, so we can't simply use existing calling conventions such as preserve_mostcc. rdar://9001553 llvm-svn: 254737	2015-12-04 17:40:13 +00:00
Alexey Bataev	7cf324772f	LEA code size optimization pass (Part 1): Remove redundant address recalculations, by Andrey Turetsky Add new x86 pass which replaces address calculations in load or store instructions with def register of existing LEA (must be in the same basic block), if the LEA calculates address that differs only by a displacement. Works only with -Os or -Oz. Differential Revision: http://reviews.llvm.org/D13294 llvm-svn: 254712	2015-12-04 10:53:15 +00:00
JF Bastien	580b6572b5	X86InstrInfo::copyPhysReg: workaround reg liveness Summary: computeRegisterLiveness and analyzePhysReg are currently getting confused about liveness in some cases, breaking copyPhysReg's calculation of whether AX is dead in some cases. Work around this issue temporarily by assuming that AX is always live. See detail in: https://llvm.org/bugs/show_bug.cgi?id=25033#c7 And associated bugs PR24535 PR25033 PR24991 PR24992 PR25201. This workaround makes the code correct but slightly inefficient, but it seems to confuse the machine instr verifier which now things EAX was undefined in some cases where it's being conservatively saved / restored. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15198 llvm-svn: 254680	2015-12-04 01:18:17 +00:00
Chih-Hung Hsieh	ed7d81e5d4	[X86] Part 1 to fix x86-64 fp128 calling convention. Almost all these changes are conditioned and only apply to the new x86-64 f128 type configuration, which will be enabled in a follow up patch. They are required together to make new f128 work. If there is any error, we should fix or revert them as a whole. These changes should have no impact to current configurations. * Relax type legalization checks to accept new f128 type configuration, whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has TLI.isTypeLegal true. * Relax GetSoftenedFloat to return in some cases f128 type SDValue, which is TLI.isTypeLegal but not "softened" to i128 node. * Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration, to generate optimized bitwise operators for libm functions. * Enhance related Lower* functions to handle f128 type. * Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions to keep new f128 type in register, and convert f128 operators to library calls. * Fix Combiner, Emitter, Legalizer routines that did not handle f128 type. * Add ExpandConstant to handle i128 constants, ExpandNode to handle ISD::Constant node. * Add one more parameter to getCommonSubClass and firstCommonClass, to guarantee that returned common sub class will contain the specified simple value type. This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp. * Fix infinite loop in getTypeLegalizationCost when f128 is the value type. * Fix printOperand to handle null operand. * Enhance ISD::BITCAST node to handle f128 constant. * Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes. * Enhance X86AsmPrinter to emit f128 values in comments. Differential Revision: http://reviews.llvm.org/D15134 llvm-svn: 254653	2015-12-03 22:02:40 +00:00
Reid Kleckner	93fc520339	[X86] Put no-op ADJCALLSTACK markers around all dynamic lowerings Summary: These ADJCALLSTACK markers don't generate code, but they keep dynamic alloca code that calls chkstk out of the prologue. This slightly pessimizes inalloca calls by preventing some register copy coalescing, but I can live with that. Reviewers: qcolombet Subscribers: hans, llvm-commits Differential Revision: http://reviews.llvm.org/D15200 llvm-svn: 254645	2015-12-03 20:46:59 +00:00
Marina Yatsina	4b1aea0802	[X86] MS inline asm: produce error when encountering "<type> ptr <reg name>" Currently "<type> ptr <reg name>" treated as <reg name> in MS inline asm, ignoring the "<type> ptr" completely and possibly ignoring the intention of the user. Fixed llvm to produce an error when encountering "<type> ptr <reg name>" operands. For example: andpd xmm1,xmmword ptr xmm1 --> andpd xmm1, xmm1 though andpd has 2 possible matching formats - andpd xmm, xmm/m128 Patch by: ziv.izhar@intel.com Differential Revision: http://reviews.llvm.org/D14607 llvm-svn: 254607	2015-12-03 12:17:03 +00:00
Marina Yatsina	90d9ffa7d6	[X86] Add support for fcomip, fucomip for Intel syntax According to x86 spec, fcomip and fucomip should be supported for Intel syntax. Differential Revision: http://reviews.llvm.org/D15104 llvm-svn: 254595	2015-12-03 08:55:33 +00:00
David Majnemer	70497c696a	Move EH-specific helper functions to a more appropriate place No functionality change is intended. llvm-svn: 254562	2015-12-02 23:06:39 +00:00

1 2 3 4 5 ...

12468 Commits