llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	27c12e088e	[X86] Allow masks with more than 6 bits set on the x << (y & mask) optimization for the 64-bit memory shifts. llvm-svn: 308657	2017-07-20 19:29:58 +00:00
Craig Topper	33225ef314	[X86] Use SARX/SHLX/SHLX instructions for (shift x (and y, (BitWidth-1))) Fixes PR33841. llvm-svn: 308591	2017-07-20 06:19:55 +00:00
Serge Pavlov	d526b13e61	Add extra operand to CALLSEQ_START to keep frame part set up previously Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527	2017-05-09 13:35:13 +00:00
Craig Topper	d0af7e8ab8	[SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620	2017-04-28 05:31:46 +00:00
Simon Pilgrim	68168d17b9	Spelling mistakes in comments. NFCI. Based on corrections mentioned in patch for clang for PR27635 llvm-svn: 299072	2017-03-30 12:59:53 +00:00
Craig Topper	e4d5aa7efc	[X86] Cleanup the AddedComplexity values on move immediate instructions. NFC This makes the values a little more consistent between similar instruction and reduces the values some. This results in better grouping in the isel table saving a few bytes. llvm-svn: 298043	2017-03-17 05:59:54 +00:00
Peter Collingbourne	ef089bdb4b	X86: Introduce relocImm-based patterns for cmp. Differential Revision: https://reviews.llvm.org/D28690 llvm-svn: 294636	2017-02-09 22:02:28 +00:00
Nikolai Bozhenov	3a8d108b2b	[x86] Fixing PR28755 by precomputing the address used in CMPXCHG8B The bug arises during register allocation on i686 for CMPXCHG8B instruction when base pointer is needed. CMPXCHG8B needs 4 implicit registers (EAX, EBX, ECX, EDX) and a memory address, plus ESI is reserved as the base pointer. With such constraints the only way register allocator would do its job successfully is when the addressing mode of the instruction requires only one register. If that is not the case - we are emitting additional LEA instruction to compute the address. It fixes PR28755. Patch by Alexander Ivchenko <alexander.ivchenko@intel.com> Differential Revision: https://reviews.llvm.org/D25088 llvm-svn: 287875	2016-11-24 13:23:35 +00:00
Peter Collingbourne	32ab3a817d	Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate.", with a fix for 32-bit x86. Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions that take a global address operand. llvm-svn: 286420	2016-11-09 23:53:43 +00:00
Peter Collingbourne	a9cadeddd4	Revert r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate." Suspected to be the cause of a sanitizer-windows bot failure: Assertion failed: isImm() && "Wrong MachineOperand accessor", file C:\b\slave\sanitizer-windows\llvm\include\llvm/CodeGen/MachineOperand.h, line 420 llvm-svn: 286385	2016-11-09 18:17:50 +00:00
Peter Collingbourne	4c15db45e4	X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate. A relocatable immediate is either an immediate operand or an operand that can be relocated by the linker to an immediate, such as a regular symbol in non-PIC code. Start using relocImm for 32-bit and 64-bit MOV instructions, and for operands of type "imm32_su". Remove a number of now-redundant patterns. Differential Revision: https://reviews.llvm.org/D25812 llvm-svn: 286384	2016-11-09 17:51:58 +00:00
Simon Pilgrim	46f119a59f	[X86] Use implicit masking of SHLD/SHRD shift double instructions Similar to the regular shift instructions, SHLD/SHRD only use the bottom bits of the shift value llvm-svn: 277341	2016-08-01 12:11:43 +00:00
David L Kreitzer	8b959e5cfa	Avoid unnecessary 32-bit to 64-bit zero extensions following 32-bit CMOV instructions on x86_64. The 32-bit CMOV implicitly zero extends. Differential Revision: https://reviews.llvm.org/D22941 llvm-svn: 277148	2016-07-29 15:09:54 +00:00
Eli Friedman	17e8ea18e9	[X86] Fix stupid typo in isel lowering. Apparently someone miscounted the number of zeros in the immediate. Fixes https://llvm.org/bugs/show_bug.cgi?id=28544 . llvm-svn: 275376	2016-07-14 05:48:25 +00:00
Rafael Espindola	68760387df	Delete the IsStatic predicate. In all its uses it was equivalent to IsNotPIC. llvm-svn: 273943	2016-06-27 21:09:14 +00:00
Kevin B. Smith	ed0b620a65	[X86]: Add a pattern that uses GR16_ABCD rather than GR32_ABCD to avoid falsely marking whole 32 bit register as live. Differential Revision: http://reviews.llvm.org/D20649 llvm-svn: 271341	2016-05-31 22:00:12 +00:00
Hans Wennborg	8eb336c14e	Re-commit r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions" with an additional fix to make RegAllocFast ignore undef physreg uses. It would previously get confused about the "push %eax" instruction's use of eax. That method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate as well, but since that runs after register-allocation, we didn't run into the RegAllocFast issue before. llvm-svn: 269949	2016-05-18 16:10:17 +00:00
Hans Wennborg	759af30109	Revert r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions" Seems to have broken the Windows ASan bot. Reverting while investigating. llvm-svn: 269833	2016-05-17 20:38:56 +00:00
Hans Wennborg	c3fb51171e	X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions This patch moves the expansion of WIN_ALLOCA pseudo-instructions into a separate pass that walks the CFG and lowers the instructions based on a conservative estimate of the offset between the stack pointer and the lowest accessed stack address. The goal is to reduce binary size and run-time costs by removing calls to _chkstk. While it doesn't fix all the code quality problems with inalloca calls, it's an incremental improvement for PR27076. Differential Revision: http://reviews.llvm.org/D20263 llvm-svn: 269828	2016-05-17 20:13:29 +00:00
Craig Topper	7b5925a5b6	[X86] Fix a bug in LOCK arithmetic operation pattern matching where the wrong immediate predicate check was being used for 64-bit instructions with 8-bit immediates. This didn't cause a bug because the order of the patterns ensured that the 64-bit instructions with 32-bit immediates were selected first. llvm-svn: 268212	2016-05-02 05:44:21 +00:00
Quentin Colombet	d6dbec4c6f	[X86] Fix the lowering of TLS calls. The callseq_end node must be glued with the TLS calls, otherwise, the generic code will miss the uses of the returned value and will mark it dead. Moreover, TLSCall 64-bit pseudo must not set an implicit-use on RDI, the pseudo uses the symbol address at this point not RDI and the lowering will do the right thing. llvm-svn: 267797	2016-04-27 21:37:37 +00:00
Hans Wennborg	5f916d3df4	[X86] Use "and $0" and "orl $-1" to store 0 and -1 when optimizing for minsize 64-bit, 32-bit and 16-bit move-immediate instructions are 7, 6, and 5 bytes, respectively, whereas and/or with 8-bit immediate is only three bytes. Since these instructions imply an additional memory read (which the CPU could elide, but we don't think it does), restrict these patterns to minsize functions. Differential Revision: http://reviews.llvm.org/D18374 llvm-svn: 264440	2016-03-25 18:11:31 +00:00
Hans Wennborg	4ae5119eeb	X86: Use push-pop for materializing 8-bit immediates for minsize (take 2) This is the same as r255936, with added logic for avoiding clobbering of the red zone (PR26023). Differential Revision: http://reviews.llvm.org/D18246 llvm-svn: 264375	2016-03-25 01:10:56 +00:00
Quentin Colombet	cf9732b417	[X86] Make sure we do not clobber RBX with cmpxchg when used as a base pointer. cmpxchg[8\|16]b uses RBX as one of its argument. In other words, using this instruction clobbers RBX as it is defined to hold one the input. When the backend uses dynamically allocated stack, RBX is used as a reserved register for the base pointer. Reserved registers have special semantic that only the target understands and enforces, because of that, the register allocator don’t use them, but also, don’t try to make sure they are used properly (remember it does not know how they are supposed to be used). Therefore, when RBX is used as a reserved register but defined by something that is not compatible with that use, the register allocator will not fix the surrounding code to make sure it gets saved and restored properly around the broken code. This is the responsibility of the target to do the right thing with its reserved register. To fix that, when the base pointer needs to be preserved, we use a different pseudo instruction for cmpxchg that save rbx. That pseudo takes two more arguments than the regular instruction: - One is the value to be copied into RBX to set the proper value for the comparison. - The other is the virtual register holding the save of the value of RBX as the base pointer. This saving is done as part of isel (i.e., we emit a copy from rbx). cmpxchg_save_rbx <regular cmpxchg args>, input_for_rbx_reg, save_of_rbx_as_bp This gets expanded into: rbx = copy input_for_rbx_reg cmpxchg <regular cmpxchg args> rbx = save_of_rbx_as_bp Note: The actual modeling of the pseudo is a bit more complicated to make sure the interferes that appears after the pseudo gets expanded are properly modeled before that expansion. This fixes PR26883. llvm-svn: 263325	2016-03-12 02:25:27 +00:00
Ahmed Bougacha	bb5d7d7ed8	[X86] Move the ATOMIC_LOAD_OP ISel from DAGToDAG to ISelLowering. NFCI. This is long-standing dirtiness, as acknowledged by r77582: The current trick is to select it into a merge_values with the first definition being an implicit_def. The proper solution is to add new ISD opcodes for the no-output variant. Doing this before selection will let us combine away some constructs. Differential Revision: http://reviews.llvm.org/D17659 llvm-svn: 262244	2016-02-29 19:28:07 +00:00
Elena Demikhovsky	e5bbca6ae2	Optimized loading (zextload) of i1 value from memory. This patch is a partial revert of https://llvm.org/svn/llvm-project/llvm/trunk@237793. Extra "and" causes performance degradation. We assume that i1 is stored in zero-extended form. And store operation is responsible for zeroing upper bits. Differential Revision: http://reviews.llvm.org/D17541 llvm-svn: 261828	2016-02-25 07:05:12 +00:00
Davide Italiano	228978c0dc	[X86ISelLowering] Fix TLSADDR lowering when shrink-wrapping is enabled. TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent shrink-wrapping from pushing prologue/epilogue past them (which result in TLS variables being accessed before the stack frame is set up), we put markers, so that the stack gets adjusted properly. Thanks to Quentin Colombet for guidance/help on how to fix this problem! llvm-svn: 261387	2016-02-20 00:44:47 +00:00
Craig Topper	e00bffbc13	[X86] Make MOV32ri64 a post-RA pseudo instead of a CodeGenOnly instruction. It was only needed for rematerialization. llvm-svn: 256818	2016-01-05 07:44:14 +00:00
Craig Topper	9583f51348	[X86] Add OpSize32 to OR32mrLocked instruction to match the normal OR32mr instruction. llvm-svn: 256817	2016-01-05 07:44:11 +00:00
David Majnemer	869be0a4a6	Revert "[X86] Use push-pop for materializing small constants under 'minsize'" The red zone consists of 128 bytes beyond the stack pointer so that the allocation of objects in leaf functions doesn't require decrementing rsp. In r255656, we introduced an optimization that would cheaply materialize certain constants via push/pop. Push decrements the stack pointer and stores it's result at what is now the top of the stack. However, this means that using push/pop would encroach on the red zone. PR26023 gives an example where this corrupts an object in the red zone. llvm-svn: 256808	2016-01-05 02:32:06 +00:00
Hans Wennborg	a6a2e512cf	[X86] Use push-pop for materializing small constants under 'minsize' Use the 3-byte (4 with REX prefix) push-pop sequence for materializing small constants. This is smaller than using a mov (5, 6 or 7 bytes depending on size and REX prefix), but it's likely to be slower, so only used for 'minsize'. This is a follow-up to r255656. Differential Revision: http://reviews.llvm.org/D15549 llvm-svn: 255936	2015-12-17 23:18:39 +00:00
Hans Wennborg	08d5905bac	[X86] Smaller code for materializing 32-bit 1 and -1 constants "movl $-1, %eax" is 5 bytes, "xorl %eax, %eax; decl %eax" is 3 bytes. This commit makes LLVM use the latter when optimizing for size. Differential Revision: http://reviews.llvm.org/D14971 llvm-svn: 255656	2015-12-15 17:10:28 +00:00
Chih-Hung Hsieh	7993e18e80	[X86] Part 2 to fix x86-64 fp128 calling convention. Part 1 was submitted in http://reviews.llvm.org/D15134. Changes in this part: * X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class. * X86CallingConv.td: Pass f128 values in XMM registers or on stack. * X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td: Add instruction selection patterns for f128. * X86ISelLowering.cpp: When target has MMX registers, configure MVT::f128 in FR128RegClass, with TypeSoftenFloat action, and custom actions for some opcodes. Add missed cases of MVT::f128 in places that handle f32, f64, or vector types. Add TODO comment to support f128 type in inline assembly code. * SelectionDAGBuilder.cpp: Fix infinite loop when f128 type can have VT == TLI.getTypeToTransformTo(Ctx, VT). * Add unit tests for x86-64 fp128 type. Differential Revision: http://reviews.llvm.org/D11438 llvm-svn: 255558	2015-12-14 22:08:36 +00:00
Reid Kleckner	420f0542cc	[WinEH] Remove isBarrier from instructions that do not return Fixes machine verification failures with David's latest EH change. llvm-svn: 252541	2015-11-09 23:34:42 +00:00
David Majnemer	2652b75700	[WinEH] Don't emit CATCHRET from visitCatchPad Instead, emit a CATCHPAD node which will get selected to a target specific sequence. llvm-svn: 252528	2015-11-09 23:07:48 +00:00
Reid Kleckner	51460c139e	[WinEH] Split EH_RESTORE out of CATCHRET for 32-bit EH This adds the EH_RESTORE x86 pseudo instr, which is responsible for restoring the stack pointers: EBP and ESP, and ESI if stack realignment is involved. We only need this on 32-bit x86, because on x64 the runtime restores CSRs for us. Previously we had to keep the CATCHRET instruction around during SEH so that we could convince X86FrameLowering to restore our frame pointers. Now we can split these instructions earlier. This was confusing, because we had a return instruction which wasn't really a return and was ultimately going to be removed by X86FrameLowering. This change also simplifies X86FrameLowering, which really shouldn't be building new MBBs. No observable functional change currently, but with the new register mask stuff in D14407, CATCHRET will become a register allocator barrier, and our existing tests rely on us having reasonable register allocation around SEH. llvm-svn: 252266	2015-11-06 01:49:05 +00:00
JF Bastien	2cdd5e4710	x86: preserve flags when folding atomic operations D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. I fixed some of this issue in D13680 but had missed INC/DEC. This patch adds the missing EFLAGS definition. llvm-svn: 250438	2015-10-15 18:24:52 +00:00
Sanjay Patel	85030aa1bd	function names should start with a lower case letter; NFC llvm-svn: 250174	2015-10-13 16:23:00 +00:00
JF Bastien	986ed68eed	x86: preserve flags when folding atomic operations Summary: D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. This patch adds the missing EFLAGS definition. Floating point operations don't set flags, the subsequent fadd optimization is therefore correct. The same applies for surrounding load/store optimizations. Reviewers: rsmith, rtrieu Subscribers: llvm-commits, reames, morisset Differential Revision: http://reviews.llvm.org/D13680 llvm-svn: 250135	2015-10-13 00:28:47 +00:00
Craig Topper	d69d495333	[X86] Remove unnecessary AddComplexity directive. The instruction is already wrapped in the equivalent earlier. NFC llvm-svn: 249369	2015-10-06 02:50:21 +00:00
David Majnemer	f828a0ccc7	[WinEH] Make FuncletLayout more robust against catchret Catchret transfers control from a catch funclet to an earlier funclet. However, it is not completely clear which funclet the catchret target is part of. Make this clear by stapling the catchret target's funclet membership onto the CATCHRET SDAG node. llvm-svn: 249052	2015-10-01 18:44:59 +00:00
Reid Kleckner	5b8a46e771	[WinEH] Make funclet return instrs pseudo instrs This makes catchret look more like a branch, and less like a weird use of BlockAddress. It also lets us get away from llvm.x86.seh.restoreframe, which relies on the old parentfpoffset label arithmetic. llvm-svn: 247936	2015-09-17 20:43:47 +00:00
Reid Kleckner	7878391208	[WinEH] Add codegen support for cleanuppad and cleanupret All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. llvm-svn: 247219	2015-09-10 00:25:23 +00:00
Reid Kleckner	df1295173f	[WinEH] Emit prologues and epilogues for funclets Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092	2015-09-08 22:44:41 +00:00
Reid Kleckner	0e2882345d	[WinEH] Add some support for code generating catchpad We can now run 32-bit programs with empty catch bodies. The next step is to change PEI so that we get funclet prologues and epilogues. llvm-svn: 246235	2015-08-27 23:27:47 +00:00
Michael Kuperstein	6e3fee07f7	[X86] Remove references to _ftol2 As of r245924, _ftol2 is no longer used for fptoui on MS platforms. Remove the dead code associated with it. llvm-svn: 245925	2015-08-25 07:58:33 +00:00
JF Bastien	0f8a99b62f	x86: NFC remove needless InstrCompiler cast Summary: The casts from String to PatFrag weren't needed if we instead provided an SDNode. This fix was suggested by @pete in D11382. Subscribers: pete, llvm-commits Differential Revision: http://reviews.llvm.org/D11788 llvm-svn: 244167	2015-08-05 23:15:37 +00:00
JF Bastien	8662083770	x86 atomic: optimize a.store(reg op a.load(acquire), release) Summary: PR24191 finds that the expected memory-register operations aren't generated when relaxed { load ; modify ; store } is used. This is similar to PR17281 which was addressed in D4796, but only for memory-immediate operations (and for memory orderings up to acquire and release). This patch also handles some floating-point operations. Reviewers: reames, kcc, dvyukov, nadav, morisset, chandlerc, t.p.northover, pete Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11382 llvm-svn: 244128	2015-08-05 21:04:59 +00:00
Rafael Espindola	494a381ede	Use small encodings for constants when possible. llvm-svn: 242493	2015-07-17 00:57:52 +00:00
Rafael Espindola	36b718fc74	Avoid a Symbol -> Name -> Symbol conversion. Before this we were producing a TargetExternalSymbol from a MCSymbol. That meant extracting the symbol name and fetching the symbol again down the pipeline. This patch adds a DAG.getMCSymbol that lets the MCSymbol pass unchanged on the DAG. Doing so removes the need for MO_NOPREFIX and fixes the root cause of pr23900, allowing r240130 to be committed again. llvm-svn: 240300	2015-06-22 17:46:53 +00:00

1 2 3 4

197 Commits