llvm-project

Commit Graph

Author	SHA1	Message	Date
Guy Blank	2bdc74a471	[X86][FastISel] Use a COPY from K register to a GPR instead of a K operation The KORTEST was introduced due to a bug where a TEST instruction used a K register. but, turns out that the opposite case of KORTEST using a GPR is now happening The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR. Differential Revision: https://reviews.llvm.org/D24953 llvm-svn: 282580	2016-09-28 11:22:17 +00:00
Craig Topper	dfc4fc9f02	[AVX-512] Teach fastisel load/store handling to use EVEX encoded instructions for 128/256-bit vectors and scalar single/double. Still need to fix the register classes to allow the extended range of registers. llvm-svn: 280682	2016-09-05 23:58:40 +00:00
Craig Topper	428169a5d6	[X86] Make some static arrays of opcodes const and shrink to uint16_t. NFC llvm-svn: 280649	2016-09-05 07:14:21 +00:00
Guy Blank	9ae797a798	[AVX512][FastISel] Do not use K registers in TEST instructions In some cases, FastIsel was emitting TEST instruction with K reg input, which is illegal. Changed to using KORTEST when dealing with K regs. Differential Revision: https://reviews.llvm.org/D23163 llvm-svn: 279393	2016-08-21 08:02:27 +00:00
Justin Bogner	cd1d5aaf2e	Replace a few more "fall through" comments with LLVM_FALLTHROUGH Follow up to r278902. I had missed "fall through", with a space. llvm-svn: 278970	2016-08-17 20:30:52 +00:00
Justin Bogner	b03fd12cef	Replace "fallthrough" comments with LLVM_FALLTHROUGH This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902	2016-08-17 05:10:15 +00:00
Matthias Braun	941a705b7b	MachineFunction: Return reference for getFrameInfo(); NFC getFrameInfo() never returns nullptr so we should use a reference instead of a pointer. llvm-svn: 277017	2016-07-28 18:40:00 +00:00
Nico Weber	8d66df15f4	Teach fast isel about the win64 calling convention. This mostly just works. Vectorcall rets are still not supported. The win64_eh test change is because fast isel doesn't use rsi for temporary computations, so it doesn't need to be pushed. The test case I'm changing was originally added to test pushes, but by now there are other test cases in that file exercising that code path. https://reviews.llvm.org/D22422 llvm-svn: 275607	2016-07-15 20:18:37 +00:00
Nico Weber	ecdf45b1e6	Teach fast isel calls and rets about stdcall. stdcall is callee-pop like thiscall, so the thiscall changes already did most of the work for this. This change only opts stdcall in and adds tests. llvm-svn: 275414	2016-07-14 13:54:26 +00:00
Nico Weber	af7e8465e1	Teach fast isel about thiscall (and callee-pop) calls. http://reviews.llvm.org/D22315 llvm-svn: 275360	2016-07-14 01:52:51 +00:00
Nico Weber	c7bf646a99	Teach FastISel about thiscall (and, hence, about callee-pop). http://reviews.llvm.org/D22115 llvm-svn: 275135	2016-07-12 01:30:35 +00:00
Elena Demikhovsky	ad0a56f3da	Re-commit of 274613. The prev commit failed on compilation. A minor change in one pattern in lib/Target/X86/X86InstrAVX512.td fixes the failure. llvm-svn: 274626	2016-07-06 14:15:43 +00:00
Elena Demikhovsky	02ced295aa	Reverted 274613 due to compilation failue. llvm-svn: 274615	2016-07-06 09:11:49 +00:00
Elena Demikhovsky	5a4f2476fd	AVX-512: Optimization for patterns with i1 scalar type The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc". I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction. I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions. This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173. Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails). Differential revision: http://reviews.llvm.org/D21956 llvm-svn: 274613	2016-07-06 09:01:20 +00:00
Rafael Espindola	db6bd02185	Delete unused includes. NFC. llvm-svn: 274225	2016-06-30 12:19:16 +00:00
Duncan P. N. Exon Smith	9cfc75c214	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189	2016-06-30 00:01:54 +00:00
David Majnemer	d770877328	Switch more loops to be range-based This makes the code a little more concise, no functional change is intended. llvm-svn: 273644	2016-06-24 04:05:21 +00:00
Benjamin Kramer	bdc4956bac	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512	2016-06-12 15:39:02 +00:00
Simon Pilgrim	35c06a0282	[X86][SSE] Add general lowering of nontemporal vector loads (fixed bad merge) Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins. This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch. We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction. There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets. Differential Review: http://reviews.llvm.org/D20965 llvm-svn: 272011	2016-06-07 13:47:23 +00:00
Igor Breger	61e628591f	[AVX512] Fix load opcode for fast isel. Differential Revision: http://reviews.llvm.org/D21067 llvm-svn: 272006	2016-06-07 13:08:45 +00:00
Craig Topper	048a08af66	[AVX512] Add 512-bit load/stores to fast isel. llvm-svn: 271486	2016-06-02 04:51:37 +00:00
Craig Topper	ca9c0801e1	[X86] Add AVX 256-bit load and stores to fast isel. I'm not sure why this was missing for so long. This also exposed that we were picking floating point 256-bit VMOVNTPS for some integer types in normal isel for AVX1 even though VMOVNTDQ is available. In practice it doesn't matter due to the execution dependency fix pass, but it required extra isel patterns. Fixing that in a follow up commit. llvm-svn: 271481	2016-06-02 04:19:45 +00:00
Craig Topper	6611188633	[X86] Use uint16_t for a couple arrays of instruction opcodes. NFC llvm-svn: 271480	2016-06-02 04:19:42 +00:00
Rafael Espindola	c7e9813228	Refactor X86 symbol access classification. This refactors the logic in X86 to avoid code duplication. It also splits it in two steps: it first decides if a symbol is local to the DSO and then uses that information to decide how to access it. The first part is implemented by shouldAssumeDSOLocal. It is not in any way specific to X86. In a followup patch I intend to move it to somewhere common and reused it in other backends. llvm-svn: 270209	2016-05-20 12:20:10 +00:00
Rafael Espindola	ab03eb007c	Record a TargetMachine instead of a Reloc::Model. Addresses r270095's code review. llvm-svn: 270147	2016-05-19 22:07:57 +00:00
Rafael Espindola	46107b9e62	Remember the relocation model. NFC. This avoids passing a TargetMachine in a few places. llvm-svn: 270095	2016-05-19 18:49:29 +00:00
Rafael Espindola	cb2d266360	Style fixes. NFC. llvm-svn: 270093	2016-05-19 18:34:20 +00:00
David Majnemer	2c5aeabedd	[X86] Lower zext i1 arguments i1 is now a legal type for X86 with AVX512. There were some paths in X86FastISel which were not quite ready to see an i1 value: they were not quite sure how to deal with sign/zero extends for call arguments. DTRT by extending to i8 for zeroext and bailing out of FastISel for signext. This fixes PR27591. llvm-svn: 268470	2016-05-04 00:22:23 +00:00
Quentin Colombet	bf200688de	[X86][FastISel] Make sure we use the right register class when we select stores. llvm-svn: 267806	2016-04-27 22:33:42 +00:00
Manman Ren	1c3f65a18c	Swift Calling Convention: use %RAX for sret. We don't need to copy the sret argument into %rax upon return. rdar://25671494 llvm-svn: 267579	2016-04-26 18:08:06 +00:00
Asaf Badouh	89406d1815	[X86] enable PIE for functions Call locally defined function directly for PIE/fPIE Differential Revision: http://reviews.llvm.org/D19226 llvm-svn: 266863	2016-04-20 08:32:57 +00:00
Reid Kleckner	28865809fe	Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.h MachineInstr.h and MachineInstrBuilder.h are very popular headers, widely included across all LLVM backends. It turns out that there only a handful of TUs that actually care about DI operands on MachineInstrs. After this change, touching DebugInfoMetadata.h and rebuilding llc only needs 112 actions instead of 542. llvm-svn: 266351	2016-04-14 18:29:59 +00:00
Manman Ren	5751814eda	Swift Calling Convention: swifterror target support. Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997	2016-04-11 21:08:06 +00:00
Manman Ren	f8bdd88cd9	Swift Calling Convention: add swiftcc. Differential Revision: http://reviews.llvm.org/D17863 llvm-svn: 265480	2016-04-05 22:41:47 +00:00
Manman Ren	f46262e0b7	Swift Calling Convention: add swiftself attribute. Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754	2016-03-29 17:37:21 +00:00
Craig Topper	cf65c62737	[X86] Use MCPhysReg and uint16_t for static arrays of registers and opcodes respectively should reduce size tiny bit. NFC llvm-svn: 262458	2016-03-02 04:42:31 +00:00
Ahmed Bougacha	68a8efa374	[X86][FastISel] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR. FastISel counterpart to r259448. llvm-svn: 259449	2016-02-02 01:44:03 +00:00
Manman Ren	ed967f3752	CXX_FAST_TLS calling convention: performance improvement for x86-64. This is the same change on x86-64 as r255821 on AArch64. rdar://9001553 llvm-svn: 257428	2016-01-12 01:08:46 +00:00
Dimitry Andric	227b928abc	Fix several accidental DOS line endings in source files Summary: There are a number of files in the tree which have been accidentally checked in with DOS line endings. Convert these to native line endings. There are also a few files which have DOS line endings on purpose, and I have set the svn:eol-style property to 'CRLF' on those. Reviewers: joerg, aaron.ballman Subscribers: aaron.ballman, sanjoy, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15848 llvm-svn: 256707	2016-01-03 17:22:03 +00:00
Michael Kuperstein	2ea81baf3a	[X86] Better support for the MCU psABI (LLVM part) This adds support for the MCU psABI in a way different from r251223 and r251224, basically reverting most of these two patches. The problem with the approach taken in r251223/4 is that it only handled libcalls that originated from the backend. However, the mid-end also inserts quite a few libcalls and assumes these use the platform's default calling convention. The previous patch tried to insert inregs when necessary both in the FE and, somewhat hackily, in the CG. Instead, we now define a new default calling convention for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does. Differential Revision: http://reviews.llvm.org/D15054 llvm-svn: 256494	2015-12-28 14:39:21 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Sanjoy Das	b11b440f8e	[IR] Add bounds checking to paramHasAttr Summary: This is intended to make a later change simpler. Note: adding this bounds checking required fixing `X86FastISel`. As far I can tell I've preserved original behavior but a careful review will be appreciated. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14304 llvm-svn: 252073	2015-11-04 20:33:45 +00:00
Duncan P. N. Exon Smith	d77de6495e	X86: Remove implicit ilist iterator conversions, NFC llvm-svn: 250741	2015-10-19 21:48:29 +00:00
Simon Pilgrim	5b65f28fe7	[X86][FastISel] Teach how to select SSE4A nontemporal stores. Add FastISel support for SSE4A scalar float / double non-temporal stores Follow up to D13698 Differential Revision: http://reviews.llvm.org/D13773 llvm-svn: 250610	2015-10-17 13:04:42 +00:00
Andrea Di Biagio	c47edbef4c	[x86][FastISel] Teach how to select nontemporal stores. This patch teaches x86 fast-isel how to select nontemporal stores. On x86, we can use MOVNTI for nontemporal stores of doublewords/quadwords. Instructions (V)MOVNTPS/PD/DQ can be used for SSE2/AVX aligned nontemporal vector stores. Before this patch, fast-isel always selected 'movd/movq' instead of 'movnti' for doubleword/quadword nontemporal stores. In the case of nontemporal stores of aligned vectors, fast-isel always selected movaps/movapd/movdqa instead of movntps/movntpd/movntdq. With this patch, if we use SSE2/AVX intrinsics for nontemporal stores we now always get the expected (V)MOVNT instructions. The lack of fast-isel support for nontemporal stores was spotted when analyzing the -O0 codegen for nontemporal stores. Differential Revision: http://reviews.llvm.org/D13698 llvm-svn: 250285	2015-10-14 10:03:13 +00:00
Andrea Di Biagio	77f62652c1	Reapply r249121 : "[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types." This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Originally reviewed here: http://reviews.llvm.org/D13347 llvm-svn: 249147	2015-10-02 16:08:05 +00:00
Andrea Di Biagio	45874e67a1	Revert: [FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. r249121 caused a Clang test failure (avx2-buitins.c). Revert r249121 while I keep investigating on the reason why that test failed. llvm-svn: 249124	2015-10-02 13:06:19 +00:00
Andrea Di Biagio	cb33456122	[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types. This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Differential Revision: http://reviews.llvm.org/D13347 llvm-svn: 249121	2015-10-02 12:45:37 +00:00
Jeroen Ketema	740f9d79ca	Arguments spilled on the stack before a function call may have alignment requirements, for example in the case of vectors. These requirements are exploited by the code generator by using move instructions that have similar alignment requirements, e.g., movaps on x86. Although the code generator properly aligns the arguments with respect to the displacement of the stack pointer it computes, the displacement itself may cause misalignment. For example if we have %3 = load <16 x float>, <16 x float>* %1, align 64 call void @bar(<16 x float> %3, i32 0) the x86 back-end emits: movaps 32(%ecx), %xmm2 movaps (%ecx), %xmm0 movaps 16(%ecx), %xmm1 movaps 48(%ecx), %xmm3 subl $20, %esp <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards movaps %xmm3, (%esp) <-- movaps requires 16-byte alignment, while %esp is not aligned as such. movl $0, 16(%esp) calll __bar To solve this, we need to make sure that the computed value with which the stack pointer is changed is a multiple af the maximal alignment seen during its computation. With this change we get proper alignment: subl $32, %esp movaps %xmm3, (%esp) Differential Revision: http://reviews.llvm.org/D12337 llvm-svn: 248786	2015-09-29 10:12:57 +00:00

1 2 3 4 5 ...

540 Commits