llvm-project

Commit Graph

Author	SHA1	Message	Date
Chris Lattner	ba3ba8fa1f	some peepholes that should match horizontal add/sub operations. llvm-svn: 163103	2012-09-03 02:58:21 +00:00
Benjamin Kramer	57003a6768	Add a note for -ffast-math optimization of vector norm. llvm-svn: 153031	2012-03-19 00:43:34 +00:00
Benjamin Kramer	863683c590	This is now implemented. llvm-svn: 146258	2011-12-09 15:45:57 +00:00
Lang Hames	de7ab801cc	Add a natural stack alignment field to TargetData, and prevent InstCombine from promoting allocas to preferred alignments that exceed the natural alignment. This avoids some potentially expensive dynamic stack realignments. The natural stack alignment is set in target data strings via the "S<size>" option. Size is in bits and must be a multiple of 8. The natural stack alignment defaults to "unspecified" (represented by a zero value), and the "unspecified" value does not prevent any alignment promotions. Target maintainers that care about avoiding promotions should explicitly add the "S<size>" option to their target data strings. llvm-svn: 141599	2011-10-10 23:42:08 +00:00
Benjamin Kramer	69affe6a94	Add a note about SSE4.1 roundss/roundsd. llvm-svn: 125438	2011-02-12 17:58:16 +00:00
Chris Lattner	5cac0f71ca	update this. llvm-svn: 113116	2010-09-05 20:22:09 +00:00
Chris Lattner	aecf47a5cb	we should pattern match the SSE complex arithmetic ops. llvm-svn: 112109	2010-08-25 23:31:42 +00:00
Chris Lattner	a42202e0e4	random improvement for variable shift codegen. llvm-svn: 111813	2010-08-23 17:30:29 +00:00
Jakob Stoklund Olesen	98ee37d878	Remove obsolete README_SSE note. We are generating movaps for all XMM register copies, including scalar floating point values. This is known to be at least as good as movss and movsd for all known architectures up to and including Nehalem because it avoids a partial register stall. The SSEDomainFix pass will switch movaps to movdqa when appropriate (i.e., when operands come from the integer unit). We don't now that switching movaps to movapd has any benefit. The same applies to andps -> pand. llvm-svn: 108096	2010-07-11 17:13:42 +00:00
Chris Lattner	7b909ac785	some notes about suboptimal insertps's llvm-svn: 107613	2010-07-05 05:48:41 +00:00
Eli Friedman	ceb13f2af3	Remove some already-fixed README entries. llvm-svn: 105377	2010-06-03 01:47:31 +00:00
Eli Friedman	a59b7a72b9	Remove README entry which no longer compiles to something sane. llvm-svn: 105376	2010-06-03 01:16:51 +00:00
Dan Gohman	6f34abd092	Floating-point add, sub, and mul are now spelled fadd, fsub, and fmul, respectively. llvm-svn: 97531	2010-03-02 01:11:08 +00:00
Dan Gohman	4a618827de	Fix "the the" and similar typos. llvm-svn: 95781	2010-02-10 16:03:48 +00:00
Chris Lattner	cf11e602a2	add a note from PR6194 llvm-svn: 95649	2010-02-09 05:45:29 +00:00
Chris Lattner	fb5670fc16	move the PR6214 microoptzn to this file. llvm-svn: 95299	2010-02-04 07:32:01 +00:00
Chris Lattner	3eb76c23dd	this is an SSE-specific issue. llvm-svn: 93373	2010-01-13 23:29:11 +00:00
Chris Lattner	e84a7911c4	Bill implemented this. llvm-svn: 63752	2009-02-04 19:09:07 +00:00
Chris Lattner	553fd7e1eb	add a note, this is why we're faster at SciMark-MonteCarlo with SSE disabled. llvm-svn: 63751	2009-02-04 19:08:01 +00:00
Evan Cheng	f31f288863	The memory alignment requirement on some of the mov{h\|l}p{d\|s} patterns are 16-byte. That is overly strict. These instructions read / write f64 memory locations without alignment requirement. llvm-svn: 63195	2009-01-28 08:35:02 +00:00
Chris Lattner	9a8eb0d534	add a note llvm-svn: 56391	2008-09-20 19:17:53 +00:00
Chris Lattner	f076d5eea8	add a note llvm-svn: 54964	2008-08-19 00:41:02 +00:00
Evan Cheng	3fc2372d3a	- Fix a x86 vector isel bug: illegal transformation of a vector_shuffle into a shift. - Add a readme entry for a missing vector_shuffle optimization that results in awful codegen. llvm-svn: 52740	2008-06-25 20:52:59 +00:00
Evan Cheng	8647b875cc	This is done. llvm-svn: 51526	2008-05-24 00:10:13 +00:00
Evan Cheng	04d24edcbb	Use movlps / movhps to modify low / high half of 16-byet memory location. llvm-svn: 51501	2008-05-23 21:23:16 +00:00
Dan Gohman	66eea1b9b3	Elaborate on the entry on integer vector multiplication by constants. llvm-svn: 51491	2008-05-23 18:05:39 +00:00
Evan Cheng	d25cb8e0d2	New entry. llvm-svn: 51487	2008-05-23 17:28:11 +00:00
Chris Lattner	3546c2b4e4	we compile multiply-by-constant into horrible code. Doesn't sse4 have some instruction for doing this? llvm-svn: 51473	2008-05-23 04:29:53 +00:00
Chris Lattner	03ce206143	add a note llvm-svn: 51062	2008-05-13 19:56:20 +00:00
Chris Lattner	d17f58ae6e	add a note llvm-svn: 51060	2008-05-13 18:48:54 +00:00
Evan Cheng	1120279ae6	Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset. pshufd $1, (%rdi), %xmm0 movd %xmm0, %eax => movl 4(%rdi), %eax llvm-svn: 51026	2008-05-13 08:35:03 +00:00
Evan Cheng	3f40c69083	On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16. llvm-svn: 51019	2008-05-13 00:54:02 +00:00
Evan Cheng	b980f6fb3d	Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other. llvm-svn: 51008	2008-05-12 23:04:07 +00:00
Anton Korobeynikov	a38e72d247	Add note llvm-svn: 50959	2008-05-11 14:33:15 +00:00
Chris Lattner	aeb23a8a34	add a note, this is actually not too bad to implement. llvm-svn: 49466	2008-04-10 05:54:50 +00:00
Chris Lattner	c692188075	move the x86-32 part of PR2108 here. llvm-svn: 49465	2008-04-10 05:37:47 +00:00
Chris Lattner	b6387c8a74	Finish implementing a readme entry: when inserting an i64 variable into a vector of zeros or undef, and when the top part is obviously zero, we can just use movd + shuffle. This allows us to compile vec_set-B.ll into: _test3: movl $1234567, %eax andl 4(%esp), %eax movd %eax, %xmm0 ret instead of: _test3: subl $28, %esp movl $1234567, %eax andl 32(%esp), %eax movl %eax, (%esp) movl $0, 4(%esp) movq (%esp), %xmm0 addl $28, %esp ret llvm-svn: 48090	2008-03-09 05:42:06 +00:00
Chris Lattner	93930dc28c	add a note llvm-svn: 48064	2008-03-09 01:08:22 +00:00
Chris Lattner	eef374c197	Implement a readme entry, compiling #include <xmmintrin.h> __m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);} into: movl $1, %eax movd %eax, %xmm0 ret instead of a constant pool load. llvm-svn: 48063	2008-03-09 01:05:04 +00:00
Chris Lattner	35adf46967	This one looks easy, add a note. llvm-svn: 48055	2008-03-08 22:32:39 +00:00
Chris Lattner	a76e23a935	move these to the appropriate file llvm-svn: 48054	2008-03-08 22:28:45 +00:00
Chris Lattner	7c08a01698	evan implemented this. llvm-svn: 47948	2008-03-05 17:11:51 +00:00
Chris Lattner	2acd0c25f6	add a note llvm-svn: 47939	2008-03-05 07:22:39 +00:00
Chris Lattner	a70df9e2ee	Evan implemented these. llvm-svn: 47828	2008-03-02 18:05:14 +00:00
Chris Lattner	eb63b09206	upgrade some entries, remove stuff that is done. llvm-svn: 47109	2008-02-14 06:19:02 +00:00
Nate Begeman	eea32990a9	readme updates llvm-svn: 47051	2008-02-13 07:06:12 +00:00
Nate Begeman	2d77e8e446	Enable SSE4 codegen and pattern matching. Add some notes to the README. llvm-svn: 46949	2008-02-11 04:19:36 +00:00
Chris Lattner	2e4719ec55	add a note llvm-svn: 46413	2008-01-27 07:31:41 +00:00
Chris Lattner	2dd23b9f32	Add some notes. llvm-svn: 46405	2008-01-26 20:12:07 +00:00
Chris Lattner	d2b8a36f0e	One readme entry is done, one is really easy (Evan, want to investigate eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn may be done (if shufps is better than pinsw, Evan, please review), and we already know about LICM of simple instructions. llvm-svn: 45407	2007-12-29 19:31:47 +00:00

1 2

77 Commits