Chris Lattner
ee7e6f42f8
lcall and ljmp always default to lcalll and ljmpl. This finally
...
wraps up r8418316
llvm-svn: 113949
2010-09-15 05:30:20 +00:00
Chris Lattner
09bfe645f6
apparently jmpl $1,$2 is an alias for ljmpl, similiarly
...
for call. Add this.
llvm-svn: 113948
2010-09-15 05:25:21 +00:00
Chris Lattner
6757eae45e
Disambiguate lcall/ljmp to the 32-bit version. This happens
...
even in 64-bit mode apparently.
llvm-svn: 113945
2010-09-15 05:14:54 +00:00
Chris Lattner
5be87c619b
fix the encoding of sldt GR16 to have the 0x66 prefix, and
...
add sldt GR32, which isn't documented in the intel manual
but which gas accepts. Part of rdar://8418316
llvm-svn: 113938
2010-09-15 04:45:10 +00:00
Chris Lattner
6b40b0def1
implement aliases for shld/shrd, part of rdar://8418316
...
llvm-svn: 113937
2010-09-15 04:37:18 +00:00
Chris Lattner
4bd21710b6
fix rdar://8431880 - rcl/rcr with no shift amount not recognized
...
llvm-svn: 113936
2010-09-15 04:33:27 +00:00
Chris Lattner
81ce173860
add various broken forms of fnstsw. I didn't add the %rax
...
version because it adds a prefix and makes even less sense
than the other broken forms. This wraps up rdar://8431422
llvm-svn: 113932
2010-09-15 04:15:16 +00:00
Chris Lattner
7df35dbd19
add some aliases for f[u]comi, part of rdar://8431422
...
llvm-svn: 113930
2010-09-15 04:08:38 +00:00
Chris Lattner
4dbcba0082
add a bunch of aliases for fp operations with no operand,
...
rdar://8431422
llvm-svn: 113929
2010-09-15 04:04:33 +00:00
Chris Lattner
d28452d94a
Diagnose invalid instructions like "incl" with "too few operands for instruction"
...
instead of crashing. This fixes:
rdar://8431815 - crash when invalid operand is one that isn't present
llvm-svn: 113921
2010-09-15 03:50:11 +00:00
Jim Grosbach
95ede4d3b6
trailing whitespace
...
llvm-svn: 113915
2010-09-15 01:01:45 +00:00
Chris Lattner
5f2311dc29
add a terrible hack to allow out with dx is parens, a gas bug.
...
This fixes PR8114
llvm-svn: 113894
2010-09-14 23:34:29 +00:00
Michael J. Spencer
93c9b2ea93
Revert "CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally."
...
This reverts commit r113632
Conflicts:
cmake/modules/AddLLVM.cmake
llvm-svn: 113819
2010-09-13 23:59:48 +00:00
Dale Johannesen
1eea351920
Fix typos. 128-bit PSHUFB takes 128-bit memory op.
...
v8i16 is not an MMX type; put it where it belongs.
llvm-svn: 113785
2010-09-13 21:15:43 +00:00
John Thompson
1094c80281
Added skeleton for inline asm multiple alternative constraint support.
...
llvm-svn: 113766
2010-09-13 18:15:37 +00:00
Chris Lattner
1bbb14ab8f
add a missed cmov alias, part of rdar://8416805
...
llvm-svn: 113693
2010-09-11 17:08:22 +00:00
Chris Lattner
3340c3e86c
add support for all the setCC aliases. Part of rdar://8416805
...
llvm-svn: 113692
2010-09-11 17:06:05 +00:00
Chris Lattner
b47c042e09
add support for pushfd/popfd which are aliases for pushfl/popfl.
...
This fixes rdar://8408129 - pushfd and popfd get invalid instruction mnemonic errors
llvm-svn: 113690
2010-09-11 16:39:16 +00:00
Chris Lattner
30561aba20
implement rdar://8407928 - support for in/out with a missing "a" register.
...
llvm-svn: 113689
2010-09-11 16:32:12 +00:00
Chris Lattner
a2a9d16b78
fix the asmparser so that the target is responsible for skipping to
...
the end of the line on a parser error, allowing skipping to happen
for syntactic errors but not for semantic errors. Before we would
miss emitting a diagnostic about the second line, because we skipped
it due to the semantic error on the first line:
foo %eax
bar %al
This fixes rdar://8414033 - llvm-mc ignores lines after an invalid instruction mnemonic errors
llvm-svn: 113688
2010-09-11 16:18:25 +00:00
Michael J. Spencer
dc38d36ccb
CMake: Get rid of LLVMLibDeps.cmake and export the libraries normally.
...
llvm-svn: 113632
2010-09-10 21:14:25 +00:00
Bill Wendling
798725b24d
Reapply r113585. The msvc machine is mercurial.
...
llvm-svn: 113610
2010-09-10 20:20:28 +00:00
Bill Wendling
a9c9aaa839
r113585 was causing clang-i686-xp-msvc9 to fail in mysterious ways that I can't
...
understand (the log file was no help).
llvm-svn: 113605
2010-09-10 19:20:47 +00:00
Bill Wendling
638a098f72
Mark the sse_load_f32 and sse_load_f64 load patterns as having memoperands so
...
that the memoperands are properly set after DAG building and general mucking
about.
llvm-svn: 113585
2010-09-10 10:34:22 +00:00
Bruno Cardoso Lopes
e8501a468c
Add one more pattern to fallback movddup
...
llvm-svn: 113522
2010-09-09 18:48:34 +00:00
Roman Divacky
3b727f55aa
Make ELF OS ABI dependent on the OS from target triple.
...
llvm-svn: 113508
2010-09-09 17:57:50 +00:00
Dale Johannesen
0ec303b97b
Move remaining MMX instructions from SSE to MMX.
...
llvm-svn: 113501
2010-09-09 17:13:07 +00:00
Dale Johannesen
5f4a6f295c
Move most MMX instructions (defined as anything that
...
uses MMX, even if it also uses other things) from InstrSSE
into InstrMMX. No (intended) functional change.
llvm-svn: 113462
2010-09-09 01:02:39 +00:00
Chris Lattner
28a9c2f89a
fix rdar://8407548, I missed the commuted form of xchg/test without a suffix.
...
llvm-svn: 113427
2010-09-08 22:27:05 +00:00
Chris Lattner
d7aba234c2
fix wonky formatting.
...
llvm-svn: 113426
2010-09-08 22:22:10 +00:00
Chris Lattner
8ead237758
fix bugs in push/pop segment support, rdar://8407242
...
llvm-svn: 113422
2010-09-08 22:13:08 +00:00
Dale Johannesen
0d2e6ad504
Add intrinsic-based patterns for MMX PINSRW and PEXTRW.
...
llvm-svn: 113420
2010-09-08 22:08:40 +00:00
Dale Johannesen
e54dba94f9
Check in forgotten file. Should fix build.
...
llvm-svn: 113409
2010-09-08 21:09:48 +00:00
Dale Johannesen
4dae01781f
Slight cleanup, use only one form of MMXI_binop_rm_int.
...
llvm-svn: 113406
2010-09-08 20:54:00 +00:00
Dale Johannesen
d79bb127dd
Add intrinsic forms of mmx<->sse conversions. Notes:
...
Omission of memory form of PI2PD is intentional; this
does not use an MMX register and does not put the chip
into MMX mode (PI2PS, oddly enough, does).
Operands of PI2PS follow the gcc builtin, not Intel.
llvm-svn: 113388
2010-09-08 19:15:38 +00:00
Bruno Cardoso Lopes
99a9f4661a
Minor change. Fix comments and remove unused and redundant code
...
llvm-svn: 113378
2010-09-08 18:12:31 +00:00
Bruno Cardoso Lopes
f7fee1c185
x86 vector shuffle lowering now relies only on target specific
...
nodes to emit shuffles and don't do isel mask matching anymore.
- Add the selection of the remaining shuffle opcode (movddup)
- Introduce two new functions to "recognize" where we may get
potential folds and add several comments to them explaining why
they are not yet in the desidered shape.
- Add more patterns to fallback the case where we select
a specific shuffle opcode as if it could fold a load, but it
can't, so remap to a valid instruction.
- Add a couple of FIXMEs to address in the following days once
there's a good solution to the current folding problem.
llvm-svn: 113369
2010-09-08 17:43:25 +00:00
Chris Lattner
2907d2e419
add support for the commuted form of the test instruction, rdar://8018260.
...
llvm-svn: 113352
2010-09-08 05:51:12 +00:00
Chris Lattner
a9ca7837e4
implement proper support for sysret{,l,q}, rdar://8403907
...
llvm-svn: 113350
2010-09-08 05:45:34 +00:00
Chris Lattner
063363fa80
implement the iret suite of instructions properly,
...
fixing rdar://8403974
llvm-svn: 113349
2010-09-08 05:38:31 +00:00
Chris Lattner
086a83afb1
add support for instruction prefixes on the same line as the instruction,
...
implementing rdar://8033482 and PR7254.
llvm-svn: 113348
2010-09-08 05:17:37 +00:00
Chris Lattner
91689c1d0f
change the MC "ParseInstruction" interface to make it the
...
implementation's job to check for and lex the EndOfStatement
marker.
llvm-svn: 113347
2010-09-08 05:10:46 +00:00
Chris Lattner
8caea68a4f
gas accepts xchg <mem>, <reg> as a synonym for xchg <reg>, <mem>.
...
Add this to the mc assembler, fixing PR8061
llvm-svn: 113346
2010-09-08 04:53:27 +00:00
Chris Lattner
4703cb4a96
fix the encoding of the "jump on *cx" family of instructions,
...
rdar://8061602
llvm-svn: 113343
2010-09-08 04:30:51 +00:00
Bruno Cardoso Lopes
6b1d62c529
Factor out some x86 vector shuffle rewriting and add comments about the direction the shuffle lowering is heading to
...
llvm-svn: 113286
2010-09-07 21:03:14 +00:00
Bruno Cardoso Lopes
7c483028fb
Move code around to prepare for moving some of the logic together to another function
...
llvm-svn: 113267
2010-09-07 20:20:27 +00:00
Bill Wendling
353802114f
Add an MVT::x86mmx type. It will take the place of all current MMX vector types.
...
llvm-svn: 113261
2010-09-07 20:03:56 +00:00
Evan Cheng
5444b36e01
Remove a dead comment.
...
llvm-svn: 113259
2010-09-07 20:01:10 +00:00
Bruno Cardoso Lopes
5a45db3e6c
decouple MMX check from regular splat checks. Some refactoring is coming, and MMX should be left alone to be easily removed after moving to intrinsics
...
llvm-svn: 113247
2010-09-07 18:41:45 +00:00
Bruno Cardoso Lopes
4f5d4b4a6e
Remove now useless check, because the code can be matched below, no need to leave it for isel
...
llvm-svn: 113242
2010-09-07 18:29:03 +00:00
Bruno Cardoso Lopes
c9b3316fea
Minor change. Since the checks are equivalent, use isMMX
...
llvm-svn: 113239
2010-09-07 18:24:00 +00:00
Dale Johannesen
605acfe533
Add patterns for MMX that use the new intrinsics.
...
Enable palignr intrinsic.
These may need adjustment for a new VT in due course.
llvm-svn: 113233
2010-09-07 18:10:56 +00:00
Bruno Cardoso Lopes
f0ea222255
Remove unused target specific node
...
llvm-svn: 113224
2010-09-07 17:38:55 +00:00
Benjamin Kramer
1ecb978214
Don't leak the old operand when transforming "sldt" into "sldtw".
...
llvm-svn: 113200
2010-09-07 14:40:58 +00:00
Chris Lattner
30bb384944
add missing cmov aliases, this resolves rdar://8208499
...
llvm-svn: 113189
2010-09-07 00:05:45 +00:00
Chris Lattner
3ae9398d5f
remove duplicated entry
...
llvm-svn: 113188
2010-09-06 23:57:24 +00:00
Chris Lattner
7ece716da2
"sldt <mem>" is ambiguous in 64-bit mode, but should
...
always be disambiguated as sldtw. sldtw and sldtq with
a mem operands have the same effect, but sldtw is more
compact. Force it to sldtw, resolving rdar://8017530
llvm-svn: 113186
2010-09-06 23:51:44 +00:00
Chris Lattner
415e04fad2
fix rdar://8017621 - llvm-mc can't guess encoding for "push $(1000)"
...
llvm-svn: 113184
2010-09-06 23:40:56 +00:00
Chris Lattner
34e366b45c
fix the operand constraints of the immediate form of in/out,
...
allowing unsigned 8-bit operands. This fixes rdar://8208481
llvm-svn: 113182
2010-09-06 23:29:05 +00:00
Chris Lattner
339cc7bfef
in the case where an instruction only has one implementation
...
of a mneumonic, report operand errors with better location
info. For example, we now report:
t.s:6:14: error: invalid operand for instruction
cwtl $1
^
but we fail for common cases like:
t.s:11:4: error: invalid operand for instruction
addl $1, $1
^
because we don't know if this is supposed to be the reg/imm or imm/reg
form.
llvm-svn: 113178
2010-09-06 22:11:18 +00:00
Chris Lattner
628fbecf4f
Now that we know if we had a total fail on the instruction mnemonic,
...
give a more detailed error. Before:
t.s:11:4: error: unrecognized instruction
addl $1, $1
^
t.s:12:4: error: unrecognized instruction
f2efqefa $1
^
After:
t.s:11:4: error: invalid operand for instruction
addl $1, $1
^
t.s:12:4: error: invalid instruction mnemonic 'f2efqefa'
f2efqefa $1
^
This fixes rdar://8017912 - llvm-mc says "unrecognized instruction" when it means "invalid operands"
llvm-svn: 113176
2010-09-06 21:54:15 +00:00
Chris Lattner
31c63fb518
simplify the hacks around jrcxz.
...
llvm-svn: 113167
2010-09-06 20:10:12 +00:00
Chris Lattner
b4be28f33d
have tblgen detect when an instruction would have matched, but
...
failed because a subtarget feature was not enabled. Use this to
remove a bunch of hacks from the X86AsmParser for rejecting things
like popfl in 64-bit mode. Previously these hacks weren't needed,
but were important to get a message better than "invalid instruction"
when used in the wrong mode.
This also fixes bugs where pushal would not be rejected correctly in
32-bit mode (just pusha).
llvm-svn: 113166
2010-09-06 20:08:02 +00:00
Chris Lattner
a22a368e7c
change MatchInstructionImpl to return an enum instead of bool.
...
llvm-svn: 113165
2010-09-06 19:22:17 +00:00
Chris Lattner
3e4582ada5
have AsmMatcherEmitter.cpp produce the hunk of code that gets included
...
into the middle of the class, and rework how the different sections of
the generated file are conditionally included for simplicity.
llvm-svn: 113163
2010-09-06 19:11:01 +00:00
Roman Divacky
e1278b57f9
Redefine LOOP* instructions from I to Ii8PCRel as they take an i8 argument.
...
llvm-svn: 113158
2010-09-06 18:43:14 +00:00
Chris Lattner
4cfbcdc7b6
random cleanups
...
llvm-svn: 113157
2010-09-06 18:32:06 +00:00
Chris Lattner
5cac0f71ca
update this.
...
llvm-svn: 113116
2010-09-05 20:22:09 +00:00
Chris Lattner
eeba0c73e5
implement rdar://6653118 - fastisel should fold loads where possible.
...
Since mem2reg isn't run at -O0, we get a ton of reloads from the stack,
for example, before, this code:
int foo(int x, int y, int z) {
return x+y+z;
}
used to compile into:
_foo: ## @foo
subq $12, %rsp
movl %edi, 8(%rsp)
movl %esi, 4(%rsp)
movl %edx, (%rsp)
movl 8(%rsp), %edx
movl 4(%rsp), %esi
addl %edx, %esi
movl (%rsp), %edx
addl %esi, %edx
movl %edx, %eax
addq $12, %rsp
ret
Now we produce:
_foo: ## @foo
subq $12, %rsp
movl %edi, 8(%rsp)
movl %esi, 4(%rsp)
movl %edx, (%rsp)
movl 8(%rsp), %edx
addl 4(%rsp), %edx ## Folded load
addl (%rsp), %edx ## Folded load
movl %edx, %eax
addq $12, %rsp
ret
Fewer instructions and less register use = faster compiles.
llvm-svn: 113102
2010-09-05 02:18:34 +00:00
Chris Lattner
65b48b5dfc
zap dead code.
...
llvm-svn: 113073
2010-09-04 18:12:00 +00:00
Bruno Cardoso Lopes
c6accda78e
Remove the last bit of isShuffleMaskLegal checks and improve the comment regarding mmx shuffles
...
llvm-svn: 113059
2010-09-04 02:58:56 +00:00
Bruno Cardoso Lopes
731bcc1abf
make explicit that we not handle several mmx shuffles
...
llvm-svn: 113058
2010-09-04 02:50:13 +00:00
Bruno Cardoso Lopes
20779ee157
Emit target specific nodes to handle palignr. Do not touch it for MMX versions yet.
...
llvm-svn: 113056
2010-09-04 02:36:07 +00:00
Bruno Cardoso Lopes
cff7cd18ab
Emit target specific nodes to handle splats starting at zero indicies
...
llvm-svn: 113055
2010-09-04 02:02:14 +00:00
Bruno Cardoso Lopes
95759917eb
Emit target specific nodes for isPSHUFHWMask and isPSHUFLWMask
...
llvm-svn: 113050
2010-09-04 01:36:45 +00:00
Bruno Cardoso Lopes
2b57008c72
Emit target specific nodes for isSHUFPMask
...
llvm-svn: 113048
2010-09-04 01:22:57 +00:00
Bruno Cardoso Lopes
2f7af36134
Previous isMOVLMask matching already emits targets nodes, remove check
...
llvm-svn: 113047
2010-09-04 00:50:08 +00:00
Bruno Cardoso Lopes
9f8e704151
One more check from the original isShuffleMaskLegal goes away
...
llvm-svn: 113045
2010-09-04 00:46:16 +00:00
Bruno Cardoso Lopes
16959372bb
Remove a duplicated but useless check that i've inserted in the previous commit.
...
llvm-svn: 113044
2010-09-04 00:43:12 +00:00
Bruno Cardoso Lopes
44578d38d3
Refactor some code and remove the extra checks for unpckl_undef and unpckh_undef
...
llvm-svn: 113043
2010-09-04 00:39:43 +00:00
Bruno Cardoso Lopes
7829d0e74b
Remove check for unpckh mask
...
llvm-svn: 113035
2010-09-03 23:32:47 +00:00
Bruno Cardoso Lopes
d1dacc57aa
Remove check for unpckl mask
...
llvm-svn: 113034
2010-09-03 23:31:50 +00:00
Bruno Cardoso Lopes
207b9d6218
Inline isShuffleMaskLegal into LowerVECTOR_SHUFFLE, so we can start
...
checking each standalone condition and decide whether emit target
specific nodes or remove the condition if it's already matched before.
llvm-svn: 113031
2010-09-03 23:24:06 +00:00
Bruno Cardoso Lopes
2bef20eda7
Reapply considered harmfull part of rr112934 and r112942.
...
"Use target specific nodes instead of relying in unpckl and
unpckh pattern fragments during isel time. Also place a
depth limit in getShuffleScalarElt.
llvm-svn: 113020
2010-09-03 22:09:41 +00:00
Dale Johannesen
367afb5a00
Remove the rest of the nonexistent 64-bit AVX instructions.
...
Bruno, please review.
llvm-svn: 113014
2010-09-03 21:23:00 +00:00
Bruno Cardoso Lopes
a750d994fe
Reapply last harmless part of r112934, the pattern fragment to match X86Unpcklpd
...
llvm-svn: 113009
2010-09-03 20:44:26 +00:00
Bruno Cardoso Lopes
fe8717c573
Reintroduce a simple function refactoring done in r112934, also without any functionality changes
...
llvm-svn: 113008
2010-09-03 20:20:02 +00:00
Bruno Cardoso Lopes
48e589b122
Reapply piecies of r112942 and r112934 which don't do
...
functional changes
llvm-svn: 113007
2010-09-03 20:10:35 +00:00
Bruno Cardoso Lopes
6979cf0808
Reapply Fix comment
...
llvm-svn: 113006
2010-09-03 19:55:05 +00:00
Daniel Dunbar
6f3da24d70
Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced
...
some infinite loop and select failures.
- Apologies for eager reverting, but its branch day.
llvm-svn: 113000
2010-09-03 19:38:11 +00:00
Daniel Dunbar
f1aacd55c0
Revert r112938 "Fix comment", which depends on r112934, which introduced some
...
infinite loop and select failures.
llvm-svn: 112999
2010-09-03 19:38:08 +00:00
Daniel Dunbar
0ffe4db45c
Revert r112942, "Use punpckh and unpckh family of nodes instead of using unpckh
...
mask pattern fragment", which depends on r112934, which introduced some infinite
loop and select failures.
llvm-svn: 112998
2010-09-03 19:38:05 +00:00
Bruno Cardoso Lopes
d6634a5b2e
AVX doesn't support mm operations neither its instrinsics.
...
The AVX versions of PALIGN and PABS* should only exist for
128-bit. Remove the unnecessary stuff.
llvm-svn: 112944
2010-09-03 02:08:45 +00:00
Bruno Cardoso Lopes
a85ec10483
Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment
...
llvm-svn: 112942
2010-09-03 01:39:08 +00:00
Bruno Cardoso Lopes
adc6bca2dd
Fix comment
...
llvm-svn: 112938
2010-09-03 01:28:51 +00:00
Bruno Cardoso Lopes
cce44678b4
- Use specific nodes to match unpckl masks.
...
- Teach getShuffleScalarElt how to handle more target
specific nodes, so the DAGCombine can make use of it.
- Add another hack to avoid the node update problem
during legalization. More description on the comments
llvm-svn: 112934
2010-09-03 01:24:00 +00:00
Jakob Stoklund Olesen
08aede2538
Don't call Predicate_* from X86 target.
...
llvm-svn: 112921
2010-09-03 00:35:18 +00:00
Anton Korobeynikov
a5a645559c
Properly emit __chkstk call instead of __alloca on non-mingw windows targets.
...
Patch by Cameron Esfahani!
llvm-svn: 112902
2010-09-02 23:03:46 +00:00
Bruno Cardoso Lopes
02a05a6a89
Move insertps mask decoding to header file
...
llvm-svn: 112896
2010-09-02 22:43:39 +00:00
Anton Korobeynikov
a689c5b2c0
Revert win64 changes. They seem to be incomplete
...
llvm-svn: 112885
2010-09-02 22:31:32 +00:00
Anton Korobeynikov
56291f7e53
Properly allocate win64 shadow reg area.
...
Patch by Jan Sjodin!
llvm-svn: 112875
2010-09-02 22:16:28 +00:00
Bruno Cardoso Lopes
814a69c330
Move decoding of insertps back to avoid unused warnings in x86 isel lowering, and fix movlhps/movhlps to decode 4 elements shuffles
...
llvm-svn: 112869
2010-09-02 21:51:11 +00:00
Dan Gohman
3c9b5f394b
Don't narrow the load and store in a load+twiddle+store sequence unless
...
there are clearly no stores between the load and the store. This fixes
this miscompile reported as PR7833.
This breaks the test/CodeGen/X86/narrow_op-2.ll optimization, which is
safe, but awkward to prove safe. Move it to X86's README.txt.
llvm-svn: 112861
2010-09-02 21:18:42 +00:00
Bruno Cardoso Lopes
c79f50170a
Move x86 specific shuffle mask decoding to its own header, it's also going to be used elsewhere. Also trim trailing whitespaces
...
llvm-svn: 112846
2010-09-02 18:40:13 +00:00
Bruno Cardoso Lopes
489613f1e5
Replace unpckl_undef and unpckh_undef matching with target specific opcodes
...
llvm-svn: 112806
2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes
e4e4be3885
Move condition out to prepare for more matching
...
llvm-svn: 112805
2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes
bf7fd146c7
Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it
...
llvm-svn: 112804
2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes
6a7f634487
become more strict about when it's safe to use X86ISD::MOVLPS
...
llvm-svn: 112799
2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes
04c25c15c7
Revert r112689, avoid those kind of checks cause they mess up with mmx
...
llvm-svn: 112760
2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes
fea81b4831
Using target specific nodes for shuffle nodes makes the mask
...
check more strict, breaking some cases not checked in the
testsuite, but also exposes some foldings not done before,
as this example:
movaps (%rdi), %xmm0
movaps (%rax), %xmm1
movaps %xmm0, %xmm2
movss %xmm1, %xmm2
shufps $36, %xmm2, %xmm0
now is generated as:
movaps (%rdi), %xmm0
movaps %xmm0, %xmm1
movlps (%rax), %xmm1
shufps $36, %xmm1, %xmm0
llvm-svn: 112753
2010-09-01 22:33:20 +00:00
Bruno Cardoso Lopes
b3825216ce
Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
...
llvm-svn: 112694
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes
6aaebe877b
minor change, simplify some logic
...
llvm-svn: 112689
2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes
2b025707a2
Move some functions around so they can be used for some other to come function
...
llvm-svn: 112687
2010-09-01 00:51:36 +00:00
Bruno Cardoso Lopes
4b56d87290
Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes
...
llvm-svn: 112661
2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes
61996ef835
Use x86 specific MOVSHDUP node and add more patterns to match it
...
llvm-svn: 112657
2010-08-31 22:22:11 +00:00
Jakob Stoklund Olesen
33e9fce2d6
Make %EFLAGS unallocatable.
...
No CCR virtual registers should exist, and %EFLAGS is used in ways that can
surprise RegAllocFast.
llvm-svn: 112650
2010-08-31 21:51:07 +00:00
Bruno Cardoso Lopes
5de15ce468
Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments
...
llvm-svn: 112644
2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes
03e4c35302
Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
...
llvm-svn: 112642
2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes
dfd9dd5d75
Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles
...
llvm-svn: 112570
2010-08-31 02:26:40 +00:00
Eli Friedman
f75de6eae7
A couple of small missed optimizations.
...
llvm-svn: 112411
2010-08-29 05:07:40 +00:00
Chris Lattner
38ccc8b884
add a bunch more common shuffles to the instprinter.
...
llvm-svn: 112397
2010-08-29 03:08:08 +00:00
Chris Lattner
7a05e6dca2
I have manually decoded the imm field of an insertps one too many
...
times. This patch causes llc and llvm-mc (which both default to
verbose-asm) to print out comments after a few common shuffle
instructions which indicates the shuffle mask, e.g.:
insertps $113, %xmm3, %xmm0 ## xmm0 = zero,xmm0[1,2],xmm3[1]
unpcklps %xmm1, %xmm0 ## xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
pshufd $1, %xmm1, %xmm1 ## xmm1 = xmm1[1,0,0,0]
This is carefully factored to keep the information extraction (of the
shuffle mask) separate from the printing logic. I plan to move the
extraction part out somewhere else at some point for other parts of
the x86 backend that want to introspect on the behavior of shuffles.
llvm-svn: 112387
2010-08-28 20:42:31 +00:00
Chris Lattner
94656b1c8c
fix the buildvector->insertp[sd] logic to not always create a redundant
...
insertp[sd] $0, which is a noop. Before:
_f32: ## @f32
pshufd $1, %xmm1, %xmm2
pshufd $1, %xmm0, %xmm3
addss %xmm2, %xmm3
addss %xmm1, %xmm0
## kill: XMM0<def> XMM0<kill> XMM0<def>
insertps $0, %xmm0, %xmm0
insertps $16, %xmm3, %xmm0
ret
after:
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm3
addss %xmm1, %xmm3
movdqa %xmm2, %xmm0
insertps $16, %xmm3, %xmm0
ret
The extra movs are due to a random (poor) scheduling decision.
llvm-svn: 112379
2010-08-28 17:59:08 +00:00
Chris Lattner
bcb6090ad0
fix the BuildVector -> unpcklps logic to not do pointless shuffles
...
when the top elements of a vector are undefined. This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined. For example, on:
_Complex float f32(_Complex float A, _Complex float B) {
return A+B;
}
We used to produce (with SSE2, SSE4.1+ uses insertps):
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $16, %xmm2, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm0
addss %xmm1, %xmm0
pshufd $16, %xmm0, %xmm1
movdqa %xmm2, %xmm0
unpcklps %xmm1, %xmm0
ret
We now produce:
_f32: ## @f32
movdqa %xmm0, %xmm2
addss %xmm1, %xmm2
pshufd $1, %xmm1, %xmm1
pshufd $1, %xmm0, %xmm3
addss %xmm1, %xmm3
movaps %xmm2, %xmm0
unpcklps %xmm3, %xmm0
ret
This implements rdar://8368414
llvm-svn: 112378
2010-08-28 17:28:30 +00:00
Chris Lattner
96db6e66f4
improve comments in the unpcklps generating logic, introduce
...
a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose. No functionality change.
llvm-svn: 112377
2010-08-28 17:15:43 +00:00
Bruno Cardoso Lopes
a982aa24ef
Clean up the logic of vector shuffles -> vector shifts.
...
Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.
llvm-svn: 112348
2010-08-28 02:46:39 +00:00
Anton Korobeynikov
c0b36921c2
Properly handle passing of FP stuff to varargs function on Win64:
...
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!
llvm-svn: 112262
2010-08-27 14:43:06 +00:00
Daniel Dunbar
1844a71e66
X86: Fix an encoding issue with LOCK_ADD64mr, which could lead to very hard to find miscompiles with the integrated assembler.
...
llvm-svn: 112250
2010-08-27 01:30:14 +00:00
Jim Grosbach
6a77066913
Simplify eliminateFrameIndex() interface back down now that PEI doesn't need
...
to try to re-use scavenged frame index reference registers. rdar://8277890
llvm-svn: 112241
2010-08-26 23:32:16 +00:00
Bruno Cardoso Lopes
e25ba0c7c2
zap the now unused MVT::getIntVectorWithNumElements
...
llvm-svn: 112218
2010-08-26 20:53:12 +00:00
Bob Wilson
a967c42a3d
Fix comment typos.
...
llvm-svn: 112202
2010-08-26 18:08:11 +00:00
Chris Lattner
eb2cc0ce0e
implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1.
...
llvm-svn: 112171
2010-08-26 05:51:22 +00:00
Chris Lattner
cc60609cb4
fix sse1 only codegen in x86-64 mode, which is something we
...
apparently try to support.
llvm-svn: 112168
2010-08-26 05:24:29 +00:00
Bruno Cardoso Lopes
184eaea855
Fix PR7748 without using microsoft extensions
...
llvm-svn: 112128
2010-08-26 01:02:53 +00:00
Chris Lattner
aecf47a5cb
we should pattern match the SSE complex arithmetic ops.
...
llvm-svn: 112109
2010-08-25 23:31:42 +00:00
Bruno Cardoso Lopes
d4085f6e91
Revert this for now, PUNPCKLDQ dont operate on v4f32
...
llvm-svn: 112090
2010-08-25 21:26:37 +00:00
Daniel Dunbar
3d148ac089
X86: Fix misencode of RI64mi8. This fixes OpenSSL / x86_64-apple-darwin10 / clang -O3.
...
llvm-svn: 112089
2010-08-25 21:11:02 +00:00
Benjamin Kramer
f1f2133ac0
Remove dead recursive function. Yay for clang -Wunused-function.
...
llvm-svn: 112060
2010-08-25 17:27:58 +00:00
Anton Korobeynikov
b3b53ecac0
Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there.
...
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.
llvm-svn: 112034
2010-08-25 07:50:11 +00:00
Bruno Cardoso Lopes
0770d25758
PUNPCKLDQ should also be used for v4f32
...
llvm-svn: 112020
2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes
2e45d522c1
teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests
...
llvm-svn: 112017
2010-08-25 02:35:37 +00:00
Daniel Dunbar
1c8d777c93
MC/X86: Tweak imul recognition, previous hack only applies for the imul form
...
taking immediates.
llvm-svn: 111950
2010-08-24 19:37:56 +00:00
Daniel Dunbar
09392785b4
MC/X86: Add custom hack for recognizing "imul $12, %eax" and friends.
...
llvm-svn: 111947
2010-08-24 19:24:18 +00:00
Daniel Dunbar
94b84a19b9
MC/X86: Warn on scale factors > 1 without index register, instead of erroring,
...
for 'as' compatibility.
llvm-svn: 111945
2010-08-24 19:13:38 +00:00
Dan Gohman
c88fda477a
Fix X86's isLegalAddressingMode to recognize that static addresses
...
need not be RIP-relative in small mode.
llvm-svn: 111917
2010-08-24 15:55:12 +00:00
Bruno Cardoso Lopes
758d7b1f5c
Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments
...
llvm-svn: 111890
2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes
264d90fff7
Start using target speficic nodes for shuffles: pshufhw and pshuflw
...
llvm-svn: 111837
2010-08-23 20:41:02 +00:00
Gabor Greif
21fed6616c
tyops
...
llvm-svn: 111835
2010-08-23 20:30:51 +00:00
Chris Lattner
58bd73a5a7
Add a new llvm.x86.int intrinsic, allowing access to the
...
x86 int and int3 instructions. Patch by Peter Housel!
llvm-svn: 111831
2010-08-23 19:39:25 +00:00
Chris Lattner
a42202e0e4
random improvement for variable shift codegen.
...
llvm-svn: 111813
2010-08-23 17:30:29 +00:00