simplifying them and exposing more information to tblgen. It would be nice
if other target authors adopted this as well, particularly arm since it has fastisel.
llvm-svn: 129676
kind of predicate: one that is specific to imm nodes. The predicate function
specified here just checks an int64_t directly instead of messing around with
SDNode's. The virtue of this is that it means that fastisel and other things
can reason about these predicates.
llvm-svn: 129675
structure and fix some fixmes. We now have a TreePredicateFn class
that handles all of the decoding of these things. This is an internal
cleanup that has no impact on the code generated by tblgen.
llvm-svn: 129670
their carry depenedencies with MVT::Flag operands) and use clean and beautiful
EFLAGS dependences instead.
We do this by changing the modelling of SBB/ADC to have EFLAGS input and outputs
(which is what requires the previous scheduler change) and change X86 ISelLowering
to custom lower ADDC and friends down to X86ISD::ADD/ADC/SUB/SBB nodes.
With the previous series of changes, this causes no changes in the testsuite, woo.
llvm-svn: 122213
the output to the correct register. Fixes a hidden problem uncovered
by the last patch where we'd try to DAG combine our MVT::Other node
oddly.
llvm-svn: 121358
backend that they were all implemented except umul. This one fell back
to the default implementation that did a hi/lo multiply and compared the
top. Fix this to check the overflow flag that the 'mul' instruction
sets, so we can avoid an explicit test. Now we compile:
void *func(long count) {
return new int[count];
}
into:
__Z4funcl: ## @_Z4funcl
movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00]
movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
mulq %rcx ## encoding: [0x48,0xf7,0xe1]
seto %cl ## encoding: [0x0f,0x90,0xc1]
testb %cl, %cl ## encoding: [0x84,0xc9]
movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8]
jmp __Znam ## TAILCALL
instead of:
__Z4funcl: ## @_Z4funcl
movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00]
movq %rdi, %rax ## encoding: [0x48,0x89,0xf8]
mulq %rcx ## encoding: [0x48,0xf7,0xe1]
testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2]
movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff]
cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8]
jmp __Znam ## TAILCALL
Other than the silly seto+test, this is using the o bit directly, so it's going in the right
direction.
llvm-svn: 120935
different forms of this instruction (movw/movl/movq) which we reported
as being ambiguous. Since they all do the same thing, gas just picks the
one with the shortest encoding. Follow its lead here.
This implements rdar://8208615
llvm-svn: 118362
exposed:
GAS doesn't accept "fcomip %st(1)", it requires "fcomip %st(1), %st(0)"
even though st(0) is implicit in all other fp stack instructions.
Fortunately, there is an alias for fcomip named "fcompi" and gas does
accept the default argument for the alias (boggle!).
As such, switch the canonical form of this instruction to "pi" instead
of "ip". This makes the code generator and disassembler generate pi,
avoiding the gas bug.
llvm-svn: 118356
shift-by-1 instructions, where the asmstring doesn't contain
the implicit 1. It turns out that a bunch of these rotate
instructions were completely broken because they used 1
instead of $1.
This fixes assembly mismatches on "rclb $1, %bl" and friends,
where we used to generate the 3 byte form, we now generate the
proper 2-byte form.
llvm-svn: 118355
operand list instead of the operand list redundantly declared on the alias
or instruction.
With this change, we finally remove the ins/outs list on the alias. Before:
def : InstAlias<(outs GR16:$dst), (ins GR8 :$src),
"movsx $src, $dst",
(MOVSX16rr8W GR16:$dst, GR8:$src)>;
After:
def : InstAlias<"movsx $src, $dst",
(MOVSX16rr8W GR16:$dst, GR8:$src)>;
This also makes the alias mechanism more general and powerful, which will
be exploited in subsequent patches.
llvm-svn: 118329
aliases installed and working. They now work when the
matched pattern and the result instruction have exactly
the same operand list.
This is now enough for us to define proper aliases for
movzx and movsx, implementing rdar://8017633 and PR7459.
Note that we do not accept instructions like:
movzx 0(%rsp), %rsi
GAS accepts this instruction, but it doesn't make any
sense because we don't know the size of the memory
operand. It could be 8/16/32 bits.
llvm-svn: 117901
directives, allowing things like this:
def : MnemonicAlias<"pop", "popl">, Requires<[In32BitMode]>;
def : MnemonicAlias<"pop", "popq">, Requires<[In64BitMode]>;
Move the rest of the X86 MnemonicAliases over to the .td file.
llvm-svn: 117830
be more complete. These are only expected to be used by llvm-mc with assembly
source so there is no pattern, [], in the .td files. Most are being added to
X86InstrInfo.td as Chris suggested and only comments about register uses are
added. Suggestions welcome on the .td changes as I'm not sure on every detail
of the x86 records. More missing instructions will be coming.
llvm-svn: 116716
The reg-reg copies were no longer being generated since copyPhysReg copies
physical registers only.
The loads and stores are not necessary - The TC constraint is imposed by the
TAILJMP and TCRETURN instructions, there should be no need for constrained loads
and stores.
llvm-svn: 116314
is general goodness because it allows ORs to be converted to LEA to avoid
inserting copies. However, this is bad because it makes the generated .s
file less obvious and gives valgrind heartburn (tons of false positives in
bitfield code).
While the general fix should be in valgrind, we can at least try to avoid
emitting ADD instructions that *don't* get promoted to LEA. This is more
work because it requires introducing pseudo instructions to represents
"add that knows the bits are disjoint", but hey, people really love valgrind.
This fixes this testcase:
https://bugs.kde.org/show_bug.cgi?id=242137#c20
the add r/i cases are coming next.
llvm-svn: 116007
else in X86), and add support for pavgusb. This is apparently the
only instruction (other than movsx) that is preventing ffmpeg from building
with clang.
If someone else is interested in banging out the rest of the 3DNow!
instructions, it should be quite easy now.
llvm-svn: 115466
x86-32: 32-bit calls were named "call" not "calll". 64-bit calls were correctly
named "callq", so this only impacted x86-32.
This fixes rdar://8456370 - llvm-mc rejects 'calll'
This also exposes that mingw/64 is generating a 32-bit call instead of a 64-bit call,
I will file a bugzilla.
llvm-svn: 114534
by having X86DAGToDAGISel::SelectAddr get passed in the parent node
of the operand match (the load/store/atomic op) and having it get
the address space from that, instead of having special FS/GS addr
mode operations that require duplicating the entire instruction set
to support.
This makes FS and GS relative accesses *far* more predictable and
work much better. It also simplifies the X86 backend a bit, more
to come.
There is still a pending issue with nodes like ISD::PREFETCH and
X86ISD::FLD, which really should be MemSDNode's but aren't.
llvm-svn: 114491
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.
llvm-svn: 112034
we are using AVX and no AVX version of the desired intruction is present,
this is better for incremental dev (without fallbacks it's easier to spot
what's missing). Not sure this is the best hack thought (we can also disable
all HasSSE* predicates by dinamically marking them 'false' if AVX is present)
llvm-svn: 109434
instruction, we only want to allow the one for the current subtarget.
- This also fixes suffix matching for jmp instructions, because it eliminates
the ambiguity between 'jmpl' and 'jmpq'.
llvm-svn: 108746
notes:
- The instructions are being added with dummy placeholder patterns using some 256
specifiers, this is not meant to work now, but since there are some multiclasses
generic enough to accept them, when we go for codegen, the stuff will be already
there.
- Add VEX encoding bits to support YMM
- Add MOVUPS and MOVAPS in the first round
- Use "Y" as suffix for those Instructions: MOVUPSYrr, ...
- All AVX instructions in X86InstrSSE.td will move soon to a new X86InstrAVX
file.
llvm-svn: 107996
jumps where possible and turning the TAILCALL marker in the instruction
asm string into a proper comment.
This eliminates a FIXME and is on the path to finishing:
rdar://7639610 - eliminate encoding and asm info for TAILJMPd TAILJMPr TAILJMPn, etc.
However, I can't eliminate the encodings for these instructions because the JIT
still exists and has its own copy of the encoder, sigh.
llvm-svn: 107946
like all other instructions, even though a segment is not
allowed. This resolves a bunch of gross hacks in the
encoder and makes LEA more consistent with the rest of the
instruction set.
No functionality change.
llvm-svn: 107934
and operand renaming to help.
The giant turn the constraints on and selectively turn it off
should probably be inverted at some point since it's just largely
50/50.
llvm-svn: 106367
prefix byte problem as in r104062.
- As a total hack to keep the TAILCALL markers in the output, which some tests depend on, this invents a new TAILJMP_1 instruction.
llvm-svn: 104120
and %rcr_, leaving just %cr_ which is what people expect.
Updated the disassembler to support this unified register set.
Added a testcase to verify that the registers continue to be
decoded correctly.
llvm-svn: 103196
caused the a pushl instruction to be incorrectly encoding using only two bytes
of immediate, causing the following 2 instruction bytes to be part of the 32-bit
immediate value. Also fixed the one byte form of push to be used when the
immediate would fit in a signed extended byte. Lastly changed the names to not
include the 32 of PUSH32 since they actually push the size of the stack pointer.
llvm-svn: 102951
a new subtarget option for AES and check for the support. Add "westmere"
line of processors and add AES-NI support to the core i7.
Add a couple of TODOs for information I couldn't verify.
llvm-svn: 100231
addl $12, %esp
popl %esi
popl %edi
popl %ebx
popl %ebp
jmpl *__Block_deallocator-L1$pb(%esi) # TAILCALL
The problem is the global base register is assigned GR32 register class. TCRETURNmi needs the registers making up the address mode to have the GR32_TC register class.
The *proper* fix is for X86DAGToDAGISel::getGlobalBaseReg() to return a copy from the global base register of the machine function rather than returning the register itself. But that has the potential of causing it to be coalesced to a more restrictive register class: GR32_TC. It can introduce additional copies and spills. For something as important the PIC base, it's not worth it especially since this is not an issue on 64-bit.
llvm-svn: 99455
ISD node. The only change in the generated isel code are comments
like:
< // Src: (X86dec_flag:i16 GR16:i16:$src)
---
> // Src: (X86dec_flag:i16:i32 GR16:i16:$src)
because now it knows that X86dec_flag returns both an i16 (for the result)
and an i32 (for EFLAGS) in this case. Wewt.
llvm-svn: 99369
to input patterns, we can fix X86ISD::CMP and X86ISD::BT as taking
two inputs (which have to be the same type) and *returning an i32*.
This is how the SDNodes get made in the graph, but we weren't able
to model it this way due to deficiencies in the pattern language.
Now we can change things like this:
def UCOM_FpIr80: FpI_<(outs), (ins RFP80:$lhs, RFP80:$rhs), CompareFP,
- [(X86cmp RFP80:$lhs, RFP80:$rhs),
- (implicit EFLAGS)]>; // CC = ST(0) cmp ST(i)
+ [(set EFLAGS, (X86cmp RFP80:$lhs, RFP80:$rhs))]>;
and fix terrible crimes like this:
-def : Pat<(parallel (X86cmp GR8:$src1, 0), (implicit EFLAGS)),
+def : Pat<(X86cmp GR8:$src1, 0),
(TEST8rr GR8:$src1, GR8:$src1)>;
This relies on matching the result of TEST8rr (which is EFLAGS, which is
an implicit def) to the result of X86cmp, an i32.
llvm-svn: 98903
These instructions technically define AL,AH, but a trick in X86ISelDAGToDAG
reads AX in order to avoid reading AH with a REX instruction.
Fix PR6489.
llvm-svn: 97742
that they are not destination type specific. This allows
tblgen to factor them and the type check is redundant with
what the isel does anyway.
llvm-svn: 97629
Move some utility TableGen defs, classes, etc. into a common file so
they may be used my multiple pattern files. We will use this for
the AVX specification to help with the transition from the current
SSE specification.
llvm-svn: 95727
Lock prefix, Repeat string operation prefixes and the Segment override prefixes.
Also added versions of the move string and store string instructions without the
repeat prefixes to X86InstrInfo.td. And finally marked the rep versions of
move/store string records in X86InstrInfo.td as isCodeGenOnly = 1 so tblgen is
happy building the disassembler files.
llvm-svn: 95252
something totally broken and parsing them as immediates, but the .td file also
had the wrong match class so things sortof worked. Except, that is, that we
would parse
movl $0, %eax
as
movl 0, %eax
Feel free to guess how well that worked.
llvm-svn: 94869
new AsmPrinter. This is perhaps less elegant than describing them
in terms of MOV32r0 and subreg operations, but it allows the
current register to rematerialize them.
llvm-svn: 93158
1. CMPXCHG8B and CMPXCHG16B did not specify implicit physical register defs and uses.
2. LCMPXCHG8B is loading 64 bit memory, not 32 bit.
llvm-svn: 92985
instead use the appropriate subreggy thing. This generates identical
code on some large apps (thanks to Evan's cross class coalescing
stuff he did back in july). This means that MOV16r0 can go away
completely in the future soon.
llvm-svn: 91972
be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.
movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0
instead of
cvtss2sd (%rdi), %xmm0
An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0
llvm-svn: 91672
Note that "hasDotLocAndDotFile"-style debug info was already broken;
people wanting this functionality should implement it in the
AsmPrinter/DwarfWriter code.
llvm-svn: 89711
bunch of associated comments, because it doesn't have anything to do
with DAGs or scheduling. This is another step in decoupling MachineInstr
emitting from scheduling.
llvm-svn: 85517
All of these "subreg32" modifier instructions are handled
explicitly by the MCInst lowering phase. If they got to
the asmprinter, they would explode. They should eventually
be replace with correct use of subregs.
llvm-svn: 84526