With r274952 and r275201 in place there are no cases left where a
forward liveness analysis yields different results than a backward one.
So we can remove the forward stepping logic.
Differential Revision: http://reviews.llvm.org/D22083
llvm-svn: 275204
This bug (llvm.org/PR28124) was introduced by r237977, which refactored
the tail call sequence to be generated in two passes instead of one.
Unfortunately, the stack adjustment produced by the first pass was not
recognized by X86FrameLowering::mergeSPUpdates() in all cases, causing
code such as the following, which clobbers the return address, to be
generated:
popl %edi
popl %edi
pushl %eax
jmp tailcallee # TAILCALL
To fix the problem, the entire stack adjustment is performed in
X86ExpandPseudo::ExpandMI() for tail calls.
Patch by Magnus Lång <margnus1@gmail.com>
Differential Revision: http://reviews.llvm.org/D21325
llvm-svn: 275103
It is an optimization pass, and should not run at -O0. Especially since Fast RA
will not do the required register coalescing anyway, so it's a loss even from
the optimization standpoint.
This also works around (but doesn't quite fix) PR28489.
llvm-svn: 275099
An identity COPY like this:
%AL = COPY %AL, %EAX<imp-def>
has no semantic effect, but encodes liveness information: Further users
of %EAX only depend on this instruction even though it does not define
the full register.
Replace the COPY with a KILL instruction in those cases to maintain this
liveness information. (This reverts a small part of r238588 but this
time adds a comment explaining why a KILL instruction is useful).
llvm-svn: 274952
Until we have a better way to extract constants through bitcasted build vectors (and how to handle undefs of partial lanes etc.) at least accept build vectors that are all zeroes.
llvm-svn: 274833
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.
This fixes PR28146.
The original commit tried inserting an 8bit-subreg into a GR32 (not GR32_ABCD)
which was not appreciated by fast regalloc on 32-bit.
llvm-svn: 274802
These tests don't actually care about the internal opcode number, but have to
be updated whenever we add a new one for GlobalISel. That's bad.
llvm-svn: 274774
xorl + setcc is generally the preferred sequence due to the partial register
stall setcc + movzbl suffers from. As a bonus, it also encodes one byte smaller.
This fixes PR28146.
Differential Revision: http://reviews.llvm.org/D21774
llvm-svn: 274692
The patch removes redundant kmov instructions (not all, we still have a lot of work here) and redundant "and" instructions after "setcc".
I use "AssertZero" marker between X86ISD::SETCC node and "truncate" to eliminate extra "and $1" instruction.
I also changed zext, aext and trunc patterns in the .td file. It allows to remove extra "kmov" instruictions.
This patch fixes https://llvm.org/bugs/show_bug.cgi?id=28173.
Fast ISEL mode is not supported correctly for AVX-512. ICMP/FCMP scalar instruction should return result in k-reg. It will be fixed in one of the next patches. I redirected handling of "cmp" to the DAG builder mode. (The code looks worse in one specific test case, but without this fix the new patch fails).
Differential revision: http://reviews.llvm.org/D21956
llvm-svn: 274613