allowing it to simplify the crazy constantexprs in the testcases
down to something sensible. This allows -std-compile-opts to
completely "devirtualize" the pointers to member functions in
the testcase from PR5176.
llvm-svn: 84368
to be more general and understand more varieties of loops.
Teach CodePlacementOpt to reorganize the basic blocks of a loop so that
they are contiguous. This also includes a fair amount of logic for preserving
fall-through edges while doing so. This fixes a BranchFolding-ism where blocks
which can't be made to use a fall-through edge and don't conveniently fit
anywhere nearby get tossed out to the end of the function.
llvm-svn: 84295
Update testcases that rely on malloc insts being present.
Also prematurely remove MallocInst handling from IndMemRemoval and RaiseAllocations to help pass tests in this incremental step.
llvm-svn: 84292
header is just the entry block to the loop, and it needn't be at
the top of the loop in the code layout.
Remove the code that suppressed loop alignment for outer loops,
so that outer loops are aligned.
llvm-svn: 84158
cannot alias the GEP. GEP pointer alias rule states this clearly:
A pointer value formed from a getelementptr instruction is associated with the
addresses associated with the first operand of the getelementptr.
llvm-svn: 84079
producing any stores at all for a long time, but ".store." was in some
IR instruction names until recently. This removal caused the test to
start failing. Just make it reject any stores.
llvm-svn: 83895
it to hold the address of an sret return value, for x86-64 ABI purposes.
Also, fix the test that was originally intended to test this to actually
test it, using FileCheck.
llvm-svn: 83853
input the the mul is a zext from bool, just that it is all zeros
other than the low bit. This fixes some phase ordering issues
that would cause us to miss some xforms in mul.ll when the worklist
is visited differently.
llvm-svn: 83794
For now the metadata of sinked/hoisted instructions is still wrong, but that'll
be fixed when instructions will have debug metadata directly attached.
llvm-svn: 83786
done by condprop, but do it in a much more general form. The
basic idea is that we can do a limited form of tail duplication
in the case when we have a branch on a phi. Moving the branch
up in to the predecessor block makes instruction selection
much easier and encourages chained jump threadings.
llvm-svn: 83759
from GVN, this also speeds it up, inserts fewer PHI nodes (see the
testcase) and allows it to remove more loads (due to fewer PHI nodes
standing in the way).
llvm-svn: 83746
when one of the bits being tested would end up being the sign bit in the
narrower type, and a signed comparison is being performed, since this would
change the result of the signed comparison. This fixes PR5132.
llvm-svn: 83670
and that will make Caller too big to inline, see if it
might be better to inline Caller into its callers instead.
This situation is described in PR 2973, although I haven't
tried the specific case in SPASS.
llvm-svn: 83602
verbose-asm mode, print comments instead. This eliminates a non-comment
difference between verbose-asm mode and non-verbose-asm mode.
Also, factor out the relevant code out of all the targets and into
target-independent code.
llvm-svn: 83392
instruction. This makes it re-materializable.
Thumb2 will split it back out into two instructions so IT pass will generate the
right mask. Also, this expose opportunies to optimize the movw to a 16-bit move.
llvm-svn: 82982
physical registers. This is especially critical for the later two since they
start the live interval of a super-register. e.g.
%DO<def> = INSERT_SUBREG %D0<undef>, %S0<kill>, 1
If this instruction is eliminated, the register scavenger will not be happy as
D0 is not defined previously.
This fixes PR5055.
llvm-svn: 82968
the PassManager code into a regular verifyAnalysis method.
Also, reorganize loop verification. Make the LoopPass infrastructure
call verifyLoop as needed instead of having LoopInfo::verifyAnalysis
check every loop in the function after each looop pass. Add a new
command-line argument, -verify-loop-info, to enable the expensive
full checking.
llvm-svn: 82952
that are phi nodes. Also tighten up FoldOpIntoPhi to treat constantexpr
operands to phis just like other variables, avoiding moving constantexpr
computations around.
Patch by Daniel Dunbar.
llvm-svn: 82913
allows matching and remembering a string and then matching and
verifying that the string occurs later in the file.
Change X86/xor.ll to use this in some cases where the test was
checking for an arbitrary register allocation decision.
llvm-svn: 82891
which have no defs anywhere in the function. In particular, this fixes sinking
of instructions that reference RIP on x86-64, which is currently being modeled
as a register.
llvm-svn: 82815
- Allocate MachineMemOperands and MachineMemOperand lists in MachineFunctions.
This eliminates MachineInstr's std::list member and allows the data to be
created by isel and live for the remainder of codegen, avoiding a lot of
copying and unnecessary translation. This also shrinks MemSDNode.
- Delete MemOperandSDNode. Introduce MachineSDNode which has dedicated
fields for MachineMemOperands.
- Change MemSDNode to have a MachineMemOperand member instead of its own
fields with the same information. This introduces some redundancy, but
it's more consistent with what MachineInstr will eventually want.
- Ignore alignment when searching for redundant loads for CSE, but remember
the greatest alignment.
Target-specific code which previously used MemOperandSDNodes with generic
SDNodes now use MemIntrinsicSDNodes, with opcodes in a designated range
so that the SelectionDAG framework knows that MachineMemOperand information
is available.
llvm-svn: 82794
regex and matching it instead of trying to match chunks at a time.
Matching chunks at a time broke with check lines like
CHECK: foo {{.*}}bar
because the .* would eat the entire rest of the line and bar would
never match.
Now we just escape the fixed strings for the user, so that something
like:
CHECK: a() {{.*}}???
is matched as:
CHECK: {{a\(\) .*\?\?\?}}
transparently "under the covers".
llvm-svn: 82779
For the AAPCS ABI, SP must always be 4-byte aligned, and at any "public
interface" it must be 8-byte aligned. For the older ARM APCS ABI, the stack
alignment is just always 4 bytes. For X86, we currently align SP at
entry to a function (e.g., to 16 bytes for Darwin), but no stack alignment
is needed at other times, such as for a leaf function.
After discussing this with Dan, I decided to go with the approach of adding
a new "TransientStackAlignment" field to TargetFrameInfo. This value
specifies the stack alignment that must be maintained even in between calls.
It defaults to 1 except for ARM, where it is 4. (Some other targets may
also want to set this if they have similar stack requirements. It's not
currently required for PPC because it sets targetHandlesStackFrameRounding
and handles the alignment in target-specific code.) The existing StackAlignment
value specifies the alignment upon entry to a function, which is how we've
been using it anyway.
llvm-svn: 82767
LiveVariables add implicit kills to correctly track partial register kills. This works well enough and is fairly accurate. But coalescer can make it impossible to maintain these markers. e.g.
BL <ga:sss1>, %R0<kill,undef>, %S0<kill>, %R0<imp-def>, %R1<imp-def,dead>, %R2<imp-def,dead>, %R3<imp-def,dead>, %R12<imp-def,dead>, %LR<imp-def,dead>, %D0<imp-def>, ...
...
%reg1031<def> = FLDS <cp#1>, 0, 14, %reg0, Mem:LD4[ConstantPool]
...
%S0<def> = FCPYS %reg1031<kill>, 14, %reg0, %D0<imp-use,kill>
When reg1031 and S0 are coalesced, the copy (FCPYS) will be eliminated the the implicit-kill of D0 is lost. In this case it's possible to move the marker to the FLDS. But in many cases, this is not possible. Suppose
%reg1031<def> = FOO <cp#1>, %D0<imp-def>
...
%S0<def> = FCPYS %reg1031<kill>, 14, %reg0, %D0<imp-use,kill>
When FCPYS goes away, the definition of S0 is the "FOO" instruction. However, transferring the D0 implicit-kill to FOO doesn't work since it is the def of D0 itself. We need to fix this in another time by introducing a "kill" pseudo instruction to track liveness.
Disabling the assertion is not ideal, but machine verifier is doing that job now. It's important to know double-def is not a miscomputation since it means a register should be free but it's not tracked as free. It's a performance issue instead.
llvm-svn: 82677
of the defs are processed.
Also fix a implicit_def propagation bug: a implicit_def of a physical register
should be applied to uses of the sub-registers.
llvm-svn: 82616
take into consideration that the result of an invoke is only valid in
the normal dest, not the unwind dest. This caused 'PHINode::hasConstantValue'
to return true in an invalid situation, causing mem2reg to delete a phi that
was actually needed. This caused a crash building 483.xalancbmk.
llvm-svn: 82491
variable increment / decrement slighter high priority.
This has major impact on some micro-benchmarks. On MultiSource/Applications
and spec tests, it's a minor win. It also reduce 256.bzip instruction count
by 8%, 55 on 164.gzip on i386 / Darwin.
llvm-svn: 82485
And fix a bug with the behavior of min/max instructions formed from
fcmp uge comparisons.
Also, use FiniteOnlyFPMath() for this code instead of UnsafeFPMath,
as it is more specific.
llvm-svn: 82466
from a piece of a large store when both are in the same block.
This allows clang to compile the testcase in PR4216 to this code:
_test_bitfield:
movl 4(%esp), %eax
movl %eax, %ecx
andl $-65536, %ecx
orl $32962, %eax
andl $40186, %eax
orl %ecx, %eax
ret
This is not ideal, but is a whole lot better than the code produced
by llvm-gcc:
_test_bitfield:
movw $-32574, %ax
orw 4(%esp), %ax
andw $-25350, %ax
movw %ax, 4(%esp)
movw 7(%esp), %cx
shlw $8, %cx
movzbl 6(%esp), %edx
orw %cx, %dx
movzwl %dx, %ecx
shll $16, %ecx
movzwl %ax, %eax
orl %ecx, %eax
ret
and dramatically better than that produced by gcc 4.2:
_test_bitfield:
pushl %ebx
call L3
"L00000000001$pb":
L3:
popl %ebx
movl 8(%esp), %eax
leal 0(,%eax,4), %edx
sarb $7, %dl
movl %eax, %ecx
andl $7168, %ecx
andl $-7201, %ebx
movzbl %dl, %edx
andl $1, %edx
sall $5, %edx
orl %ecx, %ebx
orl %edx, %ebx
andl $24, %eax
andl $-58336, %ebx
orl %eax, %ebx
orl $32962, %ebx
movl %ebx, %eax
popl %ebx
ret
llvm-svn: 82439
so that nonlocal and partially redundant loads can use it as well.
The testcase shows examples of craziness this can handle. This triggers
*many* times in 176.gcc.
llvm-svn: 82403
(and load -> load) when the base pointers must alias but when
they are different types. This occurs very very frequently in
176.gcc and other code that uses bitfields a lot.
llvm-svn: 82399
we pushed the beginning of the interval back 1, so the
interval would overlap with inputs that die. We were
also pushing the end of the interval back 1, though,
which means the earlyclobber didn't overlap with other
output operands. Don't do this. PR 4964.
llvm-svn: 82342
getSymbolForDwarfGlobalReference is smart enough to know that it
needs to register the stub it references with MachineModuleInfoMachO,
so that it gets emitted at the end of the file.
Move stub emission from X86ATTAsmPrinter::doFinalization to the
new X86ATTAsmPrinter::EmitEndOfAsmFile asmprinter hook. The important
thing here is that EmitEndOfAsmFile is called *after* the ehframes are
emitted, so we get all the stubs.
This allows us to remove a gross hack from the asmprinter where it would
"just know" that it needed to output stubs for personality functions.
Now this is all driven from a consistent interface.
The testcase change is just reordering the expected output now that the
stubs come out after the ehframe instead of before.
This also unblocks other changes that Bill wants to make.
llvm-svn: 82269
where the induction variable has a non-unit stride, such as {0,+,2}, and
there are expressions such as {1,+,2} inside the loop formed with
or or add nsw operators.
llvm-svn: 82151
more than one phi, since that leads to higher register pressure on
entry to the phi. This is especially problematic when the phi is in
a loop header, as it increases register pressure throughout the loop.
llvm-svn: 81993