allows appropriate backends to generate a sqrt instruction.
On x86, this isn't done at -O0 because we go through
FastISel instead. This is a behavior change from before
this series of sqrt patches started. I think this is OK
considering that compile speed is most important at -O0, but
could be convinced otherwise.
llvm-svn: 82778
For the AAPCS ABI, SP must always be 4-byte aligned, and at any "public
interface" it must be 8-byte aligned. For the older ARM APCS ABI, the stack
alignment is just always 4 bytes. For X86, we currently align SP at
entry to a function (e.g., to 16 bytes for Darwin), but no stack alignment
is needed at other times, such as for a leaf function.
After discussing this with Dan, I decided to go with the approach of adding
a new "TransientStackAlignment" field to TargetFrameInfo. This value
specifies the stack alignment that must be maintained even in between calls.
It defaults to 1 except for ARM, where it is 4. (Some other targets may
also want to set this if they have similar stack requirements. It's not
currently required for PPC because it sets targetHandlesStackFrameRounding
and handles the alignment in target-specific code.) The existing StackAlignment
value specifies the alignment upon entry to a function, which is how we've
been using it anyway.
llvm-svn: 82767
interest for this, as it currently reserves a register rather than using
the scavenger for matierializing constants as needed.
Instead of scavenging registers on the fly while eliminating frame indices,
new virtual registers are created, and then a scavenged collectively in a
post-pass over the function. This isolates the bits that need to interact
with the scavenger, and sets the stage for more intelligent use, and reuse,
of scavenged registers.
For the time being, this is disabled by default. Once the bugs are worked out,
the current scavenging calls in replaceFrameIndices() will be removed and
the post-pass scavenging will be the default. Until then,
-enable-frame-index-scavenging enables the new code. Currently, only the
Thumb1 back end is set up to use it.
llvm-svn: 82734
LocalAreaOffset. (We don't have any of those right now.)
PEI::calculateFrameObjectOffsets includes the absolute value of the
LocalAreaOffset in the cumulative offset value used to calculate the
stack frame size. It then adds the raw value of the LocalAreaOffset
to the stack size. For a StackGrowsDown target, that raw value is negative
and has the effect of cancelling out the absolute value that was added
earlier, but that obviously won't work for a StackGrowsUp target. Change
to subtract the absolute value of the LocalAreaOffset.
llvm-svn: 82693
LiveVariables add implicit kills to correctly track partial register kills. This works well enough and is fairly accurate. But coalescer can make it impossible to maintain these markers. e.g.
BL <ga:sss1>, %R0<kill,undef>, %S0<kill>, %R0<imp-def>, %R1<imp-def,dead>, %R2<imp-def,dead>, %R3<imp-def,dead>, %R12<imp-def,dead>, %LR<imp-def,dead>, %D0<imp-def>, ...
...
%reg1031<def> = FLDS <cp#1>, 0, 14, %reg0, Mem:LD4[ConstantPool]
...
%S0<def> = FCPYS %reg1031<kill>, 14, %reg0, %D0<imp-use,kill>
When reg1031 and S0 are coalesced, the copy (FCPYS) will be eliminated the the implicit-kill of D0 is lost. In this case it's possible to move the marker to the FLDS. But in many cases, this is not possible. Suppose
%reg1031<def> = FOO <cp#1>, %D0<imp-def>
...
%S0<def> = FCPYS %reg1031<kill>, 14, %reg0, %D0<imp-use,kill>
When FCPYS goes away, the definition of S0 is the "FOO" instruction. However, transferring the D0 implicit-kill to FOO doesn't work since it is the def of D0 itself. We need to fix this in another time by introducing a "kill" pseudo instruction to track liveness.
Disabling the assertion is not ideal, but machine verifier is doing that job now. It's important to know double-def is not a miscomputation since it means a register should be free but it's not tracked as free. It's a performance issue instead.
llvm-svn: 82677
The machine code verifier did not check for explicit operands correctly. It
used MachineInstr::getNumExplicitOperands, but that method may cheat and use
the declared count in the TargetInstrDesc.
Now we check the explicit operands one at a time in visitMachineOperand.
llvm-svn: 82652
default implementation. Update comment on the default version, which made it
sound like most targets override it. Currently only X86 and SystemZ override
this method.
llvm-svn: 82651
of the defs are processed.
Also fix a implicit_def propagation bug: a implicit_def of a physical register
should be applied to uses of the sub-registers.
llvm-svn: 82616
two different places for printing MachineMemOperands.
Drop the virtual from Value::dump and instead give Value a
protected virtual hook that can be overridden by subclasses
to implement custom printing. This lets printing be more
consistent, and simplifies printing of PseudoSourceValue
values.
llvm-svn: 82599