can access the hi regs. Our prologue and epilogue code doesn't know how to
properly handle save/restore of the hi regs, so things go badly when we alloc
them.
llvm-svn: 84982
reasonable code like Codegen/ARM/2009-02-27-SpillerBug.ll, producing
identical output except for superior formatting of constant pool entries.
llvm-svn: 84582
Leave Inst{11-8}, which represents the starting byte index of the extracted
result in the concatenation of the operands and is left unspecified.
Patch by Johnny Chen.
llvm-svn: 84572
appropriate restore location for the spill as well as perform the actual
save and restore.
The Thumb1 target uses this to make sure R12 is not clobbered while a spilled
scavenger register is live there.
llvm-svn: 84554
lowering stuff. We can now compile hello world to:
_main:
stm ,
mov r7, sp
sub sp, sp, #4
mov r0, #0
str r0,
ldr r0,
bl _printf
ldr r0,
mov sp, r7
ldm ,
Almost looks like arm code :)
llvm-svn: 84542
stack slots and giving them different PseudoSourceValue's did not fix the
problem of post-alloc scheduling miscompiling llvm itself.
- Apply Dan's conservative workaround by assuming any non fixed stack slots can
alias other memory locations. This means a load from spill slot #1 cannot
move above a store of spill slot #2.
- Enable post-alloc scheduling for x86 at optimization leverl Default and above.
llvm-svn: 84424
I can see with the original code was that I forgot that this runs after
type legalization and hence the result type will always be i32. (Custom
legalization of EXTRACT_VECTOR_ELT is only enabled for vector types with
8- and 16-bit elements.)
Regarding the FIXME comment: any information about sign and zero-extension
should be captured by separate extension operations. The DAG combiner should
handle those to produce either VGETLANEu or VGETLANEs, and that seems to be
working now. If there are cases that we're missing, let me know.
llvm-svn: 84218
In the case where there are no good places to put constants and we fall back
upon inserting unconditional branches to make new blocks, allow all constant
pool references in range of those blocks to put constants there, even if that
means resetting the "high water marks" for those references. This will still
terminate because you can't keep splitting blocks forever, and in the bad
cases where we have to split blocks, it is important to avoid splitting more
than necessary.
llvm-svn: 84202
as expressions, code for parsing a few arm specific directives (still needs
the MCStreamer calls for these). Some clean up of the operand parsing code
and adding some comments.
llvm-svn: 84201
When ARMConstantIslandPass cannot find any good locations (i.e., "water") to
place constants, it falls back to inserting unconditional branches to make a
place to put them. My recent change exposed a problem in this area. We may
sometimes append to the same block more than one unconditional branch. The
symptoms of this are that the generated assembly has a branch to an undefined
label and running llc with -debug will cause a seg fault.
This happens more easily since my change to prevent CPEs from moving from
lower to higher addresses as the algorithm iterates, but it could have
happened before. The end of the block may be in range for various constant
pool references, but the insertion point for new CPEs is not right at the end
of the block -- it is at the end of the CPEs that have already been placed
at the end of the block. The insertion point could be out of range. When
that happens, the fallback code will always append another unconditional
branch if the end of the block is in range.
The fix is to only append an unconditional branch if the block does not
already end with one. I also removed a check to see if the constant pool load
instruction is at the end of the block, since that is redundant with
checking if the end of the block is in-range.
There is more to be done here, but I think this fixes the immediate problem.
llvm-svn: 84172
Also fixed a couple of coding style things that crept in. And added more
to the temporary hacked up ARMAsmParser::MatchInstruction() method for testing.
llvm-svn: 84040
before its reference is only supported on ARM has not been true for a while.
In fact, until recently, that was only supported for Thumb. Besides that,
CPEs are always a multiple of 4 bytes in size, so inserting a CPE should have
no effect on Thumb alignment.
llvm-svn: 83916
MultiSource/Benchmarks/MiBench/automotive-susan test. The failure has
since been masked by an unrelated change (just randomly), so I don't have
a testcase for this now. Radar 7291928.
The situation where this happened is that a constant pool entry (CPE) was
placed at a lower address than the load that referenced it. There were in
fact 2 CPEs placed at adjacent addresses and referenced by 2 loads that were
close together in the code. The distance from the loads to the CPEs was
right at the limit of what they could handle, so that only one of the CPEs
could be placed within range. On every iteration, the first CPE was found
to be out of range, causing a new CPE to be inserted. The second CPE had
been in range but the newly inserted entry pushed it too far away. Thus the
second CPE was also replaced by a new entry, which in turn pushed the first
CPE out of range. Etc.
Judging from some comments in the code, the initial implementation of this
pass did not support CPEs placed _before_ their references. In the case
where the CPE is placed at a higher address, the key to making the algorithm
terminate is that new CPEs are only inserted at the end of a group of adjacent
CPEs. This is implemented by removing a basic block from the "WaterList"
once it has been used, and then adding the newly inserted CPE block to the
list so that the next insertion will come after it. This avoids the ping-pong
effect where CPEs are repeatedly moved to the beginning of a group of
adjacent CPEs. This does not work when going backwards, however, because the
entries at the end of an adjacent group of CPEs are closer than the CPEs
earlier in the group.
To make this pass terminate, we need to maintain a property that changes can
only happen in some sort of monotonic fashion. The fix used here is to require
that the CPE for a particular constant pool load can only move to lower
addresses. This is a very simple change to the code and should not cause
any significant degradation in the results.
llvm-svn: 83902
lists. Changed ARMAsmParser::MatchRegisterName to return -1 instead of 0 on
errors so 0-15 values could be returned as register numbers. Also added the
rest of the arm register names to the currently hacked up version to allow more
testing. Some changes to ARMAsmParser::ParseOperand to give different errors
for things not yet supported and some additions to the hacked
ARMAsmParser::MatchInstruction to allow more testing for now.
llvm-svn: 83673
with writeback, things like "sp!", etc. Also added some more stuff to the
temporarily hacked methods ARMAsmParser::MatchRegisterName and
ARMAsmParser::MatchInstruction to allow more parser testing.
llvm-svn: 83477
a virtual register to eliminate a frame index, it can return that register
and the constant stored there to PEI to track. When scavenging to allocate
for those registers, PEI then tracks the last-used register and value, and
if it is still available and matches the value for the next index, reuses
the existing value rather and removes the re-materialization instructions.
Fancier tracking and adjustment of scavenger allocations to keep more
values live for longer is possible, but not yet implemented and would likely
be better done via a different, less special-purpose, approach to the
problem.
eliminateFrameIndex() is modified so the target implementations can return
the registers they wish to be tracked for reuse.
ARM Thumb1 implements and utilizes the new mechanism. All other targets are
simply modified to adjust for the changed eliminateFrameIndex() prototype.
llvm-svn: 83467
operands. Some parsing of arm memory operands for preindexing and postindexing
forms including with register controled shifts. This is a work in progress.
llvm-svn: 83424
verbose-asm mode, print comments instead. This eliminates a non-comment
difference between verbose-asm mode and non-verbose-asm mode.
Also, factor out the relevant code out of all the targets and into
target-independent code.
llvm-svn: 83392
spill slot. When frame references are via the frame pointer, they will be
negative, but Thumb1 load/store instructions only allow positive immediate
offsets. Instead, Thumb1 will spill to R12.
llvm-svn: 83336
the new predicates I added) instead of going through a context and doing a
pointer comparison. Besides being cheaper, this allows a smart compiler
to turn the if sequence into a switch.
llvm-svn: 83297
to emit target-specific things at the beginning of the asm output. This
fixes a problem for PPC, where the text sections are not being kept together
as expected. The base class doInitialization code calls DW->BeginModule()
which emits a bunch of DWARF section directives. The PPC doInitialization
code then emits all the TEXT section directives, with the intention that they
will be kept together. But as I understand it, the Darwin assembler treats
the default TEXT section as a special case and moves it to the beginning of
the file, which means that all those DWARF sections are in the middle of
the text. With this change, the EmitStartOfAsmFile hook is called before
the DWARF section directives are emitted, so that all the PPC text section
directives come out right at the beginning of the file.
llvm-svn: 83176
section directives. This causes the assembler to put the text sections at
the beginning of the object file, which helps work around a limitation of the
Darwin ARM relocations. Radar 7255355.
llvm-svn: 83127
unused DECLARE instruction.
KILL is not yet used anywhere, it will replace TargetInstrInfo::IMPLICIT_DEF
in the places where IMPLICIT_DEF is just used to alter liveness of physical
registers.
llvm-svn: 83006
instruction. This makes it re-materializable.
Thumb2 will split it back out into two instructions so IT pass will generate the
right mask. Also, this expose opportunies to optimize the movw to a 16-bit move.
llvm-svn: 82982
- Allocate MachineMemOperands and MachineMemOperand lists in MachineFunctions.
This eliminates MachineInstr's std::list member and allows the data to be
created by isel and live for the remainder of codegen, avoiding a lot of
copying and unnecessary translation. This also shrinks MemSDNode.
- Delete MemOperandSDNode. Introduce MachineSDNode which has dedicated
fields for MachineMemOperands.
- Change MemSDNode to have a MachineMemOperand member instead of its own
fields with the same information. This introduces some redundancy, but
it's more consistent with what MachineInstr will eventually want.
- Ignore alignment when searching for redundant loads for CSE, but remember
the greatest alignment.
Target-specific code which previously used MemOperandSDNodes with generic
SDNodes now use MemIntrinsicSDNodes, with opcodes in a designated range
so that the SelectionDAG framework knows that MachineMemOperand information
is available.
llvm-svn: 82794
naming scheme used in SelectionDAG, where there are multiple kinds
of "target" nodes, but "machine" nodes are nodes which represent
a MachineInstr.
llvm-svn: 82790
For the AAPCS ABI, SP must always be 4-byte aligned, and at any "public
interface" it must be 8-byte aligned. For the older ARM APCS ABI, the stack
alignment is just always 4 bytes. For X86, we currently align SP at
entry to a function (e.g., to 16 bytes for Darwin), but no stack alignment
is needed at other times, such as for a leaf function.
After discussing this with Dan, I decided to go with the approach of adding
a new "TransientStackAlignment" field to TargetFrameInfo. This value
specifies the stack alignment that must be maintained even in between calls.
It defaults to 1 except for ARM, where it is 4. (Some other targets may
also want to set this if they have similar stack requirements. It's not
currently required for PPC because it sets targetHandlesStackFrameRounding
and handles the alignment in target-specific code.) The existing StackAlignment
value specifies the alignment upon entry to a function, which is how we've
been using it anyway.
llvm-svn: 82767
interest for this, as it currently reserves a register rather than using
the scavenger for matierializing constants as needed.
Instead of scavenging registers on the fly while eliminating frame indices,
new virtual registers are created, and then a scavenged collectively in a
post-pass over the function. This isolates the bits that need to interact
with the scavenger, and sets the stage for more intelligent use, and reuse,
of scavenged registers.
For the time being, this is disabled by default. Once the bugs are worked out,
the current scavenging calls in replaceFrameIndices() will be removed and
the post-pass scavenging will be the default. Until then,
-enable-frame-index-scavenging enables the new code. Currently, only the
Thumb1 back end is set up to use it.
llvm-svn: 82734