Daniel Dunbar.
- Reordered the fields in the ARMOperand Mem struct to make the struct smaller.
Making bool's into 1 bit fields and put the MCExpr* fields adjacent to each
other.
- Fixed a number of places in ARMAsmParser.cpp so they have doxygen comments.
- Change the name of ARMAsmParser::ParseRegister() to MaybeParseRegister and
added the bool ParseWriteBack parameter.
- Changed ARMAsmParser::ParseMemory() to call MaybeParseRegister().
- Added ARMAsmParser::ParseMemoryOffsetReg to factor out parsing the offset of a
memory operand. And use it for both parsing both preindexed and post indexing
addressing forms in ARMAsmParser::ParseMemory.
- Changed the first argument to ParseShift() to a reference.
- Changed ParseShift() to check for Rrx first and return to reduce nesting.
llvm-svn: 85632
void f (int a1, int a2, int a3, int a4, int a5,...)
In ARMTargetLowering::LowerFormalArguments if the function has 4 or
more regular arguments we used to set VarArgsFrameIndex using an
offset of 0, which is only correct if the function has exactly 4
regular arguments.
llvm-svn: 85590
bunch of associated comments, because it doesn't have anything to do
with DAGs or scheduling. This is another step in decoupling MachineInstr
emitting from scheduling.
llvm-svn: 85517
use it to control tail merging when there is a tradeoff between performance
and code size. When there is only 1 instruction in the common tail, we have
been merging. That can be good for code size but is a definite loss for
performance. Now we will avoid tail merging in that case when the
optimization level is "Aggressive", i.e., "-O3". Radar 7338114.
Since the IfConversion pass invokes BranchFolding, it too needs to know
the optimization level. Note that I removed the RegisterPass instantiation
for IfConversion because it required a default constructor. If someone
wants to keep that for some reason, we can add a default constructor with
a hard-wired optimization level.
llvm-svn: 85346
can access the hi regs. Our prologue and epilogue code doesn't know how to
properly handle save/restore of the hi regs, so things go badly when we alloc
them.
llvm-svn: 84982
reasonable code like Codegen/ARM/2009-02-27-SpillerBug.ll, producing
identical output except for superior formatting of constant pool entries.
llvm-svn: 84582
Leave Inst{11-8}, which represents the starting byte index of the extracted
result in the concatenation of the operands and is left unspecified.
Patch by Johnny Chen.
llvm-svn: 84572
appropriate restore location for the spill as well as perform the actual
save and restore.
The Thumb1 target uses this to make sure R12 is not clobbered while a spilled
scavenger register is live there.
llvm-svn: 84554
lowering stuff. We can now compile hello world to:
_main:
stm ,
mov r7, sp
sub sp, sp, #4
mov r0, #0
str r0,
ldr r0,
bl _printf
ldr r0,
mov sp, r7
ldm ,
Almost looks like arm code :)
llvm-svn: 84542
stack slots and giving them different PseudoSourceValue's did not fix the
problem of post-alloc scheduling miscompiling llvm itself.
- Apply Dan's conservative workaround by assuming any non fixed stack slots can
alias other memory locations. This means a load from spill slot #1 cannot
move above a store of spill slot #2.
- Enable post-alloc scheduling for x86 at optimization leverl Default and above.
llvm-svn: 84424
I can see with the original code was that I forgot that this runs after
type legalization and hence the result type will always be i32. (Custom
legalization of EXTRACT_VECTOR_ELT is only enabled for vector types with
8- and 16-bit elements.)
Regarding the FIXME comment: any information about sign and zero-extension
should be captured by separate extension operations. The DAG combiner should
handle those to produce either VGETLANEu or VGETLANEs, and that seems to be
working now. If there are cases that we're missing, let me know.
llvm-svn: 84218
In the case where there are no good places to put constants and we fall back
upon inserting unconditional branches to make new blocks, allow all constant
pool references in range of those blocks to put constants there, even if that
means resetting the "high water marks" for those references. This will still
terminate because you can't keep splitting blocks forever, and in the bad
cases where we have to split blocks, it is important to avoid splitting more
than necessary.
llvm-svn: 84202
as expressions, code for parsing a few arm specific directives (still needs
the MCStreamer calls for these). Some clean up of the operand parsing code
and adding some comments.
llvm-svn: 84201
When ARMConstantIslandPass cannot find any good locations (i.e., "water") to
place constants, it falls back to inserting unconditional branches to make a
place to put them. My recent change exposed a problem in this area. We may
sometimes append to the same block more than one unconditional branch. The
symptoms of this are that the generated assembly has a branch to an undefined
label and running llc with -debug will cause a seg fault.
This happens more easily since my change to prevent CPEs from moving from
lower to higher addresses as the algorithm iterates, but it could have
happened before. The end of the block may be in range for various constant
pool references, but the insertion point for new CPEs is not right at the end
of the block -- it is at the end of the CPEs that have already been placed
at the end of the block. The insertion point could be out of range. When
that happens, the fallback code will always append another unconditional
branch if the end of the block is in range.
The fix is to only append an unconditional branch if the block does not
already end with one. I also removed a check to see if the constant pool load
instruction is at the end of the block, since that is redundant with
checking if the end of the block is in-range.
There is more to be done here, but I think this fixes the immediate problem.
llvm-svn: 84172
Also fixed a couple of coding style things that crept in. And added more
to the temporary hacked up ARMAsmParser::MatchInstruction() method for testing.
llvm-svn: 84040
before its reference is only supported on ARM has not been true for a while.
In fact, until recently, that was only supported for Thumb. Besides that,
CPEs are always a multiple of 4 bytes in size, so inserting a CPE should have
no effect on Thumb alignment.
llvm-svn: 83916
MultiSource/Benchmarks/MiBench/automotive-susan test. The failure has
since been masked by an unrelated change (just randomly), so I don't have
a testcase for this now. Radar 7291928.
The situation where this happened is that a constant pool entry (CPE) was
placed at a lower address than the load that referenced it. There were in
fact 2 CPEs placed at adjacent addresses and referenced by 2 loads that were
close together in the code. The distance from the loads to the CPEs was
right at the limit of what they could handle, so that only one of the CPEs
could be placed within range. On every iteration, the first CPE was found
to be out of range, causing a new CPE to be inserted. The second CPE had
been in range but the newly inserted entry pushed it too far away. Thus the
second CPE was also replaced by a new entry, which in turn pushed the first
CPE out of range. Etc.
Judging from some comments in the code, the initial implementation of this
pass did not support CPEs placed _before_ their references. In the case
where the CPE is placed at a higher address, the key to making the algorithm
terminate is that new CPEs are only inserted at the end of a group of adjacent
CPEs. This is implemented by removing a basic block from the "WaterList"
once it has been used, and then adding the newly inserted CPE block to the
list so that the next insertion will come after it. This avoids the ping-pong
effect where CPEs are repeatedly moved to the beginning of a group of
adjacent CPEs. This does not work when going backwards, however, because the
entries at the end of an adjacent group of CPEs are closer than the CPEs
earlier in the group.
To make this pass terminate, we need to maintain a property that changes can
only happen in some sort of monotonic fashion. The fix used here is to require
that the CPE for a particular constant pool load can only move to lower
addresses. This is a very simple change to the code and should not cause
any significant degradation in the results.
llvm-svn: 83902
lists. Changed ARMAsmParser::MatchRegisterName to return -1 instead of 0 on
errors so 0-15 values could be returned as register numbers. Also added the
rest of the arm register names to the currently hacked up version to allow more
testing. Some changes to ARMAsmParser::ParseOperand to give different errors
for things not yet supported and some additions to the hacked
ARMAsmParser::MatchInstruction to allow more testing for now.
llvm-svn: 83673
with writeback, things like "sp!", etc. Also added some more stuff to the
temporarily hacked methods ARMAsmParser::MatchRegisterName and
ARMAsmParser::MatchInstruction to allow more parser testing.
llvm-svn: 83477
a virtual register to eliminate a frame index, it can return that register
and the constant stored there to PEI to track. When scavenging to allocate
for those registers, PEI then tracks the last-used register and value, and
if it is still available and matches the value for the next index, reuses
the existing value rather and removes the re-materialization instructions.
Fancier tracking and adjustment of scavenger allocations to keep more
values live for longer is possible, but not yet implemented and would likely
be better done via a different, less special-purpose, approach to the
problem.
eliminateFrameIndex() is modified so the target implementations can return
the registers they wish to be tracked for reuse.
ARM Thumb1 implements and utilizes the new mechanism. All other targets are
simply modified to adjust for the changed eliminateFrameIndex() prototype.
llvm-svn: 83467
operands. Some parsing of arm memory operands for preindexing and postindexing
forms including with register controled shifts. This is a work in progress.
llvm-svn: 83424
verbose-asm mode, print comments instead. This eliminates a non-comment
difference between verbose-asm mode and non-verbose-asm mode.
Also, factor out the relevant code out of all the targets and into
target-independent code.
llvm-svn: 83392
spill slot. When frame references are via the frame pointer, they will be
negative, but Thumb1 load/store instructions only allow positive immediate
offsets. Instead, Thumb1 will spill to R12.
llvm-svn: 83336
the new predicates I added) instead of going through a context and doing a
pointer comparison. Besides being cheaper, this allows a smart compiler
to turn the if sequence into a switch.
llvm-svn: 83297
to emit target-specific things at the beginning of the asm output. This
fixes a problem for PPC, where the text sections are not being kept together
as expected. The base class doInitialization code calls DW->BeginModule()
which emits a bunch of DWARF section directives. The PPC doInitialization
code then emits all the TEXT section directives, with the intention that they
will be kept together. But as I understand it, the Darwin assembler treats
the default TEXT section as a special case and moves it to the beginning of
the file, which means that all those DWARF sections are in the middle of
the text. With this change, the EmitStartOfAsmFile hook is called before
the DWARF section directives are emitted, so that all the PPC text section
directives come out right at the beginning of the file.
llvm-svn: 83176
section directives. This causes the assembler to put the text sections at
the beginning of the object file, which helps work around a limitation of the
Darwin ARM relocations. Radar 7255355.
llvm-svn: 83127
unused DECLARE instruction.
KILL is not yet used anywhere, it will replace TargetInstrInfo::IMPLICIT_DEF
in the places where IMPLICIT_DEF is just used to alter liveness of physical
registers.
llvm-svn: 83006
instruction. This makes it re-materializable.
Thumb2 will split it back out into two instructions so IT pass will generate the
right mask. Also, this expose opportunies to optimize the movw to a 16-bit move.
llvm-svn: 82982
- Allocate MachineMemOperands and MachineMemOperand lists in MachineFunctions.
This eliminates MachineInstr's std::list member and allows the data to be
created by isel and live for the remainder of codegen, avoiding a lot of
copying and unnecessary translation. This also shrinks MemSDNode.
- Delete MemOperandSDNode. Introduce MachineSDNode which has dedicated
fields for MachineMemOperands.
- Change MemSDNode to have a MachineMemOperand member instead of its own
fields with the same information. This introduces some redundancy, but
it's more consistent with what MachineInstr will eventually want.
- Ignore alignment when searching for redundant loads for CSE, but remember
the greatest alignment.
Target-specific code which previously used MemOperandSDNodes with generic
SDNodes now use MemIntrinsicSDNodes, with opcodes in a designated range
so that the SelectionDAG framework knows that MachineMemOperand information
is available.
llvm-svn: 82794
naming scheme used in SelectionDAG, where there are multiple kinds
of "target" nodes, but "machine" nodes are nodes which represent
a MachineInstr.
llvm-svn: 82790
For the AAPCS ABI, SP must always be 4-byte aligned, and at any "public
interface" it must be 8-byte aligned. For the older ARM APCS ABI, the stack
alignment is just always 4 bytes. For X86, we currently align SP at
entry to a function (e.g., to 16 bytes for Darwin), but no stack alignment
is needed at other times, such as for a leaf function.
After discussing this with Dan, I decided to go with the approach of adding
a new "TransientStackAlignment" field to TargetFrameInfo. This value
specifies the stack alignment that must be maintained even in between calls.
It defaults to 1 except for ARM, where it is 4. (Some other targets may
also want to set this if they have similar stack requirements. It's not
currently required for PPC because it sets targetHandlesStackFrameRounding
and handles the alignment in target-specific code.) The existing StackAlignment
value specifies the alignment upon entry to a function, which is how we've
been using it anyway.
llvm-svn: 82767
interest for this, as it currently reserves a register rather than using
the scavenger for matierializing constants as needed.
Instead of scavenging registers on the fly while eliminating frame indices,
new virtual registers are created, and then a scavenged collectively in a
post-pass over the function. This isolates the bits that need to interact
with the scavenger, and sets the stage for more intelligent use, and reuse,
of scavenged registers.
For the time being, this is disabled by default. Once the bugs are worked out,
the current scavenging calls in replaceFrameIndices() will be removed and
the post-pass scavenging will be the default. Until then,
-enable-frame-index-scavenging enables the new code. Currently, only the
Thumb1 back end is set up to use it.
llvm-svn: 82734
VLDM/VSTM instructions, and without this check, the code assumes that an
offset is allowed, as it would be with VLDR/VSTR. The asm printer,
however, silently drops the offset, producing incorrect code. Since the
address register in this case is either the stack or frame pointer, the
spill location ends up conflicting with some other stack slot or with
outgoing arguments on the stack.
llvm-svn: 81879
parses the .word directive as 4 bytes and ARMAsmParser::ParseInstruction will
give an error is called. Broke out the test of the .word directive into two
different test cases, one for x86 and one for arm.
llvm-svn: 81817
the MCInst path of the asmprinter. Instead, pull comment printing
out of the autogenerated asmprinter into each target that uses the
autogenerated asmprinter. This causes code duplication into each
target, but in a way that will be easier to clean up later when more
asmprinter stuff is commonized into the base AsmPrinter class.
This also fixes an xcore strangeness where it inserted two tabs
before every instruction.
llvm-svn: 81396
asm printer into the "printInstruction" routine. This
fixes a problem where the experimental asmprinter would
drop debug labels in some cases, and fixes issues on ppc/xcore
where pseudo instructions like "mr" didn't get debug locs properly.
It is annoying that this moves the call from one place into each
target, but a future set of more invasive refactorings will fix
that problem.
llvm-svn: 81377
makes an eggregious hack somewhat more palatable. Bringing the LSDA forward
and making it a GV available for reference would be even better, but is
beyond the scope of what I'm looking to solve at this point.
Objective C++ code could generate function names that broke the previous
scheme. This fixes that.
llvm-svn: 80649
The instructions can be selected directly from the intrinsics. We will need
to add some ARM-specific nodes for VLD/VST of 3 and 4 128-bit vectors, but
those are not yet implemented.
llvm-svn: 80117
all Darwin targets; could be split into separate tests for
the chip subdirectories, but from Chris' last mail on testing
I assume he'd rather have only one test. Generic seems to be
the best available, maybe there should be a Darwin subdirectory?
llvm-svn: 79877
MachineInstr and MachineOperand. This required eliminating a
bunch of stuff that was using DOUT, I hope that bill doesn't
mind me stealing his fun. ;-)
llvm-svn: 79813
several things other than Neon vector lane numbers. For inline assembly
operands with a "c" print code, check that they really are immediates.
llvm-svn: 79676
This is derived from a patch by Anton Korzh. I modified it to recognize
the VEXT shuffles during legalization and lower them to a target-specific
DAG node.
llvm-svn: 79428
- Drop the Candidates argument and fix all callers. Now that RegScavenger
tracks available registers accurately, there is no need to restict the
search.
- Make sure that no aliases of the found register are in use. This was a potential bug.
llvm-svn: 79369
support unaligned mem access only for certain types. (Should it be size
instead?)
ARM v7 supports unaligned access for i16 and i32, some v6 variants support it
as well.
llvm-svn: 79127
libcall. Take advantage of this in the ARM backend to rectify broken
choice of CC when hard float is in effect. PIC16 may want to see if
it could be of use in MakePIC16Libcall, which works unchanged.
Patch by Sandeep!
llvm-svn: 79033
target-specific VDUPLANE nodes. This allows the subreg handling for the
quad-register version to be done easily with Pats in the .td file, instead
of with custom code in ARMISelDAGToDAG.cpp.
llvm-svn: 78993
x86_64-apple-darwin10.
--- Reverse-merging r78895 into '.':
U test/CodeGen/PowerPC/2008-12-12-EH.ll
U lib/Target/DarwinTargetAsmInfo.cpp
--- Reverse-merging r78892 into '.':
U include/llvm/Target/DarwinTargetAsmInfo.h
U lib/Target/X86/X86TargetAsmInfo.cpp
U lib/Target/X86/X86TargetAsmInfo.h
U lib/Target/ARM/ARMTargetAsmInfo.h
U lib/Target/ARM/ARMTargetMachine.cpp
U lib/Target/ARM/ARMTargetAsmInfo.cpp
U lib/Target/PowerPC/PPCTargetAsmInfo.cpp
U lib/Target/PowerPC/PPCTargetAsmInfo.h
U lib/Target/PowerPC/PPCTargetMachine.cpp
G lib/Target/DarwinTargetAsmInfo.cpp
llvm-svn: 78919
pair instead of from a virtual method on TargetMachine. This cuts the final
ties of TargetAsmInfo to TargetMachine, meaning that MC can now use
TargetAsmInfo.
llvm-svn: 78802
"inlineasmstart/end" strings so that the contents of the directive
are separate from the comment character. This lets elf targets
get #APP/#NOAPP for free even if they don't use "#" as the comment
character. This also allows hoisting the darwin stuff up to the
shared TAI class.
llvm-svn: 78737
version. This allows TAI implementations to specify the directive to use
based on the mode being codegen'd for.
The real fix for this is to remove JumpTableDirective, but I don't feel
like diving into the jumptable snarl just now.
llvm-svn: 78709
match base only address, i.e. [r] since Thumb2 requires a offset register field.
For those, use [r + imm12] where the immediate is zero.
Note the generated assembly code does not look any different after the patch.
But the bug would have broken the JIT (if there is Thumb2 support) and it can
break later passes which expect the address mode to be well-formed.
llvm-svn: 78658
the overloaded vector types allowed floating-point or integer vector elements.
Most of these operations actually depend on the element type, so bitcasting
was not an option.
If you include the vpadd intrinsics that I updated earlier, this gets rid
of 20 intrinsics.
llvm-svn: 78646
and short. Well, it's kinda short. Definitely nasty and brutish.
The front-end generates the register/unregister calls into the SjLj runtime,
call-site indices and landing pad dispatch. The back end fills in the LSDA
with the call-site information provided by the front end. Catch blocks are
not yet implemented.
Built on Darwin and verified no llvm-core "make check" regressions.
llvm-svn: 78625
instead of syntactically as a string. This means that it keeps track of the
segment, section, flags, etc directly and asmprints them in the right format.
This also includes parsing and validation support for llvm-mc and
"attribute(section)", so we should now start getting errors about invalid
section attributes from the compiler instead of the assembler on darwin.
Still todo:
1) Uniquing of darwin mcsections
2) Move all the Darwin stuff out to MCSectionMachO.[cpp|h]
3) there are a few FIXMEs, for example what is the syntax to get the
S_GB_ZEROFILL segment type?
llvm-svn: 78547