1. Switch from an std::set to a SmallPtrSet for visited chain nodes.
2. Do not force the recursive flattening of token factor nodes, regardless of
use count.
3. Immediately process newly created TokenFactor nodes.
Also, improve combiner-aa by teaching it that loads to non-overlapping offsets
of relatively aligned objects cannot alias.
These changes result in a >5x speedup for combiner-aa on most testcases.
llvm-svn: 81816
The gist of this is if source of some of the copies that feed into a phi join is defined by the phi join, we'd like to eliminate them. However, if any of the non-identity source overlaps the live interval of the phi join then the coalescer won't be able to coalesce them. The early coalescer's job is to eliminate the identity copies by partially-coalescing the two live intervals.
llvm-svn: 81796
full AsmPrinter, and change TargetRegistry to keep track
of registered MCInstPrinters.
llvm-mc is still linking in the entire
target foo to get the code emitter stuff, but this is an
important step in the right direction.
llvm-svn: 81754
Move GetMBBSymbol up to AsmPrinter and make printBasicBlockLabel use it so that
we only have one place that decides what to name bb labels. Hopefully various
clients of printBasicBlockLabel can start using GetMBBSymbol instead.
llvm-svn: 81652
object, the timer it creates was not being deleted. Since the
timer belonged to a static timer group, the timer group would
be destroyed on shutdown, and would notice and complain that
not all timers it contained were destroyed.
llvm-svn: 81533
from the exception tables. However, Duncan explained why it's a can of worms to
do it the GCC way. I went back to doing it the LLVM way and added Duncan's
explanation so that I don't do this again in the future.
llvm-svn: 81434
like what GCC outputs. The mysterious code to insert padding wasn't in GCC at
all. I modified the TType base offset code to calculate the offset like GCC
does, though.
llvm-svn: 81424
code within it was the same inside and out. There's still a problem of the
TypeInfoSize should be the size of the TType format encoding (at least that's
what GCC thinks it should be).
llvm-svn: 81417
Basically, this patch is working towards removing the hard-coded values that are
output for the CIE. In particular, the CIE augmentation and the CIE augmentation
size. Both of these should be calculated. In the process, I was able to make a
bunch of code simpler.
The encodings for the personality, LSDA, and FDE in the CIE are still not
correct. They should be generated either from target-specific callbacks (blech!)
or grokked from first-principles.
llvm-svn: 81404
the MCInst path of the asmprinter. Instead, pull comment printing
out of the autogenerated asmprinter into each target that uses the
autogenerated asmprinter. This causes code duplication into each
target, but in a way that will be easier to clean up later when more
asmprinter stuff is commonized into the base AsmPrinter class.
This also fixes an xcore strangeness where it inserted two tabs
before every instruction.
llvm-svn: 81396
to instructions instead of zero extended ones. This makes the asmprinter
print signed values more consistently. This apparently only really affects
the X86 backend.
llvm-svn: 81265
instruction to insert before can be end(). getDebugLoc on
end() returns an invalid value, therefore use the debug
loc of the call instruction, and give it to InsertLabel.
llvm-svn: 81207
from floating-point to integer first, and bitcast the result
back to floating-point. Previously, this test was passing by
falling back to SelectionDAG lowering. The resulting code isn't
as nice, but it's correct and CodeGen now stays on the fast path.
llvm-svn: 81171
a new class, MachineInstrIndex, which hides arithmetic details from
most clients. This is a step towards allowing the register allocator
to update/insert code during allocation.
llvm-svn: 81040
for the complicated case where one register is tied to multiple destinations.
This avoids the extra scan of instruction operands that was introduced by
my recent change. I also pulled some code out into a separate
TryInstructionTransform method, added more comments, and renamed some
variables.
Besides all those changes, this takes care of a FIXME in the code regarding
an assumption about there being a single tied use of a register when
converting to a 3-address form. I'm not aware of cases where that assumption
is violated, but the code now only attempts to transform an instruction,
either by commuting its operands or by converting to a 3-address form,
for the simple case where there is a single pair of tied operands.
llvm-svn: 80945
avoid reloads by reusing clobbered registers.
This was causing issues in 256.bzip2 when compiled with PIC for
a while (starting at r78217), though the problem has since been masked.
llvm-svn: 80872
tied to different source registers, the TwoAddressInstructionPass needs to
be smarter. Change it to check before replacing a source register whether
that source register is tied to a different destination register, and if so,
defer handling it until a subsequent iteration.
llvm-svn: 80654
makes an eggregious hack somewhat more palatable. Bringing the LSDA forward
and making it a GV available for reference would be even better, but is
beyond the scope of what I'm looking to solve at this point.
Objective C++ code could generate function names that broke the previous
scheme. This fixes that.
llvm-svn: 80649
Shared landing pads run into trouble with SJLJ, as the dispatch table is
mapped to call sites, and merging the pads will throw that off. There needs
to be a one-to-one mapping of landing pad exception table entries to invoke
call points.
Detecting the shared pad during lowering of SJLJ info insn't sufficient, as
the dispatch function may still need separate destinations to properly
handle phi-nodes.
llvm-svn: 80530
encodings.
- Make some of the values emitted by the FDEs dependent upon the pointer
size. This is in line with how GCC does things. And it has the benefit of
working for Darwin in 64-bit mode now.
llvm-svn: 80428
A include/llvm/ADT/iterator.cmake
U autoconf/configure.ac
--- Reverse-merging r80161 into '.':
U cmake/config-ix.cmake
--- Reverse-merging r80171 into '.':
U Makefile
--- Reverse-merging r80173 into '.':
U configure
U include/llvm/Config/config.h.in
--- Reverse-merging r80180 into '.':
A include/llvm/ADT/iterator.h.in
Despite common miscomceptions, iterator.h is alive and well. It broke the build
bots for several hours. And yet no one bothered to look at them.
Gabor and Doug, please review your changes and make sure that they actually
build before resubmitting them.
llvm-svn: 80197
forcing them down into various .cpp files.
This change also:
1. Renames TimeValue::toString() and Path::toString() to ::str()
for similarity with the STL.
2. Removes all stream insertion support for sys::Path, forcing
clients to call .str().
3. Removes a use of Config/alloca.h from bugpoint, using smallvector
instead.
4. Weans llvm-db off <iostream>
sys::Path really needs to be gutted, but I don't have the desire to
do it at this point.
llvm-svn: 79869
When undoing a reuse in ReuseInfo::GetRegForReload, check if it was only a
sub-register being used. The MachineOperand::getSubReg() method is only valid
for virtual registers, so we have to recover the sub-register index manually.
llvm-svn: 79855
MachineInstr and MachineOperand. This required eliminating a
bunch of stuff that was using DOUT, I hope that bill doesn't
mind me stealing his fun. ;-)
llvm-svn: 79813
instead of as two bools. Use this to add a F_Append flag
which has the obvious behavior.
Other unrelated changes conflated into this patch:
1. REmove EH stuff from llvm-dis and llvm-as, the try blocks
are dead.
2. Simplify the filename inference code in llvm-as/llvm-dis,
because raw_fd_ostream does the right thing with '-'.
3. Switch machine verifier to use raw_ostream instead of ostream
(Which is the thing that needed append in the first place).
llvm-svn: 79807
be of (dynamically) constant values, so races on it are immaterial. We just need
to ensure that at least one write has completed before return the pointer into it.
With this change, parllc exhibits essentially no overhead on 403.gcc.
llvm-svn: 79708
- Drop the Candidates argument and fix all callers. Now that RegScavenger
tracks available registers accurately, there is no need to restict the
search.
- Make sure that no aliases of the found register are in use. This was a potential bug.
llvm-svn: 79369
remove RemoveDuplicateSuccessor, as it is no longer necessary, and because
it breaks assumptions made in
MachineBasicBlock::isOnlyReachableByFallthrough.
Convert test/CodeGen/X86/omit-label.ll to FileCheck and add a testcase
for PR4732.
test/CodeGen/Thumb2/thumb2-ifcvt2.ll sees a diff with this commit due to
it being bugpoint-reduced to the point where it doesn't matter what the
condition for the branch is.
Add some more interesting code to
test/CodeGen/X86/2009-08-06-branchfolder-crash.ll, which is the testcase
that originally motivated the RemoveDuplicateSuccessor code, to help
verify that the original problem isn't being re-broken.
llvm-svn: 79338
MCAsmStreamer. Based on this, eliminate the current section from AsmPrinter.
While I'm at it, clean up the last of the horrible "switch to null section" stuff
and add an assert. This change is in preparation for completely eliminating
asmprinter::switchtosection.
llvm-svn: 79324
more properly belong. This allows removing the front-end conditionalized
SJLJ code, and cleans up the generated IR considerably. All of the
infrastructure code (calling _Unwind_SjLj_Register/Unregister, etc) is
added by the SjLjEHPrepare pass.
llvm-svn: 79250
doing it directly. This requires const'izing a bunch of stuff that
took sections, but this seems like the right semantic thing to do:
emitting a label to a section shouldn't mutate the MCSection object
itself, for example.
llvm-svn: 79227
If two uses of a CopyFromReg want different regclasses, first try a common
sub-class, then fall back on the copy emitted in AddRegisterOperand. There is
no need for an assert here. The cross-class joiner usually cleans up nicely.
llvm-svn: 79193
It is legal for an inline asm operand to use an earlyclobber register if the
use operand is tied to the earlyclobber operand. The issue is discussed here:
http://gcc.gnu.org/ml/gcc/1999-04n/msg00431.html
We should perhaps let only the machine code verifier worry about these finer
details. EarlyClobber operands are not really interesting to the scavenger.
This fixes PR4528 for the third time.
llvm-svn: 79122
In a naked function, the flag is never set and getPristineRegs() returns an
empty list. That means naked functions are able to clobber callee saved
registers, but that is the whole point of naked functions.
This fixes PR4716.
llvm-svn: 79096
In the included test case, a stack load was not included in DistanceMap. That
caused TransferDeadness to ignore the instruction, leading to a scavenger
assert.
llvm-svn: 79090
libcall. Take advantage of this in the ARM backend to rectify broken
choice of CC when hard float is in effect. PIC16 may want to see if
it could be of use in MakePIC16Libcall, which works unchanged.
Patch by Sandeep!
llvm-svn: 79033
TargetAsmInfo. This eliminates a dependency on TargetMachine.h from
TargetRegistry.h, which technically was a layering violation.
- Clients probably can only sensibly pass in the same TargetAsmInfo as the
TargetMachine has, but there are only limited clients of this API.
llvm-svn: 78928
pair instead of from a virtual method on TargetMachine. This cuts the final
ties of TargetAsmInfo to TargetMachine, meaning that MC can now use
TargetAsmInfo.
llvm-svn: 78802
"inlineasmstart/end" strings so that the contents of the directive
are separate from the comment character. This lets elf targets
get #APP/#NOAPP for free even if they don't use "#" as the comment
character. This also allows hoisting the darwin stuff up to the
shared TAI class.
llvm-svn: 78737
The register scavenger maintains a DistanceMap that maps MI pointers to their
distance from the top of the current MBB. The DistanceMap is built
incrementally in forward() and in bulk in findFirstUse(). It is used by
scavengeRegister() to determine which candidate register has the longest
unused interval.
Unfortunately the DistanceMap contents can become outdated. The first time
scavengeRegister() is called, the DistanceMap is filled to cover the MBB. If
then instructions are inserted in the MBB (as they always are following
scavengeRegister()), the recorded distances are too short. This causes bad
behaviour in the included test case where a register use /after/ the current
position is ignored because findFirstUse() thinks is is /before/ the current
position. A "using an undefined register" assertion follows promptly.
The fix is to build a fresh DistanceMap at the top of scavengeRegister(), and
discard it after use. This means that DistanceMap is no longer needed as a
RegScavenger member variable, and forward() doesn't need to update it.
The fix then discloses issue number two in the same test case: The candidate
search in scavengeRegister() finds a CSR that has been saved in the prologue,
but is currently unused. It would be both inefficient and wrong to spill such
a register in the emergency spill slot. In the present case, the emergency
slot restore is placed immediately before the normal epilogue restore, leading
to a "Redefining a live register" assertion.
Fix number two: When scavengerRegister() stumbles upon an unused register that
is overwritten later in the MBB, return that register early. It is important
to verify that the register is defined later in the MBB, otherwise it might be
an unspilled CSR.
llvm-svn: 78650
and short. Well, it's kinda short. Definitely nasty and brutish.
The front-end generates the register/unregister calls into the SjLj runtime,
call-site indices and landing pad dispatch. The back end fills in the LSDA
with the call-site information provided by the front end. Catch blocks are
not yet implemented.
Built on Darwin and verified no llvm-core "make check" regressions.
llvm-svn: 78625
MERGE_VALUES nodes. Replacing the result values with the
operands in one MERGE_VALUES node may cause another
MERGE_VALUES node be CSE'd with the first one, and bring
its uses along, so that the first one isn't dead, as this
code expects. Fix this by iterating until the node is
really dead. This fixes PR4699.
llvm-svn: 78619
This definitely slows down asm output so put it under an -asm-exuberant
flag.
This information is useful when doing static analysis of performance
issues.
llvm-svn: 78567
2. Move section switch printing to MCSection virtual method which takes a
TAI. This eliminates textual formatting stuff from TLOF.
3. Eliminate SwitchToSectionDirective, getSectionFlagsAsString, and
TLOFELF::AtIsCommentChar.
llvm-svn: 78510
A TAI hook is appropriate in this case because this is just an
asm syntax issue, not a semantic difference. TLOF should model
the semantics of the section.
llvm-svn: 78498
Blackfin supports and/or/xor on i32 but not on i16. Teach
DAGCombiner::SimplifyBinOpWithSameOpcodeHands to not produce illegal nodes
after legalize ops.
llvm-svn: 78497
Handle large integers, x86_fp80, ConstantAggregateZero, and two more ConstantExpr:
GetElementPtr and IntToPtr
Set SHF_MERGE bit for mergeable strings
Avoid zero initialized strings to be classified as a bss symbol
Don't allow common symbols to be classified as STB_WEAK
Add a constant to be used as a global value offset in data relocations
llvm-svn: 78476
Also don't dereference old pointers after they have been deleted causing
random crashes when enabling the machine code verifier.
Ahem...
I have not included a test case for the crash. It hapened when enabling the
verifier on CodeGen/X86/2009-08-06-branchfolder-crash.ll.
The crash depends on an MBB being allocated at the same address as a
previously deleted MBB. I don't think that can be reproduced reliably.
llvm-svn: 78472
Now there is no special treatment of instructions that redefine part of a
super-register. Instead, the super-register is marked with <imp-use,kill> and
<imp-def>. For instance, from LowerSubregs on ARM:
subreg: CONVERTING: %Q1<def> = INSERT_SUBREG %Q1<undef>, %D1<kill>, 5
subreg: %D2<def> = FCPYD %D1<kill>, 14, %reg0, %Q1<imp-def>
subreg: CONVERTING: %Q1<def> = INSERT_SUBREG %Q1, %D0<kill>, 6
subreg: %D3<def> = FCPYD %D0<kill>, 14, %reg0, %Q1<imp-use,kill>, %Q1<imp-def>
llvm-svn: 78466
Verify that early clobber registers and their aliases are not used.
All changes to RegsAvailable are now done as a transaction so the order of
operands makes no difference.
The included test case is from PR4686. It has behaviour that was dependent on the order of operands.
llvm-svn: 78465
- start support for new PEI w/reg alloc, allow running RS from emit{Pro,Epi}logue() target hooks.
- fix minor issue with recursion detection.
llvm-svn: 78318
and high-bits values in ways that weren't correct for integer
types wider than 64 bits. This fixes a miscompile in
PPMacroExpansion.cpp in clang on x86-64.
llvm-svn: 78295
a dirty hack and isn't need anymore since the last x86 code emitter patch)
- Add a target-dependent modifier to addend calculation
- Use R_X86_64_32S relocation for X86::reloc_absolute_word_sext
- Use getELFSectionFlags whenever possible
- fix getTextSection to use TLOF and emit the right text section
- Handle global emission for static ctors, dtors and Type::PointerTyID
- Some minor fixes
llvm-svn: 78176
Instead of awkwardly encoding calling-convention information with ISD::CALL,
ISD::FORMAL_ARGUMENTS, ISD::RET, and ISD::ARG_FLAGS nodes, TargetLowering
provides three virtual functions for targets to override:
LowerFormalArguments, LowerCall, and LowerRet, which replace the custom
lowering done on the special nodes. They provide the same information, but
in a more immediately usable format.
This also reworks much of the target-independent tail call logic. The
decision of whether or not to perform a tail call is now cleanly split
between target-independent portions, and the target dependent portion
in IsEligibleForTailCallOptimization.
This also synchronizes all in-tree targets, to help enable future
refactoring and feature work.
llvm-svn: 78142
When LowerExtract eliminates an EXTRACT_SUBREG with a kill flag, it moves the
kill flag to the place where the sub-register is killed. This can accidentally
overlap with the use of a sibling sub-register, and we have trouble.
In the test case we have this code:
Live Ins: %R0 %R1 %R2
%R2L<def> = EXTRACT_SUBREG %R2<kill>, 1
%R2H<def> = LOAD16fi <fi#-1>, 0, Mem:LD(2,4) [FixedStack-1 + 0]
%R1L<def> = EXTRACT_SUBREG %R1<kill>, 1
%R0L<def> = EXTRACT_SUBREG %R0<kill>, 1
%R0H<def> = ADD16 %R2H<kill>, %R2L<kill>, %AZ<imp-def>, %AN<imp-def>, %AC0<imp-def>, %V<imp-def>, %VS<imp-def>
subreg: CONVERTING: %R2L<def> = EXTRACT_SUBREG %R2<kill>, 1
subreg: eliminated!
subreg: killed here: %R0H<def> = ADD16 %R2H, %R2L, %R2<imp-use,kill>, %AZ<imp-def>, %AN<imp-def>, %AC0<imp-def>, %V<imp-def>, %VS<imp-def>
The kill flag on %R2 is moved to the last instruction, and the live range overlaps with the definition of %R2H:
*** Bad machine code: Redefining a live physical register ***
- function: f
- basic block: 0x18358c0 (#0)
- instruction: %R2H<def> = LOAD16fi <fi#-1>, 0, Mem:LD(2,4) [FixedStack-1 + 0]
Register R2H was defined but already live.
The fix is to replace EXTRACT_SUBREG with IMPLICIT_DEF instead of eliminating
it completely:
subreg: CONVERTING: %R2L<def> = EXTRACT_SUBREG %R2<kill>, 1
subreg: replace by: %R2L<def> = IMPLICIT_DEF %R2<kill>
Note that these IMPLICIT_DEF instructions survive to the asm output. It is
necessary to fix the stack-color-with-reg test case because of that.
llvm-svn: 78093
Implicit operands no longer get a free pass: Imp-use requires a live register
and imp-def requires a dead register.
There is also no special rule allowing redefinition of a sub-register when the
super-register is live. The super register must have imp-kill+imp-def operands
instead.
llvm-svn: 78090
killed by another operand.
There is probably a better fix. Either 1) scavenger can look at other operands, or
2) livevariables can be smarter about kill markers. Patches welcome.
llvm-svn: 78072
TLI.computeMaskedBitsForTargetNode from ComputeMaskedBits, since
the former may call back into the latter. This fixes a major
compile time problem on a testcase that happnened to hit this
in a particularly bad way, PR4643.
llvm-svn: 78023
When LowerSubregsInstructionPass::LowerInsert eliminates an INSERT_SUBREG
instriction because it is an identity copy, make sure that the same registers
are alive before and after the elimination.
When the super-register is marked <undef> this requires inserting an
IMPLICIT_DEF instruction to make sure the super register is live.
Fix a related bug where a kill flag on the inserted sub-register was not transferred properly.
Finally, clear the undef flag in MachineInstr::addRegisterKilled. Undef implies dead and kill implies live, so they cant both be valid.
llvm-svn: 77989
Allow imp-def and imp-use of anything in the scavenger asserts, just like the machine code verifier.
Allow redefinition of a sub-register of a live register.
llvm-svn: 77904
support. This isn't immediately interesting, because Legalize
ends up lowering SELECT_CC if the target doesn't support it,
but this simplifies the process.
Also, if the SELECT_CC would be expanded in Legalize, it
can potentially end up with two copies of the condition
expression. By leaving it as SELECT+SETCC, the SELECT can be
expanded into two SELECTs that use a single SETCC.
The two comparisons are usually CSE'd, but depending on
when various expressions get legalized, the comparison
expression could involve calls to library functions, such
that the comparison expression may not be able to be CSE'd.
This will be needed by a future patch.
llvm-svn: 77896
getLSDASection() to be more specific. This makes it pretty obvious
that the ELF LSDA section is being specified wrong in PIC mode. We're
probably getting a lot of startup-time relocations to a readonly page,
which is expensive and bad.
Someone who cares about ELF C++ should investigate this.
llvm-svn: 77847