Commit Graph

13970 Commits

Author SHA1 Message Date
Hal Finkel 8a31138521 Fix DAGCombine to deal with ext-conversion of pre/post_inc loads.
The test case for this will come with the PPC indexed preinc loads commit.

llvm-svn: 158822
2012-06-20 15:42:48 +00:00
Aaron Ballman 421a5ba06d Fixing a compiler warning in MSVC 10.
llvm-svn: 158820
2012-06-20 14:44:44 +00:00
Chandler Carruth c60fbe6b58 Fix two rather subtle internal vs. external linker issues.
I'll admit I'm not entirely satisfied with this change, but it seemed
the cleanest option. Other suggestions quite welcome

The issue is that the traits specializations have static methods which
return the typedef'ed PHI_iterator type. In both the IR and MI layers
this is typedef'ed to a custom iterator class defined in an anonymous
namespace giving the types and the functions returning them internal
linkage. However, because the traits specialization is defined in the
'llvm' namespace (where it has to be, specialized template lives there),
and is in turn used in the templated implementation of the SSAUpdater.
This led to the linkage conflict that Clang now warns about.

The simplest solution to me was just to define the PHI_iterator as
a nested class inside the trait specialization. That way it still
doesn't get scoped widely, it can't be accidentally reused somewhere,
etc. This is a little gross just because nested class definitions are
a little gross, but the alternatives seem more ad-hoc.

llvm-svn: 158799
2012-06-20 08:39:30 +00:00
Andrew Trick ff2ed7b687 A new algorithm for computing LoopInfo. Temporarily disabled.
-stable-loops enables a new algorithm for generating the Loop
forest. It differs from the original algorithm in a few respects:

- Not determined by use-list order.
- Initially guarantees RPO order of block and subloops.
- Linear in the number of CFG edges.
- Nonrecursive.

I didn't want to change the LoopInfo API yet, so the block lists are
still inclusive. This seems strange to me, and it means that building
LoopInfo is not strictly linear, but it may not be a problem in
practice. At least the block lists start out in RPO order now. In the
future we may add an attribute or wrapper analysis that allows other
passes to assume RPO order.

The primary motivation of this work was not to optimize LoopInfo, but
to allow reproducing performance issues by decomposing the compilation
stages. I'm often unable to do this with the current LoopInfo, because
the loop tree order determines Loop pass order. Serializing the IR
tends to invert the order, which reverses the optimization order. This
makes it nearly impossible to debug interdependent loop optimizations
such as LSR.

I also believe this will provide more stable performance results across time.

llvm-svn: 158790
2012-06-20 05:23:33 +00:00
Andrew Trick cda51d430d Move the implementation of LoopInfo into LoopInfoImpl.h.
The implementation only needs inclusion from LoopInfo.cpp and
MachineLoopInfo.cpp. Clients of the interface should only include the
interface. This makes the interface readable and speeds up rebuilds
after modifying the implementation.

llvm-svn: 158787
2012-06-20 03:42:09 +00:00
Jakob Stoklund Olesen 3802bbf35e Add regunit liveness support to LiveIntervals::handleMove().
When LiveIntervals is tracking fixed interference in regunits, make sure
to update those intervals as well. Currently guarded by -live-regunits.

llvm-svn: 158766
2012-06-19 23:50:18 +00:00
Chad Rosier 651f9a485a Tidy up.
llvm-svn: 158762
2012-06-19 23:37:57 +00:00
Chad Rosier 7369692790 Add an ensureMaxAlignment() function to MachineFrameInfo (analogous to
ensureAlignment() in MachineFunction).  Also, drop setMaxAlignment() in
favor of this new function.  This creates a main entry point to setting
MaxAlignment, which will be helpful for future work.  No functionality
change intended.

llvm-svn: 158758
2012-06-19 22:59:12 +00:00
Lang Hames 39fb1d08dc Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen 2db1125b15 80 col.
llvm-svn: 158755
2012-06-19 22:50:53 +00:00
Jakob Stoklund Olesen 0f855e4263 Implement PPCInstrInfo::isCoalescableExtInstr().
The PPC::EXTSW instruction preserves the low 32 bits of its input, just
like some of the x86 instructions. Use it to reduce register pressure
when the low 32 bits have multiple uses.

This requires a small change to PeepholeOptimizer since EXTSW takes a
64-bit input register.

This is related to PR5997.

llvm-svn: 158743
2012-06-19 21:14:34 +00:00
Jakob Stoklund Olesen 8eb9905a7c Style: Don't reuse variables for multiple purposes.
No functional change.

llvm-svn: 158742
2012-06-19 21:10:18 +00:00
Rafael Espindola ca3e0ee8b3 Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

llvm-svn: 158692
2012-06-19 00:48:28 +00:00
Hal Finkel 8eac009633 Allow up to 64 functional units per processor itinerary.
This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
This will be needed for some upcoming PowerPC itineraries.

llvm-svn: 158679
2012-06-18 21:08:18 +00:00
Benjamin Kramer b9f84bb0ce Guard private fields that are unused in Release builds with #ifndef NDEBUG.
llvm-svn: 158608
2012-06-16 21:48:13 +00:00
Jakob Stoklund Olesen 38a6fbf933 Remove final verification in RABasic.
We now have a proper machine code verifier pass between register
allocation and rewriting.

llvm-svn: 158577
2012-06-15 23:48:48 +00:00
Jakob Stoklund Olesen 45c1f9976c Print out register number in InlineSpiller.
llvm-svn: 158575
2012-06-15 23:47:09 +00:00
Jakob Stoklund Olesen 13dffcb766 Accept null PhysReg arguments to checkRegMaskInterference.
Calling checkRegMaskInterference(VirtReg) checks if VirtReg crosses any
regmask operands, regardless of the registers they clobber.

llvm-svn: 158563
2012-06-15 22:24:22 +00:00
Bill Wendling 4fd966347a Remove assignments which aren't used afterwards.
llvm-svn: 158535
2012-06-15 19:30:42 +00:00
Jakob Stoklund Olesen 5767ad727c Use regunit liveness in RegisterCoalescer when it is available.
We only do very limited physreg coalescing now, but we still merge
virtual registers into reserved registers.

llvm-svn: 158526
2012-06-15 17:36:48 +00:00
Akira Hatanaka 1b420ac4c8 Make machine verifier check the first instruction of the last bundle instead of
the last instruction of a basic block.

llvm-svn: 158468
2012-06-14 20:51:13 +00:00
Lang Hames a33db65bd9 Make comment slightly more helpful.
llvm-svn: 158467
2012-06-14 20:37:15 +00:00
Andrew Trick 45877fa011 misched: disable SSA check pending PR13112.
llvm-svn: 158461
2012-06-14 17:48:49 +00:00
Andrew Trick 344fb64fa3 sched: fix latency of memory dependence chain edges for consistency.
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.

This should fix the standard mode as well as -enable-aa-sched-mi".

llvm-svn: 158380
2012-06-13 02:39:03 +00:00
Andrew Trick 5b90645abb sched: Avoid trivially redundant DAG edges. Take the one with higher latency.
llvm-svn: 158379
2012-06-13 02:39:00 +00:00
Andrew Trick 3e465fb225 misched: When querying RegisterPressureTracker, always save current and max pressure.
llvm-svn: 158340
2012-06-11 23:42:23 +00:00
Andrew Trick d054bd833a misched: regpressure getMaxPressureDelta, revert accidental checkin.
llvm-svn: 158339
2012-06-11 23:42:20 +00:00
Benjamin Kramer 0748008df5 Allocate the contents of DwarfDebug's StringMaps in a single big BumpPtrAllocator.
llvm-svn: 158265
2012-06-09 10:34:15 +00:00
Andrew Trick fc8ce08be3 Register pressure: added getPressureAfterInstr.
llvm-svn: 158256
2012-06-09 02:16:58 +00:00
Jakob Stoklund Olesen c26fbbfba5 Sketch a LiveRegMatrix analysis pass.
The LiveRegMatrix represents the live range of assigned virtual
registers in a Live interval union per register unit. This is not
fundamentally different from the interference tracking in RegAllocBase
that both RABasic and RAGreedy use.

The important differences are:

- LiveRegMatrix tracks interference per register unit instead of per
  physical register. This makes interference checks cheaper and
  assignments slightly more expensive. For example, the ARM D7 reigster
  has 24 aliases, so we would check 24 physregs before assigning to one.
  With unit-based interference, we check 2 units before assigning to 2
  units.

- LiveRegMatrix caches regmask interference checks. That is currently
  duplicated functionality in RABasic and RAGreedy.

- LiveRegMatrix is a pass which makes it possible to insert
  target-dependent passes between register allocation and rewriting.
  Such passes could tweak the register assignments with interference
  checking support from LiveRegMatrix.

Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix.

llvm-svn: 158255
2012-06-09 02:13:10 +00:00
Jakob Stoklund Olesen be336295cd Also compute MBB live-in lists in the new rewriter pass.
This deduplicates some code from the optimizing register allocators, and
it means that it is now possible to change the register allocators'
solutions simply by editing the VirtRegMap between the register
allocator pass and the rewriter.

llvm-svn: 158249
2012-06-09 00:14:47 +00:00
Jakob Stoklund Olesen 1224312f5b Reintroduce VirtRegRewriter.
OK, not really. We don't want to reintroduce the old rewriter hacks.

This patch extracts virtual register rewriting as a separate pass that
runs after the register allocator. This is possible now that
CodeGen/Passes.cpp can configure the full optimizing register allocator
pipeline.

The rewriter pass uses register assignments in VirtRegMap to rewrite
virtual registers to physical registers, and it inserts kill flags based
on live intervals.

These finalization steps are the same for the optimizing register
allocators: RABasic, RAGreedy, and PBQP.

llvm-svn: 158244
2012-06-08 23:44:45 +00:00
Evan Cheng c5adccab1a Start implementing pre-ra if-converter: using speculation and selects to eliminate branches.
llvm-svn: 158234
2012-06-08 21:53:50 +00:00
Andrew Trick 423fa6faee TargetInstrInfo hooks implemented in codegen should be declared pure virtual.
llvm-svn: 158233
2012-06-08 21:52:38 +00:00
Andrew Trick 596af1b02e Fix Target->Codegen dependence.
Bulk move of TargetInstrInfo implementation into
TargetInstrInfoImpl. This is dirty because the code isn't part of
TargetInstrInfoImpl class, nor should it be, because the methods are
not target hooks. However, it's the current mechanism for keeping
libTarget useful outside the backend. You'll get a not-so-nice link
error if you invoke a TargetInstrInfo method that depends on CodeGen.

The TargetInstrInfoImpl class should probably be removed since it
doesn't really solve this problem.

To really fix this, we probably need separate interfaces for the
CodeGen/nonCodeGen sides of TargetInstrInfo.

llvm-svn: 158212
2012-06-08 17:23:27 +00:00
Pete Cooper cd72016cab Move terminator machine verification to check MachineBasicBlock::instr_iterator instead of MBB::iterator
llvm-svn: 158154
2012-06-07 17:41:39 +00:00
Manman Ren 9c9641812c Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Jakob Stoklund Olesen 00e7dffefb Properly verify liveness with bundled machine instructions.
Bundles should be treated as one atomic transaction when checking
liveness. That is how the register allocator (and VLIW targets) treats
bundles.

llvm-svn: 158116
2012-06-06 22:34:30 +00:00
Andrew Trick 05ff4667eb Move RegisterClassInfo.h.
Allow targets to access this API. It's required for RegisterPressure.

llvm-svn: 158102
2012-06-06 20:29:31 +00:00
Andrew Trick 88517f608c Move RegisterPressure.h.
Make it a general utility for use by Targets.

llvm-svn: 158097
2012-06-06 19:47:35 +00:00
Benjamin Kramer 009b1c1cf1 Round 2 of dead private variable removal.
LLVM is now -Wunused-private-field clean except for
- lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields.
- gtest.

llvm-svn: 158096
2012-06-06 19:47:08 +00:00
Benjamin Kramer 628a39faa3 Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Jakob Stoklund Olesen f435b1867d Remove dead debug option -disable-rematerialization.
Remat has been stable for years, and it isn't done by
LiveIntervalAnalysis any longer. (See LiveRangeEdit).

llvm-svn: 158079
2012-06-06 16:22:41 +00:00
Benjamin Kramer 3de5d40f4d Stop leaking RegScavengers from TailDuplication.
llvm-svn: 158069
2012-06-06 13:53:41 +00:00
Jakob Stoklund Olesen c141ba584e Move LiveUnionArray into LiveIntervalUnion.h
It is useful outside RegAllocBase.

llvm-svn: 158041
2012-06-05 23:57:30 +00:00
Jakob Stoklund Olesen 46d229c573 Don't print register names in LiveIntervalUnion::print().
Soon we'll be making LiveIntervalUnions for register units as well.

This was the only place using the RepReg member, so just remove it.

llvm-svn: 158038
2012-06-05 23:07:19 +00:00
Matt Beaumont-Gay 7ba769bedd Suppress -Wunused-variable in -Asserts build
llvm-svn: 158037
2012-06-05 23:00:03 +00:00
Jakob Stoklund Olesen f3f7d6f6e2 Simplify LiveInterval::print().
Don't print out the register number and spill weight, making the TRI
argument unnecessary.

This allows callers to interpret the reg field. It can currently be a
virtual register, a physical register, a spill slot, or a register unit.

llvm-svn: 158031
2012-06-05 22:51:54 +00:00
Jakob Stoklund Olesen 12e03dae44 Add experimental support for register unit liveness.
Instead of computing a live interval per physreg, LiveIntervals can
compute live intervals per register unit. This makes impossible the
confusing situation where aliasing registers could have overlapping live
intervals. It should also make fixed interferernce checking cheaper
since registers have fewer register units than aliases.

Live intervals for regunits are computed on demand, using MRI use-def
chains and the new LiveRangeCalc class. Only regunits live in to ABI
blocks are precomputed during LiveIntervals::runOnMachineFunction().

The regunit liveness computations don't depend on LiveVariables.

llvm-svn: 158029
2012-06-05 22:02:15 +00:00
Jakob Stoklund Olesen 989b3b1516 Implement LiveRangeCalc::extendToUses() and createDeadDefs().
These LiveRangeCalc methods are to be used when computing a live range
from scratch.

llvm-svn: 158027
2012-06-05 21:54:09 +00:00
Andrew Trick 4b037005d2 MachineInstr::eraseFromParent fix for removing bundled instrs.
Patch by Ivan Llopard.

llvm-svn: 158025
2012-06-05 21:44:23 +00:00
Andrew Trick 4544606c71 misched: API for minimum vs. expected latency.
Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.

llvm-svn: 158021
2012-06-05 21:11:27 +00:00
Lang Hames a59100cc08 Add a new intrinsic: llvm.fmuladd. This intrinsic represents a multiply-add
expression (a * b + c) that can be implemented as a fused multiply-add (fma)
if the target determines that this will be more efficient. This intrinsic
will be used to implement FP_CONTRACT support and an aggressive FMA formation
mode.

If your target has a fast FMA instruction you should override the
isFMAFasterThanMulAndAdd method in TargetLowering to return true.

llvm-svn: 158014
2012-06-05 19:07:46 +00:00
Andrew Trick 73d7736b17 misched: Added MultiIssueItineraries.
This allows a subtarget to explicitly specify the issue width and
other properties without providing pipeline stage details for every
instruction.

llvm-svn: 157979
2012-06-05 03:44:40 +00:00
Andrew Trick a88d46e818 sdsched: Use the right heuristics when -mcpu is not provided and we have no itinerary.
Use ILP heuristics for long latency instrs if no scoreboard exists.

llvm-svn: 157978
2012-06-05 03:44:34 +00:00
Andrew Trick ed7c96d7d9 misched: Allow disabling scoreboard hazard checking for subtargets with a
valid itinerary but no pipeline stages.

An itinerary can contain useful scheduling information without specifying pipeline stages for each instruction.

llvm-svn: 157977
2012-06-05 03:44:32 +00:00
Andrew Trick d36adece50 misched: comments from code review.
llvm-svn: 157975
2012-06-05 03:44:26 +00:00
Jakob Stoklund Olesen 345528944c Remove the last remat-related code from LiveIntervalAnalysis.
Rematerialization is handled by LiveRangeEdit now.

llvm-svn: 157974
2012-06-05 01:06:15 +00:00
Jakob Stoklund Olesen 9e27e2621a Stop using LiveIntervals::isReMaterializable().
It is an old function that does a lot more than required by
CalcSpillWeights, which was the only remaining caller.

The isRematerializable() function never actually sets the isLoad
argument, so don't try to compute that.

llvm-svn: 157973
2012-06-05 01:06:12 +00:00
Jakob Stoklund Olesen 188d830405 Delete dead code.
llvm-svn: 157963
2012-06-04 23:01:41 +00:00
Jakob Stoklund Olesen 11fb248aa6 Switch LiveIntervals member variable to LLVM naming standards.
No functional change.

llvm-svn: 157957
2012-06-04 22:39:14 +00:00
Jakob Stoklund Olesen 5ef0e0b262 Pass context pointers to LiveRangeCalc::reset().
Remove the same pointers from all the other LiveRangeCalc functions,
simplifying the interface.

llvm-svn: 157941
2012-06-04 18:21:16 +00:00
Nadav Rotem b7bb72e4f3 Remove the "-promote-elements" flag. This flag is now enabled by default.
llvm-svn: 157925
2012-06-04 11:27:21 +00:00
Benjamin Kramer bde9176663 Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Stepan Dyatkovskiy 0e46d8a08c PR1255: case ranges.
IntRange converted from struct to class. So main change everywhere is replacement of ".Low/High" with ".getLow/getHigh()"

llvm-svn: 157884
2012-06-02 09:42:43 +00:00
Stepan Dyatkovskiy 9549f5894b PR1255: case ranges.
IntegersSubsetGeneric, IntegersSubsetMapping: added IntTy template parameter, that allows use either APInt or IntItem. This change allows to write unittest for these classes.

llvm-svn: 157880
2012-06-02 07:26:00 +00:00
Akira Hatanaka 6f3b2a670f Fix a bug in the code which custom-lowers truncating stores in LegalizeDAG.
Check that the SDValue TargetLowering::LowerOperation returns is not null
before replacing the original node with the returned node.

llvm-svn: 157873
2012-06-02 01:10:34 +00:00
Jakob Stoklund Olesen 54038d796c Switch all register list clients to the new MC*Iterator interface.
No functional change intended.

Sorry for the churn. The iterator classes are supposed to help avoid
giant commits like this one in the future. The TableGen-produced
register lists are getting quite large, and it may be necessary to
change the table representation.

This makes it possible to do so without changing all clients (again).

llvm-svn: 157854
2012-06-01 23:28:30 +00:00
Jakob Stoklund Olesen ca487d2183 Remove physreg support from adjustCopiesBackFrom and removeCopyByCommutingDef.
After physreg coalescing was disabled, these functions can't do anything
useful with physregs anyway.

llvm-svn: 157849
2012-06-01 22:38:19 +00:00
Jakob Stoklund Olesen 9b09cf0c11 Simplify some more getAliasSet callers.
MCRegAliasIterator can include Reg itself in the list.

llvm-svn: 157848
2012-06-01 22:38:17 +00:00
Jakob Stoklund Olesen 92a0083944 Switch some getAliasSet clients to MCRegAliasIterator.
MCRegAliasIterator can optionally visit the register itself, allowing
for simpler code.

llvm-svn: 157837
2012-06-01 20:36:54 +00:00
Manman Ren e873552091 ARM: properly handle alignment for struct byval.
Factor out the expansion code into a function.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 157830
2012-06-01 19:33:18 +00:00
Stepan Dyatkovskiy 66305749f1 PR1255: case ranges.
IntegersSubset devided into IntegersSubsetGeneric and into IntegersSubset itself. The first has no references to ConstantInt and works with IntItem only.
IntegersSubsetMapping also made generic. Here added second template parameter "IntegersSubsetTy" that allows to use on of two IntegersSubset types described below.

llvm-svn: 157815
2012-06-01 16:17:57 +00:00
Chris Lattner cc84e6d2b5 quick fix for PR13006, will check in testcase later.
llvm-svn: 157813
2012-06-01 15:02:52 +00:00
Chris Lattner 466076b95f enhance the logic for looking through tailcalls to look through transparent casts
in multiple-return value scenarios, like what happens on X86-64 when returning
small structs.

llvm-svn: 157800
2012-06-01 05:29:15 +00:00
Chris Lattner 182fe3eef1 enhance getNoopInput to know about vector<->vector bitcasts of legal
types, as well as int<->ptr casts.  This allows us to tailcall functions
with some trivial casts between the call and return (i.e. because the
return types disagree).

llvm-svn: 157798
2012-06-01 05:16:33 +00:00
Chris Lattner 4f3615de97 rearrange some logic, no functionality change.
llvm-svn: 157796
2012-06-01 05:01:15 +00:00
Eric Christopher 1cf3338bb4 Add support for enum forward declarations.
Part of rdar://11570854

llvm-svn: 157786
2012-06-01 00:22:32 +00:00
Manman Ren 9bccb64e56 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 157755
2012-05-31 17:20:29 +00:00
Jakob Stoklund Olesen 05e2245fc6 Prioritize smaller register classes for urgent evictions.
It helps compile exotic inline asm. In the test case, normal GR32
virtual registers use up eax-edx so the final GR32_ABCD live range has
no registers left. Since all the live ranges were tiny, we had no way of
prioritizing the smaller register class.

This patch allows tiny unspillable live ranges to be evicted by tiny
unspillable live ranges from a smaller register class.

<rdar://problem/11542429>

llvm-svn: 157715
2012-05-30 21:46:58 +00:00
Owen Anderson 0eda3e1de6 Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention.
llvm-svn: 157708
2012-05-30 18:54:50 +00:00
Owen Anderson c7aaf523e1 Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node.
llvm-svn: 157707
2012-05-30 18:50:39 +00:00
Chad Rosier fba46a64aa Remove extra space.
llvm-svn: 157706
2012-05-30 18:47:55 +00:00
Jakob Stoklund Olesen 3a48c06456 Remove some redundant tests.
An empty list is not represented as a null pointer. Let TRI do its own
shortcuts.

llvm-svn: 157702
2012-05-30 18:38:56 +00:00
Evan Cheng bc2453dd3d Teach taildup to update livein set. rdar://11538365
llvm-svn: 157663
2012-05-30 00:42:39 +00:00
Evan Cheng 50954fb3e1 If-converter models predicated defs as read + write. The read should be marked as 'undef' since it may not already be live. This appeases -verify-machineinstrs.
llvm-svn: 157662
2012-05-30 00:42:02 +00:00
Bob Wilson 33e5188c27 Add an insertPass API to TargetPassConfig. <rdar://problem/11498613>
Besides adding the new insertPass function, this patch uses it to
enhance the existing -print-machineinstrs so that the MachineInstrs
after a specific pass can be printed.

Patch by Bin Zeng!

llvm-svn: 157655
2012-05-30 00:17:12 +00:00
Evan Cheng 76f6e2671a Optional def can be either a def or a use (of reg0).
llvm-svn: 157640
2012-05-29 19:40:44 +00:00
Lang Hames e256f71937 Clear the entering, exiting and internal ranges of a bundle before collecting
ranges for the instruction about to be bundled. This fixes a bug in an external
project where an assertion was triggered due to spurious 'multiple defs' within
the bundle.

Patch by Ivan Llopard. Thanks Ivan!

llvm-svn: 157632
2012-05-29 18:19:54 +00:00
Stepan Dyatkovskiy 58107dd547 ConstantRangesSet renamed to IntegersSubset. CRSBuilder renamed to IntegersSubsetMapping.
llvm-svn: 157612
2012-05-29 12:26:47 +00:00
Peter Collingbourne 913869be45 Add llvm.fabs intrinsic.
llvm-svn: 157594
2012-05-28 21:48:37 +00:00
Stepan Dyatkovskiy e3e19cbb13 PR1255: Case Ranges
Implemented IntItem - the wrapper around APInt. Why not to use APInt item directly right now?
1. It will very difficult to implement case ranges as series of small patches. We got several large and heavy patches. Each patch will about 90-120 kb. If you replace ConstantInt with APInt in SwitchInst you will need to changes at the same time all Readers,Writers and absolutely all passes that uses SwitchInst.
2. We can implement APInt pool inside and save memory space. E.g. we use several switches that works with 256 bit items (switch on signatures, or strings). We can avoid value duplicates in this case.
3. IntItem can be easyly easily replaced with APInt.
4. Currenly we can interpret IntItem both as ConstantInt and as APInt. It allows to provide SwitchInst methods that works with ConstantInt for non-updated passes.

Why I need it right now? Currently I need to update SimplifyCFG pass (EqualityComparisons). I need to work with APInts directly a lot, so peaces of code
ConstantInt *V = ...;
if (V->getValue().ugt(AnotherV->getValue()) {
  ...
}
will look awful. Much more better this way:
IntItem V = ConstantIntVal->getValue();
if (AnotherV < V) {
}

Of course any reviews are welcome.

P.S.: I'm also going to rename ConstantRangesSet to IntegersSubset, and CRSBuilder to IntegersSubsetMapping (allows to map individual subsets of integers to the BasicBlocks).
Since in future these classes will founded on APInt, it will possible to use them in more generic ways.

llvm-svn: 157576
2012-05-28 12:39:09 +00:00
Peter Collingbourne 4d358b55fa Have getOrCreateSubprogramDIE store the DIE for a subprogram
definition in the map before calling itself to retrieve the
DIE for the declaration.  Without this change, if this causes
getOrCreateSubprogramDIE to be recursively called on the definition,
it will create multiple DIEs for that definition.  Fixes PR12831.

llvm-svn: 157541
2012-05-27 18:36:44 +00:00
Benjamin Kramer abb3fa69b4 Missed parens.
llvm-svn: 157527
2012-05-27 10:56:55 +00:00
Benjamin Kramer 4b8f8e75e6 r157525 didn't work, just disable iterator checking.
This is obviosly right but I don't see how to do this with proper vector
iterators without building a horrible mess of workarounds.

llvm-svn: 157526
2012-05-27 10:24:52 +00:00
Benjamin Kramer 48ff2751c1 SDAGBuilder: Avoid iterator invalidation harder.
vector.begin()-1 is invalid too.

llvm-svn: 157525
2012-05-27 09:44:52 +00:00
Benjamin Kramer 5aad872f8c SDAGBuilder: Don't create an invalid iterator when there is only one switch case.
Found by libstdc++'s debug mode.

llvm-svn: 157522
2012-05-26 21:19:12 +00:00
Benjamin Kramer f2beccf6b4 SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights.
SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move
the most likely condition to the front so it is checked first and the others can
be skipped. This is currently not as effective as it could be because SimplifyCFG
destroys profiling metadata when merging branches and switches. Merging branch
weight metadata is tricky though.

This code touches at most 3 cases so I didn't use a proper sorting algorithm.

llvm-svn: 157521
2012-05-26 20:01:32 +00:00
Benjamin Kramer 484f4247aa ScoreboardHazardRecognizer: Remove dead conditional in debug code.
Negative cycles are filtered out earlier.

llvm-svn: 157514
2012-05-26 11:37:37 +00:00
Justin Holewinski aa58397b3c Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

llvm-svn: 157479
2012-05-25 16:35:28 +00:00
Andrew Trick 4e7f6a7702 misched: trace formatting
llvm-svn: 157455
2012-05-25 02:02:39 +00:00
Eli Friedman 315a0c79f3 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
llvm-svn: 157446
2012-05-25 00:09:29 +00:00
Kaelyn Uhrain 85d8f0cba8 Silence unused variable warnings from when assertions are disabled.
llvm-svn: 157438
2012-05-24 23:37:49 +00:00
Andrew Trick a306a8a844 misched: Use the same scheduling heuristics with -misched-topdown/bottomup.
(except the part about choosing direction)

llvm-svn: 157437
2012-05-24 23:11:17 +00:00
Andrew Trick 79d3eecbb4 misched: Trace regpressure.
llvm-svn: 157429
2012-05-24 22:11:14 +00:00
Andrew Trick a8ad5f7c7b misched: Give each ReadyQ a unique ID
llvm-svn: 157428
2012-05-24 22:11:12 +00:00
Andrew Trick 61f1a278b8 misched: Added ScoreboardHazardRecognizer.
The Hazard checker implements in-order contraints, or interlocked
resources. Ready instructions with hazards do not enter the available
queue and are not visible to other heuristics.

The major code change is the addition of SchedBoundary to encapsulate
the state at the top or bottom of the schedule, including both a
pending and available queue.

The scheduler now counts cycles in sync with the hazard checker. These
are minimum cycle counts based on known hazards.

Targets with no itinerary (x86_64) currently remain at cycle 0. To fix
this, we need to provide some maximum issue width for all targets. We
also need to add the concept of expected latency vs. minimum latency.

llvm-svn: 157427
2012-05-24 22:11:09 +00:00
Andrew Trick ca47335461 misched: Release bottom roots in reverse order.
llvm-svn: 157426
2012-05-24 22:11:05 +00:00
Andrew Trick dd375dd34a misched: rename ReadyQ class
llvm-svn: 157425
2012-05-24 22:11:03 +00:00
Andrew Trick f378617773 misched: copy comments so compareRPDelta is readable by itself.
llvm-svn: 157424
2012-05-24 22:11:01 +00:00
Andrew Trick d5326aea81 regpressure: Added RegisterPressure::dump
llvm-svn: 157423
2012-05-24 22:10:59 +00:00
Andrew Trick b2c172e20a regpressure: physreg livein/out fix
llvm-svn: 157422
2012-05-24 22:10:57 +00:00
Craig Topper 9520719b9b Mark some static arrays as const.
llvm-svn: 157377
2012-05-24 06:35:32 +00:00
Jakob Stoklund Olesen 0ce90494e6 Add a last resort tryInstructionSplit() to RAGreedy.
Live ranges with a constrained register class may benefit from splitting
around individual uses. It allows the remaining live range to use a
larger register class where it may allocate. This is like spilling to a
different register class.

This is only attempted on constrained register classes.

<rdar://problem/11438902>

llvm-svn: 157354
2012-05-23 22:37:27 +00:00
Bill Wendling e351e8c52d Forgot to reverse conditional.
llvm-svn: 157349
2012-05-23 22:12:50 +00:00
Bill Wendling 041793c452 Reduce indentation by early detection of 'continue'. No functionality change.
llvm-svn: 157348
2012-05-23 22:09:50 +00:00
Jakob Stoklund Olesen 5b8f476037 Correctly deal with identity copies in RegisterCoalescer.
Now that the coalescer keeps live intervals and machine code in sync at
all times, it needs to deal with identity copies differently.

When merging two virtual registers, all identity copies are removed
right away. This means that other identity copies must come from
somewhere else, and they are going to have a value number.

Deal with such copies by merging the value numbers before erasing the
copy instruction. Otherwise, we leave dangling value numbers in the live
interval.

This fixes PR12927.

llvm-svn: 157340
2012-05-23 20:21:06 +00:00
Patrik Hägglund 94537c2a06 Small fix for the debug output from PBQP (PR12822).
llvm-svn: 157319
2012-05-23 12:12:58 +00:00
Eric Christopher c49643586b Add support for C++11 enum classes in llvm.
Part of rdar://11496790

llvm-svn: 157303
2012-05-23 00:09:20 +00:00
Eric Christopher d42b92f5c3 Untabify and 80-col.
llvm-svn: 157274
2012-05-22 18:45:24 +00:00
Eric Christopher 775cbd2b47 Formatting consistency.
llvm-svn: 157273
2012-05-22 18:45:18 +00:00
Jakob Stoklund Olesen 924279ca0e Only erase virtregs with no uses left.
Also make sure registers aren't erased twice if the dead def mentions
the register twice.

This fixes PR12911.

llvm-svn: 157254
2012-05-22 14:52:12 +00:00
Owen Anderson f2118ea826 Fix use of an unitialized value in the LegalizeOps expansion for ISD::SUB. No in-tree targets exercise this path.
Patch by Micah Villmow.

llvm-svn: 157215
2012-05-21 22:39:20 +00:00
Chad Rosier 5d1f5d2be3 Typo.
llvm-svn: 157195
2012-05-21 17:13:41 +00:00
Jakob Stoklund Olesen 29268b50f2 Give a small negative bias to giant edge bundles.
This helps compile time when the greedy register allocator splits live
ranges in giant functions. Without the bias, we would try to grow
regions through the giant edge bundles, usually to find out that the
region became too big and expensive.

If a live range has many uses in blocks near the giant bundle, the small
negative bias doesn't make a big difference, and we still consider
regions including the giant edge bundle.

Giant edge bundles are usually connected to landing pads or indirect
branches.

llvm-svn: 157174
2012-05-21 03:11:23 +00:00
Jakob Stoklund Olesen a7c3d2f902 Clear kill flags on the fly when joining intervals.
With physreg joining out of the way, it is easy to recognize the
instructions that need their kill flags cleared while testing for
interference.

This allows us to skip the final scan of all instructions for an 11%
speedup of the coalescer pass.

llvm-svn: 157169
2012-05-20 21:41:05 +00:00
Jakob Stoklund Olesen 2f06a6579c Constrain regclasses in PeepholeOptimizer.
It can be necessary to restrict to a sub-class before accessing
sub-registers.

llvm-svn: 157164
2012-05-20 18:42:55 +00:00
Jakob Stoklund Olesen 00f07dec0c Constrain register classes in TailDup.
When rewriting operands, make sure the new registers have a compatible
register class.

llvm-svn: 157163
2012-05-20 18:42:51 +00:00
Peter Collingbourne 8eb05fd093 When legalising shifts, do not pre-build a list of operands which
may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve
the other operands when calling UpdateNodeOperands.  Fixes PR12889.

llvm-svn: 157162
2012-05-20 18:36:15 +00:00
Benjamin Kramer 76004e69a6 Plug a leak when using MCJIT.
Found by valgrind.

llvm-svn: 157160
2012-05-20 17:24:08 +00:00
Benjamin Kramer a7c2c41c3c Use TargetMachine's register info instead of creating a new one and leaking it.
llvm-svn: 157155
2012-05-20 11:24:27 +00:00
Jakob Stoklund Olesen 1f1c6add10 Properly constrain register classes for sub-registers.
Not all GR64 registers have sub_8bit sub-registers.

llvm-svn: 157150
2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen a103a516c6 Properly constrain register classes in 2-addr.
X86 has 2-addr instructions with different constraints on the tied def
and use operands. One is GR32, one is GR32_NOSP.

llvm-svn: 157149
2012-05-20 06:38:32 +00:00
Jakob Stoklund Olesen b8f950650b Missed a push_back in r157147.
llvm-svn: 157148
2012-05-20 05:28:53 +00:00
Jakob Stoklund Olesen d0a38a8daa Avoid deleting extra copies when RegistersDefinedFromSameValue is true.
This function adds copies to be erased to DupCopies, avoid also adding
them to DeadCopies.

llvm-svn: 157147
2012-05-20 04:52:48 +00:00
Jakob Stoklund Olesen 64d82b74dd Fix build bots.
Avoid looking at the operands of a potentially erased instruction.

llvm-svn: 157146
2012-05-20 03:57:12 +00:00
Jakob Stoklund Olesen 02d83e3b8b LiveRangeQuery simplifies shrinkToUses().
llvm-svn: 157145
2012-05-20 02:54:52 +00:00
Jakob Stoklund Olesen abc8c3d3ce Use LiveRangeQuery in ScheduleDAGInstrs.
llvm-svn: 157144
2012-05-20 02:44:38 +00:00
Jakob Stoklund Olesen 58165b92e6 Eliminate some uses of struct LiveRange.
That struct ought to be a LiveInterval implementation detail.

llvm-svn: 157143
2012-05-20 02:44:36 +00:00
Jakob Stoklund Olesen 2aeead4bf6 Use LiveRangeQuery instead of getLiveRangeContaining().
llvm-svn: 157142
2012-05-20 02:44:33 +00:00
Jakob Stoklund Olesen 4e1e43a355 Simplify overlap check.
llvm-svn: 157137
2012-05-19 23:59:27 +00:00
Jakob Stoklund Olesen a34a69ce0c Fix 12892.
Dead code elimination during coalescing could cause a virtual register
to be split into connected components. The following rewriting would be
confused about the already joined copies present in the code, but
without a corresponding value number in the live range.

Erase all joined copies instantly when joining intervals such that the
MI and LiveInterval representations are always in sync.

llvm-svn: 157135
2012-05-19 23:34:59 +00:00
Jakob Stoklund Olesen e59d0c3252 Remove the late DCE in RegisterCoalescer.
Dead code and joined copies are now eliminated on the fly, and there is
no need for a post pass.

This makes the coalescer work like other modern register allocator
passes: Code is changed on the fly, there is no pending list of changes
to be committed.

llvm-svn: 157132
2012-05-19 21:02:31 +00:00
Jakob Stoklund Olesen 25ced18407 Erase joined copies immediately.
The late dead code elimination is no longer necessary.

The test changes are cause by a register hint that can be either %rdi or
%rax. The choice depends on the use list order, which this patch changes.

llvm-svn: 157131
2012-05-19 20:54:07 +00:00
Jakob Stoklund Olesen 1b707c8817 Fix an ancient bug in removeCopyByCommutingDef().
Before rewriting uses of one value in A to register B, check that there
are no tied uses. That would require multiple A values to be rewritten.

This bug can't bite in the current version of the code for a fairly
subtle reason: A tied use would have caused 2-addr to insert a copy
before the use. If the copy has been coalesced, it will be found by the
same loop changed by this patch, and the optimization is aborted.

This was exposed by 400.perlbench and lua after applying a patch that
deletes joined copies aggressively.

llvm-svn: 157130
2012-05-19 20:54:03 +00:00
Jakob Stoklund Olesen d05148ba89 Collect inflatable virtual registers on the fly.
There is no reason to defer the collection of virtual registers whose
register class may be replaced with a larger class.

llvm-svn: 157125
2012-05-19 19:25:00 +00:00
Jakob Stoklund Olesen 900f58441d Eliminate dead code after remat.
This will remove the original def once it has no more uses.

llvm-svn: 157104
2012-05-19 05:25:59 +00:00
Jakob Stoklund Olesen dcffc626c0 Don't remat during updateRegDefsUses().
Remaining virtreg->physreg copies were rematerialized during
updateRegDefsUses(), but we already do the same thing in joinCopy() when
visiting the physreg copy instruction.

Eliminate the preserveSrcInt argument to reMaterializeTrivialDef(). It
is now always true.

llvm-svn: 157103
2012-05-19 05:25:56 +00:00
Jakob Stoklund Olesen 06dc721203 Immediately erase trivially useless copies.
There is no need for these instructions to stick around since they are
known to be not dead.

llvm-svn: 157102
2012-05-19 05:25:53 +00:00
Jakob Stoklund Olesen 82d77e8145 Run proper recursive dead code elimination during coalescing.
Dead copies cause problems because they are trivial to coalesce, but
removing them gived the live range a dangling end point. This patch
enables full dead code elimination which trims live ranges to their uses
so end points don't dangle.

DCE may erase multiple instructions. Put the pointers in an ErasedInstrs
set so we never risk visiting erased instructions in the work list.

There isn't supposed to be any dead copies entering RegisterCoalescer,
but they do slip by as evidenced by test/CodeGen/X86/coalescer-dce.ll.

llvm-svn: 157101
2012-05-19 05:25:50 +00:00
Jakob Stoklund Olesen e5bbe37950 Allow LiveRangeEdit to be created with a NULL parent.
The dead code elimination with callbacks is still useful.

llvm-svn: 157100
2012-05-19 05:25:46 +00:00
Jakob Stoklund Olesen 3834dae65d Modernize naming convention for class members.
No functional change.

llvm-svn: 157079
2012-05-18 22:10:15 +00:00
Jakob Stoklund Olesen b686a2cebd Move all work list processing to copyCoalesceWorkList().
This will make it possible to filter out erased instructions later.

llvm-svn: 157073
2012-05-18 21:09:40 +00:00
Jim Grosbach 4b63d2ae1d Refactor data-in-code annotations.
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.

Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.

data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"

The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.

rdar://11459456

llvm-svn: 157062
2012-05-18 19:12:01 +00:00
Eric Christopher e2b36ce24a Remove duplicate code that we could just fallthrough to.
llvm-svn: 157060
2012-05-18 18:24:15 +00:00
Jakob Stoklund Olesen b954b91ada Simplify RegisterCoalescer::copyCoalesceInMBB().
It is no longer necessary to separate VirtCopies, PhysCopies, and
ImpDefCopies. Implicitly defined copies are extremely rare after we
added the ProcessImplicitDefs pass, and physical register copies are not
joined any longer.

llvm-svn: 157059
2012-05-18 18:21:48 +00:00
Jakob Stoklund Olesen d78d7b05ae Remove support for PhysReg joining.
This has been disabled for a while, and it is not a feature we want to
support. Copies between physical and virtual registers are eliminated by
good hinting support in the register allocator. Joining virtual and
physical registers is really a form of register allocation, and the
coalescer is not properly equipped to do that. In particular, it cannot
backtrack coalescing decisions, and sometimes that would cause it to
create programs that were impossible to register allocate, by exhausting
a small register class.

It was also very difficult to keep track of the live ranges of aliasing
registers when extending the live range of a physreg. By disabling
physreg joining, we can let fixed physreg live ranges remain constant
throughout the register allocator super-pass.

One type of physreg joining remains: A virtual register that has a
single value which is a copy of a reserved register can be merged into
the reserved physreg. This always lowers register pressure, and since we
don't compute live ranges for reserved registers, there are no problems
with aliases.

llvm-svn: 157055
2012-05-18 17:18:58 +00:00
Stepan Dyatkovskiy b638ee0ed3 Recommited reworked r156804:
SelectionDAGBuilder::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.

llvm-svn: 157046
2012-05-18 08:32:28 +00:00
Evan Cheng 22d405f57b Teach two-address pass to update the "source" map so it doesn't perform a
non-profitable commute using outdated info. The test case would still fail
because of poor pre-RA schedule. That will be fixed by MI scheduler.

rdar://11472010

llvm-svn: 157038
2012-05-18 01:33:51 +00:00
Andrew Trick 6a50baa26e comments
llvm-svn: 157020
2012-05-17 22:37:09 +00:00
Andrew Trick 276a3e8c46 misched: trace ReadyQ.
llvm-svn: 157007
2012-05-17 18:35:13 +00:00
Andrew Trick 2202577d80 misched: Added 3-level regpressure back-off.
Introduce the basic strategy for register pressure scheduling.

1) Respect target limits at all times.

2) Indentify critical register classes (pressure sets).
   Track pressure within the scheduled region.
   Avoid increasing scheduled pressure for critical registers.

3) Avoid exceeding the max pressure of the region prior to scheduling.

Added logic for picking between the top and bottom ready Q's based on
regpressure heuristics.

Status: functional but needs to be asjusted to achieve good results.
llvm-svn: 157006
2012-05-17 18:35:10 +00:00
Andrew Trick 47a1feaea0 comment
llvm-svn: 157005
2012-05-17 18:35:07 +00:00
Andrew Trick 1c646ac68b regpressure: Fix getMaxUpwardPressureDelta.
llvm-svn: 157004
2012-05-17 18:35:05 +00:00
Andrew Trick 463b2f1f04 misched: fix liveness iterators
llvm-svn: 157003
2012-05-17 18:35:03 +00:00
Andrew Trick 7d90035b0b whitespace
llvm-svn: 157002
2012-05-17 18:35:00 +00:00
Jakob Stoklund Olesen c3553ffc70 Never clear <undef> flags on already joined copies.
RegisterCoalescer set <undef> flags on all operands of copy instructions
that are scheduled to be removed. This is so they won't affect
shrinkToUses() by introducing false register reads.

Make sure those <undef> flags are never cleared, or shrinkToUses() could
cause live intervals to end at instructions about to be deleted.

This would be a lot simpler if RegisterCoalescer could just erase joined
copies immediately instead of keeping all the to-be-deleted instructions
around.

This fixes PR12862. Unfortunately, bugpoint can't create a sane test
case for this. Like many other coalescer problems, this failure depends
of a very fragile series of events.

<rdar://problem/11474428>

llvm-svn: 157001
2012-05-17 18:32:42 +00:00
Jakob Stoklund Olesen 14a8745990 Fix a verifier bug.
Make sure useless (def-only) intervals also get verified.

llvm-svn: 157000
2012-05-17 18:32:40 +00:00
Bill Wendling 27489fe014 Relax the requirement that the exception object must be an instruction. During
bugpoint-ing, it may turn into something else.

llvm-svn: 156998
2012-05-17 17:59:51 +00:00
Stepan Dyatkovskiy 96d0c925e9 SelectionDAGBuilder: CaseBlock, CaseRanges and CaseCmp changed representation of Low and High from signed to unsigned. Since unsigned ints usually simpler, faster and allows to reduce some extra signed bit checks needed before <,>,<=,>= comparisons.
llvm-svn: 156985
2012-05-17 08:56:30 +00:00
Jakob Stoklund Olesen ab4828390c Set sub-register <undef> flags more accurately.
When widening an existing <def,reads-undef> operand to a super-register,
it may be necessary to clear the <undef> flag because the wider register
is now read-modify-write through the instruction.

Conversely, it may be necessary to add an <undef> flag when the
coalescer turns a full-register def into a sub-register def, but the
larger register wasn't live before the instruction.

This happens in test/CodeGen/ARM/coalesce-subregs.ll, but the test
is too small for the <undef> flags to affect the generated code.

llvm-svn: 156951
2012-05-16 21:22:35 +00:00
Duncan Sands 49080cd9a1 Fix a thinko in DisintegrateMERGE_VALUES. Patch by Xiaoyi Guo.
llvm-svn: 156909
2012-05-16 07:57:18 +00:00
Jakob Stoklund Olesen 984997b3a0 Enable sub-sub-register copy coalescing.
It is now possible to coalesce weird skewed sub-register copies by
picking a super-register class larger than both original registers. The
included test case produces code like this:

  vld2.32 {d16, d17, d18, d19}, [r0]!
  vst2.32 {d18, d19, d20, d21}, [r0]

We still perform interference checking as if it were a normal full copy
join, so this is still quite conservative. In particular, the f1 and f2
functions in the included test case still have remaining copies because
of false interference.

llvm-svn: 156878
2012-05-15 23:31:35 +00:00
Jakob Stoklund Olesen a1626369b6 Teach RegisterCoalescer to handle symmetric sub-register copies.
It is possible to coalesce two overlapping registers to a common
super-register that it larger than both of the original registers.

The important difference is that it may be necessary to rewrite DstReg
operands as well as SrcReg operands because the sub-register index has
changed.

This behavior is still disabled by CoalescerPair.

llvm-svn: 156869
2012-05-15 22:26:28 +00:00
Jakob Stoklund Olesen 385970f290 Handle NewReg==OldReg in renameRegister().
This can happen when widening a virtual register to a super-register
class.

llvm-svn: 156867
2012-05-15 22:20:27 +00:00
Jakob Stoklund Olesen 1c6a2223d4 We never call adjustCopiesBackFrom() for partial copies.
There is no need to look at an always null SrcIdx.

llvm-svn: 156866
2012-05-15 22:18:49 +00:00
Jakob Stoklund Olesen 71673b4faf Extend the CoalescerPair interface to handle symmetric sub-register copies.
Now both SrcReg and DstReg can be sub-registers of the final coalesced
register.

CoalescerPair::setRegisters still rejects such copies because
RegisterCoalescer doesn't yet handle them.

llvm-svn: 156848
2012-05-15 20:09:43 +00:00
Andrew Trick da01ba37e0 Add -enable-aa-sched-mi, off by default, for AliasAnalysis inside MachineScheduler.
This feature avoids creating edges in the scheduler's dependence graph
for non-aliasing memory operations according to whichever alias
analysis is available. It has been fully tested in Hexagon. Before
making this default, it needs to be extended to handle multiple
MachineMemOperands, compile time needs more evaluation, and
benchmarking on X86 and ARM is needed.

Patch by Sergei Larin!

llvm-svn: 156842
2012-05-15 18:59:41 +00:00
Jim Grosbach c3b0427921 Allow MCCodeEmitter access to the target MCRegisterInfo.
Add the MCRegisterInfo to the factories and constructors.

Patch by Tom Stellard <Tom.Stellard@amd.com>.

llvm-svn: 156828
2012-05-15 17:35:52 +00:00
Stepan Dyatkovskiy e01e9863c5 Rejected r156804 due to buildbots failures.
llvm-svn: 156808
2012-05-15 06:50:18 +00:00
Stepan Dyatkovskiy d450d3fa12 SelectionDAGBuilder::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.
llvm-svn: 156804
2012-05-15 05:09:41 +00:00
Jakob Stoklund Olesen a13fd12872 Don't access MO reference after invalidating operand list.
This should unbreak llvm-x86_64-linux.

llvm-svn: 156778
2012-05-14 21:30:58 +00:00
Jakob Stoklund Olesen dc2e0cd44a Fix PR12821.
RAFast must add an <imp-def> operand when it is rewriting a sub-register
def that isn't a read-modify-write.

llvm-svn: 156777
2012-05-14 21:10:25 +00:00
Dan Gohman 164fe18cfe Rename @llvm.debugger to @llvm.debugtrap.
llvm-svn: 156774
2012-05-14 18:58:10 +00:00
Jakob Stoklund Olesen 165473247f Don't look for empty live ranges in the unions.
Empty live ranges represent undef and still get allocated, but they
won't appear in LiveIntervalUnions.

Patch by Patrik Hägglund!

llvm-svn: 156685
2012-05-12 00:33:28 +00:00
Chad Rosier a33015d4e0 Revert 156658.
llvm-svn: 156662
2012-05-11 23:21:01 +00:00
Chad Rosier e40f5d3ee0 [fast-isel] Fast-isel doesn't use the expect intrinsic.
llvm-svn: 156658
2012-05-11 23:10:58 +00:00
Manman Ren dc8ad0058f ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156599
2012-05-11 01:30:47 +00:00
Dan Gohman dfab443ae8 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.

llvm-svn: 156593
2012-05-11 00:19:32 +00:00
Andrew Trick c5d7008f27 misched: Print machineinstrs with -debug-only=misched
llvm-svn: 156576
2012-05-10 21:06:21 +00:00
Andrew Trick 419eae2db7 misched: tracing register pressure heuristics.
llvm-svn: 156575
2012-05-10 21:06:19 +00:00
Andrew Trick 7ee9de51f2 misched: Add register pressure backoff to ConvergingScheduler.
Prioritize the instruction that comes closest to keeping pressure
under the target's limit. Then prioritize instructions that avoid
increasing the max pressure in the scheduled region. The max pressure
heuristic is a tad aggressive. Later I'll fix it to consider the
unscheduled pressure as well.

WIP: This is mostly functional but untested and not likely to do much good yet.
llvm-svn: 156574
2012-05-10 21:06:16 +00:00
Andrew Trick 795c1120a6 misched: Release only unscheduled nodes into ReadyQ.
llvm-svn: 156573
2012-05-10 21:06:14 +00:00
Andrew Trick 95dafd8b31 misched: Added ReadyQ container wrapper for Top and Bottom Queues.
llvm-svn: 156572
2012-05-10 21:06:12 +00:00
Andrew Trick 4add42f439 misched: Introducing Top and Bottom register pressure trackers during scheduling.
llvm-svn: 156571
2012-05-10 21:06:10 +00:00
Andrew Trick 75812f815c RegPressure: API for speculatively checking instruction pressure.
Added getMaxExcessUpward/DownwardPressure. They somewhat abuse the
tracker by speculatively handling an instruction out of order. But it
is convenient for now. In the future, we will cache each instruction's
pressure contribution to make this efficient.

llvm-svn: 156561
2012-05-10 19:11:52 +00:00
Andrew Trick 1df762abf4 RegPressure: fix array index iteration style.
llvm-svn: 156560
2012-05-10 19:11:49 +00:00
Manman Ren b555b382bd Revert: 156550 "ARM: peephole optimization to remove cmp instruction"
This commit broke an external linux bot and gave a compile-time warning.

llvm-svn: 156556
2012-05-10 18:49:43 +00:00
Manman Ren c860887b2d ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156550
2012-05-10 16:48:21 +00:00
Eric Christopher 8d2a77de63 Fix thinko in conditional.
Part of rdar://11352000 and should bring the buildbots back.

llvm-svn: 156421
2012-05-08 21:24:39 +00:00
Jim Grosbach 92f6adc8be DAGCombiner should not change the type of an extract_vector index.
When a combine twiddles an extract_vector, care should be take to preserve
the type of the index operand. No luck extracting a reasonable testcase,
unfortunately.

rdar://11391009

llvm-svn: 156419
2012-05-08 20:56:07 +00:00
Akira Hatanaka fd82286e62 Formatting fixes.
Patch by Jack Carter.

llvm-svn: 156409
2012-05-08 19:14:42 +00:00
Eric Christopher 4d25052a9a Handle OpDeref in case it comes in as a register operand.
Part of rdar://11352000

llvm-svn: 156405
2012-05-08 18:56:00 +00:00
Jakob Stoklund Olesen 952b4c11fe Extract methods for joining physregs.
No functional change.

llvm-svn: 156345
2012-05-08 00:08:35 +00:00
Jakob Stoklund Olesen 9e8ae6c37f Naming convention and whitespace. No functional change.
llvm-svn: 156342
2012-05-07 23:46:16 +00:00
Jakob Stoklund Olesen 98595b5a61 Coalesce subreg-subreg copies.
At least some of them:

  %vreg1:sub_16bit = COPY %vreg2:sub_16bit; GR64:%vreg1, GR32: %vreg2

Previously, we couldn't figure out that the above copy could be
eliminated by coalescing %vreg2 with %vreg1:sub_32bit.

The new getCommonSuperRegClass() hook makes it possible.

This is not very useful yet since the unmodified part of the destination
register usually interferes with the source register. The coalescer
needs to understand sub-register interference checking first.

llvm-svn: 156334
2012-05-07 22:57:55 +00:00
Jakob Stoklund Olesen 3c52f0281f Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).

So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.

Patch by Yiannis Tsiouris!

llvm-svn: 156328
2012-05-07 22:10:26 +00:00
Owen Anderson ab63d84252 Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.
llvm-svn: 156324
2012-05-07 20:51:25 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Jakob Stoklund Olesen e326ed33a8 Make sure findRepresentativeClass picks the widest super-register.
We want the representative register class to contain the largest
super-registers available. This makes the function less sensitive to the
register class numbering.

llvm-svn: 156220
2012-05-04 22:53:28 +00:00
Jakob Stoklund Olesen e89496fe63 Remove extra comma in debug output.
llvm-svn: 156219
2012-05-04 22:53:26 +00:00
Jakob Stoklund Olesen 75fbe90839 Use SuperRegClassIterator for findRepresentativeClass().
The masks returned by SuperRegClassIterator are computed automatically
by TableGen. This is better than depending on the manually specified
SuperRegClasses.

llvm-svn: 156147
2012-05-04 02:19:22 +00:00
Evan Cheng b64e7b778b Fix two-address pass's aggressive instruction commuting heuristics. It's meant
to catch cases like:
 %reg1024<def> = MOV r1
 %reg1025<def> = MOV r0
 %reg1026<def> = ADD %reg1024, %reg1025
 r0            = MOV %reg1026

By commuting ADD, it let coalescer eliminate all of the copies. However, there
was a bug in the heuristics where it ended up commuting the ADD in:

 %reg1024<def> = MOV r0
 %reg1025<def> = MOV 0
 %reg1026<def> = ADD %reg1024, %reg1025
 r0            = MOV %reg1026

That did no benefit but rather ensure the last MOV would not be coalesced.

rdar://11355268

llvm-svn: 156048
2012-05-03 01:45:13 +00:00
Andrew Trick 32aea358e1 Added TargetRegisterInfo::getAllocatableClass.
The ensures that virtual registers always belong to an allocatable class.
If your target attempts to create a vreg for an operand that has no
allocatable register subclass, you will crash quickly.

This ensures that targets define register classes as intended.

llvm-svn: 156046
2012-05-03 01:14:37 +00:00
Owen Anderson 41b0665b5b Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs.
llvm-svn: 156029
2012-05-02 22:17:40 +00:00
Owen Anderson b5f167c660 Teach DAG combine that multiplication by 1.0 can always be constant folded.
llvm-svn: 156023
2012-05-02 21:32:35 +00:00
Jim Grosbach edcb868fe3 Tidy up. Naming conventions.
llvm-svn: 155960
2012-05-01 23:21:41 +00:00
Jakub Staszak cd2353402d Use dyn_cast instead of checking opcode and cast.
llvm-svn: 155957
2012-05-01 23:06:00 +00:00
Bill Wendling b6b50c6638 Strip the pointer casts off of allocas so that the selection DAG can find them.
PR10799

llvm-svn: 155954
2012-05-01 22:50:45 +00:00
Sirish Pande 94212168fc Target independent Hexagon Packetizer fix.
llvm-svn: 155947
2012-05-01 21:28:30 +00:00
Bill Wendling b12f16e75f Change the PassManager from a reference to a pointer.
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468

llvm-svn: 155902
2012-05-01 08:27:43 +00:00
Jakub Staszak cec09b2594 Add some constantness. No functionality change.
llvm-svn: 155859
2012-04-30 23:41:30 +00:00
Benjamin Kramer db25381a54 RegisterPressure: ArrayRefize some functions for better readability. No functionality change.
llvm-svn: 155795
2012-04-29 18:52:56 +00:00
Jakob Stoklund Olesen 6053899aa0 Don't update spill weights when joining intervals.
We don't compute spill weights until after coalescing anyway.

llvm-svn: 155766
2012-04-28 19:19:11 +00:00
Jakob Stoklund Olesen 4fe0e1908e Spring cleaning - Delete dead code.
llvm-svn: 155765
2012-04-28 19:19:07 +00:00
Andrew Trick 833f04962a Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice.
This time, also fix the caller of AddGlue to properly handle
incomplete chains. AddGlue had failure modes, but shamefully hid them
from its caller. It's luck ran out.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155749
2012-04-28 01:03:23 +00:00
Andrew Trick 7a773ec053 Temporarily revert r155668: Fix the SD scheduler to avoid gluing.
This definitely caused regression with ARM -mno-thumb.

llvm-svn: 155743
2012-04-27 22:55:59 +00:00
Andrew Trick 03fa574af5 Fix the SD scheduler to avoid gluing the same node twice.
DAGCombine strangeness may result in multiple loads from the same
offset. They both may try to glue themselves to another load. We could
insist that the redundant loads glue themselves to each other, but the
beter fix is to bail out from bad gluing at the time we detect it.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155668
2012-04-26 21:48:25 +00:00
Jakob Stoklund Olesen 01f201f484 Remove more dead code.
llvm-svn: 155566
2012-04-25 18:01:30 +00:00
Jakob Stoklund Olesen 983dd43b15 Remove the -disable-cross-class-join option.
Cross-class joins have been normal and fully supported for a while now.
With TableGen generating the getMatchingSuperRegClass() hook, they are
unlikely to cause problems again.

llvm-svn: 155552
2012-04-25 16:17:50 +00:00
Jakob Stoklund Olesen d11cf9677f Cross-class joining is winning.
Remove the heuristic for disabling cross-class joins. The greedy
register allocator can handle the narrow register classes, and when it
splits a live range, it can pick a larger register class.

Benchmarks were unaffected by this change.

<rdar://problem/11302212>

llvm-svn: 155551
2012-04-25 16:17:47 +00:00
Andrew Trick 4d4b5469ab Fix a naughty header include that breaks "installed" builds.
llvm-svn: 155486
2012-04-24 20:36:19 +00:00
Evan Cheng 2d14d8aca1 MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144
llvm-svn: 155470
2012-04-24 19:06:55 +00:00
Andrew Trick 26bdff9b82 cmake: new file
llvm-svn: 155460
2012-04-24 18:06:49 +00:00
Andrew Trick 9e9a9f1465 misched: DAG builder must special case earlyclobber
llvm-svn: 155459
2012-04-24 18:04:41 +00:00
Andrew Trick c3ea00565f misched: try (not too hard) to place debug values where they belong
llvm-svn: 155458
2012-04-24 18:04:37 +00:00
Andrew Trick cc45a28320 misched: ignore debug values during scheduling
llvm-svn: 155457
2012-04-24 18:04:34 +00:00
Andrew Trick 88639928bd misched: DAG builder support for tracking register pressure within the current scheduling region.
The DAG builder is a convenient place to do it. Hopefully this is more
efficient than a separate traversal over the same region.

llvm-svn: 155456
2012-04-24 17:56:43 +00:00
Andrew Trick 3cd53a1a52 RegisterPressure: A utility for computing register pressure within a
MachineInstr sequence.

This uses the new target interface for tracking register pressure
using pressure sets to model overlapping register classes and
subregisters.

RegisterPressure results can be tracked incrementally or stored at
region boundaries. Global register pressure can be deduced from local
RegisterPressure results if desired.

This is an early, somewhat untested implementation. I'm working on
testing it within the context of a register pressure reducing
MachineScheduler.

llvm-svn: 155454
2012-04-24 17:53:35 +00:00
Bill Wendling f1b14b719f Look for the 'Is Simulated' module flag. This indicates that the program is compiled to run on a simulator.
llvm-svn: 155435
2012-04-24 11:03:50 +00:00
Preston Gurd 9a0914753a This patch fixes a problem which arose when using the Post-RA scheduler
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.

llvm-svn: 155395
2012-04-23 21:39:35 +00:00
Chandler Carruth af0f8bf595 Temporarily revert r155364 until the upstream review can complete, per
the stated developer policy.

llvm-svn: 155373
2012-04-23 18:28:57 +00:00
Sirish Pande 995c8dbfd2 Hexagon Packetizer's target independent fix.
llvm-svn: 155364
2012-04-23 17:49:09 +00:00
Elena Demikhovsky 8d7e56c409 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Nadav Rotem 31caa27bf5 Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors.
llvm-svn: 155296
2012-04-21 20:08:32 +00:00
Jakob Stoklund Olesen d114da6004 Fix PR12599.
The X86 target is editing the selection DAG while isel is selecting
nodes following a topological ordering. When the DAG hacking triggers
CSE, nodes can be deleted and bad things happen.

llvm-svn: 155257
2012-04-20 23:36:09 +00:00
Jakob Stoklund Olesen e3a891cf08 Make ISelPosition a local variable.
Now that multiple DAGUpdateListeners can be active at the same time,
ISelPosition can become a local variable in DoInstructionSelection.

We simply register an ISelUpdater with CurDAG while ISelPosition exists.

llvm-svn: 155249
2012-04-20 22:08:50 +00:00
Jakob Stoklund Olesen beb9469d5c Register DAGUpdateListeners with SelectionDAG.
Instead of passing listener pointers to RAUW, let SelectionDAG itself
keep a linked list of interested listeners.

This makes it possible to have multiple listeners active at once, like
RAUWUpdateListener was already doing. It also makes it possible to
register listeners up the call stack without controlling all RAUW calls
below.

DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG
list of active listeners.

llvm-svn: 155248
2012-04-20 22:08:46 +00:00
Jakob Stoklund Olesen 7111a630d5 Print <def,read-undef> to avoid confusion.
The <undef> flag on a def operand only applies to partial register
redefinitions. Only print the flag when relevant, and print it as
<def,read-undef> to make it clearer what it means.

llvm-svn: 155239
2012-04-20 21:45:33 +00:00
Andrew Trick 51ee936101 New and improved comment.
llvm-svn: 155229
2012-04-20 20:24:33 +00:00
Andrew Trick 1eb4a0da55 SparseSet: Add support for key-derived indexes and arbitrary key types.
This nicely handles the most common case of virtual register sets, but
also handles anticipated cases where we will map pointers to IDs.

The goal is not to develop a completely generic SparseSet
template. Instead we want to handle the expected uses within llvm
without any template antics in the client code. I'm adding a bit of
template nastiness here, and some assumption about expected usage in
order to make the client code very clean.

The expected common uses cases I'm designing for:
- integer keys that need to be reindexed, and may map to additional
  data
- densely numbered objects where we want pointer keys because no
  number->object map exists.

llvm-svn: 155227
2012-04-20 20:05:28 +00:00
Andrew Trick 7405c6d57a misched: initialize BB
llvm-svn: 155226
2012-04-20 20:05:21 +00:00
Andrew Trick a11810ad60 Allow targets to select the default scheduler by name.
llvm-svn: 155090
2012-04-19 01:34:10 +00:00
Chandler Carruth b415bf98f0 This reverts a long string of commits to the Hexagon backend. These
commits have had several major issues pointed out in review, and those
issues are not being addressed in a timely fashion. Furthermore, this
was all committed leading up to the v3.1 branch, and we don't need piles
of code with outstanding issues in the branch.

It is possible that not all of these commits were necessary to revert to
get us back to a green state, but I'm going to let the Hexagon
maintainer sort that out. They can recommit, in order, after addressing
the feedback.

Reverted commits, with some notes:

Primary commit r154616: HexagonPacketizer
  - There are lots of review comments here. This is the primary reason
    for reverting. In particular, it introduced large amount of warnings
    due to a bad construct in tablegen.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154622: CMake fixes
    - r154660: Fix numerous build warnings in release builds.
  - Please don't resubmit this until the three commits above are
    included, and the issues in review addressed.

Primary commit r154695: Pass to replace transfer/copy ...
  - Reverted to minimize merge conflicts. I'm not aware of specific
    issues with this patch.

Primary commit r154703: New Value Jump.
  - Primarily reverted due to merge conflicts.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154703: Remove iostream usage
    - r154758: Fix CMake builds
    - r154759: Fix build warnings in release builds
  - Please incorporate these fixes and and review feedback before
    resubmitting.

Primary commit r154829: Hexagon V5 (floating point) support.
  - Primarily reverted due to merge conflicts.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154841: Remove unused variable (fixing build warnings)

There are also accompanying Clang commits that will be reverted for
consistency.

llvm-svn: 155047
2012-04-18 21:31:19 +00:00
Pete Cooper 8998657c64 LiveIntervalUpdate validators weren't recorded after the calls to std::for_each. Turns out std::for_each doesn't update the variable passed in for the functor but instead copy constructs a new one.
llvm-svn: 155041
2012-04-18 20:29:17 +00:00
Joel Jones 828531f798 Fixes a problem in instruction selection with testing whether or not the
transformation:

(X op C1) ^ C2 --> (X op C1) & ~C2 iff (C1&C2) == C2

should be done.  

This change has been tested:
 Using a debug+asserts build:
   on the specific test case that brought this bug to light
   make check-all
   lnt nt
   using this clang to build a release version of clang
 Using the release+asserts clang-with-clang build:
   on the specific test case that brought this bug to light
   make check-all
   lnt nt

Checking in because Evan wants it checked in.  Test case forthcoming after
scrubbing.

llvm-svn: 154955
2012-04-17 22:23:10 +00:00
Lang Hames aef9178301 SlotIndexes used to store the index list in a crufty custom linked-list. I can't
for the life of me remember why I wrote it this way, but I can't see any good
reason for it now. This patch replaces the custom linked list with an ilist.

This change should preserve the existing numberings exactly, so no generated code
should change (if it does, file a bug!).

llvm-svn: 154904
2012-04-17 04:15:51 +00:00
Eric Christopher a8caa739de Make comment here more clear.
llvm-svn: 154878
2012-04-16 23:54:23 +00:00
Chandler Carruth 1f5580b6f3 Fix updateTerminator to be resiliant to degenerate terminators where
both fallthrough and a conditional branch target the same successor.
Gracefully delete the conditional branch and introduce any unconditional
branch needed to reach the actual successor. This fixes memory
corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests.

Also, while I'm here fix a latent bug I spotted by inspection. I never
applied the same fundamental fix to this fallthrough successor finding
logic that I did to the logic used when there are no conditional
branches. As a consequence it would have selected landing pads had they
be aligned in just the right way here. I don't have a test case as
I spotted this by inspection, and the previous time I found this
required have of TableGen's source code to produce it. =/ I hate backend
bugs. ;]

Thanks to Jim Grosbach for helping me reason through this and reviewing
the fix.

llvm-svn: 154867
2012-04-16 22:03:00 +00:00
Chandler Carruth 4190b507c5 Flip the new block-placement pass to be on by default.
This is mostly to test the waters. I'd like to get results from FNT
build bots and other bots running on non-x86 platforms.

This feature has been pretty heavily tested over the last few months by
me, and it fixes several of the execution time regressions caused by the
inlining work by preventing inlining decisions from radically impacting
block layout.

I've seen very large improvements in yacr2 and ackermann benchmarks,
along with the expected noise across all of the benchmark suite whenever
code layout changes. I've analyzed all of the regressions and fixed
them, or found them to be impossible to fix. See my email to llvmdev for
more details.

I'd like for this to be in 3.1 as it complements the inliner changes,
but if any failures are showing up or anyone has concerns, it is just
a flag flip and so can be easily turned off.

I'm switching it on tonight to try and get at least one run through
various folks' performance suites in case SPEC or something else has
serious issues with it. I'll watch bots and revert if anything shows up.

llvm-svn: 154816
2012-04-16 13:49:17 +00:00
Chandler Carruth 8c0b41d656 Add a somewhat hacky heuristic to do something different from whole-loop
rotation. When there is a loop backedge which is an unconditional
branch, we will end up with a branch somewhere no matter what. Try
placing this backedge in a fallthrough position above the loop header as
that will definitely remove at least one branch from the loop iteration,
where whole loop rotation may not.

I haven't seen any benchmarks where this is important but loop-blocks.ll
tests for it, and so this will be covered when I flip the default.

llvm-svn: 154812
2012-04-16 13:33:36 +00:00
Chandler Carruth 8c74c7b1c6 Tweak the loop rotation logic to check whether the loop is naturally
laid out in a form with a fallthrough into the header and a fallthrough
out of the bottom. In that case, leave the loop alone because any
rotation will introduce unnecessary branches. If either side looks like
it will require an explicit branch, then the rotation won't add any, do
it to ensure the branch occurs outside of the loop (if possible) and
maximize the benefit of the fallthrough in the bottom.

llvm-svn: 154806
2012-04-16 09:31:23 +00:00
Hal Finkel e0cf6397fd Remove dead SD nodes after the combining pass. Fixes PR12201.
llvm-svn: 154786
2012-04-16 03:33:22 +00:00
Chandler Carruth ccc7e42b1f Rewrite how machine block placement handles loop rotation.
This is a complex change that resulted from a great deal of
experimentation with several different benchmarks. The one which proved
the most useful is included as a test case, but I don't know that it
captures all of the relevant changes, as I didn't have specific
regression tests for each, they were more the result of reasoning about
what the old algorithm would possibly do wrong. I'm also failing at the
moment to craft more targeted regression tests for these changes, if
anyone has ideas, it would be welcome.

The first big thing broken with the old algorithm is the idea that we
can take a basic block which has a loop-exiting successor and a looping
successor and use the looping successor as the layout top in order to
get that particular block to be the bottom of the loop after layout.
This happens to work in many cases, but not in all.

The second big thing broken was that we didn't try to select the exit
which fell into the nearest enclosing loop (to which we exit at all). As
a consequence, even if the rotation worked perfectly, it would result in
one of two bad layouts. Either the bottom of the loop would get
fallthrough, skipping across a nearer enclosing loop and thereby making
it discontiguous, or it would be forced to take an explicit jump over
the nearest enclosing loop to earch its successor. The point of the
rotation is to get fallthrough, so we need it to fallthrough to the
nearest loop it can.

The fix to the first issue is to actually layout the loop from the loop
header, and then rotate the loop such that the correct exiting edge can
be a fallthrough edge. This is actually much easier than I anticipated
because we can handle all the hard parts of finding a viable rotation
before we do the layout. We just store that, and then rotate after
layout is finished. No inner loops get split across the post-rotation
backedge because we check for them when selecting the rotation.

That fix exposed a latent problem with our exitting block selection --
we should allow the backedge to point into the middle of some inner-loop
chain as there is no real penalty to it, the whole point is that it
*won't* be a fallthrough edge. This may have blocked the rotation at all
in some cases, I have no idea and no test case as I've never seen it in
practice, it was just noticed by inspection.

Finally, all of these fixes, and studying the loops they produce,
highlighted another problem: in rotating loops like this, we sometimes
fail to align the destination of these backwards jumping edges. Fix this
by actually walking the backwards edges rather than relying on loopinfo.

This fixes regressions on heapsort if block placement is enabled as well
as lots of other cases where the previous logic would introduce an
abundance of unnecessary branches into the execution.

llvm-svn: 154783
2012-04-16 01:12:56 +00:00
Nadav Rotem 02ef0c3524 When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type.
llvm-svn: 154764
2012-04-15 15:08:09 +00:00
Andrew Trick 97d5b9cca6 misched: Added CanHandleTerminators.
This is a special flag for targets that really want their block
terminators in the DAG. The default scheduler cannot handle this
correctly, so it becomes the specialized scheduler's responsibility to
schedule terminators.

llvm-svn: 154712
2012-04-13 23:29:54 +00:00
Benjamin Kramer 330970d658 Reduce malloc traffic in DwarfAccelTable
- Don't copy offsets into HashData, the underlying vector won't change once the table is finalized.
- Allocate HashData and HashDataContents in a BumpPtrAllocator.
- Allocate string map entries in the same allocator.
- Random cleanups.

llvm-svn: 154694
2012-04-13 20:06:17 +00:00
Sirish Pande b486144c12 HexagonPacketizer patch.
llvm-svn: 154616
2012-04-12 21:06:38 +00:00
Nadav Rotem 9d376b6578 Reapply 154397. Original message:
Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.

llvm-svn: 154490
2012-04-11 08:26:11 +00:00
Craig Topper 692d584910 Fix an overly indented line. Remove an 'else' after an 'if' that returns.
llvm-svn: 154479
2012-04-11 04:55:51 +00:00
Craig Topper bc680061e8 Inline implVisitAluOverflow by introducing a nested switch to convert the intrinsic to an nodetype.
llvm-svn: 154478
2012-04-11 04:34:11 +00:00
Craig Topper 3ef01cdb2e Optimize code a bit by calling push_back only once in some loops. Reduces compiled code size a bit.
llvm-svn: 154473
2012-04-11 03:06:35 +00:00
Jakob Stoklund Olesen 645bdd4b69 Tweak MachineLICM heuristics for cheap instructions.
Allow cheap instructions to be hoisted if they are register pressure
neutral or better. This happens if the instruction is the last loop use
of another virtual register.

Only expensive instructions are allowed to increase loop register
pressure.

llvm-svn: 154455
2012-04-11 00:00:28 +00:00
Jakob Stoklund Olesen a3e86a604a Only check for PHI uses inside the current loop.
Hoisting a value that is used by a PHI in the loop will introduce a
copy because the live range is extended to cross the PHI.

The same applies to PHIs in exit blocks.

Also use this opportunity to make HasLoopPHIUse() non-recursive.

llvm-svn: 154454
2012-04-11 00:00:26 +00:00
Owen Anderson 6f1ee1634d Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point.
Zap a testcase that this allows us to completely fold away.

llvm-svn: 154447
2012-04-10 22:46:53 +00:00
Duncan Sands 4f53074cca Add a comment noting that the fdiv -> fmul conversion won't generate
multiplication by a denormal, and some tests checking that.

llvm-svn: 154431
2012-04-10 20:35:27 +00:00
Eric Christopher e9abba71fe To ensure that we have more accurate line information for a block
don't elide the branch instruction if it's the only one in the block,
otherwise it's ok.

PR9796 and rdar://11215207

llvm-svn: 154417
2012-04-10 18:18:10 +00:00
Owen Anderson 3efc8f22bd Revert r154397, which was causing make check failures on the buildbots.
llvm-svn: 154414
2012-04-10 18:02:12 +00:00
Nadav Rotem 065564d85a Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.

llvm-svn: 154397
2012-04-10 14:58:31 +00:00
Chandler Carruth 68062617a6 Make a somewhat subtle change in the logic of block placement. Sometimes
the loop header has a non-loop predecessor which has been pre-fused into
its chain due to unanalyzable branches. In this case, rotating the
header into the body of the loop in order to place a loop exit at the
bottom of the loop is a Very Bad Idea as it makes the loop
non-contiguous.

I'm working on a good test case for this, but it's a bit annoynig to
craft. I should get one shortly, but I'm submitting this now so I can
begin the (lengthy) performance analysis process. An initial run of LNT
looks really, really good, but there is too much noise there for me to
trust it much.

llvm-svn: 154395
2012-04-10 13:35:57 +00:00
Anton Korobeynikov 4d1220de34 Transform div to mul with reciprocal only when fp imm is legal.
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)

llvm-svn: 154394
2012-04-10 13:22:49 +00:00
Evan Cheng 136861d994 Make the code slightly more palatable.
llvm-svn: 154378
2012-04-10 03:15:18 +00:00
Evan Cheng f8bad08001 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178

llvm-svn: 154370
2012-04-10 01:51:00 +00:00
Rafael Espindola 1d9672bdce Don't try to zExt just to check if an integer constant is zero, it might
not fit in a i64.

llvm-svn: 154364
2012-04-10 00:16:22 +00:00
Akira Hatanaka 8483a6c47d Have TargetLowering::getPICJumpTableRelocBase return a node that points to the
GOT if jump table uses 64-bit gp-relative relocation.

llvm-svn: 154341
2012-04-09 20:32:12 +00:00
Lang Hames 3ad11ff90f Patch r153892 for PR11861 apparently broke an external project (see PR12493).
This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when
rescheduling instructions in TryInstructionTransform. Hopefully this will fix
PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after
the copy that unties the operands is emitted (this seems to be a more
appropriate fix for that issue anyway).

llvm-svn: 154338
2012-04-09 20:17:30 +00:00
Rafael Espindola 8f62b3248e Pattern match a setcc of boolean value with 0 as a truncate.
llvm-svn: 154322
2012-04-09 16:06:03 +00:00
Craig Topper 9c3da316ec Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out.
llvm-svn: 154309
2012-04-09 07:19:09 +00:00
Craig Topper e5893f64e8 Remove unnecessary 'else' on an 'if' that always returns
llvm-svn: 154308
2012-04-09 05:59:53 +00:00
Craig Topper e3ad4834ae Optimize code slightly. No functionality change.
llvm-svn: 154307
2012-04-09 05:55:33 +00:00
Craig Topper 5894fe430a Replace some explicit checks with asserts for conditions that should never happen.
llvm-svn: 154305
2012-04-09 05:16:56 +00:00
Craig Topper 6148fe65e8 Optimize code a bit. No functional change intended.
llvm-svn: 154299
2012-04-08 23:15:04 +00:00
Benjamin Kramer bb6ff08766 Silence sign-compare warning.
llvm-svn: 154297
2012-04-08 19:04:45 +00:00
Duncan Sands 2f1dc3814b Only have codegen turn fdiv by a constant into fmul by the reciprocal
when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly.  There is already an IR level transform that does
that, and it does it more carefully.

llvm-svn: 154296
2012-04-08 18:08:12 +00:00
Craig Topper c8e2d91a58 Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently.
llvm-svn: 154295
2012-04-08 17:53:33 +00:00
Chandler Carruth 16f0ebcbb5 Move the TLSModel information into the TargetMachine rather than hiding
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.

llvm-svn: 154292
2012-04-08 17:20:55 +00:00
Chandler Carruth bed1abf9ca Remove an over zealous assert. The assert was trying to catch places
where a chain outside of the loop block-set ended up in the worklist for
scheduling as part of the contiguous loop. However, asserting the first
block in the chain is in the loop-set isn't a valid check -- we may be
forced to drag a chain into the worklist due to one block in the chain
being part of the loop even though the first block is *not* in the loop.
This occurs when we have been forced to form a chain early due to
un-analyzable branches.

No test case here as I have no idea how to even begin reducing one, and
it will be hopelessly fragile. We have to somehow end up with a loop
header of an inner loop which is a successor of a basic block with an
unanalyzable pair of branch instructions. Ow. Self-host triggers it so
it is unlikely it will regress.

This at least gets block placement back to passing selfhost and the test
suite. There are still a lot of slowdown that I don't like coming out of
block placement, although there are now also a lot of speedups. =[ I'm
seeing swings in both directions up to 10%. I'm going to try to find
time to dig into this and see if we can turn this on for 3.1 as it does
a really good job of cleaning up after some loops that degraded with the
inliner changes.

llvm-svn: 154287
2012-04-08 14:37:02 +00:00
Chandler Carruth 49158908dc Add a debug-only 'dump' method to the BlockChain structure to ease
debugging.

llvm-svn: 154286
2012-04-08 14:37:01 +00:00
Craig Topper d024cef233 Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper e09d1c5c48 Remove 'else' after 'if' that ends in return.
llvm-svn: 154267
2012-04-07 21:23:41 +00:00
Nadav Rotem 71d07ae5cb 1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new
shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.

2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.

llvm-svn: 154266
2012-04-07 21:19:08 +00:00
Duncan Sands 5f8397a934 Convert floating point division by a constant into multiplication by the
reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.

llvm-svn: 154265
2012-04-07 20:04:00 +00:00
Eric Christopher aec8a82694 Patch to set is_stmt a little better for prologue lines in a function.
This enables debuggers to see what are interesting lines for a
breakpoint rather than any line that starts a function.

rdar://9852092

llvm-svn: 154120
2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen 37492eac8c Don't break the IV update in TLI::SimplifySetCC().
LSR always tries to make the ICmp in the loop latch use the incremented
induction variable. This allows the induction variable to be kept in a
single register.

When the induction variable limit is equal to the stride,
SimplifySetCC() would break LSR's hard work by transforming:

   (icmp (add iv, stride), stride) --> (cmp iv, 0)

This forced us to use lea for the IC update, preventing the simpler
incl+cmp.

<rdar://problem/7643606>
<rdar://problem/11184260>

llvm-svn: 154119
2012-04-05 20:30:20 +00:00
Owen Anderson a6eebf6013 Treat f16 the same as f80/f128 for the purposes of generating constants during instruction selection.
llvm-svn: 154113
2012-04-05 18:50:32 +00:00
Pete Cooper d7290700e6 REG_SEQUENCE expansion to COPY instructions wasn't taking account of sub register indices on the source registers. No simple test case
llvm-svn: 154051
2012-04-04 21:03:25 +00:00
Pete Cooper 8a3dc0ed8c f16 FREM can now be legalized by promoting to f32
llvm-svn: 154039
2012-04-04 19:36:31 +00:00
Jakob Stoklund Olesen 92fd79a639 Remove spurious debug output.
llvm-svn: 154032
2012-04-04 18:23:38 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Craig Topper 4c7d995029 Remove default case from switch that was already covering all cases.
llvm-svn: 153996
2012-04-04 04:42:42 +00:00
Pete Cooper e7bff68a5e Removed useless switch for default case when switch was covering all the enum values
llvm-svn: 153984
2012-04-04 00:53:04 +00:00
Pete Cooper 9511ec86f9 Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095>
llvm-svn: 153976
2012-04-03 22:57:55 +00:00
Pete Cooper b98934cf72 Removed one last bad continue statement meant to be removed in r153914.
llvm-svn: 153975
2012-04-03 22:18:49 +00:00
Chad Rosier 2a02fe1bb2 Fix an issue in SimplifySetCC() specific to vector comparisons.
When folding X == X we need to check getBooleanContents() to determine if the
result is a vector of ones or a vector of negative ones. 

I tried creating a test case, but the problem seems to only be exposed on a
much older version of clang (around r144500).
rdar://10923049

llvm-svn: 153966
2012-04-03 20:11:24 +00:00
Eric Christopher b81e2b403c Fix thinko check for number of operands to be the one that actually
might have more than 19 operands. Add a testcase to make sure I
never screw that up again.

Part of rdar://11026482

llvm-svn: 153961
2012-04-03 17:55:42 +00:00
Eric Christopher 34164196af Add a line number for the scope of the function (starting at the first
brace) so that we get more accurate line number information about the
declaration of a given function and the line where the function
first starts.

Part of rdar://11026482

llvm-svn: 153916
2012-04-03 00:43:49 +00:00
Pete Cooper 4f0dbb27d9 Fixes to r153903. Added missing explanation of behaviour when the VirtRegMap is NULL. Also changed it in this case to just avoid updating the map, but live ranges or intervals will still get updated and created
llvm-svn: 153914
2012-04-03 00:28:46 +00:00
Pete Cooper 3ca96f9950 Moved LiveRangeEdit.h so that it can be called from other parts of the backend, not just libCodeGen
llvm-svn: 153906
2012-04-02 22:44:18 +00:00
Jakob Stoklund Olesen 291007b055 Allocate virtual registers in ascending order.
This is just the fallback tie-breaker ordering, the main allocation
order is still descending size.

Patch by Shamil Kurmangaleev!

llvm-svn: 153904
2012-04-02 22:30:39 +00:00
Pete Cooper 2bde2f42b1 Refactored the LiveRangeEdit interface so that MachineFunction, TargetInstrInfo, MachineRegisterInfo, LiveIntervals, and VirtRegMap are all passed into the constructor and stored as members instead of passed in to each method.
llvm-svn: 153903
2012-04-02 22:22:53 +00:00
Owen Anderson 98f2c0c384 Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do.
llvm-svn: 153901
2012-04-02 22:10:29 +00:00
Lang Hames aaafacd07e During two-address lowering, rescheduling an instruction does not untie
operands. Make TryInstructionTransform return false to reflect this.
Fixes PR11861.

llvm-svn: 153892
2012-04-02 19:58:43 +00:00
Eric Christopher ad9fe8955a Turn on the accelerator tables for Darwin.
llvm-svn: 153880
2012-04-02 17:58:52 +00:00
Nadav Rotem 702f080767 Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.

llvm-svn: 153864
2012-04-02 07:11:12 +00:00
Craig Topper 54bfde79db Make MCInstrInfo available to the MCInstPrinter. This will be used to remove getInstructionName and the static data it contains since the same tables are already in MCInstrInfo.
llvm-svn: 153860
2012-04-02 06:09:36 +00:00
Nadav Rotem b078350872 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Lang Hames 652f21274f Fix typo.
llvm-svn: 153846
2012-04-01 19:27:25 +00:00
Andrew Trick 779b32a44e misched: Add finalizeScheduler to complete the target interface.
llvm-svn: 153827
2012-04-01 07:24:23 +00:00
Rafael Espindola 80c540e656 Teach CodeGen's version of computeMaskedBits to understand the range metadata.
This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.

llvm-svn: 153817
2012-03-31 18:14:00 +00:00
Bill Wendling 9f829f1cc4 If we have a VLA that has a "use" in a metadata node that's then used
here but it has no other uses, then we have a problem. E.g.,

  int foo (const int *x) {
    char a[*x];
    return 0;
  }

If we assign 'a' a vreg and fast isel later on has to use the selection
DAG isel, it will want to copy the value to the vreg. However, there are
no uses, which goes counter to what selection DAG isel expects.
<rdar://problem/11134152>

llvm-svn: 153705
2012-03-30 00:02:55 +00:00
Eric Christopher 70e1bd8872 Add support for objc property decls according to the page at:
http://llvm.org/docs/SourceLevelDebugging.html#objcproperty

including type and DECL. Expand the metadata needed accordingly.

rdar://11144023

llvm-svn: 153639
2012-03-29 08:42:56 +00:00
Jakob Stoklund Olesen c3e80cc885 Enable machine code verification in the entire code generator.
Some targets still mess up the liveness information, but that isn't
verified after MRI->invalidateLiveness().

The verifier can still check other useful things like register classes
and CFG, so it should be enabled after all passes.

llvm-svn: 153615
2012-03-28 23:54:28 +00:00
Jakob Stoklund Olesen d1bd8fba13 Enable machine code verification after PreSched2 passes.
The late scheduler depends on accurate liveness information if it is
breaking anti-dependencies, so we should be able to verify it.

Relax the terminator checking in the machine code verifier so it can
handle the basic blocks created by if conversion.

llvm-svn: 153614
2012-03-28 23:31:15 +00:00
Jakob Stoklund Olesen e433c68d7c Also verify after ExpandPostRAPseudos.
llvm-svn: 153599
2012-03-28 20:49:30 +00:00
Jakob Stoklund Olesen 341e06f8d5 Enable machine code verification after the late machine optimization passes.
Branch folding invalidates liveness and disables liveness verification
on some targets.

llvm-svn: 153597
2012-03-28 20:47:37 +00:00
Jakob Stoklund Olesen b21df32cf5 Skip liveness verification when MRI->tracksLiveness() is false.
Extract the liveness verification into its own method.

This makes it possible to run the machine code verifier after liveness
information is no longer required to be valid.

llvm-svn: 153596
2012-03-28 20:47:35 +00:00
Jakob Stoklund Olesen 8e58c90f51 Allow removeLiveIn to be called with a register that isn't live-in.
This avoids the silly double search:

  if (isLiveIn(Reg))
    removeLiveIn(Reg);

llvm-svn: 153592
2012-03-28 20:11:42 +00:00
Pete Cooper 148ebb8802 Fixed commuteInstructions bug where if its called pre-regalloc the subreg indices weren't commuted
llvm-svn: 153579
2012-03-28 17:02:22 +00:00
Eric Christopher 24a6298512 More debug output.
llvm-svn: 153571
2012-03-28 07:34:36 +00:00
Eric Christopher 7285c7d51d Fix the output of the DW_TAG_friend tag to include DW_AT_friend
and not the rest of the member tag.

Fixes PR11695

llvm-svn: 153570
2012-03-28 07:34:31 +00:00
Lang Hames 5544bf1b8a Use a SmallVector and linear lookup instead of a DenseSet - SourceMap values
will always be tiny sets, so DenseSet is overkill (SmallSet won't work as we
need iteration support). 

llvm-svn: 153529
2012-03-27 19:10:45 +00:00
Eric Christopher 7ed2efca6a Use DW_AT_low_pc for a single entry point into a routine.
Fixes PR10105

llvm-svn: 153524
2012-03-27 18:35:54 +00:00
Jakob Stoklund Olesen 6c08534aff Print SSA and liveness tracking flags in MF::print().
llvm-svn: 153518
2012-03-27 17:17:16 +00:00
Jakob Stoklund Olesen d1664a1571 Branch folding may invalidate liveness.
Branch folding can use a register scavenger to update liveness
information when required. Don't do that if liveness information is
already invalid.

llvm-svn: 153517
2012-03-27 17:06:09 +00:00
Chris Lattner 1cc25e8a40 fix what looks like a real logic bug, found by PVS-Studio (part of PR12357)
llvm-svn: 153513
2012-03-27 16:27:21 +00:00
Jakob Stoklund Olesen 9c1ad5cb7d Add an MRI::tracksLiveness() flag.
Late optimization passes like branch folding and tail duplication can
transform the machine code in a way that makes it expensive to keep the
register liveness information up to date. There is a fuzzy line between
register allocation and late scheduling where the liveness information
degrades.

The MRI::tracksLiveness() flag makes the line clear: While true,
liveness information is accurate, and can be used for register
scavenging. Once the flag is false, liveness information is not
accurate, and can only be used as a hint.

Late passes generally don't need the liveness information, but they will
sometimes use the register scavenger to help update it. The scavenger
enforces strict correctness, and we have to spend a lot of code to
update register liveness that may never be used.

llvm-svn: 153511
2012-03-27 15:13:58 +00:00
Evan Cheng 7fede87349 Post-ra LICM should take care not to hoist an instruction that would clobber a
register that's read by the preheader terminator.

rdar://11095580

llvm-svn: 153492
2012-03-27 01:50:58 +00:00
Lang Hames 551662bf5d During MachineCopyPropagation a register may be the source operand of multiple
copies being considered for removal. Make sure to track all of the copies,
rather than just the most recent encountered, by holding a DenseSet instead of
an unsigned in SrcMap.

No test case - couldn't reduce something with a sane size.

llvm-svn: 153487
2012-03-27 00:44:47 +00:00
Lang Hames 95e021faf5 Add a debug option to dump PBQP graphs during register allocation.
llvm-svn: 153483
2012-03-26 23:07:23 +00:00
Eric Christopher 0925c62c74 Use the file in the inlined die rather than the compile unit for
backtrace locations.

Testcase forthcoming, but I wanted to get some testing here.

Should fix:

PR12323
PR12314
rdar://11091100

llvm-svn: 153471
2012-03-26 21:38:38 +00:00