Commit Graph

13970 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen 4021a7bf25 Add a -new-live-intervals experimental option.
This option replaces the existing live interval computation with one
based on LiveRangeCalc.cpp. The new algorithm does not depend on
LiveVariables, and it can be run at any time, before or after leaving
SSA form.

llvm-svn: 160892
2012-07-27 20:58:46 +00:00
Jakob Stoklund Olesen bc65e8f94e Add <imp-def> of super-register when lowering SUBREG_TO_REG.
Patch by Tyler Nowicki!

llvm-svn: 160888
2012-07-27 20:19:49 +00:00
Jakob Stoklund Olesen 35400b1dda Use an otherwise unused variable.
llvm-svn: 160798
2012-07-26 19:42:56 +00:00
Jakob Stoklund Olesen f9029fef2a Start scaffolding for a MachineTraceMetrics analysis pass.
This is still a work in progress.

Out-of-order CPUs usually execute instructions from multiple basic
blocks simultaneously, so it is necessary to look at longer traces when
estimating the performance effects of code transformations.

The MachineTraceMetrics analysis will pick a typical trace through a
given basic block and provide performance metrics for the trace. Metrics
will include:

- Instruction count through the trace.
- Issue count per functional unit.
- Critical path length, and per-instruction 'slack'.

These metrics can be used to determine the performance limiting factor
when executing the trace, and how it will be affected by a code
transformation.

Initially, this will be used by the early if-conversion pass.

llvm-svn: 160796
2012-07-26 18:38:11 +00:00
Dan Gohman 0b3d782933 Add a floor intrinsic.
llvm-svn: 160791
2012-07-26 17:43:27 +00:00
Manman Ren cc1dc6dc11 Disable rematerialization in TwoAddressInstructionPass.
It is redundant; RegisterCoalescer will do the remat if it can't eliminate
the copy. Collected instruction counts before and after this. A few extra
instructions are generated due to spilling but it is normal to see these kinds
of changes with almost any small codegen change, according to Jakob.

This also fixed rdar://11830760 where xor is expected instead of movi0.

llvm-svn: 160749
2012-07-25 18:28:13 +00:00
Jakob Stoklund Olesen cef9a618b1 Preserve 2-addr constraints in ConnectedVNInfoEqClasses.
When a live range splits into multiple connected components, we would
arbitrarily assign <undef> uses to component 0. This is wrong when the
use is tied to a def that gets assigned to a different component:

  %vreg69<def> = ADD8ri %vreg68<undef>, 1

The use and def must get the same virtual register.

Fix this by assigning <undef> uses to the same component as the value
defined by the instruction, if any:

  %vreg69<def> = ADD8ri %vreg69<undef>, 1

This fixes PR13402. The PR has a test case which I am not including
because it is unlikely to keep exposing this behavior in the future.

llvm-svn: 160739
2012-07-25 17:15:15 +00:00
Jakob Stoklund Olesen c6fd3deee6 Verify two-address constraints more carefully.
Include <undef> operands and virtual registers after leaving SSA form.

llvm-svn: 160734
2012-07-25 16:49:11 +00:00
Craig Topper 17300940ae Change llvm_unreachable in SplitVectorOperand to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type.
llvm-svn: 160661
2012-07-24 04:11:21 +00:00
Sylvestre Ledru 35521e2310 Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Nadav Rotem 9056076cab Fixed DAGCombine optimizations which generate select_cc for targets
that do not support it (X86 does not lower select_cc).

PR: 13428

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160619
2012-07-23 07:59:50 +00:00
Craig Topper 2694c05e86 Tidy up. Fix indentation and remove trailing whitespace.
llvm-svn: 160617
2012-07-23 05:38:07 +00:00
Craig Topper b49546a3b3 Change llvm_unreachable in SplitVectorResult to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type. For instance 256-bit AVX intrinsics without having AVX enabled.
llvm-svn: 160616
2012-07-23 04:34:49 +00:00
Benjamin Kramer 5be8f60126 Remove unused private member variables uncovered by the recent changes to clang's -Wunused-private-field.
llvm-svn: 160583
2012-07-20 22:05:57 +00:00
Jakob Stoklund Olesen e2cfd0d45a Avoid folding loads that are unsafe to move.
LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.

This fixes PR13414.

llvm-svn: 160575
2012-07-20 21:29:31 +00:00
Jakob Stoklund Olesen f62c07f147 Split loop exiting edges more aggressively.
PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.

Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.

The test case demonstrates the improvement. Before:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  movl  %esi, %eax
  je  LBB0_1

After:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  je  LBB0_1

  movl  %esi, %eax

llvm-svn: 160571
2012-07-20 20:49:53 +00:00
Pete Cooper dcf94db677 Fix crash in machine verifier when trying to print the def of a register which has no def
llvm-svn: 160531
2012-07-19 23:40:38 +00:00
Benjamin Kramer f364a63c3e Replace some explicit compare loops with std::equal.
No functionality change.

llvm-svn: 160501
2012-07-19 10:46:05 +00:00
Galina Kistanova aaf9735951 Fixed few warnings.
llvm-svn: 160493
2012-07-19 04:50:12 +00:00
Bill Wendling d163405df8 Remove tabs.
llvm-svn: 160475
2012-07-19 00:04:14 +00:00
Chandler Carruth 985454e0ac Fix a somewhat nasty crasher in PR13378. This crashes inside of
LiveIntervals due to the two-addr pass generating bogus MI code.

The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.

The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.

As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]

Thanks to Jakob for helpful hints on the way here, and the review.

llvm-svn: 160443
2012-07-18 18:58:22 +00:00
Nuno Lopes 2151497dca ignore 'invoke @llvm.donothing', but still keep the edge to the continuation BB
llvm-svn: 160411
2012-07-18 00:07:17 +00:00
Evan Cheng e6a3b03ee0 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
Jakob Stoklund Olesen 0ef031186c Add some trace output to TwoAddressInstructionPass.
llvm-svn: 160380
2012-07-17 17:57:23 +00:00
Benjamin Kramer 7c1598caaa Remove unused variable.
llvm-svn: 160372
2012-07-17 17:00:11 +00:00
Nadav Rotem 277a40bc0a Fix a crash in the legalization of large vectors.
When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.

llvm-svn: 160357
2012-07-17 09:07:37 +00:00
Evan Cheng 780f9b5f92 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng 47d7be9578 Make sure constant bitwidth is <= 64 bit before calling getSExtValue().
llvm-svn: 160350
2012-07-17 07:47:50 +00:00
Evan Cheng f579beca6d This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Nadav Rotem 60f7904db7 Minor cleanup and docs.
llvm-svn: 160311
2012-07-16 18:56:39 +00:00
Nadav Rotem 839a06e9d7 Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160305
2012-07-16 18:34:53 +00:00
Nadav Rotem 3050e07108 Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160235
2012-07-15 20:39:08 +00:00
Nadav Rotem a62368c965 Refactor the code that checks that all operands of a node are UNDEFs.
Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160229
2012-07-15 08:38:23 +00:00
Chandler Carruth db5536f09d Reapply r160194, switching to use LV information for finding local kills.
The notable fix is to look at any dependencies attached to the kill
instruction (or other instructions between MI nad the kill) where the
dependencies are specific to the register in question.

The old code implicitly handled this by rejecting the transform if *any*
other uses were found within the block, but after the start point. The
new code directly finds the kill, and has to re-use the existing
dependency scan to check for non-kill uses.

This was caught by self-host, but I found the bug via inspection and use
of absurd assert scaffolding to compute the kills in two ways and
compare them. So I have no useful testcase for this other than
"bootstrap". I'd work harder to reduce a test case if this particular
code were likely to live for a long time.

Thanks to Benjamin Kramer for reviewing the fix itself.

llvm-svn: 160228
2012-07-15 03:29:46 +00:00
Nadav Rotem 018921002e Add a dagcombine optimization to convert concat_vectors of undefs into a single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.

llvm-svn: 160221
2012-07-14 21:30:27 +00:00
Jakob Stoklund Olesen 8f324a2cc8 Account for early-clobber reload instructions.
No test case, there are no in-tree targets that require this.

llvm-svn: 160219
2012-07-14 18:45:35 +00:00
Jakob Stoklund Olesen 3d604ab933 Be more verbose when detecting dominance problems.
Catch uses of undefined physregs that haven't been added to basic block
live-in lists. Run the verifier to pinpoint the problem.

Also run the verifier when a virtual register use is not jointly
dominated by defs.

llvm-svn: 160207
2012-07-13 23:39:05 +00:00
Chandler Carruth 9c97cd5672 Revert r160194, which switched to use LV information for finding local
kills.

This is causing miscompiles that I'm working on tracking down.

llvm-svn: 160196
2012-07-13 22:23:32 +00:00
Chandler Carruth 58c470dc68 Use the LiveVariables information to efficiently get local kills. This
removes the largest scaling problem in the test cases from PR13225 when
ASan is switched to insert basic blocks in the natural CFG order.

It may also solve some scaling problems for more normal code with large
numbers of basic blocks and variables.

llvm-svn: 160194
2012-07-13 21:18:38 +00:00
Jim Grosbach 1af8c8060c Provide function name in 'Cannot select' fatal error.
When dumping the DAG for a fatal 'Cannot select' back-end error, also
provide the name of the function the construct is in. Useful when dealing
with large testcases, as the next step is to llvm-extract the function
in question to get a small(er) testcase.

llvm-svn: 160152
2012-07-13 00:29:09 +00:00
Eric Christopher bf57091f8b The end of the prologue should be marked with is_stmt.
Fixes PR13303.

Patch by Paul Robinson!

llvm-svn: 160148
2012-07-12 23:30:25 +00:00
Duncan Sands 671cc2575d The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of
the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC).  However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.

llvm-svn: 160116
2012-07-12 09:01:35 +00:00
Evan Cheng b17122859b InstrEmitter::EmitSubregNode() optimize extract_subreg in this case:
r1025 = s/zext r1024, 4
r1026 = extract_subreg r1025, 4

to a copy:
r1026 = copy r1024

This is correct. However it uses TII->isCoalescableExtInstr() which can return
true for instructions which essentially does a sext_in_reg so this can end up
with an illegal copy where the source and destination register classes do not
match. Add a check to avoid it. Sorry, no test case possible at this time.

rdar://11849816

llvm-svn: 160059
2012-07-11 18:55:07 +00:00
Nadav Rotem 2a148668b6 Rename many of the Tmp1, Tmp2, Tmp3 variables to names such as Chain, Value, Ptr, etc.
No functionality change.

llvm-svn: 160042
2012-07-11 11:02:16 +00:00
Benjamin Kramer 9488100d46 Remove unused variable.
llvm-svn: 160040
2012-07-11 09:39:04 +00:00
Nadav Rotem de6fd282ef Refactor the DAG Legalizer by extracting the legalization of
Load and Store nodes into their own functions.
No functional change.

llvm-svn: 160037
2012-07-11 08:52:09 +00:00
Owen Anderson b8844d6744 Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization.
This is a speculative fix for a problem on Mips reported by Akira Hatanaka.

llvm-svn: 160036
2012-07-11 06:38:55 +00:00
Jakob Stoklund Olesen bc90a4ea82 Require and preserve LoopInfo for early if-conversion.
It will surely be needed by heuristics.

llvm-svn: 160027
2012-07-10 22:39:56 +00:00
Chandler Carruth 2207f76cd4 Teach the LiveInterval::join function to use the fast merge algorithm,
generalizing its implementation sufficiently to support this value
number scenario as well.

This cuts out another significant performance hit in large functions
(over 10k basic blocks, etc), especially those with "natural" CFG
structures.

llvm-svn: 160026
2012-07-10 22:25:21 +00:00
Jakob Stoklund Olesen 02638392c1 Run early if-conversion in domtree post-order.
This ordering allows nested if-conversion without using a work list, and
it makes it possible to update the dominator tree on the fly as well.

Any erased basic blocks will always be dominated by the current
post-order position, so the domtree can be pruned without invalidating
the iterator.

llvm-svn: 160025
2012-07-10 22:18:23 +00:00
Chandler Carruth 77d940011d Fix a bug where I didn't test for an empty range before inspecting the
back of it.

I don't have anything even remotely close to a test case for this. It
only broke two build bots, both of them doing bootstrap builds, one of
them a dragonegg bootstrap. It doesn't break for me when I bootstrap
either. It doesn't reproduce every time or on many machines during the
bootstrap. Many thanks to Duncan Sands who got the exact command (and
stage of the bootstrap) which failed on the dragonegg bootstrap and
managed to get it to trigger under valgrind with debug symbols. The fix
was then found by inspection.

llvm-svn: 159993
2012-07-10 15:41:33 +00:00
Nadav Rotem d908ddc186 Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.

llvm-svn: 159991
2012-07-10 13:25:08 +00:00
Chandler Carruth e18614dd17 Add an efficient merge operation to LiveInterval and use it to avoid
quadratic behavior when performing pathological merges. Fixes the core
element of PR12652.

There is only one user of addRangeFrom left: join. I'm hoping to
refactor further in a future patch and have join use this merge
operation as well.

llvm-svn: 159982
2012-07-10 05:16:17 +00:00
Chandler Carruth ac766b9b42 Teach LiveIntervals how to verify themselves and start using it in some
of the trick merge routines. This adds a layer of testing that was
necessary when implementing more efficient (and complex) merge logic for
this datastructure.

No functionality changed here.

llvm-svn: 159981
2012-07-10 05:06:03 +00:00
Andrew Trick c50f06487c indentation
llvm-svn: 159958
2012-07-09 20:43:01 +00:00
Owen Anderson d4b841f8f9 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.

llvm-svn: 159957
2012-07-09 20:31:12 +00:00
Andrew Trick 87255e340e I'm introducing a new machine model to simultaneously allow simple
subtarget CPU descriptions and support new features of
MachineScheduler.

MachineModel has three categories of data:
1) Basic properties for coarse grained instruction cost model.
2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
3) Instruction itineraties for detailed per-cycle reservation tables.

These will all live side-by-side. Any subtarget can use any
combination of them. Instruction itineraries will not change in the
near term. In the long run, I expect them to only be relevant for
in-order VLIW machines that have complex contraints and require a
precise scheduling/bundling model. Once itineraries are only actively
used by VLIW-ish targets, they could be replaced by something more
appropriate for those targets.

This tablegen backend rewrite sets things up for introducing
MachineModel type #2: per opcode/operand cost model.

llvm-svn: 159891
2012-07-07 04:00:00 +00:00
Chad Rosier 879c34f45a Whitespace.
llvm-svn: 159839
2012-07-06 17:44:22 +00:00
Chad Rosier 88d53eae56 [fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic.
llvm-svn: 159837
2012-07-06 17:33:39 +00:00
Alexey Samsonov 39602781f6 Fix PR13202 and a regtest.
DwarfDebug class could generate the same (inlined) DIVariable twice:
1) when trying to find abstract debug variable for a concrete inlined instance.
2) when explicitly collecting info for variables that were optimized out.

This change makes sure that this duplication won't happen and makes
Clang pass "gdb.opt/inline-locals" test from gdb testsuite.

Reviewed by Eric Christopher.

llvm-svn: 159811
2012-07-06 08:45:08 +00:00
Jakob Stoklund Olesen 3f1bb93cab Add some comments suggested in code review.
llvm-svn: 159800
2012-07-06 02:31:22 +00:00
Chandler Carruth 1088676476 Optimize extendIntervalEndTo a tiny bit by saving one call through the
vector erase. No functionality changed.

llvm-svn: 159746
2012-07-05 12:40:45 +00:00
Chandler Carruth 264854f9a0 Finish fixing the MachineOperand hashing, providing a nice modern
hash_value overload for MachineOperands. This addresses a FIXME
sufficient for me to remove it, and cleans up the code nicely too.

The important changes to the hashing logic:
- TargetFlags are now included in all of the hashes. These were complete
  missed.
- Register operands have their subregisters and whether they are a def
  included in the hash.
- We now actually hash all of the operand types. Previously, many
  operand types were simply *dropped on the floor*. For example:
  - Floating point immediates
  - Large integer immediates (>64-bit)
  - External globals!
  - Register masks
  - Metadata operands
- It removes the offset from the block-address hash; I'm a bit
  suspicious of this, but isIdenticalTo doesn't consider the offset for
  black addresses.

Any patterns involving these entities could have triggered extreme
slowdowns in MachineCSE or PHIElimination. Let me know if there are PRs
you think might be closed now... I'm looking myself, but I may miss
them.

llvm-svn: 159743
2012-07-05 11:06:22 +00:00
Duncan Sands 71dacd09fe All cases are covered, no need for a default. This deals with the
corresponding clang warning.

llvm-svn: 159742
2012-07-05 10:14:33 +00:00
Chandler Carruth 1d5d23106e The hash function for MI expressions, used by MachineCSE, is really
broken. This patch fixes the superficial problems which lead to the
intractably slow compile times reported in PR13225.

The specific issue is that we were failing to include the *offset* of
a global variable in the hash code. Oops. This would in turn cause all
MIs which were only distinguishable due to operating on different
offsets of a global variable to produce identical hash functions. In
some of the test cases attached to the PR I saw hash table activity
where there were O(1000) probes-per-lookup *on average*. A very few
entries were responsible for most of these probes.

There is still quite a bit more to do here. The ad-hoc layering of data
in MachineOperands makes them *extremely* brittle to hash correctly.
We're missing quite a few other cases, the only ones I've fixed here are
the specific MO types which were allowed through the assert() in
getOffset().

llvm-svn: 159741
2012-07-05 10:03:57 +00:00
Duncan Sands 0552a2cad2 Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1
booleans.  Patch by James Benton.

llvm-svn: 159739
2012-07-05 09:32:46 +00:00
Nick Lewycky 765c699370 Remove ParentMap. You can just ask the domnode for its parent. No functionality
change.

Move the "Not profitable, avoid CSE!" debug message next to where we fail the
check for profitability and use a different message for avoiding CSE due to
being in different register classes.

llvm-svn: 159729
2012-07-05 06:19:21 +00:00
Jakob Stoklund Olesen c300ef0e50 Allow trailing physreg RegisterSDNode operands on non-variadic instructions.
Also allow trailing register mask operands on non-variadic both
MachineSDNodes and MachineInstrs.

The extra physreg RegisterSDNode operands are added to the MI as
<imp-use> operands. This makes it possible to have non-variadic call
instructions.

Call and return instructions really are non-variadic, the argument
registers should only be used implicitly - they are not part of the
encoding.

llvm-svn: 159727
2012-07-04 23:53:23 +00:00
Jakob Stoklund Olesen adb50a7a09 Print SlotIndexes when available for -print-machineinstrs.
llvm-svn: 159726
2012-07-04 23:53:19 +00:00
Jakob Stoklund Olesen 2d827d628e Allow multiple terminators to read virtual registers.
Find the kill as the last terminator to read SrcReg.

Patch by Philipp Brüschweiler!

llvm-svn: 159722
2012-07-04 19:52:05 +00:00
Jakob Stoklund Olesen 29506f5e6d Make sure -print-machineinstrs applies to the first pass as well.
llvm-svn: 159720
2012-07-04 19:28:27 +00:00
Stepan Dyatkovskiy 7ff588f986 Reverted r156659, due to probable performance regressions, DenseMap should be used here:
IntegersSubsetMapping
  - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement
    if possible.

llvm-svn: 159703
2012-07-04 05:53:05 +00:00
Eric Christopher ef9d710ea6 Reduce some code duplication.
llvm-svn: 159701
2012-07-04 02:02:18 +00:00
Matt Beaumont-Gay 11d08b2e22 Fix some ascii art in a comment to not have trailing backslashes (inspiration
from IfConversion.cc), and fix some spelling and grammar in the surrounding
prose.

llvm-svn: 159699
2012-07-04 01:09:45 +00:00
Jakob Stoklund Olesen f8a63a1507 Add an experimental early if-conversion pass, off by default.
This pass performs if-conversion on SSA form machine code by
speculatively executing both sides of the branch and using a cmov
instruction to select the result. This can help lower the number of
branch mispredictions on architectures like x86 that don't have
predicable instructions.

The current implementation is very aggressive, and causes regressions on
mosts tests. It needs good heuristics that have yet to be implemented.

llvm-svn: 159694
2012-07-04 00:09:54 +00:00
Stepan Dyatkovskiy 8b0c97e0dd Part of r159527. Splitted into series of patches and gone with fixed PR13256:
IntegersSubsetMapping
  - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement
    if possible.

llvm-svn: 159659
2012-07-03 13:46:45 +00:00
Eric Christopher b65acc61a5 Revert "IntRange:" as it appears to be breaking self hosting.
This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c.

llvm-svn: 159618
2012-07-02 23:22:21 +00:00
Chandler Carruth 34263a0c95 All glory to address sanitizer. ;]
It appears to have caught a use-after-free introduced as by r159567
and/or friends which call 'addPass' from many more places. The bug in
'addPass' doesn't appear to be new, and was spotted by inspection when
ASan shown a bright light of a stacktrace at these functions.

Hopefully this will fix the ASan failure -- I have no test case other
than running an ASan-built clang over the test suite.

llvm-svn: 159614
2012-07-02 22:56:41 +00:00
Evan Cheng 39e90029a2 Target option DisableJumpTables is a gross hack. Move it to TargetLowering instead.
llvm-svn: 159611
2012-07-02 22:39:56 +00:00
Andrew Trick 2f26b34806 misched: allow NULL InstrItineraries.
llvm-svn: 159599
2012-07-02 21:55:12 +00:00
Eric Christopher dd8638fb3e Turn an assert into an error to make it a bit more friendly.
Part of rdar://6880388 and rdar://11766377

llvm-svn: 159590
2012-07-02 21:16:43 +00:00
Bob Wilson cac3b90633 Extend TargetPassConfig to allow running only a subset of the normal passes.
This is still a work in progress but I believe it is currently good enough
to fix PR13122 "Need unit test driver for codegen IR passes".  For example,
you can run llc with -stop-after=loop-reduce to have it dump out the IR after
running LSR.  Serializing machine-level IR is not yet supported but we have
some patches in progress for that.

The plan is to serialize the IR to a YAML file, containing separate sections
for the LLVM IR, machine-level IR, and whatever other info is needed.  Chad
suggested that we stash the stop-after pass in the YAML file and use that
instead of the start-after option to figure out where to restart the
compilation.  I think that's a great idea, but since it's not implemented yet
I put the -start-after option into this patch for testing purposes.

llvm-svn: 159570
2012-07-02 19:48:45 +00:00
Bob Wilson a3f9fa710a Move assertion with TargetPassConfig's Initialized flag.
llvm-svn: 159569
2012-07-02 19:48:39 +00:00
Bob Wilson b9b693650a Consistently use AnalysisID types in TargetPassConfig.
This makes it possible to just use a zero value to represent "no pass", so
the phony NoPassID global variable is no longer needed.

llvm-svn: 159568
2012-07-02 19:48:37 +00:00
Bob Wilson bbd38dd9c0 Add all codegen passes to the PassManager via TargetPassConfig.
This is a preliminary step toward having TargetPassConfig be able to
start and stop the compilation at specified passes for unit testing
and debugging.  No functionality change.

llvm-svn: 159567
2012-07-02 19:48:31 +00:00
Manman Ren 72098b2c91 Added assertion in getVRegDef of MachineRegisterInfo to make sure the virtual
register does not have multiple definitions. Modified TwoAddressInstructionPass
to use getUniqueVRegDef instead of getVRegDef.

llvm-svn: 159545
2012-07-02 18:55:36 +00:00
Andrew Trick f161e391f8 Reapply "Make NumMicroOps a variable in the subtarget's instruction itinerary."
Reapplies r159406 with minor cleanup. The regressions appear to have been spurious.

llvm-svn: 159541
2012-07-02 18:10:42 +00:00
Stepan Dyatkovskiy 8b9ecca42d IntRange:
- Changed isSingleNumber method behaviour. Now this flag is calculated on demand.
IntegersSubsetMapping
  - Optimized diff operation.
  - Replaced type of Items field from std::list with std::map.
  - Added new methods:
    bool isOverlapped(self &RHS)
    void add(self& RHS, SuccessorClass *S)
    void detachCase(self& NewMapping, SuccessorClass *Succ)
    void removeCase(SuccessorClass *Succ)
    SuccessorClass *findSuccessor(const IntTy& Val)
    const IntTy* getCaseSingleNumber(SuccessorClass *Succ)
IntegersSubsetTest
  - DiffTest: Added checks for successors.
SimplifyCFG
  Updated SwitchInst usage (now it is case-ragnes compatible) for
    - SimplifyEqualityComparisonWithOnlyPredecessor
    - FoldValueComparisonIntoPredecessors

llvm-svn: 159527
2012-07-02 13:02:18 +00:00
Rafael Espindola a77d31d7fd Now that RegistersDefinedFromSameValue handles one instruction being an
implicit_def, the other instruction can be anything, including instructions
that define multiple values. Be careful about that and don't assume what operand
0 is.
Fixes pr13249.

llvm-svn: 159509
2012-07-01 17:08:01 +00:00
Rafael Espindola efab16d43b Handle implicit_defs in the register coalescer. I am still trying to produce
a reduced testcase, but this fixes pr13209.

llvm-svn: 159479
2012-06-30 01:45:55 +00:00
Manman Ren 6fa76dc0e0 Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare
instructions with two register operands.

llvm-svn: 159465
2012-06-29 21:33:59 +00:00
Jakob Stoklund Olesen 3e3cdecf98 Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

llvm-svn: 159461
2012-06-29 21:00:03 +00:00
Jakob Stoklund Olesen da9ea1d6bc Check for extra kill flags on live-out virtual registers.
This would previously get reported as the misleading "Virtual register
def doesn't dominate all uses."

llvm-svn: 159460
2012-06-29 21:00:00 +00:00
Manman Ren c146589aa4 Add getUniqueVRegDef to MachineRegisterInfo.
This comes in handy during peephole optimization.

llvm-svn: 159453
2012-06-29 19:16:05 +00:00
Alexey Samsonov 6e7e6b646b Cleanup in DwarfDebug - fix a typo and remove two unused functions
llvm-svn: 159433
2012-06-29 16:04:14 +00:00
Chandler Carruth aafe0918bc Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.h
This was always part of the VMCore library out of necessity -- it deals
entirely in the IR. The .cpp file in fact was already part of the VMCore
library. This is just a mechanical move.

I've tried to go through and re-apply the coding standard's preferred
header sort, but at 40-ish files, I may have gotten some wrong. Please
let me know if so.

I'll be committing the corresponding updates to Clang and Polly, and
Duncan has DragonEgg.

Thanks to Bill and Eric for giving the green light for this bit of cleanup.

llvm-svn: 159421
2012-06-29 12:38:19 +00:00
Bill Wendling f799efdedc The DIBuilder class is just a wrapper around debug info creation
(a.k.a. MDNodes). The module doesn't belong in Analysis. Move it to the VMCore
instead.

llvm-svn: 159414
2012-06-29 08:32:07 +00:00
Andrew Trick 51a8cf77b8 Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary."
This reverts commit r159406. I noticed a performance regression so I'll back out for now.

llvm-svn: 159411
2012-06-29 07:10:41 +00:00
Andrew Trick 8c9e6728b3 misched: avoid scheduling instructions that can't be dispatched.
llvm-svn: 159408
2012-06-29 03:23:24 +00:00
Andrew Trick ce27bb999d misched: count micro-ops toward the issue limit.
llvm-svn: 159407
2012-06-29 03:23:22 +00:00
Andrew Trick 1f50152b2d Make NumMicroOps a variable in the subtarget's instruction itinerary.
The TargetInstrInfo::getNumMicroOps API does not change, but soon it
will be used by MachineScheduler. Now each subtarget can specify the
number of micro-ops per itinerary class. For ARM, this is currently
always dynamic (-1), because it is used for load/store multiple which
depends on the number of register operands.

Zero is now a valid number of micro-ops. This can be used for
nop pseudo-instructions or instructions that the hardware can squash
during dispatch.

llvm-svn: 159406
2012-06-29 03:23:18 +00:00
Nuno Lopes ec9653b363 add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
llvm-svn: 159383
2012-06-28 22:30:12 +00:00
Jim Grosbach e0c10d8b86 'Promote' vector [su]int_to_fp should widen elements.
Teach vector legalization how to honor Promote for int to float
conversions. The code checking whether to promote the operation knew
to look at the operand, but the actual promotion code didn't. This
fixes that. The operand is promoted up via [zs]ext.

rdar://11762659

llvm-svn: 159378
2012-06-28 21:03:44 +00:00
Bill Wendling e38859dc8e Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and
include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.

The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.

llvm-svn: 159312
2012-06-28 00:05:13 +00:00
Jakob Stoklund Olesen 59a0d3243b Allow targets to inject passes before the virtual register rewriter.
Such passes can be used to tweak the register assignments in a
target-dependent way, for example to avoid write-after-write
dependencies.

llvm-svn: 159209
2012-06-26 17:09:29 +00:00
Chandler Carruth 9139f44d23 Update a bunch of stale comments that dated from when this folled the
very first (and worst) placement algorithm. These should now more
accurately reflect the reality of the pass.

llvm-svn: 159185
2012-06-26 05:16:37 +00:00
Andrew Trick fb2ba3e1cb Enable the new LoopInfo algorithm by default.
The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

llvm-svn: 159183
2012-06-26 04:11:38 +00:00
Evan Cheng 4c6f917d34 Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection.
llvm-svn: 159179
2012-06-26 01:19:33 +00:00
Jakob Stoklund Olesen a57fc12ec9 Enforce stricter liveness rules for PHIs.
Verify that all paths from the entry block to a virtual register read
pass through a def. Enable this check even when MRI->isSSA() is false.

Verify that the live range of a virtual register is live out of all
predecessor blocks, even for PHI-values.

This requires that PHIElimination sometimes inserts IMPLICIT_DEF
instruction in predecessor blocks.

llvm-svn: 159150
2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen eb49566447 Run ProcessImplicitDefs on SSA form where it can be much simpler.
Implicitly defined virtual registers can simply have the <undef> bit set
on all uses, and copies can be turned into implicit defs recursively.

Physical registers are a bit trickier. We handle the common case where a
physreg def is used by a nearby instruction in the same basic block. For
more complicated cases, just leave the IMPLICIT_DEF instruction in.

llvm-svn: 159149
2012-06-25 18:12:18 +00:00
Jakob Stoklund Olesen 70ed924e18 Teach PHIElimination to handle <undef> operands.
When a PHI use is <undef>, don't emit a copy in the predecessor block,
but insert an IMPLICIT_DEF instruction instead. This ensures that
virtual register uses are always jointly dominated by defs, even if some
of them are IMPLICIT_DEF.

llvm-svn: 159121
2012-06-25 03:36:12 +00:00
Jakob Stoklund Olesen 6b556f824d Handle <undef> operands in TwoAddressInstructionPass.
When the source register to a 2-addr instruction is undefined, there is
no need to attempt any transformations - simply replace the source
register with the destination register.

This also comes up when lowering IMPLICIT_DEF instructions - make sure
the <undef> flag is moved to the new partial register def operand:

  %vreg8<def> = INSERT_SUBREG %vreg9<undef>, %vreg0<kill>, sub_16bit
rewrite undef:
  %vreg8<def> = INSERT_SUBREG %vreg8<undef>, %vreg0<kill>, sub_16bit
convert to:
  %vreg8:sub_16bit<def,read-undef> = COPY %vreg0<kill>

llvm-svn: 159120
2012-06-25 03:27:12 +00:00
NAKAMURA Takumi 704de074b8 llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.
llvm-svn: 159112
2012-06-24 13:32:01 +00:00
Pete Cooper fe212e762f DAG legalisation can now handle illegal fma vector types by scalarisation
llvm-svn: 159092
2012-06-24 00:05:44 +00:00
Jakob Stoklund Olesen 502e4c6ac4 Teach LiveVariables to handle <undef> operands.
It's simple: Don't treat <undef> operands as uses, and don't assume a
virtual register has a defining instruction unless a real use has been
seen.

llvm-svn: 159061
2012-06-23 02:23:00 +00:00
Jakob Stoklund Olesen a127fc780a Remove ProcessImplicitDefs.h which was unused.
The ProcessImplicitDefs class can be local to its implementation file.

llvm-svn: 159041
2012-06-22 22:27:36 +00:00
Jakob Stoklund Olesen b033dede17 Also verify the def index for early clobbers.
llvm-svn: 159039
2012-06-22 22:23:58 +00:00
Jakob Stoklund Olesen 4fa84ba8b9 Delete a boring statistic.
llvm-svn: 159030
2012-06-22 20:40:15 +00:00
Jakob Stoklund Olesen c61edda0ab Store live intervals in an IndexedMap.
It is both smaller and faster than DenseMap.

llvm-svn: 159029
2012-06-22 20:37:52 +00:00
Hal Finkel 8db5547252 Revert r158679 - use case is unclear (and it increases the memory footprint).
Original commit message:
    Allow up to 64 functional units per processor itinerary.

    This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
    This will be needed for some upcoming PowerPC itineraries.

llvm-svn: 159027
2012-06-22 20:27:13 +00:00
Jakob Stoklund Olesen 48828bb402 Fix a crash in --debug code.
Don't try to print out the live range of a physreg.

llvm-svn: 159021
2012-06-22 19:51:41 +00:00
Jakob Stoklund Olesen 48a1647c93 Don't depend on live ranges being present.
DBG_VALUE instructions could be referring to non-existing virtual
registers.

llvm-svn: 159020
2012-06-22 18:51:35 +00:00
Jakob Stoklund Olesen 8a833649e5 Simplify handleMove() a bit.
There is no need to check for physreg live ranges. They don't exist any
more.

llvm-svn: 159019
2012-06-22 18:38:57 +00:00
Jakob Stoklund Olesen 37e797fedc Stop computing physreg live ranges.
Everyone is using on-demand regunit ranges now.

llvm-svn: 159018
2012-06-22 18:20:50 +00:00
Jakob Stoklund Olesen bbad269a3e Remove some redundant LIS->hasInterval() checks.
These functions only operate on virtual registers now, and they all have
live ranges.

llvm-svn: 159015
2012-06-22 17:49:44 +00:00
Jakob Stoklund Olesen 7809578cfe Use MRI::isConstantPhysReg() to check remat feasibility.
Don't depend on LiveIntervals::hasInterval() to determine if a physreg
is reserved and constant.

llvm-svn: 159013
2012-06-22 17:31:01 +00:00
Jakob Stoklund Olesen 3244963ecc Use regunit liveness to guide LiveDebugVariables.
This should produce the same results as using physreg liveness directly.

llvm-svn: 159009
2012-06-22 17:15:32 +00:00
Jakob Stoklund Olesen b1b3e4aa58 Remove LiveIntervals::trackingRegUnits().
With regunit liveness permanently enabled, this function would always
return true.

Also remove now obsolete code for checking physreg interference.

llvm-svn: 159006
2012-06-22 16:46:44 +00:00
Rafael Espindola ea59166190 Remove another duplicated variable. We only need one to tell us if the linker
knows dwarf or not.

llvm-svn: 158993
2012-06-22 13:32:49 +00:00
Rafael Espindola d7bdaf5795 Fix a FIXME: DwarfRequiresRelocationForSectionOffset is the same as
DwarfUsesRelocationsAcrossSections.

llvm-svn: 158992
2012-06-22 13:24:07 +00:00
Nick Lewycky 33da33676f Emit relocations for DW_AT_location entries on systems which need it. This is
a recommit of r127757. Fixes PR9493. Patch by Paul Robinson!

llvm-svn: 158957
2012-06-22 01:25:12 +00:00
Lang Hames b8650f106a Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956
2012-06-22 01:09:09 +00:00
Jack Carter c457f62033 The inline asm operand modifier 'n' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Negate the immediate constant

Several Architectures such as x86 have local implementations
of operand modifier 'n' which go beyond the above description
slightly. This won't affect them.

Affected files:

    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'n' to the switch cases.

    test/CodeGen/Generic/asm-large-immediate.ll
        Generic compiled test (x86 for me)

    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter
llvm-svn: 158939
2012-06-21 21:37:54 +00:00
Pete Cooper 5b61422d80 Fix potential crash if DAGCombine on stores sees a half type
llvm-svn: 158927
2012-06-21 18:00:39 +00:00
Jack Carter b2fd5f66b4 The inline asm operand modifier 'c' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Substitute immediate value without immediate syntax

Several Architectures such as x86 have local implementations
of operand modifier 'c' which go beyond the above description
slightly. To make use of the generic modifiers without overriding
local implementation one can make a call to the base class method
for AsmPrinter::PrintAsmOperand() in the locally derived method's 
"default" case in the switch statement. That way if it is already
defined locally the generic version will never get called.

This change is needed when test/CodeGen/generic/asm-large-immediate.ll
failed on a native Mips board. The test was assuming a generic
implementation was in place.

Affected files:

    lib/Target/Mips/MipsAsmPrinter.cpp:
        Changed the default case to call the base method.
    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'c' to the switch cases.
    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter
llvm-svn: 158925
2012-06-21 17:14:46 +00:00
Evan Cheng 8c2ad81238 Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and
_umodsi3 libcalls if they have the same arguments. This optimization
was apparently broken if one of the node was replaced in place.
rdar://11714607

llvm-svn: 158900
2012-06-21 05:56:05 +00:00
Jakob Stoklund Olesen 58713de545 Update regunits in RegisterCoalescer::reMaterializeTrivialDef.
Old code would only update physreg live intervals.

llvm-svn: 158881
2012-06-21 00:09:15 +00:00
Jakob Stoklund Olesen 37a1338a16 Remove spurious typedefs.
llvm-svn: 158878
2012-06-20 23:54:18 +00:00
Jakob Stoklund Olesen 1911a0203d Remove the RenderMachineFunction HTML output pass.
I don't think anyone has been using this functionality for a while, and
it is getting in the way of refactoring now.

llvm-svn: 158876
2012-06-20 23:47:58 +00:00
Jakob Stoklund Olesen 51c63e64e3 Remove the -live-regunits command line option.
Register allocators depend on it being permanently enabled now.

llvm-svn: 158873
2012-06-20 23:31:34 +00:00
Jakob Stoklund Olesen 781e0b9fd7 Fix some more LiveInterval enumerations.
Deterministically enumerate the virtual registers instead.

llvm-svn: 158872
2012-06-20 23:23:59 +00:00
Jakob Stoklund Olesen 2d2dec96e0 Remove LiveIntervalUnions from RegAllocBase.
They are living in LiveRegMatrix now.

llvm-svn: 158868
2012-06-20 22:52:29 +00:00
Jakob Stoklund Olesen 96eebf0b14 Convert RAGreedy to LiveRegMatrix interference checking.
Stop depending on the LiveIntervalUnions in RegAllocBase, they are about
to be removed.

The changes are mostly replacing register alias iterators with regunit
iterators, and querying LiveRegMatrix instrad of RegAllocBase.

InterferenceCache is converted to work with per-regunit
LiveIntervalUnions, and it checks fixed regunit interference separately,
using the fixed live intervals provided by LiveIntervalAnalysis.

The local splitting helper calcGapWeights() is also considering fixed
regunit interference which is kept on the side now.

llvm-svn: 158867
2012-06-20 22:52:26 +00:00
Jakob Stoklund Olesen 03b87d5aaa Convert RABasic to using LiveRegMatrix interference checking.
Stop using the LiveIntervalUnions provided by RegAllocBase, they will be
removed soon.

llvm-svn: 158866
2012-06-20 22:52:24 +00:00
Jakob Stoklund Olesen effc6b2d18 Enable register unit liveness by default.
Soon we won't need to compute live intervals for physical registers.

llvm-svn: 158865
2012-06-20 22:52:22 +00:00
Jakob Stoklund Olesen bfa664eaae Teach PBQPBuilder::build() about regunit interference.
Filter out physreg candidates with regunit interferrence.
Also compute regmask interference more efficiently.

llvm-svn: 158864
2012-06-20 22:32:05 +00:00
Jakob Stoklund Olesen a1f43dcdb8 Avoid iterating with LiveIntervals::iterator.
That is a DenseMap iterator keyed by pointers, so the iteration order is
nondeterministic.

I would like to replace the DenseMap with an IndexedMap which doesn't
allow iteration.

llvm-svn: 158856
2012-06-20 21:25:05 +00:00
Pete Cooper fe5b84b404 Add users of a MERGE_VALUE node to the worklist to process again when the node is removed. Sorry, no test case. Foudn it by inspection of the code
llvm-svn: 158839
2012-06-20 19:35:43 +00:00
Jakob Stoklund Olesen 833308d785 Only update regunit live ranges that have been precomputed.
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.

That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.

llvm-svn: 158831
2012-06-20 18:00:57 +00:00
Jakob Stoklund Olesen d702e8fddf Delete dead code.
llvm-svn: 158827
2012-06-20 16:38:50 +00:00
Hal Finkel 8a31138521 Fix DAGCombine to deal with ext-conversion of pre/post_inc loads.
The test case for this will come with the PPC indexed preinc loads commit.

llvm-svn: 158822
2012-06-20 15:42:48 +00:00
Aaron Ballman 421a5ba06d Fixing a compiler warning in MSVC 10.
llvm-svn: 158820
2012-06-20 14:44:44 +00:00
Chandler Carruth c60fbe6b58 Fix two rather subtle internal vs. external linker issues.
I'll admit I'm not entirely satisfied with this change, but it seemed
the cleanest option. Other suggestions quite welcome

The issue is that the traits specializations have static methods which
return the typedef'ed PHI_iterator type. In both the IR and MI layers
this is typedef'ed to a custom iterator class defined in an anonymous
namespace giving the types and the functions returning them internal
linkage. However, because the traits specialization is defined in the
'llvm' namespace (where it has to be, specialized template lives there),
and is in turn used in the templated implementation of the SSAUpdater.
This led to the linkage conflict that Clang now warns about.

The simplest solution to me was just to define the PHI_iterator as
a nested class inside the trait specialization. That way it still
doesn't get scoped widely, it can't be accidentally reused somewhere,
etc. This is a little gross just because nested class definitions are
a little gross, but the alternatives seem more ad-hoc.

llvm-svn: 158799
2012-06-20 08:39:30 +00:00
Andrew Trick ff2ed7b687 A new algorithm for computing LoopInfo. Temporarily disabled.
-stable-loops enables a new algorithm for generating the Loop
forest. It differs from the original algorithm in a few respects:

- Not determined by use-list order.
- Initially guarantees RPO order of block and subloops.
- Linear in the number of CFG edges.
- Nonrecursive.

I didn't want to change the LoopInfo API yet, so the block lists are
still inclusive. This seems strange to me, and it means that building
LoopInfo is not strictly linear, but it may not be a problem in
practice. At least the block lists start out in RPO order now. In the
future we may add an attribute or wrapper analysis that allows other
passes to assume RPO order.

The primary motivation of this work was not to optimize LoopInfo, but
to allow reproducing performance issues by decomposing the compilation
stages. I'm often unable to do this with the current LoopInfo, because
the loop tree order determines Loop pass order. Serializing the IR
tends to invert the order, which reverses the optimization order. This
makes it nearly impossible to debug interdependent loop optimizations
such as LSR.

I also believe this will provide more stable performance results across time.

llvm-svn: 158790
2012-06-20 05:23:33 +00:00
Andrew Trick cda51d430d Move the implementation of LoopInfo into LoopInfoImpl.h.
The implementation only needs inclusion from LoopInfo.cpp and
MachineLoopInfo.cpp. Clients of the interface should only include the
interface. This makes the interface readable and speeds up rebuilds
after modifying the implementation.

llvm-svn: 158787
2012-06-20 03:42:09 +00:00
Jakob Stoklund Olesen 3802bbf35e Add regunit liveness support to LiveIntervals::handleMove().
When LiveIntervals is tracking fixed interference in regunits, make sure
to update those intervals as well. Currently guarded by -live-regunits.

llvm-svn: 158766
2012-06-19 23:50:18 +00:00
Chad Rosier 651f9a485a Tidy up.
llvm-svn: 158762
2012-06-19 23:37:57 +00:00
Chad Rosier 7369692790 Add an ensureMaxAlignment() function to MachineFrameInfo (analogous to
ensureAlignment() in MachineFunction).  Also, drop setMaxAlignment() in
favor of this new function.  This creates a main entry point to setting
MaxAlignment, which will be helpful for future work.  No functionality
change intended.

llvm-svn: 158758
2012-06-19 22:59:12 +00:00
Lang Hames 39fb1d08dc Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen 2db1125b15 80 col.
llvm-svn: 158755
2012-06-19 22:50:53 +00:00
Jakob Stoklund Olesen 0f855e4263 Implement PPCInstrInfo::isCoalescableExtInstr().
The PPC::EXTSW instruction preserves the low 32 bits of its input, just
like some of the x86 instructions. Use it to reduce register pressure
when the low 32 bits have multiple uses.

This requires a small change to PeepholeOptimizer since EXTSW takes a
64-bit input register.

This is related to PR5997.

llvm-svn: 158743
2012-06-19 21:14:34 +00:00
Jakob Stoklund Olesen 8eb9905a7c Style: Don't reuse variables for multiple purposes.
No functional change.

llvm-svn: 158742
2012-06-19 21:10:18 +00:00
Rafael Espindola ca3e0ee8b3 Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

llvm-svn: 158692
2012-06-19 00:48:28 +00:00
Hal Finkel 8eac009633 Allow up to 64 functional units per processor itinerary.
This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
This will be needed for some upcoming PowerPC itineraries.

llvm-svn: 158679
2012-06-18 21:08:18 +00:00
Benjamin Kramer b9f84bb0ce Guard private fields that are unused in Release builds with #ifndef NDEBUG.
llvm-svn: 158608
2012-06-16 21:48:13 +00:00
Jakob Stoklund Olesen 38a6fbf933 Remove final verification in RABasic.
We now have a proper machine code verifier pass between register
allocation and rewriting.

llvm-svn: 158577
2012-06-15 23:48:48 +00:00
Jakob Stoklund Olesen 45c1f9976c Print out register number in InlineSpiller.
llvm-svn: 158575
2012-06-15 23:47:09 +00:00
Jakob Stoklund Olesen 13dffcb766 Accept null PhysReg arguments to checkRegMaskInterference.
Calling checkRegMaskInterference(VirtReg) checks if VirtReg crosses any
regmask operands, regardless of the registers they clobber.

llvm-svn: 158563
2012-06-15 22:24:22 +00:00
Bill Wendling 4fd966347a Remove assignments which aren't used afterwards.
llvm-svn: 158535
2012-06-15 19:30:42 +00:00
Jakob Stoklund Olesen 5767ad727c Use regunit liveness in RegisterCoalescer when it is available.
We only do very limited physreg coalescing now, but we still merge
virtual registers into reserved registers.

llvm-svn: 158526
2012-06-15 17:36:48 +00:00
Akira Hatanaka 1b420ac4c8 Make machine verifier check the first instruction of the last bundle instead of
the last instruction of a basic block.

llvm-svn: 158468
2012-06-14 20:51:13 +00:00
Lang Hames a33db65bd9 Make comment slightly more helpful.
llvm-svn: 158467
2012-06-14 20:37:15 +00:00
Andrew Trick 45877fa011 misched: disable SSA check pending PR13112.
llvm-svn: 158461
2012-06-14 17:48:49 +00:00
Andrew Trick 344fb64fa3 sched: fix latency of memory dependence chain edges for consistency.
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.

This should fix the standard mode as well as -enable-aa-sched-mi".

llvm-svn: 158380
2012-06-13 02:39:03 +00:00
Andrew Trick 5b90645abb sched: Avoid trivially redundant DAG edges. Take the one with higher latency.
llvm-svn: 158379
2012-06-13 02:39:00 +00:00
Andrew Trick 3e465fb225 misched: When querying RegisterPressureTracker, always save current and max pressure.
llvm-svn: 158340
2012-06-11 23:42:23 +00:00
Andrew Trick d054bd833a misched: regpressure getMaxPressureDelta, revert accidental checkin.
llvm-svn: 158339
2012-06-11 23:42:20 +00:00
Benjamin Kramer 0748008df5 Allocate the contents of DwarfDebug's StringMaps in a single big BumpPtrAllocator.
llvm-svn: 158265
2012-06-09 10:34:15 +00:00
Andrew Trick fc8ce08be3 Register pressure: added getPressureAfterInstr.
llvm-svn: 158256
2012-06-09 02:16:58 +00:00
Jakob Stoklund Olesen c26fbbfba5 Sketch a LiveRegMatrix analysis pass.
The LiveRegMatrix represents the live range of assigned virtual
registers in a Live interval union per register unit. This is not
fundamentally different from the interference tracking in RegAllocBase
that both RABasic and RAGreedy use.

The important differences are:

- LiveRegMatrix tracks interference per register unit instead of per
  physical register. This makes interference checks cheaper and
  assignments slightly more expensive. For example, the ARM D7 reigster
  has 24 aliases, so we would check 24 physregs before assigning to one.
  With unit-based interference, we check 2 units before assigning to 2
  units.

- LiveRegMatrix caches regmask interference checks. That is currently
  duplicated functionality in RABasic and RAGreedy.

- LiveRegMatrix is a pass which makes it possible to insert
  target-dependent passes between register allocation and rewriting.
  Such passes could tweak the register assignments with interference
  checking support from LiveRegMatrix.

Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix.

llvm-svn: 158255
2012-06-09 02:13:10 +00:00
Jakob Stoklund Olesen be336295cd Also compute MBB live-in lists in the new rewriter pass.
This deduplicates some code from the optimizing register allocators, and
it means that it is now possible to change the register allocators'
solutions simply by editing the VirtRegMap between the register
allocator pass and the rewriter.

llvm-svn: 158249
2012-06-09 00:14:47 +00:00
Jakob Stoklund Olesen 1224312f5b Reintroduce VirtRegRewriter.
OK, not really. We don't want to reintroduce the old rewriter hacks.

This patch extracts virtual register rewriting as a separate pass that
runs after the register allocator. This is possible now that
CodeGen/Passes.cpp can configure the full optimizing register allocator
pipeline.

The rewriter pass uses register assignments in VirtRegMap to rewrite
virtual registers to physical registers, and it inserts kill flags based
on live intervals.

These finalization steps are the same for the optimizing register
allocators: RABasic, RAGreedy, and PBQP.

llvm-svn: 158244
2012-06-08 23:44:45 +00:00
Evan Cheng c5adccab1a Start implementing pre-ra if-converter: using speculation and selects to eliminate branches.
llvm-svn: 158234
2012-06-08 21:53:50 +00:00
Andrew Trick 423fa6faee TargetInstrInfo hooks implemented in codegen should be declared pure virtual.
llvm-svn: 158233
2012-06-08 21:52:38 +00:00
Andrew Trick 596af1b02e Fix Target->Codegen dependence.
Bulk move of TargetInstrInfo implementation into
TargetInstrInfoImpl. This is dirty because the code isn't part of
TargetInstrInfoImpl class, nor should it be, because the methods are
not target hooks. However, it's the current mechanism for keeping
libTarget useful outside the backend. You'll get a not-so-nice link
error if you invoke a TargetInstrInfo method that depends on CodeGen.

The TargetInstrInfoImpl class should probably be removed since it
doesn't really solve this problem.

To really fix this, we probably need separate interfaces for the
CodeGen/nonCodeGen sides of TargetInstrInfo.

llvm-svn: 158212
2012-06-08 17:23:27 +00:00
Pete Cooper cd72016cab Move terminator machine verification to check MachineBasicBlock::instr_iterator instead of MBB::iterator
llvm-svn: 158154
2012-06-07 17:41:39 +00:00
Manman Ren 9c9641812c Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Jakob Stoklund Olesen 00e7dffefb Properly verify liveness with bundled machine instructions.
Bundles should be treated as one atomic transaction when checking
liveness. That is how the register allocator (and VLIW targets) treats
bundles.

llvm-svn: 158116
2012-06-06 22:34:30 +00:00
Andrew Trick 05ff4667eb Move RegisterClassInfo.h.
Allow targets to access this API. It's required for RegisterPressure.

llvm-svn: 158102
2012-06-06 20:29:31 +00:00
Andrew Trick 88517f608c Move RegisterPressure.h.
Make it a general utility for use by Targets.

llvm-svn: 158097
2012-06-06 19:47:35 +00:00
Benjamin Kramer 009b1c1cf1 Round 2 of dead private variable removal.
LLVM is now -Wunused-private-field clean except for
- lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields.
- gtest.

llvm-svn: 158096
2012-06-06 19:47:08 +00:00
Benjamin Kramer 628a39faa3 Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Jakob Stoklund Olesen f435b1867d Remove dead debug option -disable-rematerialization.
Remat has been stable for years, and it isn't done by
LiveIntervalAnalysis any longer. (See LiveRangeEdit).

llvm-svn: 158079
2012-06-06 16:22:41 +00:00
Benjamin Kramer 3de5d40f4d Stop leaking RegScavengers from TailDuplication.
llvm-svn: 158069
2012-06-06 13:53:41 +00:00
Jakob Stoklund Olesen c141ba584e Move LiveUnionArray into LiveIntervalUnion.h
It is useful outside RegAllocBase.

llvm-svn: 158041
2012-06-05 23:57:30 +00:00
Jakob Stoklund Olesen 46d229c573 Don't print register names in LiveIntervalUnion::print().
Soon we'll be making LiveIntervalUnions for register units as well.

This was the only place using the RepReg member, so just remove it.

llvm-svn: 158038
2012-06-05 23:07:19 +00:00
Matt Beaumont-Gay 7ba769bedd Suppress -Wunused-variable in -Asserts build
llvm-svn: 158037
2012-06-05 23:00:03 +00:00
Jakob Stoklund Olesen f3f7d6f6e2 Simplify LiveInterval::print().
Don't print out the register number and spill weight, making the TRI
argument unnecessary.

This allows callers to interpret the reg field. It can currently be a
virtual register, a physical register, a spill slot, or a register unit.

llvm-svn: 158031
2012-06-05 22:51:54 +00:00
Jakob Stoklund Olesen 12e03dae44 Add experimental support for register unit liveness.
Instead of computing a live interval per physreg, LiveIntervals can
compute live intervals per register unit. This makes impossible the
confusing situation where aliasing registers could have overlapping live
intervals. It should also make fixed interferernce checking cheaper
since registers have fewer register units than aliases.

Live intervals for regunits are computed on demand, using MRI use-def
chains and the new LiveRangeCalc class. Only regunits live in to ABI
blocks are precomputed during LiveIntervals::runOnMachineFunction().

The regunit liveness computations don't depend on LiveVariables.

llvm-svn: 158029
2012-06-05 22:02:15 +00:00
Jakob Stoklund Olesen 989b3b1516 Implement LiveRangeCalc::extendToUses() and createDeadDefs().
These LiveRangeCalc methods are to be used when computing a live range
from scratch.

llvm-svn: 158027
2012-06-05 21:54:09 +00:00