Commit Graph

45179 Commits

Author SHA1 Message Date
Chris Lattner b5e15d1907 fix PR9013, an infinite loop in instcombine.
llvm-svn: 123968
2011-01-21 05:29:50 +00:00
Chris Lattner f4ca47bda8 update obsolete comment.
llvm-svn: 123965
2011-01-21 05:08:26 +00:00
Nick Lewycky 6a083cf820 Don't try to pull vector bitcasts that change the number of elements through
a select. A vector select is pairwise on each element so we'd need a new
condition with the right number of elements to select on. Fixes PR8994.

llvm-svn: 123963
2011-01-21 02:30:43 +00:00
Michael J. Spencer 0324b67267 Object: Fix type punned pointer issues by making DataRefImpl a union and using intptr_t.
llvm-svn: 123962
2011-01-21 02:27:02 +00:00
Nick Lewycky 39b12c059d Add a constant folding of casts from zero to zero. Fixes PR9011!
While here, I'd like to complain about how vector is not an aggregate type
according to llvm::Type::isAggregateType(), but they're listed under aggregate
types in the LangRef and zero vectors are stored as ConstantAggregateZero.

llvm-svn: 123956
2011-01-21 01:12:09 +00:00
Evan Cheng 028ccbfcbf Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relative
value, the "add pc" must be CSE'ed at the same time. We could follow the same
approach as T2 by adding pseudo instructions that combine the ldr + "add pc".
But the better approach is to use movw + movt (which I will enable soon), so
I'll leave this as a TODO.

llvm-svn: 123949
2011-01-20 23:55:07 +00:00
Tobias Grosser f07426b40d Implement requiredTransitive
The PassManager did not implement the transitivity of requiredTransitive. This
was unnoticed since 2006.

llvm-svn: 123942
2011-01-20 21:03:22 +00:00
Bruno Cardoso Lopes e965f06f7f Fix the encoding and parsing of clrex instruction
llvm-svn: 123936
2011-01-20 19:18:32 +00:00
Bruno Cardoso Lopes ef8cab9079 Change instruction names for consistency
llvm-svn: 123930
2011-01-20 18:36:07 +00:00
Bruno Cardoso Lopes d8f9b37f31 Add cdp/cdp2 instructions for thumb/thumb2
llvm-svn: 123929
2011-01-20 18:32:09 +00:00
Bruno Cardoso Lopes 33461ecc82 - Use a more appropriate name for Owen's ARM Parser isMCR hack since the same operands can be present
in cdp/cdp2 instructions. Also increase the hack with cdp/cdp2 instructions.
- Fix the encoding of cdp/cdp2 instructions for ARM (no thumb and thumb2 yet) and add testcases for t
hem.

llvm-svn: 123927
2011-01-20 18:06:58 +00:00
Jakob Stoklund Olesen 8a46e26b8e SplitKit requires that all defs are in place before calling useIntv().
The value mapping gets confused about which original values have multiple new
definitions so they may need phi insertions.

This could probably be simplified by letting enterIntvBefore() take a live range
to be added following the instruction. As long as the range stays inside the
same basic block, value mapping shouldn't be a problem.

llvm-svn: 123926
2011-01-20 17:45:23 +00:00
Jakob Stoklund Olesen 04e6b3bd21 Add LiveIntervalMap::dumpCache() to print out the cache used by the ssa update algorithm.
llvm-svn: 123925
2011-01-20 17:45:20 +00:00
Bruno Cardoso Lopes 4d4b490fb7 Add mcr*2 and mr*c2 support to thumb2 targets
llvm-svn: 123919
2011-01-20 16:58:48 +00:00
Bruno Cardoso Lopes cf99dc7eb9 Add mcr* and mr*c support to thumb targets
llvm-svn: 123917
2011-01-20 16:35:57 +00:00
Kalle Raiskila 6e5a54b36c Allow sign-extending of i8 and i16 to i128 on SPU.
llvm-svn: 123912
2011-01-20 15:49:06 +00:00
Duncan Sands 8fb2c3827c At -O123 the early-cse pass is run before instcombine has run. According to my
auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0.
This patch adds this transform and some related logic to InstructionSimplify
and removes some of the logic from instcombine (unfortunately not all because
there are several situations in which instcombine can improve things by making
new instructions, whereas instsimplify is not allowed to do this).  At -O2 this
often results in more than 15% more simplifications by early-cse, and results in
hundreds of lines of bitcode being eliminated from the testsuite.  I did see some
small negative effects in the testsuite, for example a few additional instructions
in three programs.  One program, 483.xalancbmk, got an additional 35 instructions,
which seems to be due to a function getting an additional instruction and then
being inlined all over the place.

llvm-svn: 123911
2011-01-20 13:21:55 +00:00
Bruno Cardoso Lopes 32f9b756a3 Refactor mcr* and mr*c instructions into classes with the same encoding. No functionality change.
llvm-svn: 123910
2011-01-20 13:17:59 +00:00
Eric Christopher 37c4a8be72 My editor's indent went crazy. Fix.
llvm-svn: 123909
2011-01-20 08:56:34 +00:00
Eric Christopher 785db078b4 Expand invalid return values for umulo and smulo. Handle these similarly
to add/sub by doing the normal operation and then checking for overflow
afterwards. This generally relies on the DAG handling the later invalid
operations as well.

Fixes the 64-bit part of rdar://8622122 and rdar://8774702.

llvm-svn: 123908
2011-01-20 08:54:28 +00:00
Evan Cheng 7af85533f8 Correct itinerary entry for t2MOV_pic_ga_add_pc.
llvm-svn: 123907
2011-01-20 08:43:03 +00:00
Evan Cheng b8b0ad80a8 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.

llvm-svn: 123905
2011-01-20 08:34:58 +00:00
Michael J. Spencer b60a18dea8 Object: Add ELF support.
llvm-svn: 123896
2011-01-20 06:38:47 +00:00
Michael J. Spencer 8e90adaf24 Object: Add COFF Support.
llvm-svn: 123895
2011-01-20 06:38:34 +00:00
Andrew Trick 2cd1f0beb6 Selection DAG scheduler register pressure heuristic fixes.
Added a check for already live regs before claiming HighRegPressure.
Fixed a few cases of checking the wrong number of successors.
Added some tracing until these heuristics are better understood.

llvm-svn: 123892
2011-01-20 06:21:59 +00:00
Jakob Stoklund Olesen 4060abb4b9 Check that a live range exists before shortening it. This fixes PR8989.
The live range may have been deleted earlier because of rematerialization.

llvm-svn: 123891
2011-01-20 06:20:02 +00:00
Jakob Stoklund Olesen 145755f1d6 Add hidden -verify-coalescing to run the machine code verifier before and after
register coalescing.

llvm-svn: 123890
2011-01-20 06:20:00 +00:00
Venkatraman Govindaraju 058e12476c Sparc backend: Implements a delay slot filler that attempt to fill delay slots
with useful instructions.

llvm-svn: 123884
2011-01-20 05:08:26 +00:00
Cameron Zwarich 050eec1d1d Update a comment.
llvm-svn: 123879
2011-01-20 03:58:43 +00:00
Jakob Stoklund Olesen 5acd4a6453 Fix bug found by new clang warning.
llvm-svn: 123872
2011-01-20 02:43:19 +00:00
Eric Christopher b2139f655b Use only one API at a time.
llvm-svn: 123866
2011-01-20 01:29:23 +00:00
Eric Christopher bb14f65672 If we can, lower the multiply part of a umulo/smulo call to a libcall
with an invalid type then split the result and perform the overflow check
normally.

Fixes the 32-bit parts of rdar://8622122 and rdar://8774702.

llvm-svn: 123864
2011-01-20 00:29:24 +00:00
Devang Patel 2d9e532a3a Fix debug info for merged global.
llvm-svn: 123862
2011-01-20 00:02:16 +00:00
Jakob Stoklund Olesen 79be8aecba Divert Hopfield network debug output. It is very noisy.
llvm-svn: 123859
2011-01-19 23:14:59 +00:00
Jakob Stoklund Olesen 509089f5b6 Don't accidentally leave small gaps in the live ranges when leaving the active
interval after an instruction. The leaveIntvAfter() method only adds liveness
from the instruction's boundary index to the inserted copy.

Ideally, SplitKit should be smarter about this, perhaps by combining useIntv()
and leaveIntvAfter() into one method that guarantees continuity.

llvm-svn: 123858
2011-01-19 23:14:56 +00:00
Jim Grosbach 493c0fbde8 Make sure to propogate the error code when we fail to parse a modifier.
llvm-svn: 123857
2011-01-19 23:06:07 +00:00
Devang Patel 8698f09dbd Fix register address expression. Patch by Ken Dyck.
llvm-svn: 123856
2011-01-19 23:04:47 +00:00
Jakob Stoklund Olesen 9fb04015ff Implement RAGreedy::splitAroundRegion and remove loop splitting.
Region splitting includes loop splitting as a subset, and it is more generic.
The splitting heuristics for variables that are live in more than one block are
now:

1. Try to create a region that covers multiple basic blocks.
2. Try to create a new live range for each block with multiple uses.
3. Spill.

Steps 2 and 3 are similar to what the standard spiller is doing.

llvm-svn: 123853
2011-01-19 22:11:48 +00:00
Nick Lewycky 5c901f3489 Similarly, analyze truncate through multiply.
llvm-svn: 123842
2011-01-19 18:56:00 +00:00
Nick Lewycky 5143f0f09b Add a missed SCEV fold that is required to continue analyzing the IR produced
by indvars through the scev expander.

trunc(add x, y) --> add(trunc x, y). Currently SCEV largely folds the other way
which is probably wrong, but preserved to minimize churn. Instcombine doesn't
do this fold either, demonstrating a missed optz'n opportunity on code doing
add+trunc+add.

llvm-svn: 123838
2011-01-19 16:59:46 +00:00
Bruno Cardoso Lopes d6335ce508 Fix the encoding of mrrc and mcrr family of instructions. Also add testcases for mcr and mrc
llvm-svn: 123837
2011-01-19 16:56:52 +00:00
Rafael Espindola fc355bc070 Add unnamed_addr when we can show that address of a global is not used.
llvm-svn: 123834
2011-01-19 16:32:21 +00:00
Nick Lewycky e9ea75e3fc Add a missing SCEV simplification sext(zext x) --> zext x.
llvm-svn: 123832
2011-01-19 15:56:12 +00:00
Daniel Dunbar e0cd9ac096 ARM/ISel: Factor out isScaledConstantInRange() helper.
llvm-svn: 123823
2011-01-19 15:12:16 +00:00
Andrew Trick 43f2563114 For ARM subtargets with useNEONForSinglePrecisionFP, double count uses
of the floating point types less than 64-bits. It's somewhat of a temporary
hack but forces more accurate modeling of register pressure and results
in fewer spills.

llvm-svn: 123811
2011-01-19 02:35:27 +00:00
Andrew Trick 5eb0a30216 whitespace
llvm-svn: 123810
2011-01-19 02:26:13 +00:00
Evan Cheng 68aec147b7 Don't forget to emit the load from indirect symbol when using movw + movt to materialize GA indirect symbols.
llvm-svn: 123809
2011-01-19 02:16:49 +00:00
Bruno Cardoso Lopes 2082057b18 Create two new generic classes to represent the following VMRS/VMSR variations:
vmrs  reg, fpexc
vmrs  reg, fpsid
vmsr  fpexc, reg
vmsr  fpsid, reg

llvm-svn: 123783
2011-01-18 21:58:20 +00:00
Bruno Cardoso Lopes cba727f291 Fix MRS encoding for arm and thumb.
llvm-svn: 123778
2011-01-18 21:31:35 +00:00
Bruno Cardoso Lopes e86a7ad01a Fix the encoding of t2ISB by using the right class and also parse it correctly
llvm-svn: 123776
2011-01-18 21:17:09 +00:00
Dan Gohman 44da55b7be Teach BasicAA to return PartialAlias in cases where both pointers
are pointing to the same object, one pointer is accessing the entire
object, and the other is access has a non-zero size. This prevents
TBAA from kicking in and saying NoAlias in such cases.

llvm-svn: 123775
2011-01-18 21:16:06 +00:00
Jakob Stoklund Olesen 267f6c1ab2 Add RAGreedy methods for splitting live ranges around regions.
Analyze the live range's behavior entering and leaving basic blocks. Compute an
interference pattern for each allocation candidate, and use SpillPlacement to
find an optimal region where that register can be live.

This code is still not enabled.

llvm-svn: 123774
2011-01-18 21:13:27 +00:00
Bruno Cardoso Lopes e6290ccf9b Follow the current hack set and enable the correct parsing of bkpt while in thumb mode.
llvm-svn: 123772
2011-01-18 20:55:11 +00:00
Chris Lattner 86d56c651d fix rdar://8878965, a regression I introduced with the recent
llvm.objectsize changes.

llvm-svn: 123771
2011-01-18 20:53:04 +00:00
Bruno Cardoso Lopes 7f639c11d7 Add support for parsing and encoding ARM's official syntax for the BFI instruction
llvm-svn: 123770
2011-01-18 20:45:56 +00:00
Jim Grosbach ec86bac8b3 Add a FIXME.
llvm-svn: 123769
2011-01-18 19:59:19 +00:00
Bruno Cardoso Lopes 95dbfac459 Ensure Mips::GP is properly reloaded after a function call. Patch by Sasa Stankovic
llvm-svn: 123768
2011-01-18 19:50:18 +00:00
Bruno Cardoso Lopes b02a9dfa55 Negative zero is not legal on mips. Patch by Sasa Stankovic
llvm-svn: 123766
2011-01-18 19:41:41 +00:00
Bruno Cardoso Lopes ac517fa9f7 Handle (i32,i32) => f64 in a cleaner way. Patch by Sasa Stankovic
llvm-svn: 123763
2011-01-18 19:38:25 +00:00
Bruno Cardoso Lopes 4dc73fa075 Add support for mips32 madd and msub instructions. Patch by Akira Hatanaka
llvm-svn: 123760
2011-01-18 19:29:17 +00:00
Duncan Sands 99589d07e9 For completeness, generalize the (X + Y) - Y -> X transform and add X - (X + 1) -> -1.
These were not recommended by my auto-simplifier since they don't fire often enough.
However they do fire from time to time, for example they remove one subtraction from
the final bitcode for 483.xalancbmk.

llvm-svn: 123755
2011-01-18 11:50:19 +00:00
Duncan Sands 9b8e2bd8ef Simplify (X<<1)-X into X. According to my auto-simplier this is the most common missed
simplification in fully optimized code.  It occurs sporadically in the testsuite, and
many times in 403.gcc: the final bitcode has 131 fewer subtractions after this change.
The reason that the multiplies are not eliminated is the same reason that instcombine
did not catch this: they are used by other instructions (instcombine catches this with
a more general transform which in general is only profitable if the operands have only
one use).

llvm-svn: 123754
2011-01-18 09:24:58 +00:00
Chris Lattner a56c8279e8 add a note
llvm-svn: 123752
2011-01-18 07:47:48 +00:00
Venkatraman Govindaraju c386f8a1f6 SPARC backend: Modified LowerCall and LowerFormalArguments so that they use CallingConv assignments.
llvm-svn: 123749
2011-01-18 06:09:55 +00:00
Cameron Zwarich dfc547d181 Remove an unnecessary #include.
llvm-svn: 123748
2011-01-18 06:07:18 +00:00
Cameron Zwarich 6b0c4c9b6c Move DominanceFrontier from VMCore to Analysis.
llvm-svn: 123747
2011-01-18 06:06:27 +00:00
Daniel Dunbar 62ea26fb6f McARM: Use accessors where appropriate.
llvm-svn: 123746
2011-01-18 05:55:27 +00:00
Daniel Dunbar bcd8eb0bac McARM: Fill in ASMOperand::dump() for memory operands.
llvm-svn: 123745
2011-01-18 05:55:21 +00:00
Daniel Dunbar 510740eea7 McARM: Make ARMOperand use a union where appropriate.
llvm-svn: 123744
2011-01-18 05:55:15 +00:00
Cameron Zwarich ce25e88218 There is no point in verifying an analysis that is never updated.
llvm-svn: 123743
2011-01-18 05:44:04 +00:00
Daniel Dunbar f5164f40c5 McARM: Unify ParseMemory() successfull return.
llvm-svn: 123740
2011-01-18 05:34:24 +00:00
Daniel Dunbar 1d5e954965 McARM: Early exit on failure (NEFC).
llvm-svn: 123739
2011-01-18 05:34:17 +00:00
Daniel Dunbar 7ed455990d McARM: Always keep an offset expression, if used (instead of assuming == 0 if used but not present), and simplify logic.
Also, clean up various non-sensicalisms in isMemModeRegThumb() and isMemModeImmThumb().

llvm-svn: 123738
2011-01-18 05:34:11 +00:00
Daniel Dunbar 5d99420e11 McARM: Add a variety of asserts on the sanity of memory operands.
llvm-svn: 123737
2011-01-18 05:34:05 +00:00
Daniel Dunbar d8da9e0fe6 McARM: Use a consistent marker for not-set OffsetRegNum.
llvm-svn: 123736
2011-01-18 05:33:57 +00:00
Cameron Zwarich fc210c79b7 Convert a std::map to a DenseMap for another 1.7% speedup on -scalarrepl.
llvm-svn: 123732
2011-01-18 04:50:38 +00:00
Cameron Zwarich 6968c41ac8 Make a std::vector a SmallVector<*, 32> like the other vectors in the same
function. This seems to be about a 1.5% speedup of -scalarrepl on test-suite
with SPEC2000 and SPEC2006.

llvm-svn: 123731
2011-01-18 04:41:32 +00:00
Rafael Espindola ecd5b9abe9 Reduce indentation and remove commented out code.
llvm-svn: 123729
2011-01-18 04:36:06 +00:00
Cameron Zwarich 66f3c66b9d Remove some now-unused DominanceFrontier methods.
llvm-svn: 123726
2011-01-18 04:21:57 +00:00
Cameron Zwarich b703654edc Remove code for updating dominance frontiers and some outdated references to
dominance and post-dominance frontiers.

llvm-svn: 123725
2011-01-18 04:11:31 +00:00
Cameron Zwarich 4694e69540 Remove outdated references to dominance frontiers.
llvm-svn: 123724
2011-01-18 03:53:26 +00:00
Daniel Dunbar 66e91d4a58 McARM: Start marking T2 address operands as such, for the benefit of the parser.
llvm-svn: 123722
2011-01-18 03:06:03 +00:00
Daniel Dunbar f413213a48 Support/CommandLine: Add "Did you mean" print for mismatched operands.
llvm-svn: 123717
2011-01-18 01:59:24 +00:00
Eric Christopher 542f8a5221 The stub routine that we're calling uses test and so clobbers
the flags.

llvm-svn: 123712
2011-01-18 01:37:20 +00:00
Chris Lattner ea4e983d70 minor change to rafael's recent patches: if something is
constant but requires a unique address, we can still put it in a
readonly section, just not a mergable one.

llvm-svn: 123711
2011-01-18 01:23:44 +00:00
Jeffrey Yasskin 249fcd4499 Remove unused variables found by gcc-4.6's -Wunused-but-set-variable.
llvm-svn: 123707
2011-01-18 00:51:23 +00:00
Stuart Hastings 4fa832aab0 Remove checking that prevented overlapping CALLSEQ_START/CALLSEQ_END
ranges, add legalizer support for nested calls.  Necessary for ARM
byval support.  Radar 7662569.

llvm-svn: 123704
2011-01-18 00:09:27 +00:00
NAKAMURA Takumi 8a07451a6e Windows/PathV2.inc: For CryptAcquireContext(), CRYPT_VERIFYCONTEXT may be specified for easy use.
llvm-svn: 123687
2011-01-17 22:41:34 +00:00
NAKAMURA Takumi 53f893af54 Windows/PathV2.inc: MoveFileEx() can behave like Posix's mv(1) to specify MOVEFILE_COPY_ALLOWED | MOVEFILE_REPLACE_EXISTING.
llvm-svn: 123686
2011-01-17 22:41:25 +00:00
NAKAMURA Takumi bb4ea1fef9 lib/Support/Windows/Signals.inc: "Showstopper" dialogs may be suppressed with SetErrorMode() on Windows 7.
llvm-svn: 123685
2011-01-17 22:41:15 +00:00
Owen Anderson 459e079912 Remove dead code, that I apparently wrote a while back. We seem to be doing well enough
without whatever this was trying to do.  When/if someone has the time to do some empirical
evaluations, it might be worth it to figure out what this code was trying to do and see if
it's worth resurrecting/fixing.

llvm-svn: 123684
2011-01-17 22:39:54 +00:00
Douglas Gregor 69e6206b19 Add a missing <cctype> include, from Joerg Sonnenberger!
llvm-svn: 123670
2011-01-17 19:17:01 +00:00
Benjamin Kramer 45d183ccf0 Fix an off-by-one error in ctpop combining.
llvm-svn: 123664
2011-01-17 18:00:28 +00:00
Cameron Zwarich b410858a5f Roll r123609 back in with two changes that fix test failures with expensive
checks enabled:

1) Use '<' to compare integers in a comparison function rather than '<='.

2) Use the uniqued set DefBlocks rather than Info.DefiningBlocks to initialize
the priority queue.

The speedup of scalarrepl on test-suite + SPEC2000 + SPEC2006 is a bit less, at
just under 16% rather than 17%.

llvm-svn: 123662
2011-01-17 17:38:41 +00:00
Michael J. Spencer 992efd12a7 Archive: Fix temp path names.
llvm-svn: 123660
2011-01-17 16:43:30 +00:00
Michael J. Spencer b74799090a Support/raw_ostream: Fix uninitalized variable in raw_fd_ostream constructor.
llvm-svn: 123643
2011-01-17 15:53:12 +00:00
Jay Foad fe87364215 Remove useless Tag enumeration.
llvm-svn: 123623
2011-01-17 15:18:06 +00:00
Kalle Raiskila 06c6d5cdb6 Split up RotateShift itinerary in SPU.
'rotq*' and 'shlq*' instructions go to the odd pipeline,
wheras the inter-vector equivalents 'rot*', 'shl*' go 
to the even.

llvm-svn: 123622
2011-01-17 13:33:19 +00:00
Benjamin Kramer 24c5184dca Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0.
This shaves off 4 popcounts from the hacked 186.crafty source.

This is enabled even when a native popcount instruction is available. The
combined code is one operation longer but it should be faster nevertheless.

llvm-svn: 123621
2011-01-17 12:04:57 +00:00
Kalle Raiskila 7e7b4ac751 Don't crash SPU BE with memory accesses with big alignmnet.
llvm-svn: 123620
2011-01-17 11:59:20 +00:00
Evan Cheng dfce83c8f5 Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g.
movw    r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4))
        movt    r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4))
LPC0_0:
        add     r0, pc, r0

It's not yet enabled by default as some tests are failing. I suspect bugs in
down stream tools.

llvm-svn: 123619
2011-01-17 08:03:18 +00:00
Cameron Zwarich 67431d7943 Roll out r123609 due to failures on the llvm-x86_64-linux-checks bot.
llvm-svn: 123618
2011-01-17 07:26:51 +00:00
Cameron Zwarich 814cd9233e Eliminate the use of dominance frontiers in PromoteMemToReg. In addition to
eliminating a potentially quadratic data structure, this also gives a 17%
speedup when running -scalarrepl on test-suite + SPEC2000 + SPEC2006. My initial
experiment gave a greater speedup around 25%, but I moved the dominator tree
level computation from dominator tree construction to PromoteMemToReg.

Since this approach to computing IDFs has a much lower overhead than the old
code using precomputed DFs, it is worth looking at using this new code for the
second scalarrepl pass as well.

llvm-svn: 123609
2011-01-17 01:08:59 +00:00
Michael J. Spencer 5ce56081c7 UnRevert "Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.""
llvm-svn: 123605
2011-01-16 23:39:59 +00:00
Michael J. Spencer ec202ee69a Fix rename.
llvm-svn: 123604
2011-01-16 22:18:41 +00:00
Anton Korobeynikov 27fc8f6467 Provide instruction sizes for ARMv5 variants of MUL instructions.
This fixes PR8987

llvm-svn: 123598
2011-01-16 21:28:33 +00:00
Anders Carlsson 6a5171ba68 Update README.txt to remove the DAE enhancement.
llvm-svn: 123597
2011-01-16 21:26:15 +00:00
Anders Carlsson d3db83349e Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead.
llvm-svn: 123596
2011-01-16 21:25:33 +00:00
Michael J. Spencer 405e958ac3 UnRevert "Revert the archive part of "Support/PathV2: Add identify_magic.""
This reverts commit dd103021a889a986a181ce36ed7b0e8dc9b645e1.

llvm-svn: 123595
2011-01-16 21:13:51 +00:00
Michael J. Spencer c2caa2133c Revert the archive part of "Support/PathV2: Add identify_magic."
llvm-svn: 123593
2011-01-16 19:56:42 +00:00
Chris Lattner 7c9f4c9c2b tidy up a comment, as suggested by duncan
llvm-svn: 123590
2011-01-16 17:46:19 +00:00
Rafael Espindola cba4c33949 Only put unnamed_addr constants in mergeable sections. Fixes PR8297.
llvm-svn: 123585
2011-01-16 17:19:34 +00:00
Rafael Espindola 751677a040 Don't merge two constants if we care about the address of both.
This fixes the original testcase in PR8927. It also causes a clang
binary built with a patched clang to increase in size by 0.21%.

We can probably get some of the size back by writing a pass that
detects that a global never has its pointer compared and adds
unnamed_addr to it (maybe extend global opt). It is also possible that
there are some other cases clang could add unnamed_addr to.

I will investigate extending globalopt next.

llvm-svn: 123584
2011-01-16 17:05:09 +00:00
Jay Foad bbb91f2b22 Simplify the construction and destruction of Uses. Simplify
User::dropHungOffUses().

llvm-svn: 123580
2011-01-16 15:30:52 +00:00
Chris Lattner 35a2e65bcb fix PR8514, a bug where the "heroic" transformation of shift/and
into and/shift would cause nodes to move around and a dangling pointer
to happen.  The code tried to avoid this with a HandleSDNode, but 
got the details wrong.

llvm-svn: 123578
2011-01-16 08:48:11 +00:00
Jay Foad 59809c7a62 Move the implementation of the User class into a new source file,
User.cpp.

llvm-svn: 123575
2011-01-16 08:10:57 +00:00
Chris Lattner e5f8de8639 fix PR8932, a case where arg promotion could infinitely promote.
llvm-svn: 123574
2011-01-16 08:09:24 +00:00
Chris Lattner ed1fb92cfe simplify a little
llvm-svn: 123573
2011-01-16 07:11:21 +00:00
Chris Lattner c326ebd118 add some commentary
llvm-svn: 123572
2011-01-16 06:39:44 +00:00
Chris Lattner 6fab2e9418 if an alloca is only ever accessed as a unit, and is accessed with load/store instructions,
then don't try to decimate it into its individual pieces.  This will just make a mess of the
IR and is pointless if none of the elements are individually accessed.  This was generating
really terrible code for std::bitset (PR8980) because it happens to be lowered by clang
as an {[8 x i8]} structure instead of {i64}.

The testcase now is optimized to:

define i64 @test2(i64 %X) {
  br label %L2

L2:                                               ; preds = %0
  ret i64 %X
}

before we generated:

define i64 @test2(i64 %X) {
  %sroa.store.elt = lshr i64 %X, 56
  %1 = trunc i64 %sroa.store.elt to i8
  %sroa.store.elt8 = lshr i64 %X, 48
  %2 = trunc i64 %sroa.store.elt8 to i8
  %sroa.store.elt9 = lshr i64 %X, 40
  %3 = trunc i64 %sroa.store.elt9 to i8
  %sroa.store.elt10 = lshr i64 %X, 32
  %4 = trunc i64 %sroa.store.elt10 to i8
  %sroa.store.elt11 = lshr i64 %X, 24
  %5 = trunc i64 %sroa.store.elt11 to i8
  %sroa.store.elt12 = lshr i64 %X, 16
  %6 = trunc i64 %sroa.store.elt12 to i8
  %sroa.store.elt13 = lshr i64 %X, 8
  %7 = trunc i64 %sroa.store.elt13 to i8
  %8 = trunc i64 %X to i8
  br label %L2

L2:                                               ; preds = %0
  %9 = zext i8 %1 to i64
  %10 = shl i64 %9, 56
  %11 = zext i8 %2 to i64
  %12 = shl i64 %11, 48
  %13 = or i64 %12, %10
  %14 = zext i8 %3 to i64
  %15 = shl i64 %14, 40
  %16 = or i64 %15, %13
  %17 = zext i8 %4 to i64
  %18 = shl i64 %17, 32
  %19 = or i64 %18, %16
  %20 = zext i8 %5 to i64
  %21 = shl i64 %20, 24
  %22 = or i64 %21, %19
  %23 = zext i8 %6 to i64
  %24 = shl i64 %23, 16
  %25 = or i64 %24, %22
  %26 = zext i8 %7 to i64
  %27 = shl i64 %26, 8
  %28 = or i64 %27, %25
  %29 = zext i8 %8 to i64
  %30 = or i64 %29, %28
  ret i64 %30
}

In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough
PHIs are in play that instcombine backs off.  It's better to not generate this stuff
in the first place.

llvm-svn: 123571
2011-01-16 06:18:28 +00:00
Chris Lattner 7cd8cf7d24 Use an irbuilder to get some trivial constant folding when doing a store
of a constant.

llvm-svn: 123570
2011-01-16 05:58:24 +00:00
Chris Lattner adb1a233b1 remove a dead check, this was needed before we had an explicit veto on uses of phis.
llvm-svn: 123569
2011-01-16 05:37:55 +00:00
Chris Lattner d55581ded8 enhance FoldOpIntoPhi in instcombine to try harder when a phi has
multiple uses.  In some cases, all the uses are the same operation,
so instcombine can go ahead and promote the phi.  In the testcase
this pushes an add out of the loop.

llvm-svn: 123568
2011-01-16 05:28:59 +00:00
Evan Cheng 572756ac11 Spill R4 if it's going to be used to restore SP from FP.
llvm-svn: 123567
2011-01-16 05:14:33 +00:00
Chris Lattner ea7131a062 remove the AllowAggressive argument to FoldOpIntoPhi. It is forced to false in the
first line of the function because it isn't a good idea, even for compares.

llvm-svn: 123566
2011-01-16 05:14:26 +00:00
Chris Lattner ff2e737714 more cleanups: use the IR builder.
llvm-svn: 123565
2011-01-16 05:08:00 +00:00
Chris Lattner 25ce280511 tidy up code.
llvm-svn: 123564
2011-01-16 04:37:29 +00:00
Owen Anderson 4e54efd625 Improve the safety of my globalopt enhancement by ensuring that the bitcast
of the stored value to the new store type is always.  Also, add a testcase.

llvm-svn: 123563
2011-01-16 04:33:33 +00:00
Chris Lattner 08f43456c9 fix PR8983, a broken assertion.
llvm-svn: 123562
2011-01-16 03:43:53 +00:00
Venkatraman Govindaraju 1b0e2cbf3f Implement AnalyzeBranch in Sparc Backend.
llvm-svn: 123561
2011-01-16 03:15:11 +00:00
Chris Lattner 218092e68e fix PR8981, a crash trying to form a conditional inc with a floating point compare.
llvm-svn: 123560
2011-01-16 02:56:53 +00:00
Chris Lattner 2d186574a6 reapply my fix for PR8961 with a tweak to properly handle
multi-instruction sequences like calls.  Many thanks to Jakob for
finding a testcase.

llvm-svn: 123559
2011-01-16 02:27:38 +00:00
Chris Lattner 8b4952fcf7 simplify this code, it is still broken but will follow up on llvm-commits.
llvm-svn: 123558
2011-01-16 02:05:10 +00:00
Michael J. Spencer 2ff30b84f8 Revert "Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1."
llvm-svn: 123557
2011-01-16 01:43:22 +00:00
Chandler Carruth ef28abefd0 Simplify a README.txt entry significantly to expose the core issue.
llvm-svn: 123556
2011-01-16 01:40:23 +00:00
Chris Lattner 1e209b87ad remove the partial specialization pass. It is unmaintained and has bugs.
llvm-svn: 123554
2011-01-16 00:27:10 +00:00
Michael J. Spencer a0ce763290 Archive: Replace all internal uses of PathV1 with PathV2. The external API still uses PathV1.
llvm-svn: 123551
2011-01-15 21:43:37 +00:00
Benjamin Kramer bec03ea725 Add an assert so we don't silently miscompile ctpop for bit widths > 128.
llvm-svn: 123549
2011-01-15 21:19:37 +00:00
Michael J. Spencer 94b2ab3556 Support/PathV2: Add identify_magic.
llvm-svn: 123548
2011-01-15 20:39:36 +00:00
Benjamin Kramer fff2517edc Reimplement CTPOP legalization with the "best" algorithm from
http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

In a silly microbenchmark on a 65 nm core2 this is 1.5x faster than the old
code in 32 bit mode and about 2x faster in 64 bit mode. It's also a lot shorter,
especially when counting 64 bit population on a 32 bit target.

I hope this is fast enough to replace Kernighan-style counting loops even when
the input is rather sparse.

llvm-svn: 123547
2011-01-15 20:30:30 +00:00
Michael J. Spencer 7887466adc Support/PathV2: Implement has_magic in terms of get_magic.
llvm-svn: 123545
2011-01-15 18:52:41 +00:00
Michael J. Spencer ee1699c362 Support/PathV2: Implement get_magic.
llvm-svn: 123544
2011-01-15 18:52:33 +00:00
Nick Lewycky 4a1ff16b29 Add missing whitespace.
llvm-svn: 123543
2011-01-15 18:42:52 +00:00
Nick Lewycky 0296a481f9 Make constmerge a two-pass algorithm so that it won't miss merging
opporuntities. Fixes PR8978.

llvm-svn: 123541
2011-01-15 18:14:21 +00:00
Benjamin Kramer ed5f2e504e Try to unbreak selfhost.
llvm-svn: 123537
2011-01-15 11:25:34 +00:00
Nick Lewycky 540f9536c8 Add a cache that protects mergefunc's internals from more surprises in DenseSet.
Also, replace tabs with spaces. Yes, it's 2011.

llvm-svn: 123535
2011-01-15 10:16:23 +00:00
Nick Lewycky 367f98f000 Teach LazyValueInfo that allocas aren't NULL. Over all of llvm-test, this saves
half a million non-local queries, each of which would otherwise have triggered a
linear scan over a basic block.

Also fix a fixme for memory intrinsics which dereference pointers. With this,
we prove that a pointer is non-null because it was dereferenced by an intrinsic
112 times in llvm-test.

llvm-svn: 123533
2011-01-15 09:16:12 +00:00
Rafael Espindola 489e505adf Allow unnamed_addr on declarations.
llvm-svn: 123529
2011-01-15 08:15:00 +00:00
Chris Lattner af26390790 temporarily revert r123526. While working on a follow-on patch I
realize that ConstantFoldTerminator doesn't preserve dominfo.

llvm-svn: 123527
2011-01-15 07:51:19 +00:00
Chris Lattner 8df83c4a24 fix rdar://8785296 - -fcatch-undefined-behavior generates inefficient code
The basic issue is that isel (very reasonably!) expects conditional branches
to be folded, so CGP leaving around a bunch dead computation feeding
conditional branches isn't such a good idea.  Just fold branches on constants
into unconditional branches.

llvm-svn: 123526
2011-01-15 07:36:13 +00:00
Chris Lattner ee588defc6 simplify code, no functionality change.
llvm-svn: 123525
2011-01-15 07:29:01 +00:00
Chris Lattner 1b93be501d Now that instruction optzns can update the iterator as they go, we can
have objectsize folding recursively simplify away their result when it
folds.  It is important to catch this here, because otherwise we won't
eliminate the cross-block values at isel and other times.

llvm-svn: 123524
2011-01-15 07:25:29 +00:00
Chris Lattner 7a2771440f make the current instruction iterator an ivar, allowing xforms that
potentially invalidate it (like inline asm lowering) to be sunk into
their proper place, cleaning up a ton of code.

llvm-svn: 123523
2011-01-15 07:14:54 +00:00
Chris Lattner 9c10d587f6 implement an instcombine xform that canonicalizes casts outside of and-with-constant operations.
This fixes rdar://8808586 which observed that we used to compile:


union xy {
        struct x { _Bool b[15]; } x;
        __attribute__((packed))
        struct y {
                __attribute__((packed)) unsigned long b0to7;
                __attribute__((packed)) unsigned int b8to11;
                __attribute__((packed)) unsigned short b12to13;
                __attribute__((packed)) unsigned char b14;
        } y;
};

struct x
foo(union xy *xy)
{
        return xy->x;
}

into:

_foo:                                   ## @foo
	movq	(%rdi), %rax
	movabsq	$1095216660480, %rcx    ## imm = 0xFF00000000
	andq	%rax, %rcx
	movabsq	$-72057594037927936, %rdx ## imm = 0xFF00000000000000
	andq	%rax, %rdx
	movzbl	%al, %esi
	orq	%rdx, %rsi
	movq	%rax, %rdx
	andq	$65280, %rdx            ## imm = 0xFF00
	orq	%rsi, %rdx
	movq	%rax, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rdx, %rsi
	movl	%eax, %edx
	andl	$-16777216, %edx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rdx
	orq	%rcx, %rdx
	movabsq	$280375465082880, %rcx  ## imm = 0xFF0000000000
	movq	%rax, %rsi
	andq	%rcx, %rsi
	orq	%rdx, %rsi
	movabsq	$71776119061217280, %r8 ## imm = 0xFF000000000000
	andq	%r8, %rax
	orq	%rsi, %rax
	movzwl	12(%rdi), %edx
	movzbl	14(%rdi), %esi
	shlq	$16, %rsi
	orl	%edx, %esi
	movq	%rsi, %r9
	shlq	$32, %r9
	movl	8(%rdi), %edx
	orq	%r9, %rdx
	andq	%rdx, %rcx
	movzbl	%sil, %esi
	shlq	$32, %rsi
	orq	%rcx, %rsi
	movl	%edx, %ecx
	andl	$-16777216, %ecx        ## imm = 0xFFFFFFFFFF000000
	orq	%rsi, %rcx
	movq	%rdx, %rsi
	andq	$16711680, %rsi         ## imm = 0xFF0000
	orq	%rcx, %rsi
	movq	%rdx, %rcx
	andq	$65280, %rcx            ## imm = 0xFF00
	orq	%rsi, %rcx
	movzbl	%dl, %esi
	orq	%rcx, %rsi
	andq	%r8, %rdx
	orq	%rsi, %rdx
	ret

We now compile this into:

_foo:                                   ## @foo
## BB#0:                                ## %entry
	movzwl	12(%rdi), %eax
	movzbl	14(%rdi), %ecx
	shlq	$16, %rcx
	orl	%eax, %ecx
	shlq	$32, %rcx
	movl	8(%rdi), %edx
	orq	%rcx, %rdx
	movq	(%rdi), %rax
	ret

A small improvement :-)

llvm-svn: 123520
2011-01-15 06:32:33 +00:00
Chris Lattner e20dd530d0 one more instcombine variant that is needed to work with future changes,
no functionality change currently.

llvm-svn: 123517
2011-01-15 05:50:18 +00:00
Chris Lattner 497459d5fd fix typo
llvm-svn: 123516
2011-01-15 05:42:47 +00:00
Chris Lattner f3c4eefff8 Catch ~x < cst just like ~x < ~y, we currently handle this through
means that are about to disappear.

llvm-svn: 123515
2011-01-15 05:41:33 +00:00
Chris Lattner 311aa63c87 reduce indentation
llvm-svn: 123514
2011-01-15 05:40:29 +00:00
Eric Christopher cc385c0c97 80-col.
llvm-svn: 123505
2011-01-15 00:25:09 +00:00
Chris Lattner b68ec5c339 Generalize LoadAndStorePromoter a bit and switch LICM
to use it.

llvm-svn: 123501
2011-01-15 00:12:35 +00:00
Bob Wilson b7a3c42eae Fix a comment.
llvm-svn: 123497
2011-01-15 00:09:18 +00:00
Eric Christopher 2af9551ebf Fix 80-cols.
llvm-svn: 123494
2011-01-14 23:50:53 +00:00
Ted Kremenek e92b6e436d Update CMake build.
llvm-svn: 123491
2011-01-14 22:58:11 +00:00
Ted Kremenek b5241b2b59 'HiReg' is written but never read. Nuke its
declaration and its assignments.

Found by clang static analyzer.

llvm-svn: 123486
2011-01-14 22:34:13 +00:00
Owen Anderson 3e2f6cf7ae Fix a false-positive warning.
llvm-svn: 123480
2011-01-14 22:31:13 +00:00
Dan Gohman abac063b7a Delete an assignment to ThisBB which isn't needed, and tidy up some
comments.

llvm-svn: 123479
2011-01-14 22:26:16 +00:00
Owen Anderson 9eb7cb48e4 Enhance GlobalOpt to be able evaluate initializers that involve stores through
bitcasts, at least in simple cases.  This fixes clang's CodeGenCXX/virtual-base-dtor.cpp

llvm-svn: 123477
2011-01-14 22:19:20 +00:00
Anton Korobeynikov 9be547cfd3 Add a possibility to switch between CFI directives- and table-based frame description emission. Currently all the backends use table-based stuff.
llvm-svn: 123476
2011-01-14 21:58:08 +00:00
Anton Korobeynikov 4d9de6be4b Cleanup
llvm-svn: 123475
2011-01-14 21:57:58 +00:00
Anton Korobeynikov b46ef57de5 Add CFI directives-based frame information emission. Not hooked yet.
llvm-svn: 123474
2011-01-14 21:57:53 +00:00
Anton Korobeynikov 61d167e92b Split stuff as a preparation for CFI directives-based frame information emission
llvm-svn: 123473
2011-01-14 21:57:45 +00:00
Anton Korobeynikov e2bea1c82e Use common style for .cfi directives
llvm-svn: 123472
2011-01-14 21:57:39 +00:00
Andrew Trick 9ccce77893 Support for precise scheduling of the instruction selection DAG,
disabled in this checkin. Sorry for the large diffs due to
refactoring. New functionality is all guarded by EnableSchedCycles.

Scheduling the isel DAG is inherently imprecise, but we give it a best
effort:
- Added MayReduceRegPressure to allow stalled nodes in the queue only
  if there is a regpressure need.
- Added BUHasStall to allow checking for either dependence stalls due to
  latency or resource stalls due to pipeline hazards.
- Added BUCompareLatency to encapsulate and standardize the heuristics
  for minimizing stall cycles (vs. reducing register pressure).
- Modified the bottom-up heuristic (now in BUCompareLatency) to
  prioritize nodes by their depth rather than height. As long as it
  doesn't stall, height is irrelevant. Depth represents the critical
  path to the DAG root.
- Added hybrid_ls_rr_sort::isReady to filter stalled nodes before
  adding them to the available queue.

Related Cleanup: most of the register reduction routines do not need
to be templates.

llvm-svn: 123468
2011-01-14 21:11:41 +00:00
Chris Lattner b498f9aff3 switch SRoA to use LoadAndStorePromoter instead of its own copy of the code.
llvm-svn: 123457
2011-01-14 19:50:47 +00:00
Chris Lattner 95294b8796 Add a new LoadAndStorePromoter class, which implements the general
"promote a bunch of load and stores" logic, allowing the code to
be shared and reused.

llvm-svn: 123456
2011-01-14 19:36:13 +00:00
Duncan Sands d6f1a9584d Turn X-(X-Y) into Y. According to my auto-simplifier this is the most common
simplification present in fully optimized code (I think instcombine fails to
transform some of these when "X-Y" has more than one use).  Fires here and
there all over the test-suite, for example it eliminates 8 subtractions in
the final IR for 445.gobmk, 2 subs in 447.dealII, 2 in paq8p etc.

llvm-svn: 123442
2011-01-14 15:26:10 +00:00
Duncan Sands 571fd9a606 Factorize common code out of the InstructionSimplify shift logic. Add in
threading of shifts over selects and phis while there.  This fires here and
there in the testsuite, to not much effect.  For example when compiling spirit
it fires 5 times, during early-cse, resulting in 6 more cse simplifications,
and 3 more terminators being folded by jump threading, but the final bitcode
doesn't change in any interesting way: other optimizations would have caught
the opportunity anyway, only later.

llvm-svn: 123441
2011-01-14 14:44:12 +00:00
Chris Lattner 9987a6f49b split SROA into two passes: one that uses DomFrontiers (-scalarrepl)
and one that uses SSAUpdater (-scalarrepl-ssa)

llvm-svn: 123436
2011-01-14 08:13:00 +00:00
Jay Foad 1d4a8fe156 Remove casts between Value** and Constant**, which won't work if a
static_cast from Constant* to Value* has to adjust the "this" pointer.
This is groundwork for PR889.

llvm-svn: 123435
2011-01-14 08:07:43 +00:00
Chris Lattner 543384efb4 Implement full support for promoting allocas to registers using SSAUpdater
instead of DomTree/DomFrontier.  This may be interesting for reducing compile 
time.  This is currently disabled, but seems to work just fine.

When this is enabled, we eliminate two runs of dominator frontier, one in the
"early per-function" optimizations and one in the "interlaced with inliner"
function passes.

llvm-svn: 123434
2011-01-14 07:50:47 +00:00
Jakob Stoklund Olesen ab3d6ecbd2 Try for the third time to teach getFirstTerminator() about debug values.
This time let's rephrase to trick gcc-4.3 into not miscompiling.

llvm-svn: 123432
2011-01-14 06:33:45 +00:00
Chris Lattner e93e4f118c revert my fastisel patch again which apparently still gives the
llvm-gcc-i386-linux-selfhost buildbot heartburn...

llvm-svn: 123431
2011-01-14 06:14:33 +00:00
Chris Lattner 5ca1391003 reapply r123414 now that the botz are calmed down and the fix is already in.
llvm-svn: 123427
2011-01-14 04:24:28 +00:00
Chris Lattner 90f3a9a1c7 indentation
llvm-svn: 123426
2011-01-14 04:23:53 +00:00
Evan Cheng d4a5c05c97 Completed :lower16: / :upper16: support for movw / movt pairs on Darwin.
- Fixed :upper16: fix up routine. It should be shifting down the top 16 bits first.
- Added support for Thumb2 :lower16: and :upper16: fix up.
- Added :upper16: and :lower16: relocation support to mach-o object writer.

llvm-svn: 123424
2011-01-14 02:38:49 +00:00
Jakob Stoklund Olesen c38102889f Revert r123419. It still breaks llvm-gcc-i386-linux-selfhost.
llvm-svn: 123423
2011-01-14 02:12:54 +00:00
Chris Lattner 21a64979f1 r123414 broke llvm-gcc bootstrap apparently, revert
llvm-svn: 123422
2011-01-14 02:07:32 +00:00
Chris Lattner 3be81e9bd7 Set the insertion point correctly for instructions generated by load folding:
they should go *before* the new instruction not after it. 

llvm-svn: 123420
2011-01-14 01:33:40 +00:00
Jakob Stoklund Olesen c0767e029d Try again to teach getFirstTerminator() about debug values.
Fix some callers to better deal with debug values.

llvm-svn: 123419
2011-01-14 01:17:53 +00:00
Duncan Sands 7f60dc1eb0 Move some shift transforms out of instcombine and into InstructionSimplify.
While there, I noticed that the transform "undef >>a X -> undef" was wrong.
For example if X is 2 then the top two bits must be equal, so the result can
not be anything.  I fixed this in the constant folder as well.  Also, I made
the transform for "X << undef" stronger: it now folds to undef always, even
though X might be zero.  This is in accordance with the LangRef, but I must
admit that it is fairly aggressive.  Also, I added "i32 X << 32 -> undef"
following the LangRef and the constant folder, likewise fairly aggressive.

llvm-svn: 123417
2011-01-14 00:37:45 +00:00
Chris Lattner 0c34cb429e fix PR8961 - a fast isel miscompilation where we'd insert a new instruction
after sext's generated for addressing that got folded.  Previously we compiled
test5 into:

_test5:                                 ## @test5
## BB#0:
        movq    -8(%rsp), %rax          ## 8-byte Reload
        movq    (%rdi,%rax), %rdi
        addq    %rdx, %rdi
        movslq  %esi, %rax
        movq    %rax, -8(%rsp)          ## 8-byte Spill
        movq    %rdi, %rax
        ret

which is insane and wrong.  Now we produce:

_test5:                                 ## @test5
## BB#0:
	movslq	%esi, %rax
	movq	(%rdi,%rax), %rax
	addq	%rdx, %rax
	ret

llvm-svn: 123414
2011-01-14 00:01:01 +00:00
Jakob Stoklund Olesen 088b30aa48 Better terminator avoidance.
This approach also works when the terminator doesn't have a slot index. (Which
can happen??)

llvm-svn: 123413
2011-01-13 23:35:53 +00:00
Evan Cheng 52899a9c34 Add comment about Thumb2 fixup comments being completely bogus.
llvm-svn: 123411
2011-01-13 23:27:39 +00:00
Tobias Grosser b1d11c19da Add single entry / single exit accessors.
Add methods for accessing the (single) entry / exit edge of a region. If no such
edge exists, null is returned.  Both accessors return the start block of the
corresponding edge. The edge can finally be formed by utilizing
Region::getEntry() or Region::getExit();

Contributed by: Andreas Simbuerger <simbuerg@fim.uni-passau.de>

llvm-svn: 123410
2011-01-13 23:18:04 +00:00
Owen Anderson a098d1505d Recognize alternative register names like ip -> r12.
Fixes <rdar://problem/8857982>.

llvm-svn: 123409
2011-01-13 22:50:36 +00:00
Jakob Stoklund Olesen bbb1a54b84 Fix a few more places that should use MBB::getLastNonDebugInstr().
llvm-svn: 123408
2011-01-13 22:47:43 +00:00
Chris Lattner b6c3aff1cb typo
llvm-svn: 123406
2011-01-13 22:11:56 +00:00
Chris Lattner b9cdf393a4 memcpy + metadata = bliss :)
llvm-svn: 123405
2011-01-13 22:08:15 +00:00
Owen Anderson c3c7f5dd56 Add support to the ARM MC infrastructure to support mcr and friends. This requires supporting
the symbolic immediate names used for these instructions, fixing their pretty-printers, and
adding proper encoding information for them.

With this, we can properly pretty-print and encode assembly like:
	mrc p15, #0, r3, c13, c0, #3

Fixes <rdar://problem/8857858>.

llvm-svn: 123404
2011-01-13 21:46:02 +00:00
Evan Cheng 0447d30939 Relax an assertion. On archs like ARM, an immediate field may be scattered. So it's possible for some bits of every 8 bits to be encoded already, and the rest still needs to be fixed up.
llvm-svn: 123403
2011-01-13 21:45:26 +00:00