Commit Graph

5783 Commits

Author SHA1 Message Date
Evan Cheng c3d1aca657 - Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term.
Also added more comments to explain why it is generally ok to return true.
- Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to
be true for loaded source (memcpy) or zero constants (memset). The poor name
choice is probably some kind of legacy issue.

llvm-svn: 169954
2012-12-12 01:32:07 +00:00
Manman Ren 82751a105c DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion
rdar://12838504

llvm-svn: 169951
2012-12-12 01:13:50 +00:00
Evan Cheng 04e5518783 Avoid using lossy load / stores for memcpy / memset expansion. e.g.
f64 load / store on non-SSE2 x86 targets.

llvm-svn: 169944
2012-12-12 00:42:09 +00:00
Evan Cheng eb54240dc2 Replace TargetLowering::isIntImmLegal() with
ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.

llvm-svn: 169929
2012-12-11 23:26:14 +00:00
Patrik Hagglund e98b7a0389 Revert EVT->MVT changes, r169836-169851, due to buildbot failures.
llvm-svn: 169854
2012-12-11 11:14:33 +00:00
Patrik Hagglund b31465b09b Change RegVT in BitTestBlock and RegsForValue, to contain MVTs,
instead of EVTs.

llvm-svn: 169851
2012-12-11 10:24:48 +00:00
Patrik Hagglund ad432a8e70 Change TargetLowering::getTypeForExtArgOrReturn to take and return
MVTs, instead of EVTs.

Accordingly, add bitsLT (and similar) to MVT.

llvm-svn: 169850
2012-12-11 10:20:51 +00:00
Patrik Hagglund d34337495e Change a parameter of TargetLowering::getVectorTypeBreakdown to MVT,
from EVT.

llvm-svn: 169849
2012-12-11 10:16:19 +00:00
Patrik Hagglund 03e9628cfa Change TargetLowering::RegisterTypeForVT to contain MVTs, instead of
EVTs.

llvm-svn: 169848
2012-12-11 10:09:23 +00:00
Patrik Hagglund c50489e203 Change TargetLowering::TransformToType to contain MVTs, instead of
EVTs.

llvm-svn: 169847
2012-12-11 10:05:04 +00:00
Patrik Hagglund 8d2e7cf561 Change TargetLowering::findRepresentativeClass to take an MVT, instead
of EVT.

llvm-svn: 169845
2012-12-11 09:57:18 +00:00
Patrik Hagglund ffb60f7c08 Change TargetLowering::getTypeToPromoteTo to take and return MVTs,
instead of EVTs.

llvm-svn: 169844
2012-12-11 09:54:23 +00:00
Patrik Hagglund a970281106 Change TargetLowering::isCondCodeLegal to take an MVT, instead of EVT.
llvm-svn: 169843
2012-12-11 09:51:27 +00:00
Patrik Hagglund e3bec6365a Change TargetLowering::getCondCodeAction to take an MVT, instead of
EVT.

llvm-svn: 169842
2012-12-11 09:48:14 +00:00
Patrik Hagglund 7ffcd226dd Change TargetLowering::getTruncStoreAction to take MVTs, instead of EVTs.
llvm-svn: 169841
2012-12-11 09:42:24 +00:00
Patrik Hagglund cbc9d4d0f9 Change TargetLowering::getLoadExtAction to take an MVT, instead of EVT.
llvm-svn: 169840
2012-12-11 09:39:09 +00:00
Patrik Hagglund 40e1afe970 Change TargetLowering::setTypeAction to take an MVT, instead fo EVT.
llvm-svn: 169839
2012-12-11 09:32:56 +00:00
Patrik Hagglund 57b1694df1 Change TargetLowering::getRepRegClassFor to take an MVT, instead of
EVT.

Accordingly, change RegDefIter to contain MVTs instead of EVTs.

llvm-svn: 169838
2012-12-11 09:31:43 +00:00
Patrik Hagglund 3708e548f8 Change TargetLowering::getRegClassFor to take an MVT, instead of EVT.
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.

This is the first, in a series of patches.

llvm-svn: 169837
2012-12-11 09:10:33 +00:00
Chandler Carruth b27041c50b Fix a miscompile in the DAG combiner. Previously, we would incorrectly
try to reduce the width of this load, and would end up transforming:

  (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
  (truncate (zextload i32 <ptr+4> as i64) to i32)

We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:

  movswl 6(...),%eax

Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:

  movl 6(...), %eax

llvm-svn: 169802
2012-12-11 00:36:57 +00:00
Chad Rosier df42cf39ab Fall back to the selection dag isel to select tail calls.
This shouldn't affect codegen for -O0 compiles as tail call markers are not
emitted in unoptimized compiles.  Testing with the external/internal nightly
test suite reveals no change in compile time performance.  Testing with -O1,
-O2 and -O3 with fast-isel enabled did not cause any compile-time or
execution-time failures.  All tests were performed on my x86 machine.
I'll monitor our arm testers to ensure no regressions occur there.

In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue
and objc_retainAutoreleaseReturnValue as tail calls unconditionally.  While
it's theoretically true that this is just an optimization, it's an
optimization that we very much want to happen even at -O0, or else ARC
applications become substantially harder to debug.

Part of rdar://12553082

llvm-svn: 169796
2012-12-11 00:18:02 +00:00
Evan Cheng 79e2ca90bc Some enhancements for memcpy / memset inline expansion.
1. Teach it to use overlapping unaligned load / store to copy / set the trailing
   bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies.
2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g.
   x86 and ARM.
3. When memcpy from a constant string, do *not* replace the load with a constant
   if it's not possible to materialize an integer immediate with a single
   instruction (required a new target hook: TLI.isIntImmLegal()).
4. Use unaligned load / stores more aggressively if target hooks indicates they
   are "fast".
5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8.
   Also increase the threshold to something reasonable (8 for memset, 4 pairs
   for memcpy).

This significantly improves Dhrystone, up to 50% on ARM iOS devices.

rdar://12760078

llvm-svn: 169791
2012-12-10 23:21:26 +00:00
Eric Christopher 200dd760fa Fix a coding style nit.
llvm-svn: 169776
2012-12-10 22:00:20 +00:00
Tom Stellard 30e2aa5015 LegalizeDAG: Allow type promotion of scalar loads
llvm-svn: 169773
2012-12-10 21:41:58 +00:00
Tom Stellard b785bd776c LegalizeDAG: Allow type promotion for scalar stores
llvm-svn: 169772
2012-12-10 21:41:54 +00:00
Craig Topper d8005db486 Teach DAG combine to handle vector add/sub with vectors of all 0s.
llvm-svn: 169727
2012-12-10 08:12:29 +00:00
Craig Topper 5ea3bdd75b Remove extra blank line.
llvm-svn: 169692
2012-12-09 08:20:52 +00:00
Craig Topper a183ddb0fe Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined.
llvm-svn: 169684
2012-12-08 22:49:19 +00:00
Evan Cheng 9ec512d768 Replace r169459 with something safer. Rather than having computeMaskedBits to
understand target implementation of any_extend / extload, just generate
zero_extend in place of any_extend for liveouts when the target knows the
zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz).

rdar://12771555

llvm-svn: 169536
2012-12-06 19:13:27 +00:00
Nadav Rotem ac450eb59e Fix a bug in the code that merges consecutive stores. Previously we did not
check if loads that happen in between stores alias with the first store in the
chain, only with the second store onwards.

llvm-svn: 169516
2012-12-06 17:34:13 +00:00
Evan Cheng 5213139f48 Let targets provide hooks that compute known zero and ones for any_extend
and extload's. If they are implemented as zero-extend, or implicitly
zero-extend, then this can enable more demanded bits optimizations. e.g.

define void @foo(i16* %ptr, i32 %a) nounwind {
entry:
  %tmp1 = icmp ult i32 %a, 100
  br i1 %tmp1, label %bb1, label %bb2
bb1:
  %tmp2 = load i16* %ptr, align 2
  br label %bb2
bb2:
  %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ]
  %cmp = icmp ult i16 %tmp3, 24
  br i1 %cmp, label %bb3, label %exit
bb3:
  call void @bar() nounwind
  br label %exit
exit:
  ret void
}

This compiles to the followings before:
        push    {lr}
        mov     r2, #0
        cmp     r1, #99
        bhi     LBB0_2
@ BB#1:                                 @ %bb1
        ldrh    r2, [r0]
LBB0_2:                                 @ %bb2
        uxth    r0, r2
        cmp     r0, #23
        bhi     LBB0_4
@ BB#3:                                 @ %bb3
        bl      _bar
LBB0_4:                                 @ %exit
        pop     {lr}
        bx      lr

The uxth is not needed since ldrh implicitly zero-extend the high bits. With
this change it's eliminated.

rdar://12771555

llvm-svn: 169459
2012-12-06 01:28:01 +00:00
Chandler Carruth 802d755533 Sort includes for all of the .h files under the 'lib' tree. These were
missed in the first pass because the script didn't yet handle include
guards.

Note that the script is now able to handle all of these headers without
manual edits. =]

llvm-svn: 169224
2012-12-04 07:12:27 +00:00
Jakub Staszak ae551a853d Simplify code. No functionality change.
llvm-svn: 169198
2012-12-04 01:00:52 +00:00
Jakub Staszak bac8ae6506 Use dyn_cast instead of isa and cast. No functionality change.
llvm-svn: 169196
2012-12-04 00:50:06 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Nadav Rotem 1157e1410c Allow merging multiple store sequences on the same chain.
llvm-svn: 169111
2012-12-02 17:14:09 +00:00
Justin Holewinski edec332437 Cleanup recent addition of DAGTypeLegalizer::SplitVecOp_VSELECT
llvm-svn: 168932
2012-11-29 19:42:09 +00:00
Justin Holewinski 0ac49bf846 Teach the legalizer how to handle operands for VSELECT nodes
If we need to split the operand of a VSELECT, it must be the mask operand. We
split the entire VSELECT operand with EXTRACT_SUBVECTOR.

llvm-svn: 168883
2012-11-29 14:26:28 +00:00
Justin Holewinski bc45119b44 Allow targets to prefer TypeSplitVector over TypePromoteInteger when computing the legalization method for vectors
For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>.

llvm-svn: 168882
2012-11-29 14:26:24 +00:00
Nadav Rotem 307d767177 When combining consecutive stores allow loads in between the stores, if the loads do not alias.
llvm-svn: 168832
2012-11-29 00:00:08 +00:00
Craig Topper 79bd205d8c Refactor to make helper method static.
llvm-svn: 168557
2012-11-25 08:08:58 +00:00
Craig Topper 268b62288e Remove duplicate check of LimitFloatPrecision. It was already checked earlier before IsExp10 could be set to true.
llvm-svn: 168553
2012-11-25 00:48:58 +00:00
Craig Topper 8571944cf1 Factor common code out of individual if blocks into common tail.
llvm-svn: 168551
2012-11-25 00:15:07 +00:00
Craig Topper d374694b07 Remove redundant calls to getCurDebugLoc in visitIntrinsicCall. It's already called at the start of the function and captured in a local variable.
llvm-svn: 168548
2012-11-24 23:05:23 +00:00
Craig Topper d2638c1894 Refactor a bit to make some helper methods static.
llvm-svn: 168546
2012-11-24 18:52:06 +00:00
Craig Topper 4a98175800 Factor some common code out of individual if blocks.
llvm-svn: 168538
2012-11-24 08:22:37 +00:00
Craig Topper bef254ab16 Refactor a bit to make some helper functions static.
llvm-svn: 168524
2012-11-23 18:38:31 +00:00
Patrik Hägglund f77cc055cd Cleanup: Simplify loop end logic in computeRegisterProperties().
llvm-svn: 168507
2012-11-23 08:35:04 +00:00
Lang Hames e9541c820a llvm.fmuladd.* lowering should be checking isOperationLegalOrCustom, rather than
isOperationLegal. Thanks to Craig Topper for pointing this out.

llvm-svn: 168485
2012-11-22 03:31:45 +00:00
Eli Friedman 30834940ec Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus.
llvm-svn: 168240
2012-11-17 01:52:46 +00:00
Craig Topper ed756c5fc8 Remove conditions from 'else if' that were guaranteed by preceding 'if'.
llvm-svn: 168191
2012-11-16 20:01:39 +00:00
Craig Topper 3669de4c97 Factor out the final FADD that's common to multiple code paths in the visitLog* functions.
llvm-svn: 168183
2012-11-16 19:08:44 +00:00
Craig Topper ae89426f07 Factor some common code to reduce compile size.
llvm-svn: 168143
2012-11-16 07:48:23 +00:00
Eli Friedman e6385e61b5 Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing
case to vector legalization so this actually works.

Patch by Pete Couperus.  Fixes PR12540.

llvm-svn: 168107
2012-11-15 22:44:27 +00:00
Craig Topper 61d045781a Add llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint intrinsics.
llvm-svn: 168025
2012-11-15 06:51:10 +00:00
Rafael Espindola c79532d101 Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333.
llvm-svn: 167912
2012-11-14 05:08:56 +00:00
Duncan Sands b8d3caf65a Codegen support for arbitrary vector getelementptrs.
llvm-svn: 167830
2012-11-13 13:01:58 +00:00
Andrew Trick 108c88c5b7 misched: Allow subtargets to enable misched and dependent options.
This allows me to begin enabling (or backing out) misched by default
for one subtarget at a time. To run misched we typically want to:
- Disable SelectionDAG scheduling (use the source order scheduler)
- Enable more aggressive coalescing (until we decide to always run the coalescer this way)
- Enable MachineScheduler pass itself.

Disabling PostRA sched may follow for some subtargets.

llvm-svn: 167826
2012-11-13 08:47:29 +00:00
Andrew Trick f1ff84c64e misched: Infrastructure for weak DAG edges.
This adds support for weak DAG edges to the general scheduling
infrastructure in preparation for MachineScheduler support for
heuristics based on weak edges.

llvm-svn: 167738
2012-11-12 19:28:57 +00:00
Andrew Trick baeaabb2d0 ScheduleDAG interface. Added OrderKind to distinguish nonregister dependencies.
This is in preparation for adding "weak" DAG edges, but generally
simplifies the design.

llvm-svn: 167435
2012-11-06 03:13:46 +00:00
Owen Anderson 15fd6ac4ba Be careful not to optimize a SELECT_CC into a SETCC post-legalization if the SETCC node would be illegal.
llvm-svn: 167344
2012-11-03 00:17:26 +00:00
Manman Ren 3d5af279b1 OutputArg: added an index of the original argument to match the change to
InputArg in r165616.

This will enable us to get the actual type for both InputArg and OutputArg.

rdar://9932559

llvm-svn: 167265
2012-11-01 23:49:58 +00:00
Chandler Carruth 5da3f0512e Revert the majority of the next patch in the address space series:
r165941: Resubmit the changes to llvm core to update the functions to
         support different pointer sizes on a per address space basis.

Despite this commit log, this change primarily changed stuff outside of
VMCore, and those changes do not carry any tests for correctness (or
even plausibility), and we have consistently found questionable or flat
out incorrect cases in these changes. Most of them are probably correct,
but we need to devise a system that makes it more clear when we have
handled the address space concerns correctly, and ideally each pass that
gets updated would receive an accompanying test case that exercises that
pass specificaly w.r.t. alternate address spaces.

However, from this commit, I have retained the new C API entry points.
Those were an orthogonal change that probably should have been split
apart, but they seem entirely good.

In several places the changes were very obvious cleanups with no actual
multiple address space code added; these I have not reverted when
I spotted them.

In a few other places there were merge conflicts due to a cleaner
solution being implemented later, often not using address spaces at all.
In those cases, I've preserved the new code which isn't address space
dependent.

This is part of my ongoing effort to clean out the partial address space
code which carries high risk and low test coverage, and not likely to be
finished before the 3.2 release looms closer. Duncan and I would both
like to see the above issues addressed before we return to these
changes.

llvm-svn: 167222
2012-11-01 09:14:31 +00:00
Chandler Carruth 7ec5085e01 Revert the series of commits starting with r166578 which introduced the
getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.

These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.

Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)

After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.

Summary of reverted revisions:

r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
         Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
         since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
         on the address space.
llvm-svn: 167221
2012-11-01 08:07:29 +00:00
Owen Anderson b351c8d692 Add a few more simple fast-math constant propagations and cancellations.
llvm-svn: 167200
2012-11-01 02:00:53 +00:00
Chad Rosier 909f6a035f [inline asm] Get the mayLoad/mayStore directly from the MIOp_ExtraInfo operand.
llvm-svn: 167050
2012-10-30 20:39:19 +00:00
Chad Rosier 86f6050c54 Add a comment for r167040.
llvm-svn: 167046
2012-10-30 20:01:12 +00:00
Chad Rosier 9e1274fb48 [inline asm] Implement mayLoad and mayStore for inline assembly. In general,
the MachineInstr MayLoad/MayLoad flags are based on the tablegen implementation.
For inline assembly, however, we need to compute these based on the constraints.

Revert r166929 as this is no longer needed, but leave the test case in place. 
rdar://12033048 and PR13504

llvm-svn: 167040
2012-10-30 19:11:54 +00:00
Ulrich Weigand 3abb34389d In various places throughout the code generator, there were special
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.

Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.

llvm-svn: 166958
2012-10-29 18:35:49 +00:00
Micah Villmow 51e7246cb4 Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this!
llvm-svn: 166596
2012-10-24 17:25:11 +00:00
Micah Villmow 6a8f3f9e20 Delete a directory that wasn't supposed to be checked in yet.
llvm-svn: 166591
2012-10-24 17:20:04 +00:00
Micah Villmow 12d9127833 Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.

llvm-svn: 166578
2012-10-24 15:52:52 +00:00
Michael Liao 5922979e49 Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.

llvm-svn: 166546
2012-10-24 04:14:18 +00:00
Jakub Staszak a6addc2741 Keep coding standard. Don't evaluate getNumOperands() every time.
llvm-svn: 166531
2012-10-24 00:38:25 +00:00
Michael Liao 6d106b7bfd Clean up code and put transformation on (build_vec (ext x)) into a helper func
llvm-svn: 166519
2012-10-23 23:06:52 +00:00
Nadav Rotem 33e034a4b3 Make the indirect branch optimization deterministic. No functionality change.
Patch by Daniel Reynaud.

llvm-svn: 166501
2012-10-23 21:05:33 +00:00
Benjamin Kramer a74129adad Symbol hygiene: Make sure declarations and definitions match, make helper functions static.
llvm-svn: 166376
2012-10-20 12:53:26 +00:00
Shuxin Yang 1479fcdef1 1. Remove noreturn attribute from __builtin_debugtrap().
(The change at Clang side was committed in r166345)

2. Cosmetic change in order to conform to coding standards. 

llvm-svn: 166350
2012-10-19 23:00:20 +00:00
Shuxin Yang cdde059a34 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 

llvm-svn: 166300
2012-10-19 20:11:16 +00:00
Michael Liao 2c2358036d Simplify condition checking as CONCAT assume all inputs of the same type.
llvm-svn: 166260
2012-10-19 03:17:00 +00:00
Nadav Rotem d5f8859672 In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other.
rdar://12513091

llvm-svn: 166196
2012-10-18 18:06:48 +00:00
Michael Liao 3ac8201ea4 Revert part of r166049 back and enable test case in r166125.
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
  when '...' are all 'undef's.
- r166125 relies on this transformation.

llvm-svn: 166155
2012-10-17 23:45:54 +00:00
Michael Liao c87d98dbc8 Revert r166049
- In general, it's unsafe for this transformation.

llvm-svn: 166135
2012-10-17 22:41:15 +00:00
Michael Liao 7a442c8031 Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i
- If the extracted vector has the same type of all vectored being concatenated
  together, it should be simplified directly into v_i, where i is the index of
  the element being extracted.

llvm-svn: 166125
2012-10-17 20:48:33 +00:00
Evan Cheng 839fb650b2 Add a really faster pre-RA scheduler (-pre-RA-sched=linearize). It doesn't use
any scheduling heuristics nor does it build up any scheduling data structure
that other heuristics use. It essentially linearize by doing a DFA walk but
it does handle glues correctly.

IMPORTANT: it probably can't handle all the physical register dependencies so
it's not suitable for x86. It also doesn't deal with dbg_value nodes right now
so it's definitely is still WIP.

rdar://12474515

llvm-svn: 166122
2012-10-17 19:39:36 +00:00
Michael Liao 19006206a1 Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)
llvm-svn: 166049
2012-10-16 19:38:35 +00:00
Jakob Stoklund Olesen 57e310613c Freeze the reserved registers as soon as isel is complete.
Also provide an MRI::getReservedRegs() function to access the frozen
register set, and isReserved() and isAllocatable() methods to test
individual registers.

The various implementations of TRI::getReservedRegs() are quite
complicated, and many passes need to look at the reserved register set.
This patch makes it possible for these passes to use the cached copy in
MRI, avoiding a lot of malloc traffic and repeated calculations.

llvm-svn: 165982
2012-10-15 21:33:06 +00:00
Micah Villmow 4bb926d91d Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis.
llvm-svn: 165941
2012-10-15 16:24:29 +00:00
Ulrich Weigand 9aa51d1a2c Fix big-endian codegen bug in DAGTypeLegalizer::ExpandRes_BITCAST
On PowerPC, a bitcast of <16 x i8> to i128 may run through a code
path in ExpandRes_BITCAST that attempts to do an intermediate
bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts
of the resulting i128 by pairing up two of those i32 vector elements
each.  The code already recognizes that on a big-endian system, the
first two vector elements form the Hi part, and the final two vector
elements form the Lo part (vice-versa from the little-endian situation).

However, we also need to take endianness into account when forming each
of those separate pairs:  on a big-endian system, vector element 0 is
the *high* part of the pair making up the Hi part of the result, and
vector element 1 is the low part of the pair.  The code currently always
uses vector element 0 as the low part and vector element 1 as the high
part, as is appropriate for little-endian platforms only.

This patch fixes this by swapping the vector elements as they are
paired up as appropriate.

llvm-svn: 165802
2012-10-12 15:42:58 +00:00
Evan Cheng 21c4adcdd8 Legalizer optimize a pair of div / mod to a call to divrem libcall if they are
not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.

rdar://12481395

llvm-svn: 165778
2012-10-12 01:15:47 +00:00
Micah Villmow 0c61134d8d Revert 165732 for further review.
llvm-svn: 165747
2012-10-11 21:27:41 +00:00
Micah Villmow 083189730e Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly.
llvm-svn: 165726
2012-10-11 17:21:41 +00:00
Michael Liao 6b49c2f69c Follow the same routine to add target float expansion hook
llvm-svn: 165707
2012-10-11 07:22:01 +00:00
Micah Villmow 0242b9b543 Add in support for expansion of all of the comparison operations to the absolute minimum required set. This allows a backend to expand any arbitrary set of comparisons as long as a minimum set is supported.
The minimum set of required instructions is ISD::AND, ISD::OR, ISD::SETO(or ISD::SETOEQ) and ISD::SETUO(or ISD::SETUNE). Everything is expanded into one of two patterns:
Pattern 1: (LHS CC1 RHS) Opc (LHS CC2 RHS)
Pattern 2: (LHS CC1 LHS) Opc (RHS CC2 RHS)

llvm-svn: 165655
2012-10-10 20:50:51 +00:00
Michael Liao effae0c8e1 Add alternative support for FP_ROUND from v2f32 to v2f64
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
  rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
  to convert it into a target-specific X86ISD::VFPEXT to work around this
  constraints. This patch also reverts a previous attempt to fix this issue by
  recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
  reduces the overhead of supporting non-power-2 vector FP extend.

llvm-svn: 165625
2012-10-10 16:32:15 +00:00
Stepan Dyatkovskiy f13dbb8e24 Issue description:
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.

Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.

Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values. 

llvm-svn: 165616
2012-10-10 11:37:36 +00:00
Bill Wendling 8ccd6ca199 Use the attribute enums to query if a parameter has an attribute.
llvm-svn: 165550
2012-10-09 21:38:14 +00:00
Micah Villmow 89021e4740 Add in the first step of the multiple pointer support. This adds in support to the data layout for specifying a per address space pointer size.
The next step is to update the optimizers to allow them to optimize the different address spaces with this information.

llvm-svn: 165505
2012-10-09 16:06:12 +00:00
Bill Wendling c9b22d735a Create enums for the different attributes.
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.

llvm-svn: 165488
2012-10-09 07:45:08 +00:00
Nadav Rotem 35315fea70 Refactor the AddrMode class out of TLI to its own header file.
This class is used by LSR and a number of places in the codegen.
This is the first step in de-coupling LSR from TLI, and creating
a new interface in between them.

llvm-svn: 165455
2012-10-08 23:06:34 +00:00