Commit Graph

81952 Commits

Author SHA1 Message Date
Jim Grosbach 9d8f6f3d9d ARM: Tweak tADDrSP definition for consistent operand order.
Make the operand order of the instruction match that of the asm syntax.

llvm-svn: 155747
2012-04-27 23:51:33 +00:00
Derek Schuff a99b168145 Revert r155745
llvm-svn: 155746
2012-04-27 23:37:41 +00:00
Derek Schuff bbf8b83e90 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

llvm-svn: 155745
2012-04-27 23:27:17 +00:00
Jakob Stoklund Olesen 5f0d1b462c Track worst case alignment padding more accurately.
Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:

  RoundUpToAlignment(Offset + UnknownPadding)

This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.

This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.

The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.

<rdar://problem/11339352>

llvm-svn: 155744
2012-04-27 22:58:38 +00:00
Andrew Trick 7a773ec053 Temporarily revert r155668: Fix the SD scheduler to avoid gluing.
This definitely caused regression with ARM -mno-thumb.

llvm-svn: 155743
2012-04-27 22:55:59 +00:00
Craig Topper 0fa6c7e593 Use 'unsigned' instead of 'int' in several places when retrieving number of vector elements.
llvm-svn: 155742
2012-04-27 22:54:43 +00:00
Chad Rosier 32c2178ef3 Add x86-specific DAG combine to simplify:
x == -y --> x+y == 0
 x != -y --> x+y != 0

On x86, the generated code goes from
   negl    %esi
   cmpl    %esi, %edi
   je    .LBB0_2
to
   addl    %esi, %edi
   je    .L4

This case is correctly handled for ARM with "cmn".

Patch by Manman Ren.
rdar://11245199
PR12545

llvm-svn: 155739
2012-04-27 22:33:25 +00:00
Michael J. Spencer 6033113e35 [Support/YAMLParser] Fix ASan found bugs.
llvm-svn: 155735
2012-04-27 21:12:20 +00:00
Craig Topper 42cd8d2c00 Tidy up spacing.
llvm-svn: 155733
2012-04-27 21:05:09 +00:00
Evan Cheng 73fd08d5bd Make test less fragile.
llvm-svn: 155732
2012-04-27 20:48:18 +00:00
Hal Finkel 27c3246169 Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.).
Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).

llvm-svn: 155729
2012-04-27 19:34:00 +00:00
David Blaikie 84e4b39995 Change recurse depth limit to uint32 to fix warning.
llvm-svn: 155727
2012-04-27 19:30:32 +00:00
David Blaikie 4a5d29509f Switch to c-style comments in a C file.
llvm-svn: 155726
2012-04-27 19:30:29 +00:00
Dan Gohman dae3349ac2 Miscellaneous accumulated cleanups.
llvm-svn: 155725
2012-04-27 18:56:31 +00:00
Lang Hames ea001225c1 Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,
<rdar://problem/11325085>.

llvm-svn: 155724
2012-04-27 18:51:24 +00:00
Mon P Wang 6120cfb8cd Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks.
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow
issues. <rdar://problem/11286839>.

llvm-svn: 155722
2012-04-27 18:09:28 +00:00
Dan Gohman 1ccecdb2fd Reapply r155682, making constant folding more consistent, with a fix to work
properly with how the code handles all-undef PHI nodes.

llvm-svn: 155721
2012-04-27 17:50:22 +00:00
Richard Barton 82f95ea2ad Fix ARM assembly parsing for upper case condition codes on IT instructions.
llvm-svn: 155720
2012-04-27 17:34:01 +00:00
Jim Grosbach 691f4dd923 Remove a docs reference to the CBackend.
llvm-svn: 155716
2012-04-27 16:29:22 +00:00
Benjamin Kramer 6cff5ad411 Missed some register numbers.
llvm-svn: 155706
2012-04-27 12:21:46 +00:00
Benjamin Kramer b1a17c425a Update edis test for r155704.
llvm-svn: 155705
2012-04-27 12:14:03 +00:00
Benjamin Kramer 913da4b261 X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures.
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!

llvm-svn: 155704
2012-04-27 12:07:43 +00:00
Evgeniy Stepanov b7ff9b1599 Update config.sub in the sample project.
This change replaces projects/sample/autoconf/config.sub with a copy of
autoconf/config.sub.

llvm-svn: 155703
2012-04-27 10:27:32 +00:00
Kostya Serebryany 5a464f03d3 [asan] small optimization: do not emit "x+0" instructions
llvm-svn: 155701
2012-04-27 10:04:53 +00:00
Richard Barton f435b09eaf Refactor IT handling not to store the bottom bit of the condition code in the mask operand in the MCInst.
llvm-svn: 155700
2012-04-27 08:42:59 +00:00
NAKAMURA Takumi 6008dfdb70 Revert r155682, "Use ConstantExpr::getExtractElement when constant-folding vectors"
It broke stage2 build. stage1/clang sometimes crashed.

llvm-svn: 155699
2012-04-27 07:59:20 +00:00
Kostya Serebryany a1259778b4 [tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov
llvm-svn: 155698
2012-04-27 07:31:53 +00:00
Craig Topper e57b49ee16 Add mcpu to tests to prevent them from using AVX instructions on Sandy Bridge after r155618.
llvm-svn: 155696
2012-04-27 07:11:58 +00:00
Evan Cheng 1ec87ee096 Implement a bastardized ABI.
llvm-svn: 155686
2012-04-27 02:11:10 +00:00
Evan Cheng f52003de56 - thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2
instructions.
- However, it does support dmb, dsb, isb, mrs, and msr.
rdar://11331541

llvm-svn: 155685
2012-04-27 01:27:19 +00:00
Dan Gohman 90f3798f26 Use ConstantExpr::getExtractElement when constant-folding vectors
instead of getAggregateElement. This has the advantage of being
more consistent and allowing higher-level constant folding to
procede even if an inner extract element cannot be folded.

Make ConstantFoldInstruction call ConstantFoldConstantExpression
on the instruction's operands, making it more consistent with 
ConstantFoldConstantExpression itself. This makes sure that
ConstantExprs get TargetData-aware folding before being handed
off as operands for further folding.

This causes more expressions to be folded, but due to a known
shortcoming in constant folding, this currently has the side effect
of stripping a few more nuw and inbounds flags in the non-targetdata
side of constant-fold-gep.ll. This is mostly harmless.

This fixes rdar://11324230.

llvm-svn: 155682
2012-04-27 00:54:36 +00:00
Jakob Stoklund Olesen c90abc8956 Break up getProfitableChainIncrement().
The required checks are moved to ChainInstruction() itself and the
policy decisions are moved to IVChain::isProfitableInc().

Also cache the ExprBase in IVChain to avoid frequent recomputations.

No functional change intended.

llvm-svn: 155676
2012-04-26 23:33:11 +00:00
Jakob Stoklund Olesen a0337d7bd9 Turn IVChain into a struct.
No functional change intended.

llvm-svn: 155675
2012-04-26 23:33:09 +00:00
Chad Rosier 7813dcee30 Add instcombine patterns for the following transformations:
(x & y) | (x ^ y) -> x | y 
 (x & y) + (x ^ y) -> x | y 

Patch by Manman Ren.
rdar://10770603

llvm-svn: 155674
2012-04-26 23:29:14 +00:00
Evan Cheng 86bd889942 DumpSegment64Command() wasn't returning correct result. Caught by static analyzer. rdar://11329354
llvm-svn: 155669
2012-04-26 22:07:28 +00:00
Andrew Trick 03fa574af5 Fix the SD scheduler to avoid gluing the same node twice.
DAGCombine strangeness may result in multiple loads from the same
offset. They both may try to glue themselves to another load. We could
insist that the redundant loads glue themselves to each other, but the
beter fix is to bail out from bad gluing at the time we detect it.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155668
2012-04-26 21:48:25 +00:00
Ted Kremenek b62cf212b2 Defensively guard against calling malloc() with a size of zero.
llvm-svn: 155661
2012-04-26 20:54:27 +00:00
Jim Grosbach 3d6c629e26 ARM: Thumb ldr(literal) base address alignment is 32-bits.
The base address for the PC-relative load is Align(PC,4), so it's the
address of the word containing the 16-bit instruction, not the address
of the instruction itself. Ugh.

rdar://11314619

llvm-svn: 155659
2012-04-26 20:48:12 +00:00
Joerg Sonnenberger 18ad5b28b7 Add note about returns_twice magic removal from LLVM itself.
llvm-svn: 155657
2012-04-26 20:10:07 +00:00
Preston Gurd 81290f4be5 Trivial change to set UseLeaForSP flag in addition to toggling
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.

Patch by Andy Zhang

llvm-svn: 155655
2012-04-26 19:52:27 +00:00
Michael J. Spencer e734f5417f [CMake] Restructure how Clang, Polly and other external projects get included.
While making lld build under the tools directory I decided to refactor how this
works.

There is now a macro, add_llvm_external_project, which takes the name of the
expected subdirectory. This sets up two CMake options.

 * LLVM_EXTERNAL_${NAME}_SOURCE_DIR
     This is the path to the source. It defaults to
     ${CMAKE_CURRENT_SOURCE_DIR}/${name}.
 * LLVM_EXTERNAL_${NAME}_BUILD
     Enable and disable building the tool as part of LLVM.

I chose LLVM_EXTERNAL_${NAME} as a prefix so they all show up together in the
GUI.

llvm-svn: 155654
2012-04-26 19:43:35 +00:00
Michael J. Spencer a6c2c29152 [Support/YAML] Properly fix unitialized variable warning by inserting a
'REPLACEMENT CHARACTER' (U+FFFD) when getAsInteger fails.

llvm-svn: 155653
2012-04-26 19:27:11 +00:00
Stepan Dyatkovskiy 3ee22ba6ca Fixed SmallMap test. The order of items is undefined in DenseMap. So being checking the increment for big mode, we can only check that all items are in map.
llvm-svn: 155651
2012-04-26 18:45:24 +00:00
Tim Northover 3de97b7a86 Use VLD1 in NEON extenting-load patterns instead of VLDR.
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.

llvm-svn: 155630
2012-04-26 08:46:29 +00:00
Tim Northover 6699a60b0e Test commit.
llvm-svn: 155626
2012-04-26 08:24:07 +00:00
Craig Topper 08ccfbe57b Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.
llvm-svn: 155618
2012-04-26 06:40:15 +00:00
Chandler Carruth 739ef80fd7 Teach the reassociate pass to fold chains of multiplies with repeated
elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.

Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!

Credit to Richard Smith for the core algorithm, and helping code the
patch itself.

llvm-svn: 155616
2012-04-26 05:30:30 +00:00
Evan Cheng 8a8e9d1b63 Specify cpu to unbreak tests.
llvm-svn: 155604
2012-04-26 01:38:10 +00:00
Evan Cheng 9f7ad310b5 If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume
the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.

rdar://11318438

llvm-svn: 155601
2012-04-26 01:13:36 +00:00
Bill Wendling 0156f44a68 Don't forget to reset 'first operand' flag when we're setting the MDNodeOperand value.
llvm-svn: 155599
2012-04-26 00:38:42 +00:00