Commit Graph

22636 Commits

Author SHA1 Message Date
Nadav Rotem 7411623fd8 Implement the cost of abnormal x86 instruction lowering as a table.
llvm-svn: 167395
2012-11-05 19:32:46 +00:00
Hal Finkel 4f24c621d9 Add support for the PowerPC-specific inline asm Z constraint and y modifier.
The Z constraint specifies an r+r memory address, and the y modifier expands
to the "r, r" in the asm string. For this initial implementation, the base
register is forced to r0 (which has the special meaning of 0 for r+r addressing
on PowerPC) and the full address is taken in the second register. In the
future, this should be improved.

llvm-svn: 167388
2012-11-05 18:18:42 +00:00
Adhemerval Zanella c4182d1890 [PATCH] PowerPC: Expand load extend vector operations
This patch expands the SEXTLOAD, ZEXTLOAD, and EXTLOAD operations for
vector types when altivec is enabled.

llvm-svn: 167386
2012-11-05 17:15:56 +00:00
Craig Topper 3b530ea605 Remove alignments from folding tables for scalar FMA4 instructions.
llvm-svn: 167366
2012-11-04 04:40:08 +00:00
Akira Hatanaka da1980f697 [mips] Set flag neverHasSideEffects flag on floating point conversion
instructions.

llvm-svn: 167348
2012-11-03 00:53:12 +00:00
Nadav Rotem c2345cbe73 X86 CostModel: Add support for a some of the common arithmetic instructions for SSE4, AVX and AVX2.
llvm-svn: 167347
2012-11-03 00:39:56 +00:00
Akira Hatanaka 7828331329 [mips] Set flag isAsCheapAsAMove flag on instruction LUi.
llvm-svn: 167345
2012-11-03 00:26:02 +00:00
Akira Hatanaka 5852e3b800 [mips] Stop reserving register AT and use register scavenger when a scratch
register is needed.

llvm-svn: 167341
2012-11-03 00:05:43 +00:00
Akira Hatanaka 654e3b40f5 [mips] Do not reserve all 64-bit registers, but only the ones which need to be
reserved. Without this fix, RegScavenger::getRegsAvailable incorrectly
returns an empty set of integer registers.

llvm-svn: 167335
2012-11-02 23:36:01 +00:00
Nadav Rotem 23848f8f1d Add a stub for the x86 cost model impl. Implement a basic cost rule for inserting/extracting from XMM registers.
llvm-svn: 167333
2012-11-02 23:27:16 +00:00
Nadav Rotem 919b5aab34 Scalar Bitcasts and Truncs are usually free
llvm-svn: 167323
2012-11-02 21:47:47 +00:00
Quentin Colombet 8e1fe84c3c Vext Lowering was missing opportunities
llvm-svn: 167318
2012-11-02 21:32:17 +00:00
Akira Hatanaka 949f8d890d [mips] Use register number instead of name to print register $AT.
llvm-svn: 167315
2012-11-02 21:26:03 +00:00
Akira Hatanaka 97b43d8bdf [mips] Add function MipsFrameLowering::estimateStackSize.
This function estimates stack size and will be called before
PrologEpilogInserter scans the callee-saved registers.

llvm-svn: 167313
2012-11-02 21:10:22 +00:00
Akira Hatanaka 719df2874c [mips] Add member field MipsFunctionInfo::IncomingArgSize which holds the size
of the incoming argument area.

llvm-svn: 167312
2012-11-02 21:03:58 +00:00
Akira Hatanaka 0dfbf1262b [mips] Delete MipsFunctionInfo::EmitNOAT. Unconditionally print directive
"set .noat" so that the assembler doesn't issue warnings when register $AT is
used.

llvm-svn: 167310
2012-11-02 20:56:25 +00:00
Pranav Bhandarkar 34b601804e Use the relationship models infrastructure to add two relations - getPredOpcode
and getPredNewOpcode. The first relates non predicated instructions with their
predicated forms and the second relates predicated instructions with their
predicate-new forms.

Patch by Jyotsna Verma!

llvm-svn: 167243
2012-11-01 19:13:23 +00:00
Chandler Carruth 5da3f0512e Revert the majority of the next patch in the address space series:
r165941: Resubmit the changes to llvm core to update the functions to
         support different pointer sizes on a per address space basis.

Despite this commit log, this change primarily changed stuff outside of
VMCore, and those changes do not carry any tests for correctness (or
even plausibility), and we have consistently found questionable or flat
out incorrect cases in these changes. Most of them are probably correct,
but we need to devise a system that makes it more clear when we have
handled the address space concerns correctly, and ideally each pass that
gets updated would receive an accompanying test case that exercises that
pass specificaly w.r.t. alternate address spaces.

However, from this commit, I have retained the new C API entry points.
Those were an orthogonal change that probably should have been split
apart, but they seem entirely good.

In several places the changes were very obvious cleanups with no actual
multiple address space code added; these I have not reverted when
I spotted them.

In a few other places there were merge conflicts due to a cleaner
solution being implemented later, often not using address spaces at all.
In those cases, I've preserved the new code which isn't address space
dependent.

This is part of my ongoing effort to clean out the partial address space
code which carries high risk and low test coverage, and not likely to be
finished before the 3.2 release looms closer. Duncan and I would both
like to see the above issues addressed before we return to these
changes.

llvm-svn: 167222
2012-11-01 09:14:31 +00:00
Chandler Carruth 7ec5085e01 Revert the series of commits starting with r166578 which introduced the
getIntPtrType support for multiple address spaces via a pointer type,
and also introduced a crasher bug in the constant folder reported in
PR14233.

These commits also contained several problems that should really be
addressed before they are re-committed. I have avoided reverting various
cleanups to the DataLayout APIs that are reasonable to have moving
forward in order to reduce the amount of churn, and minimize the number
of commits that were reverted. I've also manually updated merge
conflicts and manually arranged for the getIntPtrType function to stay
in DataLayout and to be defined in a plausible way after this revert.

Thanks to Duncan for working through this exact strategy with me, and
Nick Lewycky for tracking down the really annoying crasher this
triggered. (Test case to follow in its own commit.)

After discussing with Duncan extensively, and based on a note from
Micah, I'm going to continue to back out some more of the more
problematic patches in this series in order to ensure we go into the
LLVM 3.2 branch with a reasonable story here. I'll send a note to
llvmdev explaining what's going on and why.

Summary of reverted revisions:

r166634: Fix a compiler warning with an unused variable.
r166607: Add some cleanup to the DataLayout changes requested by
         Chandler.
r166596: Revert "Back out r166591, not sure why this made it through
         since I cancelled the command. Bleh, sorry about this!
r166591: Delete a directory that wasn't supposed to be checked in yet.
r166578: Add in support for getIntPtrType to get the pointer type based
         on the address space.
llvm-svn: 167221
2012-11-01 08:07:29 +00:00
Michael Liao 70a99c8e19 Cleanup another place redundant SP maintained
llvm-svn: 167209
2012-11-01 03:47:50 +00:00
Shuxin Yang 01efdd6c28 (For X86) Enhancement to add-carray/sub-borrow (adc/sbb) optimization.
The adc/sbb optimization is to able to convert following expression
into a single adc/sbb instruction:
  (ult) ... = x + 1 // where the ult is unsigned-less-than comparison
  (ult) ... = x - 1

  This change is to flip the "x >u y" (i.e. ugt comparison) in order 
to expose the adc/sbb opportunity.

llvm-svn: 167180
2012-10-31 23:11:48 +00:00
Nadav Rotem 6d7d39783d Fix a bug in the cost calculation of vector casts. Detect situations where bitcasts cost zero.
llvm-svn: 167170
2012-10-31 20:52:26 +00:00
Akira Hatanaka 4f5ef21869 [mips] Set isAsCheapAsAMove flag on ADDiu and DADDiu, which enables
re-materialization of immediate loads.

llvm-svn: 167153
2012-10-31 18:37:55 +00:00
Reed Kotler 27a7229c47 Implement ADJCALLSTACKUP and ADJCALLSTACKDOWN
llvm-svn: 167107
2012-10-31 05:21:10 +00:00
Craig Topper 8cd3b07a51 Add scalar forms of FMA4 VFNMSUB/VFNMADD to folding tables. Patch from Cameron McInally.
llvm-svn: 167106
2012-10-31 04:59:46 +00:00
Michael Liao e2d7e4e8e5 Clean up redundant SP register maintained in X86 TLI
llvm-svn: 167104
2012-10-31 04:14:09 +00:00
Bill Schmidt 9953cf294b This patch addresses an ABI compatibility issue with empty aggregate
parameters.  Examples of these are:

  struct { } a;
  union { } b[256];
  int a[0];

An empty aggregate has an address, although dereferencing that address is
pointless.  When passed as a parameter, an empty aggregate does not consume
a protocol register, nor does it consume a doubleword in the parameter save
area.  Passing an empty aggregate by reference passes an address just as
for any other aggregate.  Returning an empty aggregate uses GPR3 as a hidden
address of the return value location, just as for any other aggregate.

The patch modifies PPCTargetLowering::LowerFormalArguments_64SVR4 and
PPCTargetLowering::LowerCall_64SVR4 to properly skip empty aggregate
parameters passed by value.  The handling of return values and by-reference
parameters was already correct.

Built on powerpc64-unknown-linux-gnu and tested with no new regressions.
A test case is included to test proper handling of empty aggregate
parameters on both sides of the function call protocol.

llvm-svn: 167090
2012-10-31 01:15:05 +00:00
Manman Ren 6b223a4f06 X86 SSE: update rsqrtss and rcpss to use two source operands and
the first source operand is tied to the destination operand.

This is to accurately model the corresponding instructions where the upper
bits are unmodified.

rdar://12558838
PR14221

llvm-svn: 167064
2012-10-30 23:53:59 +00:00
Manman Ren acb8becc73 X86 MMX: optimize transfer from mmx to i32
We used to generate a store (movq) + a load.
Now we use movd.

rdar://9946746

llvm-svn: 167056
2012-10-30 22:15:38 +00:00
Akira Hatanaka 9c962c02e4 [mips] Allow tail-call optimization for vararg functions and functions which
use the caller's stack.

llvm-svn: 167048
2012-10-30 20:16:31 +00:00
Akira Hatanaka 4866fe14e2 Add code for saving formal argument information to MipsFunctionInfo. This
information will be used by IsEligibleForTailCallOptimization to determine
whether a call can be tail-call optimized.

llvm-svn: 167043
2012-10-30 19:37:25 +00:00
Akira Hatanaka 6233cf565f Add definition of function MipsTargetLowering::passArgOnStack which emits nodes
for passing a function call argument on a stack.

llvm-svn: 167041
2012-10-30 19:23:25 +00:00
Akira Hatanaka 8e50aba5f9 Do not do tail-call optimization if target is mips16.
llvm-svn: 167039
2012-10-30 19:07:58 +00:00
Adhemerval Zanella 5c043aeb1b PowerPC: Expand FSRQT for vector types
This patch expands FSQRT for floating point vector types when altivec is
used.

llvm-svn: 167034
2012-10-30 18:29:42 +00:00
Michael Liao 83a77c3288 Enable ELF machine type to be specified explicitly in X86 backend
llvm-svn: 167027
2012-10-30 17:33:39 +00:00
Quentin Colombet 5799e9f66c Change ForceSizeOpt attribute into MinSize attribute
llvm-svn: 167020
2012-10-30 16:32:52 +00:00
Adhemerval Zanella 56775e0f13 PowerPC: More support for Altivec compare operations
This patch adds more support for vector type comparisons using altivec.
It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector
types for comparison operators ==, !=, >, >=, <, and <=.

llvm-svn: 167015
2012-10-30 13:50:19 +00:00
Hans Wennborg f3254838e4 Use TargetTransformInfo to control switch-to-lookup table transformation
When the switch-to-lookup tables transform landed in SimplifyCFG, it
was pointed out that this could be inappropriate for some targets.
Since there was no way at the time for the pass to know anything about
the target, an awkward reverse-transform was added in CodeGenPrepare
that turned lookup tables back into switches for some targets.

This patch uses the new TargetTransformInfo to determine if a
switch should be transformed, and removes
CodeGenPrepare::ConvertLoadToSwitch.

llvm-svn: 167011
2012-10-30 11:23:25 +00:00
Hal Finkel d0b95b0961 Remove an invalid assert in TargetTransformImpl
getCastInstrCost had an assert prohibiting scalar to vector casts. Such casts,
however, are allowed. This should make the vectorizer buildbot happier.

llvm-svn: 166998
2012-10-30 02:41:57 +00:00
Jim Grosbach 4739f2eb19 ARM: Better disassembly for pc-relative LDR.
When the operand is a plain immediate rather than a label, print it
as [pc, #imm] like we do for the Thumb2 wide encoding variant.

rdar://12154503

llvm-svn: 166991
2012-10-30 01:04:51 +00:00
Reed Kotler a811753716 Change mips16 delay slot jumps to non delay slot forms by default.
We will make them delay slot forms if there is something that can be
placed in the delay slot during a separate pass. Mips16 extended instructions
cannot be placed in delay slots.

llvm-svn: 166990
2012-10-30 00:54:49 +00:00
Jakub Staszak a3d8e9974a Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chance
to test it with chapni's fix (-mattr=+avx).

llvm-svn: 166985
2012-10-30 00:01:57 +00:00
Kevin Enderby 6fd9624843 Fix ARM's b.w instruction for thumb 2 and the encoding T4. The branch target
is 24 bits not 20 and the decoding needed to correctly handle converting the
J1 and J2 bits to their I1 and I2 values to reconstruct the displacement. 

llvm-svn: 166982
2012-10-29 23:27:20 +00:00
Jakub Staszak d74cb61d86 Revert r166971. It causes buildbot failure. To be investigated.
llvm-svn: 166979
2012-10-29 23:13:50 +00:00
Jakub Staszak c3a92131dc Remove unused variable.
llvm-svn: 166973
2012-10-29 22:04:32 +00:00
Jakub Staszak 9c361bdfeb Simplify code. No functionality change.
llvm-svn: 166972
2012-10-29 22:02:26 +00:00
Jakub Staszak c8f4825ba6 Allow to fold vector load if there is more than one bitcast, so in the case:
%0 = load <8 x i16>* %dest
%1 = shufflevector <8 x i16> %0, <8 x i16> %in,
      <8 x i32> < i32 0, i32 1, i32 2, i32 3, i32 13, i32 undef, i32 14, i32 14>
store <8 x i16> %1, <8 x i16>* %dest

We get:
  vmovlpd (%eax), %xmm0, %xmm0

instead of:
  vmovaps (%eax), %xmm1
  vmovsd  %xmm1, %xmm0, %xmm0

No extra test-case is added. I just fixed the existing one
(also it uses FileCheck now).

llvm-svn: 166971
2012-10-29 21:56:35 +00:00
Bill Schmidt bd4ac26973 This patch solves a problem with passing varargs parameters under the PPC64
ELF ABI.

A varargs parameter consisting of a single-precision floating-point value,
or of a single-element aggregate containing a single-precision floating-point
value, must be passed in the low-order (rightmost) four bytes of the
doubleword stack slot reserved for that parameter.  If there are GPR protocol
registers remaining, the parameter must also be mirrored in the low-order
four bytes of the reserved GPR.

Prior to this patch, such parameters were being passed in the high-order
four bytes of the stack slot and the mirrored GPR.

The patch adds a new test case to verify the correct code generation.

llvm-svn: 166968
2012-10-29 21:18:16 +00:00
Reed Kotler 740981e35c Implement patterns for extloadi8 and extloadi16
llvm-svn: 166960
2012-10-29 19:39:04 +00:00
Chad Rosier 1bbaa449ad [ms-inline asm] Add support for the [] operator. Essentially, [expr1][expr2] is
equivalent to [expr1 + expr2].  See test cases for more examples.
rdar://12470392

llvm-svn: 166949
2012-10-29 18:01:54 +00:00
Michael Liao ad0b69fe3e Fix PR14204
- Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled.

llvm-svn: 166947
2012-10-29 17:57:12 +00:00
Joerg Sonnenberger 2b86e48b3a Fix typo
llvm-svn: 166945
2012-10-29 17:56:15 +00:00
Ulrich Weigand 0de4a1e4ae Allow i32/i64 for 'f' constraint on PowerPC.
This fixes PR12757.

llvm-svn: 166943
2012-10-29 17:49:34 +00:00
Hans Wennborg aad8ad1c36 Minor style fixes for TargetTransformationInfo and TargetTransformImpl
llvm-svn: 166936
2012-10-29 16:26:52 +00:00
Reed Kotler aebb8b034c Expand all atomic ops for mips16.
llvm-svn: 166935
2012-10-29 16:16:54 +00:00
NAKAMURA Takumi 4bd79920be PPCSubtarget.h: Add explicit braces.
llvm-svn: 166932
2012-10-29 15:51:42 +00:00
NAKAMURA Takumi 70b25de24e PPCSubtarget.h: Whitespace.
llvm-svn: 166931
2012-10-29 15:51:35 +00:00
Bill Schmidt bbc661e572 This patch adds alignment information for long double to the 64-bit PowerPC
ELF subtarget.

The existing logic is used as a fallback to avoid any changes to the Darwin
ABI.  PPC64 ELF now has two possible data layout strings: one for FreeBSD,
which requires 8-byte alignment, and a default string that requires
16-byte alignment.

I've added a test for PPC64 Linux to verify the 16-byte alignment.  If
somebody wants to add a separate test for FreeBSD, that would be great.

Note that there is a companion patch to update the alignment information
in Clang, which I am committing now as well.

llvm-svn: 166928
2012-10-29 14:59:36 +00:00
Duncan Sands ac8448e0d0 Silence a GCC warning about comparing signed and unsigned types.
llvm-svn: 166922
2012-10-29 11:29:53 +00:00
Nadav Rotem 42f73c8e4d Calling TLI->getNumRegisters creates a circular dependency when building LLVM using cmake.
Get the number of registers by calling getTypeLegalizationCost.

PR14199.

llvm-svn: 166911
2012-10-29 05:28:35 +00:00
Reed Kotler e6c31579be Implement brind operator for mips16.
llvm-svn: 166903
2012-10-28 23:08:07 +00:00
Rafael Espindola d957cb2584 Remove TargetELFWriterInfo.
All the credit goes to Jan Voung for noticing it was dead!

llvm-svn: 166902
2012-10-28 21:34:43 +00:00
Reed Kotler 3589dd74ac This patch is for the implementation of mips16 complex pattern addr16.
Previously mips16 was sharing the pattern addr which is used for mips32
and mips64. This had a number of problems:
1) Storing and loading byte and halfword quantities for mips16 has particular
problems due to the primarily non mips16 nature of SP. When we must
load/store byte/halfword stack objects in a function, we must create a mips16
alias register for SP. This functionality is tested in stchar.ll.
2) We need to have an FP register under certain conditions (such as 
dynamically sized alloca). We use mips16 register S0 for this purpose.
In this case, we also use this register when accessing frame objects so this
issue also affects the complex pattern addr16. This functionality is
tested in alloca16.ll.

The Mips16InstrInfo.td has been updated to use addr16 instead of addr.

The complex pattern C++ function for addr has been copied to addr16 and
updated to reflect the above issues.

llvm-svn: 166897
2012-10-28 06:02:37 +00:00
Quentin Colombet 3ee56a3bf5 [code size][ARM] Emit regular call instructions instead of the move, branch sequence
llvm-svn: 166854
2012-10-27 01:10:17 +00:00
Reed Kotler 7e4d9969cb Implement MipsHi for mips16
llvm-svn: 166852
2012-10-27 00:57:14 +00:00
Akira Hatanaka 6a124a84dc [mips] Do not tail-call optimize vararg functions or functions with byval
arguments.

This is rather conservative and should be fixed later to be more aggressive.

llvm-svn: 166851
2012-10-27 00:56:56 +00:00
Akira Hatanaka 2c07f1f140 [mips] Make sure FuncArg doesn't advance when OrigArgIndex is the same as in the
previous iteration.

llvm-svn: 166850
2012-10-27 00:44:39 +00:00
Akira Hatanaka ac8c669985 Use the methods and classes that were added to simplify LowerCall and
LowerFormalArguments in MipsTargetLowering.

No functionality change intended.

llvm-svn: 166846
2012-10-27 00:29:43 +00:00
Akira Hatanaka 2a13402a66 Add method MipsTargetLowering::writeVarArgRegs which copies argument registers
of vararg functions back to the stack.

llvm-svn: 166844
2012-10-27 00:21:13 +00:00
Akira Hatanaka 35f55b1622 Add method MipsTargetLowering::passByValArg.
This method emits nodes for passing byval arguments in registers and stack.
This has the same functionality as existing functions PassByValArg64 and
WriteByValArg which will be deleted later.

llvm-svn: 166843
2012-10-27 00:16:36 +00:00
Akira Hatanaka 25dad19f0e Add method MipsTargetLowering::copyByValRegs.
This method copies byval arguments passed in registers onto the stack and has
the same functionality as existing functions CopyMips64ByValRegs and
ReadByValArg which will be deleted later.

llvm-svn: 166841
2012-10-27 00:10:18 +00:00
Akira Hatanaka 4a3711d077 Add class MipsCC which provides methods used to analyze formal and call
arguments and inquire about calling convention information.

llvm-svn: 166840
2012-10-26 23:56:38 +00:00
Akira Hatanaka e485c65642 Delete MipsFunctionInfo::InArgFIRange.
llvm-svn: 166837
2012-10-26 23:49:51 +00:00
Nadav Rotem afae78edab Refactor the VectorTargetTransformInfo interface.
Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.

Port the LoopVectorizer to the new API.

The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.

llvm-svn: 166836
2012-10-26 23:49:28 +00:00
Jakob Stoklund Olesen 1f06e7f00e Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers."
Keep the integer_insertelement test case, the new coalescer can handle
this kind of lane insertion without help from pseudo-instructions.

llvm-svn: 166835
2012-10-26 23:39:46 +00:00
Kaelyn Uhrain 271fbb6445 Avoid an unused-variable warning when asserts are disabled.
llvm-svn: 166834
2012-10-26 23:28:41 +00:00
Reed Kotler b650f6bbe7 implement mips16 tls global addr
llvm-svn: 166827
2012-10-26 22:57:32 +00:00
Chad Rosier 8e71f7c2d8 [ms-inline asm] Add a comment.
llvm-svn: 166819
2012-10-26 22:01:25 +00:00
Jakob Stoklund Olesen e38018314e 80 col.
llvm-svn: 166818
2012-10-26 21:46:57 +00:00
Jakob Stoklund Olesen 410eae51f1 Remove ARMBaseRegisterInfo::isReservedReg().
It is just as easy to use MRI::isReserved() now.

llvm-svn: 166817
2012-10-26 21:43:05 +00:00
Jakob Stoklund Olesen e46a1046c0 Add GPRPair Register class to ARM.
Some instructions in ARM require 2 even-odd paired GPRs. This
patch adds support for such register class.

Patch by Weiming Zhao!

llvm-svn: 166816
2012-10-26 21:29:15 +00:00
Jakob Stoklund Olesen 09d69f5b0f Remove the canCombineSubRegIndices() target hook.
The new coalescer can already do all of this, so there is no need to
duplicate the efforts.

llvm-svn: 166813
2012-10-26 20:38:19 +00:00
Chad Rosier 5859356d80 [ms-inline asm] Emit an error for unsupported SIZE and LENGTH directives.
Part of rdar://12576868

llvm-svn: 166792
2012-10-26 18:32:44 +00:00
Chad Rosier 11c42f2d2c [ms-inline asm] Add support for the TYPE operator.
Part of rdar://12576868

llvm-svn: 166790
2012-10-26 18:04:20 +00:00
Reed Kotler 4e1c629567 (no commit message)
llvm-svn: 166780
2012-10-26 16:18:19 +00:00
Chad Rosier e2f03771c4 [ms-inline asm] Have the target AsmParser create the asmrewrite for the offsetof
operator.

llvm-svn: 166779
2012-10-26 16:09:20 +00:00
Renato Golin 4dab6a1b7c Better handling of OpcodeToISD using enum/switch.
Patch by Pasi Parviainen <pasi.parviainen@iki.fi>

llvm-svn: 166773
2012-10-26 12:24:52 +00:00
Adhemerval Zanella 0f9cff1ab8 PowerPC: Fix for rldcl/rldicl/rldicr MC emission
This patch fixes the rldcl/rldicl/rldicr instruction emission. The issue is
the MDForm_1 instruction defines the PowerISA MB field from 'rldicl'
with the name MBE, but RLDCL/RLDICL/RLDICR definition uses as 'MB'.

It end up by generatint the 'rldicl' enconding at 
'lib/Target/PowerPC/PPCGenMCCodeEmitter.inc' to use the fourth argument as the
third. The patch changes it by adjusting to use the fourth argument as
intended.

Fixes PR14180.

llvm-svn: 166770
2012-10-26 12:09:58 +00:00
Nicolas Geoffray 4027f238eb Fix CPP backend for method attributes by creating a block where a new AttrBuilder is defined for each attribute.
llvm-svn: 166762
2012-10-26 09:14:38 +00:00
Reed Kotler 287f0449a2 Implement carry for subtract/add for mips16
llvm-svn: 166755
2012-10-26 04:46:26 +00:00
Hal Finkel 9dd045f178 Add VectorTargetTransform::getNumberOfParts.
As discussed on IRC, add VectorTargetTransform::getNumberOfParts
to provide a stable interface to the vector legalization splitting factor.

llvm-svn: 166751
2012-10-26 04:28:02 +00:00
Reed Kotler e47873ab89 implement large (>16 bit) constant loading.
llvm-svn: 166749
2012-10-26 03:09:34 +00:00
Chad Rosier 240b7b963a [ms-inline asm] Perform field lookups with the dot operator.
llvm-svn: 166724
2012-10-25 21:51:10 +00:00
Reed Kotler 097556d6bd implement mips16 patterns for select nodes
llvm-svn: 166721
2012-10-25 21:33:30 +00:00
Chad Rosier f0e8720054 [ms-inline asm] Add support for creating AsmRewrites in the target specific
AsmParser logic.  To be used/tested in a subsequent commit.

llvm-svn: 166714
2012-10-25 20:41:34 +00:00
Nadav Rotem 8b749b2364 Minor cleanups.
llvm-svn: 166706
2012-10-25 18:17:48 +00:00
Chad Rosier 911c1f38b0 [ms-inline asm] Add error handling to the ParseIntelDotOperator() function.
llvm-svn: 166698
2012-10-25 17:37:43 +00:00
Adhemerval Zanella 1be10dc732 This patch fixes the MC object emission of 'nop' for external function calls
and also fixes the R_PPC64_TOC16 and R_PPC64_TOC16_DS relocation offset.
The 'nop' is needed so a restore TOC instruction (ld r2,40(r1)) can be placed
by the linker to correct restore the TOC of previous function.

Current code has two issues: it defines in PPCInstr64Bit.td file a LDinto_toc
and LDtoc_restore as a DSForm_1 with DS_RA=0 where it should be
DS=2 (the 8 bytes displacement of the TOC saving). It also wrongly emits a
MC intruction using an uint32_t value while the PPC::BL8_NOP_ELF
and PPC::BLA8_NOP_ELF are both uint64_t (because of the following 'nop').

This patch corrects the remaining ExecutionEngine using MCJIT:

ExecutionEngine/2002-12-16-ArgTest.ll
ExecutionEngine/2003-05-07-ArgumentTest.ll
ExecutionEngine/2005-12-02-TailCallBug.ll
ExecutionEngine/hello.ll
ExecutionEngine/hello2.ll
ExecutionEngine/test-call.ll

llvm-svn: 166682
2012-10-25 14:29:13 +00:00
Bill Schmidt 6ed3b99f43 This patch addresses a PPC64 ELF issue with passing parameters consisting of
structs having size 3, 5, 6, or 7.  Such a struct must be passed and received
as right-justified within its register or memory slot.  The problem is only
present for structs that are passed in registers.

Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register.  This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot.  Essentially I had accidentally
accounted for the right-adjustment twice.

In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.

The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this.  I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.

I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions.  I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.

llvm-svn: 166680
2012-10-25 13:38:09 +00:00
Adhemerval Zanella f2aceda854 Initial TOC support for PowerPC64 object creation
This patch adds initial PPC64 TOC MC object creation using the small mcmodel
(a single 64K TOC) adding the some TOC relocations (R_PPC64_TOC,
R_PPC64_TOC16, and R_PPC64_TOC16DS).

The addition of 'undefinedExplicitRelSym' hook on 'MCELFObjectTargetWriter'
is meant to avoid the creation of an unreferenced ".TOC." symbol (used in
the .odp creation) as well to set the R_PPC64_TOC relocation target as the
temporary ".TOC." symbol. On PPC64 ABI, the R_PPC64_TOC relocation should
not point to any symbol.

llvm-svn: 166677
2012-10-25 12:27:42 +00:00
Michael Liao c6696b04db Atom has SIMD instruction set extension up to SSSE3
llvm-svn: 166665
2012-10-25 07:06:48 +00:00
Michael Liao 6d810bd9b8 Clean up where SlotSize should be used instead of pointer size.
llvm-svn: 166664
2012-10-25 06:29:14 +00:00
Nadav Rotem 4a87683a41 Implement a basic cost model for vector and scalar instructions.
llvm-svn: 166642
2012-10-24 23:47:38 +00:00
Chad Rosier 5dcb4664f2 [ms-inline asm] Add support for parsing the '.' operator. Given,
[register].field

The operator returns the value at the location pointed to by register plus the
offset of field within its structure or union.  This patch only handles
immediate fields (i.e., [eax].4).  The original displacement has to be a
MCConstantExpr as well.
Part of rdar://12470415 and rdar://12470514

llvm-svn: 166632
2012-10-24 22:21:50 +00:00
Chad Rosier 6844ea09fa Tidy up. No functional change intended.
llvm-svn: 166630
2012-10-24 22:13:37 +00:00
Evan Cheng 59ed7d45a6 Fix a miscompilation caused by a typo. When turning a adde with negative value
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.

rdar://12559385

llvm-svn: 166613
2012-10-24 19:53:01 +00:00
Micah Villmow bf3eeb2dfc Add some cleanup to the DataLayout changes requested by Chandler.
llvm-svn: 166607
2012-10-24 18:36:13 +00:00
Nadav Rotem 2289f2c932 Implement a basic VectorTargetTransformInfo interface to be used by the loop and bb vectorizers for modeling the cost of instructions.
llvm-svn: 166593
2012-10-24 17:22:41 +00:00
Chad Rosier 91c8266200 [ms-inline asm] Create a register operand, rather than a memory operand when we
see the offsetof operator.  Previously, we were matching something like MOVrm
in the front-end and later matching MOVrr in the back-end.  This change makes
things more consistent.  It also fixes cases where we can't match against a 
memory operand as the source (test cases coming).
Part of rdar://12470317

llvm-svn: 166592
2012-10-24 17:22:29 +00:00
Micah Villmow 12d9127833 Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.

llvm-svn: 166578
2012-10-24 15:52:52 +00:00
Elena Demikhovsky d6afb03bc9 Special calling conventions for Intel OpenCL built-in library.
llvm-svn: 166566
2012-10-24 14:46:16 +00:00
Michael Liao c5af149e70 Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.

llvm-svn: 166545
2012-10-24 04:09:32 +00:00
Akira Hatanaka 868b3a333b [mips] Make sure sret argument is returned in register V0.
llvm-svn: 166539
2012-10-24 02:10:54 +00:00
Rafael Espindola 4e6e537314 Change x86_fastcallcc to require inreg markers. This allows it to known
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).

Clang will be updated in the next patches.

llvm-svn: 166536
2012-10-24 01:58:48 +00:00
Chad Rosier a623524487 [ms-inline asm] Offset operator - the size should be based on the size of a
pointer, not the size of the variable.
Part of rdar://12470317

llvm-svn: 166526
2012-10-23 23:42:06 +00:00
Chad Rosier eac2b2003e [ms-inline asm] Clean up comment.
llvm-svn: 166525
2012-10-23 23:34:28 +00:00
Chad Rosier 146310a1c1 [ms-inline asm] When parsing inline assembly we set the base register to a
non-zero value as we don't know the actual value at this point.  This is
necessary to get the matching correct in some cases.  However, the actual value
set as the base register doesn't matter, since we're just matching not emitting.

llvm-svn: 166523
2012-10-23 23:31:33 +00:00
Kevin Enderby dccdac6a06 Make branch heavy code for generating marked up disassembly simpler
and easier to read by adding a couple helper functions.  Suggestion by
Chandler Carruth and seconded by Meador Inge!

llvm-svn: 166515
2012-10-23 22:52:52 +00:00
Michael Liao 2843625bb5 Fix PR14161
- Check index being extracted to be constant 0 before simplfiying.
  Otherwise, retain the original sequence.

llvm-svn: 166504
2012-10-23 21:40:15 +00:00
Matt Beaumont-Gay bdcebd323a Silence -Wsign-compare
llvm-svn: 166494
2012-10-23 19:46:36 +00:00
Chad Rosier 37e755cee2 [ms-inline asm] Add an implementation of the offset operator. This is a follow
on patch to r166433.
rdar://12470317

llvm-svn: 166488
2012-10-23 17:43:43 +00:00
Michael Liao c03c03d56e Add custom UINT_TO_FP from v4i8/v4i16/v8i8/v8i16 to v4f32/v8f32
- Replace v4i8/v8i8 -> v8f32 DAG combine with custom lowering to reduce
  DAG combine overhead.
- Extend the support to v4i16/v8i16 as well.

llvm-svn: 166487
2012-10-23 17:36:08 +00:00
Michael Liao 1be96bb5ce Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
llvm-svn: 166486
2012-10-23 17:34:00 +00:00
Bill Schmidt 57d6de5fd9 This is another TLC patch for separating code for the Darwin and ELF ABIs
for the PowerPC target, and factoring the results.  This will ease future
maintenance of both subtargets.

PPCTargetLowering::LowerCall_Darwin_Or_64SVR4() has grown a lot of special-case
code for the different ABIs, making maintenance difficult.  This is getting
worse as we repair errors in the 64-bit ELF ABI implementation, while avoiding
changes to the Darwin ABI logic.  This patch splits the routine into
LowerCall_Darwin() and LowerCall_64SVR4(), allowing both versions to be
significantly simplified.  I've factored out chunks of similar code where it
made sense to do so.  I also performed similar factoring on
LowerFormalArguments_Darwin() and LowerFormalArguments_64SVR4().

There are no functional changes in this patch, and therefore no new test
cases have been developed.

Built and tested on powerpc64-unknown-linux-gnu with no new regressions.

llvm-svn: 166480
2012-10-23 15:51:16 +00:00
Reed Kotler 164bb37c7b implement setXX patterns
llvm-svn: 166459
2012-10-23 01:35:48 +00:00
Bill Wendling 12cda50f1f When a block ends in an indirect branch, add its successors to the machine basic block.
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>

llvm-svn: 166448
2012-10-22 23:30:04 +00:00
Kevin Enderby 62183c4e18 Add support for annotated disassembly output for X86 and arm.
Per the October 12, 2012 Proposal for annotated disassembly output sent out by
Jim Grosbach this set of changes implements this for X86 and arm.  The llvm-mc
tool now has a -mdis option to produced the marked up disassembly and a couple
of small example test cases have been added.

rdar://11764962

llvm-svn: 166445
2012-10-22 22:31:46 +00:00
Chad Rosier 5bca3f9b8e [ms-inline asm] Add the isOffsetOf() function.
Part of rdar://12470317

llvm-svn: 166436
2012-10-22 19:50:35 +00:00
Chad Rosier c14ed95da4 [ms-inline asm] Add support for parsing the offset operator. Callback for
CodeGen in the front-end not implemented yet.
rdar://12470317

llvm-svn: 166433
2012-10-22 19:42:52 +00:00
Chad Rosier f1f6a72901 [ms-inline asm] Reset the opcode prior to parsing a statement.
llvm-svn: 166349
2012-10-19 22:57:33 +00:00
Akira Hatanaka 0c7d131a7b [mips] Use 64-bit registers to return an sret pointer if target ABI is N64.
llvm-svn: 166344
2012-10-19 22:11:40 +00:00
Akira Hatanaka 90131ac26c [mips] Add code to do tail call optimization.
Currently, it is enabled only if option "enable-mips-tail-calls" is given and
all of the callee's arguments are passed in registers.

llvm-svn: 166342
2012-10-19 21:47:33 +00:00
Akira Hatanaka 59a32e91f9 [mips] Fix TAILCALL's operand node type.
llvm-svn: 166341
2012-10-19 21:30:15 +00:00
Akira Hatanaka c046243488 [mips] Delete MipsFunctionInfo::MaxCallFrameSize which is no longer used.
llvm-svn: 166339
2012-10-19 21:18:38 +00:00
Akira Hatanaka d03d68a3ba [mips] Add tail call instructions.
llvm-svn: 166338
2012-10-19 21:14:34 +00:00
Akira Hatanaka 0c5e357d87 [mips] Make the branch nodes used in jump instructions a template parameter.
llvm-svn: 166337
2012-10-19 21:11:03 +00:00
Akira Hatanaka 91318df0cc Add node and enum for mips tail call.
llvm-svn: 166318
2012-10-19 20:59:39 +00:00
Chad Rosier 0f48c55e70 [ms-inline asm] Have the TargetParser callback to Sema to determine the size of
a memory operand.  Retain this information and then add the sizing directives
to the IR.  This allows the backend to do proper instruction selection.

llvm-svn: 166316
2012-10-19 20:57:14 +00:00
Shuxin Yang cdde059a34 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 

llvm-svn: 166300
2012-10-19 20:11:16 +00:00
Michael Liao 4b7ccfcaad Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86
- If INSERT_VECTOR_ELT is supported (above SSE2, either by custom
  sequence of legal insn), transform BUILD_VECTOR into SHUFFLE +
  INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few
  (so far 1) elements being inserted.

llvm-svn: 166288
2012-10-19 17:15:18 +00:00
Stepan Dyatkovskiy dab8043048 ARM:
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.

llvm-svn: 166273
2012-10-19 08:23:06 +00:00
Nadav Rotem 5dc203e8f4 Reapply the TargerTransformInfo changes, minus the changes to LSR and Lowerinvoke.
llvm-svn: 166248
2012-10-18 23:22:48 +00:00
Kevin Enderby b23926d395 Fix a bug where a 32-bit address with the high bit does not get symbolicated
because the value is incorrectly being signed extended when passed to
SymbolLookUp().

llvm-svn: 166234
2012-10-18 21:49:18 +00:00
Ulrich Weigand d34b5bd610 This patch fixes failures in the SingleSource/Regression/C/uint64_to_float
test case on PowerPC caused by rounding errors when converting from a 64-bit
integer to a single-precision floating point. The reason for this are
double-rounding effects, since on PowerPC we have to convert to an
intermediate double-precision value first, which gets rounded to the
final single-precision result.

The patch fixes the problem by preparing the 64-bit integer so that the
first conversion step to double-precision will always be exact, and the
final rounding step will result in the correctly-rounded single-precision
result.  The generated code sequence is equivalent to what GCC would generate.

When -enable-unsafe-fp-math is in effect, that extra effort is omitted
and we accept possible rounding errors (just like GCC does as well).

llvm-svn: 166178
2012-10-18 13:16:11 +00:00
Bob Wilson d6d9ccca38 Temporarily revert the TargetTransform changes.
The TargetTransform changes are breaking LTO bootstraps of clang.  I am
working with Nadav to figure out the problem, but I am reverting it for now
to get our buildbots working.

This reverts svn commits: 165665 165669 165670 165786 165787 165997
and I have also reverted clang svn 165741

llvm-svn: 166168
2012-10-18 05:43:52 +00:00
Reed Kotler 6743924a32 Add conditional branch instructions and their patterns.
llvm-svn: 166134
2012-10-17 22:29:54 +00:00
Jakob Stoklund Olesen 0736442683 Merge MRI::isPhysRegOrOverlapUsed() into isPhysRegUsed().
All callers of these functions really want the isPhysRegOrOverlapUsed()
functionality which also checks aliases. For historical reasons, targets
without register aliases were calling isPhysRegUsed() instead.

Change isPhysRegUsed() to also check aliases, and switch all
isPhysRegOrOverlapUsed() callers to isPhysRegUsed().

llvm-svn: 166117
2012-10-17 18:44:18 +00:00
Jakob Stoklund Olesen a10c09804d Check for empty YMM use-def lists in X86VZeroUpper.
The previous MRI.isPhysRegUsed(YMM0) would also return true when the
function contains a call to a function that may clobber YMM0. That's
most of them.

Checking the use-def chains allows us to skip functions that don't
explicitly mention YMM registers.

llvm-svn: 166110
2012-10-17 17:52:35 +00:00
Anton Korobeynikov 0a69176ce0 Fix fallout from RegInfo => FrameLowering refactoring on MSP430.
Patch by Job Noorman!

llvm-svn: 166108
2012-10-17 17:37:11 +00:00
Michael Liao cef9541dac Check SSSE3 instead of SSE4.1
- All shuffle insns required, especially PSHUB, are added in SSSE3.

llvm-svn: 166086
2012-10-17 03:59:18 +00:00
Michael Liao 6f7206132f Fix setjmp on models with non-Small code model nor non-Static relocation model
- MBB address is only valid as an immediate value in Small & Static
  code/relocation models. On other models, LEA is needed to load IP address of
  the restore MBB.
- A minor fix of MBB in MC lowering is added as well to enable target
  relocation flag being propagated into MC.

llvm-svn: 166084
2012-10-17 02:22:27 +00:00
Michael Liao 02ca34541e Support v8f32 to v8i8/vi816 conversion through custom lowering
- Add custom FP_TO_SINT on v8i16 (and v8i8 which is legalized as v8i16 due to
  vector element-wise widening) to reduce DAG combiner and its overhead added
  in X86 backend.

llvm-svn: 166036
2012-10-16 18:14:11 +00:00
Bill Schmidt 48081cad0d This patch addresses PR13949.
For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8
bytes are to be passed in the low-order bits ("right-adjusted") of the
doubleword register or memory slot assigned to them.  A previous patch
addressed this for aggregates passed in registers.  However, small
aggregates passed in the overflow portion of the parameter save area are
still being passed left-adjusted.

The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the
caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on
the callee side.  The main fix on the callee side simply extends
existing logic for 1- and 2-byte objects to 1- through 7-byte objects,
and correcting a constant left over from 32-bit code.  There is also a
fix to a bogus calculation of the offset to the following argument in
the parameter save area.

On the caller side, again a constant left over from 32-bit code is
fixed.  Additionally, some code for 1, 2, and 4-byte objects is
duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only.  The
LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to
handle both ABIs, and I propose to separate this into two functions in a
future patch, at which time the duplication can be removed.

The patch adds a new test (structsinmem.ll) to demonstrate correct
passing of structures of all seven sizes.  Eight dummy parameters are
used to force these structures to be in the overflow portion of the
parameter save area.

As a side effect, this corrects the case when aggregates passed in
registers are saved into the first eight doublewords of the parameter
save area:  Previously they were stored left-justified, and now are
properly stored right-justified.  This requires changing the expected
output of existing test case structsinregs.ll.

llvm-svn: 166022
2012-10-16 13:30:53 +00:00
Stepan Dyatkovskiy e59a920b0c Issue:
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 

llvm-svn: 166018
2012-10-16 07:16:47 +00:00
NAKAMURA Takumi 1705a999fa Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>.
Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".

Original message since r165661:

My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the *ORIGINAL* condition code.

llvm-svn: 166017
2012-10-16 06:28:34 +00:00
Craig Topper 2a3f77585f Move X86MCInstLower class definition into implementation file. It's not needed outside.
llvm-svn: 166014
2012-10-16 06:01:50 +00:00
Bill Wendling 4f69e1483b Pass in the context to the Attributes::get method.
llvm-svn: 166007
2012-10-16 05:20:51 +00:00
Michael Liao 97bf363a9e Add __builtin_setjmp/_longjmp supprt in X86 backend
- Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also
  used as a light-weight replacement of setjmp/longjmp which are used to
  implementation continuation, user-level threading, and etc. The support added
  in this patch ONLY addresses this usage and is NOT intended to support SjLj
  exception handling as zero-cost DWARF exception handling is used by default
  in X86.

llvm-svn: 165989
2012-10-15 22:39:43 +00:00
Jim Grosbach 54c7432e22 ARM: v1i64 and v2i64 VBSL intrinsic support.
rdar://12502028

llvm-svn: 165981
2012-10-15 21:23:40 +00:00
Bill Wendling 50d27849f6 Move the Attributes::Builder outside of the Attributes class and into its own class named AttrBuilder. No functionality change.
llvm-svn: 165960
2012-10-15 20:35:56 +00:00
Chad Rosier f3bc599680 [ms-inline asm] If we parsed a statement and the opcode is valid, then it's an instruction.
llvm-svn: 165955
2012-10-15 19:08:18 +00:00
Chad Rosier 499d4a1468 [ms-inline asm] Update the end loc for ParseIntelMemOperand.
llvm-svn: 165947
2012-10-15 17:26:38 +00:00
Chad Rosier ca0ada1aeb [ms-inline asm] Use incoming argument rather than hard coding to false.
llvm-svn: 165945
2012-10-15 16:50:34 +00:00
Micah Villmow 4bb926d91d Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis.
llvm-svn: 165941
2012-10-15 16:24:29 +00:00
Adhemerval Zanella ef206f19a4 PowerPC: add EmitTCEntry class for TOC creation
This patch replaces the EmitRawText by a EmitTCEntry class (specialized for
each Streamer) in PowerPC64 TOC entry creation.

llvm-svn: 165940
2012-10-15 15:43:14 +00:00
Silviu Baranga b14097000b Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.
llvm-svn: 165929
2012-10-15 09:41:32 +00:00
Bill Wendling d079a446d7 Attributes Rewrite
Convert the internal representation of the Attributes class into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object. The
Attributes class then becomes a thin wrapper around this opaque
object. Eventually, the internal representation will be expanded to include
attributes that represent code generation options, etc.

llvm-svn: 165917
2012-10-15 04:46:55 +00:00
Bill Wendling 9e1eb4d1b9 Don't pass in an Attributes object to something that expects an integral value.
llvm-svn: 165887
2012-10-14 03:27:15 +00:00
Benjamin Kramer 35480284e7 X86: Disable long nops for all cpus prior to pentiumpro/i686.
llvm-svn: 165878
2012-10-13 17:28:35 +00:00
Benjamin Kramer ecd15d7f6c X86: Fix accidentally swapped operands.
llvm-svn: 165871
2012-10-13 12:50:19 +00:00
Benjamin Kramer d6b9362fc2 X86: Promote i8 cmov when both operands are coming from truncates of the same width.
X86 doesn't have i8 cmovs so isel would emit a branch. Emitting branches at this
level is often not a good idea because it's too late for many optimizations to
kick in. This solution doesn't add any extensions (truncs are free) and tries
to avoid introducing partial register stalls by filtering direct copyfromregs.

I'm seeing a ~10% speedup on reading a random .png file with libpng15 via
graphicsmagick on x86_64/westmere, but YMMV depending on the microarchitecture.

llvm-svn: 165868
2012-10-13 10:39:49 +00:00
Chad Rosier 4996355592 [ms-inline asm] Remove the MatchInstruction() function. Previously, this was
the interface between the front-end and the MC layer when parsing inline
assembly.  Unfortunately, this is too deep into the parsing stack. Specifically,
we're unable to handle target-independent assembly (i.e., assembly directives,
labels, etc.).  Note the MatchAndEmitInstruction() isn't the correct
abstraction either.  I'll be exposing target-independent hooks shortly, so this
is really just a cleanup.

llvm-svn: 165858
2012-10-13 00:26:04 +00:00
Manman Ren 7e48b252e7 ARM: tail-call inside a function where part of a byval argument is on caller's
local frame causes problem.

For example:
void f(StructToPass s) {
  g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.

The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.

rdar://12442472

llvm-svn: 165853
2012-10-12 23:39:43 +00:00
Chad Rosier 4453e8453e [ms-inline asm] Capitalize per coding standard.
llvm-svn: 165847
2012-10-12 23:09:25 +00:00
Jim Grosbach 30af442a84 ARM: Mark VSELECT as 'expand'.
The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.

http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961

Patch by Peter Couperus <peter.couperus@st.com>.

llvm-svn: 165845
2012-10-12 22:59:21 +00:00
Chad Rosier 2f480a8a50 [ms-inline asm] Use the new API introduced in r165830 in lieu of the
MapAndConstraints vector.  Also remove the unused Kind argument.

llvm-svn: 165833
2012-10-12 22:53:36 +00:00
Reed Kotler cf11c59e2f Div, Rem int/unsigned int
llvm-svn: 165783
2012-10-12 02:01:09 +00:00
Sean Silva 506a1c5a58 Remove unnecessary classof()'s
isa<> et al. automatically infer when the cast is an upcast (including a
self-cast), so these are no longer necessary.

llvm-svn: 165767
2012-10-11 23:30:49 +00:00
Micah Villmow 0c61134d8d Revert 165732 for further review.
llvm-svn: 165747
2012-10-11 21:27:41 +00:00
Micah Villmow 083189730e Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly.
llvm-svn: 165726
2012-10-11 17:21:41 +00:00
Bill Schmidt 22162470ba This patch addresses PR13947.
For function calls on the 64-bit PowerPC SVR4 target, each parameter
is mapped to as many doublewords in the parameter save area as
necessary to hold the parameter.  The first 13 non-varargs
floating-point values are passed in registers; any additional
floating-point parameters are passed in the parameter save area.  A
single-precision floating-point parameter (32 bits) must be mapped to
the second (rightmost, low-order) word of its assigned doubleword
slot.

Currently LLVM violates this ABI requirement by mapping such a
parameter to the first (leftmost, high-order) word of its assigned
doubleword slot.  This is internally self-consistent but will not
interoperate correctly with libraries compiled with an ABI-compliant
compiler.

This patch corrects the problem by adjusting the parameter addressing
on both sides of the calling convention.

llvm-svn: 165714
2012-10-11 15:38:20 +00:00
David Chisnall 6a00ab4b5e Expose move to/from coprocessor instructions in MIPS64 mode.
Note: [D]M{T,F}CP2 is just a recommended encoding.  Vendors often provide a
custom CP2 that interprets instructions differently and may wish to add their
own instructions that use this opcode.  We should ensure that this is easy to
do.  I will probably add a 'has custom CP{0-3}' subtarget flag to make this
easy: We want to avoid the GCC situation where every MIPS vendor makes a custom
fork that breaks every other MIPS CPU and so can't be merged upstream.

llvm-svn: 165711
2012-10-11 10:21:34 +00:00
NAKAMURA Takumi da0730c2d7 Revert r165661, "Patch by Shuxin Yang <shuxin.llvm@gmail.com>."
It broke stage2 clang and test-suite/MultiSource/Benchmarks/mediabench/g721/g721encode.

llvm-svn: 165692
2012-10-11 02:02:05 +00:00
Evan Cheng 60a25a571e Change MachineInstrBuilder::addDisp to copy over target flags by default.
llvm-svn: 165677
2012-10-11 00:15:48 +00:00
Evan Cheng 0f2565fd35 Add isel patterns for v2f32 / v4f32 neon.vbsl intrinsics. rdar://12471808
llvm-svn: 165673
2012-10-10 23:06:34 +00:00
Nadav Rotem ffd3bf4124 Add getters for the MIPS TargetTransform classes
llvm-svn: 165670
2012-10-10 22:45:53 +00:00
David Blaikie e249f18e5d Remove unused member variable introduced in r165665.
llvm-svn: 165669
2012-10-10 22:38:21 +00:00
Nadav Rotem e10328737d Add a new interface to allow IR-level passes to access codegen-specific information.
llvm-svn: 165665
2012-10-10 22:04:55 +00:00
Nadav Rotem 17418964f8 Patch by Shuxin Yang <shuxin.llvm@gmail.com>.
Original message:

The attached is the fix to radar://11663049. The optimization can be outlined by following rules:

   (select (x != c), e, c) -> select (x != c), e, x),
   (select (x == c), c, e) -> select (x == c), x, e)
where the <c> is an integer constant.

 The reason for this change is that : on x86, conditional-move-from-constant needs two instructions;
however, conditional-move-from-register need only one instruction.

  While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase.

  The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource".

llvm-svn: 165661
2012-10-10 21:31:55 +00:00
Bill Schmidt b9bc47409d When generating spill and reload code for vector registers on PowerPC,
the compiler makes use of GPR0.  However, there are two flavors of
GPR0 defined by the target:  the 32-bit GPR0 (R0) and the 64-bit GPR0
(X0).  The spill/reload code makes use of R0 regardless of whether we
are generating 32- or 64-bit code.

This patch corrects the problem in the obvious manner, using X0 and
ADDI8 for 64-bit and R0 and ADDI for 32-bit.

llvm-svn: 165658
2012-10-10 21:25:01 +00:00
Bill Schmidt 38d9458720 The PowerPC VRSAVE register has been somewhat of an odd beast since
the Altivec extensions were introduced.  Its use is optional, and
allows the compiler to communicate to the operating system which
vector registers should be saved and restored during a context switch.
In practice, this information is ignored by the various operating
systems using the SVR4 ABI; the kernel saves and restores the entire
register state.  Setting the VRSAVE register is no longer performed by
the AIX XL compilers, the IBM i compilers, or by GCC on Power Linux
systems.  It seems best to avoid this logic within LLVM as well.

This patch avoids generating code to update and restore VRSAVE for the
PowerPC SVR4 ABIs (32- and 64-bit).  The code remains in place for the
Darwin ABI.

llvm-svn: 165656
2012-10-10 20:54:15 +00:00
Michael Liao e999b865dd Add support for FP_ROUND from v2f64 to v2f32
- Due to the current matching vector elements constraints in
  ISD::FP_ROUND, rounding from v2f64 to v4f32 (after legalization from
  v2f32) is scalarized. Add a customized v2f32 widening to convert it
  into a target-specific X86ISD::VFPROUND to work around this
  constraints.

llvm-svn: 165631
2012-10-10 16:53:28 +00:00
Michael Liao effae0c8e1 Add alternative support for FP_ROUND from v2f32 to v2f64
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
  rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
  to convert it into a target-specific X86ISD::VFPEXT to work around this
  constraints. This patch also reverts a previous attempt to fix this issue by
  recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
  reduces the overhead of supporting non-power-2 vector FP extend.

llvm-svn: 165625
2012-10-10 16:32:15 +00:00
Stepan Dyatkovskiy 283baa0027 Fix for LDRB instruction:
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.

7 ops is needed, but SDNode with only 6 is created.

In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.

The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.

llvm-svn: 165617
2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy f13dbb8e24 Issue description:
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.

Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.

Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values. 

llvm-svn: 165616
2012-10-10 11:37:36 +00:00
Bill Wendling bbcdf4e2a5 Remove the final bits of Attributes being declared in the Attribute
namespace. Use the attribute's enum value instead. No functionality change
intended.

llvm-svn: 165610
2012-10-10 07:36:45 +00:00
Andrew Trick dd79f0fcea misched: Use the TargetSchedModel interface wherever possible.
Allows the new machine model to be used for NumMicroOps and OutputLatency.

Allows the HazardRecognizer to be disabled along with itineraries.

llvm-svn: 165603
2012-10-10 05:43:09 +00:00
Andrew Trick d9296ec2b6 whitespace
llvm-svn: 165601
2012-10-10 05:43:01 +00:00
Reed Kotler 0f2e44a1cb Reorder some parts of the td file to by in alphabetical order
llvm-svn: 165590
2012-10-10 01:58:16 +00:00
Akira Hatanaka 9c8dcfc73a Implement MipsTargetLowering::CanLowerReturn.
Patch by Sasa Stankovic. 

llvm-svn: 165585
2012-10-10 01:27:09 +00:00