Commit Graph

22043 Commits

Author SHA1 Message Date
Tim Northover f66181530f Implement NEON domain switching for scalar <-> S-register vmovs on ARM
llvm-svn: 162094
2012-08-17 11:32:52 +00:00
Craig Topper 31625574db Use nested switch to select arguments to reduce calls to EmitPCMP.
llvm-svn: 162089
2012-08-17 07:15:56 +00:00
Craig Topper 602e1abe0d Make ReplaceATOMIC_BINARY_64 a static function. Use a nested switch to reduce to only a single call to it thus allowing it to be inlined by the compiler.
llvm-svn: 162088
2012-08-17 06:55:11 +00:00
Craig Topper f6add7e667 Remove unnecessary include of ARMGenInstrInfo.inc.
llvm-svn: 162086
2012-08-17 06:21:09 +00:00
Jakob Stoklund Olesen 0ea1fce6b4 Add ADD and SUB to the predicable ARM instructions.
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.

Then the pseudo-instructions can go away.

llvm-svn: 162061
2012-08-16 23:21:55 +00:00
Jakob Stoklund Olesen c19bf0282d Handle ARM MOVCC optimization in PeepholeOptimizer.
Use the target independent select analysis hooks.

llvm-svn: 162060
2012-08-16 23:14:20 +00:00
Roman Divacky 2039a987c4 Revert r162034, r162035 and r162037.
llvm-svn: 162039
2012-08-16 19:07:59 +00:00
Roman Divacky 9d38fc8ddc Define and handle additional fixup kinds. By Adhemerval Zanella.
llvm-svn: 162037
2012-08-16 18:37:52 +00:00
Roman Divacky 1faf5b07c6 Fix typo and grammar. By Adhemerval Zanella.
llvm-svn: 162032
2012-08-16 18:19:29 +00:00
Jush Lu 26088cb30e [arm-fast-isel] Add support for fastcc.
Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.

llvm-svn: 162013
2012-08-16 05:15:53 +00:00
Anitha Boyapati af3e98347f Patch to enable FMA on bdver2 target. Make XOP feature enable FMA4 as well.
llvm-svn: 162012
2012-08-16 04:04:02 +00:00
Anitha Boyapati 426feb61b9 (no commit message)
llvm-svn: 162010
2012-08-16 03:50:04 +00:00
Akira Hatanaka 89d50b3957 Add Android ABI to Mips backend to handle functions returning vectors of four
floats.

llvm-svn: 162008
2012-08-16 03:48:05 +00:00
Jakob Stoklund Olesen 6cb96120f1 Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

llvm-svn: 161994
2012-08-15 22:16:39 +00:00
Evan Cheng eec6bc6270 Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029
llvm-svn: 161962
2012-08-15 17:44:53 +00:00
Jakob Stoklund Olesen 2ec0c41e01 Add missing Rfalse operand to the predicated pseudo-instructions.
When predicating this instruction:

  Rd = ADD Rn, Rm

We need an extra operand to represent the value given to Rd when the
predicate is false:

  Rd = ADDCC Rfalse, Rn, Rm, pred

The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.

Previously, Rd and Rn were tied, but that is not required.

Compare to MOVCC:

  Rd = MOVCC Rfalse, Rtrue, pred

llvm-svn: 161955
2012-08-15 16:17:24 +00:00
Anton Korobeynikov c6d945b11a The names of VFP variants of half-to-float conversion instructions were
reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.

llvm-svn: 161907
2012-08-14 23:36:01 +00:00
Eric Christopher 5f61a7498b This needs braces. Spotted by Bill.
llvm-svn: 161906
2012-08-14 23:32:15 +00:00
Michael Liao 06f6fe875a minor fix of X86ISD::VSEXT_MOVL dump
llvm-svn: 161902
2012-08-14 22:53:17 +00:00
Michael Liao 34107b9177 fix PR11334
- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.

llvm-svn: 161894
2012-08-14 21:24:47 +00:00
Jim Grosbach ecaef49f59 Switch the fixed-length disassembler to be table-driven.
Refactor the TableGen'erated fixed length disassemblmer to use a
table-driven state machine rather than a massive set of nested
switch() statements.

As a result, the ARM Disassembler (ARMDisassembler.cpp) builds much more
quickly and generates a smaller end result. For a Release+Asserts build on
a 16GB 3.4GHz i7 iMac w/ SSD:

Time to compile at -O2 (averaged w/ hot caches):
  Previous: 35.5s
  New:       8.9s

TEXT size:
  Previous: 447,251
  New:      297,661

Builds in 25% of the time previously required and generates code 66% of
the size.

Execution time of the disassembler is only slightly slower (7% disassembling
10 million ARM instructions, 19.6s vs 21.0s). The new implementation has
not yet been tuned, however, so the performance should almost certainly
be recoverable should it become a concern.

llvm-svn: 161888
2012-08-14 19:06:05 +00:00
Craig Topper 925a281b00 Factor duplicate calls to getUNDEF in several functions.
llvm-svn: 161860
2012-08-14 08:18:43 +00:00
Craig Topper d0d4b11f66 Re-factor intrinsic lowering to combine common parts of similar intrinsics. Reduces compiled code size a little bit.
llvm-svn: 161859
2012-08-14 07:43:25 +00:00
Jakob Stoklund Olesen 702bcc3bcf Remove the TII::scheduleTwoAddrSource() hook.
It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.

The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.

When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.

llvm-svn: 161794
2012-08-13 21:52:57 +00:00
Manman Ren d6c8270eaa ARM: enable struct byval for AAPCS-VFP.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 161789
2012-08-13 21:22:50 +00:00
Arnold Schwaighofer 0bb7f23cfc [Hexagon] Don't mark callee saved registers as clobbered by a tail call
This was causing unnecessary spills/restores of callee saved registers.

Fixes PR13572.

Patch by Pranav Bhandarkar!

llvm-svn: 161778
2012-08-13 19:54:01 +00:00
Nadav Rotem 3a94c545cf Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user.
rdar://11876519

llvm-svn: 161775
2012-08-13 18:52:44 +00:00
Manman Ren 959acb106b X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from
OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.

PR13576

llvm-svn: 161769
2012-08-13 18:29:41 +00:00
Eric Christopher 7d8b53c1f8 Add support for the %H output modifier.
Patch by Weiming Zhao.

llvm-svn: 161768
2012-08-13 18:18:52 +00:00
Manman Ren e90e94f117 X86: when auto-detecting the subtarget features, make sure use IsIntel to detect
Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6.

llvm-svn: 161763
2012-08-13 17:26:46 +00:00
Tim Northover 5aaa7fde94 Use correct loads for vector types during extending-load operations.
Previously, we used VLD1.32 in all cases, however there are both 16 and 64-bit
accesses being selected, so we need to use an appropriate width load in those
cases.

llvm-svn: 161748
2012-08-13 09:06:31 +00:00
Craig Topper 4e5eb72735 Tidy up VSETCC lowering code a bit more by adding an llvm_unreachable and putting an a couple if conditions in a better order.
llvm-svn: 161746
2012-08-13 03:42:38 +00:00
Craig Topper 5145a0d967 Refactor code a bit to share commonalities. No functional change intended.
llvm-svn: 161745
2012-08-13 02:34:03 +00:00
Craig Topper ff6e4d1928 Fix an unused variable warning from r161742.
llvm-svn: 161743
2012-08-13 01:26:45 +00:00
Craig Topper a7aaa62d54 Remove the LowerMMXCONCAT_VECTORS function. It could never execute because there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result.
llvm-svn: 161742
2012-08-13 01:23:55 +00:00
Craig Topper 3d2b271362 Remove call to setOperationAction for SETCC of v4f32. SETCC returns an integer type not an FP type.
llvm-svn: 161738
2012-08-12 05:31:32 +00:00
Craig Topper 498228d089 Remove unnecessary call to setOperationAction for SETCC of v2i64 under SSE42. It was already called for the same under SSE2.
llvm-svn: 161737
2012-08-12 05:15:16 +00:00
Arnold Schwaighofer b73da9453c Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM
architecture

It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.

llvm-svn: 161736
2012-08-12 05:11:56 +00:00
Craig Topper 4fa625fda7 Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed.
llvm-svn: 161735
2012-08-12 03:16:37 +00:00
Craig Topper 10a8bf3b8c Make replace many calls to getSizeInBits() with is128BitVector/is256BitVector
llvm-svn: 161734
2012-08-12 02:23:29 +00:00
Craig Topper 03d2787275 Use MVT.isXBitVector instead of EVT.isXBitVector when setting up operation actions. Compiles to smaller code.
llvm-svn: 161733
2012-08-12 00:34:56 +00:00
Michael Liao e7e828fd64 fix PR13577, an issue introduced by r161687
- FCMOV only supports a subset of X86 conditions. Skip boolean
  simplification if X86 condition is not valid for FCMOV.
- add a minimal test case for PR13577.

llvm-svn: 161732
2012-08-11 23:47:06 +00:00
Craig Topper b5bcf58ba1 Move setOperationAction for CONCAT_VECTORS for 256-bit vectors into loop since all 256-bit types are supported.
llvm-svn: 161730
2012-08-11 22:34:26 +00:00
Craig Topper 490c45c06c Tidy up indentation. No functional change.
llvm-svn: 161727
2012-08-11 17:53:00 +00:00
Craig Topper 55406d9f78 Fix a cast that was casting away 'const' unnecessarily
llvm-svn: 161726
2012-08-11 17:46:16 +00:00
Craig Topper 22cb0c572b Add a couple default: llvm_unreachable() to some switch statements. Fix a bad message in an existing llvm_unreachable.
llvm-svn: 161725
2012-08-11 17:44:14 +00:00
Manman Ren 1acb6707cd X86: when we are auto-detecting the subtarget features, make sure we turn on
FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge.

FeatureFastUAMem is already on if we pass in nehalem or westmere as a command
argument.

rdar: 7252306
llvm-svn: 161717
2012-08-10 23:43:32 +00:00
Manman Ren e201e27eb1 ARM: enable struct byval for AAPCS.
This change is to be enabled in clang.

rdar://9877866
PR://13350

llvm-svn: 161693
2012-08-10 20:39:38 +00:00
Michael Liao 5248e9913f add X86-specific DAG optimization to simplify boolean test
- if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value
  generated from X86ISD::SETCC, try to simplify the boolean value
  generation and checking by reusing the original EFLAGS with proper
  condition code
- add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places
  consuming EFLAGS

part of patches fixing PR12312

llvm-svn: 161687
2012-08-10 19:58:13 +00:00
Michael Liao ea7d906b0f remove tailing whitespaces and test commit
llvm-svn: 161664
2012-08-10 14:39:24 +00:00
Joerg Sonnenberger aa2f801ca3 Add some missing includes for the build against stdcxx.
llvm-svn: 161657
2012-08-10 10:53:56 +00:00
Eric Christopher 6ac277ce91 Remove getARMRegisterNumbering and replace with calls into
the register info for getEncodingValue. This builds on the
small patch of yesterday to set HWEncoding in the register
file.

One (deprecated) use was turned into a hard number to avoid
needing register info in the old JIT.

llvm-svn: 161628
2012-08-09 22:10:21 +00:00
Jakob Stoklund Olesen 4238a89db8 Don't modify MO while use_iterator is still pointing to it.
llvm-svn: 161626
2012-08-09 22:08:24 +00:00
Chad Rosier 9cb988f3aa [ms-inline asm] Extend the MC AsmParser API to match MCInsts (but not emit).
This new API will be used by clang to parse ms-style inline asms.

One goal of this project is to use this style of inline asm for targets other
then x86.  Therefore, this API needs to be implemented for non-x86 targets at
some point in the future.

llvm-svn: 161624
2012-08-09 22:04:55 +00:00
Jack Carter 120a30a732 Another 32 to 64 bit sign extension bug.
The fields in the td definition were switched.

llvm-svn: 161607
2012-08-09 19:43:18 +00:00
Arnold Schwaighofer 81b2eec1ab Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!

llvm-svn: 161581
2012-08-09 15:25:52 +00:00
Eric Christopher 245f9b5552 This field isn't used anymore, use it with HWEncoding instead.
llvm-svn: 161564
2012-08-09 01:39:32 +00:00
Jakob Stoklund Olesen 978c1280a5 Don't use getNextOperandForReg().
This way of using getNextOperandForReg() was unlikely to work as
intended. We don't give any guarantees about the order of operands in
the use-def chains, so looking only at operands following a given
operand in the chain doesn't make sense.

llvm-svn: 161542
2012-08-08 23:44:04 +00:00
Andrew Trick 352abc19a5 Added MispredictPenalty to SchedMachineModel.
This replaces an existing subtarget hook on ARM and allows standard
CodeGen passes to potentially use the property.

llvm-svn: 161471
2012-08-08 02:44:16 +00:00
Andrew Trick 207c569cf3 whitespace
llvm-svn: 161469
2012-08-08 02:44:08 +00:00
Manman Ren 1be131ba27 X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276

llvm-svn: 161462
2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen 3b9a442841 Don't scan physreg use-def chains looking for a PIC base.
We can't rematerialize a PIC base after register allocation anyway, and
scanning physreg use-def chains is very expensive in a function with
many calls.

<rdar://problem/12047515>

llvm-svn: 161461
2012-08-08 00:40:47 +00:00
Evan Cheng fbdd25c135 X86 cmp lowering is looking past truncate on the condition node. It should only
do so when the high bits are known zero. This caused a subtle miscompilation.

rdar://12027825 

llvm-svn: 161451
2012-08-07 22:21:00 +00:00
Hal Finkel 895a5f5d12 Add a comment about mftb vs. mfspr on PPC.
Thanks to Alex Rosenberg for the suggestion.

llvm-svn: 161428
2012-08-07 17:04:20 +00:00
Bill Wendling 0acd0c0a69 Revert r161371. Removing the 'const' before Type is a "good thing".
--- Reverse-merging r161371 into '.':
U    include/llvm/Target/TargetData.h
U    lib/Target/TargetData.cpp

llvm-svn: 161394
2012-08-07 05:51:59 +00:00
Jack Carter f4946cfbb9 The define for 64 bit sign extension neglected to
initialize fields of the class that it used.

The result was nonsense code.

Before:
0000000000000000 <foo>:
   0:    00441100     0x441100
   4:    03e00008     jr    ra
   8:    00000000     nop

After:
0000000000000000 <foo>:
   0:    00041000     sll    v0,a0,0x0
   4:    03e00008     jr    ra
   8:    00000000     nop 

llvm-svn: 161377
2012-08-07 00:35:22 +00:00
Bill Wendling 654cd4aaee Constify the Type parameter to some methods (which are const anyway).
llvm-svn: 161371
2012-08-07 00:26:35 +00:00
Andrew Trick e0c83b1f3b Allow x86 subtargets to use the GenericModel defined in X86Schedule.td.
This allows codegen passes to query properties like
InstrItins->SchedModel->IssueWidth. It also ensure's that
computeOperandLatency returns the X86 defaults for loads and "high
latency ops". This should have no significant impact on existing
schedulers because X86 defaults happen to be the same as global
defaults.

llvm-svn: 161370
2012-08-07 00:25:30 +00:00
Jack Carter 4c58381c3a Mips relocation R_MIPS_64 relocates a 64 bit double word.
I hit this in a very large program (spirit.cpp), but 
have not figured out how to make a small make check
test for it.

llvm-svn: 161366
2012-08-07 00:01:14 +00:00
Jack Carter 612c66314c The Mips64InstrInfo.td definitions DynAlloc64 LEA_ADDiu64
were using a class defined for 32 bit instructions and 
thus the instruction was for addiu instead of daddiu.

This was corrected by adding the instruction opcode as a 
field in the  base class to be filled in by the defs.

llvm-svn: 161359
2012-08-06 23:29:06 +00:00
Jack Carter 84491abb20 Mips relocations R_MIPS_HIGHER and R_MIPS_HIGHEST.
These 2 relocations gain access to the 
highest and the second highest 16 bits
of a 64 bit object.

R_MIPS_HIGHER %higher(A+S)
The %higher(x) function is [ (((long long) x + 0x80008000LL) >> 32) & 0xffff ]. 

R_MIPS_HIGHEST %highest(A+S)
The %highest(x) function is [ (((long long) x + 0x800080008000LL) >> 48) & 0xffff ]. 

llvm-svn: 161348
2012-08-06 21:26:03 +00:00
Hal Finkel 33e529d56b MFTB on PPC64 should really be encoded using MFSPR.
The MFTB instruction itself is being phased out, and its functionality
is provided by MFSPR. According to the ISA docs, using MFSPR works on all known
chips except for the 601 (which did not have a timebase register anyway)
and the POWER3.

Thanks to Adhemerval Zanella for pointing this out!

llvm-svn: 161346
2012-08-06 21:21:44 +00:00
Eric Christopher 22738d00a3 Add support for the OpenBSD for Bitrig.
Patch by David Hill.

llvm-svn: 161344
2012-08-06 20:52:18 +00:00
Roman Divacky 7d6e08560b Remove empty overrides of processFunctionBeforeFrameFinalized().
llvm-svn: 161328
2012-08-06 18:14:18 +00:00
Craig Topper ab47fe4e16 Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Craig Topper 6d0408d3a5 Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern.
llvm-svn: 161306
2012-08-05 00:36:57 +00:00
Craig Topper 43ee9fae92 Use a COPY node instead of an explicit MOVA opcode in the custom insterter for pcmpestrm/pcmpistrm. Allows the register allocator to handle it better and prevent wasted identity moves.
llvm-svn: 161305
2012-08-05 00:17:48 +00:00
Hal Finkel 70381a7b18 Add readcyclecounter lowering on PPC64.
On PPC64, this can be done with a simple TableGen pattern.
To enable this, I've added the (otherwise missing) readcyclecounter
SDNode definition to TargetSelectionDAG.td.

llvm-svn: 161302
2012-08-04 14:10:46 +00:00
Anton Korobeynikov ef731edf53 Skip impdef regs during eabi save/restore list emission to workaround PR11902
llvm-svn: 161301
2012-08-04 13:25:58 +00:00
Anton Korobeynikov 3a4fdfeceb Recognize vst1.64 / vld1.64 with 3 and 4 regs as load from / store to stack stuff
(this corresponds by spilling/reloading regs in DTriple / DQuad reg classes).
No testcase, found by inspection.

llvm-svn: 161300
2012-08-04 13:22:14 +00:00
Anton Korobeynikov 218aaf6d04 Add stack spill / reload instructions for DTriple and DQuad register classes, which
were missed for no reason. This fixes PR13377

llvm-svn: 161299
2012-08-04 13:16:12 +00:00
Akira Hatanaka 22bec282e9 1. Redo mips16 instructions to avoid multiple opcodes for same instruction.
Change these to patterns.
2. Add another 16 instructions.

Patch by Reed Kotler.

llvm-svn: 161272
2012-08-03 22:57:02 +00:00
Gabor Greif a1529b6ca4 allow 'make CPPFLAGS=<something>' work again
this makes this hack a bit more bearable
for poor souls who need to pass custom
preprocessor flags to the build process

llvm-svn: 161240
2012-08-03 13:31:24 +00:00
Bob Wilson 3e6fa462f3 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

llvm-svn: 161232
2012-08-03 04:06:28 +00:00
Bob Wilson c740e3f0d1 Add new getLibFunc method to TargetLibraryInfo.
This just provides a way to look up a LibFunc::Func enum value for a
function name.  Alphabetize the enums and function names so we can use a
binary search.

llvm-svn: 161231
2012-08-03 04:06:22 +00:00
Jush Lu 4705da9020 [arm-fast-isel] Add support for shl, lshr, and ashr.
llvm-svn: 161230
2012-08-03 02:37:48 +00:00
Eric Christopher b3322364e4 Add support for the ARM GHC calling convention, this patch was in 3.0,
but somehow managed to be dropped later.

Patch by Karel Gardas.

llvm-svn: 161226
2012-08-03 00:05:53 +00:00
Jim Grosbach 5d6d015969 ARM: Tidy up. Remove unused template parameters.
llvm-svn: 161222
2012-08-02 22:08:27 +00:00
Jim Grosbach b79c33ef55 ARM: More InstAlias refactors to use #NAME#.
llvm-svn: 161220
2012-08-02 21:59:52 +00:00
Jim Grosbach 6d27ad62a8 ARM: Refactor instaliases using TableGen support for #NAME#.
Now that TableGen supports references to NAME w/o it being explicitly
referenced in the definition's own name, use that to simplify
assembly InstAlias definitions in multiclasses.

llvm-svn: 161218
2012-08-02 21:50:41 +00:00
Manman Ren ba8122cc25 X86 Peephole: fold loads to the source register operand if possible.
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.

rdar://10554090 and rdar://11873276

llvm-svn: 161207
2012-08-02 19:37:32 +00:00
Akira Hatanaka fab8929459 Move the code that creates instances of MipsInstrInfo and MipsFrameLowering out
of MipsTargetMachine.cpp.

llvm-svn: 161191
2012-08-02 18:21:47 +00:00
Akira Hatanaka fffad897f2 Set transient stack alignment in constructor of MipsFrameLowering and re-enable
test o32_cc_vararg.ll.

llvm-svn: 161189
2012-08-02 18:15:13 +00:00
Jiangning Liu fa18005a4c Support fpv4 for ARM Cortex-M4.
llvm-svn: 161163
2012-08-02 08:35:55 +00:00
Jiangning Liu 6a43bf7d74 Fix #13035, a bug around Thumb instruction LDRD/STRD with negative #0 offset index issue.
llvm-svn: 161162
2012-08-02 08:29:50 +00:00
Jiangning Liu 288e1af8c8 Fix #13138, a bug around ARM instruction DSB encoding and decoding issue.
llvm-svn: 161161
2012-08-02 08:21:27 +00:00
Jiangning Liu 10dd40e42d Fix #13241, a bug around shift immediate operand for ARM instruction ADR.
llvm-svn: 161159
2012-08-02 08:13:13 +00:00
Manman Ren 5759d01230 X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276

llvm-svn: 161152
2012-08-02 00:56:42 +00:00
Manman Ren 4059145396 X86: mark GATHER instructios as mayLoad
llvm-svn: 161143
2012-08-01 23:28:59 +00:00
Jim Grosbach 8724d0fd99 ARM: Remove redundant instalias.
llvm-svn: 161134
2012-08-01 20:33:05 +00:00
Jim Grosbach 96e8a8dc6d Clean up formatting.
llvm-svn: 161133
2012-08-01 20:33:02 +00:00
Jim Grosbach b437a8c5d5 Tidy up.
llvm-svn: 161132
2012-08-01 20:33:00 +00:00
Chad Rosier 24c19d20c0 Whitespace.
llvm-svn: 161122
2012-08-01 18:39:17 +00:00
Elena Demikhovsky 3cb3b0045c Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Craig Topper b8aec08819 Add more indirection to the disassembler tables to reduce amount of space used to store the operand types and encodings. Store only the unique combinations in a separate table and store indices in the instruction table. Saves about 32K of static data.
llvm-svn: 161101
2012-08-01 07:39:18 +00:00
Akira Hatanaka 0820f0ca8e Implement MipsJITInfo::replaceMachineCodeForFunction.
No new test case is added.
This patch makes test JITTest.FunctionIsRecompiledAndRelinked pass on mips
platform.

Patch by Petar Jovanovic.

llvm-svn: 161098
2012-08-01 02:29:24 +00:00
Akira Hatanaka 4a240b0f9d Remove unused variable.
llvm-svn: 161095
2012-08-01 00:37:53 +00:00
Akira Hatanaka 88d76cfd7a Implement MipsSERegisterInfo::eliminateCallFramePseudoInstr. The function emits
instructions that decrement and increment the stack pointer before and after a
call when the function does not have a reserved call frame.

llvm-svn: 161093
2012-07-31 23:52:55 +00:00
Akira Hatanaka cb37e13fa7 Add definitions of two subclasses of MipsRegisterInfo, Mips16RegisterInfo and
MipsSERegisterInfo.

llvm-svn: 161092
2012-07-31 23:41:32 +00:00
Akira Hatanaka d1c43cee24 Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and
MipsSEFrameLowering.

Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be
reserved if there is a call with a large call frame or there are variable sized
objects on the stack.

llvm-svn: 161090
2012-07-31 22:50:19 +00:00
Akira Hatanaka 2c64adf672 Add Mips16InstrInfo.cpp and MipsSEInstrInfo.cpp to CMakeLists.txt.
llvm-svn: 161083
2012-07-31 22:11:05 +00:00
Akira Hatanaka b7fa3c9db0 Add definitions of two subclasses of MipsInstrInfo, MipsInstrInfo (for mips16),
and MipsSEInstrInfo (for mips32/64).

llvm-svn: 161081
2012-07-31 21:49:49 +00:00
Akira Hatanaka 30651805c4 Delete mips64 target machine classes. mips target machines can be used in place
of them.

llvm-svn: 161080
2012-07-31 21:39:17 +00:00
Akira Hatanaka 02de0e4425 Let PEI::calculateFrameObjectOffsets compute the final stack size rather than
computing it in MipsFrameLowering::emitPrologue.

llvm-svn: 161078
2012-07-31 21:28:49 +00:00
Akira Hatanaka 33a25af5a8 Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering.
The frame object which points to the dynamically allocated area will not be
needed after changes are made to cease reserving call frames.

llvm-svn: 161076
2012-07-31 20:54:48 +00:00
Akira Hatanaka a66d676b20 Define ADJCALLSTACKDOWN/UP nodes. These nodes are emitted regardless of whether
or not it is in mips16 mode. Define MipsPseudo (mode-independant pseudo) and
PseudoSE (mips32/64 pseudo) classes.

llvm-svn: 161071
2012-07-31 19:13:07 +00:00
Akira Hatanaka 3a810eda91 Change name of class MipsInst to InstSE to distinguish it from mips16's
instruction class. SE stands for standard encoding.

llvm-svn: 161069
2012-07-31 18:55:01 +00:00
Akira Hatanaka beda2241a4 When store nodes or memcpy nodes are created to copy the function call
arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and
integer offset operands rather than frame object operands.

llvm-svn: 161068
2012-07-31 18:46:41 +00:00
Chad Rosier 710be7df71 [x86 frame lowering] In 32-bit mode, use ESI as the base pointer.
Previously, we were using EBX, but PIC requires the GOT to be in EBX before 
function calls via PLT GOT pointer.

llvm-svn: 161066
2012-07-31 18:29:21 +00:00
Akira Hatanaka 4ce7c4060d Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as
single-precision load and store.

Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect
to map unaligned floating point load/store nodes to these instructions.

llvm-svn: 161063
2012-07-31 18:16:49 +00:00
Craig Topper c2efce404e Make INSTRUCTION_SPECIFIER_FIELDS match X86DisassemblerCommon.h. Also remove trailing whitespace.
llvm-svn: 161029
2012-07-31 05:18:26 +00:00
Craig Topper fb39f97d4c Tidy up trailing whitespace
llvm-svn: 161027
2012-07-31 04:58:05 +00:00
Craig Topper 5f33d90214 Tidy up trailing whitespace
llvm-svn: 161026
2012-07-31 04:38:27 +00:00
Kevin Enderby 5c490f1b8f Fix a bug in ARMMachObjectWriter::RecordRelocation() in ARMMachObjectWriter.cpp
where the other_half of the movt and movw relocation entries needs to get set
and only with the 16 bits of the other half.

rdar://10038370

llvm-svn: 160978
2012-07-30 18:46:15 +00:00
Craig Topper efd97044a3 Mark MOVZX16/MOVSX16 as neverHasSideEffects/mayLoad
llvm-svn: 160953
2012-07-30 07:14:07 +00:00
Craig Topper c6b7ef61f4 Mark MOVZX32_NOREX as isCodeGenOnly and neverHasSideEffects. The isCodeGenOnly change allows special detection of _NOREX instructions to be removed from tablegen disassembler code.
llvm-svn: 160951
2012-07-30 06:48:11 +00:00
Craig Topper 14eac5dda8 Give VCVTTPD2DQ priority over CVTTPD2DQ.
llvm-svn: 160942
2012-07-30 02:20:32 +00:00
Craig Topper f881d385da Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1.
llvm-svn: 160941
2012-07-30 02:14:02 +00:00
Craig Topper 415b3586d0 Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern.
llvm-svn: 160939
2012-07-30 01:38:57 +00:00
Craig Topper 28402efcb6 Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns.
llvm-svn: 160938
2012-07-29 23:26:34 +00:00
Craig Topper b6767f3acd Move more SSE/AVX convert instruction patterns into their definitions.
llvm-svn: 160937
2012-07-29 22:30:06 +00:00
Manman Ren f87dd7c01b Revert r160920 and r160919 due to dragonegg and clang selfhost failure
llvm-svn: 160927
2012-07-29 02:44:09 +00:00
Craig Topper fc93281c07 Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions.
llvm-svn: 160922
2012-07-28 18:59:19 +00:00
Craig Topper 024797b9a2 Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects.
llvm-svn: 160921
2012-07-28 18:36:39 +00:00
Manman Ren 0fa3ab88ba X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276

llvm-svn: 160919
2012-07-28 16:48:01 +00:00
Craig Topper 44f9b5343d Make CVTSS2SI instruction definition consistent with CVTSD2SI.
llvm-svn: 160914
2012-07-28 08:28:23 +00:00
Craig Topper 1c1aef07b8 Fix up memory load types for SSE scalar convert intrinsic patterns.
llvm-svn: 160913
2012-07-28 07:59:59 +00:00
Manman Ren 32367c063b X86 Peephole: fix PR13475 in optimizeCompare.
It is possible that an instruction can use and update EFLAGS.
When checking the safety, we should check the usage of EFLAGS first before
declaring it is safe to optimize due to the update.

llvm-svn: 160912
2012-07-28 03:15:46 +00:00
Akira Hatanaka 97ba7696f8 Pass the correct call frame size to callseq_start node. This is needed to
replace uses of function getMaxCallFrameSize defined in MipsFunctionInfo with
the one MachineFrameInfo has.

llvm-svn: 160841
2012-07-26 23:27:01 +00:00
Jakob Stoklund Olesen 7cd08536c2 Remove the X86 sub_ss and sub_sd sub-register indexes completely.
llvm-svn: 160833
2012-07-26 23:07:20 +00:00
Jakob Stoklund Olesen 77cd55b4ee Remove the last mentions of sub_ss and sub_sd from patterns.
I'll remove these two sub-register indexes shortly.

llvm-svn: 160831
2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen b96d0b4e08 Eliminate sub_ss, sub_sd from broadcast patterns.
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).

llvm-svn: 160830
2012-07-26 22:59:06 +00:00
Jakob Stoklund Olesen 206b825f5c Eliminate more sub_ss / sub_sd patterns.
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.

llvm-svn: 160820
2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen 75d17b0577 Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd.
The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.

llvm-svn: 160818
2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen ceee4a9d0c Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Craig Topper c7690ac7ac Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms.
llvm-svn: 160775
2012-07-26 07:48:28 +00:00
Akira Hatanaka 64626fc20f Fix call setup for PIC.
Patch by Reed Kotler.

llvm-svn: 160774
2012-07-26 02:24:43 +00:00
Jim Grosbach 6df755cc4e ARM: Don't assume an SDNode is a constant.
Before accessing a node as a ConstandSDNode, make sure it actually is one.
No testcase of non-trivial size.

rdar://11948669

llvm-svn: 160735
2012-07-25 17:02:47 +00:00
Nuno Lopes 89702e94b5 make all Emit*() functions consult the TargetLibraryInfo information before creating a call to a library function.
Update all clients to pass the TLI information around.
Previous draft reviewed by Eli.

llvm-svn: 160733
2012-07-25 16:46:31 +00:00
Rafael Espindola 73173c55c2 Fix typos. Thanks to Matt Beaumont-Gay for noticing it.
llvm-svn: 160731
2012-07-25 15:42:45 +00:00
Rafael Espindola 11c38b9657 When a return struct pointer is passed in registers, the called has nothing
to pop.

llvm-svn: 160725
2012-07-25 13:41:10 +00:00
Rafael Espindola 2caee7f4d2 Factor a long list of conditions into a predicate function. No functionality
change.

llvm-svn: 160724
2012-07-25 13:35:45 +00:00
Akira Hatanaka 5a69c235ae Eliminate the stack slot used to save the global base register.
The long branch pass (fixed in r160601) no longer uses the global base register
to compute addresses of branch destinations, so it is not necessary to reserve
a slot on the stack.

llvm-svn: 160703
2012-07-25 03:16:47 +00:00
Kevin Enderby 216ac31971 Fix a bug in the x86 disassembler's symbolic disassembly support for Jcc-Jump
if Condition Is Met instuctions that was not correctly determining the target
instruction.

So for a jne rel32 instruction:

% cat x.s
.byte 0x0f, 0x85, 0x09, 0x00, 0x00, 0x00
% as x.s

it was incorrectly deterining the target:

% otool -q -tv a.out 
a.out:
(__TEXT,__text) section
0000000000000000	jne	0xd

and with the fix it gets this correct as:

% otool -q -tv a.out
a.out:
(__TEXT,__text) section
0000000000000000	jne	0xf

rdar://11505997

llvm-svn: 160694
2012-07-24 21:40:01 +00:00
Nuno Lopes 342cf787ef add a few more functions to TargetLibraryInfo:
fputc, memchr, memcmp, putchar, puts, strchr, strncmp

llvm-svn: 160690
2012-07-24 21:00:36 +00:00
David Chisnall 5b8c1680de ELF does not imply GNU/Linux. Do not assume GNU conventions just because we
are targeting an ELF platform.  Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.

This fixes PR13438.

llvm-svn: 160687
2012-07-24 20:04:16 +00:00
Nuno Lopes 20f5a7aeb7 TargetLibraryInfo: add strn?cat, strn?cpy, and strn?len
llvm-svn: 160678
2012-07-24 17:25:06 +00:00
Akira Hatanaka 45da9e2653 Fix function MipsCodeEmitter::emitExternalSymbolAddress to pass test
ExecutionEngine/test-fp.ll.
 
Patch by Petar Jovanovic.

llvm-svn: 160653
2012-07-24 00:08:26 +00:00
Akira Hatanaka 26e9ecb7a3 Add basic ability to setup call frame, and make procedure calls.
Hello world will compile and execute with this patch.

Patch by Reed Kotler.

llvm-svn: 160651
2012-07-23 23:45:54 +00:00
Akira Hatanaka adec58c091 Add comment for relocations MO_HIGHER and HIGHEST in MipsBaseInfo.h.
llvm-svn: 160636
2012-07-23 19:19:20 +00:00
Micah Villmow 9eedce1e7c Test revert of test changes.
llvm-svn: 160632
2012-07-23 16:42:45 +00:00
Micah Villmow 780c24f19c Test commit.
llvm-svn: 160631
2012-07-23 16:37:24 +00:00
Sylvestre Ledru 35521e2310 Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Akira Hatanaka f72efdb62f Fix Mips long branch pass.
This pass no longer requires that the global pointer value be saved to the
stack or register since it uses bal instruction to compute branch distance.

llvm-svn: 160601
2012-07-21 03:30:44 +00:00
Akira Hatanaka 6035fe78c7 Add HIGHER and HIGHEST relocations to Mips backend.
llvm-svn: 160599
2012-07-21 03:09:04 +00:00
Akira Hatanaka b49c68a65d Revert accidental commit.
llvm-svn: 160598
2012-07-21 02:20:33 +00:00
Akira Hatanaka f73e362758 Add VK_Mips_HIGHER and VK_Mips_HIGHEST to MCSymbolRefExpr::VariantKind.
Test case will be added later when long branch patch is checked in.

llvm-svn: 160597
2012-07-21 02:15:19 +00:00
Craig Topper 0b94e46ce3 Don't use implicit register operands to calculate L-bit for AVX instructions. Needed because super reg defs and kills are added as implicit operands on 128-bit instructions. Fixes PR13349. Patch by Jose Fonseca.
llvm-svn: 160543
2012-07-20 07:03:46 +00:00
Preston Gurd 8e082688a1 Adds the family codes for the Midview Atom processors so that the
Atom buildbot will auto-detect Atom.

llvm-svn: 160521
2012-07-19 19:05:37 +00:00
Sebastian Pop 221e07e140 default to use -mv4 when no version of Hexagon has been specified
This fixes a bunch of make check failures of the form:

Unknown Architecture Version.
UNREACHABLE executed at ../lib/Target/Hexagon/HexagonSubtarget.cpp:60!

llvm-svn: 160518
2012-07-19 18:24:50 +00:00
Jush Lu e67e07b901 [arm-fast-isel] Add support for vararg function calls.
llvm-svn: 160500
2012-07-19 09:49:00 +00:00
Bill Wendling 723444e767 Remove tabs.
llvm-svn: 160483
2012-07-19 00:25:04 +00:00
Bill Wendling 318f03f56f Remove tabs.
llvm-svn: 160479
2012-07-19 00:15:11 +00:00
Bill Wendling ea6397f67b Remove tabs.
llvm-svn: 160477
2012-07-19 00:11:40 +00:00
Bill Wendling 2b07965042 Remove tabs.
llvm-svn: 160476
2012-07-19 00:06:06 +00:00
Manman Ren d0a4ee8427 X86: remove redundant cmp against zero.
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129

llvm-svn: 160454
2012-07-18 21:40:01 +00:00
Preston Gurd f0a48ec8f1 This patch fixes 8 out of 20 unexpected failures in "make check"
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!

llvm-svn: 160451
2012-07-18 20:49:17 +00:00
Andrew Trick a22cdb713b Fix ARMTargetLowering::isLegalAddImmediate to consider thumb encodings.
Based on Evan's suggestion without a commitable test.

llvm-svn: 160441
2012-07-18 18:34:27 +00:00
Andrew Trick bc325168c3 whitespace
llvm-svn: 160440
2012-07-18 18:34:24 +00:00
Nadav Rotem 4c12245b3a The vbroadcast family of instructions has 'fallback patterns' in case where the
load source operand is used by multiple nodes. The v2i64 broadcast was emulated
by shuffling the two lower i32 elements to the upper two.
We had a bug in the immediate used for the broadcast.
Replacing 0 to 0x44.
0x44 means [01|00|01|00] which corresponds to the correct lane.

Patch by Michael Kuperstein.

llvm-svn: 160430
2012-07-18 08:14:48 +00:00
Jack Carter a62ba82825 Mips specific inline asm operand modifier 'M':
Print the high order register of a double word register operand.

In 32 bit mode, a 64 bit double word integer will be represented
by 2 32 bit registers. This modifier causes the high order register
to be used in the asm expression. It is useful if you are using 
doubles in assembler and continue to control register to variable
relationships.

This patch also fixes a related bug in a previous patch:

    case 'D': // Second part of a double word register operand
    case 'L': // Low order register of a double word register operand
    case 'M': // High order register of a double word register operand

I got 'D' and 'M' confused. The second part of a double word operand
will only match 'M' for one of the endianesses. I had 'L' and 'D'
be the opposite twins when 'L' and 'M' are.

llvm-svn: 160429
2012-07-18 06:41:36 +00:00
Craig Topper 6bf3ed454a Remove tab characters.
llvm-svn: 160425
2012-07-18 04:59:16 +00:00
Craig Topper 8532423268 Fix typo in error message and remove some tab characters.
llvm-svn: 160423
2012-07-18 04:36:35 +00:00
Craig Topper 01deb5f2df Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas.
llvm-svn: 160420
2012-07-18 04:11:12 +00:00
Joel Jones b84f7bea09 More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160410
2012-07-18 00:02:16 +00:00
Akira Hatanaka f640f040d1 Clean up Mips16InstrFormats.td and Mips16InstrInfo.td.
Patch by Reed Kotler.

llvm-svn: 160403
2012-07-17 22:55:34 +00:00
Evan Cheng e6a3b03ee0 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
Evan Cheng 780f9b5f92 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng f579beca6d This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Evan Cheng 75315b877c For something like
uint32_t hi(uint64_t res)
{
        uint_32t hi = res >> 32;
        return !hi;
}

llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
  %lnot = icmp ult i64 %res, 4294967296
  %lnot.ext = zext i1 %lnot to i32
  ret i32 %lnot.ext
}

The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
        movabsq $4294967296, %rax       ## imm = 0x100000000
        cmpq    %rax, %rdi
        sbbl    %eax, %eax
        andl    $1, %eax

This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
        shrq    $32, %rdi
        testl   %edi, %edi
        sete    %al
        movzbl  %al, %eax

It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1

rdar://11866926

llvm-svn: 160312
2012-07-16 19:35:43 +00:00
Tom Stellard 1be1aa84ec Revert "AMDGPU: Add core backend files for R600/SI codegen v6"
This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea.

llvm-svn: 160303
2012-07-16 18:19:53 +00:00
Tom Stellard 95bd0be903 Revert "Build script changes for R600/SI Codegen v6"
This reverts commit e3013202259ed1e006c21817c63cf25d75982721.

llvm-svn: 160301
2012-07-16 18:19:46 +00:00
Tom Stellard 151dc338e4 Revert "Target/AMDGPU/R600KernelParameters.cpp: Fix two includes, <llvm/IRBuilder.h> and <llvm/TypeBuilder.h>"
This reverts commit 0258a6bdd30802f5cc0e8e57c8e768fde2aef590.

llvm-svn: 160299
2012-07-16 18:19:41 +00:00
Tom Stellard 1bd3012505 Revert "Target/AMDGPU: [CMake] Fix dependencies. 1) Add intrinsics_gen. Add AMDGPUCommonTableGen."
This reverts commit ebc934ba32ee71abbb8f0f2eb6a0fbaa613ba0d2.

llvm-svn: 160298
2012-07-16 18:19:40 +00:00
Tom Stellard 781853e11f Revert "Target/AMDGPU/R600KernelParameters.cpp: Don't use "and", "or" as conditional operator..."
This reverts commit 29f28bc14ad5a907f5dc849f004fafeec0aab33a.

llvm-svn: 160297
2012-07-16 18:19:38 +00:00
Tom Stellard 2e007de42d Revert "Target/AMDGPU/AMDILIntrinsicInfo.cpp: Use llvm_unreachable() in nonreturn function, instead of assert(0)."
This reverts commit 4ba4acc1bc2561b944a571edbb6a2dc78e357dfe.

llvm-svn: 160296
2012-07-16 18:19:37 +00:00
Tom Stellard f65e78b2fa Revert "Target/AMDGPU: Fix includes, or msvc build failed."
This reverts commit fef4aa1b16fcf7a472559abbbcf4c1adc9eb5ca6.

llvm-svn: 160295
2012-07-16 18:19:32 +00:00
Chad Rosier 10e8207c9e With r160248 in place this code is no longer needed.
llvm-svn: 160293
2012-07-16 17:42:13 +00:00
NAKAMURA Takumi 96cc5e5bf9 Target/AMDGPU: Fix includes, or msvc build failed.
llvm-svn: 160280
2012-07-16 15:43:50 +00:00
NAKAMURA Takumi dc4261794f Target/AMDGPU/AMDILIntrinsicInfo.cpp: Use llvm_unreachable() in nonreturn function, instead of assert(0).
llvm-svn: 160279
2012-07-16 15:43:09 +00:00