Commit Graph

101845 Commits

Author SHA1 Message Date
Jim Grosbach 4a1a9ce5e6 ARM: cortex-m0 doesn't support unaligned memory access.
Unlike other v6+ processors, cortex-m0 never supports unaligned accesses.
From the v6m ARM ARM:

"A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned
access occurs."

rdar://16491560

llvm-svn: 205452
2014-04-02 19:28:13 +00:00
Jim Grosbach df1e05bb8a Make some range based loop types more explicit.
No functional change, but more readable code.

llvm-svn: 205451
2014-04-02 19:28:08 +00:00
Kai Nacke 13673ac704 [mips] Add more Octeon cnMips instructions
Adds the instructions ext/ext32/cins/cins32.
It also changes pop/dpop to accept the two operand version and
adds a simple pattern to generate baddu.
Tests for the two operand versions (including baddu/dmul/dpop/pop)
and the code generation pattern for baddu are included.

Reviewed by: Daniel.Sanders@imgtec.com

llvm-svn: 205449
2014-04-02 18:40:43 +00:00
Jim Grosbach 20b0790df7 [C++11,ARM64] Range based for and explicit 'override' in STP cleanup.
No functional change intended.

llvm-svn: 205446
2014-04-02 18:00:59 +00:00
Jim Grosbach 05abd709f3 [C++11,ARM64] Range based for loops in constant promotion.
No functional change intended.

llvm-svn: 205445
2014-04-02 18:00:56 +00:00
Jim Grosbach 7dc9edeaa5 [C++11,ARM64] Range based for loops in load/store pair optimizer.
No functional change intended.

llvm-svn: 205444
2014-04-02 18:00:53 +00:00
Jim Grosbach 020e657790 [C++11,ARM64] Range based for loops in target lowering.
No functional change intended.

llvm-svn: 205443
2014-04-02 18:00:51 +00:00
Jim Grosbach 91f1f47751 [C++11,ARM64] Range based for loops in frame lowering.
No functional change intended.

llvm-svn: 205442
2014-04-02 18:00:49 +00:00
Jim Grosbach f39d752b03 [C++11,ARM64] Range based for loops in pseudo expansion.
No functional change intended.

llvm-svn: 205441
2014-04-02 18:00:46 +00:00
Jim Grosbach 673825ebac [C++11,ARM64] Range based for loops for LOH
No functional change intended.

llvm-svn: 205440
2014-04-02 18:00:44 +00:00
Jim Grosbach 2539c3d07a [C++11,ARM64] Range based for loops TLS cleanup.
No functional change intended.

llvm-svn: 205439
2014-04-02 18:00:41 +00:00
Jim Grosbach 0d0c5a614a [C++11,ARM64] Range based for loops in branch relaxation.
No functional change intended.

llvm-svn: 205438
2014-04-02 18:00:39 +00:00
Jim Grosbach 1c762ca9bd [C++11,ARM64] Range based for loops in address type promotion.
No functional change intended.

llvm-svn: 205437
2014-04-02 18:00:36 +00:00
Quentin Colombet 7bf9d8cd13 [ARM64][CollectLOH] Remove the link to the radar from the comments.
llvm-svn: 205435
2014-04-02 16:40:49 +00:00
Simon Atanasyan 3ee21b014b [yaml2obj][ELF] Convert some static functions into class members to
reduce number of arguments.

No functional changes.

llvm-svn: 205434
2014-04-02 16:34:54 +00:00
Simon Atanasyan ee7776d681 [yaml2obj][ELF] Remove unused typedef.
No functional changes.

llvm-svn: 205433
2014-04-02 16:34:48 +00:00
Simon Atanasyan 220c54a0e3 [yaml2obj][ELF] Move section index to the ELFState class.
No functional changes.

llvm-svn: 205432
2014-04-02 16:34:40 +00:00
Simon Atanasyan ee882239cb [yaml2obj][ELF] Remove relationship between ELFState
and ContiguousBlobAccumulator classes. Pass ContiguousBlobAccumulator to
the handleSymtabSectionHeader function directly.

No functional changes.

llvm-svn: 205431
2014-04-02 16:34:34 +00:00
Oliver Stannard b14c625111 ARM: Add support for segmented stacks
Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav.

llvm-svn: 205430
2014-04-02 16:10:33 +00:00
Adrian Prantl a731cf0018 clarify comment
llvm-svn: 205429
2014-04-02 15:49:45 +00:00
Adrian Prantl f79621e440 fix a comment to use ASCII aprostrophes.
llvm-svn: 205428
2014-04-02 15:49:37 +00:00
Tim Northover 6d69168ffd ARM64: use GOT for weak symbols & PIC.
Weak symbols cannot use the small code model's usual ADRP sequences since the
instruction simply may not be able to encode a value of 0.

This redirects them to use the GOT, which hopefully linkers are able to cope
with even in the static relocation model.

llvm-svn: 205426
2014-04-02 14:39:11 +00:00
Tim Northover 0d80f70530 ARM64: fix lowering of fp128 fptosi/fptoui
We were creating libcall nodes that returned an MVT::f128, when these
particular operations actually return an int of some stripe.

llvm-svn: 205425
2014-04-02 14:39:07 +00:00
Tim Northover 670df3d937 SLPVectorizer: compare entire intrinsic for SLP compatibility.
Some Intrinsics are overloaded to the extent that return type equality (all
that's been checked up to now) does not guarantee that the arguments are the
same. In these cases SLP vectorizer should not recurse into the operands, which
can be achieved by comparing them as "Function *" rather than simply the ID.

llvm-svn: 205424
2014-04-02 14:39:02 +00:00
Tim Northover ebd37ab382 ARM64: make sure first argument to INSERT_SUBVECTOR has right type.
Again, coalescing and other optimisations swiftly made the MachineInstrs
consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was
produced.

llvm-svn: 205423
2014-04-02 14:38:58 +00:00
Tim Northover 5e3a484e3b ARM64: convert fp16 narrowing ISel to pseudo-instruction
The previous attempt was fine with optimisations, but was actually rather
cavalier with its types. When compiled at -O0, it produced invalid COPY
MachineInstrs.

llvm-svn: 205422
2014-04-02 14:38:54 +00:00
Job Noorman f7da105f39 Mark FPB as a reserved register when needed.
llvm-svn: 205421
2014-04-02 13:13:56 +00:00
Rafael Espindola b1b49789d0 Work around gold bug http://sourceware.org/PR16794.
llvm-svn: 205416
2014-04-02 12:15:20 +00:00
Renato Golin d93295ea56 Remove duplicated DMB instructions
ARM specific optimiztion, finding places in ARM machine code where 2 dmbs
follow one another, and eliminating one of them.

Patch by Reinoud Elhorst.

llvm-svn: 205409
2014-04-02 09:03:43 +00:00
Yaron Keren 2895496852 Added isTargetWindowsMSVC(), renamed isTargetMingw() to isTargetWindowsGNU()
and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the
four Windows environments in Triple.h.

Suggestion by Saleem Abdulrasool!

llvm-svn: 205393
2014-04-02 04:27:51 +00:00
Hal Finkel b0ebdc0f43 [LoopVectorizer] Count dependencies of consecutive pointers as uniforms
For the purpose of calculating the cost of the loop at various vectorization
factors, we need to count dependencies of consecutive pointers as uniforms
(which means that the VF = 1 cost is used for all overall VF values).

For example, the TSVC benchmark function s173 has:
  ...
  %3 = add nsw i64 %indvars.iv, 16000
  %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3
  ...
and we must realize that the add will be a scalar in order to correctly deduce
it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all
dependencies of a consecutive pointer must be a scalar (uniform), and so we
simply need to add all consecutive pointers to the worklist that currently
detects collects uniforms.

Fixes PR19296.

llvm-svn: 205387
2014-04-02 02:34:49 +00:00
David Blaikie 326e1fa13b Adjust comments regarding non-relocated abbrev offset in debug_info.dwo
I'm not sure the comment in the implementation really adds a lot of
value (it's clear that we emit zero when no symbol is provided, but it
doesn't explain why we would do that). Happy to iterate.

llvm-svn: 205386
2014-04-02 02:04:51 +00:00
David Blaikie 94c1d7f174 Split debug_loc and debug_loc.dwo emission into two separate functions
Based on code review feedback from Eric Christopher on r204697

llvm-svn: 205385
2014-04-02 01:50:20 +00:00
David Blaikie 0a456de5a2 DebugInfo: Introduce DebugLocList to encapsulate a list of DebugLocEntries and an MC Label to refer to them
This removes the magic-number-esque code creating/retrieving the same
label for a debug_loc entry from two places and removes the last small
piece of reusable logic from emitDebugLoc so that there will be less
duplication when refactoring it into two functions (one for debug_loc,
the other for debug_loc.dwo).

llvm-svn: 205382
2014-04-02 01:43:18 +00:00
Quentin Colombet 3c2b13b258 [ARM64][CollectLOH] Add some comments to explain how the LOHs
framework works (for the compiler part), since the design
document is not available.

llvm-svn: 205379
2014-04-02 01:02:28 +00:00
Adrian Prantl 3c5453cb6e Add a doxygen comment to DebugLocEntry::Merge.
llvm-svn: 205374
2014-04-01 23:34:45 +00:00
David Blaikie 6fa9966ee6 DebugLocEntry: Actually merge the loc entry when returning true.
Seems we didn't have any test coverage for merging... awesome. So I
added some - but hit an llvm-objdump bug while I was there. I'm choosing
not to shave that yak right now.

Code review feedback/bug catch by Adrian Prantl in r205360.

llvm-svn: 205373
2014-04-01 23:19:23 +00:00
David Blaikie 91567b6700 Fix accidental fallthrough in DebugLocEntry::hasSameValueOrLocation
No test case (this would invoke UB by examining uninitialized members,
etc, at best - and this code is apparently untested anyway - I'm about
to fix that)

Code review feedback from Adrian Prantl on r205360.

llvm-svn: 205367
2014-04-01 22:25:09 +00:00
David Blaikie c2af77b027 Remove unused function DebugLocEntry::isEmpty
llvm-svn: 205365
2014-04-01 22:06:18 +00:00
David Blaikie d306baf572 Refactor out the comparison of the location/value in a DebugLocEntry
llvm-svn: 205364
2014-04-01 22:04:07 +00:00
David Blaikie 98caf8bc64 Add inequality operator for MachineLocation.
Fixes the build I broke in r205360

llvm-svn: 205361
2014-04-01 21:54:52 +00:00
David Blaikie 1275e4f026 DebugInfo: Split DebugLocEntry into its own file.
It seems big enough that it deserves its own file - but it is header
only, so there's no need for another cpp file, etc.

llvm-svn: 205360
2014-04-01 21:49:04 +00:00
Adrian Prantl 6b444c5c8e Add a comment about the DIDescriptor class hierarchy.
llvm-svn: 205358
2014-04-01 21:04:24 +00:00
Adrian Prantl 75ce62acef DwarfDebug: Prevent DebugLocEntry merging from coalescing two different
constants into only the first one.

rdar://14874886.

llvm-svn: 205357
2014-04-01 21:04:18 +00:00
Hal Finkel 9e0baa6d3a [PowerPC] Add some missing VSX bitcast patterns
llvm-svn: 205352
2014-04-01 19:24:27 +00:00
Yaron Keren 48d68d439a If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 and
Environment == Triple::MSVC so it will never be MinGW or Cygwin.

llvm-svn: 205349
2014-04-01 18:52:55 +00:00
Hal Finkel 2eed29f3c8 Implement X86TTI::getUnrollingPreferences
This provides an initial implementation of getUnrollingPreferences for x86.
getUnrollingPreferences is used by the generic (concatenation) unroller, which
is distinct from the unrolling done by the loop vectorizer. Many modern x86
cores have some kind of uop cache and loop-stream detector (LSD) used to
efficiently dispatch small loops, and taking full advantage of this requires
unrolling small loops (small here means 10s of uops).

These caches also have limits on the number of taken branches in the loop, and
so we also cap the loop unrolling factor based on the maximum "depth" of the
loop. This is currently calculated with a partial DFS traversal (partial
because it will stop early if the path length grows too much). This is still an
approximation, and one that is both conservative (because it does not account
for branches eliminated via block placement) and optimistic (because it is only
recording the maximum depth over minimum paths). Nevertheless, because the
loops that fit in these uop caches are so small, it is not clear how much the
details matter.

The original set of patches posted for review produced the following test-suite
performance results (from the TSVC benchmark) at that time:
  ControlLoops-dbl - 13% speedup
  ControlLoops-flt - 15% speedup
  Reductions-dbl - 7.5% speedup

llvm-svn: 205348
2014-04-01 18:50:34 +00:00
Hal Finkel 6386cb8d4d Add some additional fields to TTI::UnrollingPreferences
In preparation for an upcoming commit implementing unrolling preferences for
x86, this adds additional fields to the UnrollingPreferences structure:

 - PartialThreshold and PartialOptSizeThreshold - Like Threshold and
   OptSizeThreshold, but used when not fully unrolling. These are necessary
   because we need different thresholds for full unrolling from those used when
   partially unrolling (the full unrolling thresholds are generally going to be
   larger).

 - MaxCount - A cap on the unrolling factor when partially unrolling. This can
   be used by a target to prevent the unrolled loop from exceeding some
   resource limit independent of the loop size (such as number of branches).

There should be no functionality change for any in-tree targets.

llvm-svn: 205347
2014-04-01 18:50:30 +00:00
Hal Finkel b4e001cc81 Use TopTTI->getGEPCost from within getUserCost
The implementation of getUserCost had duplicated (and hard-coded) the default
logic in getGEPCost. Instead, it is better to use getGEPCost directly, which
limits the default logic to the implementation of one function, and allows
targets to override the behavior.

No functionality change intended.

llvm-svn: 205346
2014-04-01 18:50:06 +00:00
Kai Nacke af47f60f83 [mips] Add Octeon cnMips instructions mtmX and mtpX
Adds the Octeon cnMips instructions "load multiplier register MPLx" and "load product register Px".
Includes tests.

Reviews by: Daniel.Sanders@imgtec.com

llvm-svn: 205343
2014-04-01 18:35:26 +00:00
Reid Kleckner 101102711d Support segmented stacks on Win64
Identical to Win32 method except the GS segment register is used for TLS
instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14.

llvm-svn: 205342
2014-04-01 18:34:21 +00:00
Matt Arsenault 553751b9bc Fix missing RUN line in test
llvm-svn: 205341
2014-04-01 18:34:13 +00:00
Yaron Keren 136fe7db46 isTargetWindows() renamed to isTargetKnownWindowsMSVC()
to reflect its current functionality.

Based on Takumi NAKAMURA suggestion.

llvm-svn: 205338
2014-04-01 18:15:34 +00:00
Matt Arsenault e407ae9846 Make isSetCCEquivalent respect the TargetBooleanContents
llvm-svn: 205336
2014-04-01 18:13:26 +00:00
Matt Arsenault 6310c3f667 Add helpers for checking if a value is a target boolean constant.
llvm-svn: 205335
2014-04-01 18:13:22 +00:00
David Blaikie 0e84adc621 DebugInfo: Factor out common functionality for rendering debug_loc and debug_loc.dwo location list entries
In preparation for refactoring this function into two, one for
debug_loc, one for debug_loc.dwo.

llvm-svn: 205324
2014-04-01 16:17:41 +00:00
David Blaikie 7f1f8742ea Cleanup remaining use of removed variable to fix the build
llvm-svn: 205323
2014-04-01 16:13:29 +00:00
David Blaikie e12ab1276d Simplify debug_loc.dwo handling slightly.
llvm-svn: 205322
2014-04-01 16:09:49 +00:00
Christian Pirker dc9ff75554 ARM: rename ARMle/ARMbe with ARMLE/ARMBE, and Thumble/Thumbbe with ThumbLE/ThumbBE
llvm-svn: 205317
2014-04-01 15:19:30 +00:00
Tim Northover 0feb91ef15 ARM: teach LLVM that Cortex-A7 is very similar to A8.
llvm-svn: 205314
2014-04-01 14:10:07 +00:00
Aaron Ballman 8bf5a548ea Attempting to fix r205124, which had failed asserts when built with MSVC.
Suggestion from Yaron Keren.

llvm-svn: 205313
2014-04-01 13:56:35 +00:00
Tim Northover 1351030801 ARM: add cyclone CPU with ZeroCycleZeroing feature.
The Cyclone CPU is similar to swift for most LLVM purposes, but does have two
preferred instructions for zeroing a VFP register. This teaches LLVM about
them.

llvm-svn: 205309
2014-04-01 13:22:02 +00:00
Daniel Sanders 21bce30fdc [mips] Renamed ParseAnyRegisterWithoutDollar to MatchAnyRegisterWithoutDollar
This is for consistency with other functions. The Parse* functions consume
tokens and the Match* functions don't.

No functional change.

llvm-svn: 205305
2014-04-01 12:35:23 +00:00
Aaron Ballman 0947bb20d8 Fixing an MSVC warning about widening the result of a 32-bit shift implicitly. No functional change intended.
llvm-svn: 205304
2014-04-01 12:24:25 +00:00
Tim Northover 4f1dd58e2e ARM64: add intrinsic for pmull (p64 x p64 = p128) operations.
llvm-svn: 205302
2014-04-01 12:22:37 +00:00
Aaron Ballman d1726ee8fa Fixing warnings in the MSVC build. No functional changes intended.
llvm-svn: 205301
2014-04-01 12:22:20 +00:00
Daniel Sanders ffd8436d6c [mips] Extend ParseJumpTarget to support the full symbol expression syntax.
Summary:
This should fix the issues the D3222 caused in lld. Testcase is based on
the one that failed in the buildbot.

Depends on D3233

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3234

llvm-svn: 205298
2014-04-01 10:41:48 +00:00
Daniel Sanders 315386c083 [mips] Use AsmLexer::peekTok() to resolve the conflict between $reg and $sym
Summary:
Parsing registers no longer consume the $ token before it's confirmed whether it really has a register or not, therefore it's no longer impossible to match symbols if registers were tried first.

Depends on D3232

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3233

llvm-svn: 205297
2014-04-01 10:40:14 +00:00
Daniel Sanders 0993457891 [mips] Hoist Parser.Lex() calls out of MatchAnyRegisterNameWithoutDollar()
Summary:
No functional change

Depends on D3222

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3232

llvm-svn: 205295
2014-04-01 10:37:46 +00:00
Tim Northover ff179ba3d3 ARM64: add patterns for more lane-wise ld1/st1 operations.
llvm-svn: 205294
2014-04-01 10:37:09 +00:00
Tim Northover d8d613b979 ARM64: fix bug in ld3r (1d) SelectionDAG.
llvm-svn: 205293
2014-04-01 10:37:03 +00:00
Daniel Sanders b50ccf8e26 [mips] Rewrite MipsAsmParser and MipsOperand.
Summary:
Highlights:
- Registers are resolved much later (by the render method).
  Prior to that point, GPR32's/GPR64's are GPR's regardless of register
  size. Similarly FGR32's/FGR64's/AFGR64's are FGR's regardless of register
  size or FR mode. Numeric registers can be anything.
- All registers are parsed the same way everywhere (even when handling
  symbol aliasing)
  - One consequence is that all registers can be specified numerically
    almost anywhere (e.g. $fccX, $wX). The exception is symbol aliasing
    but that can be easily resolved.
- Removes the need for the hasConsumedDollar hack
- Parenthesis and Bracket suffixes are handled generically
- Micromips instructions are parsed directly instead of going through the
  standard encodings first.
- rdhwr accepts all 32 registers, and the following instructions that previously
  xfailed now work:
    ddiv, ddivu, div, divu, cvt.l.[ds], se[bh], wsbh, floor.w.[ds], c.ngl.d,
    c.sf.s, dsbh, dshd, madd.s, msub.s, nmadd.s, nmsub.s, swxc1
- Diagnostics involving registers point at the correct character (the $)
- There's only one kind of immediate in MipsOperand. LSA immediates are handled
  by the predicate and renderer.

Lowlights:
- Hardcoded '$zero' in the div patterns is handled with a hack.
  MipsOperand::isReg() will return true for a k_RegisterIndex token
  with Index == 0 and getReg() will return ZERO for this case. Note that it
  doesn't return ZERO_64 on isGP64() targets.
- I haven't cleaned up all of the now-unused functions.
  Some more of the generic parser could be removed too (integers and relocs
  for example).
- insve.df needed a custom decoder to handle the implicit fourth operand that
  was needed to make it parse correctly. The difficulty was that the matcher
  expected a Token<'0'> but gets an Imm<0>. Adding an implicit zero solved this.

Reviewers: matheusalmeida, vmedic

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D3222

llvm-svn: 205292
2014-04-01 10:35:28 +00:00
Renato Golin 33f973a43a Recover TableGen/LangRef, make it official
Making the new TableGen documentation official and marking the old file as
"Moved". Also, reverting the original LangRef as the normative formal
description of the language, while keeping the "new" LangRef as LangIntro
for the less inlcined to reading language grammars.

We should remove TableGenFundamentals.rst one day, but for now, just a
warning that it moved will have to do, while we make sure there are no more
links to it from elsewhere.

llvm-svn: 205289
2014-04-01 09:51:49 +00:00
Alexey Volkov 1328b28dc6 [x86] Do not convert to cmp32 for Atom arch by Sergey Okunev
Differential Revision: http://llvm-reviews.chandlerc.com/D2824

llvm-svn: 205288
2014-04-01 08:13:07 +00:00
David Blaikie 3464161070 DebugInfo: Avoid creating unnecessary/empty line tables and remove the special case of '0' in DwarfCompileUnit::initStmtList by just always using a label difference
This moves one case of raw text checking down into the MCStreamer
interfaces in the form of a virtual function, even if we ultimately end
up consolidating on the one-or-many line tables issue one day, this is
nicer in the interim. This just generally streamlines a bunch of use
cases into a common code path.

llvm-svn: 205287
2014-04-01 08:07:52 +00:00
David Blaikie 8bf66c4f3f DebugInfo: Emit relocation to debug_line section when emitting asm for asm
I don't think this is reachable by any frontend (why would you transform
asm to asm+debug info?) but it helps tidy up some of this code, avoid
the weird special case of "emit the first CU, store the label, then emit
the rest" in MCDwarfLineTable::Emit by instead having the
DWARF-for-assembly case use the same codepath as DwarfDebug.cpp, by
registering the label of the debug_line section, thus causing it to be
emitted. (with a special case in asm output to just emit the label since
asm output uses the .loc directives, etc, rather than the debug_loc
directly)

llvm-svn: 205286
2014-04-01 07:35:52 +00:00
Adrian Prantl 762bdd5118 Remove FIXMEs. The scope of a Variable is always a lexical scope; there is
nothing to be gained from switching this over to a DIScopeRef.

llvm-svn: 205281
2014-04-01 03:50:01 +00:00
Adrian Prantl d09ba23faf LTO type uniquing: store the Decl field of a DIImportedEntity as a DIRef.
No other functionality changes, DIBuilder testcase is included in a paired
CFE commit.

This relaxes the assertion in isScopeRef to also accept subclasses of
DIScope.

llvm-svn: 205279
2014-04-01 03:41:04 +00:00
Adrian Prantl c3264f393d Add a comment about type-uniquing ObjC types.
llvm-svn: 205277
2014-04-01 03:40:59 +00:00
David Blaikie f78cbb5b44 Comment to describe the debug_loc.dwo constants
Code review feedback from Eric Christopher on r204697

llvm-svn: 205268
2014-03-31 23:50:20 +00:00
Saleem Abdulrasool d3bafe3e41 MCJIT: ensure that cygwin is identified properly
Cygwin is now a proper environment rather than an OS.  This updates the MCJIT
tests to avoid execution on Cygwin.  This fixes native cygwin tests.

llvm-svn: 205266
2014-03-31 23:42:23 +00:00
Hal Finkel 86b3064f2b Move partial/runtime unrolling late in the pipeline
The generic (concatenation) loop unroller is currently placed early in the
standard optimization pipeline. This is a good place to perform full unrolling,
but not the right place to perform partial/runtime unrolling. However, most
targets don't enable partial/runtime unrolling, so this never mattered.

However, even some x86 cores benefit from partial/runtime unrolling of very
small loops, and follow-up commits will enable this. First, we need to move
partial/runtime unrolling late in the optimization pipeline (importantly, this
is after SLP and loop vectorization, as vectorization can drastically change
the size of a loop), while keeping the full unrolling where it is now. This
change does just that.

llvm-svn: 205264
2014-03-31 23:23:51 +00:00
Duncan P. N. Exon Smith acb367bd62 lit: Set a base directory for compiler-rt tests
Setting this parameter enables llvm-lit to run on source directories for
compiler-rt test suites that implement magic in their lit.cfg.

<rdar://problem/16458307>

llvm-svn: 205262
2014-03-31 23:14:10 +00:00
Arnold Schwaighofer 15262e6703 Revert "SLPVectorizer: Ignore users that are insertelements we can reschedule them"
This reverts commit r205018.

Conflicts:
	lib/Transforms/Vectorize/SLPVectorizer.cpp
	test/Transforms/SLPVectorizer/X86/insert-element-build-vector.ll

This is breaking libclc build.

llvm-svn: 205260
2014-03-31 23:05:56 +00:00
Joerg Sonnenberger deb1e4adc1 Shifting into the sign bit is UB as discussed on IRC. Explicitly use the
BitWord type for the constants to avoid this.

llvm-svn: 205257
2014-03-31 22:53:57 +00:00
Juergen Ributzka e117992f00 [Stackmaps] Update the stackmap format to use 64-bit relocations for the function address and properly align all entries.
This commit updates the stackmap format to version 1 to indicate the
reorganizaion of several fields. This was done in order to align stackmap
entries to their natural alignment and to minimize padding.

Fixes <rdar://problem/16005902>

llvm-svn: 205254
2014-03-31 22:14:04 +00:00
Adam Nemet 10c4ce2584 [X86] Adjust cost of FP_TO_UINT v4f64->v4i32 as well
Pretty obvious follow-on to r205159 to also handle conversion from double
besides float.

Fixes <rdar://problem/16373208>

llvm-svn: 205253
2014-03-31 21:54:48 +00:00
Matt Arsenault d6c4326786 R600/SI: Remove leftover pattern splitting 64-bit ors.
It's now matched to the scalar 64-bit or and split later if
necessary.'

llvm-svn: 205252
2014-03-31 21:46:46 +00:00
Manman Ren 63efd8e7e6 Register allocator: set CSRFirstUseCost to 5 for ARM64.
A value of 5 means if we have a split or spill option that has a really
low cost (1 << 14 is the entry frequency), we will choose to spill
or split the really cold path before using a callee-saved register.

This gives us the performance benefit on SPECInt2k and is also conservative.

rdar://16162005

llvm-svn: 205248
2014-03-31 21:06:36 +00:00
Matt Arsenault f751d6272d Change shouldSplitVectorElementType to better match the description.
Pass the entire vector type, and not just the element.

llvm-svn: 205247
2014-03-31 20:54:58 +00:00
Rui Ueyama f4d29bc05b Fix MSVC warning.
This patch is to fix the following warning when compiled with MSVC 64 bit.

  warning C4334: '<<' : result of 32-bit shift implicitly converted to 64
  bits (was 64-bit shift intended?)

llvm-svn: 205245
2014-03-31 20:04:37 +00:00
Matt Arsenault d7bdcc46a6 R600/SI: Implement shouldConvertConstantLoadToIntImm
llvm-svn: 205244
2014-03-31 19:54:27 +00:00
Hal Finkel b811b6d0d1 Add an optional ability to expand larger BUILD_VECTORs with shuffles
This adds the ability to expand large (meaning with more than two unique
defined values) BUILD_VECTOR nodes in terms of SCALAR_TO_VECTOR and (legal)
vector shuffles. There is now no limit of the size we are capable of expanding
this way, although we don't currently do this for vectors with many unique
values because of the default implementation of TLI's
shouldExpandBuildVectorWithShuffles function.

There is currently no functional change to any existing targets because the new
capabilities are not used unless some target overrides the TLI
shouldExpandBuildVectorWithShuffles function. As a result, I've not included a
test case for the new functionality in this commit, but regression tests will
(at least) be added soon when I commit support for the PPC QPX vector
instruction set.

The benefit of committing this now is that it makes the
shouldExpandBuildVectorWithShuffles callback, which had to be added for other
reasons regardless, fully functional. I suspect that other targets will
also benefit from tuning the heuristic.

llvm-svn: 205243
2014-03-31 19:42:55 +00:00
Matt Arsenault 378bf9c68b R600: Compute masked bits for min and max
llvm-svn: 205242
2014-03-31 19:35:33 +00:00
Rafael Espindola ee1c342ef9 Don't relocate with sections if there might be a paired relocation.
llvm-svn: 205240
2014-03-31 19:00:23 +00:00
Daniel Sanders e34a120285 Revert: [mips] Rewrite MipsAsmParser and MipsOperand.' due to buildbot errors in lld tests.
It's currently unable to parse 'sym + imm' without surrounding parenthesis.

llvm-svn: 205237
2014-03-31 18:51:43 +00:00
Matt Arsenault 4c53717787 R600: Add BFE, BFI, and BFM intrinsics to help with writing tests.
llvm-svn: 205236
2014-03-31 18:21:18 +00:00
Matt Arsenault b34583661b R600: Add target nodes for BFM and BFI
llvm-svn: 205235
2014-03-31 18:21:13 +00:00
Saleem Abdulrasool 2070088bef ARM: fix typo
llvm-svn: 205233
2014-03-31 18:09:10 +00:00
Rafael Espindola c627a8750a Now that this test is assembly, make the checks a bit stronger.
This will be used for a followup patch.

llvm-svn: 205232
2014-03-31 18:01:50 +00:00