Commit Graph

112534 Commits

Author SHA1 Message Date
Chris Bieneman d1d9430a05 Refactoring llvm command line parsing and option registration.
Summary:
The primary goal of this patch is to remove the need for MarkOptionsChanged(). That goal is accomplished by having addOption and removeOption properly sort the options.

This patch puts the new add and remove functionality on a CommandLineParser class that is a placeholder. Some of the functionality in this class will need to be merged into the OptionRegistry, and other bits can hopefully be in a better abstraction.

This patch also removes the RegisteredOptionList global, and the need for cl::Option objects to be linked list nodes.

The changes in CommandLineTest.cpp are required because these changes shift when we validate that options are not duplicated. Before this change duplicate options were only found during certain cl API calls (like cl::ParseCommandLine). With this change duplicate options are found during option construction.

Reviewers: dexonsmith, chandlerc, pete

Reviewed By: pete

Subscribers: pete, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D7132

llvm-svn: 227345
2015-01-28 19:00:25 +00:00
Alex Rosenberg d558ffaf69 Assume code ownership for the PS4 to ensure patches get reviewed, per the Developer Policy.
llvm-svn: 227340
2015-01-28 18:33:39 +00:00
Bjorn Steinbrink 88b2b57cc9 Fix build breakage caused by memory leaks in llvm-c-test
I accidently introduced those in r227319.

llvm-svn: 227339
2015-01-28 18:32:31 +00:00
Colin LeMahieu 19ed07c75a [Hexagon] Deleting a lot of old variants of intrinsics and updating references.
llvm-svn: 227338
2015-01-28 18:29:11 +00:00
Frederic Riss d34551833f [dsymutil] Add DwarfLinker class.
It's an empty shell for now. It's main method just opens the debug
map objects and parses their Dwarf info. Test that we at least do
that correctly.

llvm-svn: 227337
2015-01-28 18:27:01 +00:00
Colin LeMahieu 39b846ce0f [Hexagon] Converting XTYPE/BIT intrinsic patterns and adding tests.
llvm-svn: 227335
2015-01-28 18:06:23 +00:00
Sanjay Patel 9bb601856e use SDValue methods directly instead of getNode()->* ; NFCI
llvm-svn: 227334
2015-01-28 18:01:31 +00:00
Rafael Espindola a05b3b73a4 Simplify code. NFC.
llvm-svn: 227333
2015-01-28 17:54:19 +00:00
Colin LeMahieu fe03c9a678 [Hexagon] Replacing XTYPE/SHIFT intrinsic patternss. Adding tests and missing instructions with tests.
llvm-svn: 227330
2015-01-28 17:37:59 +00:00
Jozef Kolek e10a02ecf0 [mips][microMIPS] Implement LWGP instruction
Differential Revision: http://reviews.llvm.org/D6650

llvm-svn: 227325
2015-01-28 17:27:26 +00:00
Colin LeMahieu fdbc5adbb6 [Hexagon] Replacing intrinsics for halfword adds and max/min word/dword.
llvm-svn: 227322
2015-01-28 17:06:40 +00:00
Colin LeMahieu c17902b89b [Hexagon] Replacing old intrinsic tests with organized versions that match the reference manual.
llvm-svn: 227321
2015-01-28 16:58:05 +00:00
Bjorn Steinbrink a09ac0085d Fix LLVMSetMetadata and LLVMAddNamedMetadataOperand for single value MDNodes
Summary:
MetadataAsValue uses a canonical format that strips the MDNode if it
contains only a single constant value. This triggers an assertion when
trying to cast the value to a MDNode.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7165

llvm-svn: 227319
2015-01-28 16:35:59 +00:00
Simon Atanasyan e13a9624c2 [ELFYAML] Provide explicit value for relocation addendums in the test
The `Addend` is an optional field of the `Relocation` YAML record. But
we do not provide its default value while reading it from a YAML file
and so it might keep uninitialized.

I am going to fix the code by a separate commit. We might either make
this field mandatory (at least for .rela sections) or specify 0 as
a default value explicitly.

llvm-svn: 227318
2015-01-28 16:22:50 +00:00
Michael Kuperstein 90e08320c9 [x32] Change the condition from bitness to LP64 for TCRETURNdi64.
TCRETURNmi64, which was mistakenly changed in r227307 will wait for another day.

llvm-svn: 227317
2015-01-28 16:11:35 +00:00
Tom Stellard 40ce8af4a5 R600: Move DataLayout to AMDGPUTargetMachine
This is a follow up to r227113.

It is now required to use the amdgcn target for SI and newer GPUs.

llvm-svn: 227316
2015-01-28 16:04:26 +00:00
Tom Stellard eba5648ad2 R600: Use a Southern Islands GPU as the default for the amdgcn target
llvm-svn: 227314
2015-01-28 15:38:42 +00:00
Hal Finkel 34c94d5caa Correct the AggressiveAntiDepBreaker's handling of subregisters defining super registers
As the AggressiveAntiDepBreaker iterated backward through a scheduling region,
we must leave super registers live through subregister definitions so that all
relevant subregister definitions are renamed together. The problem was that we
were also discarding sub-register use locations as the sub-registers are
redefined. The result is that we'd rename the super register along with some,
but not all, subregister definitions.

  R0_D = {R0_L, R1_L}
  R0_L = {R0_S, R1_S}

  %R0_L<def> = TRLi9 16, pred:8, pred:%noreg
  %R1_L<def> = LSRLrr %R1_L<kill>, %R0_S, pred:8, pred:%noreg
  %R0_L<def> = LSRLrr %R2_L, %R0_S, pred:8, pred:%noreg, %R0_L<imp-use,kill>
  %R1_L<def> = ANDLri %R1_L<kill>, 2047, pred:8, pred:%noreg
  %R0_L<def> = ANDLri %R0_L<kill>, 2047, pred:8, pred:%noreg
  %R4_D<def> = ASRDrr %R0_D<kill>, %R6_S

  Anti:   %R4_D<def> = ASRDrr %R0_D<kill>, %R6_S
   Def Groups: R4_D=g213->g215(via R4_S)->g214(via R4_L)->g216(via R5_S)->g216(via R4_L)->g217(via R5_L)
   Use Groups: R0_D=g0->g218(last-use) R1_L->g219(last-use) R6_S=g204->g220(last-use)
  Anti:   %R0_L<def> = ANDLri %R0_L<kill>, 2047, pred:8, pred:%noreg
   Def Groups: R0_L=g208->g209(via R0_S)->g218(via R0_D)->g210(via R1_S)->g210(via R0_D)
   Antidep reg: R0_L (real dependency)
   Use Groups: R0_L=g210->g224(last-use) R0_S->g225(last-use) R1_S->g226(last-use)
  Anti:   %R1_L<def> = ANDLri %R1_L<kill>, 2047, pred:8, pred:%noreg
   Def Groups: R1_L=g219->g210(via R0_D)
   Antidep reg: R1_L (real dependency)
   Use Groups: R1_L=g210->g229(last-use)
  Anti:   %R0_L<def> = LSRLrr %R2_L, %R0_S, pred:8, pred:%noreg, %R0_L<imp-use,kill>
   Def Groups: R0_L=g224->g225(via R0_S)->g210(via R0_D)->g226(via R1_S)->g226(via R0_D)
   Antidep reg: R0_L Use Groups: R2_L=g192 R0_S=g226->g230(last-use) R0_L=g226->g231(last-use) R1_S->g232(last-use)
  Anti:   %R1_L<def> = LSRLrr %R1_L<kill>, %R0_S, pred:8, pred:%noreg
   Def Groups: R1_L=g229->g226(via R0_D)
   Antidep reg: R1_L Use Groups: R1_L=g226->g233(last-use) R0_S=g230
  Anti:   %R0_L<def> = TRLi9 16, pred:8, pred:%noreg
   Def Groups: R0_L=g231->g230(via R0_S)->g226(via R0_D)->g232(via R1_S)->g232(via R0_D)
   Antidep reg: R0_L
   Rename Candidates for Group g232:
    R0_D: elcInt64Regs :: R0_D R1_D R2_D R3_D R4_D R5_D R8_D R9_D R10_D R11_D R12_D R13_D R14_D R15_D R16_D R17_D R18_D R19_D R20_D R21_D R22_D R23_D R24_D R25_D
    R0_L: elcIntRegs :: R0_L R1_L R2_L R3_L R4_L R5_L R8_L R9_L R10_L R11_L R12_L R13_L R14_L R15_L R16_L R17_L R18_L R19_L R20_L R21_L R22_L R23_L R24_L R25_L
    R0_S: elcShrtRegs elcShrtRegs :: R0_S R1_S R2_S R3_S R4_S R5_S R8_S R9_S R10_S R11_S R12_S R13_S R14_S R15_S R16_S R17_S R18_S R19_S R20_S R21_S R22_S R23_S R24_S R25_S
   Find Registers: [R12_D: R12_D R12_L R12_S]
   Breaking anti-dependence edge on R0_L: R0_D->R12_D(1 refs) R0_L->R12_L(2 refs) R0_S->R12_S(2 refs)
   Use Groups:
  ...

  %R12_L<def> = TRLi9 16, pred:8, pred:%noreg
  %R1_L<def> = LSRLrr %R1_L<kill>, %R12_S, pred:8, pred:%noreg
  %R0_L<def> = LSRLrr %R2_L<kill>, %R12_S, pred:8, pred:%noreg, %R12_L<imp-use>
  %R1_L<def> = ANDLri %R1_L<kill>, 2047, pred:8, pred:%noreg
  %R0_L<def> = ANDLri %R0_L<kill>, 2047, pred:8, pred:%noreg
  %R4_D<def> = ASRDrr %R12_D<kill>, %R6_S

With this change, we now produce:

  Anti:   %R4_D<def> = ASRDrr %R0_D<kill>, %R6_S
   Def Groups: R4_D=g213->g215(via R4_S)->g214(via R4_L)->g216(via R5_S)->g216(via R4_L)->g217(via R5_L)
   Use Groups: R0_D=g0->g218(last-use) R1_L->g219(last-use) R6_S=g204->g220(last-use)
  Anti:   %R0_L<def> = ANDLri %R0_L<kill>, 2047, pred:8, pred:%noreg
   Def Groups: R0_L=g208->g209(via R0_S)->g218(via R0_D)->g210(via R1_S)->g210(via R0_D)
   Antidep reg: R0_L (real dependency)
   Use Groups: R0_L=g210
  Anti:   %R1_L<def> = ANDLri %R1_L<kill>, 2047, pred:8, pred:%noreg
   Def Groups: R1_L=g219->g210(via R0_D)
   Antidep reg: R1_L (real dependency)
   Use Groups: R1_L=g210
  Anti:   %R0_L<def> = LSRLrr %R2_L, %R0_S, pred:8, pred:%noreg, %R0_L<imp-use,kill>
   Def Groups: R0_L=g210->g210(via R0_D)->g210(via R0_D)
   Antidep reg: R0_L Use Groups: R2_L=g192 R0_S=g210 R0_L=g210
  Anti:   %R1_L<def> = LSRLrr %R1_L<kill>, %R0_S, pred:8, pred:%noreg
   Def Groups: R1_L=g210->g210(via R0_D)
   Antidep reg: R1_L Use Groups: R1_L=g210 R0_S=g210
  Anti:   %R0_L<def> = TRLi9 16, pred:8, pred:%noreg
   Def Groups: R0_L=g210->g210(via R0_D)->g210(via R0_D)
   Antidep reg: R0_L
   Rename Candidates for Group g210:
    R0_D: elcInt64Regs :: R0_D R1_D R2_D R3_D R4_D R5_D R8_D R9_D R10_D R11_D R12_D R13_D R14_D R15_D R16_D R17_D R18_D R19_D R20_D R21_D R22_D R23_D R24_D R25_D
    R0_L: elcIntRegs elcIntAIRegs elcIntRegs elcIntRegs elcIntRegs elcIntRegs :: R0_L R1_L R2_L R3_L R4_L R5_L R8_L R9_L R10_L R11_L R12_L R13_L R14_L R15_L R16_L R17_L R18_L R19_L R20_L R21_L R22_L R23_L R24_L R25_L
    R1_L: elcIntRegs elcIntRegs elcIntRegs elcIntRegs elcIntRegs :: R0_L R1_L R2_L R3_L R4_L R5_L R8_L R9_L R10_L R11_L R12_L R13_L R14_L R15_L R16_L R17_L R18_L R19_L R20_L R21_L R22_L R23_L R24_L R25_L
    R0_S: elcShrtRegs elcShrtRegs :: R0_S R1_S R2_S R3_S R4_S R5_S R8_S R9_S R10_S R11_S R12_S R13_S R14_S R15_S R16_S R17_S R18_S R19_S R20_S R21_S R22_S R23_S R24_S R25_S
   Find Registers: [R12_D: R12_D R12_L R13_L R12_S]
   Breaking anti-dependence edge on R0_L: R0_D->R12_D(1 refs) R0_L->R12_L(7 refs) R1_L->R13_L(5 refs) R0_S->R12_S(2 refs)
   Use Groups:
  ...

  %R12_L<def> = TRLi9 16, pred:8, pred:%noreg
  %R13_L<def> = LSRLrr %R13_L<kill>, %R12_S, pred:8, pred:%noreg
  %R12_L<def> = LSRLrr %R2_L<kill>, %R12_S<kill>, pred:8, pred:%noreg, %R12_L<imp-use,kill>
  %R13_L<def> = ANDLri %R13_L<kill>, 2047, pred:8, pred:%noreg
  %R12_L<def> = ANDLri %R12_L<kill>, 2047, pred:8, pred:%noreg
  %R4_D<def> = ASRDrr %R12_D, %R6_S, %R12_L<imp-def>, %R12_S<imp-def>, %R13_S<imp-def>

As demonstrated by this example, this is also somewhat unfortunate, because
there is actually no need to rename the super register in this case (it is
fully covered by later subregister definitions), but we don't seem to track
enough information here to exploit that either.

Thanks to Daniil Troshkov for reporting the issue. The debug outputs in this
commit message are from Daniil.

llvm-svn: 227311
2015-01-28 14:44:14 +00:00
Michael Kuperstein 951995821a [X86] Reduce some 32-bit imuls into lea + shl
Reduce integer multiplication by a constant of the form k*2^c, where k is in {3,5,9} into a lea + shl. Previously it was only done for imulq on 64-bit platforms, but it makes sense for imull and 32-bit as well.

Differential Revision: http://reviews.llvm.org/D7196

llvm-svn: 227308
2015-01-28 14:08:22 +00:00
Michael Kuperstein f387611ac2 [x32] Enable sibcall optimization on x32.
This includes two things:
1) Fix TCRETURNdi and TCRETURN64di patterns to check the right thing (LP64 as opposed to target bitness).
2) Allow LEA64_32 in MatchingStackOffset.

llvm-svn: 227307
2015-01-28 13:38:48 +00:00
Sean Silva 52c7dcd55d [docs] Use slightly more proper .rst markup
Again, I'd like to emphasize to everyone that this sort of markup change
is *not* what you should be concerned about when writing docs. Focus on
*content*.

I applaud Chandler for focusing on the fantastic content of this new
section!

llvm-svn: 227305
2015-01-28 10:36:41 +00:00
Sean Silva b1548edf25 [docs] [cleanup] No need for a comment around C++11 override
llvm-svn: 227304
2015-01-28 10:26:29 +00:00
Elena Demikhovsky 7b0dd39db6 AVX-512: Added FMA intrinsics with rounding mode
By Asaf Badouh and Elena Demikhovsky

Added special nodes for rounding: FMADD_RND, FMSUB_RND..
It will prevent merge between nodes with rounding and other standard nodes.

llvm-svn: 227303
2015-01-28 10:21:27 +00:00
Craig Topper 7d3c6d307a [X86] Teach disassembler to handle illegal immediates on AVX512 integer compare instructions.
llvm-svn: 227302
2015-01-28 10:09:56 +00:00
Craig Topper 6772eac490 [X86] Merge printSSECC and printAVXCC. They only differed by an assertion.
llvm-svn: 227301
2015-01-28 10:09:52 +00:00
Chandler Carruth 16b670ec20 [LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack
tracing code.

Managed static was just insane overhead for this. We took memory fences
and external function calls in every path that pushed a pretty stack
frame. This includes a multitude of layers setting up and tearing down
passes, the parser in Clang, everywhere. For the regression test suite
or low-overhead JITs, this was contributing to really significant
overhead.

Even the LLVM ThreadLocal is really overkill here because it uses
pthread_{set,get}_specific logic, and has careful code to both allocate
and delete the thread local data. We don't actually want any of that,
and this code in particular has problems coping with deallocation. What
we want is a single TLS pointer that is valid to use during global
construction and during global destruction, any time we want. That is
exactly what every host compiler and OS we use has implemented for
a long time, and what was standardized in C++11. Even though not all of
our host compilers support the thread_local keyword, we can directly use
the platform-specific keywords to get the minimal functionality needed.
Provided this limited trial survives the build bots, I will move this to
Compiler.h so it is more widely available as a light weight if limited
alternative to the ThreadLocal class. Many thanks to David Majnemer for
helping me think through the implications across platforms and craft the
MSVC-compatible syntax.

The end result is *substantially* faster. When running llc in a tight
loop over a small IR file targeting the aarch64 backend, this improves
its performance by over 10% for me. It also seems likely to fix the
remaining regressions seen by JIT users with threading enabled.

This may actually have more impact on real-world compile times due to
the use of the pretty stack tracing utility throughout the rest of Clang
or LLVM, but I've not collected any detailed measurements.

llvm-svn: 227300
2015-01-28 09:52:14 +00:00
Chandler Carruth 5b0d3e3f3a [LPM] A targeted but somewhat horrible fix to the legacy pass manager's
querying of the pass registry.

The pass manager relies on the static registry of PassInfo objects to
perform all manner of its functionality. I don't understand why it does
much of this. My very vague understanding is that this registry is
touched both during static initialization *and* while each pass is being
constructed. As a consequence it is hard to make accessing it not
require a acquiring some lock. This lock ends up in the hot path of
setting up, tearing down, and invaliditing analyses in the legacy pass
manager.

On most systems you can observe this as a non-trivial % of the time
spent in 'ninja check-llvm'. However, I haven't really seen it be more
than 1% in extreme cases of compiling more real-world software,
including LTO.

Unfortunately, some of the GPU JITs are seeing this taking essentially
all of their time because they have very small IR running through
a small pass pipeline very many times (at least, this is the vague
understanding I have of it).

This patch tries to minimize the cost of looking up PassInfo objects by
leveraging the fact that the objects themselves are immutable and they
are allocated separately on the heap and so don't have their address
change. It also requires a change I made the last time I tried to debug
this problem which removed the ability to de-register a pass from the
registry. This patch creates a single access path to these objects
inside the PMTopLevelManager which memoizes the result of querying the
registry. This is somewhat gross as I don't really know if
PMTopLevelManager is the *right* place to put it, and I dislike using
a mutable member to memoize things, but it seems to work.

For long-lived pass managers this should completely eliminate
the cost of acquiring locks to look into the pass registry once the
memoized cache is warm. For 'ninja check' I measured about 1.5%
reduction in CPU time and in total time on a machine with 32 hardware
threads. For normal compilation, I don't know how much this will help,
sadly. We will still pay the cost while we populate the memoized cache.
I don't think it will hurt though, and for LTO or compiles with many
small functions it should still be a win. However, for tight loops
around a pass manager with many passes and small modules, this will help
tremendously. On the AArch64 backend I saw nearly 50% reductions in time
to complete 2000 cycles of spinning up and tearing down the pipeline.
Measurements from Owen of an actual long-lived pass manager show more
along the lines of 10% improvements.

Differential Revision: http://reviews.llvm.org/D7213

llvm-svn: 227299
2015-01-28 09:47:21 +00:00
Elena Demikhovsky 45f0448081 Fold fcmp in cases where value is provably non-negative. By Arch Robison.
This patch folds fcmp in some cases of interest in Julia. The patch adds a function CannotBeOrderedLessThanZero that returns true if a value is provably not less than zero. I.e. the function returns true if the value is provably -0, +0, positive, or a NaN. The patch extends InstructionSimplify.cpp to fold instances of fcmp where:
 - the predicate is olt or uge
 - the first operand is provably not less than zero
 - the second operand is zero
The motivation for handling these cases optimizing away domain checks for sqrt in Julia for common idioms such as sqrt(x*x+y*y)..

http://reviews.llvm.org/D6972

llvm-svn: 227298
2015-01-28 08:03:58 +00:00
David Majnemer d312c374df llvm-ar: Remove unimplemented -N option from -help
This fixes PR22358.

llvm-svn: 227296
2015-01-28 06:00:01 +00:00
Chandler Carruth b81dfa6378 [LPM] Stop using the string based preservation API. It is an
abomination.

For starters, this API is incredibly slow. In order to lookup the name
of a pass it must take a memory fence to acquire a pointer to the
managed static pass registry, and then potentially acquire locks while
it consults this registry for information about what passes exist by
that name. This stops the world of LLVMs in your process no matter
how little they cared about the result.

To make this more joyful, you'll note that we are preserving many passes
which *do not exist* any more, or are not even analyses which one might
wish to have be preserved. This means we do all the work only to say
"nope" with no error to the user.

String-based APIs are a *bad idea*. String-based APIs that cannot
produce any meaningful error are an even worse idea. =/

I have a patch that simply removes this API completely, but I'm hesitant
to commit it as I don't really want to perniciously break out-of-tree
users of the old pass manager. I'd rather they just have to migrate to
the new one at some point. If others disagree and would like me to kill
it with fire, just say the word. =]

llvm-svn: 227294
2015-01-28 04:57:56 +00:00
Eric Christopher 6c901623c0 Migrate AArch64 except for TTI and AsmPrinter away from getSubtargetImpl.
llvm-svn: 227293
2015-01-28 03:51:33 +00:00
Chandler Carruth 064dc3333f Introduce a section to the programmers manual about type hierarchies,
polymorphism, and virtual dispatch.

This is essentially trying to explain the emerging design techniques
being used in LLVM these days somewhere more accessible than the
comments on a particular piece of infrastructure. It covers the
"concepts-based polymorphism" that caused some confusion during initial
reviews of the new pass manager as well as the tagged-dispatch mechanism
used pervasively in LLVM and Clang.

Perhaps most notably, I've tried to provide some criteria to help
developers choose between these options when designing new pieces of
infrastructure.

Differential Revision: http://reviews.llvm.org/D7191

llvm-svn: 227292
2015-01-28 03:04:54 +00:00
David Blaikie e245228903 Add description to assert
llvm-svn: 227291
2015-01-28 02:43:15 +00:00
David Blaikie fa1a3c7cf5 PR22356: DebugInfo: Handle the size of a member where the type of that member is a typedef (or other sugar) of a declaration.
llvm-svn: 227290
2015-01-28 02:34:53 +00:00
Lang Hames 33c9433ed4 Revert r227247 and r227228: "Add weak symbol support to RuntimeDyld".
This has wider implications than I expected when I reviewed the patch: It can
cause JIT crashes where clients have used the default value for AbortOnFailure
during symbol lookup. I'm currently investigating alternative approaches and I
hope to have this back in tree soon.

llvm-svn: 227287
2015-01-28 01:30:37 +00:00
Zachary Turner 49693b40c5 [llvm-pdbdump] Add basic symbol dumping.
This adds two command line options:

--symbols dumps a list of all symbols found in the PDB.
--symbol-details dumps the same list, but with detailed information
                 for every symbol such as type, attributes, etc.

llvm-svn: 227286
2015-01-28 01:22:33 +00:00
Reid Kleckner 4af6415237 Move EH personality type classification to Analysis/LibCallSemantics.h
Summary:
Also add enum types for __C_specific_handler and _CxxFrameHandler3 for
which we know a few things.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7214

llvm-svn: 227284
2015-01-28 01:17:38 +00:00
Zachary Turner 4287b94988 [llvm-pdbdump] Add support for printing source files and compilands.
This adds two command line options to llvm-pdbdump.

--source-files prints a flat list of all source files in the PDB.

--compilands prints a list of all compilands (e.g. object files)
             that the PDB knows about, and for each one, a list of
             source files that the compiland is composed of as well
             as a hash of the original source file.

llvm-svn: 227276
2015-01-28 00:33:00 +00:00
Zachary Turner 2145a67c7c [llvm-pdbdump] Print more friendly names for enum values.
llvm-svn: 227275
2015-01-28 00:32:49 +00:00
Quentin Colombet 308b171318 Revert r227242 - Merge vector stores into wider vector stores (PR21711).
This commit creates infinite loop in DAG combine for in the LLVM test-suite
for aarch64 with mcpu=cylcone (just having neon may be enough to expose this).

llvm-svn: 227272
2015-01-27 23:58:01 +00:00
Petar Jovanovic 4a11849034 [mips] Use __clear_cache builtin instead of cacheflush()
Use __clear_cache builtin instead of cacheflush() in
Unix Memory::InvalidateInstructionCache().

Differential Revision: http://reviews.llvm.org/D7198

llvm-svn: 227269
2015-01-27 23:30:18 +00:00
Zachary Turner 4016f78dbd Run dos2unix against llvm-pdbdump.
llvm-svn: 227262
2015-01-27 23:02:23 +00:00
Saleem Abdulrasool c44d71b8df SymbolRewriter: allow rewriting with comdats
COMDATs must be identically named to the symbol.  When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs.  This corrects the behaviour and
adds test cases for this.

llvm-svn: 227261
2015-01-27 22:57:39 +00:00
Saleem Abdulrasool 9769b18cba SymbolRewriter: prevent unnecessary rewrite
The rewrite for the pattern based rewrite is unnecessary if the existing name
matches the pattern.

llvm-svn: 227260
2015-01-27 22:57:35 +00:00
Zachary Turner 7058dfc6cc Add support for dumping debug tables to llvm-pdbdump.
PDB stores some of its data in streams and some in tables.
This patch teaches llvm-pdbdump to dump basic summary data
for the debug tables.

In support of this, this patch also adds some DIA helper
classes, such as a wrapper around an IDiaSymbol interface,
as well as helpers for outputting various enumerations to
a raw_ostream.

llvm-svn: 227257
2015-01-27 22:40:14 +00:00
Sanjay Patel b1ca4e48d4 remove function names from comments; NFC
llvm-svn: 227256
2015-01-27 22:26:56 +00:00
Chris Bieneman 6816936287 Re-landing changes to use ArrayRef instead of SmallVectorImpl, and new API test.
This contains the changes from r227148 & r227154, and also fixes to the test case to properly clean up the stack options.

llvm-svn: 227255
2015-01-27 22:21:06 +00:00
Kostya Serebryany bfa3f9d82f [fuzzer] properly enable asan's coverage feedback
llvm-svn: 227254
2015-01-27 22:19:55 +00:00
Sanjay Patel 6b280777b7 fix typos; NFC
llvm-svn: 227253
2015-01-27 22:16:52 +00:00
Kostya Serebryany d53b43fe11 Add a Fuzzer library
Summary:
A simple genetic in-process coverage-guided fuzz testing library.

I've used this fuzzer to test clang-format
(it found 12+ bugs, thanks djasper@ for the fixes!)
and it may also help us test other parts of LLVM.
So why not keep it in the LLVM repository?

I plan to add the cmake build rules later (in a separate patch, if that's ok)
and also add a clang-format-fuzzer target.

See README.txt for details.

Test Plan: Tests will follow separately.

Reviewers: djasper, chandlerc, rnk

Reviewed By: rnk

Subscribers: majnemer, ygribov, dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D7184

llvm-svn: 227252
2015-01-27 22:08:41 +00:00
Ahmed Bougacha 1ac9356524 [SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.
This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).

The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.

The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification.  This was introduced in a
refactoring (r225640) to match the original behavior.

However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.

For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW.  When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
  stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
  and simplifies the first stpcpy to a memcpy.  We now have
  two memcpys.

llvm-svn: 227250
2015-01-27 21:52:16 +00:00
Sanjoy Das dcf2651043 Teach IRCE to look at branch weights when recognizing range checks
Splitting a loop to make range checks redundant is profitable only if
the range check "never" fails. Make this fact a part of recognizing a
range check -- a branch is a range check only if it is expected to
pass (via branch_weights metadata).

Differential Revision: http://reviews.llvm.org/D7192

llvm-svn: 227249
2015-01-27 21:38:12 +00:00
Alexey Samsonov 533948088e Revert "[x86] Combine x86mmx/i64 to v2i64 conversion to use scalar_to_vector"
This reverts commits r226953 and r226974.

llvm-svn: 227248
2015-01-27 21:34:11 +00:00
Keno Fischer f2107249da [ExecutionEngine] Fix r227228 tests on Windows
On Windows, we're running MCJIT with ELF, so the module needs to have
its Triple explicitly adjusted.

llvm-svn: 227247
2015-01-27 21:33:25 +00:00
Kevin Enderby 9a50944ca0 dd the option, -link-opt-hints to llvm-objdump used with -macho to print the
Mach-O AArch64 linker optimization hints for ADRP code optimization.

llvm-svn: 227246
2015-01-27 21:28:24 +00:00
Sanjay Patel bcf62f2fa2 Merge vector stores into wider vector stores (PR21711)
This patch resolves part of PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ).

The 'f3' test case in that report presents a situation where we have two 128-bit
stores extracted from a 256-bit source vector. 

Instead of producing this:

vmovaps %xmm0, (%rdi)
vextractf128    $1, %ymm0, 16(%rdi)

This patch merges the 128-bit stores into a single 256-bit store:

vmovups %ymm0, (%rdi)

Differential Revision: http://reviews.llvm.org/D7208

llvm-svn: 227242
2015-01-27 20:50:27 +00:00
Zachary Turner fcb14ad7a3 Add llvm-pdbdump to tools.
llvm-pdbdump is a tool which can be used to dump the contents
of Microsoft-generated PDB files.  It makes use of the Microsoft
DIA SDK, which is a COM based library designed specifically for
this purpose.

The initial commit of this tool dumps the raw bytes from PDB data
streams.  Future commits will dump more semantic information such
as types, symbols, source files, etc similar to the types of
information accessible via llvm-dwarfdump.

Reviewed by: Aaron Ballman, Reid Kleckner, Chandler Carruth
Differential Revision: http://reviews.llvm.org/D7153

llvm-svn: 227241
2015-01-27 20:46:21 +00:00
Dmitry Vyukov 91ffdec3ec tsan: properly instrument unaligned accesses
If a memory access is unaligned, emit __tsan_unaligned_read/write
callbacks instead of __tsan_read/write.
Required to change semantics of __tsan_unaligned_read/write to not do the user memory.
But since they were unused (other than through __sanitizer_unaligned_load/store) this is fine.
Fixes long standing issue 17:
https://code.google.com/p/thread-sanitizer/issues/detail?id=17

llvm-svn: 227231
2015-01-27 20:19:17 +00:00
Ramkumar Ramachandra ca49c036ba overloaded-intrinsic-name: exercise anyptr on struct
No other test I know shows how struct names are mangled in overloaded
intrinsic functions.

Differential Revision: http://reviews.llvm.org/D7037

llvm-svn: 227229
2015-01-27 20:03:08 +00:00
Keno Fischer 88cc26811b [ExecutionEngine] Add weak symbol support to RuntimeDyld
Support weak symbols by first looking up if there is an externally visible symbol we can find,
and only if that fails using the one in the object file we're loading.

Reviewed By: lhames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6950

llvm-svn: 227228
2015-01-27 20:02:31 +00:00
Keno Fischer 5f92a08fc0 [ExecutionEngine] FindFunctionNamed: Skip declarations
Summary:
Basically all other methods that look up functions by name skip them if they are mere declarations.
Do the same in FindFunctionNamed.

Reviewers: lhames

Reviewed By: lhames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7068

llvm-svn: 227227
2015-01-27 19:29:00 +00:00
Kai Nacke e024539ea0 [mips] Add range checks and transformation to octeon instructions in AsmParser.
This patch adds range checks to the immediate operands of octeon
instructions in the AsmParser. Like gas, it applies the following
transformations if the immediate is to large:

bbit0 $8, 42, foo => bbit032 $8, 10, foo
bbit1 $8, 46, foo => bbit132 $8, 14, foo
cins $8, $31, 32, 31 => cins32 $8, $31, 0, 31
exts $7, $4, 54, 9 => exts32 $7, $4, 22, 9

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D7080

llvm-svn: 227225
2015-01-27 19:11:28 +00:00
Kostya Serebryany 869b8675c8 Add cmake flag LLVM_USE_SANITIZE_COVERAGE
Summary:
When LLVM_USE_SANITIZE_COVERAGE=YES
and one of the sanitizers is used -fsanitize-coverage=3 will be added
to build flag. This will be used to run a coverage-guided fuzzer on various
llvm libraries.

Test Plan: n/a

Reviewers: rnk

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7116

llvm-svn: 227216
2015-01-27 17:59:28 +00:00
Marek Olsak 75170778ec R600/SI: Enable all tests that pass on VI without changes
llvm-svn: 227214
2015-01-27 17:27:15 +00:00
Marek Olsak 794ff8392e R600/SI: Fix MIN3/MAX3 on VI, define MED3
llvm-svn: 227213
2015-01-27 17:25:15 +00:00
Marek Olsak 367447c255 R600/SI: Don't set patterns for chip-specific instructions while having pseudos
Only pseudos have patterns on them.

Also don't set the asm string for VINTRP_Pseudo. All pseudos should have empty
asm.

This matches what all other multiclasses do.

llvm-svn: 227212
2015-01-27 17:25:11 +00:00
Marek Olsak 0c1f8812f5 R600/SI: Add VI versions of LDS atomics
Each class is split into two: one adds let statements around non-pseudos,
and the other one specifies the parameters.

llvm-svn: 227211
2015-01-27 17:25:07 +00:00
Marek Olsak 19d9e1f459 R600/SI: Add VI versions of MUBUF atomics
llvm-svn: 227210
2015-01-27 17:25:02 +00:00
Marek Olsak ee98b1177c R600/SI: Add VI versions of MUBUF loads and stores
This enables a lot of existing patterns for VI.

llvm-svn: 227209
2015-01-27 17:24:58 +00:00
Marek Olsak 7ef6db49ac R600/SI: Add pseudos for MUBUF loads and stores
This defines the SI versions only, so it shouldn't change anything.

There are no changes other than using the new multiclasses, adding missing
mayLoad/mayStore, and formatting fixes.

llvm-svn: 227208
2015-01-27 17:24:54 +00:00
Andrea Di Biagio 086cbc37ad [InstCombine] Teach how to fold a select into a cttz/ctlz with the 'is_zero_undef' flag.
This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by
a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared.

Added test InstCombine/select-cmp-cttz-ctlz.ll.

llvm-svn: 227197
2015-01-27 15:58:14 +00:00
Evgeniy Stepanov 3fdfc7b1b3 [sancov] Fix unspecified constructor order between sancov and asan.
Sanitizer coverage constructor must run after asan constructor (for each DSO).
Bump constructor priority to guarantee that.

llvm-svn: 227195
2015-01-27 15:01:22 +00:00
Manuel Jacob 8220addad7 Add a FIXME in SelectionDAGBuilder before an assert that is valid only on X86.
When lowering memcpy, memset or memmove, this assert checks whether the pointer
operands are in an address space < 256 which means "user defined address space"
on X86.  However, this notion of "user defined address space" does not exist
for other targets.

llvm-svn: 227191
2015-01-27 13:14:35 +00:00
Eric Christopher 337262068f Replace some uses of getSubtargetImpl with the cached version
off of the MachineFunction or with the version that takes a
Function reference as an argument.

llvm-svn: 227185
2015-01-27 08:48:42 +00:00
Chandler Carruth 7b011df6f2 [PM] Clean up file banner comments prior to refactoring this code.
llvm-svn: 227182
2015-01-27 08:28:33 +00:00
Eric Christopher 7592b0c5e6 Have the PBQP register allocator use the subtarget on the MachineFunction.
(and remove an extraneous private).

llvm-svn: 227181
2015-01-27 08:27:06 +00:00
Eric Christopher e005826526 Remove some extraneous includes.
llvm-svn: 227180
2015-01-27 08:27:03 +00:00
Eric Christopher ad6bedb43e Fix build failure with pointer vs reference.
NB: Saving files after editing helps.
llvm-svn: 227178
2015-01-27 08:00:42 +00:00
Eric Christopher 2c63549386 Update a few calls to getSubtarget<> to either be getSubtargetImpl
when we didn't need the cast to the base class or the cached version
off of the subtarget.

llvm-svn: 227176
2015-01-27 07:54:39 +00:00
Eric Christopher 36d9273128 Clean up the AArch64 store pair suppression pass initialization
and remove and unnecessary class variable.

llvm-svn: 227175
2015-01-27 07:54:36 +00:00
Eric Christopher 3d4276f053 The subtarget is cached on the MachineFunction. Access it directly.
llvm-svn: 227173
2015-01-27 07:31:29 +00:00
Eric Christopher e38c8d4aa9 Migrate SeparateConstOffsetFromGEP to use a Function with
getSubtarget.

llvm-svn: 227172
2015-01-27 07:16:37 +00:00
David Majnemer 4c82daea60 LoopRotate: Don't walk the uses of a Constant
LoopRotate wanted to avoid live range interference by looking at the
uses of a Value in the loop latch and seeing if any lied outside of the
loop.  We would wrongly perform this operation on Constants.

This fixes PR22337.

llvm-svn: 227171
2015-01-27 06:21:43 +00:00
Eric Christopher b9f60c17dc Remove unused include.
llvm-svn: 227170
2015-01-27 05:58:44 +00:00
Richard Trieu 15ac9363a7 Revert r227148 & r227154 which added a test which infinitely loops.
r227148 added test CommandLineTest.HideUnrelatedOptionsMulti which repeatedly
outputs two following lines:

-tool: CommandLine Error: Option 'test-option-1' registered more than once!
-tool: CommandLine Error: Option 'test-option-2' registered more than once!

r227154 depends on changes from r227148

llvm-svn: 227167
2015-01-27 03:03:47 +00:00
Chandler Carruth e378a016e8 [PM] Run clang-format over this header to clean up the very few)
divergent formatting issues. This should prevent any format-only diffs
from sneaking into subsequent changes to port TTI to the new pass
manager.

llvm-svn: 227165
2015-01-27 02:20:43 +00:00
Chandler Carruth 406f332174 [PM] Switch a doxygen comment to the standard format. NFC
llvm-svn: 227164
2015-01-27 02:20:41 +00:00
Chandler Carruth d649c0ad56 [PM] Refactor the core logic to run EarlyCSE over a function into an
object that manages a single run of this pass.

This was already essentially how it worked. Within the run function, it
would point members at *stack local* allocations that were only live for
a single run. Instead, it seems much cleaner to have a utility object
whose lifetime is clearly bounded by the run of the pass over the
function and can use member variables in a more direct way.

This also makes it easy to plumb the analyses used into it from the pass
and will make it re-usable with the new pass manager.

No functionality changed here, its just a refactoring.

llvm-svn: 227162
2015-01-27 01:34:14 +00:00
Eric Christopher 349d5886e5 MachineRegisterInfo can access TII off of the MachineFunction's
subtarget and so doesn't need the TargetMachine or to access via
getSubtargetImpl. Update all callers.

llvm-svn: 227160
2015-01-27 01:15:16 +00:00
Eric Christopher 4e048eeb2a Migrate AtomicExpandPass and DwarfEHPrepare to using a Function-ized getSubtargetImpl.
llvm-svn: 227159
2015-01-27 01:04:42 +00:00
Eric Christopher 7aebb32dd3 Fix unsigned/signed comparison warning.
llvm-svn: 227158
2015-01-27 01:01:39 +00:00
Eric Christopher fccff37b53 Migrate CodeGenPrepare to use the Function based getSubtarget
code.

llvm-svn: 227157
2015-01-27 01:01:38 +00:00
Eric Christopher 2b214e7ad3 Grab the TargetLowering info from the DAG rather than querying for
a subtarget.

llvm-svn: 227156
2015-01-27 01:01:36 +00:00
Eric Christopher 6a524f96de Remove extraneous period.
llvm-svn: 227155
2015-01-27 01:01:34 +00:00
Chris Bieneman fd3dbd9403 One more fix to the new API to fix const-correctness.
llvm-svn: 227154
2015-01-27 00:42:00 +00:00
Adrian Prantl ad49878697 Replace this testcase with an even shorter one provided by dblaikie.
llvm-svn: 227152
2015-01-27 00:22:17 +00:00
Chad Rosier f9327d6fe9 Commoning of target specific load/store intrinsics in Early CSE.
Phabricator revision: http://reviews.llvm.org/D7121
Patch by Sanjin Sijaric <ssijaric@codeaurora.org>!

llvm-svn: 227149
2015-01-26 22:51:15 +00:00
Chris Bieneman c333e577fe Pete Cooper suggested the new API should use ArrayRef instead of SmallVectorImpl. Also adding a test case.
llvm-svn: 227148
2015-01-26 22:50:47 +00:00
Philip Reames 219b15e85f Add test cases for PRE w/volatile loads
These tests check that the combination of 227110 (cross block query inst) and 227112 (volatile load semantics) work together properly to allow PRE in cases where a loop contains a volatile access.

llvm-svn: 227146
2015-01-26 22:40:44 +00:00
Simon Pilgrim 0629ba1ad9 [X86][SSE] Float comparisons can sometimes be safely commuted
For ordered, unordered, equal and not-equal tests, packed float and double comparison instructions can be safely commuted without affecting the results. This patch checks the comparison mode of the (v)cmpps + (v)cmppd instructions and commutes the result if it can.

Differential Revision: http://reviews.llvm.org/D7178

llvm-svn: 227145
2015-01-26 22:29:24 +00:00
Zachary Turner 02991af057 Have the UTF conversion wrappers append a null terminator.
This is especially useful for the UTF8 -> UTF16 direction, since
there is no equivalent of llvm::SmallString<> for wide characters.
This means that anyone who wants a null terminated string is forced
to manually push and pop their own null terminator.

Reviewed by: Reid Kleckner.

llvm-svn: 227143
2015-01-26 22:05:50 +00:00
Simon Pilgrim 9b7c00352d [X86][PCLMUL] Enable commutation for PCLMUL instructions
Patch to allow (v)pclmulqdq to be commuted - swaps the src registers and inverts the immediate (low/high) src mask.

Differential Revision: http://reviews.llvm.org/D7180

llvm-svn: 227141
2015-01-26 22:00:18 +00:00
Chris Bieneman 0104325776 Add new HideUnrelatedOptions API that takes a SmallVectorImpl.
Need a new API for clang-modernize that allows specifying a list of option categories to remain visible. This will allow clang-modernize to move off getRegisteredOptions.

llvm-svn: 227140
2015-01-26 21:57:29 +00:00
Simon Pilgrim 106abe47d6 Line endings fix. NFC.
llvm-svn: 227138
2015-01-26 21:28:32 +00:00
Matt Arsenault 7d734f44a3 R600: Cleanup or test
Fix broken check lines, use multiple check prefixes,
add an additional test for i1 or.

llvm-svn: 227137
2015-01-26 21:16:10 +00:00
Simon Pilgrim a2f1c40575 Line endings fix. NFC.
llvm-svn: 227136
2015-01-26 21:15:42 +00:00
Alexei Starovoitov 3c8465acb2 bpf: fix build due to 'Move DataLayout back to the TargetMachine'
commit r227113 moved DataLayout

llvm-svn: 227133
2015-01-26 20:43:15 +00:00
Bruno Cardoso Lopes 3da5b126a5 [x86][MMX] Rename and cleanup tests: arith, intrinsics and shuffle
- Rename mmx-builtins to mmx-intrinsics to match other intrinsic test naming.
- Remove tests that duplicate functionality from mmx-intrinsics.ll.
- Move arith related tests to mmx-arith.ll.
- MMX related shuffle goes to vector-shuffle-mmx.ll.

llvm-svn: 227130
2015-01-26 20:06:51 +00:00
Hans Wennborg b64cb271dc SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable
The range check would get optimized away later, but we might as well not emit
them in the first place.

http://reviews.llvm.org/D6471

llvm-svn: 227126
2015-01-26 19:52:34 +00:00
Hans Wennborg 6800008f04 SimplifyCFG: don't remove unreachable default switch destinations
An unreachable default destination can be exploited by other optimizations and
allows for more efficient lowering. Both the SDag switch lowering and
LowerSwitch can exploit unreachable defaults.

Also make TurnSwitchRangeICmp handle switches with unreachable default.
This is kind of separate change, but it cannot be tested without the change
above, and I don't want to land the change above without this since that would
regress other tests.

Differential Revision: http://reviews.llvm.org/D6471

llvm-svn: 227125
2015-01-26 19:52:32 +00:00
Hans Wennborg 90b827cae2 Make ConstantFoldTerminator() handle switches with unreachable default.
Tested by Transforms/SimplifyCFG/switch-to-br.ll's @unreachable function.

Differential Revision: http://reviews.llvm.org/D6471

llvm-svn: 227124
2015-01-26 19:52:24 +00:00
Justin Holewinski d4d2e9bd0e [NVPTX] Generate a more optimal sequence for select of i1
Instead of creating a pattern like "(p && a) || ((!p) && b)",
just expand the i8 operands to i32 and perform the selp on them.

Fixes PR22246

llvm-svn: 227123
2015-01-26 19:52:20 +00:00
Reid Kleckner d8cb6b00c5 Add a UTF8 to UTF16 conversion wrapper for use in the pdb dumper
This can also be used instead of the WindowsSupport.h ConvertUTF8ToUTF16
helpers, but that will require massaging some character types. The
Windows support routines want wchar_t output, but wchar_t is often 32
bits on non-Windows OSs.

llvm-svn: 227122
2015-01-26 19:51:00 +00:00
Eric Christopher b11a1b7b2c Cache the lookup of TargetLowering in the atomic expand pass.
llvm-svn: 227121
2015-01-26 19:45:40 +00:00
Ahmed Bougacha 9a9e1a59ce [SelectionDAG] Fix assert message copypasta. NFC.
llvm-svn: 227119
2015-01-26 19:31:42 +00:00
Eric Christopher 59063140a7 Add a FIXME about preferred alignment to DataLayout.
Essentially DataLayout is global and affects the layout of ABI
level objects. Preferred alignment could change on a per function
basis as we change CPU features.

llvm-svn: 227118
2015-01-26 19:19:04 +00:00
Justin Holewinski 23df659e6d [NVPTX] Handle floating-point conversion patterns that are not explicitly ordered or unordered
Fixes PR22322

llvm-svn: 227117
2015-01-26 19:11:20 +00:00
Alex Rosenberg b9fefdd215 Use a different encoding for debugtrap on PS4.
llvm-svn: 227116
2015-01-26 19:09:27 +00:00
Eric Christopher 8b7706517c Move DataLayout back to the TargetMachine from TargetSubtargetInfo
derived classes.

Since global data alignment, layout, and mangling is often based on the
DataLayout, move it to the TargetMachine. This ensures that global
data is going to be layed out and mangled consistently if the subtarget
changes on a per function basis. Prior to this all targets(*) have
had subtarget dependent code moved out and onto the TargetMachine.

*One target hasn't been migrated as part of this change: R600. The
R600 port has, as a subtarget feature, the size of pointers and
this affects global data layout. I've currently hacked in a FIXME
to enable progress, but the port needs to be updated to either pass
the 64-bitness to the TargetMachine, or fix the DataLayout to
avoid subtarget dependent features.

llvm-svn: 227113
2015-01-26 19:03:15 +00:00
Philip Reames a7ad6a589c Refine memory dependence's notion of volatile semantics
According to my reading of the LangRef, volatiles are only ordered with respect to other volatiles. It is entirely legal and profitable to forward unrelated loads over the volatile load. This patch implements this for GVN by refining the transition rules MemoryDependenceAnalysis uses when encountering a volatile.

The added test cases show where the extra flexibility is profitable for local dependence optimizations. I have a related change (227110) which will extend this to non-local dependence (i.e. PRE), but that's essentially orthogonal to the semantic change in this patch. I have tested the two together and can confirm that PRE works over a volatile load with both changes.  I will be submitting a PRE w/volatiles test case seperately in the near future.

Differential Revision: http://reviews.llvm.org/D6901

llvm-svn: 227112
2015-01-26 18:54:27 +00:00
Sanjay Patel 805bc02c2b Model sqrtsd as a binary operation with one source operand tied to the destination (PR14221)
This patch fixes the following miscompile:

define void @sqrtsd(<2 x double> %a) nounwind uwtable ssp {
  %0 = tail call <2 x double> @llvm.x86.sse2.sqrt.sd(<2 x double> %a) nounwind 
  %a0 = extractelement <2 x double> %0, i32 0
  %conv = fptrunc double %a0 to float
  %a1 = extractelement <2 x double> %0, i32 1
  %conv3 = fptrunc double %a1 to float
  tail call void @callee2(float %conv, float %conv3) nounwind
  ret void
}

Current codegen:

sqrtsd	%xmm0, %xmm1        ## high element of %xmm1 is undef here
xorps	%xmm0, %xmm0
cvtsd2ss	%xmm1, %xmm0
shufpd	$1, %xmm1, %xmm1
cvtsd2ss	%xmm1, %xmm1 ## operating on undef value
jmp	_callee

This is a continuation of http://llvm.org/viewvc/llvm-project?view=revision&revision=224624 ( http://reviews.llvm.org/D6330 ) 
which was itself a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ).

All of these patches are partial fixes for PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 ); 
this should be the final patch needed to resolve that bug.

Differential Revision: http://reviews.llvm.org/D6885

llvm-svn: 227111
2015-01-26 18:42:16 +00:00
Philip Reames 32351455f6 Pass QueryInst down through non-local dependency calculation
This change is mostly motivated by exposing information about the original query instruction to the actual scanning work in getPointerDependencyFrom when used by GVN PRE. In a follow up change, I will use this to be more precise with regards to the semantics of volatile instructions encountered in the scan of a basic block.

Worth noting, is that this change (despite appearing quite simple) is not semantically preserving. By providing more information to the helper routine, we allow some optimizations to kick in that weren't previously able to (when called from this code path.) In particular, we see that treatment of !invariant.load becomes more precise. In theory, we might see a difference with an ordered/atomic instruction as well, but I'm having a hard time actually finding a test case which shows that.

Test wise, I've included new tests for !invariant.load which illustrate this difference. I've also included some updated TBAA tests which highlight that this change isn't needed for that optimization to kick in - it's handled inside alias analysis itself. 

Eventually, it would be nice to factor the !invariant.load handling inside alias analysis as well.

Differential Revision: http://reviews.llvm.org/D6895

llvm-svn: 227110
2015-01-26 18:39:52 +00:00
Philip Reames 56a03938f7 Revert GCStrategy ownership changes
This change reverts the interesting parts of 226311 (and 227046).  This change introduced two problems, and I've been convinced that an alternate approach is preferrable anyways.

The bugs were:
- Registery appears to require all users be within the same linkage unit.  After this change, asking for "statepoint-example" in Transform/ would sometimes get you nullptr, whereas asking the same question in CodeGen would return the right GCStrategy.  The correct long term fix is to get rid of the utter hack which is Registry, but I don't have time for that right now.  227046 appears to have been an attempt to fix this, but I don't believe it does so completely.
- GCMetadataPrinter::finishAssembly was being called more than once per GCStrategy.  Each Strategy was being added to the GCModuleInfo multiple times.

Once I get time again, I'm going to split GCModuleInfo into the gc.root specific part and a GCStrategy owning Analysis pass.  I'm probably also going to kill off the Registry.  Once that's done, I'll move the new GCStrategyAnalysis and all built in GCStrategies into Analysis.  (As original suggested by Chandler.)  This will accomplish my original goal of being able to access GCStrategy from Transform/  without adding all of the builtin GCs to IR/.  

llvm-svn: 227109
2015-01-26 18:26:35 +00:00
Zachary Turner 39571b37a3 Teach raw_ostream to support hex formatting without a prefix '0x'.
Previously using format_hex() would always print a 0x prior to the
hex characters.  This allows this to be optional, so that one can
choose to print (e.g.) 255 as either 0xFF or just FF.

Differential Revision: http://reviews.llvm.org/D7151

llvm-svn: 227108
2015-01-26 18:21:33 +00:00
Alex Rosenberg f298f16ccf Remove trailing whitespace. NFC ®
llvm-svn: 227105
2015-01-26 18:02:18 +00:00
Alex Rosenberg 8108ef0a93 Remove trailing whitespace.
Also test commit email processing by including this char: '®'

llvm-svn: 227103
2015-01-26 17:35:56 +00:00
Eric Christopher a576281694 Move the Mips target to storing the ABI in the TargetMachine rather
than on MipsSubtargetInfo.

This required a bit of massaging in the MC level to handle this since
MC is a) largely a collection of disparate classes with no hierarchy,
and b) there's no overarching equivalent to the TargetMachine, instead
only the subtarget via MCSubtargetInfo (which is the base class of
TargetSubtargetInfo).

We're now storing the ABI in both the TargetMachine level and in the
MC level because the AsmParser and the TargetStreamer both need to
know what ABI we have to parse assembly and emit objects. The target
streamer has a pointer to the one in the asm parser and is updated
when the asm parser is created. This is fragile as the FIXME comment
notes, but shouldn't be a problem in practice since we always
create an asm parser before attempting to emit object code via the
assembler. The TargetMachine now contains the ABI so that the DataLayout
can be constructed dependent upon ABI.

All testcases have been updated to use the -target-abi command line
flag so that we can set the ABI without using a subtarget feature.

Should be no change visible externally here.

llvm-svn: 227102
2015-01-26 17:33:46 +00:00
Eric Christopher 6e4ed49d79 Store the passed in CPU name string so that it can be accessed later.
llvm-svn: 227101
2015-01-26 17:33:30 +00:00
Daniel Berlin 16f7a52628 Fix incorrect partial aliasing
Update testcases

llvm-svn: 227099
2015-01-26 17:31:17 +00:00
Daniel Berlin 8f10e387bb Fix delegation
llvm-svn: 227098
2015-01-26 17:30:39 +00:00
Sanjay Patel 078b612de1 fix line-endings; NFC
llvm-svn: 227095
2015-01-26 17:21:36 +00:00
Michael J. Spencer f5215652d5 [Support][Windows] Disable error dialog boxes when stack trace printing is enabled.
llvm-svn: 227094
2015-01-26 17:05:02 +00:00
Chris Bieneman 831fc5e87d Putting all the standard tool options into a "Generic" category.
Summary:
This puts all the options that CommandLine.cpp implements into a category so that the APIs to hide options can not hide based on the generic category instead of string matching a partial list of argument strings.

This patch is pretty simple and straight forward but it does impact the -help output of all tools using cl::opt. Specifically the options implemented in CommandLine.cpp (help, help-list, help-hidden, help-list-hidden, print-options, print-all-options, version) are all grouped together into an Option category, and these options are never hidden by the cl::HideUnrelatedOptions API.

Reviewers: dexonsmith, chandlerc, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7150

llvm-svn: 227093
2015-01-26 16:56:00 +00:00
Alex Rosenberg 89814b4762 [MC] The PS4's ELF OSABI value is the same as FreeBSD.
llvm-svn: 227091
2015-01-26 15:42:07 +00:00
Alex Rosenberg 93460674df Teach the autoconf machinery about the PS4 triple.
(I think the last checkin, r227060, got lost from the mailing lists because of the (R) in the comment.)

llvm-svn: 227090
2015-01-26 15:25:05 +00:00
Vasileios Kalintiris ef96a8ecd6 [mips] Enable arithmetic and binary operations for the i128 data type.
Summary:
This patch adds support for some operations that were missing from
128-bit integer types (add/sub/mul/sdiv/udiv... etc.). With these
changes we can support the __int128_t and __uint128_t data types
from C/C++.

Depends on D7125

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7143

llvm-svn: 227089
2015-01-26 12:33:22 +00:00
Vasileios Kalintiris 2ed214f387 [mips] Add tests for bitwise binary and integer arithmetic operators.
Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7125

llvm-svn: 227087
2015-01-26 12:04:40 +00:00
Joerg Sonnenberger 429edc1780 The canonical CPU variant for ARM according to config.guess uses a
suffix it seems:

    # ./config.guess
    earmv7hfeb-unknown-netbsd7.99.4

Extend the triple parsing to support this. Avoid running the ARM parser
multiple times because StringSwitch is not lazy.

Reviewers: Renato Golin, Tim Northover

Differential Revision: http://reviews.llvm.org/D7166

llvm-svn: 227085
2015-01-26 11:41:48 +00:00
Vladimir Medic 0516a5b686 When disassembler meets compact jump instructions for r6 it crashes as the access to operands array is out of range. This patch removes dedicated decoder method that wrongly handles decoding of these instructions.
llvm-svn: 227084
2015-01-26 10:33:43 +00:00
Vasileios Kalintiris 30c5451fbc Revert "[mips] Fix assertion on i128 addition/subtraction on MIPS64"
This reverts commit r227003. Support for addition/subtraction and
various other operations for the i128 data type will be added in a
future commit based on the review D7143.

llvm-svn: 227082
2015-01-26 09:53:30 +00:00
NAKAMURA Takumi 2081cefdbe Revert llvm/test/MC/ELF/noexec.s in r227074, "Fix a problem where the AArch64 ELF assembler was failing with"
It should be split into target-specific location.

llvm-svn: 227080
2015-01-26 09:30:29 +00:00
Erik Eckstein 98df6da740 SLPVectorizer: fix wrong scheduling of atomic load/stores.
This fixes PR22306.

llvm-svn: 227077
2015-01-26 09:07:04 +00:00
Eric Christopher d4d2bbe769 Correct the header guard for MipsABIInfo.h.
llvm-svn: 227076
2015-01-26 08:19:53 +00:00
Eric Christopher c8c76853b9 Fix a problem where the AArch64 ELF assembler was failing with
-no-exec-stack. This was due to it not deriving from the correct
asm info base class and missing the override for the exec
stack section query. Added another line to the noexec test
line to make sure this doesn't regress.

llvm-svn: 227074
2015-01-26 06:32:17 +00:00
Craig Topper d1e1d106ca [X86] Change comparision immediate type to i8 in test cases for AVX512 floating point comparisons. The type was already changed in the definitions and was being auto upgraded to the new type.
llvm-svn: 227064
2015-01-25 23:26:12 +00:00
Craig Topper 29f2e95185 [X86] Use i8 immediate for comparison type on AVX512 packed integer instructions. This matches floating point equivalents. Includes autoupgrade support to convert old code.
llvm-svn: 227063
2015-01-25 23:26:02 +00:00
Alex Rosenberg 38babd1eeb Add the triple for the Sony Playstation®4.
Lots more to follow.

llvm-svn: 227060
2015-01-25 22:46:59 +00:00
Adrian Prantl 40cb819c6f Debug info: Fix PR22296 by omitting the DW_AT_location if we lost the
physical register that is described in a DBG_VALUE.

In the testcase the DBG_VALUE describing "p5" becomes unavailable
because the register its address is in is clobbered and we (currently)
aren't smart enough to realize that the value is rematerialized immediately
after the DBG_VALUE and/or is actually a stack slot.

llvm-svn: 227056
2015-01-25 19:04:08 +00:00
Bill Schmidt 258ac3cc56 [PowerPC] Revert ppc64le-aggregates.ll test changes from r227053
It appears we have different behavior with and without -mcpu=pwr8 even
with ppc64le defaulting to POWER8.  The failure appears as follows:

/home/bb/cmake-llvm-x86_64-linux/llvm-project/llvm/test/CodeGen/PowerPC/ppc64le-aggregates.ll:268:14: error: expected string not found in input
; CHECK-DAG: lfs 1, 0([[REG]])
             ^
<stdin>:497:11: note: scanning from here
 ld 3, .LC1@toc@l(3)
          ^
<stdin>:497:11: note: with variable "REG" equal to "3"
 ld 3, .LC1@toc@l(3)
          ^
<stdin>:514:2: note: possible intended match here
 lfs 1, 0(4)
 ^

Reverting this particular test case change.  Nemanja, please have a look
at the reason for the failure.

llvm-svn: 227055
2015-01-25 18:18:54 +00:00
Bill Schmidt 279cabb450 [PowerPC] Reset the baseline for ppc64le to be equivalent to pwr8
Test by Nemanja Ivanovic.

Since ppc64le implies POWER8 as a minimum, it makes sense that the
same features are included. Since the pwr8 processor model will likely
be getting new features until the implementation is complete, I
created a new list to add these updates to. This will include them in
both pwr8 and ppc64le.

Furthermore, it seems that it would make sense to compose the feature
lists for other processor models (pwr3 and up). Per discussion in the
review, I will make this change in a subsequent patch.

In order to test the changes, I've added an additional run step to
test cases that specify -march=ppc64le -mcpu=pwr8 to omit the -mcpu
option. Since the feature lists are the same, the behaviour should be
unchanged.

llvm-svn: 227053
2015-01-25 18:05:42 +00:00
Simon Atanasyan 4685c839d4 [docs] Add link to the MIPS 64-bit ELF object file specification
llvm-svn: 227050
2015-01-25 16:20:30 +00:00
NAKAMURA Takumi 4834009898 Instantiate Registry<GCStrategy> in LLVMCore, to let it available on Win32 DLL.
llvm-svn: 227046
2015-01-25 15:05:36 +00:00
Simon Atanasyan 5d19c67a68 [ELFYAML] Support mips64 relocation record format in yaml2obj/obj2yaml
MIPS64 ELF file has a very specific relocation record format. Each
record might specify up to three relocation operations. So the `r_info`
field in fact consists of three relocation type sub-fields and optional
code of "special" symbols.

http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf
page 40

The patch implements support of the MIPS64 relocation record format in
yaml2obj/obj2yaml tools by introducing new optional Relocation fields:
Type2, Type3, and SpecSym. These fields are recognized only if the
object/YAML file relates to the MIPS64 target.

Differential Revision: http://reviews.llvm.org/D7136

llvm-svn: 227044
2015-01-25 13:29:25 +00:00
Elena Demikhovsky 1a603b3f13 AVX-512: Changes in operations on masks registers for KNL and SKX
- Added KSHIFTB/D/Q for skx
- Added KORTESTB/D/Q for skx
- Fixed store operation for v8i1 type for KNL
- Store size of v8i1, v4i1 and v2i1 are changed to 8 bits

llvm-svn: 227043
2015-01-25 12:47:15 +00:00
NAKAMURA Takumi c17696022d Orc/IRCompileLayer.h: Avoid non-static initializer.
llvm-svn: 227042
2015-01-25 11:41:56 +00:00
NAKAMURA Takumi 2fb9a523ab OrcJIT: Avoid non-static initializers.
llvm-svn: 227041
2015-01-25 11:41:49 +00:00
NAKAMURA Takumi 94a6bef31f Orc/LLVMBuild.txt: Prune redundant "Target" in libdeps.
llvm-svn: 227040
2015-01-25 11:41:41 +00:00
Craig Topper ca8e179bc2 [X86] Give scalar VRNDSCALE instructions priority in AVX512 mode.
llvm-svn: 227039
2015-01-25 08:49:22 +00:00
Craig Topper 1d60952401 Simplify a multiclass. No functional change.
llvm-svn: 227038
2015-01-25 08:49:19 +00:00
Craig Topper e3155c96ee Remove tab characters. NFC
llvm-svn: 227036
2015-01-25 08:45:32 +00:00
Elena Demikhovsky a3232f764e Implemented cost model for masked load/store operations.
llvm-svn: 227035
2015-01-25 08:44:46 +00:00
Craig Topper 53a846764c [X86] Replace i32i8imm on SSE/AVX instructions with i32u8imm which will make the assembler bounds check them. It will also make them print as unsigned.
llvm-svn: 227032
2015-01-25 02:21:16 +00:00
Craig Topper fc946a0e6f [X86] Use u8imm in several places that used i32i8imm that don't require an i32 type.
llvm-svn: 227031
2015-01-25 02:21:13 +00:00
Craig Topper e7f6cf437c Remove tab characters. NFC.
llvm-svn: 227030
2015-01-25 02:21:11 +00:00
Chandler Carruth db0809fff3 [PM] Remove the restricted visibility from the instcombine worklist. Now
that library consumers access the instcombine pass directly, they also
(transitively) access the worklist. Also, it would need to be used
directly in order to have a useful utility if we ever want that.

This should fix some warnings since I moved this code. Sorry for the
trouble.

llvm-svn: 227025
2015-01-25 00:30:05 +00:00
Lang Hames b5be1addce Remove a few more redundant ExecutionEngine regression tests.
llvm-svn: 227021
2015-01-24 22:41:13 +00:00
Charlie Turner d415cc8198 Fixup debug information references.
llvm-svn: 227020
2015-01-24 21:51:21 +00:00
Charlie Turner 6cba064414 Update references to lines of code count.
The number of lines of code in Kaleidoscope has risen from the
previously reported 700 to 986 according to the cloc tool. This tools
was run on the toy.cpp file from Chapter 8.

llvm-svn: 227019
2015-01-24 21:51:17 +00:00
Justin Bogner db7afdbb01 InstrProf: Add operator!= to coverage counters
I'll use this in clang shortly. Also makes the operator definition
style more consistent in this class.

llvm-svn: 227018
2015-01-24 21:13:23 +00:00
Justin Bogner 3c0f124a74 llvm-cov: Only combine segments if they overlap exactly
If two coverage segments cover the same area we need to combine them,
as per r218432. OTOH, just because they start at the same place
doesn't mean they cover the same area. This fixes the check to be more
exact about this.

This is pretty hard to test right now. The frontend doesn't currently
emit regions that start at the same place but don't overlap, but some
upcoming work changes this.

llvm-svn: 227017
2015-01-24 20:58:52 +00:00
Patrik Hagglund d03dad8b4b Revert r227013 "Add visibility attribute for InstCombinePass (r226987)."
Buildbot breakage.
http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/21749

llvm-svn: 227016
2015-01-24 20:35:36 +00:00
Saleem Abdulrasool fc07c72674 CodeGen: drive-by formatting clean ups
Minor tweaks to whitespace formatting that I noticed was off.  NFC.

llvm-svn: 227014
2015-01-24 20:19:45 +00:00
Patrik Hagglund 87aada2d3c Add visibility attribute for InstCombinePass (r226987).
Warning by gcc:
'llvm::InstCombinePass' declared with greater visibility than the type of its field 'llvm::InstCombinePass::Worklist' [-Wattributes]

llvm-svn: 227013
2015-01-24 20:06:53 +00:00
Benjamin Kramer 4d50b6c306 DebugInfo: Fix use after return found by asan.
llvm-svn: 227012
2015-01-24 19:55:23 +00:00
Lang Hames 069ff67b1a [Orc] Add TransformUtils to Orc's dependency list.
Patch by Jan Vesely. Thanks Jan!

llvm-svn: 227011
2015-01-24 19:00:09 +00:00
Lang Hames 73e55395e8 Remove a number of redundant ExecutionEngine regression tests.
These tests used to test the legacy JIT but since that has been removed they're
just redundantly testing MCJIT. Remove them and just leave their counterparts in
test/ExecutionEngine/MCJIT.

llvm-svn: 227010
2015-01-24 18:49:51 +00:00
Alexei Starovoitov 8a3d0c1557 bpf: add missing lit.local.cfg
llvm-svn: 227009
2015-01-24 18:20:52 +00:00
Alexei Starovoitov e4c8c807bb BPF backend
Summary:
V8->V9:
- cleanup tests

V7->V8:
- addressed feedback from David:
- switched to range-based 'for' loops
- fixed formatting of tests

V6->V7:
- rebased and adjusted AsmPrinter args
- CamelCased .td, fixed formatting, cleaned up names, removed unused patterns
- diffstat: 3 files changed, 203 insertions(+), 227 deletions(-)

V5->V6:
- addressed feedback from Chandler:
- reinstated full verbose standard banner in all files
- fixed variables that were not in CamelCase
- fixed names of #ifdef in header files
- removed redundant braces in if/else chains with single statements
- fixed comments
- removed trailing empty line
- dropped debug annotations from tests
- diffstat of these changes:
  46 files changed, 456 insertions(+), 469 deletions(-)

V4->V5:
- fix setLoadExtAction() interface
- clang-formated all where it made sense

V3->V4:
- added CODE_OWNERS entry for BPF backend

V2->V3:
- fix metadata in tests

V1->V2:
- addressed feedback from Tom and Matt
- removed top level change to configure (now everything via 'experimental-backend')
- reworked error reporting via DiagnosticInfo (similar to R600)
- added few more tests
- added cmake build
- added Triple::bpf
- tested on linux and darwin

V1 cover letter:
---------------------
recently linux gained "universal in-kernel virtual machine" which is called
eBPF or extended BPF. The name comes from "Berkeley Packet Filter", since
new instruction set is based on it.
This patch adds a new backend that emits extended BPF instruction set.

The concept and development are covered by the following articles:
http://lwn.net/Articles/599755/
http://lwn.net/Articles/575531/
http://lwn.net/Articles/603983/
http://lwn.net/Articles/606089/
http://lwn.net/Articles/612878/

One of use cases: dtrace/systemtap alternative.

bpf syscall manpage:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b4fc1a460f3017e958e6a8ea560ea0afd91bf6fe

instruction set description and differences vs classic BPF:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/filter.txt

Short summary of instruction set:
- 64-bit registers
  R0      - return value from in-kernel function, and exit value for BPF program
  R1 - R5 - arguments from BPF program to in-kernel function
  R6 - R9 - callee saved registers that in-kernel function will preserve
  R10     - read-only frame pointer to access stack
- two-operand instructions like +, -, *, mov, load/store
- implicit prologue/epilogue (invisible stack pointer)
- no floating point, no simd

Short history of extended BPF in kernel:
interpreter in 3.15, x64 JIT in 3.16, arm64 JIT, verifier, bpf syscall in 3.18, more to come in the future.

It's a very small and simple backend.
There is no support for global variables, arbitrary function calls, floating point, varargs,
exceptions, indirect jumps, arbitrary pointer arithmetic, alloca, etc.
From C front-end point of view it's very restricted. It's done on purpose, since kernel
rejects all programs that it cannot prove safe. It rejects programs with loops
and with memory accesses via arbitrary pointers. When kernel accepts the program it is
guaranteed that program will terminate and will not crash the kernel.

This patch implements all 'must have' bits. There are several things on TODO list,
so this is not the end of development.
Most of the code is a boiler plate code, copy-pasted from other backends.
Only odd things are lack or < and <= instructions, specialized load_byte intrinsics
and 'compare and goto' as single instruction.
Current instruction set is fixed, but more instructions can be added in the future.

Signed-off-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>

Subscribers: majnemer, chandlerc, echristo, joerg, pete, rengolin, kristof.beyls, arsenm, t.p.northover, tstellarAMD, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D6494

llvm-svn: 227008
2015-01-24 17:51:26 +00:00
Daniel Sanders 9a4f2c55df [mips] Fix 'jumpy' debug line info around calls.
Summary:
At the moment, address calculation is taking the debug line info from the
address node (e.g. TargetGlobalAddress). When a function is called multiple
times, this results in output of the form:

  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .. address calculation ..
  .loc $second_call_location
  .. function call ..
  .loc $first_call_location
  .. address calculation ..
  .loc $third_call_location
  .. function call ..

This patch makes address calculations for function calls take the debug line
info for the call node and results in output of the form:
  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .loc $second_call_location
  .. address calculation ..
  .. function call ..
  .loc $third_call_location
  .. address calculation ..
  .. function call ..

All other address calculations continue to use the address node.

Test Plan: Fixes test/DebugInfo/multiline.ll on a mips host.

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D7050

llvm-svn: 227005
2015-01-24 14:35:11 +00:00
Sylvestre Ledru 450f97dcb9 Update of the gold-plugin.cpp code to match Chandler's changes (r226981)
llvm-svn: 227004
2015-01-24 13:59:08 +00:00
Daniel Sanders 5a9225b262 [mips] Fix assertion on i128 addition/subtraction on MIPS64
Summary:
In addition to the included tests, this fixes
test/CodeGen/Generic/i128-addsub.ll on a mips64 host.

Reviewers: atanasyan, sagar, vmedic

Reviewed By: vmedic

Subscribers: sdkie, llvm-commits

Differential Revision: http://reviews.llvm.org/D6610

llvm-svn: 227003
2015-01-24 12:58:10 +00:00
Andrea Di Biagio 8381475a75 [DAG] Fix wrong canonicalization performed on shuffle nodes.
This fixes a regression introduced by r226816.
When replacing a splat shuffle node with a constant build_vector,
make sure that the new build_vector has a valid number of elements.

Thanks to Patrik Hagglund for reporting this problem and providing a
small reproducible.

llvm-svn: 227002
2015-01-24 11:54:29 +00:00
Chandler Carruth 9dea5cdb8e [PM] General doxygen and comment cleanup for this pass.
llvm-svn: 227001
2015-01-24 11:44:32 +00:00
Chandler Carruth 7253bba458 [PM] Reformat this code with clang-format so that I can use clang-format
when refactoring for the new pass manager without introducing too many
formatting changes into meaning full diffs.

llvm-svn: 227000
2015-01-24 11:33:55 +00:00
Chandler Carruth 43e590e51f [PM] Port LowerExpectIntrinsic to the new pass manager.
This just lifts the logic into a static helper function, sinks the
legacy pass to be a trivial wrapper of that helper fuction, and adds
a trivial wrapper for the new PM as well. Not much to see here.

I switched a test case to run in both modes, but we have to strip the
dead prototypes separately as that pass isn't in the new pass manager
(yet).

llvm-svn: 226999
2015-01-24 11:13:02 +00:00
Chandler Carruth c3bf5bd8cf [PM] Change LowerExpectIntrinsic to actually return true when it has
changed the IR. This is particularly easy as we can just look for the
existence of any expect intrinsic at all to know whether we've changed
the IR.

llvm-svn: 226998
2015-01-24 11:12:57 +00:00
Chandler Carruth 6eb60eb5c9 [PM] Use a more appropriate name for the statistics variable in
lower-expect, as we don't have 'if's in the IR and we use it for
switches as well.

llvm-svn: 226997
2015-01-24 10:57:25 +00:00
Chandler Carruth d12741e0a9 [PM] Switch tihs code to use a range based for loop over the function.
We can't switch the loop over the instructions because it needs to
early-increment the iterator.

llvm-svn: 226996
2015-01-24 10:57:19 +00:00
Chandler Carruth 3f5e7b1fb6 [PM] Use a SmallVector instead of std::vector to avoid heap allocations
for small switches, and avoid using a complex loop to set up the
weights.

We know what the baseline weights will be so we can just resize the
vector to contain all that value and clobber the one slot that is
likely. This seems much more direct than the previous code that tested
at every iteration, and started off by zeroing the vector.

llvm-svn: 226995
2015-01-24 10:47:13 +00:00
Chandler Carruth 0012c778a4 [PM] Pull the two helpers for this pass into static functions. There are
no members for them to use.

Also, make them accept references as there is no possibility of a null
pointer.

llvm-svn: 226994
2015-01-24 10:39:24 +00:00
Chandler Carruth 579c5c45c2 [PM] Add a basic doxygen comment for this pass.
llvm-svn: 226993
2015-01-24 10:32:53 +00:00
Chandler Carruth 0ea746bf9f [PM] Clean up the formatting of the LowerExpectIntrinsic pass prior to
refactoring its code.

llvm-svn: 226992
2015-01-24 10:30:14 +00:00
Chandler Carruth 72793727cc [PM] Move the LowerExpectIntrinsic pass to the Scalar library.
It was already in the Scalar header and referenced extensively as being
in this library, the source file was just in the utils directory for
some reason. No actual functionality changed. I noticed as it didn't
make sense to add a pass header to the utils headers.

llvm-svn: 226991
2015-01-24 10:18:47 +00:00
Yunzhong Gao a8cf495a15 If we see UTF-8 BOM sequence at the beginning of a response file, we shall
remove these bytes before parsing.

Phabricator Revision: http://reviews.llvm.org/D7156

llvm-svn: 226988
2015-01-24 04:23:08 +00:00
Chandler Carruth 83ba269e4b [PM] Port instcombine to the new pass manager!
This is exciting as this is a much more involved port. This is
a complex, existing transformation pass. All of the core logic is shared
between both old and new pass managers. Only the access to the analyses
is separate because the actual techniques are separate. This also uses
a bunch of different and interesting analyses and is the first time
where we need to use an analysis across an IR layer.

This also paves the way to expose instcombine utility functions. I've
got a static function that implements the core pass logic over
a function which might be mildly interesting, but more interesting is
likely exposing a routine which just uses instructions *already in* the
worklist and combines until empty.

I've switched one of my favorite instcombine tests to run with both as
well to make sure this keeps working.

llvm-svn: 226987
2015-01-24 04:19:17 +00:00
Filipe Cabecinhas de968ecb05 [Bitcode] Diagnose errors instead of asserting from bad input
Eventually we can make some of these pass the error along to the caller.

Reports a fatal error if:
We find an invalid abbrev record
We try to get an invalid abbrev number
We can't fill the current word due to an EOF

Fixed an invalid bitcode test to check for output with FileCheck

Bugs found with afl-fuzz

llvm-svn: 226986
2015-01-24 04:15:05 +00:00
Chandler Carruth c0291865ed [PM] Rework how the TargetLibraryInfo pass integrates with the new pass
manager to support the actual uses of it. =]

When I ported instcombine to the new pass manager I discover that it
didn't work because TLI wasn't available in the right places. This is
a somewhat surprising and/or subtle aspect of the new pass manager
design that came up before but I think is useful to be reminded of:

While the new pass manager *allows* a function pass to query a module
analysis, it requires that the module analysis is already run and cached
prior to the function pass manager starting up, possibly with
a 'require<foo>' style utility in the pass pipeline. This is an
intentional hurdle because using a module analysis from a function pass
*requires* that the module analysis is run prior to entering the
function pass manager. Otherwise the other functions in the module could
be in who-knows-what state, etc.

A somewhat surprising consequence of this design decision (at least to
me) is that you have to design a function pass that leverages
a module analysis to do so as an optional feature. Even if that means
your function pass does no work in the absence of the module analysis,
you have to handle that possibility and remain conservatively correct.
This is a natural consequence of things being able to invalidate the
module analysis and us being unable to re-run it. And it's a generally
good thing because it lets us reorder passes arbitrarily without
breaking correctness, etc.

This ends up causing problems in one case. What if we have a module
analysis that is *definitionally* impossible to invalidate. In the
places this might come up, the analysis is usually also definitionally
trivial to run even while other transformation passes run on the module,
regardless of the state of anything. And so, it follows that it is
natural to have a hard requirement on such analyses from a function
pass.

It turns out, that TargetLibraryInfo is just such an analysis, and
InstCombine has a hard requirement on it.

The approach I've taken here is to produce an analysis that models this
flexibility by making it both a module and a function analysis. This
exposes the fact that it is in fact safe to compute at any point. We can
even make it a valid CGSCC analysis at some point if that is useful.
However, we don't want to have a copy of the actual target library info
state for each function! This state is specific to the triple. The
somewhat direct and blunt approach here is to turn TLI into a pimpl,
with the state and mutators in the implementation class and the query
routines primarily in the wrapper. Then the analysis can lazily
construct and cache the implementations, keyed on the triple, and
on-demand produce wrappers of them for each function.

One minor annoyance is that we will end up with a wrapper for each
function in the module. While this is a bit wasteful (one pointer per
function) it seems tolerable. And it has the advantage of ensuring that
we pay the absolute minimum synchronization cost to access this
information should we end up with a nice parallel function pass manager
in the future. We could look into trying to mark when analysis results
are especially cheap to recompute and more eagerly GC-ing the cached
results, or we could look at supporting a variant of analyses whose
results are specifically *not* cached and expected to just be used and
discarded by the consumer. Either way, these seem like incremental
enhancements that should happen when we start profiling the memory and
CPU usage of the new pass manager and not before.

The other minor annoyance is that if we end up using the TLI in both
a module pass and a function pass, those will be produced by two
separate analyses, and thus will point to separate copies of the
implementation state. While a minor issue, I dislike this and would like
to find a way to cleanly allow a single analysis instance to be used
across multiple IR unit managers. But I don't have a good solution to
this today, and I don't want to hold up all of the work waiting to come
up with one. This too seems like a reasonable thing to incrementally
improve later.

llvm-svn: 226981
2015-01-24 02:06:09 +00:00
Richard Smith baa128ea5a Bring the modules buildbot back to life after r226940.
llvm-svn: 226980
2015-01-24 01:55:52 +00:00
Kuba Brecka a62bbccad0 Reverting r226937: lit: Make MCJIT's supported arch check case insensitive
The r226937 commit causes ASan lit tests to be all skipped on OS X.

llvm-svn: 226979
2015-01-24 01:42:44 +00:00
Quentin Colombet 29f553398f [AArch64][LoadStoreOptimizer] Form LDPSW when possible.
This patch adds the missing LD[U]RSW variants to the load store optimizer, so
that we generate LDPSW when possible.

<rdar://problem/19583480>

llvm-svn: 226978
2015-01-24 01:25:54 +00:00
Lang Hames f4ff3435e7 [Orc] Add some missing headers to the CompileOnDemandLayer.h
llvm-svn: 226975
2015-01-24 00:45:11 +00:00
Bruno Cardoso Lopes ddcc2e31a7 [x86] Fix a comment
llvm-svn: 226974
2015-01-24 00:22:04 +00:00
Lang Hames 17e6caee63 [Orc] Add orcjit to the dependencies list in the Makefile for lli.
This should fix a few more broken bots.

llvm-svn: 226973
2015-01-24 00:01:29 +00:00
Tom Stellard edd188c459 R600/SI: Emit .hsa.version section for amdhsa OS
llvm-svn: 226970
2015-01-23 23:59:08 +00:00
Reid Kleckner 3d4638b391 Fix assertion when C++ EH filters are present in functions using SEH
Should fix PR22305.

llvm-svn: 226969
2015-01-23 23:51:25 +00:00
Adrian Prantl 70f2a736db Address more review comments for DIExpression::iterator.
- input_iterator
- define an operator->
- make constructors private were possible

llvm-svn: 226967
2015-01-23 23:40:47 +00:00
Justin Bogner cf3063b4a3 InstrProf: debug dumps should go to dbgs(), not outs()
llvm-svn: 226964
2015-01-23 23:28:30 +00:00
Justin Bogner 0b9858dca5 llvm-cov: Don't use llvm::outs() in library code
Nothing in lib/ should be using llvm::outs() directly. Thread it in
from the caller instead.

llvm-svn: 226961
2015-01-23 23:09:27 +00:00
Justin Bogner 000b5223e4 llvm-cov: Use range-for (NFC)
llvm-svn: 226960
2015-01-23 22:57:02 +00:00
Reid Kleckner f4ebbc6825 mips: Fix "XPASS" test results by removing 'not' commands
These tests are asserting and crashing for me, and 'not' sees that as a
non-zero exit code instead of a signal code for obscure Windows reasons.
This causes the test to pass, giving me an unclean 'ninja check'.

The test is already XFAILd, so just run the test without 'not' and let
lit handle the failure.

llvm-svn: 226958
2015-01-23 22:55:31 +00:00
Bruno Cardoso Lopes 56567f9135 [x86] Combine x86mmx/i64 to v2i64 conversion to use scalar_to_vector
Handle the poor codegen for i64/x86xmm->v2i64 (%mm -> %xmm) moves. Instead of
using stack store/load pair to do the job, use scalar_to_vector directly, which
in the MMX case can use movq2dq. This was the current behavior prior to
improvements for vector legalization of extloads in r213897.

This commit fixes the regression and as a side-effect also remove some
unnecessary shuffles.

In the new attached testcase, we go from:

pshufw  $-18, (%rdi), %mm0
movq    %mm0, -8(%rsp)
movq    -8(%rsp), %xmm0
pshufd  $-44, %xmm0, %xmm0
movd    %xmm0, %eax
...

To:

pshufw  $-18, (%rdi), %mm0
movq2dq %mm0, %xmm0
movd    %xmm0, %eax
...

Differential Revision: http://reviews.llvm.org/D7126
rdar://problem/19413324

llvm-svn: 226953
2015-01-23 22:44:16 +00:00
Justin Bogner 011c742535 llvm-cov: clang-format the GCOV files (NFC)
llvm-svn: 226952
2015-01-23 22:38:01 +00:00
Reid Kleckner 7b23b43bfe Fix the MSVC build with the new Orc JIT APIs
llvm-svn: 226949
2015-01-23 22:25:47 +00:00
Michael J. Spencer 58e419593c [YAMLIO] Dirty hack: Force integral conversion to allow strong typedefs to convert.
llvm-svn: 226948
2015-01-23 22:24:57 +00:00
Lang Hames 28452d85c8 [Orc] Remove a bunch of constructors from ObjectLinkingLayer.
These constructors were causing trouble for MSVC and older GCCs. This should
fix more of the build failures from r226940.

llvm-svn: 226946
2015-01-23 22:11:07 +00:00
Tom Stellard 20f6c0732f R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select()
We used to do this promotion during DAG legalization, but this
caused an infinite loop in ExpandUnalignedLoad() because it assumed
that i64 loads were legal if i64 was a legal type.

It also seems better to report i64 loads as legal, since they actually
are and we were just promoting them to simplify our tablegen files.

llvm-svn: 226945
2015-01-23 22:05:45 +00:00
Michael J. Spencer e368a62676 [Object][ELF] Test unknown type.
llvm-svn: 226943
2015-01-23 21:58:09 +00:00
Michael J. Spencer 731cae3839 [YAMLIO] Add support for numeric values in enums.
llvm-svn: 226942
2015-01-23 21:57:50 +00:00
Lang Hames d235df0aaa [Orc] LLVMLinkInOrcMCJITReplacement shouldn't be in the anonymous namespace.
This should fix some of the builder errors from r226940.

llvm-svn: 226941
2015-01-23 21:49:12 +00:00
Lang Hames 93de2a12a3 [Orc] New JIT APIs.
This patch adds a new set of JIT APIs to LLVM. The aim of these new APIs is to
cleanly support a wider range of JIT use cases in LLVM, and encourage the
development and contribution of re-usable infrastructure for LLVM JIT use-cases.

These APIs are intended to live alongside the MCJIT APIs, and should not affect
existing clients.

Included in this patch:

1) New headers in include/llvm/ExecutionEngine/Orc that provide a set of
   components for building JIT infrastructure.
   Implementation code for these headers lives in lib/ExecutionEngine/Orc.

2) A prototype re-implementation of MCJIT (OrcMCJITReplacement) built out of the
   new components.

3) Minor changes to RTDyldMemoryManager needed to support the new components.
   These changes should not impact existing clients.

4) A new flag for lli, -use-orcmcjit, which will cause lli to use the
   OrcMCJITReplacement class as its underlying execution engine, rather than
   MCJIT itself.

Tests to follow shortly.

Special thanks to Michael Ilseman, Pete Cooper, David Blaikie, Eric Christopher,
Justin Bogner, and Jim Grosbach for extensive feedback and discussion.

llvm-svn: 226940
2015-01-23 21:25:00 +00:00
Adrian Prantl 4bd0a0c3ed Move the accessor functions from DIExpression::iterator into a wrapper
DIExpression::Operand, so we can write range-based for loops.

Thanks to David Blaikie for the idea.

llvm-svn: 226939
2015-01-23 21:24:41 +00:00
Reid Kleckner 5f5036a073 lit: Make MCJIT's supported arch check case insensitive
Should make the tests run when using CMake on systems where 'uname -p'
reports "amd64", such as FreeBSD.

Should fix PR21559.

llvm-svn: 226937
2015-01-23 21:11:40 +00:00
Kevin Enderby 479ee6135d Fix the problem with llvm-objdump and -archive-headers in printing the archive header size field.
This problem showed up with the clang-cmake-armv7-a15-full bot.  Thanks to Renato Golin for his help.

llvm-svn: 226936
2015-01-23 21:02:44 +00:00
Alexei Starovoitov 4ea2f606a8 [mips] fix spelling of 'disassembler'
trivial first commit

llvm-svn: 226935
2015-01-23 21:00:08 +00:00
Hans Wennborg ae9c971a2f LowerSwitch: replace unreachable default with popular case destination
SimplifyCFG currently does this transformation, but I'm planning to remove that
to allow other passes, such as this one, to exploit the unreachable default.

This patch takes care to keep track of what case values are unreachable even
after the transformation, allowing for more efficient lowering.

Differential Revision: http://reviews.llvm.org/D6697

llvm-svn: 226934
2015-01-23 20:43:51 +00:00
Colin LeMahieu bc2f47a76e [Objdump] Output information about common symbols in a way closer to GNU objdump.
llvm-svn: 226932
2015-01-23 20:06:24 +00:00
Ramkumar Ramachandra c3c8b27616 [emacs] llvm-mode: fix parens, font-lock i*
In llvm-mode, with electric-pair-mode turned on, typing a literal '['
would print out '[[', and '(' would print a '(('. This was a very
annoying bug caused by overzealous syntax-table entries: the parens are
already part of the '(' and ')' class by default. Fix this.

While at it, notice that i32, i64, i1 etc. are not font-locked despite a
clear intent to do so. The issue is that regexp-opt doesn't accept
regular expressions. So, spell out the common literal integers with
different widths.

Differential Revision: http://reviews.llvm.org/D7036

llvm-svn: 226931
2015-01-23 19:45:35 +00:00
Kevin Enderby 69fe98da14 Add the option, -data-in-code, to llvm-objdump used with -macho to print the Mach-O data in code table.
llvm-svn: 226921
2015-01-23 18:52:17 +00:00
Reid Kleckner 5cc1569c54 Classify functions by EH personality type rather than using the triple
This mostly reverts commit r222062 and replaces it with a new enum. At
some point this enum will grow at least for other MSVC EH personalities.

Also beefs up the way we were sniffing the personality function.
Previously we would emit the Itanium LSDA despite using
__C_specific_handler.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D6987

llvm-svn: 226920
2015-01-23 18:49:01 +00:00
Adrian Prantl 577feba44b Debug Info / PR22309: Allow union types to be emitted as unsigned constants.
llvm-svn: 226919
2015-01-23 18:01:39 +00:00
Eric Christopher a1c6e0c8ce Remove some local variables in place of just querying for them
in the couple of asserts.

llvm-svn: 226917
2015-01-23 17:22:44 +00:00
Toma Tabacu c405c82214 [mips] Add new error message and improve testing for parsing the .module directive.
Summary:
We used to silently ignore any empty .module's and we used to give an error saying that we found
an "unexpected token at start of statement" when the value of the option wasn't an identifier (e.g. if it was a number).

We now give an error saying that we "expected .module option identifier" in both of those cases.

I also fixed the other tests in mips-abi-bad.s, which all seemed to be broken.


Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7095

llvm-svn: 226905
2015-01-23 10:40:19 +00:00
Jyoti Allur f1d7050a25 This patch fixes issue with lowering below mentioned pattern :-
_foo:
        smull	 r0, r1, r1, r0
	smull	 r2, r3, r3, r2
	adds	r0, r2, r0
	adc	r1, r3, r1
	bx	lr

to

_foo:
        smull	 r0, r1, r1, r0
	smlal	 r0, r1, r3, r2
	bx	lr

llvm-svn: 226904
2015-01-23 09:10:03 +00:00
Craig Topper 0271d10d35 [x86] Change u8imm operands to always print as unsigned. This makes shuffle masks and the like make way more sense.
llvm-svn: 226902
2015-01-23 08:00:59 +00:00
Mehdi Amini 5059813c2d DAGCombine: always constant fold FMA when target disable FP exceptions
Summary: When trying to constant fold an FMA in the DAG, getNode()
fails to fold the FMA if an operand is not finite. In this case this
patch allows the constant folding if !TLI->hasFloatingPointExceptions()

Reviewers: resistor

Reviewed By: resistor

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D6912

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 226901
2015-01-23 07:07:20 +00:00
Lang Hames ac92bfc505 [ADT] Add move operations to SmallVector<T,N> from SmallVectorImpl<T>.
This makes it possible to move between SmallVectors of different sizes.

Thanks to Dave Blaikie and Duncan Smith for patch feedback.

llvm-svn: 226899
2015-01-23 06:25:17 +00:00
Craig Topper f9c6598a9c Fix 80 column violation
llvm-svn: 226898
2015-01-23 06:18:35 +00:00
Craig Topper 46469aa4da [X86] Add IntrNoMem to the AVX512 conflict intrinsics.
llvm-svn: 226897
2015-01-23 06:11:45 +00:00
Rafael Espindola 5fa925ebf6 Add STB_GNU_UNIQUE to the ELF writer.
This lets llvm-mc assemble files produced by gcc.

llvm-svn: 226895
2015-01-23 04:44:35 +00:00
NAKAMURA Takumi 7ca79f7f96 Prune an out-of-date \param since r226476. [-Wdocumentation]
llvm-svn: 226890
2015-01-23 01:05:12 +00:00
NAKAMURA Takumi 2bbc90cca5 Reformat.
llvm-svn: 226888
2015-01-23 01:02:07 +00:00
NAKAMURA Takumi f6eee4ad67 MipsAsmParser.cpp: Suppress a warning introduced in r226657. [-Wunused-variable]
llvm-svn: 226887
2015-01-23 01:01:52 +00:00
Jan Vesely 5f715d36a7 R600: Try to use lower types for 64bit division if possible
v2: add and enable tests for SI

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 226881
2015-01-22 23:42:43 +00:00
Jan Vesely 6269e3ca2f SelectionDAG: Add KnownBits and SignBits computation for EXTRACT_ELEMENT
v2: use getZExtValue
    add missing break
    codestyle

v3: add few more comments

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 226880
2015-01-22 23:42:41 +00:00
Jan Vesely f7987ca5a7 R600: Simplify LowerUDIVREM
optimizations can handle removing the Hi part operations.
The generated code is identical for R600, ~10% icount reduction for SI

v2: rebase

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 226879
2015-01-22 23:42:39 +00:00
Duncan P. N. Exon Smith 68ab023ef7 IR: Change GenericDwarfNode::getHeader() to StringRef
Simplify the API to use a `StringRef` directly rather than exposing the
`MDString` bits underneath.

llvm-svn: 226876
2015-01-22 23:10:55 +00:00
Duncan P. N. Exon Smith e8b5e49ffd IR: DwarfNode => DebugNode, NFC
These things are potentially used for non-DWARF data (see the discussion
in PR22235), so take the `Dwarf` out of the name.  Since the new name
gives fewer clues, update the doxygen to properly describe what they
are.

llvm-svn: 226874
2015-01-22 22:47:44 +00:00
Simon Pilgrim 7e6d573e87 [X86][AVX] Added (V)MOVDDUP / (V)MOVSLDUP / (V)MOVSHDUP memory folding + tests.
Minor tweak now that D7042 is complete, we can enable stack folding for (V)MOVDDUP and do proper testing.

Added missing AVX ymm folding patterns and fixed alignment for AVX VMOVSLDUP / VMOVSHDUP.

llvm-svn: 226873
2015-01-22 22:39:59 +00:00
Simon Pilgrim c976e8eef4 Line endings fixes. NFC.
llvm-svn: 226872
2015-01-22 22:27:37 +00:00
Simon Pilgrim 507b37fcbb [X86][SSE] Simplified PSUBUS tests
Removed loops from PSUBUS tests - ensures folding is tested. Also renamed SSE2 tests SSSE3 to match cpu.

This is a follow up commit agreed in http://reviews.llvm.org/D7094

llvm-svn: 226871
2015-01-22 22:19:58 +00:00