Commit Graph

106628 Commits

Author SHA1 Message Date
NAKAMURA Takumi 15ac9af4f4 Fix llvm/test/DebugInfo/X86/recursive_inlining.ll to use %llc_dwarf.
llvm-svn: 215181
2014-08-08 02:24:05 +00:00
NAKAMURA Takumi 40da267976 AArch64InstrInfo.cpp: Fix \param(s). [-Wdocumentation]
llvm-svn: 215180
2014-08-08 02:04:18 +00:00
Anton Yartsev 671dff1e49 [tablegen] - Eliminate memory leaks in TGParser.cpp
Ugly solution indicating that a refactoring is necessary to get the ownership under control.

llvm-svn: 215176
2014-08-08 00:29:54 +00:00
Adam Nemet 7d498629f1 [AVX512] Add zero-masking variant to AVX512_masking multiclass
This completes one item from the todo-list of r215125 "Generate masking
instruction variants with tablegen".

The AddedComplexity is needed just like for the k variant.

Added a codegen test based on valignq.

llvm-svn: 215173
2014-08-07 23:53:38 +00:00
Gerolf Hoflehner ea96a3d336 Fix for multi-line comment warning
llvm-svn: 215169
2014-08-07 23:19:55 +00:00
Adam Nemet fa1f7201fc [AVX512] Add codegen test for the masking variant of valign
The AddedComplexity is needed just like in avx512_perm_3src.  There may be a
bug in the complexity computation...

llvm-svn: 215168
2014-08-07 23:18:18 +00:00
Akira Hatanaka 5acc58fcfb [stack protector] Look through bitcasts to get global variable
__stack_chk_guard.

Handle the case where the pointer operand of the load instruction that loads the
stack guard is not a global variable but instead a bitcast.

%StackGuard = load i8** bitcast (i64** @__stack_chk_guard to i8**)
call void @llvm.stackprotector(i8* %StackGuard, i8** %StackGuardSlot)

Original test case provided by Ana Pazos.

This fixes PR20558.

llvm-svn: 215167
2014-08-07 23:08:24 +00:00
Adrian Prantl 80c8b2742f Make these regexes stricter by disallowing any additional characters in the output.
Thanks to dblaikie for pointing this out!

llvm-svn: 215166
2014-08-07 23:04:07 +00:00
Arnold Schwaighofer 4fb3c47456 SLPVectorizer: Use the type of the value loaded/stored to get the ABI alignment
We were using the pointer type which is incorrect.

llvm-svn: 215162
2014-08-07 22:47:27 +00:00
Adrian Prantl 03c67849d1 Add a separate testcase for a DWARF expression describing a value in a
subregister.

llvm-svn: 215161
2014-08-07 22:44:34 +00:00
Adrian Prantl 26e66b155f Reflow this comment.
llvm-svn: 215160
2014-08-07 22:44:24 +00:00
David Blaikie 09fdfabdda DebugInfo: Fix overwriting/loss of inlined arguments to recursively inlined functions.
Due to an unnecessary special case, inlined arguments that happened to
be from the same function as they were inlined into were misclassified
as non-inline arguments and would overwrite the non-inlined arguments.

Assert that we never overwrite a function's arguments, and stop
misclassifying inlined arguments as non-inline arguments to fix this
issue.

Excuse the rather crappy test case - handcrafted IR might do better, or
someone who understands better how to tickle the inliner to create a
recursive inlining situation like this (though it may also be necessary
to tickle the variable in a particular way to cause it to be recorded in
the MMI side table and go down this particular path for location
information).

llvm-svn: 215157
2014-08-07 22:22:49 +00:00
Reed Kotler 87048a4c9e fix materialization of one bit constants and global values which are accessed through
a base GOT entry.

Summary:
get tip of tree mips fast-isel to pass test-suite

Two bugs were fixed:

1) one bit booleans were treated as 1 bit signed integers and so the literal '1' could become sign extended.
2) mips uses got for pic but in certain cases, as with string constants for example, many items can be referenced from the same got entry and this case was not handled properly.

Test Plan: test-suite

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: mcrosier

Differential Revision: http://reviews.llvm.org/D4801

llvm-svn: 215155
2014-08-07 22:09:01 +00:00
Eric Christopher b9fd9ed37e Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.

Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reverts commits r215111, 215115, 215116, 215117, 215136.

llvm-svn: 215154
2014-08-07 22:02:54 +00:00
Gerolf Hoflehner b5220dc779 Debugging Utility - optional ability for dumping critical path length
llvm-svn: 215153
2014-08-07 21:49:44 +00:00
Gerolf Hoflehner 97c383bc36 MachineCombiner Pass for selecting faster instruction sequence on AArch64
Re-commit of r214832,r21469 with a work-around that
avoids the previous problem with gcc build compilers

The work-around is to use SmallVector instead of ArrayRef
of basic blocks in preservesResourceLen()/MachineCombiner.cpp

llvm-svn: 215151
2014-08-07 21:40:58 +00:00
Kevin Enderby ae2a9a236f Add two missing ARM cpusubtypes to the switch statement in
MachOObjectFile::getArch(uint32_t CPUType, uint32_t CPUSubType) .

Upcoming changes will cause existing test cases to use this but
I wanted to check in this obvious change separately.

llvm-svn: 215150
2014-08-07 21:30:25 +00:00
Owen Anderson 6c19ab1b5d Fix a case in SROA where lifetime intrinsics could inhibit alloca promotion. In
this case, the code path dealing with vector promotion was missing the explicit
checks for lifetime intrinsics that were present on the corresponding integer
promotion path.

llvm-svn: 215148
2014-08-07 21:07:35 +00:00
Lang Hames 4ea28e2294 [MCJIT] Replace a c-style cast with reinterpret_cast + static_cast.
C-style casts (and reinterpret_casts) result in implementation defined
values when a pointer is cast to a larger integer type. On some platforms
this was leading to bogus address computations in RuntimeDyldMachOAArch64.

This should fix http://llvm.org/PR20501.

llvm-svn: 215143
2014-08-07 20:41:57 +00:00
Richard Smith 56579b6324 Remove Support/IncludeFile.h and its only user. This is actively harmful, since
it breaks the modules builds (where CallGraph.h can be quite reasonably
transitively included by an unimported portion of a module, and CallGraph.cpp
not linked in), and appears to have been entirely redundant since PR780 was
fixed back in 2008.

If this breaks anything, please revert; I have only tested this with a single
configuration, and it's possible that this is still somehow fixing something
(though I doubt it, since no other similar file uses this mechanism any more).

llvm-svn: 215142
2014-08-07 20:41:17 +00:00
Rafael Espindola cc847b6376 Fix test failure on ARM.
llvm-svn: 215140
2014-08-07 20:33:06 +00:00
Richard Smith c54302a1d0 [modules] Update module map workaround to cope with the problematic file having
been relocated.

llvm-svn: 215139
2014-08-07 20:27:08 +00:00
Frederic Riss e6bb1871eb test commit: remove trailing whitespace.
llvm-svn: 215138
2014-08-07 20:04:00 +00:00
Rafael Espindola e20470d399 Remove a few XFAILs.
These tests now pass with MCJIT.

llvm-svn: 215136
2014-08-07 19:35:22 +00:00
Akira Hatanaka bbd33f6766 [Branch probability] Recompute branch weights of tail-merged basic blocks.
BranchFolderPass was not correctly setting the basic block branch weights when
tail-merging created or merged blocks. This patch recomutes the weights of
tail-merged blocks using the following formula:

branch_weight(merged block to successor j) =
sum(block_frequency(bb) * branch_probability(bb -> j))

bb is a block that is in the set of merged blocks.

<rdar://problem/16256423>

llvm-svn: 215135
2014-08-07 19:30:13 +00:00
Joerg Sonnenberger 54c340b76a Add the majority of the remaining SPE instructions.
llvm-svn: 215131
2014-08-07 18:52:39 +00:00
Justin Bogner 1b9f936f91 FileCheck: Add a flag to allow checking empty input
Currently FileCheck errors out on empty input. This is usually the
right thing to do, but makes testing things like "this command does
not emit some error message" hard to test. This usually leads to
people using "command 2>&1 | count 0" instead, and then the bots that
use guard malloc fail a few hours later.

By adding a flag to FileCheck that allows empty inputs, we can make
tests that consist entirely of "CHECK-NOT" lines feasible.

llvm-svn: 215127
2014-08-07 18:40:37 +00:00
Joerg Sonnenberger f74c74693c Indent
llvm-svn: 215126
2014-08-07 18:05:32 +00:00
Adam Nemet 2e2537f665 [AVX512] Generate masking instruction variants with tablegen
After adding the masking variants to several instructions, I have decided to
experiment with generating these from the non-masking/unconditional
variant. This will hopefully reduce the amount repetition that we currently
have in order to define an instruction with all its variants (for a reg/mem
instruction this would be 6 instruction defs and 2 Pat<> for the intrinsic).

The patch is the first cut that is currently only applied to valignd/q to make
the patch small.

A few notes on the approach:

  * In order to stitch together the dag for both the conditional and the
  unconditional patterns I pass the RHS of the set rather than the full
  pattern (set dest, RHS).
  * Rather than subclassing each instruction base class (e.g. AVX512AIi8),
  with a masking variant which wouldn't scale, I derived the masking
  instructions from a new base class AVX512 (this is just I<> with
  Requires<HasAVX512>).  The instructions derive from this now, plus a new set
  of classes that add the format bits and everything else that instruction
  base class provided (i.e. AVX512AIi8 vs. AVX512AIi8Base).

I hope we can go incrementally from here.  I expect that:

  * We will need different variants of the masking class.  One example is
  instructions requiring three vector sources.  In this case we tie one of the
  source operands to dest rather than a new implicit source operand ($src0)
  * Add the zero-masking variant
  * Add more AVX512*Base classes as new uses are added

I've looked at X86.td.expanded before and after to make sure that nothing got
lost for valignd/q.

llvm-svn: 215125
2014-08-07 17:53:55 +00:00
NAKAMURA Takumi 17ae2fca4c llvm/test/tools/llvm-objdump: Reorganize target-dependent some tests.
llvm-svn: 215122
2014-08-07 17:17:19 +00:00
Rafael Espindola a3ddbc9d23 Fix the ocaml bindings.
llvm-svn: 215117
2014-08-07 14:48:13 +00:00
Rafael Espindola ea9c317000 fix configure+make build
llvm-svn: 215116
2014-08-07 14:38:49 +00:00
Rafael Espindola f8b27c41e8 Nuke the old JIT.
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.

Thanks to Lang Hames for making MCJIT a good replacement!

llvm-svn: 215111
2014-08-07 14:21:18 +00:00
Joerg Sonnenberger 84d35dfe96 Add mfasr and mtasr
llvm-svn: 215110
2014-08-07 13:35:34 +00:00
Joerg Sonnenberger 853feaa808 Add mfrtcu and mfrtcl instructions
llvm-svn: 215109
2014-08-07 13:16:58 +00:00
Joerg Sonnenberger 1837a7b4fa Support mttbl and mttbu mnemonic
llvm-svn: 215108
2014-08-07 13:06:23 +00:00
Joerg Sonnenberger a3d4dc9eb4 Add RFID instruction.
llvm-svn: 215105
2014-08-07 12:39:59 +00:00
Joerg Sonnenberger 83ef5c7753 Fix Itineray class of rfi
llvm-svn: 215104
2014-08-07 12:35:16 +00:00
Joerg Sonnenberger 6ae087abc6 Spell e500 feature in lower case.
llvm-svn: 215103
2014-08-07 12:31:28 +00:00
Joerg Sonnenberger 39f095ae5a Add first bunch of SPE instructions. As they overlap with Altivec, mark
them as parser-only until the disassembler is extended to handle
predicates properly.

llvm-svn: 215102
2014-08-07 12:18:21 +00:00
Alexander Kornienko 7151ad7762 Insert parens to avoid a warning:
suggest parentheses around arithmetic in operand of '^' [-Wparentheses]

llvm-svn: 215101
2014-08-07 12:09:34 +00:00
Aaron Ballman b677f7ac4b Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). No functional changes intended.
llvm-svn: 215100
2014-08-07 12:07:33 +00:00
Daniel Sanders 449344315f [mips] Add assembler support for .set msa/nomsa directive.
Summary:
These directives are used to toggle whether the assembler accepts MSA-specific instructions or not.

Patch by Matheus Almeida and Toma Tabacu.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4783

llvm-svn: 215099
2014-08-07 12:03:36 +00:00
Pavel Chupin 124889243a Fix lld-x86_64-win7 Build #11969
llvm-svn: 215097
2014-08-07 11:09:59 +00:00
Chandler Carruth 4e8fcbd3fd [x86] Fix another miscompile found through fuzz testing the new vector
shuffle lowering.

This is closely related to the previous one. Here we failed to use the
source offset when swapping in the other case -- where we end up
swapping the *final* shuffle. The cause of this bug is a bit different:
I simply wasn't thinking about the fact that this mask is actually
a slice of a wide mask and thus has numbers that need SourceOffset
applied. Simple fix. Would be even more simple with an algorithm-y thing
to use here, but correctness first. =]

llvm-svn: 215095
2014-08-07 10:37:35 +00:00
Chandler Carruth e206385e99 [x86] Fix another miscompile in the new vector shuffle lowering found
via the fuzz tester.

Here I missed an offset when round-tripping a value through a shuffle
mask. I got it right 2 lines below. See a problem? I do. ;] I'll
probably be adding a little "swap" algorithm which accepts a range and
two values and swaps those values where they occur in the range. Don't
really have a name for it, let me know if you do.

llvm-svn: 215094
2014-08-07 10:14:27 +00:00
Chandler Carruth 78494364d1 [x86] Fix another miscompile in the new vector shuffle lowering found
through the new fuzzer.

This one is great: bad operator precedence led the modulus to happen at
the wrong point. All the asserts didn't fire because there were usually
the right values past the end of the 4 element region we were looking
at. Probably could have gotten a crash here with ASan + fuzzing, but the
correctness tests pinpointed this really nicely.

llvm-svn: 215092
2014-08-07 09:45:02 +00:00
Pavel Chupin f55eb450e5 [x32] Use ebp/esp as frame and stack pointer
Summary:
Since pointers are 32-bit on x32 we can use ebp and esp as frame and stack
pointer. Some operations like PUSH/POP and CFI_INSTRUCTION still
require 64-bit register, so using 64-bit MachineFramePtr where required.

X86_64 NaCl uses 64-bit frame/stack pointers, however it's been found that
both isTarget64BitLP64 and isTarget64BitILP32 are true for NaCl. Addressing
this issue here as well by making isTarget64BitLP64 false.

Also mark hasReservedSpillSlot unreachable on X86. See inlined comments.

Test Plan: Add one new simple test and upgrade 2 existing with x32 target case.

Reviewers: nadav, dschuff

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D4617

llvm-svn: 215091
2014-08-07 09:41:19 +00:00
Chandler Carruth 27046758de [x86] Fix a miscompile in the new shuffle lowering found through the new
fuzz testing.

The function which tested for adjacency did what it said on the tin, but
when I called it, I wanted it to do something more thorough: I wanted to
know if the *pairs* of shuffle elements were adjacent and started at
0 mod 2. In one place I had the decency to try to test for this, but in
the other it was completely skipped, miscompiling this test case. Fix
this by making the helper actually do what I wanted it to do everywhere
I called it (and removing the now redundant code in one place).

I *really* dislike the name "canWidenShuffleElements" for this
predicate. If anyone can come up with a better name, please let me know.
The other name I thought about was "canWidenShuffleMask" but is it
really widening the mask to reduce the number of lanes shuffled? I don't
know. Naming things is hard.

llvm-svn: 215089
2014-08-07 08:11:31 +00:00
Pete Cooper 9b90dc7c6a Update Tablegen documents given that binary literals are now sized
llvm-svn: 215088
2014-08-07 05:47:13 +00:00
Pete Cooper 94891ddf0d Update BitRecTy::convertValue to allow if expressions with bit values on both sides of the if
llvm-svn: 215087
2014-08-07 05:47:10 +00:00
Pete Cooper 0bf1ea72ee Change the { } expression in tablegen to accept sized binary literals which are not just 0 and 1.
It also allows nested { } expressions, as now that they are sized, we can merge pull bits from the nested value.

In the current behaviour, everything in { } must have been convertible to a single bit.
However, now that binary literals are sized, its useful to be able to initialize a range of bits.

So, for example, its now possible to do

bits<8> x = { 0, 1, { 0b1001 }, 0, 0b0 }

llvm-svn: 215086
2014-08-07 05:47:07 +00:00
Pete Cooper 2cfdfe5882 Change BitsInit to inherit from TypedInit.
This is useful in a later patch where binary literals such as 0b000 will become BitsInit values instead of IntInit values.

llvm-svn: 215085
2014-08-07 05:47:04 +00:00
Pete Cooper 2597764ad9 Change TableGen so that binary literals such as 0b001 are now sized.
Instead of these becoming an integer literal internally, they now become bits<n> values.

Prior to this change, 0b001 was 1 bit long.  This is confusing as clearly the user gave 3 bits.
This new type holds both the literal value and the size, and so can ensure sizes match on initializers.

For example, this used to be legal

bits<1> x = 0b00;

but now it must be written as

bits<2> x = 0b00;

llvm-svn: 215084
2014-08-07 05:47:00 +00:00
Pete Cooper 99ad2a3b67 TableGen: Change { } to only accept bits<n> entries when n == 1.
Prior to this change, it was legal to do something like

  bits<2> opc = { 0, 1 };
  bits<2> opc2 = { 1, 0 };
  bits<2> a = { opc, opc2 };

This involved silently dropping bits from opc and opc2 which is very hard to debug.

Now the above test would be an error.  Having tested with an assert, none of LLVM/clang was relying on this behaviour.

Thanks to Adam Nemet for the above test.

llvm-svn: 215083
2014-08-07 05:46:57 +00:00
Pete Cooper c18261d467 Fix a whole bunch of binary literals which were the wrong size. All were being silently zero extended to the correct width.
The commit after this changes { } and 0bxx literals to be of type bits<n> and not int.  This means we need to write exactly the right number of bits, and not rely on the values being silently zero extended for us.

llvm-svn: 215082
2014-08-07 05:46:54 +00:00
Chandler Carruth ae42d7d020 Add an option to the shuffle fuzzer that lets you fuzz exclusively
within a single bit-width of vectors. This is particularly useful for
when you know you have bugs in a certain area and want to find simpler
test cases than those produced by an open-ended fuzzing that ends up
legalizing the vector in addition to shuffling it.

llvm-svn: 215056
2014-08-07 04:49:54 +00:00
Bill Wendling fcb526020a Use the minor number for the revision numbers.
llvm-svn: 215055
2014-08-07 04:21:45 +00:00
Chandler Carruth 7f3facb7e6 Add a vector shuffle fuzzer.
This is a python script which for a given seed generates a random
sequence of random shuffles of a random vector width. It embeds this
into a function and emits a main function which calls the test routine
and checks that the results (where defined) match the obvious results.

I'll be using this to drive out miscompiles from the new vector shuffle
logic now that it is clean of any crashes I can find with llvm-stress.

Note, my python skills are very poor. Sorry if this is terrible code,
and feel free to tell me how I should write this or just patch it as
necessary.

The tests generated try to be very portable and use boring C routines.
It technically will mis-declare the C routines and pass 32-bit integers
to parametrs that expect 64-bit integers. If someone wants to fix this
and has less terrible ideas of how to do it, I'm all ears. Fortunately,
this "just works" for x86. =]

llvm-svn: 215054
2014-08-07 04:13:51 +00:00
Justin Bogner 989cbc9058 DebugInfo: Make a test more portable
mach-o doesn't like sections without segments, and elf is perfectly
happy with commas in section names, so use a Darwin-like section name.

Suggestion by Eric Christopher.

llvm-svn: 215052
2014-08-07 03:47:28 +00:00
Saleem Abdulrasool 64a8cc7d0d MC: split Win64EHUnwindEmitter into a shared streamer
This changes Win64EHEmitter into a utility WinEH UnwindEmitter that can be
shared across multiple architectures and a target specific bit which is
overridden (Win64::UnwindEmitter).  This enables sharing the section selection
code across X86 and the intended use in ARM for emitting unwind information for
Windows on ARM.

llvm-svn: 215050
2014-08-07 02:59:41 +00:00
Quentin Colombet 0233d49574 [X86][SchedModel] Fixed missing/wrong scheduling model found by code inspection.
Source: Agner Fog's Instruction tables.

Related to <rdar://problem/15607571>

llvm-svn: 215045
2014-08-07 00:20:44 +00:00
Kevin Enderby c959562092 Add the -mcpu= option to llvm-objdump for use with the disassemblers.
Also make the disassembler created with the Mach-O parser (the -m option)
pick up the Target specific attributes specified with -mattr option.

llvm-svn: 215032
2014-08-06 23:24:41 +00:00
Reid Kleckner ce63b791fe MC X86: Accept ".att_syntax prefix" and diagnose noprefix
Fixes PR18916.  I don't think we need to implement support for either
hybrid syntax.  Nobody should write Intel assembly with '%' prefixes on
their registers or AT&T assembly without them.

llvm-svn: 215031
2014-08-06 23:21:13 +00:00
David Blaikie ff3dd1701c Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.""
This reverts commit r214761.

Revert while Reid investigates & provides a reproduction for an
assertion failure for this on Windows.

llvm-svn: 214999
2014-08-06 22:30:12 +00:00
Sanjay Patel b63e43c931 fix typo
llvm-svn: 214995
2014-08-06 21:08:38 +00:00
Yaron Keren f394f83f82 getNewMemBuffer memsets the buffer to zeros,
the caller don't have to initialize it.

llvm-svn: 214994
2014-08-06 20:59:09 +00:00
Sanjay Patel cd47959eb6 Fix a test that has no checks.
X86 doesn't have fneg, so check for xor.

Differential Revision: http://reviews.llvm.org/D4812

llvm-svn: 214992
2014-08-06 20:45:30 +00:00
Matt Arsenault a6dc6c281c R600: Cleanup fadd and fsub tests
llvm-svn: 214991
2014-08-06 20:27:55 +00:00
Rui Ueyama c487f7728e Revert "r214897 - Remove dead zero store to calloc initialized memory"
It broke msan.

llvm-svn: 214989
2014-08-06 19:30:38 +00:00
Eric Christopher b5217507c7 Remove the target machine from CCState. Previously it was only used
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.

llvm-svn: 214988
2014-08-06 18:45:26 +00:00
Adrian Prantl 364d13170a Improve performance of calculateDbgValueHistory.
In r210492 the logic of calculateDbgValueHistory was changed to end
register variable live ranges at the end of MBB conditionally on
the fact that the register was or not clobbered by the function body.

This requires an initial scan of all the operands of the function
to collect all clobbered registers. In a second pass over all
instructions, we compare this set with the set of clobbered
registers for the current MachineInstruction. This modification
incurred a compilation time regression on some benchmarks: the
debug info emission phase takes ~10% more time.

While a small performance hit is unavoidable due to the initial
scan requirement, we can improve the situation by avoiding to
create too many temporary sets and just use lambdas to work directly
on the result of the initial scan.

Fixes <rdar://problem/17884104>

Patch by Frederic Riss!

llvm-svn: 214987
2014-08-06 18:41:24 +00:00
Adrian Prantl e2d637597c Cleanup collectChangingRegs
The handling of the epilogue is best expressed as an early exit and
there is no reason to look for register defs in DbgValue MIs.

Patch by Frederic Riss!

llvm-svn: 214986
2014-08-06 18:41:19 +00:00
David Blaikie 6e9477af8c DebugInfo: Fix ranges+gmlt test case to actually exercise the gmlt situation.
Originally this test case tested the specified behavior (that -gmlt
would not produce DW_AT_ranges and that when no CU DW_AT_ranges were
produced, no debug_ranges section (not even an empty list) would be
produced) but then the ranges emission code was improved not to create
ranges of a single element (instead favoring high_pc/low_pc) and so this
test case no longer exercised the -gmlt portion of the behavior.

This caused me some confusion when reading the comments and trying to
update this test case for future changes to -gmlt. I've made this test
resilient to those changes (by using the {{DW_TAG|NULL}} pattern to
block the end of the attribute search at the end of the CU's attribute
list without mandating that it must (or must not) be followed by another
tag (the future changes to -gmlt should produce no subprograms in this
CU))

Fix the test case to have two functions in distinct sections to force
the use of DW_AT_ranges.

llvm-svn: 214985
2014-08-06 18:24:19 +00:00
Reid Kleckner 2daa731bab Add a triple to this test to get the right IR mangling
llvm-svn: 214982
2014-08-06 18:09:15 +00:00
Reid Kleckner 61bac93faa Don't count inreg params when mangling fastcall functions
This is consistent with MSVC.

llvm-svn: 214981
2014-08-06 18:09:04 +00:00
Reid Kleckner e41d957028 Round up the size of byval arguments to MinAlign
Otherwise we can end up with an argument frame size that is not a
multiple of stack slot size, which is very awkward.

This fixes PR20547, which was a bug in x86_64 Sys V vararg handling.
However, it's much easier to test this with x86 callee-cleanup
functions, which previously ended in "retl $6" instead of "retl $8".

This does affect behavior of all backends, but it presumably fixes the
same bug in all of them.

llvm-svn: 214980
2014-08-06 17:57:23 +00:00
Duncan P. N. Exon Smith 04642a4972 UseListOrder: Use std::vector
I initially used a `SmallVector<>` for `UseListOrder::Shuffle`, which
was a silly choice.  When I realized my error I quickly rolled a custom
data structure.

This commit simplifies it to a `std::vector<>`.  Now that I've had a
chance to measure performance, this data structure isn't part of a
bottleneck, so the additional complexity is unnecessary.

This is part of PR5680.

llvm-svn: 214979
2014-08-06 17:36:08 +00:00
Chad Rosier b481bdfec4 [AArch64] Add a few isTarget* API to AArch64 Subtarget.
llvm-svn: 214977
2014-08-06 16:56:58 +00:00
Chad Rosier 5b78a642e4 Add test case omitted in r214974.
llvm-svn: 214975
2014-08-06 16:06:41 +00:00
Chad Rosier afe7c93c7f [AArch64] Fix OS ABI flag for aarch64-linux-gnu target.
For triple aarch64-linux-gnu we were incorrectly setting IRIX.
For triple aarch64 we are correctly setting SYSV.

Patch by Ana Pazos <apazos@codeaurora.org>.

llvm-svn: 214974
2014-08-06 16:05:02 +00:00
Sanjay Patel d26358e12d use register iterators that include self to reduce code duplication in CriticalAntiDepBreaker
This patch addresses 2 FIXME comments that I added to CriticalAntiDepBreaker while fixing PR20020.

Initialize an MCSubRegIterator and an MCRegAliasIterator to include the self reg.

Assuming that works as advertised, there should be functional difference with this patch, just less code.

Also, remove the associated asserts - we're setting those values just before, so the asserts don't do anything meaningful.

Differential Revision: http://reviews.llvm.org/D4566

llvm-svn: 214973
2014-08-06 15:58:15 +00:00
Robert Khasanov 3c30c4bdec [AVX512] Added load/store instructions to Register2Memory opcode tables.
Added lowering tests for load/store.

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 214972
2014-08-06 15:40:34 +00:00
James Molloy 99917946da [AArch64] Add a testcase for r214957.
llvm-svn: 214965
2014-08-06 13:31:32 +00:00
James Molloy 568da0990e Add a new option -run-slp-after-loop-vectorization.
This swaps the order of the loop vectorizer and the SLP/BB vectorizers. It is disabled by default so we can do performance testing - ideally we want to change to having the loop vectorizer running first, and the SLP vectorizer using its leftovers instead of the other way around.

llvm-svn: 214963
2014-08-06 12:56:19 +00:00
Tim Northover 2a417b96d4 ARM: do not generate BLX instructions on Cortex-M CPUs.
Particularly on MachO, we were generating "blx _dest" instructions on M-class
CPUs, which don't actually exist. They happen to get fixed up by the linker
into valid "bl _dest" instructions (which is why such a massive issue has
remained largely undetected), but we shouldn't rely on that.

llvm-svn: 214959
2014-08-06 11:13:14 +00:00
Tim Northover d4d294dd51 ARM-MachO: materialize callee address correctly on v4t.
llvm-svn: 214958
2014-08-06 11:13:06 +00:00
James Molloy f089ab70f4 [AArch64] Conditional selects are expensive on out-of-order cores.
Specifically Cortex-A57. This probably applies to Cyclone too but I haven't enabled it for that as I can't test it.

This gives ~4% improvement on SPEC 174.vpr, and ~1% in 471.omnetpp.

llvm-svn: 214957
2014-08-06 10:42:18 +00:00
Chandler Carruth c3927cd8c9 [x86] Fix two independent miscompiles in the process of getting the same
test case to actually generate correct code.

The primary miscompile fixed here is that we weren't correctly handling
in-place elements in one half of a single-input v8i16 shuffle when
moving a dword of elements from that half to the other half. Some times,
we would clobber the in-place elements in forming the dword to move
across halves.

The fix to this involves forcibly marking the in-place inputs even when
there is no need to gather them into a dword, and to much more carefully
re-arrange the elements when grouping them into a dword to move across
halves. With these two changes we would generate correct shuffles for
the test case, but found another miscompile. There are also some random
perturbations of the generated shuffle pattern in SSE2. It looks like
a wash; more instructions in some cases fewer in others.

The second miscompile would corrupt the results into nonsense. This is
a buggy pattern in one of the added DAG combines. Mapping elements
through a PSHUFD when pairing redundant half-shuffles is *much* harder
than this code makes it out to be -- it requires reasoning about *all*
of where the input is used in the PSHUFD, not just one part of where it
is used. Plus, we can't combine a half shuffle *into* a PSHUFD but the
code didn't guard against it. I think this was just a bad idea and I've
just removed that aspect of the combine. No tests regress as
a consequence so seems OK.

llvm-svn: 214954
2014-08-06 10:16:36 +00:00
Chandler Carruth 8f23ba26d2 [x86] Switch to a formulation of a for loop that is much more obviously
not corrupting the mask by mutating it more times than intended. No
functionality changed (the results were non-overlapping so the old
version "worked" but was non-obvious).

llvm-svn: 214953
2014-08-06 10:16:33 +00:00
Adam Nemet 5ec912881f [X86] Fixes commit r214890 to match the posted patch
This was another fallout from my local rebase where something went wrong :(

llvm-svn: 214951
2014-08-06 07:13:12 +00:00
Matt Arsenault 515c24b7e0 Correct comment
llvm-svn: 214945
2014-08-06 00:44:25 +00:00
Peter Collingbourne df240b252a [dfsan] Try not to create too many additional basic blocks in functions which
already have a large number of blocks. Works around a performance issue with
the greedy register allocator.

llvm-svn: 214944
2014-08-06 00:33:40 +00:00
Matt Arsenault d5f4de27b6 R600: Increase nearby load scheduling threshold.
This partially fixes weird looking load scheduling
in memcpy test. The load clustering doesn't seem
particularly smart, but this method seems to be partially
deprecated so it might not be worth trying to fix.

llvm-svn: 214943
2014-08-06 00:29:49 +00:00
Matt Arsenault c10853f29f R600/SI: Implement areLoadsFromSameBasePtr
This currently has a noticable effect on the kernel argument loads.
LDS and global loads are more problematic, I think because of how copies
are currently inserted to ensure that the address is a VGPR.

llvm-svn: 214942
2014-08-06 00:29:43 +00:00
Quentin Colombet 33ea1681ce [X86][SchedModel] Fixed some wrong scheduling model found by code inspection.
Source: Agner Fog's Instruction tables.

Related to <rdar://problem/15607571>

llvm-svn: 214940
2014-08-06 00:22:39 +00:00
David Blaikie fb0412f039 DebugInfo: Assert that any CU for which debug_loc lists are emitted, has at least one range.
This was coming in weird debug info that had variables (and hence
debug_locs) but was in GMLT mode (because it was missing the 13th field
of the compile_unit metadata) so no ranges were constructed. We should
always have at least one range for any CU with a debug_loc in it -
because the range should cover the debug_loc.

The assertion just ensures that the "!= 1" range case inside the
subsequent loop doesn't get entered for the case where there are no
ranges at all, which should never reach here in the first place.

llvm-svn: 214939
2014-08-06 00:21:25 +00:00
David Blaikie cabf54a313 DebugInfo: Fix a bunch of tests that, owing to their compile_unit metadata not including a 13th field, had some subtle behavior.
Without the 13th field, the "emission kind" field defaults to 0 (which
is not equal to either of the values of the emission kind enum (1 ==
full debug info, 2 == line tables only)).

In this particular instance, the comparison with "FullDebugInfo" was
done when adding elements to the ranges list - so for these test cases
no values were added to the ranges list.

This got weirder when emitting debug_loc entries as the addresses should
be relative to the range of the CU if the CU has only one range (the
reasonable assumption is that if we're emitting debug_loc lists for a CU
that CU has at least one range - but due to the above situation, it has
zero) so the ranges were emitted relative to the start of the section
rather than relative to the start of the CU's singular range.

Fix these tests by accounting for the difference in the description of
debug_loc entries (in some cases making the test ignorant to these
differences, in others adding the extra label difference expression,
etc) or the presence/absence of high/low_pc on the CU, and add the 13th
field to their CUs to enable proper "full debug info" emission here.

In a future commit I'll fix up a bunch of other test cases that are not
so rigorously depending on this behavior, but still doing similarly
weird things due to the missing 13th field.

llvm-svn: 214937
2014-08-05 23:57:31 +00:00
Matt Arsenault 1070511847 R600/SI: Add definitions for ds_read2st64_ / ds_write2st64_
llvm-svn: 214936
2014-08-05 23:53:20 +00:00
JF Bastien ac8b66b32c Fix typos in comments and doc
Committing http://reviews.llvm.org/D4798 for Robin Morisset (morisset@google.com)

llvm-svn: 214934
2014-08-05 23:27:34 +00:00
David Blaikie e1a26a624d DebugInfo: Move the reference to the CU from the location list entry to the list itself, since it is constant across an entire list.
This simplifies construction and usage while making the data structure
smaller. It was a holdover from the days when we didn't have a separate
DebugLocList and all we had was a flat list of DebugLocEntries.

llvm-svn: 214933
2014-08-05 23:14:16 +00:00
Rafael Espindola b8141d55b9 Remove a virtual function from TargetMachine. NFC.
llvm-svn: 214929
2014-08-05 22:10:21 +00:00
Jonathan Roelofs ef84bda531 Re-apply r214881: Fix return sequence on armv4 thumb
This reverts r214893, re-applying r214881 with the test case relaxed a bit to
satiate the build bots.

POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748

llvm-svn: 214928
2014-08-05 21:32:21 +00:00
Lang Hames ae17268a7e [MCJIT] Make llvm-rtdyld check RuntimeDyld's error state when running in -verify
mode.

This will cause -verify mode to report failure when RuntimeDyld encounters an
internal error (e.g. overflows in relocation computations). Previously we had
let these errors slip past unreported.

llvm-svn: 214925
2014-08-05 20:51:46 +00:00
Bill Schmidt 42a6936c78 [PowerPC] Swap arguments and adjust shift count for vsldoi on little endian
Commits r213915 and r214718 fix recognition of shuffle masks for vmrg*
and vpku*um instructions for a little-endian target, by swapping the
input arguments.  The vsldoi instruction requires similar treatment,
and also needs its shift count adjusted for little endian.

Reviewed by Ulrich Weigand.

This is a bug fix candidate for release 3.5 (and hopefully the last of
those for PowerPC).

llvm-svn: 214923
2014-08-05 20:47:25 +00:00
Sanjay Patel 1954f2e924 Improved test cases that were added with r214892.
1. Added ':' to CHECK-LABELs
2. Added more CHECKs
3. Added CHECK-NEXTs
4. Added verbose hex immediate comments to CHECKs

llvm-svn: 214921
2014-08-05 20:16:35 +00:00
Rafael Espindola f9e52cf015 Don't internalize all but main by default.
This is mostly a cleanup, but it changes a fairly old behavior.

Every "real" LTO user was already disabling the silly internalize pass
and creating the internalize pass itself. The difference with this
patch is for "opt -std-link-opts" and the C api.

Now to get a usable behavior out of opt one doesn't need the funny
looking command line:

opt -internalize -disable-internalize -internalize-public-api-list=foo,bar -std-link-opts

llvm-svn: 214919
2014-08-05 20:10:38 +00:00
Rafael Espindola c03b6e7880 Add a test showing the interaction of linker scripts and plugin.
In particular, the linker script is processed early enough for function g
to be internalized.

llvm-svn: 214916
2014-08-05 19:56:53 +00:00
Chandler Carruth a746239be3 [x86] Fix a crasher due to shuffles which cancel each other out and add
a test case.

We also miscompile this test case which is showing a serious flaw in the
single-input v8i16 shuffle code. I've left the specific instruction
checks FIXME-ed out until I can address the bug in the single-input
code, but I wanted to separate out a significant functionality change to
produce correct code from a very simple and targeted crasher fix.

The miscompile problem stems from keeping track of inputs by value
rather than by index. As a consequence of doing this, we can't reliably
update those inputs because they might swap and we can't detect this
without copying the mask.

The blend code now uses indices for the input lists and this seems
strictly better. It also should make it easier to sort things and do
other cleanups. I think the time has come to simplify The Great Lambda
here.

llvm-svn: 214914
2014-08-05 18:45:49 +00:00
Duncan P. N. Exon Smith 6a6e9cb50c Remove dead code in condition
Whether or not it's appropriate, labels have been first-class types
since r51511.

llvm-svn: 214908
2014-08-05 18:22:58 +00:00
NAKAMURA Takumi ca562297d9 X86CodeEmitter.cpp: Add SEH_Epilogue to ignored list for legacy JIT, corresponding to r214775.
llvm-svn: 214905
2014-08-05 18:04:15 +00:00
Adam Nemet c04f3f9f73 [X86] Improve comments for r214888
A rebase somehow ate my comments. This restores them.

llvm-svn: 214903
2014-08-05 17:58:49 +00:00
Matt Arsenault 6532520fbf R600/SI: Use register class instead of list of registers
I'm not sure if this has any consequence or not.

llvm-svn: 214902
2014-08-05 17:52:40 +00:00
Matt Arsenault 2549bb4b83 R600/SI: Add exec_lo and exec_hi subregisters.
This allows accessing an SReg subregister with a normal subregister
index, instead of getting a machine verifier error.

Also be sure to include all of these subregisters in SReg_32.
This fixes inferring SGPR instead of SReg when finding a
super register class.

llvm-svn: 214901
2014-08-05 17:52:37 +00:00
Duncan P. N. Exon Smith 5a511b59c5 BitcodeReader: Fix non-determinism in use-list order
`BasicBlockFwdRefs` (and `BlockAddrFwdRefs` before it) was being emptied
in a non-deterministic order.  When predicting use-list order I've
worked around this another way, but even when parsing lazily (and we
can't recreate use-list order) use-lists should be deterministic.

Make them so by using a side-queue of functions with forward-referenced
blocks that gets visited in order.

llvm-svn: 214899
2014-08-05 17:49:48 +00:00
Philip Reames 00c9b6461f Remove dead zero store to calloc initialized memory
Optimize the following IR:

%1 = tail call noalias i8* @calloc(i64 1, i64 4)
%2 = bitcast i8* %1 to i32*
; This store is dead and should be removed
store i32 0, i32* %2, align 4

Memory returned by calloc is guaranteed to be zero initialized. If the value being stored is the constant zero (and the store is not otherwise observable across threads), we can delete the store.  If the store is to an out of bounds address, it is undefined and thus also removable.

Reviewed By: nicholas

Differential Revision: http://reviews.llvm.org/D3942

llvm-svn: 214897
2014-08-05 17:48:20 +00:00
Jonathan Roelofs 064eb5a177 Revert r214881 because it broke lots of build-bots
llvm-svn: 214893
2014-08-05 17:36:05 +00:00
Sanjay Patel 8e5beb6edb Optimize vector fabs of bitcasted constant integer values.
Allow vector fabs operations on bitcasted constant integer values to be optimized
in the same way that we already optimize scalar fabs.

So for code like this:
%bitcast = bitcast i64 18446744069414584320 to <2 x float> ; 0xFFFF_FFFF_0000_0000
%fabs = call <2 x float> @llvm.fabs.v2f32(<2 x float> %bitcast)
%ret = bitcast <2 x float> %fabs to i64

Instead of generating something like this:

movabsq (constant pool loadi of mask for sign bits)
vmovq   (move from integer register to vector/fp register)
vandps  (mask off sign bits)
vmovq   (move vector/fp register back to integer return register)

We should generate:

mov     (put constant value in return register)

I have also removed a redundant clause in the first 'if' statement:
N0.getOperand(0).getValueType().isInteger()

is the same thing as:
IntVT.isInteger()

Testcases for x86 and ARM added to existing files that deal with vector fabs.
One existing testcase for x86 removed because it is no longer ideal.

For more background, please see:
http://reviews.llvm.org/D4770

And:
http://llvm.org/bugs/show_bug.cgi?id=20354

Differential Revision: http://reviews.llvm.org/D4785

llvm-svn: 214892
2014-08-05 17:35:22 +00:00
Adam Nemet fd2161b710 [AVX512] Add masking variant and intrinsics for valignd/q
This is similar to what I did with the two-source permutation recently.  (It's
almost too similar so that we should consider generating the masking variants
with some tablegen help.)

Both encoding and intrinsic tests are added as well.  For the latter, this is
what the IR that the intrinsic test on the clang side generates.

Part of <rdar://problem/17688758>

llvm-svn: 214890
2014-08-05 17:23:04 +00:00
Adam Nemet 4688a2e5cb [X86] Increase X86_MAX_OPERANDS from 5 to 6
This controls the number of operands in the disassembler's x86OperandSets
table.  The entries describe how the operand is encoded and its type.

Not to surprisingly 5 operands is insufficient for AVX512.  Consider
VALIGNDrrik in the next patch.  These are its operand specifiers:

  { /* 328 */
    { ENCODING_DUP, TYPE_DUP1 },
    { ENCODING_REG, TYPE_XMM512 },
    { ENCODING_WRITEMASK, TYPE_VK8 },
    { ENCODING_VVVV, TYPE_XMM512 },
    { ENCODING_RM_CD64, TYPE_XMM512 },
    { ENCODING_IB, TYPE_IMM8 },
  },

llvm-svn: 214889
2014-08-05 17:23:01 +00:00
Adam Nemet 164b07fbfe [X86] Add lowering to VALIGN
This was currently part of lowering to PALIGNR with some special-casing to
make interlane shifting work.  Since AVX512F has interlane alignr (valignd/q)
and AVX512BW has vpalignr we need to support both of these *at the same time*,
e.g. for SKX.

This patch breaks out the common code and then add support to check both of
these lowering options from LowerVECTOR_SHUFFLE.

I also added some FIXMEs where I think the AVX512BW and AVX512VL additions
should probably go.

llvm-svn: 214888
2014-08-05 17:22:59 +00:00
Adam Nemet 2f10cc699d [X86] Separate DAG node for valign and palignr
They have different semantics (valign is interlane while palingr is intralane)
and palingr is still needed even in the AVX512 context.  According to the
latest spec AVX512BW provides these.

llvm-svn: 214887
2014-08-05 17:22:55 +00:00
Adam Nemet d00a05e3e2 [AVX512] alignr: Use suffix rather than name argument to multiclass
Again no functional change.  This prepares for the suffix to be used with the
intrinsic matching.

llvm-svn: 214886
2014-08-05 17:22:52 +00:00
Adam Nemet f92139dd61 [AVX512] Pull everything alignr-related into the multiclass
The packed integer pattern becomes the DAG pattern for rri and the packed
float, another Pat<> inside the multiclass.

No functional change.

llvm-svn: 214885
2014-08-05 17:22:50 +00:00
Adam Nemet 1c752d8f5e Wrap long lines
llvm-svn: 214884
2014-08-05 17:22:47 +00:00
Jonathan Roelofs f5fad3767b Fix return sequence on armv4 thumb
POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:

  POP {r3}
  ADD sp, #offset
  BX r3

This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:

  MOV ip, r3
  POP {r3}
  ADD sp, #offset
  MOV lr, r3
  MOV r3, ip
  BX lr

http://reviews.llvm.org/D4748

llvm-svn: 214881
2014-08-05 17:13:17 +00:00
David Blaikie b706b58e78 Partially revert r214761 that asserted that all concrete debug info variables had DIEs, due to a failure on Darwin.
I'll work on a reduction and fix after this.

llvm-svn: 214880
2014-08-05 16:47:23 +00:00
David Blaikie c74ffa9cab Improve test for merged global debug info by using llvm-dwarfdump.
It's a bit of a tradeoff, since llvm-dwarfdump doesn't print the name of
the global symbol being used as an address in the addressing mode, but
this avoids the dependence on hardcoded set labels that keep changing
(5+ commits over the last few years that each update the set label as it
changes due to other, unrelated differences in output). This could've,
instead, been changed to match the set name then match the name in the
string pool but that would present other issues (needing to skip over
the sets that weren't of interest, etc) and checking that the addresses
(granted, without relocations applied - so it's not the whole story)
match in the two variable location descriptions seems sufficient and
fairly stable here.

There are a few similar other tests with similar label dependence that
I'll update soonish.

llvm-svn: 214878
2014-08-05 16:20:25 +00:00
Joerg Sonnenberger c4ce42980e Add accessors for the PPC 403 bank registers.
llvm-svn: 214875
2014-08-05 15:45:15 +00:00
Renato Golin 877b9b3513 Add tests for cp10/cp11 on ARMv5/6
Tests for ARMv7/8 are already on diagnostics.s

llvm-svn: 214872
2014-08-05 15:29:41 +00:00
Keith Walker 1045717584 Specify that the thumb setend and blx <immed> instructions are not valid on an m-class target
llvm-svn: 214871
2014-08-05 15:11:59 +00:00
Keith Walker 292aa3d5f7 Define stc2/stc2l/ldc2/ldc2l as thumb2 instructions
llvm-svn: 214868
2014-08-05 14:58:05 +00:00
Joerg Sonnenberger 936a4c8ceb Accessors for SSR2 and SSR3 on PPC 403.
llvm-svn: 214867
2014-08-05 14:53:05 +00:00
Tom Stellard 229d5e669b R600/SI: Update MUBUF assembly string to match AMD proprietary compiler
llvm-svn: 214866
2014-08-05 14:48:12 +00:00
Tom Stellard b37f797678 R600/SI: Avoid generating REGISTER_LOAD instructions.
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.

llvm-svn: 214865
2014-08-05 14:40:52 +00:00
Joerg Sonnenberger 412471271e Add dci/ici instructions for PPC 476 and friends.
llvm-svn: 214864
2014-08-05 14:40:32 +00:00
Joerg Sonnenberger 048284e1b6 Add mftblo and mftbhi for PPC 4xx.
llvm-svn: 214863
2014-08-05 14:18:16 +00:00
Joerg Sonnenberger 9dedceb71d Add lswi / stswi for assembler use with a warning to not add patterns
for them.

llvm-svn: 214862
2014-08-05 13:34:01 +00:00
Yi Kong e56de69500 AArch64: Add support for instruction prefetch intrinsic
Instruction prefetch is not implemented for AArch64, it is incorrectly
translated into data prefetch instruction.

Differential Revision: http://reviews.llvm.org/D4777

llvm-svn: 214860
2014-08-05 12:46:47 +00:00
James Molloy 2b8933c354 Teach the SLP Vectorizer that keeping some values live over a callsite can have a cost.
Some types, such as 128-bit vector types on AArch64, don't have any callee-saved registers. So if a value needs to stay live over a callsite, it must be spilled and refilled. This cost is now taken into account.

llvm-svn: 214859
2014-08-05 12:30:34 +00:00
Chandler Carruth 183771bd8e [x86] Reformat some code I moved around in a prior commit but left
poorly formatted. Sorry about that.

llvm-svn: 214853
2014-08-05 10:35:30 +00:00
Joerg Sonnenberger 6b41a9900a Allow binary and for tblgen math.
llvm-svn: 214851
2014-08-05 09:43:25 +00:00
Chandler Carruth 947cef191d [x86] Fix a crash and wrong-code bug in the new vector lowering all
found by a single test reduced out of a failure on llvm-stress.

The start of the problem (and the crash) came when we tried to use
a find of a non-used slot in the move-to half of the move-mask as the
target for two bad-half inputs. While if lucky this will be the first of
a pair of slots which we can place the bad-half inputs into, it isn't
actually guaranteed. This really isn't surprising, not sure what I was
thinking. The correct way to find the two unused slots is to look for
one of the *used* slots. We know it isn't that pair, and we can use some
modular arithmetic to find the other pair by masking off the odd bit and
adding 2 modulo 4. With this, we reliably found a viable pair of slots
for the bad-half inputs.

Sadly, that wasn't enough. We also had a wrong code bug that surfaced
when I reduced the test case for this where we would use the same slot
twice for the two bad inputs. This is because both of the bad inputs
could be in odd slots originally and thus the mod-2 mapping would
actually be the same. The whole point of the weird indexing into the
pair of empty slots was to try to leverage when the end result needed
the two bad-half inputs to be paired in a dword and pre-pair them in the
correct orrientation. This is less important with the powerful combining
we're now doing, and also easier and more reliable to achieve be noting
that we add the bad-half inputs in order. Thus, if they are in a dword
pair, the low part of that will be the first input in the sequence.
Always putting that in the low element will just do the right thing in
addition to computing the correct result.

Test case added. =]

llvm-svn: 214849
2014-08-05 08:19:21 +00:00
Juergen Ributzka 9503327756 [FastIsel][AArch64] Fix previous commit r214844 (Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.)
The original code would fail for unsupported value types like i1, i8, and i16.
This fix changes the code to only create a sub-register copy for i64 value types
and all other types (i1/i8/i16/i32) just use the source register without any
modifications.

getRegClassFor() is now guarded by the i64 value type check, that guarantees
that we always request a register for a valid value type.

llvm-svn: 214848
2014-08-05 07:31:30 +00:00
Juergen Ributzka a126d1ef3c [FastISel][AArch64] Implement the FastLowerArguments hook.
This implements basic argument lowering for AArch64 in FastISel. It only
handles a small subset of the C calling convention. It supports simple
arguments that can be passed in GPR and FPR registers.

This should cover most of the trivial cases without falling back to
SelectionDAG.

This fixes <rdar://problem/17890986>.

llvm-svn: 214846
2014-08-05 05:43:48 +00:00
Kevin Qin ec100526e3 Revert "r214832 - MachineCombiner Pass for selecting faster instruction"
It broke compiling of most Benchmark and internal test, as clang got
clashed by segmentation fault or assertion.

llvm-svn: 214845
2014-08-05 05:43:47 +00:00
Juergen Ributzka 51f5326e25 [FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended.
llvm-svn: 214844
2014-08-05 05:43:44 +00:00
Juergen Ributzka 384c3b5c03 Provide convenient access to the zext/sext attributes of function arguments. NFC.
llvm-svn: 214843
2014-08-05 05:43:41 +00:00
Eric Christopher fc6de428c8 Have MachineFunction cache a pointer to the subtarget to make lookups
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.

Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.

llvm-svn: 214838
2014-08-05 02:39:49 +00:00
Gerolf Hoflehner 4dbf44b9d8 MachineCombiner Pass for selecting faster instruction
sequence on AArch64

Re-commit of r214669 without changes to test cases
LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and
LLVM:: CodeGen/AArch64/dp-3source.ll
This resolves the reported compfails of the original commit.

llvm-svn: 214832
2014-08-05 01:16:13 +00:00
Joerg Sonnenberger 755ffa9b54 Add TCR register access
llvm-svn: 214826
2014-08-04 23:53:42 +00:00
Joerg Sonnenberger 5995e0021d Add PPC 603's tlbld and tlbli instructions.
llvm-svn: 214825
2014-08-04 23:49:45 +00:00
Renato Golin bc0b0378c5 Allow CP10/CP11 operations on ARMv5/v6
Those registers are VFP/NEON and vector instructions should be used instead,
but old cores rely on those co-processors to enable VFP unwinding. This change
was prompted by the libc++abi's unwinding routine and is also present in many
legacy low-level bare-metal code that we ought to compile/assemble.

Fixing bug PR20025 and allowing PR20529 to proceed with a fix in libc++abi.

llvm-svn: 214802
2014-08-04 23:21:56 +00:00
Bill Schmidt f04e998e00 [PPC64LE] Fix wrong IR for vec_sld and vec_vsldoi
My original LE implementation of the vsldoi instruction, with its
altivec.h interfaces vec_sld and vec_vsldoi, produces incorrect
shufflevector operations in the LLVM IR.  Correct code is generated
because the back end handles the incorrect shufflevector in a
consistent manner.

This patch and a companion patch for Clang correct this problem by
removing the fixup from altivec.h and the corresponding fixup from the
PowerPC back end.  Several test cases are also modified to reflect the
now-correct LLVM IR.

llvm-svn: 214800
2014-08-04 23:21:01 +00:00
Kevin Enderby e3c13468bf Enable Darwin vararg parameters support in assembler macros.
Duplicate the vararg tests for linux and add a tests which mixed
vararg arguments with darwin positional parameters.

Patch by: Janne Grunau <j@jannau.net>

llvm-svn: 214799
2014-08-04 23:14:37 +00:00
Pedro Artigas ec7cbd7d14 Changed the liveness tracking in the RegisterScavenger
to use register units instead of registers.

reviewed by Jakob Stoklund Olesen.

llvm-svn: 214798
2014-08-04 23:07:49 +00:00
Joerg Sonnenberger 51cf733427 Add simplified aliases for access to DCCR, ICCR, DEAR and ESR
llvm-svn: 214797
2014-08-04 22:56:42 +00:00
Andrew Trick fd4f32be90 Fix SmallDenseMap assignment operator.
Self assignment would lead to buckets of garbage, causing quadratic probing to hang.

llvm-svn: 214790
2014-08-04 22:18:25 +00:00
Juergen Ributzka 53533e885a [FastISel][AArch64] Fix shift lowering for i8 and i16 value types.
This fix changes the parameters #r and #s that are passed to the UBFM/SBFM
instruction to get the zero/sign-extension for free.

The original problem was that the shift left would use the 32-bit shift even for
i8/i16 value types, which could leave the upper bits set with "garbage" values.

The arithmetic shift right on the other side would use the wrong MSB as sign-bit
to determine what bits to shift into the value.

This fixes <rdar://problem/17907720>.

llvm-svn: 214788
2014-08-04 21:49:51 +00:00
Justin Bogner ac021bac4e IR: Fix up doxygen comment for LLVMContext::diagnose
This comment was referring to the DiagnosticSeverity with RS_
prefixes, but they're actually DS_. I've also modernized the comment
style since I was changing it anyway.

llvm-svn: 214787
2014-08-04 21:49:15 +00:00
Chandler Carruth 40dbd382ad [SDAG] Fix a really, really terrible bug in the DAG combiner.
This code is completely wrong. It is also dead, as if it were to *ever*
run, it would crash. Fortunately, after my work to the combiner, it is
at least *possible* to reach the code, and llvm-stress has found a test
case. Thanks to Patrick for reporting.

It would be really good if anyone who remembers how this code works and
what it was intended to do could add some more obvious test coverage
instead of my completely contrived and reduced test case. My test case
was so brittle I left a bread crumb comment in it to help the next
person to stumble on it and not know what it was actually testing for.

llvm-svn: 214785
2014-08-04 21:29:59 +00:00
Joerg Sonnenberger 6c3e38522a tlbre / tlbwe / tlbsx / tlbsx. variants for the PPC 4xx CPUs.
llvm-svn: 214784
2014-08-04 21:28:22 +00:00
Eric Christopher 5f11f5f979 Reorder to keep data and routines separate and to keep a couple of
similar routines close to each other.

llvm-svn: 214782
2014-08-04 21:25:44 +00:00
Eric Christopher d913448b38 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781
2014-08-04 21:25:23 +00:00
Eric Christopher acc8ef273b Reimplement the temporary non-const getSubtargetImpl routine so
that we can avoid implementing it on every target. Thanks to Richard
Smith for the suggestions!

llvm-svn: 214780
2014-08-04 21:24:07 +00:00
Chad Rosier 5908ab4dd6 [AArch64] Extend the number of scalar instructions supported in the AdvSIMD
scalar integer instruction pass.

This is a patch I had lying around from a few months ago.  The pass is
currently disabled by default, so nothing to interesting.

llvm-svn: 214779
2014-08-04 21:20:25 +00:00
Joerg Sonnenberger 6d05a2b461 MC uses .lcomm now, so adjust.
llvm-svn: 214776
2014-08-04 21:06:00 +00:00
Reid Kleckner e704010450 Fix failure to invoke exception handler on Win64
When the last instruction prior to a function epilogue is a call, we
need to emit a nop so that the return address is not in the epilogue IP
range.  This is consistent with MSVC's behavior, and may be a workaround
for a bug in the Win64 unwinder.

Differential Revision: http://reviews.llvm.org/D4751

Patch by Vadim Chugunov!

llvm-svn: 214775
2014-08-04 21:05:27 +00:00
David Blaikie 16409f23f8 Correct the emission kind constants committed in r214771
llvm-svn: 214772
2014-08-04 20:36:00 +00:00
David Blaikie f851712509 Document the "emission kind" field of the DICompileUnit in LLVM's Source Level Debugging metadata.
llvm-svn: 214771
2014-08-04 20:32:48 +00:00
Joerg Sonnenberger 6e842b34a0 Recognize mftbl as alias for mftb, for symmetry with mttb.
llvm-svn: 214769
2014-08-04 20:28:34 +00:00
Joerg Sonnenberger 906ae46d57 Add a sentence that all entries should include an email address.
Add one for Greg Clayton, Peter Collingbourne, Tobias Grosser and
Jakob Olesen based on recent commits.

llvm-svn: 214762
2014-08-04 19:33:25 +00:00
David Blaikie 448c066eea Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."
Originally reverted in r213432 with flakey failures on an ASan self-host
build. After reduction it seems to be the same issue fixed in r213805
(ArgPromo + DebugInfo: Handle updating debug info over multiple
applications of argument promotion) and r213952 (by having
LiveDebugVariables strip dbg_value intrinsics in functions that are not
described by debug info). Though I cannot explain why this failure was
flakey...

llvm-svn: 214761
2014-08-04 19:30:08 +00:00
Matt Arsenault fa097f8f3d R600/SI: Fix definitions for ds_read2 / ds_write2 instructions.
These were just wrong, using the wrong register classes
and store2 was missing an operand.

llvm-svn: 214756
2014-08-04 18:49:22 +00:00
Joerg Sonnenberger 5f233fc723 Rename PPCLinuxMCAsmInfo to PPCELFMCAsmInfo to better reflect the
systems it represents.

llvm-svn: 214755
2014-08-04 18:46:13 +00:00
Joerg Sonnenberger b604f82cb8 Allow .lcomm with alignment on ELF targets.
llvm-svn: 214754
2014-08-04 18:45:10 +00:00
Alex Lorenz 1193b5e272 Coverage: add HasCodeBefore flag to a mapping region.
This flag will be used by the coverage tool to help 
compute the execution counts for each line in a source file.

Differential Revision: http://reviews.llvm.org/D4746

llvm-svn: 214740
2014-08-04 18:00:51 +00:00
Eric Christopher 34aaf970e2 Move the R600 intrinsic support back to the target machine - there's
nothing subtarget dependent about the intrinsic support in any
backend as far as I can tell.

llvm-svn: 214738
2014-08-04 17:37:43 +00:00
Justin Bogner 487e764b58 Path: Stop claiming path::const_iterator is bidirectional
path::const_iterator claims that it's a bidirectional iterator, but it
doesn't satisfy all of the contracts for a bidirectional iterator.
For example, n3376 24.2.5 p6 says "If a and b are both dereferenceable,
then a == b if and only if *a and *b are bound to the same object",
but this doesn't work with how we stash and recreate Components.

This means that our use of reverse_iterator on this type is invalid
and leads to many of the valgrind errors we're hitting, as explained
by Tilmann Scheller here:

    http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140728/228654.html

Instead, we admit that path::const_iterator is only an input_iterator,
and implement a second input_iterator for path::reverse_iterator (by
changing const_iterator::operator-- to reverse_iterator::operator++).
All of the uses of this just traverse once over the path in one
direction or the other anyway.

llvm-svn: 214737
2014-08-04 17:36:41 +00:00
Joerg Sonnenberger 5002fb5337 Refactor SPRG instructions.
llvm-svn: 214733
2014-08-04 17:26:15 +00:00
Akira Hatanaka e457f3e17a [X86] Place parentheses around "isMask_32(STReturns) && N <= 2".
This corrects r214672, which was committed to silence a gcc warning.

llvm-svn: 214732
2014-08-04 17:23:38 +00:00
Joerg Sonnenberger 7405210418 Add support for m[ft][di]bat[ul] instructions.
llvm-svn: 214731
2014-08-04 17:07:41 +00:00
Matt Arsenault 329eda3b82 Use the known address space constant rather than checking it
llvm-svn: 214729
2014-08-04 16:55:35 +00:00
Matt Arsenault efb3d5347d R600: Remove unused include
llvm-svn: 214728
2014-08-04 16:55:33 +00:00
Eric Christopher eba9167e7b Add a dummy subtarget to the CPP backend target machine. This will
allow us to forward all of the standard TargetMachine calls to the
subtarget and still return null as we were before.

llvm-svn: 214727
2014-08-04 16:40:55 +00:00
Joerg Sonnenberger 0b2ebcb49d Add features for PPC 4xx and e500/e500mc instructions.
Move the test cases for them into separate files.

llvm-svn: 214724
2014-08-04 15:47:38 +00:00
Ulrich Weigand 983341d3f3 [PowerPC] Add target triple to vec_urem_const.ll test case
This should hopefully fix build bots on other architectures.

llvm-svn: 214721
2014-08-04 14:55:26 +00:00
Robert Khasanov 7ca7df0bf9 [SKX] Enabling load/store instructions: encoding
Instructions: VMOVAPD, VMOVAPS, VMOVDQA8, VMOVDQA16, VMOVDQA32,VMOVDQA64, VMOVDQU8, VMOVDQU16, VMOVDQU32,VMOVDQU64, VMOVUPD, VMOVUPS,

Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>

llvm-svn: 214719
2014-08-04 14:35:15 +00:00
Ulrich Weigand cc9909b881 [PowerPC] Swap arguments to vpkuhum/vpkuwum on little-endian
In commit r213915, Bill fixed little-endian usage of vmrgh* and vmrgl*
by swapping the input arguments.  As it turns out, the exact same fix
is also required for the vpkuhum/vpkuwum patterns.

This fixes another regression in llvmpipe when vector support is
enabled.

Reviewed by Bill Schmidt.

llvm-svn: 214718
2014-08-04 13:53:40 +00:00
Aaron Ballman c36c6abc45 Improving the name of the function parameter, which happens to solve two likely-less-than-useful MSVC warnings: warning C4258: 'I' : definition from the for loop is ignored; the definition from the enclosing scope is used.
llvm-svn: 214717
2014-08-04 13:51:27 +00:00
Ulrich Weigand 51eccec5d9 [PowerPC] MULHU/MULHS are not legal for vector types
I ran into some test failures where common code changed vector division
by constant into a multiply-high operation (MULHU).  But these are not
implemented by the back-end, so we failed to recognize the insn.

Fixed by marking MULHU/MULHS as Expand for vector types.

llvm-svn: 214716
2014-08-04 13:27:12 +00:00
Daniel Sanders a5cb453cd3 Fixed accidental use of reserved identifier in r214709.
llvm-svn: 214715
2014-08-04 13:27:03 +00:00
Ulrich Weigand c4cc7febb0 [PowerPC] Fix and improve vector comparisons
This patch refactors code generation of vector comparisons.

This fixes a wrong code-gen bug for ISD::SETGE for floating-point types,
and improves generated code for vector comparisons in general.

Specifically, the patch moves all logic deciding how to implement vector
comparisons into getVCmpInst, which gets two extra boolean outputs
indicating to its caller whether its needs to swap the input operands
and/or negate the result of the comparison.  Apart from implementing
these two modifications as directed by getVCmpInst, there is no need
to ever implement vector comparisons in any other manner; in particular,
there is never a need to perform two separate comparisons (e.g. one for
equal and one for greater-than, as code used to do before this patch).

Reviewed by Bill Schmidt.

llvm-svn: 214714
2014-08-04 13:13:57 +00:00
Daniel Sanders f0df221d76 [mips] Add assembler support for '.set mipsX'.
Summary:
This patch also fixes an issue with the way the Mips assembler enables/disables architecture
features. Before this patch, the assembler never disabled feature bits. For example,
.set mips64
.set mips32r2

would result in the 'OR' of mips64 with mips32r2 feature bits which isn't right.
Unfortunately this isn't trivial to fix because there's not an easy way to clear
feature bits as the algorithm in MCSubtargetInfo (ToggleFeature) only clears the bits
that imply the feature being cleared and not the implied bits by the feature (there's a
better explanation to the code I added).

Patch by Matheus Almeida and updated by Toma Tabacu

Reviewers: vmedic, matheusalmeida, dsanders

Reviewed By: dsanders

Subscribers: tomatabacu, llvm-commits

Differential Revision: http://reviews.llvm.org/D4123

llvm-svn: 214709
2014-08-04 12:20:00 +00:00
NAKAMURA Takumi d587e20cec TargetInstrInfo::genAlternativeCodeSequence(): Fix a couple of \param(s). [-Wdocumentation]
llvm-svn: 214708
2014-08-04 10:23:22 +00:00
Chandler Carruth 0e2ddb2790 [x86] Just unilaterally prefer SSSE3-style PSHUFB lowerings over clever
use of PACKUS. It's cleaner that way.

I looked at implementing clever combine-based folding of PACKUS chains
into PSHUFB but it is quite hard and doesn't seem likely to be worth it.
The most annoying part would be detecting that the correct masking had
been done to use PACKUS-style instructions as a blend operation rather
than there being any saturating as is indicated by its name. We generate
really nice code for what few test cases I've come up with that aren't
completely contrived for this by just directly prefering PSHUFB and so
let's go with that strategy for now. =]

llvm-svn: 214707
2014-08-04 10:17:35 +00:00
Chandler Carruth 06e6f1cae2 [x86] Implement more aggressive use of PACKUS chains for lowering common
patterns of v16i8 shuffles.

This implements one of the more important FIXMEs for the SSE2 support in
the new shuffle lowering. We now generate the optimal shuffle sequence
for truncate-derived shuffles which show up essentially everywhere.

Unfortunately, this exposes a weakness in other parts of the shuffle
logic -- we can no longer form PSHUFB here. I'll add the necessary
support for that and other things in a subsequent commit.

llvm-svn: 214702
2014-08-04 09:40:02 +00:00
Benjamin Kramer 2abde4f9d7 Update links to the gcc and java documentation that 404'd.
llvm-svn: 214700
2014-08-04 09:26:40 +00:00
Kevin Qin f31ecf3fea Revert "r214669 - MachineCombiner Pass for selecting faster instruction"
This commit broke "make check" for several hours, so get it reverted.

llvm-svn: 214697
2014-08-04 05:10:33 +00:00
NAKAMURA Takumi 56bc3419a3 MemoryBuffer: Don't use mmap when FileSize is multiple of 4k on Cygwin.
On Cygwin, getpagesize() returns 64k(AllocationGranularity).

In r214580, the size of X86GenInstrInfo.inc became 1499136.

FIXME: We should reorganize again getPageSize() on Win32.
MapFile allocates address along AllocationGranularity but view is mapped by physical page.

llvm-svn: 214681
2014-08-04 01:43:37 +00:00