Commit Graph

30774 Commits

Author SHA1 Message Date
Rafael Espindola 5f7ade26d0 objdump: Don't print a (always 0) size for MachO symbols.
Only common symbol on MachO and COFF have a size.

For COFF we already had a custom format.

For MachO, there is no native objdump and we were printing it as ELF. Now
we only print the sizes for symbols that actually have them.

llvm-svn: 240422
2015-06-23 15:45:38 +00:00
Toma Tabacu d88d79c79d [mips] [IAS] Add partial support for the ULHU pseudo-instruction.
Summary:
This only adds support for ULHU of an immediate address with/without a source register.
It does not include support for ULHU of the address of a symbol.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9671

llvm-svn: 240410
2015-06-23 14:39:42 +00:00
Petar Jovanovic b7915a1f0b [mips64] Emit correct addend for some PC-relative relocations
So far, LLVM has not emitted correct addend for N64 and N32 ABI. This patch
fixes that. It also removes fixup from MCJIT for R_MIPS_PC16 relocation.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D10565

llvm-svn: 240404
2015-06-23 13:54:42 +00:00
Daniel Jasper 41de8027b1 Revert r240302 ("Bring r240130 back.").
This causes errors like:

  ld: error: blah.o: requires dynamic R_X86_64_PC32 reloc against '' which
  may overflow at runtime; recompile with -fPIC
  blah.cc:function f(): error: undefined reference to ''
  blah.o:g(): error: undefined reference to ''

I have not yet come up with an appropriate reproduction.

llvm-svn: 240394
2015-06-23 11:31:32 +00:00
Daniel Sanders 70b5908d39 [mips] llvm-readobj can parse .MIPS.abiflags. No need to check the bytes.
Summary:

Reviewers: atanasyan

Reviewed By: atanasyan

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10538

llvm-svn: 240392
2015-06-23 10:11:36 +00:00
Elena Demikhovsky 5e2f8c4231 AVX-512: Added all forms of VPABS instruction
Added all intrinsics, tests for encoding, tests for intrinsics.

llvm-svn: 240386
2015-06-23 08:19:46 +00:00
Justin Bogner 47ab1fa6d6 test: Move target dependent test in their own folder for c API test
Dissasembly tests depends on target. The problem is that it disable
all tests if all targets are not compiled. This moves things around in
order to get target specific code in a target specific folder.

Patch by Amaury Sechet. Thanks!

llvm-svn: 240380
2015-06-23 06:46:54 +00:00
Weiming Zhao f1abad57da Fix PR13851: Preserve metadata for the unswitched branch
This patch copies the metadata of the unswitched branch to the newly
crreated branch in loop unswitch pass.

llvm-svn: 240378
2015-06-23 05:31:09 +00:00
Rafael Espindola 14522db74f Add a test for the previous commit.
This shows how two symbols at the same address are handled.

llvm-svn: 240374
2015-06-23 03:42:44 +00:00
David Majnemer 726901b638 [InstCombine] Optimize subtract of selects into a select of a sub
This came up when examining some code generated by clang's IRGen for
certain member pointers.

llvm-svn: 240369
2015-06-23 02:49:24 +00:00
Rafael Espindola aeef0618b9 Fix tests when X86 is not enabled.
llvm-svn: 240368
2015-06-23 02:45:44 +00:00
Rafael Espindola a4a4093ed8 Compute correct symbol sizes for MachO and COFF.
Before this would dump from the symbol start to the end of the section.

llvm-svn: 240367
2015-06-23 02:20:37 +00:00
Sanjay Patel e79b43a01f [x86] generalize reassociation optimization in machine combiner to 2 instructions
Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass
to reassociate the following sequence to reduce the critical path:

A = ? op ?
B = A op X
C = B op Y
-->
A = ? op ?
B = X op Y
C = A op B

'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could
be any associative math/logic op (see TODO in code comment).

This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of
a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth
reassociating them.

This generalization has a compile-time cost because we can now match more instruction sequences
and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't
improve the critical path.

For example, in the new test case:

A = M div N
B = A add X
C = B add Y

We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path:

A = M div N
B = A add Y
C = B add X

We need the combiner to reject that pattern but select this:

A = M div N
B = X add Y
C = B add A

Differential Revision: http://reviews.llvm.org/D10460

llvm-svn: 240361
2015-06-23 00:39:40 +00:00
Evgeniy Stepanov 9e0d41ab09 Fix PR23914.
r226830 moved the declaration of Buf to a nested scope, resulting
in a dangling reference (in StringRef Name), and a use-after-free.

llvm-svn: 240357
2015-06-22 23:36:03 +00:00
Adam Nemet f530b329c7 [LoopDist] Improve variable names and comments in LoopVersioning class, NFC
As with the previous patch, the goal is to turn the class into a general
loop-versioning class.  This patch removes any references to loop
distribution.

llvm-svn: 240352
2015-06-22 22:59:40 +00:00
Pawel Bylica e6fd8c4232 Revert r240291: causes problems in self-hosted builds.
llvm-svn: 240343
2015-06-22 21:54:07 +00:00
Peter Collingbourne ea45d834e0 Linker: Do not expect comdat to exist in source module.
llvm-svn: 240341
2015-06-22 21:46:51 +00:00
Frederic Riss ebc162a766 [Object] Search for architecures by name in MachOUniversalBinary::getObjectForArch()
The reason we need to search by name rather than by Triple::ArchType
is to handle subarchitecture correclty. There is no different ArchType
for the x86_64h architecture (it identifies itself as x86_64), or for
the various ARM subarches. The only way to get to the subarch slice
in an universal binary is to search by name.

This issue led to hard to debug and transient symbolication failures
in Asan tests (it mostly works, because the files are very similar).

This also affects the Profiling infrastucture as it is the other user
of that API.

Reviewers: samsonov, bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10604

llvm-svn: 240339
2015-06-22 21:33:24 +00:00
Pawel Bylica 776b553438 Set missing x86 arch in a CodeGen regression test.
Fixes the regression test added in r240291.

llvm-svn: 240336
2015-06-22 21:18:10 +00:00
Simon Pilgrim c5f409c1ec [X86][AVX2] Added missing stack folding tests for vpshufhw/vpshuflw
llvm-svn: 240332
2015-06-22 21:10:42 +00:00
Tom Stellard f0296cee9b R600/SI: Use ELF64 format instead of ELF32
Reviewers: arsenm, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10392

llvm-svn: 240331
2015-06-22 21:03:54 +00:00
Tom Stellard 3aed34e947 R600: Use EM_AMDGPU for the ELF Machine type
Reviewers: arsenm, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10390

llvm-svn: 240330
2015-06-22 21:03:52 +00:00
Ahmed Bougacha ed3c4d1a3d [X86] Teach load folding to accept scalar _Int users of MOVSS/MOVSD.
The _Int instructions are special, in that they operate on the full
VR128 instead of FR32.  The load folding then looks at MOVSS, at the
user, and bails out when it sees a size mismatch.

What we really know is that the rm_Int instructions don't load the
higher lanes, so folding is fine.

This happens for the straightforward intrinsic code, e.g.:

    _mm_add_ss(a, _mm_load_ss(p));

Fixes PR23349.

Differential Revision: http://reviews.llvm.org/D10554

llvm-svn: 240326
2015-06-22 20:51:51 +00:00
Alex Lorenz 91370c5d62 MIR Serialization: Introduce a lexer for machine instructions.
This commit adds a function that tokenizes the string containing
the machine instruction. This commit also adds a struct called 
'MIToken' which is used to represent the lexer's tokens.

Reviewers: Sean Silva

Differential Revision: http://reviews.llvm.org/D10521

llvm-svn: 240323
2015-06-22 20:37:46 +00:00
Peter Collingbourne de26a918c1 SafeStack: Create the unsafe stack pointer on demand.
This avoids creating an unnecessary undefined reference on targets such as
NVPTX that require such references to be declared in asm output.

llvm-svn: 240321
2015-06-22 20:26:54 +00:00
Pete Cooper 80d21cb40d Change .thumb_set to have the same error checks as .set.
According to the documentation, .thumb_set is 'the equivalent of a .set directive'.

We didn't have equivalent behaviour in terms of all the errors we could throw, for
example, when a symbol is redefined.

This change refactors parseAssignment so that it can be used by .set and .thumb_set
and implements tests for .thumb_set for all the errors thrown by that method.

Reviewed by Rafael Espíndola.

llvm-svn: 240318
2015-06-22 19:35:57 +00:00
Sanjay Patel 09b2c890af [x86] set default reciprocal (division and square root) codegen to match GCC
D8982 ( checked in at http://reviews.llvm.org/rL239001 ) added command-line 
options to allow reciprocal estimate instructions to be used in place of
divisions and square roots.

This patch changes the default settings for x86 targets to allow that recip
codegen (except for scalar division because that breaks too much code) when
using -ffast-math or its equivalent. 

This matches GCC behavior for this kind of codegen.

Differential Revision: http://reviews.llvm.org/D10396

llvm-svn: 240310
2015-06-22 18:29:44 +00:00
Sanjoy Das 6f567a4b79 [FaultMaps] Add a parser for the __llvm__faultmaps section.
Summary:
The parser is exercised by llvm-objdump using -print-fault-maps.  As is
probably obvious, the code itself was "heavily inspired" by
http://reviews.llvm.org/D10434.

Reviewers: reames, atrick, JosephTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10491

llvm-svn: 240304
2015-06-22 18:03:02 +00:00
Rafael Espindola 2d6bae2e09 Bring r240130 back.
Now that pr23900 is fixed, we can bring it back with no changes.

Original message:

Make all temporary symbols unnamed.

What this does is make all symbols that would otherwise start with a .L
(or L on MachO) unnamed.

Some of these symbols still show up in the symbol table, but we can just
make them unnamed.

In order to make sure we produce identical results when going thought assembly,
all .L (not just the compiler produced ones), are now unnamed.

Running llc on llvm-as.opt.bc, the peak memory usage goes from 208.24MB to
205.57MB.

llvm-svn: 240302
2015-06-22 17:52:52 +00:00
Alex Lorenz 8e0a1b4857 MIR Serialization: Serialize machine instruction names.
This commit implements initial machine instruction serialization. It
serializes machine instruction names. The instructions are represented
using a YAML sequence of string literals and are a part of machine
basic block YAML mapping.

This commit introduces a class called 'MIParser' which will be used to
parse the machine instructions and operands.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10481

llvm-svn: 240295
2015-06-22 17:02:30 +00:00
Pawel Bylica 06407c0320 Fix shl folding in DAG combiner.
Summary: The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared.

Test Plan: A regression test included.

Reviewers: andreadb

Reviewed By: andreadb

Subscribers: andreadb, test, llvm-commits

Differential Revision: http://reviews.llvm.org/D10602

llvm-svn: 240291
2015-06-22 15:58:11 +00:00
Rafael Espindola bdf509aaaf Add a triple to the test to fix it on some hosts.
The slp vectorizer doesn't optimize this case in 32 bits.

Fixes PR23453.

llvm-svn: 240289
2015-06-22 15:44:20 +00:00
Toma Tabacu 8e0316d439 [mips] [IAS] Add support for LAReg with identical source and destination register operands.
Summary: In this case, we're supposed to load the immediate in AT and then ADDu it with the source register and put it in the destination register.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9367

llvm-svn: 240278
2015-06-22 13:10:23 +00:00
Elena Demikhovsky 55a997437c AVX-512: added VPSHUFB instruction - all SKX forms
Added intrinsics and encoding tests.

llvm-svn: 240277
2015-06-22 13:00:42 +00:00
Toma Tabacu fb9d125592 [mips] [IAS] Add support for LASym with identical source and destination register operands.
Summary:
In this case, we're supposed to load the address of the symbol in AT and then ADDu it with the source register and
put it in the destination register.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9366

llvm-svn: 240273
2015-06-22 12:08:39 +00:00
Elena Demikhovsky ba5ab328e5 AVX-512: All forms of VCOPMRESS VEXPAND instructions,
encoding tests.

llvm-svn: 240272
2015-06-22 11:16:30 +00:00
Elena Demikhovsky d78609a7ac Reverted AVX-512 vector shuffle
llvm-svn: 240258
2015-06-22 09:01:15 +00:00
Michael Kuperstein fc21951cd7 [X86] Allow more call sequences to use push instructions for argument passing
This allows more call sequences to use pushes instead of movs when optimizing for size.
In particular, calling conventions that pass some parameters in registers (e.g. thiscall) are now supported.

Differential Revision: http://reviews.llvm.org/D10500

llvm-svn: 240257
2015-06-22 08:31:22 +00:00
Elena Demikhovsky e77566112c AVX-512: Added intrinsics for VPERMT2W/D/Q/PS/PD and
VPERMI2W/D/Q/PS/PD instructions.
Added tests.

llvm-svn: 240256
2015-06-22 06:45:48 +00:00
Rafael Espindola ff373d2c73 Add the testcase from pr23900.
llvm-svn: 240253
2015-06-22 01:29:24 +00:00
Duncan P. N. Exon Smith 3a73d9e067 AsmPrinter: Don't emit empty .debug_loc entries
If we don't know how to represent a .debug_loc entry, skip the entry
entirely rather than emitting an empty one.  Similarly, if a .debug_loc
list has no entries, don't create the list.

We still want to create the variables, just in an optimized-out form
that doesn't have a DW_AT_location.

llvm-svn: 240244
2015-06-21 16:54:56 +00:00
Simon Pilgrim fd704fe895 [X86][SSE] Added missing stack folding test for CVTSD2SS instruction.
llvm-svn: 240241
2015-06-21 16:07:47 +00:00
Hans Wennborg 6ed81cbcdb Switch lowering: add heuristic for filling leaf nodes in the weight-balanced binary search tree
Sparse switches with profile info are lowered as weight-balanced BSTs. For
example, if the node weights are {1,1,1,1,1,1000}, the right-most node would
end up in a tree by itself, bringing it closer to the top.

However, a leaf in this BST can contain up to 3 cases, and having a single
case in a leaf node as in the example means the tree might become
unnecessarily high.

This patch adds a heauristic to the pivot selection algorithm that moves more
cases into leaf nodes unless that would lower their rank. It still doesn't
yield the optimal tree in every case, but I believe it's conservatibely correct.

llvm-svn: 240224
2015-06-20 17:14:07 +00:00
Simon Pilgrim 056cbfe58d [X86][SSE] Fix PerformSExtCombine bug that accessed the wrong return value of an aggregate type.
Fix to rL237885 to ensure that it accesses the correct return value of an aggregate type.

llvm-svn: 240223
2015-06-20 16:19:24 +00:00
Simon Pilgrim d862e0f33a [X86][SSE][CostModel] Added full set of sitofp/uitofp costings for SSE2/AVX/AVX2/AVX512F.
Merged separate (but equivalent) SSE2/AVX512F tests.

Removed codegen tests since these are already done better in test/CodeGen/X86.

The actual cost values still need to be updated to match recent codegen improvements.

llvm-svn: 240219
2015-06-20 14:58:01 +00:00
Peter Collingbourne e3d2447f79 Use correct escaping for semicolon on Windows.
llvm-svn: 240207
2015-06-20 01:28:20 +00:00
Peter Collingbourne 2e06b7d198 LibDriver tests require x86 target.
llvm-svn: 240205
2015-06-20 01:14:37 +00:00
Peter Collingbourne 7070827be1 LibDriver: implement /libpath and $LIB; ignore /ignore and /machine.
llvm-svn: 240203
2015-06-20 00:57:12 +00:00
Nico Weber 67e715ff7d Revert 240130, it caused crashes (repro in PR23900).
llvm-svn: 240193
2015-06-19 23:43:47 +00:00
Sanjoy Das 18c9dd31de [CallGraph] Given -print-callgraph a stable printing order.
Summary:
Since FunctionMap has llvm::Function pointers as keys, the order in
which the traversal happens can differ from run to run, causing spurious
FileCheck failures.  Have CallGraph::print sort the CallGraphNodes by
name before printing them.

Reviewers: bogner, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10575

llvm-svn: 240191
2015-06-19 23:20:31 +00:00
Rafael Espindola 3dc0d05bf4 Improve error handling of getRelocationAddend.
This patch changes getRelocationAddend to use ErrorOr and considers it an error
to try to get the addend of a REL section.

If, for example, a x86_64 file has a REL section, that file is corrupted and
we should reject it.

Using ErrorOr is not ideal since we check the section type once per relocation
instead of once per section.

Checking once per section would involve getRelocationAddend just asserting and
callers checking the section before iterating over the relocations.

In any case, this is an improvement and includes a test.

llvm-svn: 240176
2015-06-19 20:58:43 +00:00
Alex Lorenz 00302df3fe MIR Parser: report an error when a basic block isn't found.
This commit reports an error when the MIR parser can't find
a basic block with the machine basic block's name.

llvm-svn: 240174
2015-06-19 20:12:03 +00:00
Alex Lorenz 4f093bf1ce MIR Serialization: Serialize the list of machine basic blocks with simple attributes.
This commit implements the initial serialization of machine basic blocks in a
machine function. Only the simple, scalar MBB attributes are serialized. The 
reference to LLVM IR's basic block is preserved when that basic block has a name.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10465

llvm-svn: 240145
2015-06-19 17:43:07 +00:00
Michael Zolotukhin 4d8ffa082c [SLP] Vectorize for all-constant entries.
Differential Revision: http://reviews.llvm.org/D10531

llvm-svn: 240144
2015-06-19 17:40:15 +00:00
Matt Arsenault 5eb5eb59fc AMDGPU: Fix some places missed in rename
llvm-svn: 240143
2015-06-19 17:39:03 +00:00
Rafael Espindola 284a750c5f Make all temporary symbols unnamed.
What this does is make all symbols that would otherwise start with a .L
(or L on MachO) unnamed.

Some of these symbols still show up in the symbol table, but we can just
make them unnamed.

In order to make sure we produce identical results when going thought assembly,
all .L (not just the compiler produced ones), are now unnamed.

Running llc on llvm-as.opt.bc, the peak memory usage goes from 208.24MB to
205.57MB.

llvm-svn: 240130
2015-06-19 12:16:55 +00:00
Ahmed Bougacha 9a9094260d [ARM] Look through concat when lowering in-place shuffles (VZIP, ..)
Currently, we canonicalize shuffles that produce a result larger than
their operands with:
  shuffle(concat(v1, undef), concat(v2, undef))
->
  shuffle(concat(v1, v2), undef)

because we can access quad vectors (see PerformVECTOR_SHUFFLECombine).

This is useful in the general case, but there are special cases where
native shuffles produce larger results: the two-result ops.

We can look through the concat when lowering them:
  shuffle(concat(v1, v2), undef)
->
  concat(VZIP(v1, v2):0, :1)

This lets us generate the native shuffles instead of scalarizing to
dozens of VMOVs.

Differential Revision: http://reviews.llvm.org/D10424

llvm-svn: 240118
2015-06-19 02:32:35 +00:00
Ahmed Bougacha 7dbea8cec9 [ARM] Add D-sized vtrn/vuzp/vzip tests, and cleanup. NFC.
llvm-svn: 240114
2015-06-19 02:15:34 +00:00
Eric Christopher 572e03a396 Fix "the the" in comments.
llvm-svn: 240112
2015-06-19 01:53:21 +00:00
Alex Lorenz 82a9a7e42c MIR Serialization: Reenable one of the MIRParser tests by reverting r239805.
The test 'llvm/test/CodeGen/MIR/machine-function.mir' was disabled on 
x86 msc18 in r239805 as it failed. My commit r240054 have fixed the
problem, so this commit reverts the commit that disabled the test as
it should pass now. 

llvm-svn: 240074
2015-06-18 22:46:27 +00:00
Rafael Espindola 9ac06a0e6b Improve the --expand-relocs handling of MachO.
In a relocation target can take 3 basic forms

* A r_value in scattered relocations.
* A symbol in external relocations.
* A section is non-external relocations.

Have the dump reflect that. With this change we go from

CHECK-NEXT:       Extern: 0
CHECK-NEXT:       Type: X86_64_RELOC_SUBTRACTOR (5)
CHECK-NEXT:       Symbol: 0x2
CHECK-NEXT:       Scattered: 0

To just

// CHECK-NEXT:       Type: X86_64_RELOC_SUBTRACTOR (5)
// CHECK-NEXT:       Section: __data (2)

Since the relocation is with a section, we print the seciton name and don't
need to say that it is not scattered or external.

Someone motivated can add further special cases for things like
ARM64_RELOC_ADDEND and ARM_RELOC_PAIR.

llvm-svn: 240073
2015-06-18 22:38:20 +00:00
Yi Jiang e0b3499db7 Avoid redundant select node in early if-conversion pass
llvm-svn: 240072
2015-06-18 22:34:09 +00:00
Hans Wennborg 67d492a544 Switch lowering: enable whole-switch jump tables at -O0.
To same compile time, the analysis to find dense case-clusters in switches is
not done at -O0. However, when the whole switch is dense enough, it is easy to
turn it into a jump table, resulting in much faster code with no extra effort.

llvm-svn: 240071
2015-06-18 22:22:30 +00:00
Rafael Espindola cf022ba270 Pass --expand-relocs to a few more tests.
llvm-svn: 240069
2015-06-18 22:12:47 +00:00
Sanjay Patel c3e018e6fd add test to show suboptimal load merging behavior
llvm-svn: 240063
2015-06-18 21:34:26 +00:00
Simon Pilgrim de94fa6438 [X86][SSE][CostModel] Fixed uitofp/sitofp cost target tests to specify sse2/avx2/avx512f directly instead of via a cpu model.
llvm-svn: 240062
2015-06-18 21:26:01 +00:00
Sanjay Patel 9fce2bc7b1 fixed to test attributes and use better checks
1. Used update_llc_test_checks.py to tighten checks
2. Fixed triple (nothing Darwin-specific here)
3. Replaced CPU specifiers with attributes
4. Fixed comments
5. Removed IvyBridge run because it did not add any coverage

llvm-svn: 240058
2015-06-18 21:12:24 +00:00
Rafael Espindola aaaa575f71 Use --expand-relocs in a test. It will make the next change easier to read.
llvm-svn: 240053
2015-06-18 20:57:35 +00:00
Colin LeMahieu d2158755eb [Hexagon] Printing packet brackets when asm printing and adding a number of tests that test packet brackets.
llvm-svn: 240051
2015-06-18 20:43:50 +00:00
Sanjoy Das c65d43e649 [CallGraph] Teach the CallGraph about non-leaf intrinsics.
Summary:
Currently intrinsics don't affect the creation of the call graph.
This is not accurate with respect to statepoint and patchpoint
intrinsics -- these do call (or invoke) LLVM level functions.

This change fixes this inconsistency by adding a call to the external
node for call sites that call these non-leaf intrinsics.  This coupled
with the fact that these intrinsics also escape the function pointer
they call gives us a conservatively correct call graph.

Reviewers: reames, chandlerc, atrick, pgavlin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10526

llvm-svn: 240039
2015-06-18 19:28:26 +00:00
David Majnemer 46c852e438 [CodeGen] Don't emit a random reference to the personality function
This should fix issues we've been seeing with Darwin.

llvm-svn: 240036
2015-06-18 18:31:46 +00:00
James Y Knight f90346f8f6 [SPARC] Repair GOT references to internal symbols.
They had been getting emitted as a section + offset reference, which
is bogus since the value needs to be the offset within the GOT, not
the actual address of the symbol's object.

Differential Revision: http://reviews.llvm.org/D10441

llvm-svn: 240020
2015-06-18 15:05:15 +00:00
Rafael Espindola f14eec8d78 Convert a few tests to use llvm-mc.
llvm-svn: 240017
2015-06-18 13:39:07 +00:00
Simon Pilgrim 1739421893 [X86][AVX2] Added AVX2 SINT_TO_FP/UINT_TO_FP tests
llvm-svn: 240013
2015-06-18 12:32:28 +00:00
Asaf Badouh 81f03c30a5 [AVX512]
add instructions: VPAVGB and VPAVGW


review
http://reviews.llvm.org/D10504

llvm-svn: 240012
2015-06-18 12:30:53 +00:00
Elena Demikhovsky d3057e5e37 AVX-512: (fixed) Added encoding of all forms of VPERMT2W/D/Q/PS/PD and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.

llvm-svn: 240003
2015-06-18 08:56:19 +00:00
Elena Demikhovsky 4f13f3f9b8 reverted 239999 due to test failures
llvm-svn: 240001
2015-06-18 08:06:49 +00:00
Elena Demikhovsky 975a637cd9 AVX-512: Added encoding of all forms of VPERMT2W/D/Q/PS/PD
and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.

llvm-svn: 239999
2015-06-18 07:29:40 +00:00
Benjamin Kramer c6e8bfc41d [AsmPrinter] Make isRepeatedByteSequence smarter about odd integer types
- zext the value to alloc size first, then check if the value repeats
  with zero padding included. If so we can still emit a .space
- Do the checking with APInt.isSplat(8), which handles non-pow2 types
- Also handle large constants (bit width > 64)
- In a ConstantArray all elements have the same type, so it's sufficient
  to check the first constant recursively and then just compare if all
  following constants are the same by pointer compare

llvm-svn: 239977
2015-06-17 23:55:17 +00:00
Simon Pilgrim 3aa039a4a8 [X86][SSE] Improved support for vector i16 to float conversions.
Added explicit sign extension for v4i16/v8i16 to v4i32/v8i32 before conversion to floats. Matches existing support for v4i8/v8i8.

Follow up to D10433

llvm-svn: 239966
2015-06-17 22:43:34 +00:00
Jingyue Wu cd3afea451 Add NVPTXLowerAlloca pass to convert alloca'ed memory to local address
Summary:
This is done by first adding two additional instructions to convert the
alloca returned address to local and convert it back to generic. Then
replace all uses of alloca instruction with the converted generic
address. Then we can rely NVPTXFavorNonGenericAddrSpace pass to combine
the generic addresscast and the corresponding Load, Store, Bitcast, GEP
Instruction together.

Patched by Xuetian Weng (xweng@google.com). 

Test Plan: test/CodeGen/NVPTX/lower-alloca.ll

Reviewers: jholewinski, jingyue

Reviewed By: jingyue

Subscribers: meheff, broune, eliben, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10483

llvm-svn: 239964
2015-06-17 22:31:02 +00:00
David Majnemer 7fddeccb8b Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

llvm-svn: 239940
2015-06-17 20:52:32 +00:00
Ahmed Bougacha f32991461f [CodeGenPrepare] Generalize inserted set from truncs to any inst.
It's been used before to avoid infinite loops caused by separate CGP
optimizations undoing one another.  We found one more such issue
caused by r238054.  To avoid it, generalize the "InsertedTruncs"
set to any inst, and use it to avoid touching those again.

llvm-svn: 239938
2015-06-17 20:44:32 +00:00
Colin LeMahieu bb71f7d251 [Hexagon] Adding a number of other tests for min/max instructions and loading i1s.
llvm-svn: 239935
2015-06-17 20:29:33 +00:00
Peter Collingbourne 4fc603ded3 LowerBitSets: Do not assign names to aliases of unnamed bitset element objects.
The restriction on unnamed aliases was removed in r239921. Mostly reverts
r239590, but we keep the test.

llvm-svn: 239923
2015-06-17 18:31:02 +00:00
Rafael Espindola 54fc298bbc Allow aliases to be unnamed.
If globals can be unnamed, there is no reason for aliases to be different.

The restriction was there since the original implementation in r36435. I
can only guess it was there because of the old bison parser for the old
alias syntax.

llvm-svn: 239921
2015-06-17 17:53:31 +00:00
Colin LeMahieu ca8a82d5c7 [Hexagon] Adding some compare tests, fixing existing XFAILed tests, and removing mcpu=hexagonv4 since that's the minimum version anyway.
llvm-svn: 239917
2015-06-17 17:19:05 +00:00
Diego Novillo 8c49a57266 Add documentation for new backedge mass propagation in irregular loops.
Tweak test cases and rename headerIndexFor -> getHeaderIndex.

llvm-svn: 239915
2015-06-17 16:28:22 +00:00
Benjamin Kramer 58675d4f84 [MC/Dwarf] Encode DW_CFA_advance_loc in target endianess.
This matches GNU as output.

llvm-svn: 239911
2015-06-17 15:14:35 +00:00
Toma Tabacu f712ede932 [mips] [IAS] Add support for expanding LASym with a source register operand.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9348

llvm-svn: 239910
2015-06-17 14:31:51 +00:00
Toma Tabacu 1a1083285c [mips] [IAS] Add support for the B{L,G}{T,E}(U) branch pseudo-instructions.
Summary:
This does not include support for the immediate variants of these pseudo-instructions.
Fixes llvm.org/PR20968.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D8537

llvm-svn: 239905
2015-06-17 13:20:24 +00:00
Toma Tabacu 9e7b90c244 [mips] [IAS] Fix LA with relative label operands.
Summary:
Call MCSymbolRefExpr::create() with a MCSymbol* argument, not with a StringRef
of the Symbol's name, in order to avoid creating invalid temporary symbols for
relative labels (e.g. {$,.L}tmp00, {$,.L}tmp10 etc.).

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10498

llvm-svn: 239901
2015-06-17 12:30:37 +00:00
Toma Tabacu 6a1e0eb27d [mips] [IAS] Add test for SW with relative label operands. NFC.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10497

llvm-svn: 239899
2015-06-17 11:46:37 +00:00
Toma Tabacu 07c97b3b7e [mips] [IAS] Fix LW with relative label operands.
Summary:
Previously, MCSymbolRefExpr::create() was called with a StringRef of the symbol
name, which it would then search for in the Symbols StringMap (from MCContext).

However, relative labels (which are temporary symbols) are apparently not stored
in the Symbols StringMap, so we end up creating a new {$,.L}tmp symbol
({$,.L}tmp00, {$,.L}tmp10 etc.) each time we create an MCSymbolRefExpr by
passing in the symbol name as a StringRef.

Fortunately, there is a version of MCSymbolRefExpr::create() which takes an
MCSymbol* and we already have an MCSymbol* at that point, so we can just pass
that in instead of the StringRef.

I also removed the local StringRef calls to MCSymbolRefExpr::create() from
expandMemInst(), as those cases can be handled by evaluateRelocExpr() anyway.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9938

llvm-svn: 239897
2015-06-17 10:43:45 +00:00
Igor Breger dfcc3d31a7 AVX-512: cvtusi2ss/d intrinsics.
Change builtin function name and signature ( add third parameter - rounding mode ).
Added tests for intrinsics.

Differential Revision: http://reviews.llvm.org/D10473

llvm-svn: 239888
2015-06-17 07:23:57 +00:00
Matthias Braun 8321006d44 Revert "AArch64: Use CMP;CCMP sequences for and/or/setcc trees."
The patch triggers a miscompile on SPEC 2006 403.gcc with the (ref)
200.i and scilab.i inputs. I opened PR23866 to track analysis of this.

This reverts commit r238793.

llvm-svn: 239880
2015-06-17 04:02:32 +00:00
Colin LeMahieu be99a02b1b [Hexagon] Adding MC ELF streamer and updating addend relocation test which shows correct ELF symbol.
llvm-svn: 239876
2015-06-17 03:06:16 +00:00
Sanjay Patel 0848a8be92 Add some tests based on PR21711
These were originally added in r227242,
but that patch was reverted because it
caused a failure on AArch64.

llvm-svn: 239860
2015-06-16 22:37:50 +00:00
Simon Atanasyan 6e07e9305b [llvm-readobj] Print MIPS .reginfo section content
llvm-svn: 239856
2015-06-16 21:47:43 +00:00
Simon Pilgrim cae7b94cbd [X86][SSE] Vectorize v2i32 to v2f64 conversions
This patch enables support for the conversion of v2i32 to v2f64 to use the CVTDQ2PD xmm instruction and stay on the SSE unit instead of scalarizing, sign extending to i64 and using CVTSI2SDQ scalar conversions.

Differential Revision: http://reviews.llvm.org/D10433

llvm-svn: 239855
2015-06-16 21:40:28 +00:00
Philip Reames c25df11614 Reapply 239795 - [InstCombine] Propagate non-null facts to call parameters
The original change broke clang side tests.  I will be submitting those momentarily.  This change includes post commit feedback on the original change from from Pete Cooper.

Original Submission comments:
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.

Differential Revision: http://reviews.llvm.org/D9132

llvm-svn: 239849
2015-06-16 20:24:25 +00:00
Rafael Espindola c6afe0d4e9 Improve handling of end of file in the bitcode reader.
Before this patch the bitcode reader would read a module from a file
that contained in order:

* Any number of non MODULE_BLOCK sub blocks.
* One MODULE_BLOCK
* Any number of non MODULE_BLOCK sub blocks.
* 4 '\n' characters to handle OS X's ranlib.

Since we support lazy reading of modules, any information that is relevant
for the module has to be in the MODULE_BLOCK or before it. We don't gain
anything from checking what is after.

This patch then changes the reader to stop once the MODULE_BLOCK has been
successfully parsed.

This avoids the ugly special case for .bc files in an archive and makes it
easier to embed bitcode files.

llvm-svn: 239845
2015-06-16 20:03:39 +00:00
Diego Novillo 9a779623d9 Fix PR 23525 - Separate header mass propagation in irregular loops.
Summary:
When propagating mass through irregular loops, the mass flowing through
each loop header may not be equal. This was causing wrong frequencies
to be computed for irregular loop headers.

Fixed by keeping track of masses flowing through each of the headers in
an irregular loop. To do this, we now keep track of per-header backedge
weights. After the loop mass is distributed through the loop, the
backedge weights are used to re-distribute the loop mass to the loop
headers.

Since each backedge will have a mass proportional to the different
branch weights, the loop headers will end up with a more approximate
weight distribution (as opposed to the current distribution that assumes
that every loop header is the same).

Reviewers: dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10348

llvm-svn: 239843
2015-06-16 19:10:58 +00:00
Igor Laevsky 8f3fa0ec63 [Statepoints] Test only change. Check that statepoint lowering didn't generate more than expected amount of spills.
See http://reviews.llvm.org/D10402 for related discussion.

llvm-svn: 239842
2015-06-16 19:07:05 +00:00
Frederic Riss 40baa0aad4 Have MachOObjectFile::isValidArch() accept armv7
llvm-svn: 239833
2015-06-16 17:37:03 +00:00
Alex Lorenz 5ef16b8a7c MIR Parser: Report an error when a machine function doesn't have a corresponding function.
This commit reports an error when a machine function from a MIR file that contains
LLVM IR can't find a function with the same name in the loaded LLVM IR module.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10468

llvm-svn: 239831
2015-06-16 17:06:29 +00:00
Rafael Espindola 35f6faed67 Add a test for padded bitcode files.
llvm-svn: 239829
2015-06-16 16:36:15 +00:00
Kit Barton 4f79f96fd7 Properly handle the mftb instruction.
The mftb instruction was incorrectly marked as deprecated in the PPC
Backend. Instead, it should not be treated as deprecated, but rather be
implemented using the mfspr instruction. A similar patch was put into GCC last
year. Details can be found at:

https://sourceware.org/ml/binutils/2014-11/msg00383.html.
This change will replace instances of the mftb instruction with the mfspr
instruction for all CPUs except 601 and pwr3. This will also be the default
behaviour.

Additional details can be found in:

https://llvm.org/bugs/show_bug.cgi?id=23680

Phabricator review: http://reviews.llvm.org/D10419

llvm-svn: 239827
2015-06-16 16:01:15 +00:00
Matt Arsenault ed891b5561 Revert "Revert "Fix merges of non-zero vector stores""
Reapply r239539. Don't assume the collected number of
stores is the same vector size. Just take the first N
stores to fill the vector.

llvm-svn: 239825
2015-06-16 15:51:48 +00:00
Benjamin Kramer 1ee59cba5d [InstSimplify] Allow folding of fdiv X, X with just NaNs ignored
Any combination of +-inf/+-inf is NaN so it's already ignored with
nnan and we can skip checking for ninf. Also rephrase logic in comments
a bit.

llvm-svn: 239821
2015-06-16 14:57:29 +00:00
Daniel Sanders 58405d856e [mips][ias] Expand on r238751 to cover as many relocs as possible.
Summary:
Relocs that can be converted from absolute to PC-relative now do so if IsPCRel
is true. Relocs that require PC-relative now call llvm_unreachable() if IsPCRel
is false and similarly those that require absolute assert that IsPCRel is false.

Note that while it looks like some relocs (e.g. R_MIPS_26) can be converted into
the MIPS32r6/MIPS64r6 relocs (R_MIPS_PC*_S2), it isn't actually valid to do so.

Placeholders have been left in the testcase for unsupported relocs and relocs
that cannot be generated at the moment.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits, rafael

Differential Revision: http://reviews.llvm.org/D10184

llvm-svn: 239817
2015-06-16 13:46:26 +00:00
Daniel Sanders c535d93b47 [llvm-mc] The object form of the GNU triple should be the same as the string form.
Summary:
GetTarget() may modify TripleName without also updating TheTriple.
This can lead to situations where the MCObjectStreamer has a different triple
to the rest of LLVM.

This inconsistency caused sparc-little-endian.s to pass on Windows because most
of LLVM had sparcel-pc-win32 while MCObjectStreamer had "". I believe the same
kind of thing was also true of Darwin.

Reviewers: rengolin

Reviewed By: rengolin

Subscribers: llvm-commits, rengolin, rafael

Differential Revision: http://reviews.llvm.org/D10450

llvm-svn: 239808
2015-06-16 09:57:38 +00:00
Asaf Badouh 02d126cb9d [AVX512] add integer min/max intrinsics support.
review:
http://reviews.llvm.org/D10439

llvm-svn: 239806
2015-06-16 08:39:27 +00:00
NAKAMURA Takumi f68c7a27f4 Disable llvm/test/CodeGen/MIR/machine-function.mir on x86 msc18 for now. Investigating.
The emission was as below;

  ---
  name:            foo
  alignment:       31428584
  exposesReturnsTwice: true
  hasInlineAsm:    false
  ...
  ---
  name:            bar
  alignment:       1701667182
  exposesReturnsTwice: false
  hasInlineAsm:    false
  ...
  ---
  name:            func
  alignment:       8
  exposesReturnsTwice: false
  hasInlineAsm:    false
  ...
  ---
  name:            func2
  alignment:       16
  exposesReturnsTwice: true
  hasInlineAsm:    true
  ...

llvm-svn: 239805
2015-06-16 06:57:35 +00:00
Elena Demikhovsky 77f0e9f662 X86: optimized i64 vector multiply with constant
When we multiply two 64-bit vectors, we extract lower and upper part and use the PMULUDQ instruction.
When one of the operands is a constant, the upper part may be zero, we know this at compile time.
Example: %a = mul <4 x i64> %b, <4 x i64> < i64 5, i64 5, i64 5, i64 5>.
I'm checking the value of the upper part and prevent redundant "multiply", "shift" and "add" operations.

llvm-svn: 239802
2015-06-16 06:07:24 +00:00
Philip Reames 1a6305f313 Revert 239795
I forgot to update some clang test cases.  I'll fix and resubmit tomorrow.

llvm-svn: 239800
2015-06-16 01:20:53 +00:00
Ahmed Bougacha 8c7754b965 [AArch64] Generalize extract-high DUP extension to MOVI/MVNI.
These are really immediate DUPs, and suffer from the same problem
with long instructions with a high/2 variant (e.g. smull).

By extending a MOVI (or DUP, before this patch), we can avoid an ext
on the other operand of the long instruction, e.g. turning:
    ext.16b v0, v0, v0, #8
    movi.4h v1, #0x53
    smull.4s  v0, v0, v1
into:
    movi.8h v1, #0x53
    smull2.4s  v0, v0, v1

While there, add a now-necessary combine to fold (VT NVCAST (VT x)).

llvm-svn: 239799
2015-06-16 01:18:14 +00:00
Ahmed Bougacha d300722b93 [AArch64] Robustize neon-2velem-high test. NFC.
llvm-svn: 239798
2015-06-16 01:05:39 +00:00
Philip Reames dfc29fba60 [InstCombine] Propagate non-null facts to call parameters
If a parameter to a function is known non-null, use the existing parameter attributes to record that fact at the call site. This has no optimization benefit by itself - that I know of - but is an enabling change for http://reviews.llvm.org/D9129.

Differential Revision: http://reviews.llvm.org/D9132

llvm-svn: 239795
2015-06-16 00:43:54 +00:00
Alex Lorenz 5b5f97537f MIR Serialization: Print and parse simple machine function attributes.
This commit serializes the simple, scalar attributes from the 
'MachineFunction' class.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10449

llvm-svn: 239790
2015-06-16 00:10:47 +00:00
Alex Lorenz 8e7a58d7cc MIR Serialization: Create dummy functions when the MIR file doesn't have LLVM IR.
This commit creates a dummy LLVM IR function with one basic block and an unreachable
instruction for each parsed machine function when the MIR file doesn't have LLVM IR.
This change is required as the machine function analysis pass creates machine
functions only for the functions that are defined in the current LLVM module.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10135

llvm-svn: 239778
2015-06-15 23:07:38 +00:00
Alex Lorenz fe2aa97bab MIR Serialization: Report an error when machine functions have the same name.
This commit reports an error when the MIR parser encounters a machine
function with the name that is the same as the name of a different
machine function.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10130

llvm-svn: 239774
2015-06-15 22:23:23 +00:00
Peter Collingbourne 58af6d1594 Add safestack attribute to LLVMAttribute enum and Go bindings. Correct
constants in commented-out part of LLVMAttribute enum. Add tests that verify
that the safestack attribute is only allowed as a function attribute.

llvm-svn: 239772
2015-06-15 22:16:51 +00:00
Colin LeMahieu ded2e90600 [Hexagon] Using readobj rather than objdump.
llvm-svn: 239770
2015-06-15 21:57:41 +00:00
Colin LeMahieu a071a8e5b6 [Hexagon] PC-relative offsets are relative to packet start rather than the offset of the relocation. Set relocation addend and check it's correct in the ELF.
llvm-svn: 239769
2015-06-15 21:52:13 +00:00
Simon Pilgrim aa9f712967 [X86][SSE] Added tests for vector i8/i16 to f32/f64 conversions
llvm-svn: 239767
2015-06-15 21:49:31 +00:00
Peter Collingbourne 82437bf7a5 Protection against stack-based memory corruption errors using SafeStack
This patch adds the safe stack instrumentation pass to LLVM, which separates
the program stack into a safe stack, which stores return addresses, register
spills, and local variables that are statically verified to be accessed
in a safe way, and the unsafe stack, which stores everything else. Such
separation makes it much harder for an attacker to corrupt objects on the
safe stack, including function pointers stored in spilled registers and
return addresses. You can find more information about the safe stack, as
well as other parts of or control-flow hijack protection technique in our
OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf)
and our project website (http://levee.epfl.ch).

The overhead of our implementation of the safe stack is very close to zero
(0.01% on the Phoronix benchmarks). This is lower than the overhead of
stack cookies, which are supported by LLVM and are commonly used today,
yet the security guarantees of the safe stack are strictly stronger than
stack cookies. In some cases, the safe stack improves performance due to
better cache locality.

Our current implementation of the safe stack is stable and robust, we
used it to recompile multiple projects on Linux including Chromium, and
we also recompiled the entire FreeBSD user-space system and more than 100
packages. We ran unit tests on the FreeBSD system and many of the packages
and observed no errors caused by the safe stack. The safe stack is also fully
binary compatible with non-instrumented code and can be applied to parts of
a program selectively.

This patch is our implementation of the safe stack on top of LLVM. The
patches make the following changes:

- Add the safestack function attribute, similar to the ssp, sspstrong and
  sspreq attributes.

- Add the SafeStack instrumentation pass that applies the safe stack to all
  functions that have the safestack attribute. This pass moves all unsafe local
  variables to the unsafe stack with a separate stack pointer, whereas all
  safe variables remain on the regular stack that is managed by LLVM as usual.

- Invoke the pass as the last stage before code generation (at the same time
  the existing cookie-based stack protector pass is invoked).

- Add unit tests for the safe stack.

Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.

Differential Revision: http://reviews.llvm.org/D6094

llvm-svn: 239761
2015-06-15 21:07:11 +00:00
Alex Lorenz 735c47ec3e MIR Serialization: Connect the machine function analysis pass to the MIR parser.
This commit connects the machine function analysis pass (which creates machine
functions) to the MIR parser, which will initialize the machine functions 
with the state from the MIR file and reconstruct the machine IR.

This commit introduces a new interface called 'MachineFunctionInitializer',
which can be used to provide custom initialization for the machine functions.

This commit also introduces a new diagnostic class called 
'DiagnosticInfoMIRParser' which is used for MIR parsing errors.
This commit modifies the default diagnostic handling in LLVMContext - now the
the diagnostics are printed directly into llvm::errs() so that the MIR parsing 
errors can be printed with colours.  

Reviewers: Justin Bogner

Differential Revision: http://reviews.llvm.org/D9928

llvm-svn: 239753
2015-06-15 20:30:22 +00:00
Sanjoy Das 784582f116 Add "REQUIRES: asserts" to test case that uses -debug-only
llvm-svn: 239748
2015-06-15 20:05:38 +00:00
Sanjoy Das 69fad0799e [CodeGen] Add a pass to fold null checks into nearby memory operations.
Summary:
This change adds an "ImplicitNullChecks" target dependent pass.  This
pass folds null checks into memory operation using the FAULTING_LOAD
pseudo-op introduced in previous patches.

Depends on D10197
Depends on D10199
Depends on D10200

Reviewers: reames, rnk, pgavlin, JosephTremoulet, atrick

Reviewed By: atrick

Subscribers: ab, JosephTremoulet, llvm-commits

Differential Revision: http://reviews.llvm.org/D10201

llvm-svn: 239743
2015-06-15 18:44:27 +00:00
Evgeny Astigeevich ff1f4be4c7 On behalf of Alexandros Lamprineas:
LLVM targeting aarch64 doesn't correctly produce aligned accesses for non-aligned
data at -O0/fast-isel (-mno-unaligned-access).
The root cause seems to be in fast-isel not producing unaligned access correctly
for -mno-unaligned-access.

The patch just aborts fast-isel for loads and stores when -mno-unaligned-access is
present. 
The regression test is updated to check this new test case (-mno-unaligned-access 
together with fast-isel).

Differential Revision: http://reviews.llvm.org/D10360

llvm-svn: 239732
2015-06-15 15:48:44 +00:00
Rafael Espindola 92200d237a gold-plugin: save the .o when given -save-temps.
The plugin now save the bitcode before and after optimizations and the
.o that is passed to the linker.

llvm-svn: 239726
2015-06-15 13:36:27 +00:00
Jingyue Wu 12b0c2835e [ValueTracking] do not overwrite analysis results already computed
Summary:
ValueTracking used to overwrite the analysis results computed from
assumes and dominating conditions. This patch fixes this issue.

Test Plan: test/Analysis/ValueTracking/assume.ll

Reviewers: hfinkel, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10283

llvm-svn: 239718
2015-06-15 05:46:29 +00:00
Hao Liu 1c2e89a57a [AArch64] Delete two empty files, which should be removed by r239713.
llvm-svn: 239715
2015-06-15 02:56:40 +00:00
Hao Liu d0ca8d7edd [AArch64] Revert r239711 again. We need to discuss how to share code between AArch64 and ARM backend.
llvm-svn: 239713
2015-06-15 01:56:40 +00:00
Hao Liu cb070e3833 [AArch64] Match interleaved memory accesses into ldN/stN instructions.
Re-commit after adding "-aarch64-neon-syntax=generic" to fix the failure on OS X.
This patch was firstly committed in r239514, then reverted in r239544 because of a syntax incompatible failure on OS X.

llvm-svn: 239711
2015-06-15 01:35:49 +00:00
Benjamin Kramer 228680ded8 [InstSimplify] fsub nnan x, x -> 0.0 is valid without ninf
Both inf - inf and (-inf) - (-inf) are NaN, so it's already covered by
nnan.

llvm-svn: 239702
2015-06-14 21:01:20 +00:00
Benjamin Kramer 4f0524614e [InstSimplify] Add self-fdiv identities for -ffinite-math-only.
When NaNs and Infs are ignored we can fold
 X /  X -> 1.0
-X /  X -> -1.0
 X / -X -> -1.0

llvm-svn: 239701
2015-06-14 18:53:58 +00:00
Igor Breger 5e49697138 AVX-512: Implemented DAG lowering for shuff62x2/shufi62x2 instuctions ( Shuffle Packed Values at 128-bit Granularity )
Tests added , vector-shuffle-512-v8.ll test re-generated.

Differential Revision: http://reviews.llvm.org/D10300

llvm-svn: 239697
2015-06-14 13:07:47 +00:00
Michael Kuperstein e3de07a529 Add support for parsing the XOR operator in Intel syntax inline assembly.
Differential Revision: http://reviews.llvm.org/D10385
Patch by marina.yatsina@intel.com

llvm-svn: 239695
2015-06-14 12:59:45 +00:00
Igor Breger abe4a79b75 AVX-512: Implemented cvtsi2ss/d cvtusi2ss/d instructions with round control for KNL.
Added intrinsics for cvtsi2ss/d instructions.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D10430

llvm-svn: 239694
2015-06-14 12:44:55 +00:00
Colin LeMahieu b8575b14be [Hexagon] Adding some codegen tests and updating some to match spec.
llvm-svn: 239690
2015-06-13 21:46:39 +00:00
Simon Pilgrim d3f6427446 [DAGCombiner] Added BSWAP(BSWAP(x)) -> x combine pattern.
llvm-svn: 239682
2015-06-13 16:25:12 +00:00
Simon Pilgrim 011381d48b [DAGCombiner] Added BSWAP vector constant folding support.
llvm-svn: 239675
2015-06-13 14:08:15 +00:00
Tom Stellard 45bb48ea19 R600 -> AMDGPU rename
llvm-svn: 239657
2015-06-13 03:28:10 +00:00
Tim Northover 02cfdbb7f1 AArch64: map bare-metal arm64-macho triple to MachO MC layer.
Far better than an assertion about expecting ELF.

llvm-svn: 239647
2015-06-12 23:37:11 +00:00
Tom Stellard 12a1910e87 R600/SI: Add assembler support for FLAT instructions
- Add glc, slc, and tfe operands to flat instructions
- Add missing flat instructions
- Fix the encoding of flat_load_dwordx3 and flat_store_dwordx3.

llvm-svn: 239637
2015-06-12 20:47:06 +00:00
Colin LeMahieu 79ec06525e [Hexagon] Making intrinsic tests agnostic to register allocation. Narrowing intrinsic parameters to appropriate width.
llvm-svn: 239634
2015-06-12 19:57:32 +00:00
Rafael Espindola de28b7375f Don't depend on the interleaving of stdout and stderr.
That can change as we change the buffering.

llvm-svn: 239602
2015-06-12 12:20:03 +00:00
John Brawn d9e39d53b6 [ARM] Disabling vfp4 should disable fp16
ARMTargetParser::getFPUFeatures should disable fp16 whenever it
disables vfp4, as otherwise something like -mcpu=cortex-a7 -mfpu=none
leaves us with fp16 enabled (though the only effect that will have is
a wrong build attribute).

Differential Revision: http://reviews.llvm.org/D10397

llvm-svn: 239599
2015-06-12 09:38:51 +00:00
Peter Collingbourne 005354b1f4 LowerBitSets: Give names to aliases of unnamed bitset element objects.
It is valid for globals to be unnamed, but aliases must have a name. To avoid
creating invalid IR, we need to assign names to any aliases we create that
point to unnamed objects that have been moved into combined globals.

llvm-svn: 239590
2015-06-12 03:25:05 +00:00
Alexey Samsonov 9947e48cd1 [GVN] Use a simpler form of IRBuilder constructor.
Summary:
A side effect of this change is that it IRBuilder now automatically
created debug info locations for new instructions, which is the
same as debug location of insertion point. This is fine for the
functions in questions (GetStoreValueForLoad and
GetMemInstValueForLoad), as they are used in two situations:
  * GVN::processLoad, which tries to eliminate a load. In this case
    new instructions would have the same debug location as the load they
    eventually replace;
  * MaterializeAdjustedValue, which adds new instructions to the end
    of the basic blocks, which could later be used to replace the load
    definition. In this case we don't yet know the way the load would
    be eventually replaced (either by assembling the precomputed values
    via PHI, or by using them directly), so just using the basic block
    strategy seems to be reasonable. There is also a special case
    in the code that *would* adjust the location of the last
    instruction replacing the load definition to the location of the
    load.

Test Plan: regression test suite

Reviewers: echristo, dberlin, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10405

llvm-svn: 239585
2015-06-12 01:39:48 +00:00
Reid Kleckner 81d1cc00b7 [WinEH] Put finally pointers in the handler scope table field
We were putting them in the filter field, which is correct for 64-bit
but wrong for 32-bit.

Also switch the order of scope table entry emission so outermost entries
are emitted first, and fix an obvious state assignment bug.

llvm-svn: 239574
2015-06-11 23:37:18 +00:00
Reid Kleckner a9d6253572 [WinEH] Create an llvm.x86.seh.exceptioninfo intrinsic
This intrinsic is like framerecover plus a load. It recovers the EH
registration stack allocation from the parent frame and loads the
exception information field out of it, giving back a pointer to an
EXCEPTION_POINTERS struct. It's designed for clang to use in SEH filter
expressions instead of accessing the EXCEPTION_POINTERS parameter that
is available on x64.

This required a minor change to MC to allow defining a label variable to
another absolute framerecover label variable.

llvm-svn: 239567
2015-06-11 22:32:23 +00:00
Peter Collingbourne 82e657b509 Object: Prepend __imp_ when mangling a dllimport symbol in IRObjectFile.
We cannot prepend __imp_ in the IR mangler because a function reference may
be emitted unmangled in a constant initializer. The linker is expected to
resolve such references to thunks. This is covered by the new test case.

Strictly speaking we ought to emit two undefined symbols, one with __imp_ and
one without, as we cannot know which symbol the final object file will refer
to. However, this would require rather intrusive changes to IRObjectFile,
and lld works fine without it for now.

This reimplements r239437, which was reverted in r239502.

Differential Revision: http://reviews.llvm.org/D10400

llvm-svn: 239560
2015-06-11 21:42:18 +00:00
Alexey Samsonov 770f65ca6a Set proper debug location for branch added in BasicBlock::splitBasicBlock().
This improves debug locations in passes that do a lot of basic block
transformations. Important case is LoopUnroll pass, the test for correct
debug locations accompanies this change.

Test Plan: regression test suite

Reviewers: dblaikie, sanjoy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10367

llvm-svn: 239551
2015-06-11 18:25:54 +00:00
Rafael Espindola 65d37e64a9 This reverts commit r239529 and r239514.
Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions."
Revert "Fixing MSVC 2013 build error."

The  test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X.

llvm-svn: 239544
2015-06-11 17:30:33 +00:00
Reid Kleckner 2691c59e97 Revert "Fix merges of non-zero vector stores"
This reverts commit r239539.

It was causing SDAG assertions while building freetype.

llvm-svn: 239543
2015-06-11 17:25:24 +00:00
Matt Arsenault 91f90e694f SLSR: Pass address space to isLegalAddressingMode
This only updates one of the uses. The other is used in cases
that may never touch memory, so I'm not sure why this is even
calling it. Perhaps there should be a new, similar hook for such
cases or pass -1 for unknown address space.

llvm-svn: 239540
2015-06-11 16:13:39 +00:00
Matt Arsenault e23a063dc3 Fix merges of non-zero vector stores
Now actually stores the non-zero constant instead of 0.
I somehow forgot to include this part of r238108.

The test change was just an independent instruction order swap,
so just add another check line to satisfy CHECK-NEXT.

llvm-svn: 239539
2015-06-11 16:03:52 +00:00
Tom Stellard 53e015f37d R600/SI: Add -mcpu=bonaire to a test that uses flat address space
Flat instructions don't exist on SI, but there is a bug in the backend that
allows them to be selected.

llvm-svn: 239533
2015-06-11 14:51:46 +00:00
Toma Tabacu e1e460dbc5 Recommit "[mips] [IAS] Add support for BNE and BEQ with an immediate operand." (r239396).
Apparently, Arcanist didn't include some of my local changes in my previous
commit attempt.

llvm-svn: 239523
2015-06-11 10:36:10 +00:00
Zoran Jovanovic cdfcbe41f2 [mips][microMIPS] Implement ERET and ERETNC instructions
http://reviews.llvm.org/D10091

llvm-svn: 239522
2015-06-11 10:22:46 +00:00
Zoran Jovanovic 6b0dcd7b8c [mips] Change existing uimm10 operand to restrict the accepted immediates
http://reviews.llvm.org/D10312

llvm-svn: 239520
2015-06-11 09:51:58 +00:00
Zoran Jovanovic fcecf26092 [mips][microMIPSr6] Change disassembler tests to one line format
llvm-svn: 239519
2015-06-11 09:42:10 +00:00
Hao Liu 4566d18e89 [AArch64] Match interleaved memory accesses into ldN/stN instructions.
Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true"

E.g. Transform an interleaved load (Factor = 2):
       %wide.vec = load <8 x i32>, <8 x i32>* %ptr
       %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>  ; Extract even elements
       %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>  ; Extract odd elements
     Into:
       %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr)
       %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
       %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Transform an interleaved store (Factor = 2):
       %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
       store <8 x i32> %i.vec, <8 x i32>* %ptr
     Into:
       %v0 = shuffle %i.vec, undef, <0, 1, 2, 3>
       %v1 = shuffle %i.vec, undef, <4, 5, 6, 7>
       call void aarch64.neon.st2(%v0, %v1, %ptr)

llvm-svn: 239514
2015-06-11 09:05:02 +00:00
Simon Pilgrim 5965680d53 [X86][SSE] Vectorized i8 and i16 shift operators
This patch ensures that SHL/SRL/SRA shifts for i8 and i16 vectors avoid scalarization. It builds on the existing i8 SHL vectorized implementation of moving the shift bits up to the sign bit position and separating the 4, 2 & 1 bit shifts with several improvements:

1 - SSE41 targets can use (v)pblendvb directly with the sign bit instead of performing a comparison to feed into a VSELECT node.
2 - pre-SSE41 targets were masking + comparing with an 0x80 constant - we avoid this by using the fact that a set sign bit means a negative integer which can be compared against zero to then feed into VSELECT, avoiding the need for a constant mask (zero generation is much cheaper).
3 - SRA i8 needs to be unpacked to the upper byte of a i16 so that the i16 psraw instruction can be correctly used for sign extension - we have to do more work than for SHL/SRL but perf tests indicate that this is still beneficial.

The i16 implementation is similar but simpler than for i8 - we have to do 8, 4, 2 & 1 bit shifts but less shift masking is involved. SSE41 use of (v)pblendvb requires that the i16 shift amount is splatted to both bytes however.

Tested on SSE2, SSE41 and AVX machines.

Differential Revision: http://reviews.llvm.org/D9474

llvm-svn: 239509
2015-06-11 07:46:37 +00:00
Nemanja Ivanovic ea1db8a697 LLVM support for vector quad bit permute and gather instructions through builtins
This patch corresponds to review:
http://reviews.llvm.org/D10096

This is the back end portion of the patch related to D10095.
The patch adds the instructions and back end intrinsics for:
vbpermq
vgbbd

llvm-svn: 239505
2015-06-11 06:21:25 +00:00
Reid Kleckner c35e7f52ba Revert "Move dllimport name mangling to IR mangler."
This reverts commit r239437.

This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol
directly instead of using it to do an indirect function call.

llvm-svn: 239502
2015-06-11 01:31:48 +00:00
Peter Collingbourne 115fe37621 ArgumentPromotion: Drop sret attribute on functions that are only called directly.
If the first argument to a function is a 'this' argument and the second
has the sret attribute, the ArgumentPromotion pass may promote the 'this'
argument to more than one argument, violating the IR constraint that 'sret'
may only be applied to the first or second argument.

Although this IR constraint is arguably unnecessary, it highlighted the fact
that ArgPromotion does not need to preserve this attribute. Dropping the
attribute reduces register pressure in the backend by avoiding the register
copy required by sret. Because sret implies noalias, we also replace the
former with the latter.

Differential Revision: http://reviews.llvm.org/D10353

llvm-svn: 239488
2015-06-10 21:14:34 +00:00
Sanjay Patel 08829bac81 [x86] Add a reassociation optimization to increase ILP via the MachineCombiner pass
This is a reimplementation of D9780 at the machine instruction level rather than the DAG.

Use the MachineCombiner pass to reassociate scalar single-precision AVX additions (just a
starting point; see the TODO comments) to increase ILP when it's safe to do so.

The code is closely based on the existing MachineCombiner optimization that is implemented
for AArch64.

This patch should not cause the kind of spilling tragedy that led to the reversion of r236031.

Differential Revision: http://reviews.llvm.org/D10321

llvm-svn: 239486
2015-06-10 20:32:21 +00:00
Reid Kleckner c87a6faba1 [WinEH] _except_handlerN uses 0 instead of 1 to indicate catch-all
Our usage of 1 was a holdover from __C_specific_handler.

llvm-svn: 239482
2015-06-10 18:14:07 +00:00
Alexey Samsonov 89645dfa4d [GVN] Set proper debug locations for some instructions created by GVN.
Determining proper debug locations for instructions created in
PHITransAddr is tricky. We use a simple approach here and simply copy
debug locations from instructions computing load address to
"corresponding" instructions re-creating the address computation
in predecessor basic blocks.

This may not always be correct, given all the rearrangement and
simplification going on, and debug locations may jump around a lot,
as the basic blocks we copy locations between may be very far from
each other.

Still, this would work good in most simple cases (e.g. when chain
of address computing instruction is short, or our mapping turns out
to be 1-to-1), and we desire to have *some* reasonable debug locations
associated with newly inserted instructions.

See http://reviews.llvm.org/D10351 review thread for more details.

Test Plan: regression test suite

Reviewers: spatel, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10351

llvm-svn: 239479
2015-06-10 17:37:38 +00:00
Colin LeMahieu 1e9d1d768c [Hexagon] Adding decoders for signed operands and ensuring all signed operand types disassemble correctly.
llvm-svn: 239477
2015-06-10 16:52:32 +00:00
Igor Laevsky 965bf6a3ce [Statepoints] Add test case to check that statepoint is marked with Throwable attribute.
Differential Revision: http://reviews.llvm.org/D10215

llvm-svn: 239473
2015-06-10 13:24:00 +00:00
Igor Laevsky 346ff628f7 [StatepointLowering] Reuse stack slots across basic blocks
During statepoint lowering we can sometimes avoid spilling of the value if we know that it was already spilled for previous statepoint.
We were doing this by checking if incoming statepoint value was lowered into load from stack slot. This was working only in boundaries of one basic block.

But instead of looking at the lowered node we can look directly at the llvm-ir value and if it was gc.relocate (or some simple modification of it) look up stack slot for it's derived pointer and reuse stack slot from it. This allows us to look across basic block boundaries.

Differential Revision: http://reviews.llvm.org/D10251

llvm-svn: 239472
2015-06-10 12:31:53 +00:00
Elena Demikhovsky 00c9ad5ec2 AVX-512: Fixed a bug in comparison of i1 vectors.
cmp eq should give kxnor instruction
cmp neq should give kxor 

https://llvm.org/bugs/show_bug.cgi?id=23631

llvm-svn: 239460
2015-06-10 06:49:28 +00:00
Reid Kleckner 673de15af9 [WinEH] Call llvm.stackrestore in __except blocks
We have to do this manually, the runtime only sets up ebp. Fixes a crash
when returning after catching an exception.

llvm-svn: 239451
2015-06-10 01:34:54 +00:00
Reid Kleckner 2bc93ca846 [WinEH] Emit .safeseh directives for all 32-bit exception handlers
Use a "safeseh" string attribute to do this. You would think we chould
just accumulate the set of personalities like we do on dwarf, but this
fails to account for the LSDA-loading thunks we use for
__CxxFrameHandler3. Each of those needs to make it into .sxdata as well.
The string attribute seemed like the most straightforward approach.

llvm-svn: 239448
2015-06-10 01:02:30 +00:00
NAKAMURA Takumi e6c1b56780 Add explicit -mtriple=arm-unknown to llvm/test/CodeGen/ARM/disable-tail-calls.ll, to satisfy *-win32.
llvm-svn: 239442
2015-06-09 23:33:25 +00:00
Alexey Samsonov b7f02d371f [BasicBlockUtils] Set debug locations for instructions created in SplitBlockPredecessors.
Test Plan: regression test suite

Reviewers: eugenis, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10343

llvm-svn: 239438
2015-06-09 22:10:29 +00:00
Peter Collingbourne 9fe51fdf18 Move dllimport name mangling to IR mangler.
This ensures that LTO clients see the correct external symbol name.

Differential Revision: http://reviews.llvm.org/D10318

llvm-svn: 239437
2015-06-09 22:09:53 +00:00
Jingyue Wu 75589ffcc2 [NVPTX] fix a crash bug in NVPTXFavorNonGenericAddrSpaces
Summary:
We used to assume V->RAUW only modifies the operand list of V's user.
However, if V and V's user are Constants, RAUW may replace and invalidate V's
user entirely.

This patch fixes the above issue by letting the caller replace the
operand instead of calling RAUW on Constants.

Test Plan: @nested_const_expr and @rauw in access-non-generic.ll

Reviewers: broune, jholewinski

Reviewed By: broune, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10345

llvm-svn: 239435
2015-06-09 21:50:32 +00:00
Peter Collingbourne bc05163f15 LibDriver, llvm-lib: introduce.
llvm-lib is intended to be a lib.exe compatible utility that also
understands bitcode. The implementation lives in a library so that
lld can use it to implement /lib.

Differential Revision: http://reviews.llvm.org/D10297

llvm-svn: 239434
2015-06-09 21:50:22 +00:00
Reid Kleckner f12c030f48 [WinEH] Add 32-bit SEH state table emission prototype
This gets all the handler info through to the asm printer and we can
look at the .xdata tables now. I've convinced one small catch-all test
case to work, but other than that, it would be a stretch to say this is
functional.

The state numbering algorithm avoids doing any scope reconstruction as
we do for C++ to simplify the implementation.

llvm-svn: 239433
2015-06-09 21:42:19 +00:00
Chad Rosier cf90acc104 [AArch64] Remove an overly conservative check when generating store pairs.
Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.

Previously, the read of w1 (see below) prevented the formation of a stp.

        str      w0, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        str     w1, [x2, #4]
        ret

We now generate the following code.

        stp      w0, w1, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        ret

All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.

llvm-svn: 239432
2015-06-09 20:59:41 +00:00
Akira Hatanaka d9699bc7bd Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099

llvm-svn: 239427
2015-06-09 19:07:19 +00:00
Arnold Schwaighofer 7e226271a1 MergeFunctions: Don't replace a weak function use by another equivalent weak function
We don't know whether the weak functions definition is the definitive definition.

rdar://21303727

llvm-svn: 239422
2015-06-09 18:19:17 +00:00
David Blaikie 0ebe35b278 Revert "[DWARF] Fix a few corner cases in expression emission"
This reverts commit r239380 due to apparently GDB regressions:
http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/22562

llvm-svn: 239420
2015-06-09 18:01:51 +00:00
Samuel Antao cd50135a29 The constant initialization for globals in NVPTX is generated as an
array of bytes. The generation of this byte arrays was expecting 
the host to be little endian, which prevents big endian hosts to be 
used in the generation of the PTX code. This patch fixes the 
problem by changing the way the bytes are extracted so that it 
works for either little and big endian.

llvm-svn: 239412
2015-06-09 16:29:34 +00:00
Toma Tabacu 465acfd13c Recommit "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144).
Specified the llvm namespace for the 2 calls to make_unique() which caused
compilation errors in Visual Studio 2013.

llvm-svn: 239405
2015-06-09 13:33:26 +00:00
Elena Demikhovsky 6b62b659cb X86-MPX: Implemented encoding for MPX instructions.
Added encoding tests.

llvm-svn: 239403
2015-06-09 13:02:10 +00:00
Toma Tabacu 7977cfd52a Revert "[mips] [IAS] Add support for BNE and BEQ with an immediate operand." (r239396).
It was breaking buildbots.

llvm-svn: 239397
2015-06-09 10:43:49 +00:00
Toma Tabacu 5fa8fb5762 [mips] [IAS] Add support for BNE and BEQ with an immediate operand.
Summary:
For some branches, GAS accepts an immediate instead of the 2nd register operand.
We only implement this for BNE and BEQ for now. Other branch instructions can be added later, if needed.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: seanbruno, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D9666

llvm-svn: 239396
2015-06-09 10:34:31 +00:00
NAKAMURA Takumi d93855c82f llvm/test/DebugInfo/X86/expressions.ll: %llc_dwarf shouldn't be used with -mtriple, since %llc_dwarf implies the triple.
In this case, use plain "llc".

llvm-svn: 239390
2015-06-09 08:03:33 +00:00
Keno Fischer 979ae9f1f6 Move X86-only test case to appropriate directory
llvm-svn: 239384
2015-06-09 02:52:47 +00:00
Keno Fischer e34147ce2f [DWARF] Fix a few corner cases in expression emission
Summary: I noticed an object file with `DW_OP_reg4 DW_OP_breg4 0` as a DWARF expression,
which I traced to a missing break (and `++I`) in this code snippet.
While I was at it, I also added support for a few other corner cases
along the same lines that I could think of.

Test Plan: Hand-crafted test case to exercises these cases is included.

Reviewers: echristo, dblaikie, aprantl

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10302

llvm-svn: 239380
2015-06-09 01:53:59 +00:00
Anna Zaks 119046098a [asan] Prevent __attribute__((annotate)) triggering errors on Darwin
The following code triggers a fatal error in the compiler instrumentation
of ASan on Darwin because we place the attribute into llvm.metadata section,
which does not have the proper MachO section name.

void foo() __attribute__((annotate("custom")));
void foo() {;}

This commit reorders the checks so that we skip everything in llvm.metadata
first. It also removes the hard failure in case the section name does not
parse. That check will be done lower in the compilation pipeline anyway.

(Reviewed in http://reviews.llvm.org/D9093.)

llvm-svn: 239379
2015-06-09 00:58:08 +00:00
Matt Arsenault 705eb8f6b1 Implement computeKnownBits for min/max nodes
llvm-svn: 239378
2015-06-09 00:52:41 +00:00
Jingyue Wu 2e4d1dd0ed [NVPTX] run SROA after NVPTXFavorNonGenericAddrSpaces
Summary:
This cleans up most allocas NVPTXLowerKernelArgs emits for byval
parameters.

Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store.

Reviewers: eliben, jholewinski

Reviewed By: eliben, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10322

llvm-svn: 239368
2015-06-09 00:05:56 +00:00
Arnold Schwaighofer 0302da614a MergeFunctions: Impose a total order on the replacement of functions
We don't want to replace function A by Function B in one module and Function B
by Function A in another module.

If these functions are marked with linkonce_odr we would end up with a function
stub calling B in one module and a function stub calling A in another module. If
the linker decides to pick these two we will have two stubs calling each other.

rdar://21265586

llvm-svn: 239367
2015-06-09 00:03:29 +00:00
Ranjeet Singh 10511a493e [AArch64] AsmParser should be case insensitive about accepting vector register names.
Differential Revision: http://reviews.llvm.org/D10320

llvm-svn: 239353
2015-06-08 21:32:16 +00:00
Rafael Espindola 4b23eb85f9 Fix a regression in .pop_section.
It was calling ChangeSection with the wrong current section, eventually leading
to a crash.

llvm-svn: 239335
2015-06-08 20:08:55 +00:00
Simon Pilgrim 4af289d0f2 [X86][SSE] Added lzcnt vector tests.
llvm-svn: 239333
2015-06-08 19:58:43 +00:00
Akira Hatanaka 4a61619ff5 [ARM] Pass a callback to FunctionPass constructors to enable skipping execution
on a per-function basis.

Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.

rdar://problem/20542263

Differential Revision: http://reviews.llvm.org/D8717

llvm-svn: 239325
2015-06-08 18:50:43 +00:00
Matthias Braun 6f8db0e1a7 X86: Reject register operands with obvious type mismatches.
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.

Related to rdar://21042280

Differential Revision: http://reviews.llvm.org/D10260

llvm-svn: 239309
2015-06-08 16:56:23 +00:00
Colin LeMahieu 6aca6f0be5 [Hexagon] Adding functionality for searching for compound instruction pairs. Compound instructions reduce slot resource requirements freeing those packet slots up for more instructions.
llvm-svn: 239307
2015-06-08 16:34:47 +00:00
Simon Pilgrim 4791f6d89b [DAGCombiner] Added CTLZ vector constant folding support.
llvm-svn: 239305
2015-06-08 16:19:00 +00:00
Javed Absar e1c7dc3ee2 ARM]: Add support for MMFR4_EL1 in assembler
This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler.
This register provides information about the implemented memory model and memory management support.

llvm-svn: 239302
2015-06-08 15:01:11 +00:00
Petar Jovanovic cf197f0bde [Mips64][mcjit] Add R_MIPS_PC32 relocation
This patch adds R_MIPS_PC32 relocation for Mips64.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D10235

llvm-svn: 239301
2015-06-08 14:10:23 +00:00
Igor Breger 00d9f8457b AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

Differential Revision: http://reviews.llvm.org/D10310

llvm-svn: 239300
2015-06-08 14:03:17 +00:00
Artur Pilipenko 7fad7e57e8 Minor refactoring of GEP handling in isDereferenceablePointer
For GEP instructions isDereferenceablePointer checks that all indices are constant and within bounds. Replace this index calculation logic to a call to accumulateConstantOffset. Separated from the http://reviews.llvm.org/D9791

Reviewed By: sanjoy

Differential Revision: http://reviews.llvm.org/D9874

llvm-svn: 239299
2015-06-08 11:58:13 +00:00
Silviu Baranga 98a137196a [LAA] Fix estimation of number of memchecks
Summary:
We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y
are writes.
 
Assuming we have w writes and r reads, currently this number is  estimated as being
w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate
the number of checks required.

This change adds a getNumberOfChecks method to RuntimePointerCheck, which
will count the number of runtime checks needed (similar in implementation to
needsAnyChecking) and uses it to produce the correct number of runtime checks.

Test Plan:
llvm test suite
spec2k
spec2k6

Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit).

Reviewers: anemet

Reviewed By: anemet

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D10217

llvm-svn: 239295
2015-06-08 10:27:06 +00:00
Simon Pilgrim c789e1d57b [DAGCombiner] Added CTTZ vector constant folding support.
llvm-svn: 239293
2015-06-08 09:57:09 +00:00
Hao Liu 32c0539691 [LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses.
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
       a = A[i];         // load of even element
       b = A[i+1];       // load of odd element
       ...               // operations on a, b, c, d
       A[i] = c;         // store of even element
       A[i+1] = d;       // store of odd element
     }

  The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
     %wide.vec = load <8 x i32>, <8 x i32>* %ptr
     %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
     %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>

  The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
     %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> 
     store <8 x i32> %interleaved.vec, <8 x i32>* %ptr

This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. 

llvm-svn: 239291
2015-06-08 06:39:56 +00:00
Hao Liu 751004a67d [LoopAccessAnalysis] Teach LAA to check the memory dependence between strided accesses.
Differential Revision: http://reviews.llvm.org/D9368

llvm-svn: 239285
2015-06-08 04:48:37 +00:00
Colin LeMahieu 14ec76eb63 [objdump] Moving PrintImmHex out of MachODump and in to llvm-objdump and setting instprinter appropriately.
llvm-svn: 239265
2015-06-07 21:07:17 +00:00
Simon Pilgrim 856fc94a2f [X86] Added tzcnt vector tests.
llvm-svn: 239264
2015-06-07 21:01:34 +00:00
Matt Arsenault e81944fd5e SeparateConstOffsetFromGEP: Pass address space to isLegalAddressingMode
llvm-svn: 239262
2015-06-07 20:17:44 +00:00
Simon Pilgrim 3a7718038d [X86] Added BitScanForward/BitScanReverse memory folding + tests
llvm-svn: 239257
2015-06-07 18:34:25 +00:00
Simon Pilgrim 19cefe6903 Fixed line endings
llvm-svn: 239253
2015-06-07 16:09:48 +00:00
Simon Pilgrim 68cd237f57 [DAGCombiner] Added CTPOP vector constant folding support.
Added tests to the existing SSE/AVX test files.

llvm-svn: 239252
2015-06-07 15:37:14 +00:00
Colin LeMahieu 229a1e69fc Teaching llvm-mc how to understand the defsym command line option. This allows integer-constant symbols to be defined on the command line and used during assembly.
llvm-svn: 239240
2015-06-07 01:46:24 +00:00
Colin LeMahieu 1c8c213529 [MC] Common symbols weren't being checked for redeclaration which allowed an assembly file to generate an assertion in setCommon(): !isCommon(). This change allows redeclaration as long as the size and alignment match exactly, otherwise report a fatal error.
llvm-svn: 239227
2015-06-06 20:12:40 +00:00
Sanjoy Das ad714b1af3 [LoopUnroll] Fix truncation bug in canUnrollCompletely.
Summary:
canUnrollCompletely takes `unsigned` values for `UnrolledCost` and
`RolledDynamicCost` but is passed in `uint64_t`s that are silently
truncated.  Because of this, when `UnrolledSize` is a large integer
that has a small remainder with UINT32_MAX, LLVM tries to completely
unroll loops with high trip counts.

Reviewers: mzolotukhin, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10293

llvm-svn: 239218
2015-06-06 05:24:10 +00:00
David Majnemer 1c297e66fb [CVP] Don't assume Constants of type i1 can be known to be true or false
CVP wants to analyze the condition operand of a select along an edge.
It succeeds in getting back a Constant but not a ConstantInt.  Instead,
it gets a ConstantExpr.  It then assumes that the Constant must be equal
to false because it isn't equal to true.

Instead, perform an additional comparison.

This fixes PR23752.

llvm-svn: 239217
2015-06-06 04:56:51 +00:00
David Majnemer 468f670021 [InstCombine] Don't miscompile select to poison
If we have (select a, b, c), it is sometimes valid to simplify this to a
single select operand.  However, doing so is only valid if the
computation doesn't inject poison into the computation.

It might be helpful to consider the following example:
  (select (icmp ne %i, INT_MAX), (add nsw %i, 1), INT_MIN)

The select is equivalent to (add %i, 1) but not (add nsw %i, 1).

Self hosting on x86_64 revealed that this occurs very, very rarely so
bailing out is hopefully pretty reasonable.

llvm-svn: 239215
2015-06-06 02:30:43 +00:00
Rafael Espindola f3d49b30b5 Handle 16 bit PC relative relocations.
Fixes pr23771.

llvm-svn: 239214
2015-06-06 02:29:56 +00:00
Frederic Riss 123303122d [dsymutil] Fix misspelled CHECK line.
llvm-svn: 239200
2015-06-05 23:46:18 +00:00
Frederic Riss 5a642079d7 [dsymutil] Add support for linking the debug_frame section.
Linking the debug frame section is actually very easy as we just have to
patch the start address in the FDE header and then copy the rest of the
FDE without even looking at it. The only small complexity comes from the
handling of the CIEs that we should unique across object file. This is
also really easy by using a StringMap keyed on the raw contents of the
CIE.

llvm-svn: 239198
2015-06-05 23:06:11 +00:00
Frederic Riss 4f5874a51f [dsymutil] Have the YAML deserialization rewrite the object address of symbols.
The main use of the YAML debug map format is for testing inside LLVM. If we have IR
files in the tests used to generate object files, then we obviously don't know the
addresses of the symbols inside the object files beforehand.

This change lets the YAML import lookup the addresses in the object files and rewrite
them. This will allow to have test that really don't need any binary input.

llvm-svn: 239189
2015-06-05 21:12:07 +00:00
Renato Golin 3dabb23384 Revert "[InstCombine] Rephrase fix to SimplifyWithOpReplaced"
This reverts commit r239141. This commit was an attempt to reintroduce
a previous patch that broke many self-hosting bots with clang timeouts,
but it still has slowdown issues, at least  on ARM, increasing the
compilation time (stage 2, clang's) by 5x.

llvm-svn: 239175
2015-06-05 18:24:12 +00:00
Sanjoy Das 72cb5e1087 [InstCombine] Fix PR23751.
PR23751 was caused by a missing ``break;`` in r234388.

llvm-svn: 239171
2015-06-05 18:04:42 +00:00
Peter Collingbourne 6679fc1a79 Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM."
as it caused miscompilations and assertion failures (PR23768,
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html).

llvm-svn: 239169
2015-06-05 18:01:28 +00:00
Fiona Glaser 666e352440 DAGCombiner: don't duplicate (fmul x, c) in visitFNEG if fneg is free
For targets with a free fneg, this fold is always a net loss if it
ends up duplicating the multiply, so definitely avoid it.

This might be true for some targets without a free fneg too, but
I'll leave that for future investigation.

llvm-svn: 239167
2015-06-05 17:52:34 +00:00
Chandler Carruth 9dabd14d59 [Unroll] Rework the naming and structure of the new unroll heuristics.
The new naming is (to me) much easier to understand. Here is a summary
of the new state of the world:

- '*Threshold' is the threshold for full unrolling. It is measured
  against the estimated unrolled cost as computed by getUserCost in TTI
  (or CodeMetrics, etc). We will exceed this threshold when unrolling
  loops where unrolling exposes a significant degree of simplification
  of the logic within the loop.
- '*PercentDynamicCostSavedThreshold' is the percentage of the loop's
  estimated dynamic execution cost which needs to be saved by unrolling
  to apply a discount to the estimated unrolled cost.
- '*DynamicCostSavingsDiscount' is the discount applied to the estimated
  unrolling cost when the dynamic savings are expected to be high.

When actually analyzing the loop, we now produce both an estimated
unrolled cost, and an estimated rolled cost. The rolled cost is notably
a dynamic estimate based on our analysis of the expected execution of
each iteration.

While we're still working to build up the infrastructure for making
these estimates, to me it is much more clear *how* to make them better
when they have reasonably descriptive names. For example, we may want to
apply estimated (from heuristics or profiles) dynamic execution weights
to the *dynamic* cost estimates. If we start doing that, we would also
need to track the static unrolled cost and the dynamic unrolled cost, as
only the latter could reasonably be weighted by profile information.

This patch is sadly not without functionality change for the new unroll
analysis logic. Buried in the heuristic management were several things
that surprised me. For example, we never subtracted the optimized
instruction count off when comparing against the unroll heursistics!
I don't know if this just got lost somewhere along the way or what, but
with the new accounting of things, this is much easier to keep track of
and we use the post-simplification cost estimate to compare to the
thresholds, and use the dynamic cost reduction ratio to select whether
we can exceed the baseline threshold.

The old values of these flags also don't necessarily make sense. My
impression is that none of these thresholds or discounts have been tuned
yet, and so they're just arbitrary placehold numbers. As such, I've not
bothered to adjust for the fact that this is now a discount and not
a tow-tier threshold model. We need to tune all these values once the
logic is ready to be enabled.

Differential Revision: http://reviews.llvm.org/D9966

llvm-svn: 239164
2015-06-05 17:01:43 +00:00
Frederic Riss c0866ad2c0 [dsymutil] Handle the -oso-prepend-path option when the input is a YAML debug map
All the tests using a YAML debug map will need this.

llvm-svn: 239163
2015-06-05 16:35:44 +00:00
Alexei Starovoitov 8cf9a4c472 [bpf] rename triple names bpf_be -> bpfeb
llvm-svn: 239162
2015-06-05 16:11:14 +00:00
Colin LeMahieu be8c453d58 [Hexagon] Reapply r239097 with tests corrected for shuffling and duplexing.
llvm-svn: 239161
2015-06-05 16:00:11 +00:00
John Brawn 985c04e8fa [ARM] Add support for -sp- FPUs and FPU none to TargetParser
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.

Differential Revision: http://reviews.llvm.org/D10238

llvm-svn: 239151
2015-06-05 13:31:19 +00:00
Simon Pilgrim 7de557e5a9 [X86][AVX2] Added tests for v32i8 vector shifts
Currently still scalarized, but D9474 should remedy that.

llvm-svn: 239146
2015-06-05 12:35:36 +00:00
Toma Tabacu 399a56d771 Revert "[mips] [IAS] Restore STI.FeatureBits in .set pop." (r239144).
This is breaking the Windows buildbots.

llvm-svn: 239145
2015-06-05 12:19:27 +00:00
Toma Tabacu 89ebf88ff3 [mips] [IAS] Restore STI.FeatureBits in .set pop.
Summary:
Only restoring AvailableFeatures is not enough and will lead to buggy behaviour.
For example, if we have a feature enabled and we ".set pop", the next time we try
to ".set" that feature nothing will happen because the "!(STI.getFeatureBits()[Feature])"
check will be false, because we didn't restore STI.FeatureBits.

In order to fix this, we need to make MipsAssemblerOptions remember the STI.FeatureBits
instead of the AvailableFeatures and then regenerate AvailableFeatures each time we ".set pop".
This is because, AFAIK, there is no way to convert from AvailableFeatures back to STI.FeatureBits,
but the reverse is possible by using ComputeAvailableFeatures(STI.FeatureBits).

I also moved the updating of AssemblerOptions inside the "if" statement in
setFeatureBits() and clearFeatureBits(), as there is no reason to update if
nothing changes.

Reviewers: dsanders, mkuper

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9156

llvm-svn: 239144
2015-06-05 11:48:54 +00:00
David Majnemer b58f32f7a8 [LoopVectorize] Don't crash on zero-sized types in isInductionPHI
isInductionPHI wants to calculate the stride based on the pointee size.
However, this is not possible when the pointee is zero sized.

This fixes PR23763.

llvm-svn: 239143
2015-06-05 10:52:40 +00:00
Andrea Di Biagio eb33134ce7 Simplify code; NFC.
Also, moved test cases from CodeGen/X86/fold-buildvector-bug.ll into
CodeGen/X86/buildvec-insertvec.ll and regenerated CHECK lines using
update_llc_test_checks.py.

llvm-svn: 239142
2015-06-05 10:29:55 +00:00
David Majnemer 6d8081835d [InstCombine] Rephrase fix to SimplifyWithOpReplaced
I don't have the IR which is causing the build bot breakage but I can
postulate as to why they are timing out:
1. SimplifyWithOpReplaced was stripping flags from the simplified value.
2. visitSelectInstWithICmp was overriding SimplifyWithOpReplaced because
   it's simplification wasn't correct.
3. InstCombine would revisit the add instruction and note that it can
   rederive the flags.
4. By modifying the value, we chose to revisit instructions which reuse
   the value.  One of the instructions is the original select, causing
   LLVM to never reach fixpoint.

Instead, strip the flags only when we are sure we are going to perform
the simplification.

llvm-svn: 239141
2015-06-05 09:57:57 +00:00
Daniel Jasper 917fa5ee66 Revert "[InstCombine] Don't miscompile safe increment idiom"
This is breaking a lot of build bots and is causing very long-running
compiles (infinite loops)?

Likely, we shouldn't return nullptr?

llvm-svn: 239139
2015-06-05 09:31:20 +00:00
Simon Pilgrim b4c562de87 [X86][SSE] Added tests for i8/i16 vector shifts
Currently still scalarized, but D9474 should remedy that.

llvm-svn: 239136
2015-06-05 08:24:23 +00:00
Alexey Samsonov 5c3ed00eb2 Revert "[Object, ELF] Fix segmentation fault in ELFFile::getSectionName()."
This reverts commit r239124.

llvm-svn: 239125
2015-06-04 23:58:31 +00:00
Alexey Samsonov 24558d5520 [Object, ELF] Fix segmentation fault in ELFFile::getSectionName().
Don't do a null dereference if .shstrtab section is missing.

llvm-svn: 239124
2015-06-04 23:40:23 +00:00