Commit Graph

128514 Commits

Author SHA1 Message Date
Quentin Colombet 1bd7504ef3 [MachineRegisterInfo] Add a method to set the size of a virtual register a posteriori.
This is required for mir testing.

llvm-svn: 262861
2016-03-07 21:41:39 +00:00
Amaury Sechet 5984dfe7c7 Small formating change in Core.cpp . NFC
llvm-svn: 262860
2016-03-07 21:39:20 +00:00
Quentin Colombet 70a9670d80 [MachineRegisterInfo] Get rid of the global-isel ifdefs.
One additional pointer is not a big deal size-wise and it makes
the code much nicer!

llvm-svn: 262856
2016-03-07 21:22:09 +00:00
Matt Arsenault 56356c8a9c AMDGPU: Remove a fixme for ptrrtoint handling
llvm-svn: 262854
2016-03-07 21:12:46 +00:00
Matt Arsenault 81d06015c6 AMDGPU: Move function only used by R600
llvm-svn: 262853
2016-03-07 21:10:13 +00:00
Matt Arsenault ceb2c06cbd DAGCombiner: Check legality before creating extract_vector_elt
Problem not hit by any in tree target.

llvm-svn: 262852
2016-03-07 21:10:09 +00:00
Justin Bogner bbab368e13 SelectionDAG: Remove some unused AtomicSDNode constructors. NFC
llvm-svn: 262849
2016-03-07 20:15:12 +00:00
Adam Nemet bb3680bd85 [LoopDataPrefetch] If prefetch distance is not set, skip pass
This lets select sub-targets enable this pass.  The patch implements the
idea from the recent llvm-dev thread:
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/94925

The goal is to enable the LoopDataPrefetch pass for the Cyclone
sub-target only within Aarch64.

Positive and negative tests will be included in an upcoming patch that
enables selective prefetching of large-strided accesses on Cyclone.

llvm-svn: 262844
2016-03-07 18:35:42 +00:00
Marina Yatsina 5f5de9f89b [ms-inline-asm][AVX512] Add ability to use k registers in MS inline asm + fix bag with curly braces
Until now curly braces could only be used in MS inline assembly to mark block start/end.
All curly braces were removed completely at a very early stage.
This approach caused bugs like:
"m{o}v eax, ebx" turned into "mov eax, ebx" without any error.

In addition, AVX-512 added special operands (e.g., k registers), which are also surrounded by curly braces that mark them as such.
Now, we need to keep the curly braces and identify at a later stage if they are marking block start/end (if so, ignore them), or surrounding special AVX-512 operands (if so, parse them as such).

This patch fixes the bug described above and enables the use of AVX-512 special operands.

This commit is the the llvm part of the patch.
The clang part of the review is: http://reviews.llvm.org/D17766
The llvm part of the review is: http://reviews.llvm.org/D17767

Differential Revision: http://reviews.llvm.org/D17767

llvm-svn: 262843
2016-03-07 18:11:16 +00:00
Adam Nemet 4896c7a82a [ScopedNoAliasAA] Make test basic.ll less confusing
Summary:
This testcase had me confused.  It made me believe that you can use
alias scopes and alias scopes list interchangeably with alias.scope and
noalias.  Both langref and the other testcase use scope lists so I went
looking.

Turns out using scope directly only happens to work by chance.  When
ScopedNoAliasAAResult::mayAliasInScopes traverses this as a scope list:

!1 = !{!1, !0, !"some scope"}

, the first entry is in fact a scope but only because the scope is
happened to be defined self-referentially to make it unique globally.

The remaining elements in the tuple (!0, !"some scope") are considered
as scopes but AliasScopeNode::getDomain will just bail on those without
any error.

This change avoids this ambiguity in the test but I've also been
wondering if we should issue some sort of a diagnostics.

Reviewers: dexonsmith, hfinkel

Subscribers: mssimpso, llvm-commits

Differential Revision: http://reviews.llvm.org/D16670

llvm-svn: 262841
2016-03-07 17:49:10 +00:00
Adam Nemet 81113ef68c Revert "Enable LoopLoadElimination by default"
This reverts commit r262250.

It causes SPEC2006/gcc to generate wrong result (166.s) in AArch64 when
running with *ref* data set.  The error happens with
"-Ofast -flto -fuse-ld=gold" or "-O3 -fno-strict-aliasing".

llvm-svn: 262839
2016-03-07 17:38:02 +00:00
Chandler Carruth af8321ecf7 [memdep] Switch to range based for loops.
llvm-svn: 262831
2016-03-07 15:12:57 +00:00
Chandler Carruth 9ca96384f3 [DFSan] Remove an overly aggressive assert reported in PR26068.
This code has been successfully used to bootstrap libc++ in a no-asserts
mode for a very long time, so the code that follows cannot be completely
incorrect. I've added a test that shows the current behavior for this
kind of code with DFSan. If it is desirable for DFSan to do something
special when processing an invoke of a variadic function, it can be
added, but we shouldn't keep an assert that we've been ignoring due to
release builds anyways.

llvm-svn: 262829
2016-03-07 14:05:09 +00:00
Chandler Carruth b32febe48e [memdep] Switch a function to return true on success instead of false.
This is much more clear and less surprising IMO. It also makes things
more consistent with the increasingly large chunk of LLVM code that
assumes true-on-success.

llvm-svn: 262826
2016-03-07 12:45:07 +00:00
Chandler Carruth 40e21f2a20 [memdep] Cleanup the implementation doxygen comments and remove
duplicated comments.

In several cases these had diverged making them especially nice to
canonicalize. I checked to make sure we weren't losing important
information of course.

llvm-svn: 262825
2016-03-07 12:30:06 +00:00
Chandler Carruth 78954164a9 [memdep] Finish cleaning up all of the comments' doxygen.
llvm-svn: 262824
2016-03-07 11:27:56 +00:00
Chandler Carruth 1fac9df95c [memdep] Switch from a hacky use of PointerIntPair and poorly chosen
arbitrary integers cast to Instruction pointers to a sum type over
Instruction * and a PointerEmbeddedInt.

No functionality changed.

Differential Revision: http://reviews.llvm.org/D15845

llvm-svn: 262823
2016-03-07 11:04:46 +00:00
Chandler Carruth 3d79dd9b06 [memdep] Update the comments' doxygen style and place them more clearly.
Just cleaning this up, no functionality changed. Next up will be moving
it to use the sum type instead of arbitrary "pointer"-like enums.

llvm-svn: 262822
2016-03-07 10:35:02 +00:00
Chandler Carruth 60fb1b4bd2 [memdep] Run clang-format over the header before porting it to
the new pass manager.

The port will involve substantial edits here, and would likely introduce
bad formatting if formatted in isolation, so just get all the formatting
up to snuff. I'll also go through and try to freshen the doxygen here as
well as modernizing some of the code.

llvm-svn: 262821
2016-03-07 10:19:30 +00:00
Craig Topper 267bdb2094 [CodeGen] Add space-optimized EmitMergeInputChains1_2 to the DAG isel matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC
llvm-svn: 262815
2016-03-07 07:29:12 +00:00
Mehdi Amini b923d641d0 Add a new insert_as() method to DenseMap and use it for ConstantUniqueMap
Just like the existing find_as() method, the new insert_as() accepts
an extra parameter which is used as a key to find the bucket in the
map.
When creating a Constant, we want to check the map before actually
creating the object. In this case we have to perform two queries to
the map, and this extra parameter can save recomputing the hash value
for the second query.

This is a reapply of r260458, that was reverted because it was
suspected to be the cause of instability of an internal bot, but
wasn't confirmed.

Differential Revision: http://reviews.llvm.org/D16268

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 262812
2016-03-07 00:51:00 +00:00
Mehdi Amini 67dfe09da4 Bitcode reader: Inline readAbbreviatedField in readRecord and move the enclosing loop in each case (NFC)
Summary: This make readRecord 20% faster, measured on an LTO build

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17911

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 262811
2016-03-07 00:38:09 +00:00
NAKAMURA Takumi 2de1b320a4 Revert r130657, "Windows/DynamicLibrary.inc: Clean up ELM_Callback. We may check the decl instead of the versions of individual libraries."
We may assume the type of 1st argument as PCSTR in PENUMLOADED_MODULES_CALLBACK. PSTR was in the ancient mingw32.

llvm-svn: 262810
2016-03-07 00:13:09 +00:00
Simon Pilgrim 253ca348b2 [X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target shuffle combining.
Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles.

This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed.

Removed non-constant pool mask decode path as we have no way of testing it right now.

Differential Revision: http://reviews.llvm.org/D17916

llvm-svn: 262809
2016-03-06 21:54:52 +00:00
Valery Pykhtin dc11054f20 [AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler.
Engages code from r262804.

Differential Revision: http://reviews.llvm.org/D17151

llvm-svn: 262808
2016-03-06 20:25:36 +00:00
Valery Pykhtin 50cd3c4ec7 fix sanitizer-ppc64be-linux failure for r262804
error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]

http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/930

llvm-svn: 262805
2016-03-06 15:13:54 +00:00
Valery Pykhtin 499a5c6323 [AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields
Differential Revision: http://reviews.llvm.org/D17150

llvm-svn: 262804
2016-03-06 13:27:13 +00:00
Igor Breger 4d94d4d5f7 AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element types on SKX
Differential Revision: http://reviews.llvm.org/D17913

llvm-svn: 262803
2016-03-06 12:38:58 +00:00
Wilfred Hughes c0531a4a21 Fix typo.
llvm-svn: 262802
2016-03-06 12:37:34 +00:00
Valery Pykhtin 0c6293da68 [AMDGPU] SOPxx instructions operand naming fixed in td files.
dst -> sdst
ssrcN -> srcN

Differential Revision: http://reviews.llvm.org/D17646

llvm-svn: 262801
2016-03-06 10:31:44 +00:00
Craig Topper 581c0087b9 [X86] Use high bits of return value from getEncoding instead of predicate functions to populate the REX and VEX prefix bits that extend register encodings. NFC
llvm-svn: 262800
2016-03-06 08:12:47 +00:00
Craig Topper faab5c68d4 [X86] Remove unnecessary masking. The assert above it already guaranteed it. NFC
llvm-svn: 262799
2016-03-06 08:12:44 +00:00
Craig Topper 5e038cf589 [X86] Use uint8_t instead of unsigned char as it shortens the code and more explicitly reflects the desired size.
llvm-svn: 262798
2016-03-06 08:12:42 +00:00
Igor Breger f1bd761e00 AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector).
Differential Revision: http://reviews.llvm.org/D17763

llvm-svn: 262797
2016-03-06 07:46:03 +00:00
Saleem Abdulrasool 11bf1ac297 unitests: add some ARM TargetParser tests
The ARM TargetParser would construct invalid StringRefs.  This would cause
asserts to trigger.  Add some tests in LLVM to ensure that we dont regress on
this in the future.  Although there is a test for this in clang, this ensures
that the changes would get caught in the same repository.

llvm-svn: 262790
2016-03-06 04:50:55 +00:00
Alexander Kornienko 45c9a5beee [docs] Updated docs to work with Doxygen 1.8.11
llvm-svn: 262786
2016-03-06 03:50:08 +00:00
Simon Pilgrim 40e1a71cdd [X86][AVX] Improved VPERMILPS variable shuffle mask decoding.
Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool.

Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization

Followup to D17681

llvm-svn: 262784
2016-03-05 22:53:31 +00:00
Simon Pilgrim aa99331bad [X86] AMD Bobcat CPU (btver1) doesn't support XSAVE
btver1 is a SSSE3/SSE4a only CPU - it doesn't have AVX and doesn't support XSAVE.

Differential Revision: http://reviews.llvm.org/D17683

llvm-svn: 262782
2016-03-05 22:00:50 +00:00
Saleem Abdulrasool 4208381016 Support: catch invalid accesses
It is possible to invoke these methods on an invalid input resulting in an
invalid substring construction.  It seems that we do not have unit tests for
these methods.  Tests to ensure that the invalid call is caught to follow in
clang.

Resolves PR26839.

llvm-svn: 262778
2016-03-05 20:00:44 +00:00
Saleem Abdulrasool fa8c6ed3fa ExecutionEngine: tweak debug log
Add a newline to separate the log message.  NFC.

llvm-svn: 262777
2016-03-05 20:00:41 +00:00
Yaron Keren ce608690e1 Replace GlobalScopeAsm[GlobalScopeAsm.size()-1] with GlobalScopeAsm.back(), NFC.
llvm-svn: 262775
2016-03-05 16:02:09 +00:00
Krzysztof Parzyszek 5c61d11a6d Add DAG mutation interface to the post-RA scheduler
Differential Revision: http://reviews.llvm.org/D17868

llvm-svn: 262774
2016-03-05 15:45:23 +00:00
Chandler Carruth 47dbdd9c31 [aa-eval] Enhance the comments to better describe the overview of why
this pass exists.

This is based on feedback received when moving this comment from the source
file to a new header file.

Differential Revision: http://reviews.llvm.org/D17476

llvm-svn: 262769
2016-03-05 08:20:15 +00:00
Matthias Braun 4797ec95e4 RegisterCoalescer: Remap subregister lanemasks before exchanging operands
Rematerializing and merging into a bigger register class at the same
time, requires the subregister range lanemasks getting remapped to the
new register class.

This fixes http://llvm.org/PR26805

llvm-svn: 262768
2016-03-05 04:36:13 +00:00
Matthias Braun 8de09aa0c5 RegisterCoalescer: Need to check DstReg+SrcReg for missing undef flags
copy coalescing with enabled subregister liveness can reveal undef uses,
previously this was only checked for the SrcReg in updateRegDefsUses()
but we need to check DstReg as well.

llvm-svn: 262767
2016-03-05 04:36:10 +00:00
Matthias Braun 2cbfd9fff5 RegisterPressure: Small cleanup
llvm-svn: 262766
2016-03-05 04:36:08 +00:00
Quentin Colombet 2a7676b442 [X86] Fix the lowering of setjmp intrinsic on i386.
When the lowering of the setjmp intrinsic requires
a global base pointer to be set, make sure such pointer
gets defined by the CGBR pass.

This fixes PR26742.

llvm-svn: 262762
2016-03-05 00:31:04 +00:00
Quentin Colombet fb5be7a37f Add missing triple in my previous commit!
llvm-svn: 262760
2016-03-04 23:36:32 +00:00
Quentin Colombet 13b524597d [X86] Do not use cmpxchgXXb when we need the base pointer (RBX).
cmpxchgXXb uses RBX as one of its implicit argument. I.e., when
we use that instruction we need to clobber RBX. This is generally
fine, expect when RBX is a reserved register because in that case,
the register allocator will not track its value and will not
save and restore it when interferences occur.

rdar://problem/24851412

llvm-svn: 262759
2016-03-04 23:29:39 +00:00
Sanjay Patel 216b275994 [x86] add tests for masked loads with constant masks
llvm-svn: 262758
2016-03-04 23:28:07 +00:00
Mike Aizatsky 243fe2b3a0 [libfuzzer] adding std:string to allowed adaptable argument.
llvm-svn: 262757
2016-03-04 23:18:01 +00:00
David Majnemer 71a1c2c619 Fix build breakage
llvm-svn: 262756
2016-03-04 23:02:15 +00:00
David Majnemer d2f767d2f6 [X86] Support cleaning more than 2**16 bytes of stack
The x86 ret instruction has a 16 bit immediate indicating how many bytes
to pop off of the stack beyond the return address.

There is a problem when extremely large structs are passed by value: we
might not be able to fit the number of bytes to pop into the return
instruction.

To fix this, expand RET_FLAG a little later and use a special sequence
to clean the stack:

pop  %ecx     ; return address is now in %ecx
add  $n, %esp ; clean the stack
push %ecx     ; bring the return address back on the stack
ret           ; pop the return address and jmp to it's value

llvm-svn: 262755
2016-03-04 22:56:17 +00:00
Kostya Serebryany 5c3701c621 [libFuzzer] log less when re-loading files; fix a silly bug: when running single files actually run all of them, not just the first one
llvm-svn: 262754
2016-03-04 22:35:40 +00:00
Philip Reames a0c9f6e736 [LVI] Fix a bug which prevented use of !range metadata within a query
The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further.  The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor.

llvm-svn: 262751
2016-03-04 22:27:39 +00:00
Rong Xu ecdc98fdae [PGO] Add a commandline option to control number of the VP annotation metadata.
llvm-svn: 262750
2016-03-04 22:08:44 +00:00
Michael Kuperstein b89f0fa2a2 [DAGCombine] Fix divrem combine not to assume div/rem type is simple.
The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization 
combine.

This fixes PR26835.

Differential Revision: http://reviews.llvm.org/D17878

llvm-svn: 262746
2016-03-04 21:23:29 +00:00
Teresa Johnson 5d07531d02 Fix new gold test to specify emulation mode.
The thinlto_linkonceresolution.ll gold linker test introduced in r262727
included a target triple, but didn't set the emulation mode, which is
necessary since the default linker target may be different.

Patch by H.J. Lu

llvm-svn: 262745
2016-03-04 21:19:08 +00:00
Dan Gohman e6b81362e9 [WebAssembly] Add another possible code-size optimization to README.txt
llvm-svn: 262740
2016-03-04 20:09:57 +00:00
Renato Golin 175c6d6d95 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Tom Stellard 649b5db557 AMDGPU/SI: Add support for spiling SGPRs to scratch buffer
Summary:
This is necessary for when we run out of VGPRs and can no
longer use v_{read,write}_lane for spilling SGPRs.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17592

llvm-svn: 262732
2016-03-04 18:31:18 +00:00
Teresa Johnson 3b8f6126ac Fix bot failure from r262721: unintented change in gold-plugin save-temps
The split code gen task ID should not be appended to save-temps output
file when the parallelism factor is 1 (not actually splitting).

llvm-svn: 262731
2016-03-04 18:16:00 +00:00
Sanjoy Das fefc4d50ed [Statepoint docs] Delete trailing whitespace
llvm-svn: 262730
2016-03-04 18:14:09 +00:00
Tom Stellard ebef6f9771 AMDGPU/SI: Enable frame index scavenging during PrologEpilogueInserter
Summary:
This allows us to use virtual registers when we need extra registers
for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex().

Once all the frame indices have been eliminated, the
PrologEpilogueInserter does an extra pass over the program to replace
all virtual registers with physical ones.

This allows us to make more efficient use of our emergency spill slots,
so we only need to create one.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17591

llvm-svn: 262728
2016-03-04 18:02:01 +00:00
Teresa Johnson a17f2cd1a3 [ThinLTO] Ensure prevailing linkonce emitted as weak in ThinLTO backends
Summary:
Since IR files are all compiled into separate independent object files
in ThinLTO mode, the prevailing linkonce symbols must be emitted in its
object file even if it is no longer referenced there, e.g. if no
references remain in the module after inlining, since it may be
referenced by another ThinLTO compiled object file. This is done by
changing LDPR_PREVAILING_DEF_IRONLY* symbols to LDPR_PREVAILING_DEF,
which converts the prevailing linkonce to weak. We also don't need the
other prevailing IRONLY handling for internalization, which is not
currently performed for ThinLTO.

Test case included.

Reviewers: davidxl, rafael

Subscribers: rafael, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16173

llvm-svn: 262727
2016-03-04 17:48:35 +00:00
Krzysztof Parzyszek 51155fc0d1 [Hexagon] Fix lowering of calls with the return type of i1
This fixes an assertion in test/CodeGen/Hexagon/ifcvt-edge-weight.ll
when run with -debug-only=isel

llvm-svn: 262726
2016-03-04 17:38:05 +00:00
Zoran Jovanovic a68b67d1ed [mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated.
Author: milena.vujosevic.janicic
Reviewers: dsanders
Differential Revision: http://reviews.llvm.org/D17373

llvm-svn: 262725
2016-03-04 17:34:31 +00:00
Teresa Johnson 7cffaf3ad0 [ThinLTO] Launch importing backends in parallel threads from gold plugin
Summary:
Launch ThinLTO backends (LTO and codegen pipelines with importing) in
parallel using a ThreadPool, after creating the combined index.
The number of threads is controlled by the existing -jobs gold plugin
option, or the hardware concurrency if not specified.

The old behavior of exiting after creating the combined index can be
invoked via a new thinlto-index-only plugin option.

This commit involves just the ThinLTO-specific pieces of D15390, the NFC
and other restructuring pieces were committed independently:
  r262677: Add hardware_concurrency interface to llvm::thread (NFC)
  r262719: Change split code gen to use ThreadPool
  r262721: Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC)

Reviewers: pcc, joker.eph, rafael

Subscribers: rafael, davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D15390

llvm-svn: 262724
2016-03-04 17:06:02 +00:00
Teresa Johnson a9f65554b0 Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC)
This is the NFC part remaining from D15390, which refactors the
current codegen() into a CodeGen class with various modular methods and
other helper functions that will be used by the follow-on ThinLTO piece.

llvm-svn: 262721
2016-03-04 16:36:06 +00:00
Teresa Johnson d84c7decb6 Change split code gen to use ThreadPool
Part of D15390.

llvm-svn: 262719
2016-03-04 15:39:13 +00:00
Simon Pilgrim 3c7e94208a [X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining tests
None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining

llvm-svn: 262718
2016-03-04 15:19:42 +00:00
Sam Kolton f51f4b8370 Test commit access
llvm-svn: 262714
2016-03-04 12:29:14 +00:00
Simon Pilgrim b4b90fb8d6 [X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from the second input of a binary shuffle (punpcklbw)
llvm-svn: 262710
2016-03-04 11:15:23 +00:00
Valery Pykhtin 824e804bf6 test commit
llvm-svn: 262709
2016-03-04 10:59:50 +00:00
Benjamin Kramer 4dbf3371bb Make headers self-contained again.
llvm-svn: 262702
2016-03-04 10:49:30 +00:00
Nikolay Haustov 5bf46ac150 AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics
These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the
GL_ARB_shader_image_load_store extension.

Initial change by Nicolai H.hnle

Differential Revision: http://reviews.llvm.org/D17401

llvm-svn: 262701
2016-03-04 10:39:50 +00:00
Justin Bogner 85ddad485c Annotate our undefined behaviour to sneak it past the sanitizers
We have known UB in some ilists where we static cast half nodes to
(larger) derived types and use the address. See llvm.org/PR26753.

This needs to be fixed, but in the meantime it'd be nice if running
ubsan didn't complain. This adds annotations in the two places where
ubsan complains while running check-all of a sanitized clang build.

llvm-svn: 262683
2016-03-04 01:52:47 +00:00
Easwaran Raman 588c68a87b Fix a memory leak.
llvm-svn: 262682
2016-03-04 01:18:40 +00:00
Justin Bogner 87feb4e64e CodeGen: Tune the SmallVector size in LiveRange
The vast majority of LiveRanges (ie, 4/5) have exactly 1 segment and 1
value number, and a good chunk of the rest have 2 of each, so
allocating space for 4 is wasteful. This is especially noticeable when
dealing with a very large number of vregs, and I have an internal case
where dropping this to 2 shaves over 5% off of peak memory when
compiling a particularly large function.

llvm-svn: 262681
2016-03-04 00:58:39 +00:00
Easwaran Raman 3b7a8246c9 Fix a use-after-free bug introduced in r262636
llvm-svn: 262679
2016-03-04 00:44:01 +00:00
Teresa Johnson a3135be77d Add hardware_concurrency interface to llvm::thread (NFC)
Part of D15390.

llvm-svn: 262677
2016-03-04 00:25:54 +00:00
Evgeniy Stepanov 330c5a60c7 [gold] Handle modules that are not included in the link.
Gold has a newly added LDPT_GET_SYMBOLS_V3 callback that can
distinguish between a module that is not included in the link, and
one that is included but has its entire interface preempted by others.

Fixes PR26674.

llvm-svn: 262676
2016-03-04 00:23:29 +00:00
Easwaran Raman 75c21a9428 Fix memory leak in tests.
llvm-svn: 262674
2016-03-03 23:55:41 +00:00
Mike Aizatsky b8627a89a6 [libfuzzer] arbitrary function adapter.
The adapter automates converting sequence of bytes into arbitrary
arguments.

Differential Revision: http://reviews.llvm.org/D17829

llvm-svn: 262673
2016-03-03 23:45:29 +00:00
Philip Reames 2e7383cc1e [docs] Add a description of current problem areas to the statepoint docs
Triggered by a question on llvm-dev about status

llvm-svn: 262671
2016-03-03 23:24:44 +00:00
Guozhi Wei 92e9d0e80e [InstCombine] Combine A->B->A BitCast
This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

llvm-svn: 262670
2016-03-03 23:21:38 +00:00
NAKAMURA Takumi f2b521ffc5 llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify explicit triple.
We will see it for targeting win32;

  LLVM ERROR: CPU: 'generic' does not support ARM mode execution!

llvm-svn: 262668
2016-03-03 22:38:39 +00:00
Kostya Serebryany e483ed2825 [libFuzzer] when interrupted, call _Exit() instead of exit()
llvm-svn: 262667
2016-03-03 22:36:37 +00:00
Simon Pilgrim f33cb61471 [X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.
PSHUFB decoder was assuming that input was 128 or 256-bit vector only.

llvm-svn: 262661
2016-03-03 21:55:01 +00:00
Lang Hames 3b514554a2 [RuntimeDyld] Fix '_' stripping in RTDyldMemoryManager::getSymbolAddressInProcess.
The RTDyldMemoryManager::getSymbolAddressInProcess method accepts a
linker-mangled symbol name, but it calls through to dlsym to do the lookup (via
DynamicLibrary::SearchForAddressOfSymbol), and dlsym expects an unmangled
symbol name.

Historically we've attempted to "demangle" by removing leading '_'s on all
platforms, and fallen back to an extra search if that failed. That's broken, as
it can cause symbols to resolve incorrectly on platforms that don't do mangling
if you query '_foo' and the process also happens to contain a 'foo'.

Fix this by demangling conditionally based on the host platform. That's safe
here because this function is specifically for symbols in the host process, so
the usual cross-process JIT looking concerns don't apply.

M    unittests/ExecutionEngine/ExecutionEngineTest.cpp
M    lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp

llvm-svn: 262657
2016-03-03 21:23:15 +00:00
Philip Reames b7270446cf [ValueTracking] "constant fold" an experimental hidden option
llvm-svn: 262648
2016-03-03 19:50:32 +00:00
Philip Reames 146307eb52 [ValueTracking] Remove dead code from an old experiment
This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits.  While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default.  Given this, it's time to abandon the experiment.  

Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments.  Given how easy the patch is to apply, there's no reason to leave the code in tree.  

For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads.  In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach.  

llvm-svn: 262646
2016-03-03 19:44:06 +00:00
Sanjay Patel 9bba75084b [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)
Given that we're not actually reducing the instruction count in the included
regression tests, I think we would call this a canonicalization step.

The motivation comes from the example in PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable
example of:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
  %bc = bitcast <4 x i32> %not to <2 x i64>
  %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1>
  %bc2 = bitcast <2 x i64> %notnot to <4 x i32>
  ret <4 x i32> %bc2
}

Simplifies to the expected:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  ret <4 x i32> %lobit
}

Differential Revision: http://reviews.llvm.org/D17583

llvm-svn: 262645
2016-03-03 19:19:04 +00:00
Easwaran Raman fd6557e368 Fix breakage caused by r262636.
Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused))

llvm-svn: 262643
2016-03-03 18:53:20 +00:00
Sanjoy Das 3928910fe6 [ConstantRange] Rename test; NFC
llvm-svn: 262640
2016-03-03 18:31:33 +00:00
Sanjoy Das 724f5cf278 [SCEV] Prove no-overflow via constant ranges
Exploit ScalarEvolution::getRange's newly acquired smartness (since
r262438) by using that to infer nsw and nuw when possible.

llvm-svn: 262639
2016-03-03 18:31:29 +00:00
Sanjoy Das 11ef606f1d [SCEV] Be less eager about demoting zexts to sexts
After r262438 we can have provably positive NSW SCEV expressions whose
zero extensions cannot be simplified (since r262438 makes SCEV better at
computing constant ranges).  This means demoting sexts of positive add
recurrences eagerly can result in an unsimplified zero extension where
we could have had a simplified sign extension.  This change fixes the
issue by teaching SCEV to demote sext of a positive SCEV expression to a
zext only if the sext could not be simplified.

llvm-svn: 262638
2016-03-03 18:31:23 +00:00
Sanjoy Das f3867e64a8 [ConstantRange] Generalize makeGuaranteedNoWrapRegion to work on ranges
This will be used in a later patch to ScalarEvolution.  Right now only
the unit tests exercise the newly added code.

llvm-svn: 262637
2016-03-03 18:31:16 +00:00
Easwaran Raman 3035719c86 Infrastructure for PGO enhancements in inliner
This patch provides the following infrastructure for PGO enhancements in inliner:

Enable the use of block level profile information in inliner
Incremental update of block frequency information during inlining
Update the function entry counts of callees when they get inlined into callers.

Differential Revision: http://reviews.llvm.org/D16381

llvm-svn: 262636
2016-03-03 18:26:33 +00:00
Simon Pilgrim abcee45b7a [X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS
The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic.

This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle.

Differential Revision: http://reviews.llvm.org/D17681

llvm-svn: 262635
2016-03-03 18:13:53 +00:00
Dehao Chen 57d1dda558 Use LineLocation instead of CallsiteLocation to index callsite profile.
Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples).

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17827

llvm-svn: 262634
2016-03-03 18:09:32 +00:00
Simon Pilgrim 022afe2538 [X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.
lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway.

llvm-svn: 262633
2016-03-03 17:54:35 +00:00
Simon Pilgrim 0107d24810 [X86] Pulled out repeated code testing for constant vector shift amount. NFCI.
llvm-svn: 262631
2016-03-03 17:35:43 +00:00
Amjad Aboud 0ce261d052 MCU target has its own ABI, however X86 interrupt handler calling convention overrides this ABI.
Fixed the ordering to check first for X86 interrupt handler then for MCU target.

Differential Revision: http://reviews.llvm.org/D17801

llvm-svn: 262628
2016-03-03 17:17:54 +00:00
Ahmed Bougacha 671795a985 [X86] Don't assume that shuffle non-mask operands starts at #0.
That's not the case for VPERMV/VPERMV3, which cover all possible
combinations (the C intrinsics use a different order; the AVX vs
AVX512 intrinsics are different still).

Since:
  r246981 AVX-512: Lowering for 512-bit vector shuffles.
VPERMV is recognized in getTargetShuffleMask.

This breaks assumptions in most callers, as they expect
the non-mask operands to start at index 0.
VPERMV has the mask as operand #0; VPERMV3 has it in the middle.

Instead of the faulty assumption, have getTargetShuffleMask return
its operands as well.

One alternative we considered was to change the operand order of
VPERMV, but we agreed to stick to the instruction order, as there
are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular).

Differential Revision: http://reviews.llvm.org/D17041

llvm-svn: 262627
2016-03-03 16:53:50 +00:00
Matthew Simpson b840a6d6f4 [LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When
detecting these recurrences, we need to ensure that the previous value is
actually defined inside the loop. This patch includes the fix and test case.

llvm-svn: 262624
2016-03-03 16:12:01 +00:00
Sanjay Patel d6cb4ec2a2 [AArch64] fold 'isPositive' vector integer operations (PR26819)
This is one of the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26819

Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that
in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might
come from autovectorization even if it's unlikely to be produced from C NEON intrinsics.

The patch is based on the x86 equivalent:
http://reviews.llvm.org/rL262036

Differential Revision: http://reviews.llvm.org/D17834

llvm-svn: 262623
2016-03-03 15:56:08 +00:00
Igor Breger 639fde79b0 AVX512: Combine AND + TESTM instructions .
Differential Revision: http://reviews.llvm.org/D17844

llvm-svn: 262621
2016-03-03 14:18:38 +00:00
Renato Golin f824ced6a1 Making rem_crash.ll target-specific
This test failed in some ARM bots after a divmod change because it was
running on a native llc, instead of targeted one. This makes sure the test
is target-specific (as intended), and also copies to ARM and AArch64
directories. If it is also supposed to work on other architectures, I'll
leave as an exercise to the respective maintainers.

llvm-svn: 262620
2016-03-03 14:01:10 +00:00
Dylan McKay 4fd0d4af86 [AVR] Add calling convention parser tokens
Summary: Adds the 'avr_intrcc' and 'avr_signalcc' IR calling convention tokens to the parser.

Reviewers: arsenm

Subscribers: dylanmckay, llvm-commits

Differential Revision: http://reviews.llvm.org/D16348

llvm-svn: 262600
2016-03-03 10:08:02 +00:00
Simon Pilgrim 91dd0a796c [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 262599
2016-03-03 09:43:28 +00:00
Renato Golin 3d78271eac Revert "[ARM] Merging 64-bit divmod lib calls into one"
This reverts commit r262507, which broke some ARM buildbots.

llvm-svn: 262594
2016-03-03 08:57:44 +00:00
Michael Zuckerman c4d054fa4a [LLVM][AVX512] PSRLWI Chnage imm8 to int
Differential Revision: http://reviews.llvm.org/D17753

llvm-svn: 262592
2016-03-03 08:54:05 +00:00
Matt Arsenault 5ba9718abe TTI: Fix not using overload of getIntrinsicInstrCost
This was always calling the generic version, so the target
custom implementation was never called.

llvm-svn: 262585
2016-03-03 05:43:49 +00:00
Junmo Park 6ba96fb431 [BranchFolding] Change function name related with merging MMOs. NFC
Summary:
Removing MMOs is not our prefer behavior any more.

Reviewers: mcrosier, reames
   
Differential Revision: http://reviews.llvm.org/D17668

llvm-svn: 262580
2016-03-03 03:57:20 +00:00
Tom Stellard cc7067a668 AMDGPU: Insert two S_NOP instructions for every high level source statement.
Patch by: Konstantin Zhuravlyov

Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input.

Reviewers: tstellarAMD, arsenm

Subscribers: echristo, dblaikie, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17454

llvm-svn: 262579
2016-03-03 03:53:29 +00:00
Tom Stellard 600ca6fd39 AMDGPU/SI: Don't try to move scratch wave offset when there are no free SGPRs
Summary:
When there were no free SGPRs, we were trying to move this value into
some of the reserved registers which was causing a segmentation fault.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17590

llvm-svn: 262577
2016-03-03 03:45:09 +00:00
Hans Wennborg 153e4b0f11 [X86] Enable forwarding bool arguments in tail calls (PR26305)
The code was previously not able to track a boolean argument
at a call site back to the formal argument of the caller.

Differential Revision: http://reviews.llvm.org/D17786

llvm-svn: 262575
2016-03-03 02:06:32 +00:00
Tim Shen 6e676a84ad [PPCVSXFMAMutate] Temporarily disable this pass
llvm-svn: 262573
2016-03-03 01:27:35 +00:00
Philip Reames ae27b2380f [MBP] Renaming a confusing variable and add clarifying comments
Was discussed as part of http://reviews.llvm.org/D17830

llvm-svn: 262571
2016-03-03 00:58:43 +00:00
Jacques Pienaar 2a0641434a [lanai] Fixing file path used in test
llvm-svn: 262567
2016-03-03 00:30:02 +00:00
Matthias Braun 0f521c5430 TargetSchedule: Allow explicit Unsupported markers in InstRW
llvm-svn: 262549
2016-03-03 00:05:07 +00:00
Matthias Braun 42d9ad9c5b TableGen: Accept itinerary data when checking for schedmodel completeness
llvm-svn: 262548
2016-03-03 00:04:59 +00:00
Philip Reames 23d933982a [MBP] Avoid placing random blocks between loop preheader and header
If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code.

The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!.

Differential Revision: http://reviews.llvm.org/D17830

llvm-svn: 262547
2016-03-03 00:01:42 +00:00
David Majnemer 1ef654024f [X86] Don't give catch objects a displacement of zero
Catch objects with a displacement of zero do not initialize a catch
object.  The displacement is relative to %rsp at the end of the
function's prologue for x86_64 targets.

If we place an object at the top-of-stack, we will end up wit a
displacement of zero resulting in our catch object remaining
uninitialized.

Address this by creating our catch objects as fixed objects.  We will
ensure that the UnwindHelp object is created after the catch objects so
that no catch object will have a displacement of zero.

Differential Revision: http://reviews.llvm.org/D17823

llvm-svn: 262546
2016-03-03 00:01:25 +00:00
Sanjay Patel 840564973f [AArch64] add tests to demonstrate existing codegen for PR26819
llvm-svn: 262540
2016-03-02 23:22:03 +00:00
Matt Arsenault 8226fc4829 AMDGPU: Simplify boolean conditional return statements
Patch by Richard Thomson

llvm-svn: 262536
2016-03-02 23:00:21 +00:00
Philip Reames 02e1132afb [MBP] Remove overly verbose debug output
llvm-svn: 262531
2016-03-02 22:40:51 +00:00
Amaury Sechet 3b8b2ea2e1 Explode store of arrays in instcombine
Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized.

Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17828

llvm-svn: 262530
2016-03-02 22:36:45 +00:00
Davide Italiano 98be3f2be2 [llvm-nm] Restore the previous behaviour (pre r262525).
It broke some buildbots.

Pointy-hat to:  me

llvm-svn: 262529
2016-03-02 22:33:49 +00:00
Davide Italiano f156763cfa [llvm-nm] Fix rendering of -s grouping with all the othe options.
llvm-svn: 262525
2016-03-02 21:59:31 +00:00
Philip Reames b9688f4382 [MBP] Adjust debug output to be more focused and approachable
llvm-svn: 262522
2016-03-02 21:45:13 +00:00
Amaury Sechet 7cd3fe7db6 Unpack array of all sizes in InstCombine
Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements.

Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15890

llvm-svn: 262521
2016-03-02 21:28:30 +00:00
Daniel Berlin 6412002d24 Really fix ASAN leak/etc issues with MemorySSA unittests
llvm-svn: 262519
2016-03-02 21:16:28 +00:00
Kostya Serebryany 4394b31e1d [libFuzzer] add -Werror for libFuzzer build rule
llvm-svn: 262517
2016-03-02 21:08:16 +00:00
Daniel Berlin 989e601b26 Revert "Fix ASAN detected errors in code and test" (it was not meant to be committed yet)
This reverts commit 890bbccd600ba1eb050353d06a29650ad0f2eb95.

llvm-svn: 262512
2016-03-02 20:36:22 +00:00
Daniel Berlin 27ed1c2eb0 Fix ASAN detected errors in code and test
llvm-svn: 262511
2016-03-02 20:27:29 +00:00
Bob Wilson 9ab86aabba Add another test for the GlobalOpt change in r212079.
This is a test that Akira Hatanaka wrote to test GlobalOpt's handling of
aliases with GEP operands. David Majnemer independently made the same
change to GlobalOpt in r212079. Akira's test is a useful addition, so I'm
pulling it over from the llvm repo for Swift on GitHub.

llvm-svn: 262510
2016-03-02 20:02:25 +00:00
Kostya Serebryany 721f61a00e [libFuzzer] more trophies
llvm-svn: 262509
2016-03-02 19:45:10 +00:00
Renato Golin 93e42d9934 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262507
2016-03-02 19:35:45 +00:00
Reid Kleckner 65f9d9cd32 Revert "[X86] Elide references to _chkstk for dynamic allocas"
This reverts commit r262370.

It turns out there is code out there that does sequences of allocas
greater than 4K: http://crbug.com/591404

The goal of this change was to improve the code size of inalloca call
sequences, but we got tangled up in the mess of dynamic allocas.
Instead, we should come back later with a separate MI pass that uses
dominance to optimize the full sequence. This should also be able to
remove the often unneeded stacksave/stackrestore pairs around the call.

llvm-svn: 262505
2016-03-02 19:20:59 +00:00
Matthias Braun f290912d22 ARM: Introduce conservative load/store optimization mode
Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which
means that unaligned loads/stores do not trap and even extensive testing
will not catch these bugs. However the multi/double variants are not
affected by this bit and will still trap. In effect a more aggressive
load/store optimization will break existing (bad) code.

These bugs do not necessarily manifest in the broken code where the
misaligned pointer is formed but often later in perfectly legal code
where it is accessed. This means recompiling system libraries (which
have no alignment bugs) with a newer compiler will break existing
applications (with alignment bugs) that worked before.

So (under protest) I implemented this safe mode which limits the
formation of multi/double operations to cases that are not affected by
user code (stack operations like spills/reloads) or cases where the
normal operations trap anyway (floating point load/stores). It is
disabled by default.

Differential Revision: http://reviews.llvm.org/D17015

llvm-svn: 262504
2016-03-02 19:20:00 +00:00
Justin Bogner b2ecee9c31 SelectionDAG: Use correctly sized allocation functions for SDNodes
The placement new calls here were all calling the allocation function
in RecyclingAllocator/Recycler for SDNode, instead of the function for
the specific subclass we were constructing.

Since this particular allocator always overallocates it more or less
worked, but would hide what we're actually doing from any memory
tools. Also, if you tried to change this allocator so something like a
BumpPtrAllocator or MallocAllocator, the compiler would crash horribly
all the time.

Part of llvm.org/PR26808.

llvm-svn: 262500
2016-03-02 19:01:11 +00:00
Geoff Berry 62c1a1e7c7 [AArch64] Enable non-leaf frame pointer elimination.
Summary:
This change enables frame pointer elimination in non-leaf functions.
The -fomit-frame-pointer option still needs to be used when compiling
via clang (or an equivalent method of not setting the
'no-frame-pointer-elim*' function attributes if generating llvm IR via
some other method) to take advantage of this optimization.

This change should be NFC when compiling via clang without
-fomit-frame-pointer.

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, tberghammer, qcolombet, llvm-commits, danalbert, mcrosier, srhines

Differential Revision: http://reviews.llvm.org/D17730

llvm-svn: 262495
2016-03-02 17:58:31 +00:00
Chris Bieneman 7d942d73b8 [CMake] Add test-depends target to build dependencies of check-all
This is just another convenience target for bots to use. It enables isolation of building and testing.

llvm-svn: 262494
2016-03-02 17:56:30 +00:00
Reid Kleckner 9fac19f02e [cmake] Check the compiler version first
Otherwise users get messages from CheckAtomic about missing libatomic
instead of a sensible message that says "use GCC 4.7 or newer".

I structured the change along the lines of HandleLLVMStdlib.cmake, so
that the standalone build of Clang still gets the compiler version
check.

Reviewers: beanz

Differential Revision: http://reviews.llvm.org/D17789

llvm-svn: 262491
2016-03-02 16:42:56 +00:00
Chandler Carruth 12884f7f80 [AA] Hoist the logic to reformulate various AA queries in terms of other
parts of the AA interface out of the base class of every single AA
result object.

Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.

When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.

To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).

However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.

The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.

It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that
manner.

However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.

Differential Revision: http://reviews.llvm.org/D17329

llvm-svn: 262490
2016-03-02 15:56:53 +00:00
Simon Pilgrim 537907fd32 [X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from one of the inputs of a binary shuffle (punpcklbw)
llvm-svn: 262486
2016-03-02 14:16:50 +00:00
Michael Zuckerman 927fdaee88 [LLVM][AVX512]PSRAWI Change imm8 to int.
Differential Revision: http://reviews.llvm.org/D17705

llvm-svn: 262480
2016-03-02 12:05:07 +00:00
Simon Pilgrim c02b72627a [X86][SSE] Lower 128-bit MOVDDUP with existing VBROADCAST mechanisms
We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of.

This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well.

It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts.

Differential Revision: http://reviews.llvm.org/D17680

llvm-svn: 262478
2016-03-02 11:43:05 +00:00
Nikolay Haustov f2fbabe9c1 Revert "[AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields"
Build failure with clang.

llvm-svn: 262477
2016-03-02 11:16:56 +00:00
Nikolay Haustov f0f24628cb Revert "[AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler."
Build failure with clang.

llvm-svn: 262475
2016-03-02 10:54:21 +00:00
Nikolay Haustov 73447a9714 [AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler.
complementary patch to table-driven amd_kernel_code_t field parser/printer utility. lit tests passed.

Patch by: Valery Pykhtin

Differential Revision: http://reviews.llvm.org/D17151

llvm-svn: 262474
2016-03-02 10:36:30 +00:00
Nikolay Haustov 6c8c74969a [AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields
This is going to be used in .hsatext disassembler and can be used
in current assembler parser (lit tests passed on parsing).
Code using this helpers isn't included in this patch.

Benefits:

unified approach
fast field name lookup on parsing
Later I would like to enhance some of the field naming/syntax using this code.

Patch by: Valery Pykhtin

Differential Revision: http://reviews.llvm.org/D17150

llvm-svn: 262473
2016-03-02 10:36:25 +00:00
Dmitry Vyukov 2eed1218e5 libfuzzer: fix compiler warnings
- unused sigaction/setitimer result (used in assert)
- unchecked fscanf return value
- signed/unsigned comparison

llvm-svn: 262472
2016-03-02 09:54:40 +00:00
Craig Topper 1d3f4aefd6 [X86] Remove unnecessary call to isReg from emitter's DestMem handling for VEX prefix. The operand is always a register. NFC
llvm-svn: 262468
2016-03-02 07:32:45 +00:00
Craig Topper 6a7cd42213 [X86] Make X86MCCodeEmitter::DetermineREXPrefix locate operands more like how VEX prefix handling does.
llvm-svn: 262467
2016-03-02 07:32:43 +00:00
David Majnemer 5aadde1ecc [X86] Permit reading of the FLAGS register without it being previously defined
We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS.
While technically correct, this is not be desirable for folks who want
to examine aspects of the FLAGS register which are not related to
computation like whether or not CPUID is a valid instruction.

Differential Revision: http://reviews.llvm.org/D17782

llvm-svn: 262465
2016-03-02 06:46:52 +00:00
Craig Topper d4dabb3939 [X86] Remove assertion I accidentally left in.
llvm-svn: 262464
2016-03-02 06:35:22 +00:00
Craig Topper a267431fa6 [X86] Be more structured about how we capture the register number when it is encoded in bits 7:4 of the immediate.
For some instructions the register is not the last operand and the immediate handling had to detect this and hardcode the index to find it. It also required CurOp to be pointing at the last operand handled in the Form switch whereas for any instruction it would be pointing at the next operand.

Now we just capture the value in the Form switch when we know exactly where it is and the CurOp pointer can behave normally.

llvm-svn: 262462
2016-03-02 06:06:18 +00:00
Sanjoy Das dcd3a88e29 [SCEV] Minor naming, braces cleanup; NFC
llvm-svn: 262459
2016-03-02 04:52:22 +00:00
Craig Topper cf65c62737 [X86] Use MCPhysReg and uint16_t for static arrays of registers and opcodes respectively should reduce size tiny bit. NFC
llvm-svn: 262458
2016-03-02 04:42:31 +00:00
Matt Arsenault f2dcb4737b AMDGPU: Fix bug 26659.
Fix checking the same instruction twice instead of the
second branch that uses vccz. I don't think this matters
currently because s_branch_vccnz is always used currently.

llvm-svn: 262457
2016-03-02 04:12:39 +00:00
Matt Arsenault a266bd8760 AMDGPU: Cleanup suggested in bug 23960
llvm-svn: 262456
2016-03-02 04:05:14 +00:00
Matt Arsenault 5de68cbc4c Bug 20810: Use report_fatal_error instead of unreachable
llvm-svn: 262455
2016-03-02 03:33:55 +00:00
Sanjoy Das 6b017a11ba Add a comment with a rational for the unusual code structure
llvm-svn: 262454
2016-03-02 02:56:29 +00:00
Sanjoy Das eca1b53b95 Qualify getRangeForAffineAR with this-> for MSVC
llvm-svn: 262453
2016-03-02 02:44:08 +00:00
George Burgess IV e0e6e48b29 Attempt to fix ASAN failure in a MemorySSA test.
llvm-svn: 262452
2016-03-02 02:35:04 +00:00
Sanjoy Das 1168f93c2b Perturb code in an attempt to appease MSVC
For some reason MSVC seems to think I'm calling getConstant() from a
static context.  Try to avoid this issue by explicitly specifying
'this->' (though I'm not confident that this will actually work).

llvm-svn: 262451
2016-03-02 02:34:20 +00:00
Sanjoy Das 62a1c33929 More code permutation to appease MSVC
llvm-svn: 262449
2016-03-02 02:15:42 +00:00
Sanjoy Das 9e5ebf145c Remove "auto" to appease the MSVC bots
llvm-svn: 262448
2016-03-02 01:59:37 +00:00
Matt Arsenault 7d0a77b979 DAGCombiner: Make sure an integer is being truncated
llvm-svn: 262446
2016-03-02 01:36:51 +00:00
Sanjay Patel 5e4c46de6d revert r262424 because there's a *clang test* for AArch64 that checks -O3 asm output
that is broken by this change

llvm-svn: 262440
2016-03-02 01:04:09 +00:00
Daniel Berlin ab3aefeaa5 Fix SHARED_LIBS build
llvm-svn: 262439
2016-03-02 00:58:48 +00:00
Sanjoy Das bf73098472 [SCEV] Make getRange smarter around selects
Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}"
by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q})
instead.

The latter can be easier to compute precisely in cases like
"{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in
such cases the latter form simplifies to [0,N+1) union [0,N+1).

llvm-svn: 262438
2016-03-02 00:57:54 +00:00
Sanjoy Das b765b633cb [SCEV] Extract out a getRangeForAffineAR; NFC
Pure code-motion change.  Will be used later in making getRange more clever.

llvm-svn: 262437
2016-03-02 00:57:39 +00:00
Chris Bieneman 2d5e077165 [CMake] Add convenience target llvm-test-depends to build test dependencies.
This is useful when paired with the distribution targets to build prerequisites for running tests.

llvm-svn: 262428
2016-03-02 00:27:14 +00:00
Chris Bieneman 2b12f6765b [CMake] Add distribution target that is the "just-build" side of install-distribution
This is just a convenience target to allow limiting what you build.

llvm-svn: 262427
2016-03-02 00:27:12 +00:00
Sanjay Patel 147e927957 [InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to shifts (PR26701)
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case. 

Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.

But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
 

llvm-svn: 262424
2016-03-01 23:55:18 +00:00
Dehao Chen 1012be120a Perform InstructioinCombiningPass before SampleProfile pass.
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17742

llvm-svn: 262419
2016-03-01 22:53:02 +00:00
Kostya Serebryany 3d95dd9149 [libFuzzer] deprecate exit_on_first flag
llvm-svn: 262417
2016-03-01 22:33:14 +00:00
David Blaikie f72dbc101c llvm-dwp: Add missing copyright notice to llvm-dwp.cpp
Addressing feedback on IRC by Sean Silva.

llvm-svn: 262416
2016-03-01 22:29:00 +00:00
Kostya Serebryany 228d5b1ce4 [libFuzzer] add generic signal handlers so that libFuzzer can report at least something if ASan is not handlig the signals for us. Remove abort_on_timeout flag.
llvm-svn: 262415
2016-03-01 22:19:21 +00:00
Simon Pilgrim a3ad9cd793 [X86][SSE41] Added missing fast-isel intrinsics tests
Match IR generated in clang/test/CodeGen/sse41-builtins.c

llvm-svn: 262412
2016-03-01 22:05:05 +00:00
Colin LeMahieu 2d497a0078 [NFC] Convert tabs to spaces.
llvm-svn: 262411
2016-03-01 22:05:03 +00:00
Simon Pilgrim 9c03d1313c [X86][XOP] Regenerated intrinsics tests
llvm-svn: 262410
2016-03-01 21:58:50 +00:00
Matthias Braun 37d884fdf4 AArch64: Reenable CompleteModel for A53, A57 and Kryo models
The fixes in r262393 completed them as well.

llvm-svn: 262408
2016-03-01 21:55:35 +00:00
Simon Pilgrim b1f5c62d5f [X86][AVX2] Regenerated 256-bit vector / 64-bit element permute tests
llvm-svn: 262406
2016-03-01 21:53:12 +00:00
Tim Northover d59b23a5ae Fix typo. NFC.
llvm-svn: 262405
2016-03-01 21:45:22 +00:00
Simon Pilgrim 89244ba84a [X86][AVX2] Regenerated horizontal add/sub tests
llvm-svn: 262403
2016-03-01 21:43:55 +00:00
Simon Pilgrim 46e57fd073 [X86][AVX2] Regenerated intrinsics tests
llvm-svn: 262401
2016-03-01 21:38:41 +00:00
Colin LeMahieu 5cb6eea664 [Hexagon] Modifying r262258 to only be in effect in the hand assembler path, not the integrated assembler.
llvm-svn: 262400
2016-03-01 21:37:41 +00:00
Matthias Braun a939bd07d1 TableGen: Display helpfull message for incomplete models.
llvm-svn: 262399
2016-03-01 21:36:12 +00:00
Simon Pilgrim f2f5626b84 [X86][AVX] Fixed triple/arch clash in test case
We were specifying a x64 triple and then overriding with a x86 arch. 

llvm-svn: 262398
2016-03-01 21:33:08 +00:00
Matt Arsenault b36d462fac DAGCombiner: Turn truncate of a bitcasted vector to an extract
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.

This fixes some test failures in a future commit when i64 loads
are changed to promote.

llvm-svn: 262397
2016-03-01 21:31:53 +00:00
Rafael Espindola 889e45601e Add LLVMBuild for ObjectYAML.
Should fix the DBUILD_SHARED_LIBS bots.

llvm-svn: 262396
2016-03-01 21:29:33 +00:00
David Blaikie ddb27369a0 Revert "llvm-dwp: Keep ObjectFiles alive until object emission their contents can be referenced directly rather than copied"
Accidentally committed.

This reverts commit r262389.

llvm-svn: 262395
2016-03-01 21:24:04 +00:00
Jacques Pienaar ea9f25a740 [lanai] Add ELF enum value and relocations.
Add ELF enum value and relocations for Lanai backed.

General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html).

Differential Revision: http://reviews.llvm.org/D17008

llvm-svn: 262394
2016-03-01 21:21:42 +00:00
Matthias Braun a6cfb6f682 AArch64: Add missing schedinfo, check completeness for cyclone
This adds some missing generic schedule info definitions, enables
completeness checking for cyclone and fixes a typo uncovered by that.

Differential Revision: http://reviews.llvm.org/D17748

llvm-svn: 262393
2016-03-01 21:20:31 +00:00
Kit Barton e725669483 [Power9] Implement new vector compare, extract, insert instructions
This change implements the following vector operations:

  - Vector Compare Not Equal
    - vcmpneb(.) vcmpneh(.) vcmpnew(.)
    - vcmpnezb(.) vcmpnezh(.) vcmpnezw(.)
  - Vector Extract Unsigned
    - vextractub vextractuh vextractuw vextractd
    - vextublx vextubrx vextuhlx vextuhrx vextuwlx vextuwrx
  - Vector Insert
    - vinsertb vinserth vinsertw vinsertd

26 instructions.

Phabricator: http://reviews.llvm.org/D15916
llvm-svn: 262392
2016-03-01 20:51:57 +00:00