Commit Graph

88146 Commits

Author SHA1 Message Date
Sanjoy Das f13900f8ac [IRCE] Reflow comments; NFC
llvm-svn: 262988
2016-03-09 02:34:15 +00:00
Dan Gohman d7a2eea619 [WebAssembly] Implement irreducible control flow.
This implements a very simple conservative transformation that doesn't
require more than linear code size growth. There's room for much more
optimization in this space.

llvm-svn: 262982
2016-03-09 02:01:14 +00:00
Sanjoy Das 97d19bd95f [SCEV] Slightly generalize getRangeViaFactoring
Building on the previous change, this generalizes
ScalarEvolution::getRangeViaFactoring to work with
{Ext(C?A:B)+k0,+,Ext(C?A:B)+k1} where Ext can be a zero extend, sign
extend or truncate operation, and k0 and k1 are constants.

llvm-svn: 262979
2016-03-09 01:51:02 +00:00
Sanjoy Das d3488c6060 [SCEV] Slightly generalize getRangeViaFactoring
This change generalizes ScalarEvolution::getRangeViaFactoring to work
with {Ext(C?A:B),+,Ext(C?A:B)} where Ext can be a zero extend, sign
extend or truncate operation.

llvm-svn: 262978
2016-03-09 01:50:57 +00:00
Mehdi Amini 7c4a1a8d48 libLTO: add a ThinLTOCodeGenerator on the model of LTOCodeGenerator.
This is intended to provide a parallel (threaded) ThinLTO scheme
for linker plugin use through the libLTO C API.

The intent of this patch is to provide a first implementation as a
proof-of-concept and allows linker to start supporting ThinLTO by
definiing the libLTO C API. Some part of the libLTO API are left
unimplemented yet. Following patches will add support for these.

The current implementation can link all clang/llvm binaries.

Differential Revision: http://reviews.llvm.org/D17066

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 262977
2016-03-09 01:37:22 +00:00
Mehdi Amini bd04e8fed6 FunctionIndex is not optional for renameModuleForThinLTO(), make it a reference (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 262976
2016-03-09 01:37:14 +00:00
Sanjay Patel b8d071bc8a use range-based for loop; NFCI
llvm-svn: 262956
2016-03-08 20:53:48 +00:00
Sanjay Patel f831fdb56a fix variable name; NFC
llvm-svn: 262953
2016-03-08 19:07:42 +00:00
Sanjay Patel 5c96723622 use range-based loop; NFCI
llvm-svn: 262952
2016-03-08 19:06:12 +00:00
Quentin Colombet 4340b55593 Revert r262759 and r262760.
The fix consisting in using the library call for atomic compare and swap when
the instruction is not safe to use may be incorrect. Indeed the library call may
not exist on all platform. In other words, we need a better fix! 

llvm-svn: 262943
2016-03-08 17:29:11 +00:00
Chad Rosier e40b9513a9 [AArch64] Add MMOs to unscaled pairs.
Test to be committed in follow up commit, per discussion in D17097.
http://reviews.llvm.org/D17097

llvm-svn: 262942
2016-03-08 17:16:38 +00:00
Sanjay Patel eaf06851d0 rangify, fix function names; NFCI
llvm-svn: 262940
2016-03-08 17:12:32 +00:00
Krzysztof Parzyszek cd99e364e3 Invoke DAG postprocessing in the post-RA scheduler
This was inadvertently omitted from r262774, which added the mutation
interface.

llvm-svn: 262939
2016-03-08 16:54:20 +00:00
Sanjay Patel 5b8d741632 don't repeat function names in documentation comments; NFC
llvm-svn: 262937
2016-03-08 16:26:39 +00:00
Artyom Skrobov 5ddea6a8e9 [ARM] Simplify ARMInstr*.td by getting rid of identity PatFrags (NFC)
Reviewers: t.p.northover, grosbach, resistor

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17636

llvm-svn: 262936
2016-03-08 16:23:54 +00:00
Hans Wennborg e00b6e7249 Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG"
This caused PR26870.

llvm-svn: 262935
2016-03-08 16:21:41 +00:00
Krzysztof Parzyszek 1a1d78b86f Add DAG mutation interface to the DFA packetizer
llvm-svn: 262930
2016-03-08 15:33:51 +00:00
Igor Breger 999ac754f2 AVX512: Add extract_subvector patterns v8i1->v4i1 , v4i1->v2i1.
Differential Revision: http://reviews.llvm.org/D17953

llvm-svn: 262929
2016-03-08 15:21:25 +00:00
Junmo Park 974eb0a96d Revert "[InstCombine] Combine A->B->A BitCast"
This reverts commit r262670 due to compile failure.

llvm-svn: 262916
2016-03-08 07:09:46 +00:00
Peter Collingbourne 3866cc5f69 Fix evaluation order. Spotted by Alexander Riccio!
llvm-svn: 262907
2016-03-08 03:50:36 +00:00
Kit Barton ba532dc816 [Power9] Implement new vsx instructions: load, store instructions for vector and scalar
We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to
implement this new patch.

This patch implements the following vsx instructions:

Vector load/store:
lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx
stxv stxvb16x stxvh8x stxvl stxvll stxvx
Scalar load/store:
lxsd lxssp lxsibzx lxsihzx
stxsd stxssp stxsibx stxsihx
21 instructions

Phabricator: http://reviews.llvm.org/D16919
llvm-svn: 262906
2016-03-08 03:49:13 +00:00
Dan Gohman 1402606477 [WebAssembly] Update for spec change from tableswitch to br_table.
Also note that the operand order changed; the default label is now listed
after the regular labels.

llvm-svn: 262903
2016-03-08 03:18:12 +00:00
Justin Bogner 671febc0f7 Re-apply "SelectionDAG: Store SDNode operands in an ArrayRecycler"
This re-applies r262886 with a fix for 32 bit platforms that have 8 byte
pointer alignment, effectively reverting r262892.

Original Message:

  Currently some SDNode operands are malloc'd, some are stored inline in
  subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
  This scheme is complex, inconsistent, and makes refactoring SDNodes
  fairly difficult.

  Instead, we can allocate all of the operands using an ArrayRecycler
  that wraps a BumpPtrAllocator. This keeps the cache locality when
  iterating operands, improves locality when iterating SDNodes without
  looking at operands, and vastly simplifies the ownership semantics.

  It also means we stop overallocating SDNodes by 2-3x and will make it
  simpler to fix the rampant undefined behaviour we have in how we
  mutate SDNodes from one kind to another (See llvm.org/pr26808).

  This is NFC other than the changes in memory behaviour, and I ran some
  LNT tests to make sure this didn't hurt compile time. Not many tests
  changed: there were a couple of 1-2% regressions reported, but there
  were more improvements (of up to 4%) than regressions.

llvm-svn: 262902
2016-03-08 03:14:29 +00:00
Quentin Colombet 5e63e78ca9 [MIR] Change the token name for '<' and '>' to be consitent with the LLVM IR parser.
Thanks to Ahmed Bougacha for noticing!

llvm-svn: 262899
2016-03-08 02:00:43 +00:00
Quentin Colombet f574ab292b [AArch64] Initialize GlobalISel as part of the target initialization.
llvm-svn: 262897
2016-03-08 01:45:36 +00:00
Quentin Colombet 39293d3aaa [GlobalISel] Introduce initializer method to support start/stop-after features.
llvm-svn: 262896
2016-03-08 01:38:55 +00:00
Quentin Colombet 050b211820 [MIR] Teach the parser/printer that generic virtual registers do not need a register class.
llvm-svn: 262893
2016-03-08 01:17:03 +00:00
Justin Bogner 7e6f09c28f Revert "SelectionDAG: Store SDNode operands in an ArrayRecycler"
Looks like the largest SDNode is different between 32 and 64 bit now,
so this is breaking 32 bit bots. Reverting while I figure out a fix.

This reverts r262886.

llvm-svn: 262892
2016-03-08 01:07:03 +00:00
Richard Smith c2a2830e94 A couple more UB fixes for C++14 sized deallocation.
llvm-svn: 262891
2016-03-08 00:59:44 +00:00
Quentin Colombet 287c6bb571 [MIR] Teach the parser how to parse complex types of generic machine instructions.
By complex types, I mean aggregate or vector types.

llvm-svn: 262890
2016-03-08 00:57:31 +00:00
Justin Bogner 6543a9385f SelectionDAG: Store SDNode operands in an ArrayRecycler
Currently some SDNode operands are malloc'd, some are stored inline in
subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
This scheme is complex, inconsistent, and makes refactoring SDNodes
fairly difficult.

Instead, we can allocate all of the operands using an ArrayRecycler
that wraps a BumpPtrAllocator. This keeps the cache locality when
iterating operands, improves locality when iterating SDNodes without
looking at operands, and vastly simplifies the ownership semantics.

It also means we stop overallocating SDNodes by 2-3x and will make it
simpler to fix the rampant undefined behaviour we have in how we
mutate SDNodes from one kind to another (See llvm.org/pr26808).

This is NFC other than the changes in memory behaviour, and I ran some
LNT tests to make sure this didn't hurt compile time. Not many tests
changed: there were a couple of 1-2% regressions reported, but there
were more improvements (of up to 4%) than regressions.

llvm-svn: 262886
2016-03-08 00:39:51 +00:00
Quentin Colombet d655483944 [MIR] Teach the printer how to print complex types for generic machine instructions.
Before this change, we would get the type definition in the middle
of the instruction.
E.g., %0(48) = G_ADD %struct_alias = type { i32, i16 } %edi, %edi

Now, we have just the expected type name:
%0(48) = G_ADD %struct_alias %edi, %edi

llvm-svn: 262885
2016-03-08 00:38:01 +00:00
Quentin Colombet dafed5d7d8 [AsmParser] Expose an API to parse a string starting with a type.
Without actually parsing a type it is difficult to perdict where
the type definition ends. In other words, instead of expecting
the user of the parser API to hand over only the relevant bits
of the string being parsed, take the whole string, parse the type,
and get back the number of characters that have been read.

This will be used by the MIR testing infrastructure.

llvm-svn: 262884
2016-03-08 00:37:07 +00:00
Easwaran Raman b1bd398ceb Revert revisions 262636, 262643, 262679, and 262682.
llvm-svn: 262883
2016-03-08 00:36:35 +00:00
Quentin Colombet 12350a8e13 [MIR] Print the type of generic machine instructions.
llvm-svn: 262880
2016-03-08 00:29:15 +00:00
Quentin Colombet 851996778f [MIR] Teach the mir parser about types on generic machine instructions.
llvm-svn: 262879
2016-03-08 00:20:48 +00:00
Anna Zaks c1efa64c63 [tsan] Add support for pointer typed atomic stores, loads, and cmpxchg
TSan instrumentation functions for atomic stores, loads, and cmpxchg work on
integer value types. This patch adds casts before calling TSan instrumentation
functions in cases where the value is a pointer.

Differential Revision: http://reviews.llvm.org/D17833

llvm-svn: 262876
2016-03-07 23:16:23 +00:00
Quentin Colombet 41bea872dd [MachineInstr] Get rid of some GlobalISel ifdefs.
Now the type API is always available, but when global-isel is not
built the implementation does nothing.

Note: The implementation free of ifdefs is WIP and tracked here in PR26576.
llvm-svn: 262873
2016-03-07 22:47:23 +00:00
Quentin Colombet 774b1efa62 [IR] Provide an API to skip the details of a structured type when printed.
The mir infrastructure will need this for generic instructions and currently
this feature was only available through the anonymous TypePrinter class.

llvm-svn: 262869
2016-03-07 22:32:42 +00:00
Quentin Colombet 81e72b4d4e [AsmParser] Add a function to parse a standalone type.
This is useful for MIR serialization. Indeed generic machine instructions
must have a type and we don't want to duplicate the logic in the MIParser.

llvm-svn: 262868
2016-03-07 22:09:05 +00:00
Quentin Colombet 4e14a497a3 [MIR] Teach the MIPrinter about size for generic virtual registers.
llvm-svn: 262867
2016-03-07 21:57:52 +00:00
Matt Arsenault c89f2919a4 AMDGPU: Match more med3 integer patterns
llvm-svn: 262864
2016-03-07 21:54:48 +00:00
Quentin Colombet 2a831fb826 [MIR] Teach the parser how to handle the size of generic virtual registers.
llvm-svn: 262862
2016-03-07 21:48:43 +00:00
Quentin Colombet 1bd7504ef3 [MachineRegisterInfo] Add a method to set the size of a virtual register a posteriori.
This is required for mir testing.

llvm-svn: 262861
2016-03-07 21:41:39 +00:00
Amaury Sechet 5984dfe7c7 Small formating change in Core.cpp . NFC
llvm-svn: 262860
2016-03-07 21:39:20 +00:00
Quentin Colombet 70a9670d80 [MachineRegisterInfo] Get rid of the global-isel ifdefs.
One additional pointer is not a big deal size-wise and it makes
the code much nicer!

llvm-svn: 262856
2016-03-07 21:22:09 +00:00
Matt Arsenault 56356c8a9c AMDGPU: Remove a fixme for ptrrtoint handling
llvm-svn: 262854
2016-03-07 21:12:46 +00:00
Matt Arsenault 81d06015c6 AMDGPU: Move function only used by R600
llvm-svn: 262853
2016-03-07 21:10:13 +00:00
Matt Arsenault ceb2c06cbd DAGCombiner: Check legality before creating extract_vector_elt
Problem not hit by any in tree target.

llvm-svn: 262852
2016-03-07 21:10:09 +00:00
Adam Nemet bb3680bd85 [LoopDataPrefetch] If prefetch distance is not set, skip pass
This lets select sub-targets enable this pass.  The patch implements the
idea from the recent llvm-dev thread:
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/94925

The goal is to enable the LoopDataPrefetch pass for the Cyclone
sub-target only within Aarch64.

Positive and negative tests will be included in an upcoming patch that
enables selective prefetching of large-strided accesses on Cyclone.

llvm-svn: 262844
2016-03-07 18:35:42 +00:00
Marina Yatsina 5f5de9f89b [ms-inline-asm][AVX512] Add ability to use k registers in MS inline asm + fix bag with curly braces
Until now curly braces could only be used in MS inline assembly to mark block start/end.
All curly braces were removed completely at a very early stage.
This approach caused bugs like:
"m{o}v eax, ebx" turned into "mov eax, ebx" without any error.

In addition, AVX-512 added special operands (e.g., k registers), which are also surrounded by curly braces that mark them as such.
Now, we need to keep the curly braces and identify at a later stage if they are marking block start/end (if so, ignore them), or surrounding special AVX-512 operands (if so, parse them as such).

This patch fixes the bug described above and enables the use of AVX-512 special operands.

This commit is the the llvm part of the patch.
The clang part of the review is: http://reviews.llvm.org/D17766
The llvm part of the review is: http://reviews.llvm.org/D17767

Differential Revision: http://reviews.llvm.org/D17767

llvm-svn: 262843
2016-03-07 18:11:16 +00:00
Adam Nemet 81113ef68c Revert "Enable LoopLoadElimination by default"
This reverts commit r262250.

It causes SPEC2006/gcc to generate wrong result (166.s) in AArch64 when
running with *ref* data set.  The error happens with
"-Ofast -flto -fuse-ld=gold" or "-O3 -fno-strict-aliasing".

llvm-svn: 262839
2016-03-07 17:38:02 +00:00
Chandler Carruth af8321ecf7 [memdep] Switch to range based for loops.
llvm-svn: 262831
2016-03-07 15:12:57 +00:00
Chandler Carruth 9ca96384f3 [DFSan] Remove an overly aggressive assert reported in PR26068.
This code has been successfully used to bootstrap libc++ in a no-asserts
mode for a very long time, so the code that follows cannot be completely
incorrect. I've added a test that shows the current behavior for this
kind of code with DFSan. If it is desirable for DFSan to do something
special when processing an invoke of a variadic function, it can be
added, but we shouldn't keep an assert that we've been ignoring due to
release builds anyways.

llvm-svn: 262829
2016-03-07 14:05:09 +00:00
Chandler Carruth b32febe48e [memdep] Switch a function to return true on success instead of false.
This is much more clear and less surprising IMO. It also makes things
more consistent with the increasingly large chunk of LLVM code that
assumes true-on-success.

llvm-svn: 262826
2016-03-07 12:45:07 +00:00
Chandler Carruth 40e21f2a20 [memdep] Cleanup the implementation doxygen comments and remove
duplicated comments.

In several cases these had diverged making them especially nice to
canonicalize. I checked to make sure we weren't losing important
information of course.

llvm-svn: 262825
2016-03-07 12:30:06 +00:00
Chandler Carruth 60fb1b4bd2 [memdep] Run clang-format over the header before porting it to
the new pass manager.

The port will involve substantial edits here, and would likely introduce
bad formatting if formatted in isolation, so just get all the formatting
up to snuff. I'll also go through and try to freshen the doxygen here as
well as modernizing some of the code.

llvm-svn: 262821
2016-03-07 10:19:30 +00:00
Craig Topper 267bdb2094 [CodeGen] Add space-optimized EmitMergeInputChains1_2 to the DAG isel matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC
llvm-svn: 262815
2016-03-07 07:29:12 +00:00
Mehdi Amini b923d641d0 Add a new insert_as() method to DenseMap and use it for ConstantUniqueMap
Just like the existing find_as() method, the new insert_as() accepts
an extra parameter which is used as a key to find the bucket in the
map.
When creating a Constant, we want to check the map before actually
creating the object. In this case we have to perform two queries to
the map, and this extra parameter can save recomputing the hash value
for the second query.

This is a reapply of r260458, that was reverted because it was
suspected to be the cause of instability of an internal bot, but
wasn't confirmed.

Differential Revision: http://reviews.llvm.org/D16268

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 262812
2016-03-07 00:51:00 +00:00
Mehdi Amini 67dfe09da4 Bitcode reader: Inline readAbbreviatedField in readRecord and move the enclosing loop in each case (NFC)
Summary: This make readRecord 20% faster, measured on an LTO build

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17911

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 262811
2016-03-07 00:38:09 +00:00
NAKAMURA Takumi 2de1b320a4 Revert r130657, "Windows/DynamicLibrary.inc: Clean up ELM_Callback. We may check the decl instead of the versions of individual libraries."
We may assume the type of 1st argument as PCSTR in PENUMLOADED_MODULES_CALLBACK. PSTR was in the ancient mingw32.

llvm-svn: 262810
2016-03-07 00:13:09 +00:00
Simon Pilgrim 253ca348b2 [X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target shuffle combining.
Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles.

This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed.

Removed non-constant pool mask decode path as we have no way of testing it right now.

Differential Revision: http://reviews.llvm.org/D17916

llvm-svn: 262809
2016-03-06 21:54:52 +00:00
Valery Pykhtin dc11054f20 [AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler.
Engages code from r262804.

Differential Revision: http://reviews.llvm.org/D17151

llvm-svn: 262808
2016-03-06 20:25:36 +00:00
Valery Pykhtin 50cd3c4ec7 fix sanitizer-ppc64be-linux failure for r262804
error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]

http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/930

llvm-svn: 262805
2016-03-06 15:13:54 +00:00
Valery Pykhtin 499a5c6323 [AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields
Differential Revision: http://reviews.llvm.org/D17150

llvm-svn: 262804
2016-03-06 13:27:13 +00:00
Igor Breger 4d94d4d5f7 AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element types on SKX
Differential Revision: http://reviews.llvm.org/D17913

llvm-svn: 262803
2016-03-06 12:38:58 +00:00
Valery Pykhtin 0c6293da68 [AMDGPU] SOPxx instructions operand naming fixed in td files.
dst -> sdst
ssrcN -> srcN

Differential Revision: http://reviews.llvm.org/D17646

llvm-svn: 262801
2016-03-06 10:31:44 +00:00
Craig Topper 581c0087b9 [X86] Use high bits of return value from getEncoding instead of predicate functions to populate the REX and VEX prefix bits that extend register encodings. NFC
llvm-svn: 262800
2016-03-06 08:12:47 +00:00
Craig Topper faab5c68d4 [X86] Remove unnecessary masking. The assert above it already guaranteed it. NFC
llvm-svn: 262799
2016-03-06 08:12:44 +00:00
Craig Topper 5e038cf589 [X86] Use uint8_t instead of unsigned char as it shortens the code and more explicitly reflects the desired size.
llvm-svn: 262798
2016-03-06 08:12:42 +00:00
Igor Breger f1bd761e00 AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector).
Differential Revision: http://reviews.llvm.org/D17763

llvm-svn: 262797
2016-03-06 07:46:03 +00:00
Simon Pilgrim 40e1a71cdd [X86][AVX] Improved VPERMILPS variable shuffle mask decoding.
Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool.

Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization

Followup to D17681

llvm-svn: 262784
2016-03-05 22:53:31 +00:00
Simon Pilgrim aa99331bad [X86] AMD Bobcat CPU (btver1) doesn't support XSAVE
btver1 is a SSSE3/SSE4a only CPU - it doesn't have AVX and doesn't support XSAVE.

Differential Revision: http://reviews.llvm.org/D17683

llvm-svn: 262782
2016-03-05 22:00:50 +00:00
Saleem Abdulrasool 4208381016 Support: catch invalid accesses
It is possible to invoke these methods on an invalid input resulting in an
invalid substring construction.  It seems that we do not have unit tests for
these methods.  Tests to ensure that the invalid call is caught to follow in
clang.

Resolves PR26839.

llvm-svn: 262778
2016-03-05 20:00:44 +00:00
Saleem Abdulrasool fa8c6ed3fa ExecutionEngine: tweak debug log
Add a newline to separate the log message.  NFC.

llvm-svn: 262777
2016-03-05 20:00:41 +00:00
Krzysztof Parzyszek 5c61d11a6d Add DAG mutation interface to the post-RA scheduler
Differential Revision: http://reviews.llvm.org/D17868

llvm-svn: 262774
2016-03-05 15:45:23 +00:00
Matthias Braun 4797ec95e4 RegisterCoalescer: Remap subregister lanemasks before exchanging operands
Rematerializing and merging into a bigger register class at the same
time, requires the subregister range lanemasks getting remapped to the
new register class.

This fixes http://llvm.org/PR26805

llvm-svn: 262768
2016-03-05 04:36:13 +00:00
Matthias Braun 8de09aa0c5 RegisterCoalescer: Need to check DstReg+SrcReg for missing undef flags
copy coalescing with enabled subregister liveness can reveal undef uses,
previously this was only checked for the SrcReg in updateRegDefsUses()
but we need to check DstReg as well.

llvm-svn: 262767
2016-03-05 04:36:10 +00:00
Matthias Braun 2cbfd9fff5 RegisterPressure: Small cleanup
llvm-svn: 262766
2016-03-05 04:36:08 +00:00
Quentin Colombet 2a7676b442 [X86] Fix the lowering of setjmp intrinsic on i386.
When the lowering of the setjmp intrinsic requires
a global base pointer to be set, make sure such pointer
gets defined by the CGBR pass.

This fixes PR26742.

llvm-svn: 262762
2016-03-05 00:31:04 +00:00
Quentin Colombet 13b524597d [X86] Do not use cmpxchgXXb when we need the base pointer (RBX).
cmpxchgXXb uses RBX as one of its implicit argument. I.e., when
we use that instruction we need to clobber RBX. This is generally
fine, expect when RBX is a reserved register because in that case,
the register allocator will not track its value and will not
save and restore it when interferences occur.

rdar://problem/24851412

llvm-svn: 262759
2016-03-04 23:29:39 +00:00
Mike Aizatsky 243fe2b3a0 [libfuzzer] adding std:string to allowed adaptable argument.
llvm-svn: 262757
2016-03-04 23:18:01 +00:00
David Majnemer 71a1c2c619 Fix build breakage
llvm-svn: 262756
2016-03-04 23:02:15 +00:00
David Majnemer d2f767d2f6 [X86] Support cleaning more than 2**16 bytes of stack
The x86 ret instruction has a 16 bit immediate indicating how many bytes
to pop off of the stack beyond the return address.

There is a problem when extremely large structs are passed by value: we
might not be able to fit the number of bytes to pop into the return
instruction.

To fix this, expand RET_FLAG a little later and use a special sequence
to clean the stack:

pop  %ecx     ; return address is now in %ecx
add  $n, %esp ; clean the stack
push %ecx     ; bring the return address back on the stack
ret           ; pop the return address and jmp to it's value

llvm-svn: 262755
2016-03-04 22:56:17 +00:00
Kostya Serebryany 5c3701c621 [libFuzzer] log less when re-loading files; fix a silly bug: when running single files actually run all of them, not just the first one
llvm-svn: 262754
2016-03-04 22:35:40 +00:00
Philip Reames a0c9f6e736 [LVI] Fix a bug which prevented use of !range metadata within a query
The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further.  The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor.

llvm-svn: 262751
2016-03-04 22:27:39 +00:00
Rong Xu ecdc98fdae [PGO] Add a commandline option to control number of the VP annotation metadata.
llvm-svn: 262750
2016-03-04 22:08:44 +00:00
Michael Kuperstein b89f0fa2a2 [DAGCombine] Fix divrem combine not to assume div/rem type is simple.
The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization 
combine.

This fixes PR26835.

Differential Revision: http://reviews.llvm.org/D17878

llvm-svn: 262746
2016-03-04 21:23:29 +00:00
Dan Gohman e6b81362e9 [WebAssembly] Add another possible code-size optimization to README.txt
llvm-svn: 262740
2016-03-04 20:09:57 +00:00
Renato Golin 175c6d6d95 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Tom Stellard 649b5db557 AMDGPU/SI: Add support for spiling SGPRs to scratch buffer
Summary:
This is necessary for when we run out of VGPRs and can no
longer use v_{read,write}_lane for spilling SGPRs.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17592

llvm-svn: 262732
2016-03-04 18:31:18 +00:00
Tom Stellard ebef6f9771 AMDGPU/SI: Enable frame index scavenging during PrologEpilogueInserter
Summary:
This allows us to use virtual registers when we need extra registers
for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex().

Once all the frame indices have been eliminated, the
PrologEpilogueInserter does an extra pass over the program to replace
all virtual registers with physical ones.

This allows us to make more efficient use of our emergency spill slots,
so we only need to create one.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17591

llvm-svn: 262728
2016-03-04 18:02:01 +00:00
Krzysztof Parzyszek 51155fc0d1 [Hexagon] Fix lowering of calls with the return type of i1
This fixes an assertion in test/CodeGen/Hexagon/ifcvt-edge-weight.ll
when run with -debug-only=isel

llvm-svn: 262726
2016-03-04 17:38:05 +00:00
Zoran Jovanovic a68b67d1ed [mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated.
Author: milena.vujosevic.janicic
Reviewers: dsanders
Differential Revision: http://reviews.llvm.org/D17373

llvm-svn: 262725
2016-03-04 17:34:31 +00:00
Teresa Johnson d84c7decb6 Change split code gen to use ThreadPool
Part of D15390.

llvm-svn: 262719
2016-03-04 15:39:13 +00:00
Sam Kolton f51f4b8370 Test commit access
llvm-svn: 262714
2016-03-04 12:29:14 +00:00
Valery Pykhtin 824e804bf6 test commit
llvm-svn: 262709
2016-03-04 10:59:50 +00:00
Benjamin Kramer 4dbf3371bb Make headers self-contained again.
llvm-svn: 262702
2016-03-04 10:49:30 +00:00
Nikolay Haustov 5bf46ac150 AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics
These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the
GL_ARB_shader_image_load_store extension.

Initial change by Nicolai H.hnle

Differential Revision: http://reviews.llvm.org/D17401

llvm-svn: 262701
2016-03-04 10:39:50 +00:00
Easwaran Raman 588c68a87b Fix a memory leak.
llvm-svn: 262682
2016-03-04 01:18:40 +00:00
Easwaran Raman 3b7a8246c9 Fix a use-after-free bug introduced in r262636
llvm-svn: 262679
2016-03-04 00:44:01 +00:00
Mike Aizatsky b8627a89a6 [libfuzzer] arbitrary function adapter.
The adapter automates converting sequence of bytes into arbitrary
arguments.

Differential Revision: http://reviews.llvm.org/D17829

llvm-svn: 262673
2016-03-03 23:45:29 +00:00
Guozhi Wei 92e9d0e80e [InstCombine] Combine A->B->A BitCast
This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

llvm-svn: 262670
2016-03-03 23:21:38 +00:00
Kostya Serebryany e483ed2825 [libFuzzer] when interrupted, call _Exit() instead of exit()
llvm-svn: 262667
2016-03-03 22:36:37 +00:00
Simon Pilgrim f33cb61471 [X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.
PSHUFB decoder was assuming that input was 128 or 256-bit vector only.

llvm-svn: 262661
2016-03-03 21:55:01 +00:00
Lang Hames 3b514554a2 [RuntimeDyld] Fix '_' stripping in RTDyldMemoryManager::getSymbolAddressInProcess.
The RTDyldMemoryManager::getSymbolAddressInProcess method accepts a
linker-mangled symbol name, but it calls through to dlsym to do the lookup (via
DynamicLibrary::SearchForAddressOfSymbol), and dlsym expects an unmangled
symbol name.

Historically we've attempted to "demangle" by removing leading '_'s on all
platforms, and fallen back to an extra search if that failed. That's broken, as
it can cause symbols to resolve incorrectly on platforms that don't do mangling
if you query '_foo' and the process also happens to contain a 'foo'.

Fix this by demangling conditionally based on the host platform. That's safe
here because this function is specifically for symbols in the host process, so
the usual cross-process JIT looking concerns don't apply.

M    unittests/ExecutionEngine/ExecutionEngineTest.cpp
M    lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp

llvm-svn: 262657
2016-03-03 21:23:15 +00:00
Philip Reames b7270446cf [ValueTracking] "constant fold" an experimental hidden option
llvm-svn: 262648
2016-03-03 19:50:32 +00:00
Philip Reames 146307eb52 [ValueTracking] Remove dead code from an old experiment
This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits.  While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default.  Given this, it's time to abandon the experiment.  

Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments.  Given how easy the patch is to apply, there's no reason to leave the code in tree.  

For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads.  In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach.  

llvm-svn: 262646
2016-03-03 19:44:06 +00:00
Sanjay Patel 9bba75084b [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)
Given that we're not actually reducing the instruction count in the included
regression tests, I think we would call this a canonicalization step.

The motivation comes from the example in PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable
example of:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
  %bc = bitcast <4 x i32> %not to <2 x i64>
  %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1>
  %bc2 = bitcast <2 x i64> %notnot to <4 x i32>
  ret <4 x i32> %bc2
}

Simplifies to the expected:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  ret <4 x i32> %lobit
}

Differential Revision: http://reviews.llvm.org/D17583

llvm-svn: 262645
2016-03-03 19:19:04 +00:00
Easwaran Raman fd6557e368 Fix breakage caused by r262636.
Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused))

llvm-svn: 262643
2016-03-03 18:53:20 +00:00
Sanjoy Das 724f5cf278 [SCEV] Prove no-overflow via constant ranges
Exploit ScalarEvolution::getRange's newly acquired smartness (since
r262438) by using that to infer nsw and nuw when possible.

llvm-svn: 262639
2016-03-03 18:31:29 +00:00
Sanjoy Das 11ef606f1d [SCEV] Be less eager about demoting zexts to sexts
After r262438 we can have provably positive NSW SCEV expressions whose
zero extensions cannot be simplified (since r262438 makes SCEV better at
computing constant ranges).  This means demoting sexts of positive add
recurrences eagerly can result in an unsimplified zero extension where
we could have had a simplified sign extension.  This change fixes the
issue by teaching SCEV to demote sext of a positive SCEV expression to a
zext only if the sext could not be simplified.

llvm-svn: 262638
2016-03-03 18:31:23 +00:00
Sanjoy Das f3867e64a8 [ConstantRange] Generalize makeGuaranteedNoWrapRegion to work on ranges
This will be used in a later patch to ScalarEvolution.  Right now only
the unit tests exercise the newly added code.

llvm-svn: 262637
2016-03-03 18:31:16 +00:00
Easwaran Raman 3035719c86 Infrastructure for PGO enhancements in inliner
This patch provides the following infrastructure for PGO enhancements in inliner:

Enable the use of block level profile information in inliner
Incremental update of block frequency information during inlining
Update the function entry counts of callees when they get inlined into callers.

Differential Revision: http://reviews.llvm.org/D16381

llvm-svn: 262636
2016-03-03 18:26:33 +00:00
Simon Pilgrim abcee45b7a [X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS
The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic.

This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle.

Differential Revision: http://reviews.llvm.org/D17681

llvm-svn: 262635
2016-03-03 18:13:53 +00:00
Dehao Chen 57d1dda558 Use LineLocation instead of CallsiteLocation to index callsite profile.
Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples).

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17827

llvm-svn: 262634
2016-03-03 18:09:32 +00:00
Simon Pilgrim 022afe2538 [X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.
lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway.

llvm-svn: 262633
2016-03-03 17:54:35 +00:00
Simon Pilgrim 0107d24810 [X86] Pulled out repeated code testing for constant vector shift amount. NFCI.
llvm-svn: 262631
2016-03-03 17:35:43 +00:00
Amjad Aboud 0ce261d052 MCU target has its own ABI, however X86 interrupt handler calling convention overrides this ABI.
Fixed the ordering to check first for X86 interrupt handler then for MCU target.

Differential Revision: http://reviews.llvm.org/D17801

llvm-svn: 262628
2016-03-03 17:17:54 +00:00
Ahmed Bougacha 671795a985 [X86] Don't assume that shuffle non-mask operands starts at #0.
That's not the case for VPERMV/VPERMV3, which cover all possible
combinations (the C intrinsics use a different order; the AVX vs
AVX512 intrinsics are different still).

Since:
  r246981 AVX-512: Lowering for 512-bit vector shuffles.
VPERMV is recognized in getTargetShuffleMask.

This breaks assumptions in most callers, as they expect
the non-mask operands to start at index 0.
VPERMV has the mask as operand #0; VPERMV3 has it in the middle.

Instead of the faulty assumption, have getTargetShuffleMask return
its operands as well.

One alternative we considered was to change the operand order of
VPERMV, but we agreed to stick to the instruction order, as there
are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular).

Differential Revision: http://reviews.llvm.org/D17041

llvm-svn: 262627
2016-03-03 16:53:50 +00:00
Matthew Simpson b840a6d6f4 [LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When
detecting these recurrences, we need to ensure that the previous value is
actually defined inside the loop. This patch includes the fix and test case.

llvm-svn: 262624
2016-03-03 16:12:01 +00:00
Sanjay Patel d6cb4ec2a2 [AArch64] fold 'isPositive' vector integer operations (PR26819)
This is one of the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26819

Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that
in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might
come from autovectorization even if it's unlikely to be produced from C NEON intrinsics.

The patch is based on the x86 equivalent:
http://reviews.llvm.org/rL262036

Differential Revision: http://reviews.llvm.org/D17834

llvm-svn: 262623
2016-03-03 15:56:08 +00:00
Igor Breger 639fde79b0 AVX512: Combine AND + TESTM instructions .
Differential Revision: http://reviews.llvm.org/D17844

llvm-svn: 262621
2016-03-03 14:18:38 +00:00
Dylan McKay 4fd0d4af86 [AVR] Add calling convention parser tokens
Summary: Adds the 'avr_intrcc' and 'avr_signalcc' IR calling convention tokens to the parser.

Reviewers: arsenm

Subscribers: dylanmckay, llvm-commits

Differential Revision: http://reviews.llvm.org/D16348

llvm-svn: 262600
2016-03-03 10:08:02 +00:00
Simon Pilgrim 91dd0a796c [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 262599
2016-03-03 09:43:28 +00:00
Renato Golin 3d78271eac Revert "[ARM] Merging 64-bit divmod lib calls into one"
This reverts commit r262507, which broke some ARM buildbots.

llvm-svn: 262594
2016-03-03 08:57:44 +00:00
Michael Zuckerman c4d054fa4a [LLVM][AVX512] PSRLWI Chnage imm8 to int
Differential Revision: http://reviews.llvm.org/D17753

llvm-svn: 262592
2016-03-03 08:54:05 +00:00
Junmo Park 6ba96fb431 [BranchFolding] Change function name related with merging MMOs. NFC
Summary:
Removing MMOs is not our prefer behavior any more.

Reviewers: mcrosier, reames
   
Differential Revision: http://reviews.llvm.org/D17668

llvm-svn: 262580
2016-03-03 03:57:20 +00:00
Tom Stellard cc7067a668 AMDGPU: Insert two S_NOP instructions for every high level source statement.
Patch by: Konstantin Zhuravlyov

Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input.

Reviewers: tstellarAMD, arsenm

Subscribers: echristo, dblaikie, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17454

llvm-svn: 262579
2016-03-03 03:53:29 +00:00
Tom Stellard 600ca6fd39 AMDGPU/SI: Don't try to move scratch wave offset when there are no free SGPRs
Summary:
When there were no free SGPRs, we were trying to move this value into
some of the reserved registers which was causing a segmentation fault.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17590

llvm-svn: 262577
2016-03-03 03:45:09 +00:00
Hans Wennborg 153e4b0f11 [X86] Enable forwarding bool arguments in tail calls (PR26305)
The code was previously not able to track a boolean argument
at a call site back to the formal argument of the caller.

Differential Revision: http://reviews.llvm.org/D17786

llvm-svn: 262575
2016-03-03 02:06:32 +00:00
Tim Shen 6e676a84ad [PPCVSXFMAMutate] Temporarily disable this pass
llvm-svn: 262573
2016-03-03 01:27:35 +00:00
Philip Reames ae27b2380f [MBP] Renaming a confusing variable and add clarifying comments
Was discussed as part of http://reviews.llvm.org/D17830

llvm-svn: 262571
2016-03-03 00:58:43 +00:00
Philip Reames 23d933982a [MBP] Avoid placing random blocks between loop preheader and header
If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code.

The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!.

Differential Revision: http://reviews.llvm.org/D17830

llvm-svn: 262547
2016-03-03 00:01:42 +00:00
David Majnemer 1ef654024f [X86] Don't give catch objects a displacement of zero
Catch objects with a displacement of zero do not initialize a catch
object.  The displacement is relative to %rsp at the end of the
function's prologue for x86_64 targets.

If we place an object at the top-of-stack, we will end up wit a
displacement of zero resulting in our catch object remaining
uninitialized.

Address this by creating our catch objects as fixed objects.  We will
ensure that the UnwindHelp object is created after the catch objects so
that no catch object will have a displacement of zero.

Differential Revision: http://reviews.llvm.org/D17823

llvm-svn: 262546
2016-03-03 00:01:25 +00:00
Matt Arsenault 8226fc4829 AMDGPU: Simplify boolean conditional return statements
Patch by Richard Thomson

llvm-svn: 262536
2016-03-02 23:00:21 +00:00
Philip Reames 02e1132afb [MBP] Remove overly verbose debug output
llvm-svn: 262531
2016-03-02 22:40:51 +00:00
Amaury Sechet 3b8b2ea2e1 Explode store of arrays in instcombine
Summary: This is the last step toward supporting aggregate memory access in instcombine. This explodes stores of arrays into a serie of stores for each element, allowing them to be optimized.

Reviewers: joker.eph, reames, hfinkel, majnemer, mgrang

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17828

llvm-svn: 262530
2016-03-02 22:36:45 +00:00
Philip Reames b9688f4382 [MBP] Adjust debug output to be more focused and approachable
llvm-svn: 262522
2016-03-02 21:45:13 +00:00
Amaury Sechet 7cd3fe7db6 Unpack array of all sizes in InstCombine
Summary: This is another step toward improving fca support. This unpack load of array in a series of load to array's elements.

Reviewers: chandlerc, joker.eph, majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15890

llvm-svn: 262521
2016-03-02 21:28:30 +00:00
Daniel Berlin 6412002d24 Really fix ASAN leak/etc issues with MemorySSA unittests
llvm-svn: 262519
2016-03-02 21:16:28 +00:00
Kostya Serebryany 4394b31e1d [libFuzzer] add -Werror for libFuzzer build rule
llvm-svn: 262517
2016-03-02 21:08:16 +00:00
Daniel Berlin 989e601b26 Revert "Fix ASAN detected errors in code and test" (it was not meant to be committed yet)
This reverts commit 890bbccd600ba1eb050353d06a29650ad0f2eb95.

llvm-svn: 262512
2016-03-02 20:36:22 +00:00
Daniel Berlin 27ed1c2eb0 Fix ASAN detected errors in code and test
llvm-svn: 262511
2016-03-02 20:27:29 +00:00
Renato Golin 93e42d9934 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262507
2016-03-02 19:35:45 +00:00
Reid Kleckner 65f9d9cd32 Revert "[X86] Elide references to _chkstk for dynamic allocas"
This reverts commit r262370.

It turns out there is code out there that does sequences of allocas
greater than 4K: http://crbug.com/591404

The goal of this change was to improve the code size of inalloca call
sequences, but we got tangled up in the mess of dynamic allocas.
Instead, we should come back later with a separate MI pass that uses
dominance to optimize the full sequence. This should also be able to
remove the often unneeded stacksave/stackrestore pairs around the call.

llvm-svn: 262505
2016-03-02 19:20:59 +00:00
Matthias Braun f290912d22 ARM: Introduce conservative load/store optimization mode
Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which
means that unaligned loads/stores do not trap and even extensive testing
will not catch these bugs. However the multi/double variants are not
affected by this bit and will still trap. In effect a more aggressive
load/store optimization will break existing (bad) code.

These bugs do not necessarily manifest in the broken code where the
misaligned pointer is formed but often later in perfectly legal code
where it is accessed. This means recompiling system libraries (which
have no alignment bugs) with a newer compiler will break existing
applications (with alignment bugs) that worked before.

So (under protest) I implemented this safe mode which limits the
formation of multi/double operations to cases that are not affected by
user code (stack operations like spills/reloads) or cases where the
normal operations trap anyway (floating point load/stores). It is
disabled by default.

Differential Revision: http://reviews.llvm.org/D17015

llvm-svn: 262504
2016-03-02 19:20:00 +00:00
Justin Bogner b2ecee9c31 SelectionDAG: Use correctly sized allocation functions for SDNodes
The placement new calls here were all calling the allocation function
in RecyclingAllocator/Recycler for SDNode, instead of the function for
the specific subclass we were constructing.

Since this particular allocator always overallocates it more or less
worked, but would hide what we're actually doing from any memory
tools. Also, if you tried to change this allocator so something like a
BumpPtrAllocator or MallocAllocator, the compiler would crash horribly
all the time.

Part of llvm.org/PR26808.

llvm-svn: 262500
2016-03-02 19:01:11 +00:00
Geoff Berry 62c1a1e7c7 [AArch64] Enable non-leaf frame pointer elimination.
Summary:
This change enables frame pointer elimination in non-leaf functions.
The -fomit-frame-pointer option still needs to be used when compiling
via clang (or an equivalent method of not setting the
'no-frame-pointer-elim*' function attributes if generating llvm IR via
some other method) to take advantage of this optimization.

This change should be NFC when compiling via clang without
-fomit-frame-pointer.

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, tberghammer, qcolombet, llvm-commits, danalbert, mcrosier, srhines

Differential Revision: http://reviews.llvm.org/D17730

llvm-svn: 262495
2016-03-02 17:58:31 +00:00
Chandler Carruth 12884f7f80 [AA] Hoist the logic to reformulate various AA queries in terms of other
parts of the AA interface out of the base class of every single AA
result object.

Because this logic reformulates the query in terms of some other aspect
of the API, it would easily cause O(n^2) query patterns in alias
analysis. These could in turn be magnified further based on the number
of call arguments, and then further based on the number of AA queries
made for a particular call. This ended up causing problems for Rust that
were actually noticable enough to get a bug (PR26564) and probably other
places as well.

When originally re-working the AA infrastructure, the desire was to
regularize the pattern of refinement without losing any generality.
While I think it was successful, that is clearly proving to be too
costly. And the cost is needless: we gain no actual improvement for this
generality of making a direct query to tbaa actually be able to
re-use some other alias analysis's refinement logic for one of the other
APIs, or some such. In short, this is entirely wasted work.

To the extent possible, delegation to other API surfaces should be done
at the aggregation layer so that we can avoid re-walking the
aggregation. In fact, this significantly simplifies the logic as we no
longer need to smuggle the aggregation layer into each alias analysis
(or the TargetLibraryInfo into each alias analysis just so we can form
argument memory locations!).

However, we also have some delegation logic inside of BasicAA and some
of it even makes sense. When the delegation logic is baking in specific
knowledge of aliasing properties of the LLVM IR, as opposed to simply
reformulating the query to utilize a different alias analysis interface
entry point, it makes a lot of sense to restrict that logic to
a different layer such as BasicAA. So one aspect of the delegation that
was in every AA base class is that when we don't have operand bundles,
we re-use function AA results as a fallback for callsite alias results.
This relies on the IR properties of calls and functions w.r.t. aliasing,
and so seems a better fit to BasicAA. I've lifted the logic up to that
point where it seems to be a natural fit. This still does a bit of
redundant work (we query function attributes twice, once via the
callsite and once via the function AA query) but it is *exactly* twice
here, no more.

The end result is that all of the delegation logic is hoisted out of the
base class and into either the aggregation layer when it is a pure
retargeting to a different API surface, or into BasicAA when it relies
on the IR's aliasing properties. This should fix the quadratic query
pattern reported in PR26564, although I don't have a stand-alone test
case to reproduce it.

It also seems general goodness. Now the numerous AAs that don't need
target library info don't carry it around and depend on it. I think
I can even rip out the general access to the aggregation layer and only
expose that in BasicAA as it is the only place where we re-query in that
manner.

However, this is a non-trivial change to the AA infrastructure so I want
to get some additional eyes on this before it lands. Sadly, it can't
wait long because we should really cherry pick this into 3.8 if we're
going to go this route.

Differential Revision: http://reviews.llvm.org/D17329

llvm-svn: 262490
2016-03-02 15:56:53 +00:00
Michael Zuckerman 927fdaee88 [LLVM][AVX512]PSRAWI Change imm8 to int.
Differential Revision: http://reviews.llvm.org/D17705

llvm-svn: 262480
2016-03-02 12:05:07 +00:00
Simon Pilgrim c02b72627a [X86][SSE] Lower 128-bit MOVDDUP with existing VBROADCAST mechanisms
We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of.

This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well.

It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts.

Differential Revision: http://reviews.llvm.org/D17680

llvm-svn: 262478
2016-03-02 11:43:05 +00:00
Nikolay Haustov f2fbabe9c1 Revert "[AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields"
Build failure with clang.

llvm-svn: 262477
2016-03-02 11:16:56 +00:00
Nikolay Haustov f0f24628cb Revert "[AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler."
Build failure with clang.

llvm-svn: 262475
2016-03-02 10:54:21 +00:00
Nikolay Haustov 73447a9714 [AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler.
complementary patch to table-driven amd_kernel_code_t field parser/printer utility. lit tests passed.

Patch by: Valery Pykhtin

Differential Revision: http://reviews.llvm.org/D17151

llvm-svn: 262474
2016-03-02 10:36:30 +00:00
Nikolay Haustov 6c8c74969a [AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields
This is going to be used in .hsatext disassembler and can be used
in current assembler parser (lit tests passed on parsing).
Code using this helpers isn't included in this patch.

Benefits:

unified approach
fast field name lookup on parsing
Later I would like to enhance some of the field naming/syntax using this code.

Patch by: Valery Pykhtin

Differential Revision: http://reviews.llvm.org/D17150

llvm-svn: 262473
2016-03-02 10:36:25 +00:00
Dmitry Vyukov 2eed1218e5 libfuzzer: fix compiler warnings
- unused sigaction/setitimer result (used in assert)
- unchecked fscanf return value
- signed/unsigned comparison

llvm-svn: 262472
2016-03-02 09:54:40 +00:00
Craig Topper 1d3f4aefd6 [X86] Remove unnecessary call to isReg from emitter's DestMem handling for VEX prefix. The operand is always a register. NFC
llvm-svn: 262468
2016-03-02 07:32:45 +00:00
Craig Topper 6a7cd42213 [X86] Make X86MCCodeEmitter::DetermineREXPrefix locate operands more like how VEX prefix handling does.
llvm-svn: 262467
2016-03-02 07:32:43 +00:00
David Majnemer 5aadde1ecc [X86] Permit reading of the FLAGS register without it being previously defined
We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS.
While technically correct, this is not be desirable for folks who want
to examine aspects of the FLAGS register which are not related to
computation like whether or not CPUID is a valid instruction.

Differential Revision: http://reviews.llvm.org/D17782

llvm-svn: 262465
2016-03-02 06:46:52 +00:00
Craig Topper d4dabb3939 [X86] Remove assertion I accidentally left in.
llvm-svn: 262464
2016-03-02 06:35:22 +00:00
Craig Topper a267431fa6 [X86] Be more structured about how we capture the register number when it is encoded in bits 7:4 of the immediate.
For some instructions the register is not the last operand and the immediate handling had to detect this and hardcode the index to find it. It also required CurOp to be pointing at the last operand handled in the Form switch whereas for any instruction it would be pointing at the next operand.

Now we just capture the value in the Form switch when we know exactly where it is and the CurOp pointer can behave normally.

llvm-svn: 262462
2016-03-02 06:06:18 +00:00
Sanjoy Das dcd3a88e29 [SCEV] Minor naming, braces cleanup; NFC
llvm-svn: 262459
2016-03-02 04:52:22 +00:00
Craig Topper cf65c62737 [X86] Use MCPhysReg and uint16_t for static arrays of registers and opcodes respectively should reduce size tiny bit. NFC
llvm-svn: 262458
2016-03-02 04:42:31 +00:00
Matt Arsenault f2dcb4737b AMDGPU: Fix bug 26659.
Fix checking the same instruction twice instead of the
second branch that uses vccz. I don't think this matters
currently because s_branch_vccnz is always used currently.

llvm-svn: 262457
2016-03-02 04:12:39 +00:00
Matt Arsenault a266bd8760 AMDGPU: Cleanup suggested in bug 23960
llvm-svn: 262456
2016-03-02 04:05:14 +00:00
Matt Arsenault 5de68cbc4c Bug 20810: Use report_fatal_error instead of unreachable
llvm-svn: 262455
2016-03-02 03:33:55 +00:00
Sanjoy Das 6b017a11ba Add a comment with a rational for the unusual code structure
llvm-svn: 262454
2016-03-02 02:56:29 +00:00
Sanjoy Das eca1b53b95 Qualify getRangeForAffineAR with this-> for MSVC
llvm-svn: 262453
2016-03-02 02:44:08 +00:00
George Burgess IV e0e6e48b29 Attempt to fix ASAN failure in a MemorySSA test.
llvm-svn: 262452
2016-03-02 02:35:04 +00:00
Sanjoy Das 1168f93c2b Perturb code in an attempt to appease MSVC
For some reason MSVC seems to think I'm calling getConstant() from a
static context.  Try to avoid this issue by explicitly specifying
'this->' (though I'm not confident that this will actually work).

llvm-svn: 262451
2016-03-02 02:34:20 +00:00
Sanjoy Das 62a1c33929 More code permutation to appease MSVC
llvm-svn: 262449
2016-03-02 02:15:42 +00:00
Sanjoy Das 9e5ebf145c Remove "auto" to appease the MSVC bots
llvm-svn: 262448
2016-03-02 01:59:37 +00:00
Matt Arsenault 7d0a77b979 DAGCombiner: Make sure an integer is being truncated
llvm-svn: 262446
2016-03-02 01:36:51 +00:00
Sanjay Patel 5e4c46de6d revert r262424 because there's a *clang test* for AArch64 that checks -O3 asm output
that is broken by this change

llvm-svn: 262440
2016-03-02 01:04:09 +00:00
Sanjoy Das bf73098472 [SCEV] Make getRange smarter around selects
Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}"
by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q})
instead.

The latter can be easier to compute precisely in cases like
"{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in
such cases the latter form simplifies to [0,N+1) union [0,N+1).

llvm-svn: 262438
2016-03-02 00:57:54 +00:00
Sanjoy Das b765b633cb [SCEV] Extract out a getRangeForAffineAR; NFC
Pure code-motion change.  Will be used later in making getRange more clever.

llvm-svn: 262437
2016-03-02 00:57:39 +00:00
Sanjay Patel 147e927957 [InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to shifts (PR26701)
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case. 

Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.

But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
 

llvm-svn: 262424
2016-03-01 23:55:18 +00:00
Dehao Chen 1012be120a Perform InstructioinCombiningPass before SampleProfile pass.
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17742

llvm-svn: 262419
2016-03-01 22:53:02 +00:00
Kostya Serebryany 3d95dd9149 [libFuzzer] deprecate exit_on_first flag
llvm-svn: 262417
2016-03-01 22:33:14 +00:00
Kostya Serebryany 228d5b1ce4 [libFuzzer] add generic signal handlers so that libFuzzer can report at least something if ASan is not handlig the signals for us. Remove abort_on_timeout flag.
llvm-svn: 262415
2016-03-01 22:19:21 +00:00
Colin LeMahieu 2d497a0078 [NFC] Convert tabs to spaces.
llvm-svn: 262411
2016-03-01 22:05:03 +00:00
Matthias Braun 37d884fdf4 AArch64: Reenable CompleteModel for A53, A57 and Kryo models
The fixes in r262393 completed them as well.

llvm-svn: 262408
2016-03-01 21:55:35 +00:00
Colin LeMahieu 5cb6eea664 [Hexagon] Modifying r262258 to only be in effect in the hand assembler path, not the integrated assembler.
llvm-svn: 262400
2016-03-01 21:37:41 +00:00
Matt Arsenault b36d462fac DAGCombiner: Turn truncate of a bitcasted vector to an extract
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.

This fixes some test failures in a future commit when i64 loads
are changed to promote.

llvm-svn: 262397
2016-03-01 21:31:53 +00:00
Rafael Espindola 889e45601e Add LLVMBuild for ObjectYAML.
Should fix the DBUILD_SHARED_LIBS bots.

llvm-svn: 262396
2016-03-01 21:29:33 +00:00
Jacques Pienaar ea9f25a740 [lanai] Add ELF enum value and relocations.
Add ELF enum value and relocations for Lanai backed.

General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html).

Differential Revision: http://reviews.llvm.org/D17008

llvm-svn: 262394
2016-03-01 21:21:42 +00:00
Matthias Braun a6cfb6f682 AArch64: Add missing schedinfo, check completeness for cyclone
This adds some missing generic schedule info definitions, enables
completeness checking for cyclone and fixes a typo uncovered by that.

Differential Revision: http://reviews.llvm.org/D17748

llvm-svn: 262393
2016-03-01 21:20:31 +00:00
Kit Barton e725669483 [Power9] Implement new vector compare, extract, insert instructions
This change implements the following vector operations:

  - Vector Compare Not Equal
    - vcmpneb(.) vcmpneh(.) vcmpnew(.)
    - vcmpnezb(.) vcmpnezh(.) vcmpnezw(.)
  - Vector Extract Unsigned
    - vextractub vextractuh vextractuw vextractd
    - vextublx vextubrx vextuhlx vextuhrx vextuwlx vextuwrx
  - Vector Insert
    - vinsertb vinserth vinsertw vinsertd

26 instructions.

Phabricator: http://reviews.llvm.org/D15916
llvm-svn: 262392
2016-03-01 20:51:57 +00:00
Sanjay Patel 18988ae66c [x86] use getBitcast()
This isn't quite NFC because some of the SDLocs may change which could
cause scheduling differences. But no regression tests are affected and
there is no functional change intended.

llvm-svn: 262391
2016-03-01 20:47:02 +00:00
David Blaikie bcb1c7e369 Fix some warnings a bit harder/different
This is an alternate fix to 262378 and a fix to a pessimizing-move
warning.

llvm-svn: 262390
2016-03-01 20:41:17 +00:00
Geoff Berry a0df341082 Revert "[AArch64] Fix isLegalAddImmediate() to return true for valid negative values."
Revert r262248 in an attempt to fix the clang-native-aarch64-full
    bot and to investigate a performance regression in
    SingleSource/Benchmarks/CoyoteBench/huffbench

llvm-svn: 262388
2016-03-01 20:28:52 +00:00
Vasileios Kalintiris 36901dd1c3 Revert "[mips] Promote the result of SETCC nodes to GPR width."
This reverts commit r262316.

It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.

llvm-svn: 262387
2016-03-01 20:25:43 +00:00
Kit Barton 01cd2e7a4b New file to track implementation status of new POWER9 instructions
llvm-svn: 262386
2016-03-01 20:19:43 +00:00
Matthias Braun 17cb57995e TableGen: Check scheduling models for completeness
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:

- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model

Typical steps necessary to complete a model:

- Ensure all pseudo instructions that are expanded before machine
  scheduling (usually everything handled with EmitYYY() functions in
  XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
  resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.

Differential Revision: http://reviews.llvm.org/D17747

llvm-svn: 262384
2016-03-01 20:03:21 +00:00
Justin Lebar 2daaa1ceca [NVPTX] Annotate param loads/stores as mayLoad/mayStore.
Summary:
Tablegen was unable to determine that param loads/stores were actually
reading or writing from memory.  I think this isn't a problem in
practice for param stores, because those occur in a block right before
we make our call.  But param loads don't have to at the very beginning
of a function, so should be annotated as mayLoad so we don't incorrectly
optimize them.

Reviewers: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D17471

llvm-svn: 262381
2016-03-01 19:44:22 +00:00
Justin Lebar 536c8b7446 [NVPTX] Remove workaround for tablegen crash in NVPTXInstrInfo.td.
Summary: Looks like this was caused by a typo.

Reviewers: jholewinski

Subscribers: jholewinski, llvm-commits, tra

Differential Revision: http://reviews.llvm.org/D17357

llvm-svn: 262380
2016-03-01 19:44:20 +00:00
Reid Kleckner d2da0f0cac Fix -Wnon-virtual-dtor warnings
llvm-svn: 262378
2016-03-01 19:39:54 +00:00
Owen Anderson 7ea02fc787 Fix an issue where fast math flags were dropped during scalarization.
Most portions of InstCombine properly propagate fast math flags, but
apparently the vector scalarization section was overlooked.

llvm-svn: 262376
2016-03-01 19:35:52 +00:00
Sanjoy Das f1e9cae00e [SCEV] Minor cleanup: rename method, C++11'ify; NFC
llvm-svn: 262374
2016-03-01 19:28:01 +00:00
Justin Lebar b5ca00a58d [NVPTX] Use different, convergent MIs for convergent calls.
Summary:
Calls sometimes need to be convergent.  This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.

Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs.  But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.

Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.

Reviewers: jholewinski

Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17423

llvm-svn: 262373
2016-03-01 19:24:03 +00:00
Justin Lebar 93e7a9b91c [NVPTX] Nix hack used to emit '{' and '}' for NVPTX calls.
Summary: Tablegen understands backslash as an escape char; that's sufficient.

Reviewers: jholewinski

Subscribers: llvm-commits, tra, jholewinski

Differential Revision: http://reviews.llvm.org/D17432

llvm-svn: 262372
2016-03-01 19:24:00 +00:00
Justin Lebar 877f5acc60 [NVPTX] Reformat NVPTXInstrInfo.td, and add additional comments.
Summary:
Also simplify some of the embedded C++ logic.

No functional changes.

Reviewers: jholewinski

Subscribers: llvm-commits, tra, jholewinski

Differential Revision: http://reviews.llvm.org/D17354

llvm-svn: 262371
2016-03-01 19:23:30 +00:00
David Majnemer 791b88b6da [X86] Elide references to _chkstk for dynamic allocas
The _chkstk function is called by the compiler to probe the stack in an
order consistent with Windows' expectations.  However, it is possible to
elide the call to _chkstk and manually adjust the stack pointer if we
can prove that the allocation is fixed size and smaller than the probe
size.

This shrinks chrome.dll, chrome_child.dll and chrome.exe by a
cummulative ~133 KB.

Differential Revision: http://reviews.llvm.org/D17679

llvm-svn: 262370
2016-03-01 19:20:23 +00:00
Rafael Espindola ebd9193b57 Move ObjectYAML code to a new library.
It is only ever used by obj2yaml and yaml2obj. No point in linking it
everywhere.

llvm-svn: 262368
2016-03-01 19:15:06 +00:00
Sanjay Patel 8a37a988e6 fix function names; NFC
llvm-svn: 262367
2016-03-01 19:14:09 +00:00
David Majnemer 45ebda4278 [Verifier] Don't abort on invalid cleanuprets
Code in visitEHPadPredecessors assume a little too much about the
validity of a cleanupret with an invalid cleanuppad operand.

llvm-svn: 262364
2016-03-01 18:59:50 +00:00
Easwaran Raman 8832e5e2f5 Fix breakage caused by r262360.
llvm-svn: 262363
2016-03-01 18:59:11 +00:00
Daniel Berlin 83fc77b4c0 Add the beginnings of an update API for preserving MemorySSA
Summary:
This adds the beginning of an update API to preserve MemorySSA.  In particular,
this patch adds a way to remove memory SSA accesses when instructions are
deleted.

It also adds relevant unit testing infrastructure for MemorySSA's API.

(There is an actual user of this API, i will make that diff dependent on this one.  In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P)

Reviewers: hfinkel, reames, george.burgess.iv

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17157

llvm-svn: 262362
2016-03-01 18:46:54 +00:00
Simon Atanasyan f69c7e5382 [DebugInfo] Dump CIE augmentation data as a list of hex bytes
CIE augmentation data might contain non-printable characters.
The patch prints the data as a list of hex bytes.

Differential Revision: http://reviews.llvm.org/D17759

llvm-svn: 262361
2016-03-01 18:38:05 +00:00
Easwaran Raman 7c4f25d2ed Metadata support for profile summary.
This adds support to convert ProfileSummary object to Metadata and create a
ProfileSummary object from metadata. This would allow attaching profile summary
information to Module allowing optimization passes to use it.

llvm-svn: 262360
2016-03-01 18:30:58 +00:00
Matt Arsenault 03dac8d8e4 DAGCombiner: Turn extract of bitcasted integer into truncate
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.

llvm-svn: 262358
2016-03-01 18:01:37 +00:00
Matt Arsenault e55c1658ea Add isScalarInteger helper to EVT/MVT
llvm-svn: 262357
2016-03-01 18:01:28 +00:00
Changpeng Fang 24f035af32 AMDGPU/SI: Implement DS_PERMUTE/DS_BPERMUTE Instruction Definitions and Intrinsics
Summary:
  This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics,
which are new since VI.

Reviewers: tstellarAMD, arsenm

Subscribers: llvm-commits, arsenm

Differential Revision: http://reviews.llvm.org/D17614

llvm-svn: 262356
2016-03-01 17:51:23 +00:00
Kostya Serebryany f84df30e4f [libFuzzer] remove FuzzerSanitizerOptions.cpp
llvm-svn: 262354
2016-03-01 17:46:32 +00:00
Michael Zuckerman 433b241570 [LLVM][AVX512] PSRL{DI|QI} Change imm8 to int
Differential Revision: http://reviews.llvm.org/D17713

llvm-svn: 262353
2016-03-01 17:46:32 +00:00
Hans Wennborg e64cf9dddb [X86] Check that attribute parameters match for tail calls (PR26590)
In the code below on 32-bit targets, x would previously get forwarded to g()
without sign-extension to 32 bits as required by the parameter attribute.

  void g(signed short);
  void f(unsigned short x) {
    g(x);
  }

llvm-svn: 262352
2016-03-01 17:45:23 +00:00
Sanjay Patel 2ca144f14c fix documentation comments; NFC
llvm-svn: 262351
2016-03-01 17:25:35 +00:00
Petar Jovanovic 6315f3f9b7 Revert "calculate builtin_object_size if argument is a removable pointer"
Revert r262337 as "check-llvm ubsan" step failed on
sanitizer-x86_64-linux-fast buildbot.

llvm-svn: 262349
2016-03-01 16:50:08 +00:00
Sanjay Patel 9fea531fec function names start with a lowercase letter; NFC
llvm-svn: 262347
2016-03-01 16:17:48 +00:00
Nikolay Haustov e309e1415d [AMDGPU] Remove unused disassembler code.
llvm-svn: 262346
2016-03-01 16:02:40 +00:00
Rafael Espindola 5cd721ae12 Refactor duplicated code for linking with pthread.
llvm-svn: 262344
2016-03-01 15:54:40 +00:00
Nikolay Haustov 47a115cd41 [AMDGPU] Fix build warnings.
llvm-svn: 262338
2016-03-01 14:50:59 +00:00
Petar Jovanovic 8aef99aa86 calculate builtin_object_size if argument is a removable pointer
This patch fixes calculating correct value for builtin_object_size function
when pointer is used only in builtin_object_size function call and never
after that.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D17337

llvm-svn: 262337
2016-03-01 14:39:55 +00:00
Nikolay Haustov ac106add0f [AMDGPU] Disassembler code refactored + error messages.
Idea behind this change is to make code shorter and as much common for all targets as possible. Let's even accept more code than is valid for a particular target, leaving it for the assembler to sort out.

64bit instructions decoding added.

Error\warning messages on unrecognized instructions operands added, InstPrinter allowed to print invalid operands helping to find invalid/unsupported code.

The change is massive and hard to compare with previous version, so it makes sense just to take a look on the new version. As a bonus, with a few TD changes following, it disassembles the majority of instructions. Currently it fully disassembles >300K binary source of some blas kernel.

Previous TODOs were saved whenever possible.

Patch by: Valery Pykhtin

Differential Revision: http://reviews.llvm.org/D17720

llvm-svn: 262332
2016-03-01 13:57:29 +00:00
Petr Pavlu 7ad9ec9fcf [LTO] Fix error reporting from lto_module_create_in_local_context()
Function lto_module_create_in_local_context() would previously
rely on the default LLVMContext being created for it by
LTOModule::makeLTOModule(). This context exits the program on
error and is not arranged to update sLastStringError in
tools/lto/lto.cpp.

Function lto_module_create_in_local_context() now creates an
LLVMContext by itself, sets it up correctly to its needs and then
passes it to LTOModule::createInLocalContext() which takes
ownership of the context and keeps it present for the lifetime of
the returned LTOModule.

Function LTOModule::makeLTOModule() is modified to take a
reference to LLVMContext (instead of a pointer) and no longer
creates a default context when nullptr is passed to it. Method
LTOModule::createInContext() that takes a pointer to LLVMContext
is removed because it allows to pass a nullptr to it. Instead
LTOModule::createFromBuffer() (that takes a reference to
LLVMContext) should be used.

Differential Revision: http://reviews.llvm.org/D17715

llvm-svn: 262330
2016-03-01 13:13:49 +00:00
Michael Zuckerman 7878888690 [AVX512][PSRAQ][PSRAD] Change imm8 to int.
Differential Revision: http://reviews.llvm.org/D17692

llvm-svn: 262320
2016-03-01 11:36:23 +00:00
Amjad Aboud 719325fe11 Disallow generating vzeroupper before return instruction (iret) in interrupt handler function.
This resolves https://llvm.org/bugs/show_bug.cgi?id=26412

Differential Revision: http://reviews.llvm.org/D17542

llvm-svn: 262319
2016-03-01 11:32:03 +00:00
Simon Atanasyan 255689c3ad [MC][YAML] Rangify the loop. NFC
llvm-svn: 262317
2016-03-01 10:11:27 +00:00
Vasileios Kalintiris 3a8f7f9e31 [mips] Promote the result of SETCC nodes to GPR width.
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.

The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.

Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D10970

llvm-svn: 262316
2016-03-01 10:08:01 +00:00
Nikolay Haustov ea8febde04 [TableGen] AsmMatcher: Skip optional operands in the midle of instruction if it is not present
Previosy, if actual instruction have one of optional operands then other optional operands listed before this also should be presented.
For example instruction v_fract_f32 v0, v1, mul:2 have one optional operand - OMod and do not have optional operand clamp. Previously this was not allowed because clamp is listed before omod in AsmString:

string AsmString = "v_fract_f32$vdst, $src0_modifiers$clamp$omod";
Making this work required some hacks (both OMod and Clamp match classes have same PredicateMethod).

Now, if MatchInstructionImpl meets formal optional operand that is not presented in actual instruction it skips this formal operand and tries to match current actual operand with next formal.

Patch by: Sam Kolton

Review: http://reviews.llvm.org/D17568

[AMDGPU] Assembler: Check immediate types for several optional operands in predicate methods
With this change you should place optional operands in order specified by asm string:

clamp -> omod
offset -> glc -> slc -> tfe
Fixes for several tests.
Depends on D17568

Patch by: Sam Kolton

Review: http://reviews.llvm.org/D17644
llvm-svn: 262314
2016-03-01 08:34:43 +00:00
Nikolay Haustov 95b4fcd377 AsmParser: Fix nested .irp/.irpc
Count .irp/.irpc in parseMacroLikeBody similar to .rept
Update tests.

Review: http://reviews.llvm.org/D17707
llvm-svn: 262313
2016-03-01 08:18:28 +00:00
Craig Topper 073e947f1b [X86] Centralize the masking of TSFlags with FormMask into a variable earlier so we can stop masking in multiple places. NFC
llvm-svn: 262312
2016-03-01 07:15:59 +00:00
Craig Topper 5c8dc5f064 [X86] Localize a temporary variable into the cases its need in. NFC
llvm-svn: 262310
2016-03-01 06:42:48 +00:00
Craig Topper b8c29b4ae9 [X86] Be consistent about using pre/post increment/decrement in nearby code. NFC
llvm-svn: 262309
2016-03-01 06:42:46 +00:00
Craig Topper d40a55064f [X86] Combine some initialization code with variable declaration and comments. NFC
llvm-svn: 262301
2016-03-01 05:42:16 +00:00
Matt Arsenault a67c4916cf LegalizeDAG: Use correct ptr type when expanding unaligned load/store
This fixes regressions exposed in existing AMDGPU tests in a
future commit when all loads are custom lowered.

llvm-svn: 262299
2016-03-01 05:13:35 +00:00
Matt Arsenault d275fcabcb AMDGPU: Don't emit build_pair during udivrem legalization
Technically you aren't supposed to emit these after type legalization
for some reason, and we use vector extracts of bitcasted integers
as the canonical way to do this.

llvm-svn: 262298
2016-03-01 05:06:05 +00:00
Matt Arsenault f4dfc1a027 AMDGPU: Don't use estimated stack size when we know the real stack size
llvm-svn: 262297
2016-03-01 04:58:20 +00:00
Matt Arsenault 59b8b77405 AMDGPU: Set HasExtractBitInsn
This currently does not have the control over the bitwidth,
and there are missing optimizations to reduce the integer to
32-bit if it can be.

But in most situations we do want the sinking to occur.

llvm-svn: 262296
2016-03-01 04:58:17 +00:00
David Majnemer cb305dea1c [WinEH] Allocate the registration node before the catch objects
The CatchObjOffset is relative to the end of the EH registration node
for 32-bit x86 WinEH targets.  A special sentinel value, 0, is used to
indicate that no catch object should be initialized.

This means that a catch object allocated immediately before the
registration node would be assigned a CatchObjOffset of 0, leading the
runtime to believe that a catch object should not be initialized.

To handle this, allocate the registration node prior to any other frame
object.  This will ensure that catch objects will not be allocated
before the registration node.

This fixes PR26757.

Differential Revision: http://reviews.llvm.org/D17689

llvm-svn: 262294
2016-03-01 04:30:16 +00:00
David Majnemer f08579f5a8 [Verifier] Diagnose when unwinding out of cycles of blocks
Generally speaking, this can only happen with unreachable code.
However, neglecting to check for this condition would lead us to loop
forever.

llvm-svn: 262284
2016-03-01 01:19:05 +00:00
Adam Nemet b8486e5a32 [LAA] Add missing debug output
llvm-svn: 262279
2016-03-01 00:50:08 +00:00
Sanjay Patel 6f2c01f712 [x86, InstCombine] transform more x86 masked loads to LLVM intrinsics
Continuation of:
http://reviews.llvm.org/rL262269

llvm-svn: 262273
2016-02-29 23:59:00 +00:00
Adam Nemet efc091f457 [LLE] Fix a comment
llvm-svn: 262270
2016-02-29 23:21:12 +00:00
Sanjay Patel 98a71505f5 [x86, InstCombine] transform x86 AVX masked loads to LLVM intrinsics
The intended effect of this patch in conjunction with:
http://reviews.llvm.org/rL259392
http://reviews.llvm.org/rL260145

is that customers using the AVX intrinsics in C will benefit from combines when
the load mask is constant:

__m128 mload_zeros(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(0));
}

__m128 mload_fakeones(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(1));
}

__m128 mload_ones(float *f) {
  return _mm_maskload_ps(f, _mm_set1_epi32(0x80000000));
}

__m128 mload_oneset(float *f) {
  return _mm_maskload_ps(f, _mm_set_epi32(0x80000000, 0, 0, 0));
}

...so none of the above will actually generate a masked load for optimized code.

This is the masked load counterpart to:
http://reviews.llvm.org/rL262064

llvm-svn: 262269
2016-02-29 23:16:48 +00:00
David Majnemer fe2f7f367a [Verifier] Handle more funclet edge cases
This change makes the verifier a little more paranoid.  It was possible
to trick the verifier into crashing or infinite looping.

llvm-svn: 262268
2016-02-29 22:56:36 +00:00
Adam Nemet 83be06e529 [LLE] Fix SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper with Polly
We can actually have dependences between accesses with different
underlying types.  Bail in this case.

A test will follow shortly.

llvm-svn: 262267
2016-02-29 22:53:59 +00:00
Eric Christopher 114fa1c3f6 Simplify some boolean conditional return statements in AArch64.
http://reviews.llvm.org/D9979

Patch by Richard Thomson (and some conflict resolution by me).

llvm-svn: 262266
2016-02-29 22:50:49 +00:00
Adrian Prantl dba58fbdd9 Improve the debug output of DwarfDebug::buildLocationList().
llvm-svn: 262265
2016-02-29 22:28:22 +00:00
Sanjoy Das 999dc75c12 [Verifier] Minor fix to error message; NFC
llvm-svn: 262262
2016-02-29 22:04:25 +00:00
Colin LeMahieu ab9eca4d9f [Hexagon] As a size optimization, not lazy extending TPREL or DTPREL variants since they're usually in range.
llvm-svn: 262258
2016-02-29 21:21:56 +00:00
Colin LeMahieu 9e5a9c32db [Hexagon] Missed member initialization causing ubsan failure.
llvm-svn: 262252
2016-02-29 20:42:25 +00:00
Adam Nemet dd9e637aca Enable LoopLoadElimination by default
Summary:
I re-benchmarked this and results are similar to original results in
D13259:

On ARM64:
  SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27%
  SingleSource/Benchmarks/Polybench/stencils/adi                   -19.78%

On x86:
  SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog  -27.14%

And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop
Distribution.

In terms of compile time, there is ~5% increase on both
SingleSource/Benchmarks/Misc/oourafft and
SingleSource/Benchmarks/Linkpack/linkpack-pc.  These are both very tiny
loop-intensive programs where SCEV computations dominates compile time.

The reason that time spent in SCEV increases has to do with the design
of the old pass manager.  If a transform pass does not preserve an
analysis we *invalidate* the analysis even if there was *no*
modification made by the transform pass.

This means that currently we don't take advantage of LLE and LV sharing
the same analysis (LAA) and unfortunately we recompute LAA *and* SCEV
for LLE.

(There should be a way to work around this limitation in the case of
SCEV and LAA since both compute things on demand and internally cache
their result.  Thus we could pretend that transform passes preserve
these analyses and manually invalidate them upon actual modification.
On the other hand the new pass manager is supposed to solve so I am not
sure if this is worthwhile.)

Reviewers: hfinkel, dberlin

Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16300

llvm-svn: 262250
2016-02-29 20:35:11 +00:00
Geoff Berry f5ba61d18c [AArch64] Fix isLegalAddImmediate() to return true for valid negative values.
Reviewers: t.p.northover, jmolloy

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D17463

llvm-svn: 262248
2016-02-29 19:53:22 +00:00
Adrian Prantl fb2add2be1 Fix PR26585 by improving the promotion of DBG_VALUEs to DW_AT_locations.
When a variable is described by a single DBG_VALUE instruction we can
often use a more efficient inline DW_AT_location instead of using a
location list.

This commit makes the heuristic that decides when to apply this
optimization stricter by also verifying that the DBG_VALUE is live at the
entry of the function (instead of just checking that it is valid until
the end of the function).

<rdar://problem/24611008>

llvm-svn: 262247
2016-02-29 19:49:46 +00:00
Steven Wu f2fe0141ca Rename embedded bitcode section in MachO
Summary:
Rename the section embeds bitcode from ".llvmbc,.llvmbc" to "__LLVM,__bitcode".
The new name matches MachO section naming convention.

Reviewers: rafael, pcc

Subscribers: davide, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D17388

llvm-svn: 262245
2016-02-29 19:40:10 +00:00
Ahmed Bougacha bb5d7d7ed8 [X86] Move the ATOMIC_LOAD_OP ISel from DAGToDAG to ISelLowering. NFCI.
This is long-standing dirtiness, as acknowledged by r77582:

    The current trick is to select it into a merge_values with
    the first definition being an implicit_def. The proper solution is
    to add new ISD opcodes for the no-output variant.

Doing this before selection will let us combine away some constructs.

Differential Revision: http://reviews.llvm.org/D17659

llvm-svn: 262244
2016-02-29 19:28:07 +00:00
Colin LeMahieu b9f1eae328 [Hexagon] Setting sign mismatch flag on expression instead of using bit tricks.
llvm-svn: 262243
2016-02-29 19:17:56 +00:00
Rong Xu 9e926e8b92 Minor code cleanup. NFC
llvm-svn: 262242
2016-02-29 19:16:04 +00:00
David Majnemer e60ee3b8ce [WinEH] Make setjmp work correctly with EH
32-bit X86 EH on Windows utilizes a stack of registration nodes
allocated and deallocated on entry/exit.  A registration node contains a
bunch of EH personality specific information like which try-state we are
currently in.

Because a setjmp target allows control flow from arbitrary program
points, there is no way to ensure that the try-state we are in is
correctly updated once we transfer control.

MSVC compatible compilers, like MSVC and ICC, utilize runtime helpers to
reinitialize the try-state when a longjmp occurs.  This is implemented
by adding additional arguments to _setjmp3: the desired try-state and
a helper routine to update the try-state.

Differential Revision: http://reviews.llvm.org/D17721

llvm-svn: 262241
2016-02-29 19:16:03 +00:00
Dehao Chen 939993ff2f Move discriminator assignment to the right place.
Summary: Now discriminator is assigned per-function instead of per-module.

Reviewers: davidxl, dnovillo

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D17664

llvm-svn: 262240
2016-02-29 18:59:48 +00:00
Colin LeMahieu 73cd686ce1 [Hexagon] Using MustExtend flag on expression instead of passing around bools.
llvm-svn: 262238
2016-02-29 18:39:51 +00:00
Adrian Prantl 693e8de0fa fix typo in comment
llvm-svn: 262236
2016-02-29 17:06:46 +00:00
Nemanja Ivanovic 1a5706ca1b Fix for PR26180
Corresponds to Phabricator review:
http://reviews.llvm.org/D16592

This fix includes both an update to how we handle the "generic" CPU on LE
systems as well as Anton's fix for the Fast Isel issue.

llvm-svn: 262233
2016-02-29 16:42:27 +00:00
Daniel Sanders 03a8d2f8ec [mips] Range check uimm20 and fixed a bug this revealed.
Summary:
The bug was that dextu's operand 3 would print 0-31 instead of 32-63 when
printing assembly. This came up when replacing
MipsInstPrinter::printUnsignedImm() with a version that could handle arbitrary
bit widths.

MipsAsmPrinter::printUnsignedImm*() don't seem to be used so they have been
removed.

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D15521

llvm-svn: 262231
2016-02-29 16:06:38 +00:00
Vasileios Kalintiris 29620aca3e [mips] Do not use SLL for ANY_EXTEND nodes as the high bits are undefined.
Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D15420

llvm-svn: 262230
2016-02-29 15:58:12 +00:00
Daniel Sanders 611eb82953 [mips] Make isel select the correct DEXT variant up front.
Summary:
Previously, it would always select DEXT and substitute any invalid matches
for DEXTU/DEXTM during MipsMCCodeEmitter::encodeInstruction(). This works
but causes problems when adding range checked immediates to IAS.

Now isel selects the correct variant up front.

Reviewers: vkalintiris

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D16810

llvm-svn: 262229
2016-02-29 15:26:54 +00:00
Rafael Espindola 8d6fbc3a4e IRObject: Mark extern_weak as weak.
llvm-svn: 262222
2016-02-29 14:26:06 +00:00
Benjamin Kramer 6bb15021b3 [InstSimplify] Restore fsub 0.0, (fsub 0.0, X) ==> X optzn
I accidentally removed this in r262212 but there was no test coverage to
detect it.

llvm-svn: 262215
2016-02-29 12:18:25 +00:00
Daniel Sanders 90f0d0b8e3 [mips] Make symbols an acceptable branch target when expanding compare-to-immediate-and-branch macros.
Reviewers: vkalintiris

Subscribers: llvm-commits, vkalintiris, dim, seanbruno, dsanders

Differential Revision: http://reviews.llvm.org/D15369

llvm-svn: 262213
2016-02-29 11:24:49 +00:00
Benjamin Kramer f5b2a47ac6 [InstSimplify] fsub 0.0, (fsub -0.0, X) ==> X is only safe if signed zeros are ignored.
Only allow fsub -0.0, (fsub -0.0, X) ==> X without nsz. PR26746.

llvm-svn: 262212
2016-02-29 11:12:23 +00:00
Chandler Carruth 8b5a7419b8 [PM] Wire up optimization levels and default pipeline construction APIs
in the PassBuilder.

These are really just stubs for now, but they give a nice API surface
that Clang or other tools can start learning about and enabling for
experimentation.

I've also wired up parsing various synthetic module pass names to
generate these set pipelines. This allows the pipelines to be combined
with other passes and have their order controlled, with clear separation
between the *kind* of canned pipeline, and the *level* of optimization
to be used within that canned pipeline.

The most interesting part of this patch is almost certainly the spec for
the different optimization levels. I don't think we can ever have hard
and fast rules that would make it easy to determine whether a particular
optimization makes sense at a particular level -- it will always be in
large part a judgement call. But hopefully this will outline the
expected rationale that should be used, and the direction that the
pipelines should be taken. Much of this was based on a long llvm-dev
discussion I started years ago to try and crystalize the intent behind
these pipelines, and now, at long long last I'm returning to the task of
actually writing it down somewhere that we can cite and try to be
consistent with.

Differential Revision: http://reviews.llvm.org/D12826

llvm-svn: 262196
2016-02-28 22:16:03 +00:00
NAKAMURA Takumi df0cd72657 [PM] Appease mingw32's auto-import DLL build with minimal tweaks, with fix for clang.
char AnalysisBase::ID should be declared as extern and defined in one module.

llvm-svn: 262188
2016-02-28 17:17:00 +00:00
Vasileios Kalintiris c9aaa3171d [mips] Remove unused function declarations from MipsRegisterInfo.h. NFC.
llvm-svn: 262187
2016-02-28 16:55:28 +00:00
NAKAMURA Takumi ca04a1f720 Revert r262185, "[PM] Appease mingw32's auto-import DLL build with minimal tweaks."
I'll rework soon.

llvm-svn: 262186
2016-02-28 16:54:06 +00:00
NAKAMURA Takumi de40e7437e [PM] Appease mingw32's auto-import DLL build with minimal tweaks.
char AnalysisBase::ID should be declared as extern and defined in one module.

llvm-svn: 262185
2016-02-28 16:38:46 +00:00
JF Bastien 1afd1e2baa WebAssembly: fix build
More API churn, experimental target got sad.

llvm-svn: 262179
2016-02-28 15:33:53 +00:00
Michael Zuckerman 96836fc81c [AVX512][PSLLW ][PSLLV] Change imm8 to int
Differential Revision: http://reviews.llvm.org/D17684

llvm-svn: 262176
2016-02-28 07:32:10 +00:00
Xinliang David Li 985ff20a9c [PGO] Remove redundant counter copies for avail_extern functions.
Differential Revision: http://reviews.llvm.org/D17654

llvm-svn: 262157
2016-02-27 23:11:30 +00:00
Duncan P. N. Exon Smith ebcce78f65 CodeGen: Remove an iterator => pointer conversion, NFC
Part of PR26753.

llvm-svn: 262154
2016-02-27 20:27:44 +00:00
Matt Arsenault 3a61985b2f AMDGPU: More bits of frame index are known to be zero
The maximum private allocation for the whole GPU is 4G,
so the maximum possible index for a single workitem is the
maximum size divided by the smallest granularity for a dispatch.

This increases the number of known zero high bits, which
enables more offset folding. The maximum private size per
workitem with this is 128M but may be smaller still.

llvm-svn: 262153
2016-02-27 20:26:57 +00:00
Duncan P. N. Exon Smith d6ebd07b8d CodeGen: Use MachineInstr& in InlineSpiller::rematerializeFor()
InlineSpiller::rematerializeFor() never uses its parameter as an
iterator, so take it by reference instead.  This removes an implicit
conversion from MachineBasicBlock::iterator to MachineInstr*.

llvm-svn: 262152
2016-02-27 20:23:14 +00:00
Duncan P. N. Exon Smith be8f8c4478 CodeGen: Update LiveIntervalAnalysis API to use MachineInstr&, NFC
These parameters aren't expected to be null, so take them by reference.

llvm-svn: 262151
2016-02-27 20:14:29 +00:00
Duncan P. N. Exon Smith fd8cc23220 CodeGen: Change MachineInstr to use MachineInstr&, NFC
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null.  Slowly inching
toward being able to fix PR26753.

llvm-svn: 262149
2016-02-27 20:01:33 +00:00
Matt Arsenault 982224cfb8 DAGCombiner: Don't unnecessarily swap operands in ReassociateOps
In the case where op = add, y = base_ptr, and x = offset, this
transform:

(op y, (op x, c1)) -> (op (op x, y), c1)

breaks the canonical form of add by putting the base pointer in the
second operand and the offset in the first.

This fix is important for the R600 target, because for some address
spaces the base pointer and the offset are stored in separate register
classes. The old pattern caused the ISel code for matching addressing
modes to put the base pointer and offset in the wrong register classes,
which required no-trivial code transformations to fix.

llvm-svn: 262148
2016-02-27 19:57:45 +00:00
Duncan P. N. Exon Smith d3a7467221 CodeGen: Use MachineInstr& in HashMachineInstr, NFC
Also update HashEndOfMBB to take MachineBasicBlock&.

llvm-svn: 262146
2016-02-27 19:48:01 +00:00
Duncan P. N. Exon Smith 5e6e8c7a0a CodeGen: Use MachineInstr& in AntiDepBreaker API, NFC
Take parameters as MachineInstr& instead of MachineInstr* in
AntiDepBreaker API, since these are required to be non-null.  No
functionality change intended.  Looking toward PR26753.

llvm-svn: 262145
2016-02-27 19:33:37 +00:00
Duncan P. N. Exon Smith bd529fbb4a CodeGen: Assert valid MI in AntiDepBreaker::UpdateDbgValue
This already assumes a valid MI, since it dereferences the MI in an
assertion before checking for null.  At an explicit assert.

llvm-svn: 262144
2016-02-27 19:23:34 +00:00
Duncan P. N. Exon Smith 1f6624ae34 AArch64: Use MachineInstr& in guaranteesZeroRegInBlock(), NFC
llvm-svn: 262143
2016-02-27 19:12:54 +00:00
Duncan P. N. Exon Smith 5702287809 CodeGen: Update DFAPacketizer API to take MachineInstr&, NFC
In all but one case, change the DFAPacketizer API to take MachineInstr&
instead of MachineInstr*.  In DFAPacketizer::endPacket(), take
MachineBasicBlock::iterator.  Besides cleaning up the API, this is in
search of PR26753.

llvm-svn: 262142
2016-02-27 19:09:00 +00:00
Duncan P. N. Exon Smith f9ab416d70 WIP: CodeGen: Use MachineInstr& in MachineInstrBundle.h, NFC
Update APIs in MachineInstrBundle.h to take and return MachineInstr&
instead of MachineInstr* when the instruction cannot be null.  Besides
being a nice cleanup, this is tacking toward a fix for PR26753.

llvm-svn: 262141
2016-02-27 17:05:33 +00:00
JF Bastien 13d3b9b777 WebAssembly: fix build
It was broken by the work for PR26753.

llvm-svn: 262140
2016-02-27 16:38:23 +00:00
Renato Golin 9a5419ecf7 Revert "[sancov] do not instrument nodes that are full pre-dominators"
This reverts commit r262103, as it broke all ARM and AArch64 bots.

llvm-svn: 262139
2016-02-27 14:19:19 +00:00
Simon Pilgrim 9e10b1655c Tidyup for loops - don't repeat upper limit evaluation if you don't have to. NFCI.
llvm-svn: 262137
2016-02-27 13:26:58 +00:00
Chris Dewhurst 053826af69 The patch adds missing registers and instructions to complete all the registers supported by the Sparc v8 manual.
These are all co-processor registers, with the exception of the floating-point deferred-trap queue register.
Although these will not be lowered automatically by any instructions, it allows the use of co-processor
instructions implemented by inline-assembly.

Code Reviewed at http://reviews.llvm.org/D17133, with the exception of a very small change in brace placement in SparcInstrInfo.td,
which was formerly causing a problem in the disassembly of the %fq register.

llvm-svn: 262133
2016-02-27 12:49:59 +00:00
Simon Pilgrim 3b42ca0760 Strip trailing whitespace. NFCI.
llvm-svn: 262131
2016-02-27 11:49:16 +00:00
Chandler Carruth afcec4c55a [PM] Provide explicit instantiation declarations and definitions for the
PassManager and AnalysisManager template specializations as well.

llvm-svn: 262128
2016-02-27 10:45:35 +00:00
Chandler Carruth 2a54094d40 [PM] Provide two templates for the two directionalities of analysis
manager proxies and use those rather than repeating their definition
four times.

There are real differences between the two directions: outer AMs are
const and don't need to have invalidation tracked. But every proxy in
a particular direction is identical except for the analysis manager type
and the IR unit they proxy into. This makes them prime candidates for
nice templates.

I've started introducing explicit template instantiation declarations
and definitions as well because we really shouldn't be emitting all this
everywhere. I'm going to go back and add the same for the other
templates like this in a follow-up patch.

I've left the analysis manager as an opaque type rather than using two
IR units and requiring it to be an AnalysisManager template
specialization. I think its important that users retain the ability to
provide their own custom analysis management layer and provided it has
the appropriate API everything should Just Work.

llvm-svn: 262127
2016-02-27 10:38:10 +00:00
Matt Arsenault 360d244d5b DAGCombiner: Relax sqrt NaN folding check
This is OK for +0 since compares to +/-0 give the same result.

llvm-svn: 262125
2016-02-27 09:38:05 +00:00