Commit Graph

99511 Commits

Author SHA1 Message Date
Eric Christopher 42b9248803 For X86-64 linux and PPC64 linux align int128 to 16 bytes.
For other platforms we should find out what they need and likely
make the same change, however, a smaller additional change is easier
for platforms we know have it specified	in the ABI. As part of this
rewrite some of the handling in the backends for data layout and update
a bunch of testcases.

Based on a patch by Simonas Kazlauskas!

llvm-svn: 294702
2017-02-10 03:32:21 +00:00
Matt Arsenault b4493e909f AMDGPU: Fix trailing whitespace
llvm-svn: 294694
2017-02-10 02:42:31 +00:00
Wei Ding 205bfdb3e9 AMDGPU : Add trap handler support.
Differential Revision: http://reviews.llvm.org/D26010

llvm-svn: 294692
2017-02-10 02:15:29 +00:00
Stanislav Mekhanoshin 6dec24316b [AMDGPU] Override PSet for M0
This change returns empty PSet list for M0 register. Otherwise its
PSet as defined by tablegen is SReg_32. This results in incorrect
register pressure calculation every time an instruction uses M0.
Such uses count as SReg_32 PSet and inadequately increase pressure
on SGPRs.

Differential Revision: https://reviews.llvm.org/D29798

llvm-svn: 294691
2017-02-10 02:07:58 +00:00
Eric Fiselier 87c87f4c30 [CMake] Fix pthread handling for out-of-tree builds
LLVM defines `PTHREAD_LIB` which is used by AddLLVM.cmake and various projects
to correctly link the threading library when needed. Unfortunately
`PTHREAD_LIB` is defined by LLVM's `config-ix.cmake` file which isn't installed
and therefore can't be used when configuring out-of-tree builds. This causes
such builds to fail since `pthread` isn't being correctly linked.

This patch attempts to fix that problem by renaming and exporting
`LLVM_PTHREAD_LIB` as part of`LLVMConfig.cmake`. I renamed `PTHREAD_LIB`
because It seemed likely to cause collisions with downstream users of
`LLVMConfig.cmake`.

llvm-svn: 294690
2017-02-10 01:59:20 +00:00
Marcos Pividori a0b23b8e63 [libFuzzer] Export external functions on tests.
We need to export external functions so they are found when calling
GetProcAddress() on Windows. But we can't use `__declspec(dllexport)` because
we want the targets to be completely independent from the fuzz engines and don't
depend on other header files. Also, we don't want to include platform specific
code managed with conditional macros.
So, the solution is to add the exported symbols with linker flags in cmake.

Differential revision: https://reviews.llvm.org/D29752

llvm-svn: 294688
2017-02-10 01:40:28 +00:00
Marcos Pividori 0ae27e80b0 [libFuzzer] Use dynamic loading for External Functions on Windows.
Replace weak aliases with dynamic loading.
Weak aliases were generating some problems when linking for MT on Windows. For
MT, compiler-rt's libraries are statically linked to the main executable the
same than libFuzzer, so if we use weak aliases, we are providing two different
default implementations for the same weak function and the linker fails.

In this diff I re implement ExternalFunctions() using dynamic loading, so it
works in both cases (MD and MT). Also, dynamic loading is simpler, since we are
not defining any auxiliary external function, and we don't need to deal with
weak aliases.
This is equivalent to the implementation using dlsym(RTLD_DEFAULT, FnName) for
Posix.

Differential revision: https://reviews.llvm.org/D29751

llvm-svn: 294687
2017-02-10 01:35:46 +00:00
Eugene Zelenko 4b6ff6b86e [MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 294685
2017-02-10 01:33:54 +00:00
Dan Gohman df4f4d45f0 [WebAssembly] Pass an MCContext to WebAssemblyMCCodeEmitter. NFC.
llvm-svn: 294679
2017-02-10 00:14:42 +00:00
Matthias Braun 2bef2a08c0 Fix syntax error
llvm-svn: 294678
2017-02-10 00:09:20 +00:00
Matthias Braun 62e1e8531b ARMSubtarget.h: Change to one line per enum element; NFC
Change syntax to have enum elements sorted alphabetically and one per
line as that is more merge/cherry pick friendly.

llvm-svn: 294677
2017-02-10 00:06:44 +00:00
Eric Christopher e4b10f5d37 Add an additional set of braces to deal with subobject initialization.
llvm-svn: 294674
2017-02-10 00:02:09 +00:00
Chandler Carruth 0ede22e1c0 [PM] Add Argument Promotion to the pass pipeline.
This needs explicit requires of the optimization remark emission before
loop pass pipelines containing LICM as we no longer get it from the
inliner -- Argument Promotion may invalidate it. Technically the inliner
could also have broken this, but it never came up in testing.

Differential Revision: https://reviews.llvm.org/D29595

llvm-svn: 294670
2017-02-09 23:54:57 +00:00
Chandler Carruth addcda483e [PM] Port ArgumentPromotion to the new pass manager.
Now that the call graph supports efficient replacement of a function and
spurious reference edges, we can port ArgumentPromotion to the new pass
manager very easily.

The old PM-specific bits are sunk into callbacks that the new PM simply
doesn't use. Unlike the old PM, the new PM simply does argument
promotion and afterward does the update to LCG reflecting the promoted
function.

Differential Revision: https://reviews.llvm.org/D29580

llvm-svn: 294667
2017-02-09 23:46:27 +00:00
Peter Collingbourne 17febdbb25 WholeProgramDevirt: Check that VCP candidate functions are defined before evaluating them.
This was crashing before.

llvm-svn: 294666
2017-02-09 23:46:26 +00:00
Chandler Carruth 1f8fcfeac5 [PM/LCG] Teach LCG to support spurious reference edges.
Somewhat amazingly, this only requires teaching it to clean them up when
deleting a dead function from the graph. And we already have exactly the
necessary data structures to do that in the parent RefSCCs.

This allows ArgPromote to work in a much simpler way be merely letting
reference edges linger in the graph after the causing IR is deleted. We
will clean up these edges when we run any function pass over the IR, but
don't remove them eagerly.

This avoids all of the quadratic update issues both in the current pass
manager and in my previous attempt with the new pass manager.

Differential Revision: https://reviews.llvm.org/D29579

llvm-svn: 294663
2017-02-09 23:30:14 +00:00
George Burgess IV ccf11c2f9f [ARM] Add support for armv7ve triple in llvm (PR31358).
Gcc supports target armv7ve which is armv7-a with virtualization
extensions. This change adds support for this in llvm for gcc
compatibility.

Also remove redundant FeatureHWDiv, FeatureHWDivARM for a few models as
this is specified automatically by FeatureVirtualization.

Patch by Manoj Gupta.

Differential Revision: https://reviews.llvm.org/D29472

llvm-svn: 294661
2017-02-09 23:29:14 +00:00
Chandler Carruth aaad9f84be [PM/LCG] Teach the LazyCallGraph how to replace a function without
disturbing the graph or having to update edges.

This is motivated by porting argument promotion to the new pass manager.
Because of how LLVM IR Function objects work, in order to change their
signature a new object needs to be created. This is efficient and
straight forward in the IR but previously was very hard to implement in
LCG. We could easily replace the function a node in the graph
represents. The challenging part is how to handle updating the edges in
the graph.

LCG previously used an edge to a raw function to represent a node that
had not yet been scanned for calls and references. This was the core
of its laziness. However, that model causes this kind of update to be
very hard:
1) The keys to lookup an edge need to be `Function*`s that would all
   need to be updated when we update the node.
2) There will be some unknown number of edges that haven't transitioned
   from `Function*` edges to `Node*` edges.

All of this complexity isn't necessary. Instead, we can always build
a node around any function, always pointing edges at it and always using
it as the key to lookup an edge. To maintain the laziness, we need to
sink the *edges* of a node into a secondary object and explicitly model
transitioning a node from empty to populated by scanning the function.
This design seems much cleaner in a number of ways, but importantly
there is now exactly *one* place where the `Function*` has to be
updated!

Some other cleanups that fall out of this include having something to
model the *entry* edges more accurately. Rather than hand rolling parts
of the node in the graph itself, we have an explicit `EdgeSequence`
object that gives us exactly the functionality needed. We also have
a consistent place to define the edge iterators and can use them for
both the entry edges and the internal edges of the graph.

The API used to model the separation between a node and its edges is
intentionally very thin as most clients are expected to deal with nodes
that have populated edges. We model this exactly as an optional does
with an additional method to populate the edges when that is
a reasonable thing for a client to do. This is based on API design
suggestions from Richard Smith and David Blaikie, credit goes to them
for helping pick how to model this without it being either too explicit
or too implicit.

The patch is somewhat noisy due to shifting around iterator types and
new syntax for walking the edges of a node, but most of the
functionality change is in the `Edge`, `EdgeSequence`, and `Node` types.

Differential Revision: https://reviews.llvm.org/D29577

llvm-svn: 294653
2017-02-09 23:24:13 +00:00
Dan Gohman b6afd2070a [WebAssembly] Refactor void return peephole using MaybeRewriteToFallthrough. NFC.
llvm-svn: 294652
2017-02-09 23:19:03 +00:00
Sanjay Patel f38bab73aa [InstCombine] allow (X * C2) << C1 --> X * (C2 << C1) for vectors
This fold already existed for vectors but only when 'C1' was a splat
constant (but 'C2' could be any constant). 

There were no tests for any vector constants, so I'm adding a test
that shows non-splat constants for both operands.  

llvm-svn: 294650
2017-02-09 23:13:04 +00:00
Peter Collingbourne cea1e4e79a De-duplicate some code for creating an AARGetter suitable for the legacy PM.
I'm about to use this in a couple more places.

Differential Revision: https://reviews.llvm.org/D29793

llvm-svn: 294648
2017-02-09 23:11:52 +00:00
Adrian McCarthy d6e091dcc5 Fix build break from r294633.
llvm-svn: 294642
2017-02-09 22:49:35 +00:00
Simon Pilgrim 7f0d7e08b2 [X86] Remove duplicate call to getValueType. NFCI.
llvm-svn: 294640
2017-02-09 22:35:59 +00:00
Peter Collingbourne ef089bdb4b X86: Introduce relocImm-based patterns for cmp.
Differential Revision: https://reviews.llvm.org/D28690

llvm-svn: 294636
2017-02-09 22:02:28 +00:00
Matt Arsenault 0699ef39ce AMDGPU: Add pass to expand memcpy/memmove/memset
llvm-svn: 294635
2017-02-09 22:00:42 +00:00
Peter Collingbourne d7dd65ad7c X86: Teach X86InstrInfo::analyzeCompare to recognize compares of symbols.
This requires that we communicate to X86InstrInfo::optimizeCompareInstr
that the second operand is neither a register nor an immediate. The way we
do that is by setting CmpMask to zero.

Note that there were already instructions where the second operand was not a
register nor an immediate, namely X86::SUB*rm, so also set CmpMask to zero
for those instructions. This seems like a latent bug, but I was unable to
trigger it.

Differential Revision: https://reviews.llvm.org/D28621

llvm-svn: 294634
2017-02-09 21:58:24 +00:00
Adrian McCarthy 0beb3323c5 Introduce NativeRawSymbol for PDB reading.
This is a stub for a new concrete implementation of IPDBRawSymbol.
Nothing uses this uses this implementation yet.  My plan is to
locally switch lldb-pdbdump from the DIA reader to the Native one
and flesh out the implementations of these method stubs in the order
they're needed.

llvm-svn: 294633
2017-02-09 21:51:19 +00:00
Michael J. Spencer 714d9d22ad [LoadCombine] Fix combining of loads which span an aliasing store.
Fixes PR31517

Differential Revision: https://reviews.llvm.org/D28922

llvm-svn: 294632
2017-02-09 21:46:49 +00:00
Peter Collingbourne 857aba4410 Rename LowerTypeTestsSummaryAction to PassSummaryAction. NFCI.
I intend to use the same type with the same semantics in the WholeProgramDevirt
pass.

Differential Revision: https://reviews.llvm.org/D29746

llvm-svn: 294629
2017-02-09 21:45:01 +00:00
Sanjay Patel ae3b43e488 [InstCombine] use m_APInt to allow demanded bits analysis on splat constants
llvm-svn: 294628
2017-02-09 21:43:06 +00:00
Konstantin Zhuravlyov fd87137710 [AMDGPU] Calculate number of min/max SGPRs/VGPRs for WavesPerEU instead of using switch statement
Differential Revision: https://reviews.llvm.org/D29741

llvm-svn: 294627
2017-02-09 21:33:23 +00:00
Daniel Berlin 73ad5cb9b1 Drop graph_ prefix
llvm-svn: 294621
2017-02-09 20:37:46 +00:00
Daniel Berlin 58a6e57394 GraphTraits: Add range versions of graph traits functions (graph_nodes, graph_children, inverse_graph_nodes, inverse_graph_children).
Summary:
Convert all obvious node_begin/node_end and child_begin/child_end
pairs to range based for.

Sending for review in case someone has a good idea how to make
graph_children able to be inferred. It looks like it would require
changing GraphTraits to be two argument or something. I presume
inference does not happen because it would have to check every
GraphTraits in the world to see if the noderef types matched.

Note: This change was 3-staged with clang as well, which uses
Dominators/etc from LLVM.

Reviewers: chandlerc, tstellarAMD, dblaikie, rsmith

Subscribers: arsenm, llvm-commits, nhaehnle

Differential Revision: https://reviews.llvm.org/D29767

llvm-svn: 294620
2017-02-09 20:37:24 +00:00
Sanjoy Das 74bda4d591 [JumpThreading] Thread through guards
Summary:
This patch allows JumpThreading also thread through guards.
Virtually, guard(cond) is equivalent to the following construction:

  if (cond) { do something } else {deoptimize}

Yet it is not explicitly converted into IFs before lowering.
This patch enables early threading through guards in simple cases.
Currently it covers the following situation:

  if (cond1) {
    // code A
  } else {
    // code B
  }
  // code C
  guard(cond2)
  // code D

If there is implication cond1 => cond2 or !cond1 => cond2, we can transform
this construction into the following:

  if (cond1) {
    // code A
    // code C
  } else {
    // code B
    // code C
    guard(cond2)
  }
  // code D

Thus, removing the guard from one of execution branches.

Patch by Max Kazantsev!

Reviewers: reames, apilipenko, igor-laevsky, anna, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29620

llvm-svn: 294617
2017-02-09 19:40:22 +00:00
Saleem Abdulrasool 111cd669e9 Object: pad out BSD archive members to 8-bytes
ld64 requires its archive members to be 8-byte aligned for 64-bit
content and 4-byte aligned for 32-bit content.  Opt for the larger
alignment requirement.  This ensures that ld64 can consume archives
generated by llvm-ar.

Thanks to Kevin Enderby for the hint about the ld64/cctools behaviours!

Resolves PR28361!

llvm-svn: 294615
2017-02-09 19:29:35 +00:00
Simon Pilgrim e0b5c2acbd Convert to for-range loop. NFCI.
llvm-svn: 294610
2017-02-09 18:52:24 +00:00
Geoff Berry 7e320c2485 [SelectionDAG] Fix bugs in inverted condition splitting code.
Summary:
Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by
Mikael Holmen.  Handle non-canonicalized xor not operation
correctly (was assuming operand 0 was always the non-constant operand)
and check that the negated condition is also in the same block as the
original and/or instruction (as is done for and/or operands already)
before proceeding with optimization.

Reviewers: bogner, MatzeB, qcolombet

Subscribers: mcrosier, uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D29680

llvm-svn: 294605
2017-02-09 18:28:17 +00:00
Simon Pilgrim 6bf1bd3ed6 [X86][MMX] Remove the (long time) unused MMX_PINSRW ISD opcode.
llvm-svn: 294596
2017-02-09 17:08:47 +00:00
Saleem Abdulrasool d3faeaf8a2 Object: add a comment explaining a divergence
Add a note about the reason for the divergence from the specification
for ld64.  Addresses post-commit review comments from Davide.  NFC.

llvm-svn: 294594
2017-02-09 15:47:58 +00:00
David Bozier 93e773e9be Revert: "[Stack Protection] Add diagnostic information for why stack protection was applied to a function"
this reverts revision r294590 as it broke some buildbots.

llvm-svn: 294593
2017-02-09 15:40:14 +00:00
David Bozier 6a44b7c2eb [Stack Protection] Add diagnostic information for why stack protection was applied to a function
Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function.

This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled.

Patch by: James Henderson

Differential Revision: https://reviews.llvm.org/D29023

llvm-svn: 294590
2017-02-09 15:08:40 +00:00
Rafael Espindola dc1c3011fd Make it possible to set SHF_LINK_ORDER explicitly.
This will make it possible to add support for gcing user metadata
(asan for example).

llvm-svn: 294589
2017-02-09 14:59:20 +00:00
Pierre Gousseau 6953b32475 [X86][btver2] PR31902: Fix a crash in combineOrCmpEqZeroToCtlzSrl under fast math.
In combineOrCmpEqZeroToCtlzSrl, replace "getConstantOperand == 0" by "isNullConstant" to account for floating point constants.

Differential Revision: https://reviews.llvm.org/D29756

llvm-svn: 294588
2017-02-09 14:43:58 +00:00
Diana Picus 7232af352f [ARM] GlobalISel: Lower single precision FP args
Both for aapcscc and aapcs_vfpcc. We currently filter out soft float targets
because we don't support libcalls yet.

llvm-svn: 294584
2017-02-09 13:09:59 +00:00
Artur Pilipenko 4a64031954 [DAGCombiner] Support non-zero offset in load combine
Enable folding patterns which load the value from non-zero offset:

  i8 *a = ...
  i32 val = a[4] | (a[5] << 8) | (a[6] << 16) | (a[7] << 24)
=>
  i32 val = *((i32*)(a+4))

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D29394

llvm-svn: 294582
2017-02-09 12:06:01 +00:00
Simon Pilgrim 563e23e66e [X86][SSE] Attempt to break register dependencies during lowerBuildVector
LowerBuildVectorv16i8/LowerBuildVectorv8i16 insert values into a UNDEF vector if the build vector doesn't contain any zero elements, resulting in register dependencies with a previous use of the register.

This patch attempts to break the register dependency by either always zeroing the vector before hand or (if we're inserting to the 0'th element) by using VZEXT_MOVL(SCALAR_TO_VECTOR(i32 AEXT(Elt))) which lowers to (V)MOVD and performs a similar function. Additionally (V)MOVD is a shorter instruction than PINSRB/PINSRW. We already do something similar for SSE41 PINSRD.

On pre-SSE41 LowerBuildVectorv16i8 we go a little further and use VZEXT_MOVL(SCALAR_TO_VECTOR(i32 ZEXT(Elt))) if the build vector contains zeros to avoid the vector zeroing at the cost of a scalar zero extension, which can probably be brought over to the other cases in a future patch in some cases (load folding etc.)

Differential Revision: https://reviews.llvm.org/D29720

llvm-svn: 294581
2017-02-09 11:50:19 +00:00
Vitaly Buka 9987d98370 LVI: Fix use-of-uninitialized-value after r294463
BlockValueStack can be reallocated making reference e invalid.

llvm-svn: 294572
2017-02-09 09:28:05 +00:00
Craig Topper 3cac763532 [X86] Remove the HLE feature flag.
We only implemented it for one of the 3 HLE instructions and that instruction is also under the RTM flag. Clang only implements the RTM flag from its command line.

llvm-svn: 294562
2017-02-09 06:51:02 +00:00
Craig Topper 86576bd921 [X86] Remove INVPCID and SMAP feature flags. They aren't currently used by any instructions and not tested.
If we implement intrinsics for their instructions in the future, the feature flags can be added back with proper testing.

llvm-svn: 294561
2017-02-09 06:50:59 +00:00
Craig Topper 50f3d1452c [X86] Clzero intrinsic and its addition under znver1
This patch does the following.

1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero
2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1)
3. Adds the clzero feature under znver1 architecture.
4. The custom inserter is added in Lowering.
5. A testcase is added to check the intrinsic.
6. The clzero instruction is added to assembler test.

Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me.

Differential revision: https://reviews.llvm.org/D29385

llvm-svn: 294558
2017-02-09 04:27:34 +00:00