Commit Graph

241315 Commits

Author SHA1 Message Date
Simon Atanasyan 65b2461cfd [ELF][MIPS] Do not emit DT_REL[A]COUNT for MIPS targets
It looks like MIPS dynamic loader does not support RELCOUNT tag.
Both gold/bfd linkers does not emit this tag on MIPS. I will investigate
the problem further but for now it is better to behave like GNU linkers.

llvm-svn: 280630
2016-09-04 17:40:12 +00:00
Simon Pilgrim c228f5c195 [X86][SSE] Regenerate fcmp/uitofp combine tests
llvm-svn: 280629
2016-09-04 17:16:01 +00:00
Lang Hames 6c9bd78507 [ORC] Fix an unfinished comment.
llvm-svn: 280628
2016-09-04 16:31:41 +00:00
Sanjay Patel 6b4909749b [InstCombine] recode icmp fold in a vector-friendly way; NFC
The transform in question:
icmp (and (trunc W), C2), C1 -> icmp (and W, C2'), C1'

...is still not enabled for vectors, thus no functional change intended.
It's not clear to me if this is a good transform for vectors or even
scalars in general. Changing that behavior may be a follow-on patch.

llvm-svn: 280627
2016-09-04 14:32:15 +00:00
Hal Finkel f0bc9db96e [PowerPC] During branch relaxation, recompute padding offsets before each iteration
We used to compute the padding contributions to the block sizes during branch
relaxation only at the start of the transformation. As we perform branch
relaxation, we change the sizes of the blocks, and so the amount of inter-block
padding might change. Accordingly, we need to recompute the (alignment-based)
padding in between every iteration on our way toward the fixed point.

Unfortunately, I don't have a test case (and none was provided in the bug
report), and while this obviously seems needed, algorithmically, I don't have
any way of generating a small and/or non-fragile regression test.

llvm-svn: 280626
2016-09-04 14:18:29 +00:00
Igor Breger 7e2a0dfa0c revert r279960.
https://llvm.org/bugs/show_bug.cgi?id=30249

llvm-svn: 280625
2016-09-04 14:03:52 +00:00
Simon Pilgrim 9a36318c54 EOL fixes
llvm-svn: 280624
2016-09-04 13:30:46 +00:00
Simon Pilgrim 122b0de1c1 Strip trailing whitespace
llvm-svn: 280623
2016-09-04 13:28:46 +00:00
Joerg Sonnenberger 9bd9d98929 Test case for r280607 to check presence and sanity of the *_LOCK_FREE
macros.

llvm-svn: 280622
2016-09-04 11:21:27 +00:00
Kuba Brecka 224264ade0 [libcxx] Fix a data race in call_once
call_once is using relaxed atomic load to perform double-checked locking, which contains a data race. The fast-path load has to be an acquire atomic load.

Differential Revision: https://reviews.llvm.org/D24028

llvm-svn: 280621
2016-09-04 09:55:12 +00:00
Chandler Carruth ccd44939ef [PM] Revert r280447: Add a unittest for invalidating module analyses with an SCC pass.
This was mistakenly committed. The world isn't ready for this test, the
test code has horrible debugging code in it that should never have
landed in tree, it currently passes because of bugs elsewhere, and it
needs to be rewritten to not be susceptible to passing for the wrong
reasons.

I'll re-land this in a better form when the prerequisite patches land.

So sorry that I got this mixed into a series of commits that *were*
ready to land. I shouldn't have. =[ What's worse is that it stuck around
for so long and I discovered it while fixing the underlying bug that
caused it to pass.

llvm-svn: 280620
2016-09-04 08:42:31 +00:00
Chandler Carruth 11b3f60cd9 [LCG] Clean up and make NDEBUG verify calls more rigorous with
make_scope_exit now that we have that utility.

This makes the code much more clear and readable by isolating the check.
It also makes it easy to go through and make sure all the interesting
update routines have a start and end verify so we don't slowly let the
graph drift into an invalid state.

llvm-svn: 280619
2016-09-04 08:34:31 +00:00
Chandler Carruth 1f621f0a70 [LCG] A NFC refactoring to extract the logic for doing
a postorder-sequence based update after edge insertion into a generic
helper function.

This separates the SCC-specific logic into two fairly simple lambdas and
extracts the rest into a generic helper template function. I think this
is a net win on its own merits because it disentangles different pieces
of the algorithm. Now there is one place that does the two-step
partition to identify a set of newly connected components and at the
same time update the postorder sequence.

However, I'm also hoping to re-use this an upcoming patch to update
a cached post-order sequence of RefSCCs when doing the analogous update
to the RefSCC graph, and I don't want to have two copies.

The diff is quite messy but this really is just moving things around and
making types generic rather than specific.

llvm-svn: 280618
2016-09-04 08:34:24 +00:00
Dorit Nuzman abd15f69b2 [InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing
memcpy with ld/st.

When InstCombine replaces a memcpy with loads+stores it does not copy over the
llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes
that.

Differential Revision: https://reviews.llvm.org/D23499

llvm-svn: 280617
2016-09-04 07:49:39 +00:00
Lang Hames 3301c7eebd [ExecutionEngine] Move ObjectCache::anchor from MCJIT to ExecutionEngine.
ObjectCache is an ExecutionEngine utility, so its anchor belongs there. The
practical impact of this change is that ORC users no longer need to link MCJIT
to use ObjectCaches.

llvm-svn: 280616
2016-09-04 07:24:11 +00:00
Dorit Nuzman 7673ba7ac2 Test commit.
llvm-svn: 280615
2016-09-04 07:06:00 +00:00
Hal Finkel 73390c7acd [PowerPC] Zero-extend constants in FastISel
As it turns out, whether we zero-extend or sign-extend i8/i16 constants, which
are illegal types promoted to i32 on PowerPC, is a choice constrained by
assumptions within the infrastructure. Specifically, the logic in
FunctionLoweringInfo::ComputePHILiveOutRegInfo assumes that constant PHI
operands will be zero extended, and so, at least when materializing constants
that are PHI operands, we must do the same.

The rest of our fast-isel implementation does not appear to depend on the fact
that we were sign-extending i8/i16 constants, and all other targets also appear
to zero-extend small-bitwidth constants in fast-isel; we'll now do the same (we
had been doing this only for i1 constants, and sign-extending the others).

Fixes PR27721.

llvm-svn: 280614
2016-09-04 06:07:19 +00:00
Elad Cohen fb6358d2b5 [Modules] Add 'freestanding' to the 'requires-declaration' feature-list.
This adds support for modules that require (non-)freestanding
environment, such as the compiler builtin mm_malloc submodule.

Differential Revision: https://reviews.llvm.org/D23871

llvm-svn: 280613
2016-09-04 06:00:42 +00:00
Eric Fiselier 8e571b551e Apply curr_symbol.pass.cpp test fix to missed test case
llvm-svn: 280612
2016-09-04 04:09:25 +00:00
Craig Topper af0d63d2e7 [AVX-512] Remove masked integer add/sub/mull intrinsics and upgrade to native IR.
llvm-svn: 280611
2016-09-04 02:09:53 +00:00
Joseph Tremoulet e92e0a9042 Fix inliner funclet unwind memoization
Summary:
The inliner may need to determine where a given funclet unwinds to,
and this determination may depend on other funclets throughout the
funclet tree.  The code that performs this walk in getUnwindDestToken
memoizes results to avoid redundant computations.  In the case that
a funclet's unwind destination is derived from its ancestor, there's
code to walk back down the tree from the ancestor updating the memo
map of its descendants to record the unwind destination.  This change
fixes that code to account for the case that some descendant has a
different unwind destination, which can happen if that unwind dest
is a descendant of the EHPad being queried and thus didn't determine
its unwind destination.

Also update test inline-funclets.ll, which is supposed to cover such
scenarios, to include a case that fails an assertion without this fix
but passes with it.

Fixes PR29151.


Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D24117

llvm-svn: 280610
2016-09-04 01:23:20 +00:00
Joerg Sonnenberger b50b2fac9f Trailing dot that shouldn't have been committed.
llvm-svn: 280609
2016-09-04 00:51:02 +00:00
Eric Fiselier f49fe8f2b6 Fix bad locale test data when using the newest glibc
llvm-svn: 280608
2016-09-04 00:48:54 +00:00
Joerg Sonnenberger 82216f0faa PR 27200: Fix names of the atomic lock-free macros.
llvm-svn: 280607
2016-09-04 00:44:10 +00:00
Todd Fiala b1a503bd13 XFAIL TestGdbRemoteExitCode failing tests
Tracked by:
llvm.org/pr30271

llvm-svn: 280606
2016-09-04 00:43:10 +00:00
Marshall Clow 2a837eae39 Mark test as XFAIL for C++03, rather than providing a dummy pass.
llvm-svn: 280605
2016-09-04 00:37:06 +00:00
Todd Fiala e77fce0a50 [NFC] Darwin llgs support from Week of Code
This code represents the Week of Code work I did on bringing up
lldb-server LLGS support for Darwin.  It does not include the
Xcode project changes needed, as we don't want to throw that switch
until more support is implemented (i.e. this change is inert, no
build systems use it yet.  I've verified on Ubuntu 16.04, macOS
Xcode and macOS cmake builds).

This change does some minimal refactoring of code that is shared
with the Linux LLGS portion, moving it from NativeProcessLinux into
NativeProcessProtocol.  That code is also used by NativeProcessDarwin.

Current state on Darwin:
* Process launching is implemented.  (Attach is not).
  Launching on devices has not yet been tested (FBS/BKS might
  need a bit of work).
* Inferior waitpid monitoring and communication of exit status
  via MainLoop callback is implemented.
* Memory read/write, breakpoints, thread register context, etc.
  are not yet implemented.  This impacts process stop/resume, as
  the initial launch suspended immediately starts the process
  up and running because it doesn't know it is supposed to remain
  stopped.
* I implemented the equivalent of MachThreadList as
  NativeThreadListDarwin, in anticipation that we might want to
  factor out common parts into NativeThreadList{Protocol} and share
  some code here.  After writing it, though, the fallout from merging
  Mach Task/Process into a single concept plus some other minor
  changes makes the whole NativeThreadListDarwin concept nothing more
  than dead weight.  I am likely going to get rid of this class and
  just manage it directly in NativeProcessDarwin, much like I did
  for NativeProcessLinux.
* There is a stub-out call for starting a STDIO thread.  That will
  go away and adopt the MainLoop pselect-based IOObject reading.

I am developing the fully-integrated changes in the following repo,
which contains the necessary Xcode bits and the glue that enables
lldb-debugserver on a macOS system:

  https://github.com/tfiala/lldb/tree/llgs-darwin

This change also breaks out a few of the lldb-server tests into
their own directory, and adds some $qHostInfo tests (not sure why
I didn't write those tests back when I initially implemented that
on the Linux side).

llvm-svn: 280604
2016-09-04 00:18:56 +00:00
Craig Topper a57d2ca406 [X86] Combine some of the strings in autoupgrade code.
llvm-svn: 280603
2016-09-03 23:55:13 +00:00
Xinliang David Li 7a28a7fbd8 Cleanup : Use metadata preserving API for branch creation
Use the wrapper API in IRBuilder that does meta data copy
to create new branch in LoopUnswitch.

llvm-svn: 280602
2016-09-03 22:26:11 +00:00
Tobias Grosser 8d4cb1a060 ScopInfo: Do not derive assumptions from all GEP pointer instructions
... but instead rely on the assumptions that we derive for load/store
instructions.

Before we were able to delinearize arrays, we used GEP pointer instructions
to derive information about the likely range of induction variables, which
gave us more freedom during loop scheduling. Today, this is not needed
any more as we delinearize multi-dimensional memory accesses and as part
of this process also "assume" that all accesses to these arrays remain
inbounds. The old derive-assumptions-from-GEP code has consequently become
mostly redundant. We drop it both to clean up our code, but also to improve
compile time. This change reduces the scop construction time for 3mm in
no-asserts mode on my machine from 48 to 37 ms.

llvm-svn: 280601
2016-09-03 21:55:25 +00:00
Xinliang David Li 241e6c7086 [Profile] preserve branch metadata lowering select in CGP
CGP currently drops select's MD_prof profile data when
generating conditional branch which can lead to bad
code layout. The patch fixes the issue.

Differential Revision: http://reviews.llvm.org/D24169

llvm-svn: 280600
2016-09-03 21:26:36 +00:00
Mehdi Amini ebb3434850 Fix ThinLTO crash with debug info
Because the recent change about ODR type uniquing in the context,
we can reach types defined in another module during IR linking.
This triggered some assertions in case we IR link without starting
from an empty module. To alleviate that, we can self-map metadata
defined in the destination module so that they won't be visited.

Differential Revision: https://reviews.llvm.org/D23841

llvm-svn: 280599
2016-09-03 21:12:33 +00:00
Simon Pilgrim 3606d2346c Strip trailing whitespace
llvm-svn: 280598
2016-09-03 20:36:05 +00:00
Craig Topper f43e4a1728 [AVX-512] Remove masked integer mullo builtins and replace with native IR.
llvm-svn: 280597
2016-09-03 19:19:49 +00:00
Craig Topper 0e18976b8d [AVX-512] Remove masked integer add/sub builtins and replace with native IR.
llvm-svn: 280596
2016-09-03 18:29:35 +00:00
Matt Arsenault ac42ba8633 AMDGPU: Set sizes of spill pseudos
llvm-svn: 280595
2016-09-03 17:25:44 +00:00
Matt Arsenault 5ffe3e1d93 AMDGPU: Fix adding duplicate implicit exec uses
I'm not sure if this should be considered a bug in
copyImplicitOps or not, but implicit operands that are part
of the static instruction definition should not be copied.

llvm-svn: 280594
2016-09-03 17:25:39 +00:00
Craig Topper 907b580d72 [AVX-512] Add integer ADD/SUB instructions to load folding tables. Add an AVX512 stack folding test.
llvm-svn: 280593
2016-09-03 17:20:07 +00:00
Craig Topper 392cd0300d [AVX-512] Mark EVEX encoded vpcmpeq as commutable just like its AVX and SSE equivalent.
llvm-svn: 280592
2016-09-03 16:28:03 +00:00
Aaron Ballman bb18e91cdf Fix the attribute documentation build.
llvm-svn: 280591
2016-09-03 15:36:52 +00:00
Nicolai Haehnle 3bba6a8438 AMDGPU: Reduce the duration of whole-quad-mode
Summary:
This contains two changes that reduce the time spent in WQM, with the
intention of reducing bandwidth required by VMEM loads:

1. Sampling instructions by themselves don't need to run in WQM, only their
   coordinate inputs need it (unless of course there is a dependent sampling
   instruction). The initial scanInstructions step is modified accordingly.

2. When switching back from WQM to Exact, switch back as soon as possible.
   This affects the logic in processBlock.

This should always be a win or at best neutral.

There are also some cleanups (e.g. remove unused ExecExports) and some new
debugging output.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D22092

llvm-svn: 280590
2016-09-03 12:26:38 +00:00
Nicolai Haehnle a246dccc26 AMDGPU: Fix an interaction between WQM and polygon stippling
Summary:
This fixes a rare bug in polygon stippling with non-monolithic pixel shaders.

The underlying problem is as follows: the prolog part contains the polygon
stippling sequence, i.e. a kill. The main part then enables WQM based on the
_reduced_ exec mask, effectively undoing most of the polygon stippling.

Since we cannot know whether polygon stippling will be used, the main part
of a non-monolithic shader must always return to exact mode to fix this
problem.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23131

llvm-svn: 280589
2016-09-03 12:26:32 +00:00
Eric Fiselier ff94d25063 Fix PR30202 - notify_all_at_thread_exit seg faults if run from a raw pthread context.
Summary:
This patch allows threads not created using `std::thread` to use `std::notify_all_at_thread_exit` by ensuring the TL state has been initialized within `std::notify_all_at_thread_exit`.

Additionally this patch "fixes" a potential oddity in `__thread_local_pointer::reset(pointer)`, which would previously delete the old thread local data. However there should *never* be old thread local data because pthread *should* null it out on thread exit. Unfortunately it's possible that pthread failed to do this according to the spec:


> 
> Upon key creation, the value NULL shall be associated with the new key in all active threads. Upon thread creation, the value NULL shall be associated with all defined keys in the new thread.
> 
> An optional destructor function may be associated with each key value. At thread exit, if a key value has a non-NULL destructor pointer, and the thread has a non-NULL value associated with that key, the value of the key is set to NULL, and then the function pointed to is called with the previously associated value as its sole argument. The order of destructor calls is unspecified if more than one destructor exists for a thread when it exits.
> 
> If, after all the destructors have been called for all non-NULL values with associated destructors, there are still some non-NULL values with associated destructors, then the process is repeated. If, after at least {PTHREAD_DESTRUCTOR_ITERATIONS} iterations of destructor calls for outstanding non-NULL values, there are still some non-NULL values with associated destructors, implementations may stop calling destructors, or they may continue calling destructors until no non-NULL values with associated destructors exist, even though this might result in an infinite loop.

However if pthread fails to delete the value it is probably incorrect for us to do it. Destroying the value performs all of the "at thread exit" actions registered with it but we are way past "at thread exit".





Reviewers: mclow.lists, bcraig, EricWF

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D24159

llvm-svn: 280588
2016-09-03 08:07:40 +00:00
Niels Ole Salscheider b28cc458c3 Replace the Radeon GCN GPU family names by more descriptive ones
Differential Revision: https://reviews.llvm.org/D23957

llvm-svn: 280587
2016-09-03 07:13:54 +00:00
Matt Arsenault 46a0382ab2 AMDGPU: Do basic folding of class intrinsic
This allows more of the OCML builtin library to be
constant folded.

llvm-svn: 280586
2016-09-03 07:06:58 +00:00
Eric Fiselier ebbcfe7bfc memory_resource still needs init_priority when built with GCC 4.9
llvm-svn: 280585
2016-09-03 07:05:40 +00:00
Matt Arsenault 2510a31677 AMDGPU: Fix spilling of m0
readlane/writelane do not support using m0 as the output/input.
Constrain the register class of spill vregs to try to avoid this,
but also handle spilling of the physreg when necessary by inserting
an additional copy to a normal SGPR.

llvm-svn: 280584
2016-09-03 06:57:55 +00:00
Matt Arsenault f3d1a1a1b6 Improve debug error message with register name
llvm-svn: 280583
2016-09-03 06:57:49 +00:00
Craig Topper 892ce56901 [AVX-512] Add EVEX encoded VPCMPEQ and VPCMPGT to the load folding tables.
llvm-svn: 280581
2016-09-03 04:37:50 +00:00
Nico Weber 55fbe483d6 Add a test Aaron asked for that I forgot to add before landing r280578.
llvm-svn: 280580
2016-09-03 04:27:14 +00:00