Commit Graph

225415 Commits

Author SHA1 Message Date
Adam Nemet b0c4eae073 [LoopVectorize] Annotate versioned loop with noalias metadata
Summary:
Use the new LoopVersioning facility (D16712) to add noalias metadata in
the vector loop if we versioned with memchecks.  This can enable some
optimization opportunities further down the pipeline (see the included
test or the benchmark improvement quoted in D16712).

The test also covers the bug I had in the initial version in D16712.

The vectorizer did not previously use LoopVersioning.  The reason is
that the vectorizer performs its transformations in single shot.  It
creates an empty single-block vector loop that it then populates with
the widened, if-converted instructions.  Thus creating an intermediate
versioned scalar loop seems wasteful.

So this patch (rather than bringing in LoopVersioning fully) adds a
special interface to LoopVersioning to allow the vectorizer to add
no-alias annotation while still performing its own versioning.

As the vectorizer propagates metadata from the instructions in the
original loop to the vector instructions we also check the pointer in
the original instruction and see if LoopVersioning can add no-alias
metadata based on the issued memchecks.

Reviewers: hfinkel, nadav, mzolotukhin

Subscribers: mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17191

llvm-svn: 263744
2016-03-17 20:32:37 +00:00
Adam Nemet 5eccf07df3 [LoopVersioning] Annotate versioned loop with noalias metadata
Summary:
If we decide to version a loop to benefit a transformation, it makes
sense to record the now non-aliasing accesses in the newly versioned
loop.  This allows non-aliasing information to be used by subsequent
passes.

One example is 456.hmmer in SPECint2006 where after loop distribution,
we vectorize one of the newly distributed loops.  To vectorize we
version this loop to fully disambiguate may-aliasing accesses.  If we
add the noalias markers, we can use the same information in a later DSE
pass to eliminate some dead stores which amounts to ~25% of the
instructions of this hot memory-pipeline-bound loop.  The overall
performance improves by 18% on our ARM64.

The scoped noalias annotation is added in LoopVersioning.  The patch
then enables this for loop distribution.  A follow-on patch will enable
it for the vectorizer.  Eventually this should be run by default when
versioning the loop but first I'd like to get some feedback whether my
understanding and application of scoped noalias metadata is correct.

Essentially my approach was to have a separate alias domain for each
versioning of the loop.  For example, if we first version in loop
distribution and then in vectorization of the distributed loops, we have
a different set of memchecks for each versioning.  By keeping the scopes
in different domains they can conveniently be defined independently
since different alias domains don't affect each other.

As written, I also have a separate domain for each loop.  This is not
necessary and we could save some metadata here by using the same domain
across the different loops.  I don't think it's a big deal either way.

Probably the best is to review the tests first to see if I mapped this
problem correctly to scoped noalias markers.  I have plenty of comments
in the tests.

Note that the interface is prepared for the vectorizer which needs the
annotateInstWithNoAlias API.  The vectorizer does not use LoopVersioning
so we need a way to pass in the versioned instructions.  This is also
why the maps have to become part of the object state.

Also currently, we only have an AA-aware DSE after the vectorizer if we
also run the LTO pipeline.  Depending how widely this triggers we may
want to schedule a DSE toward the end of the regular pass pipeline.

Reviewers: hfinkel, nadav, ashutosh.nema

Subscribers: mssimpso, aemerson, llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D16712

llvm-svn: 263743
2016-03-17 20:32:32 +00:00
Justin Bogner ae341c6e9b Bitcode: Error out instead of crashing on corrupt metadata
I hit a crash in the bitcode reader on some corrupt input where an
MDString had somehow been attached to an instruction instead of an
MDNode. This input is pretty bogus, but we shouldn't be crashing on bad
input here.

This change adds error handling in all of the places where we
currently have unchecked casts from Metadata to MDNode, which means
we'll error out instead of crashing for that sort of input.

Unfortunately, I don't have tests. Hitting this requires flipping bits
in the input bitcode, and committing corrupt binary files to catch
these cases is a bit too opaque and unmaintainable.

llvm-svn: 263742
2016-03-17 20:12:06 +00:00
Tim Northover 498c56c240 ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering.
llvm-svn: 263741
2016-03-17 20:10:28 +00:00
Reid Kleckner 4084504caa Revert "For MS ABI, emit dllexport friend functions defined inline in class"
This reverts commit r263738.

This appears to cause a failure in
CXX/temp/temp.decls/temp.friend/p1.cpp

llvm-svn: 263740
2016-03-17 20:06:58 +00:00
Kostya Serebryany c5575aabd6 [libFuzzer] deprecate several flags
llvm-svn: 263739
2016-03-17 19:59:39 +00:00
Reid Kleckner 0f6caf66e9 For MS ABI, emit dllexport friend functions defined inline in class
Summary: ...as that is apparently what MSVC does

Reviewers: rnk

Patch by Stephan Bergmann

Differential Revision: http://reviews.llvm.org/D15267

llvm-svn: 263738
2016-03-17 19:52:20 +00:00
Kostya Serebryany 23dbc390af [libFuzzer] add __attribute__((no_sanitize_memory)) to two functions that may be called from signal handler(s) or from msan. This will hopefully avoid msan false reports which I can't reproduce
llvm-svn: 263737
2016-03-17 19:42:35 +00:00
Michael J. Spencer 6979e74ce0 [msan fix] unitalized variable
llvm-svn: 263736
2016-03-17 19:16:54 +00:00
Stephane Sezer f81049184a Fix deadlock due to thread list locking in 'bt all' with obj-c
Summary:
The gdb-remote async thread cannot modify thread state while the main thread
holds a lock on the state. Don't use locking thread iteration for bt all.

Specifically, the deadlock manifests when lldb attempts to JIT code to
symbolicate objective c while backtracing. As part of this code path,
SetPrivateState() is called on an async thread. This async thread will
block waiting for the thread_list lock held by the main thread in
CommandObjectIterateOverThreads. The main thread will also block on the
async thread during DoResume (although with a timeout), leading to a
deadlock. Due to the timeout, the deadlock is not immediately apparent,
but the inferior will be left in an invalid state after the bt all completes,
and objective-c symbols will not be successfully resolved in the backtrace.

Reviewers: andrew.w.kaylor, jingham, clayborg

Subscribers: sas, lldb-commits

Differential Revision: http://reviews.llvm.org/D18075

Change by Francis Ricci <fjricci@fb.com>

llvm-svn: 263735
2016-03-17 18:52:41 +00:00
Guozhi Wei 7b390ec4cd [InstCombine] Combine A->B->A BitCast
This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

llvm-svn: 263734
2016-03-17 18:47:20 +00:00
Sanjoy Das c9058ca9e0 [Statepoints] Export a magic constant into a header; NFC
llvm-svn: 263733
2016-03-17 18:42:17 +00:00
David Blaikie fef5c5a27a Remove defaulted move ops, the type is zero-cost copyable anyway, so there's no need for specific move ops
(addresses MSVC build error, since MSVC 2013 can't generate default move
ops)

llvm-svn: 263732
2016-03-17 18:28:16 +00:00
Filipe Cabecinhas c80cb3d0c6 [lit] Enqueue tests on a separate thread to not hit limits on parallel queues
Summary:
The multiprocessing.Queue.put() call can hang if we try queueing all the
tests before starting to take them out of the queue.
The current implementation hangs if tests exceed 2^^15, on Mac OS X.
This might happen with a ninja check-all if one has a bunch of llvm
projects.

Reviewers: delcypher, bkramer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17609

llvm-svn: 263731
2016-03-17 18:27:33 +00:00
David Blaikie a9ff7a14ec Fix implicit copy ctor and copy assignment operator warnings when -Wdeprecated passed.
Fix implicit copy ctor and copy assignment operator warnings
when -Wdeprecated passed.

Patch by Don Hinton!

Differential Revision: http://reviews.llvm.org/D18123

llvm-svn: 263730
2016-03-17 18:05:07 +00:00
Valery Pykhtin c4546ec0cf [AMDGPU] add VI disassembler tests. NFC.
Autogenerated from the corresponding assembler tests with a few FIXME added (will fix soon).

Differential Revision: http://reviews.llvm.org/D18249

llvm-svn: 263729
2016-03-17 17:56:33 +00:00
Petar Jovanovic 0b44f24033 [PowerPC] Disable CTR loops optimization for soft float operations
This patch prevents CTR loops optimization when using soft float operations
inside loop body. Soft float operations use function calls, but function
calls are not allowed inside CTR optimized loops.

Patch by Aleksandar Beserminji.

Differential Revision: http://reviews.llvm.org/D17600

llvm-svn: 263727
2016-03-17 17:11:33 +00:00
Eugene Zelenko 05f7e6ae0d Fix Clang-tidy modernize-deprecated-headers warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D18231

llvm-svn: 263726
2016-03-17 17:02:25 +00:00
Derek Schuff d4207ba0f6 [WebAssembly] Stackify code emitted by eliminateFrameIndex and SP writeback
Summary:
MRI::eliminateFrameIndex can emit several instructions to do address
calculations; these can usually be stackified. Because instructions with
FI operands can have subsequent operands which may be expression trees,
find the top of the leftmost tree and insert the code before it, to keep
the LIFO property.

Also use stackified registers when writing back the SP value to memory
in the epilog; it's unnecessary because SP will not be used after the
epilog, and it results in better code.

Differential Revision: http://reviews.llvm.org/D18234

llvm-svn: 263725
2016-03-17 17:00:29 +00:00
David Majnemer 93bbc7cd66 [COFF] Use coff_section::getAlignment
Use LLVM's section alignment calculation instead of having LLD calculate
it.

llvm-svn: 263724
2016-03-17 16:58:08 +00:00
David Majnemer 07f7fe5a4c [COFF] Fix invalid alignment in tests
Some COFF tests used INT_MIN for the alignment of the directive section.
This is invalid; replace the alignment with something more sensible: 1.

llvm-svn: 263723
2016-03-17 16:56:31 +00:00
David Majnemer 511391feaa [COFF] Refactor section alignment calculation
Section alignment isn't completely trivial, let it live in one place so
that we may reuse it in LLVM.

llvm-svn: 263722
2016-03-17 16:55:18 +00:00
David Majnemer 62fed0c354 Forgot to commit this with r263692
llvm-svn: 263721
2016-03-17 16:55:11 +00:00
Changpeng Fang 234fcb81d3 AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute
Symmary:
  ds_permute/ds_bpermute do not read memory so s_waitcnt is not needed.

Reviewers
  arsenm, tstellarAMD

Subscribers
  llvm-commits, arsenm

Differential Revision:
  http://reviews.llvm.org/D18197

llvm-svn: 263720
2016-03-17 16:43:50 +00:00
Nicolai Haehnle 79cad857a0 AMDGPU: mark atomic instructions as sources of divergence
Summary:
As explained by the comment, threads will typically see different values
returned by atomic instructions even if the arguments are equal.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18156

llvm-svn: 263719
2016-03-17 16:21:59 +00:00
Benjamin Kramer f8bff1c31c Use a simpler set of mock headers for the vfs+modules crash recovery tests.
The System/ mock is large and too complex for this test. It can cause
the tests to fail in mysterious ways as it depends on the resource dir
being present, which is not really supported for driver tests (using
%clang instead of %clang_cc1). Copy the tree and trim out all the
%unnecessary fat.

llvm-svn: 263718
2016-03-17 16:19:51 +00:00
Simon Pilgrim 0f37fbac51 [X86][SSE] Simplified blend-with-zero combining
We were being too aggressive in trying to combine a shuffle into a blend-with-zero pattern, often resulting in a endless loop of contrasting combines

This patch stops the combine if we already have a blend in place (means we miss some domain corrections)

llvm-svn: 263717
2016-03-17 15:59:36 +00:00
Sanjay Patel 9e23fedaf0 propagate 'unpredictable' metadata on select instructions
This is similar to D18133 where we allowed profile weights on select instructions. 
This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects.

A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at:
http://reviews.llvm.org/rL263648

Differential Revision: http://reviews.llvm.org/D18220

llvm-svn: 263716
2016-03-17 15:30:52 +00:00
Saleem Abdulrasool 071a099102 ARM: Revert SVN r253865, 254158, fix windows division
The two changes together weakened the test and caused a regression with division
handling in MSVC mode.  They were applied to avoid an assertion being triggered
in the block frequency analysis.  However, the underlying problem was simply
being masked rather than solved properly.  Address the actual underlying problem
and revert the changes.  Rather than analyze the cause of the assertion, the
division failure was assumed to be an overflow.

The underlying issue was a subtle bug in the BB construction in the emission of
the div-by-zero check (WIN__DBZCHK).  We did not construct the proper successor
information in the basic blocks, nor did we update the PHIs associated with the
basic block when we split them.  This would result in assertions being triggered
in the block frequency analysis pass.

Although the original tests are being removed, the tests themselves performed
very little in terms of validation but merely tested that we did not assert when
generating code.  Update this with new tests that actually ensure that we do not
regress on the code generation.

llvm-svn: 263714
2016-03-17 14:10:49 +00:00
Daniel Jasper 97439926e4 clang-format: [JS] Make requoting of JavaScript string literals only
change affected ranges.

llvm-svn: 263713
2016-03-17 13:03:41 +00:00
Simon Atanasyan a8e9cb38ae [ELF][MIPS] Support R_MIPS_TLS_DTPREL_HI16/LO16 and R_MIPS_TLS_TPREL_HI16/LO16 relocations
That is initial and the most trivial patch to support TLS for MIPS targets.

llvm-svn: 263712
2016-03-17 12:36:08 +00:00
Simon Atanasyan 60d0555004 [ELF][MIPS] Add test case to check number of redundant entries in the local part of MIPS GOT. NFC.
llvm-svn: 263711
2016-03-17 12:36:00 +00:00
Daniel Jasper a4607e1b88 clang-format: [JS] Fix incorrect spacing around contextual keywords.
Before:
  x.of ();

After:
  x.of();

llvm-svn: 263710
2016-03-17 12:17:59 +00:00
Daniel Jasper 710f8493c8 clang-format: Slightly weaken AlignAfterOpenBracket=AlwaysBreak.
If a call takes a single argument, using AlwaysBreak can lead to lots
of wasted lines and additional indentation without improving the
readability in a significant way.

Before:
  caaaaaaaaaaaall(
      caaaaaaaaaaaall(
          caaaaaaaaaaaall(
              caaaaaaaaaaaaaaaaaaaaaaall(aaaaaaaaaaaaaa, aaaaaaaaa))));

After:
  caaaaaaaaaaaall(caaaaaaaaaaaall(caaaaaaaaaaaall(
      caaaaaaaaaaaaaaaaaaaaaaall(aaaaaaaaaaaaaa, aaaaaaaaa))));

llvm-svn: 263709
2016-03-17 12:00:22 +00:00
Simon Atanasyan 16f2460575 [llvm-objdump] Add REQUIRES x86 directive to fix buildbots
llvm-svn: 263708
2016-03-17 11:09:21 +00:00
George Rimar 177f7bcee8 Avoid using braces. NFC.
llvm-svn: 263707
2016-03-17 11:00:27 +00:00
Alexey Bataev e122da19ea [OPENMP 4.5] Allow to use private data members in 'copyprivate' clause.
OpenMP 4.5 allows privatization of non-static data members in non-static
member functions. This patch adds support of private data members in
'copyprivate' clauses.

llvm-svn: 263706
2016-03-17 10:50:17 +00:00
Simon Atanasyan 34223a7e5d [llvm-objdump] Add '0x' prefix to a target displacement number to accent its hex format
It might be hard to recognize a hexadecimal number without '0x' prefix.
Besides that '0x' prefix corresponds to GNU objdump behaviour.

Differential Revision: http://reviews.llvm.org/D18207

llvm-svn: 263705
2016-03-17 10:43:44 +00:00
Simon Atanasyan 58ee875296 [mips] Use `formatImm` call to print immediate value in the `MipsInstPrinter`
That allows, for example, to print hex-formatted immediates using
llvm-objdump --print-imm-hex command line option.

Differential Revision: http://reviews.llvm.org/D18195

llvm-svn: 263704
2016-03-17 10:43:36 +00:00
Scott Egerton d65377da78 [mips] Eliminate instances of "potentially uninitialised local variable" warnings, NFC
Summary:
This should eliminate all occurrences of this within LLVMMipsAsmParser.
This patch is in response to http://reviews.llvm.org/D17983. I was unable
to reproduce the warnings on my machine so please advise if this fixes the
warnings.

Reviewers: ariccio, vkalintiris, dsanders

Subscribers: dblaikie, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18087

llvm-svn: 263703
2016-03-17 10:37:51 +00:00
Wilfred Hughes 1bf1be9933 Remove obselete reference to TypeResolve from the tutorial.
TypeResolve went away in r134829 in 2011.

llvm-svn: 263702
2016-03-17 10:20:58 +00:00
Alexey Bataev a839dddf92 [OPENMP 4.0] Use 'declare reduction' constructs in 'reduction' clauses.
OpenMP 4.0 allows to define custom reduction operations using '#pragma
omp declare reduction' construct. Patch allows to use this custom
defined reduction operations in 'reduction' clauses.

llvm-svn: 263701
2016-03-17 10:19:46 +00:00
Wilfred Hughes b59b488e21 Minor grammar fix in kaleidoscope tutorial.
llvm-svn: 263700
2016-03-17 10:18:13 +00:00
Jonas Hahnfeld 626557be6b [libcxxabi] Disable cxa_thread_atexit_test if unavailable
The feature check is already in place when building the library but wasn't
honored for the tests.

Differential Revision: http://reviews.llvm.org/D18205

llvm-svn: 263699
2016-03-17 10:00:24 +00:00
Kuba Brecka 493028e8e2 Removing a non-intentional debug output that got committed in r263695.
llvm-svn: 263698
2016-03-17 09:27:40 +00:00
Wilfred Hughes aa1ea69c71 Further typo fixes in kaleidoscope tutorial.
llvm-svn: 263697
2016-03-17 09:26:45 +00:00
Wilfred Hughes 2d6b4e568e Fix typo in kaleidoscope tutorial.
llvm-svn: 263696
2016-03-17 09:09:07 +00:00
Kuba Brecka 4c80867ecf [sanitizer] On OS X, verify that interceptors work and abort if not, take 2
On OS X 10.11+, we have "automatic interceptors", so we don't need to use DYLD_INSERT_LIBRARIES when launching instrumented programs. However, non-instrumented programs that load TSan late (e.g. via dlopen) are currently broken, as TSan will still try to initialize, but the program will crash/hang at random places (because the interceptors don't work). This patch adds an explicit check that interceptors are working, and if not, it aborts and prints out an error message suggesting to explicitly use DYLD_INSERT_LIBRARIES.

TSan unit tests run with a statically linked runtime, where interceptors don't work. To avoid aborting the process in this case, the patch replaces `DisableReexec()` with a weak `ReexecDisabled()` function which is defined to return true in unit tests.

Differential Revision: http://reviews.llvm.org/D18212

llvm-svn: 263695
2016-03-17 08:37:25 +00:00
Junmo Park e19d6796ee Minor code cleanups. NFC.
llvm-svn: 263694
2016-03-17 06:41:27 +00:00
George Rimar 786e866fea [ELF] - -pie/--pic-executable option implemented
-pie
--pic-executable

Create a position independent executable.  This is currently only
 supported on ELF platforms.  Position independent executables are
 similar to shared libraries in that they are relocated by the
 dynamic linker to the virtual address the OS chooses for them
 (which can vary between invocations).  Like normal dynamically
 linked executables they can be executed and symbols defined in the
 executable cannot be overridden by shared libraries.

Differential revision: http://reviews.llvm.org/D18183

llvm-svn: 263693
2016-03-17 05:57:33 +00:00