Commit Graph

208684 Commits

Author SHA1 Message Date
Richard Smith d61d4acd70 [modules] Further simplification and speedup of redeclaration chain loading.
Instead of eagerly deserializing a list of DeclIDs when we load a module file
and doing a binary search to find the redeclarations of a decl, store a list of
redeclarations of each chain before the first declaration and load it directly.

llvm-svn: 245789
2015-08-22 20:13:39 +00:00
Eric Fiselier b17bb06914 [libcxx] Add new Sphinx documentation
Summary:
This patch adds Sphinx based documentation to libc++. The goal is to make it easier to write documentation for libc++ since writing new documentation in HTML is cumbersome. This patch rewrites the main page for libc++ along with the instructions for using, building and testing libc++. 

The built documentation can be found and reviewed here: http://efcs.ca/libcxx-docs

In order to build the sphinx documentation you need to specify the cmake options `-DLLVM_ENABLE_SPHINX=ON -DLIBCXX_INCLUDE_DOCS=ON`. This will add the makefile rule `docs-libcxx-html`.

Reviewers: chandlerc, mclow.lists, danalbert, jroelofs

Subscribers: silvas, cfe-commits

Differential Revision: http://reviews.llvm.org/D12129

llvm-svn: 245788
2015-08-22 19:40:49 +00:00
Jingyue Wu 284ebe237f [CUDA] Change initializer for CUDA device code based on CUDA documentation.
Summary:
According to CUDA documentation, global variables declared with __device__,
__constant__ can be initialized from host code, so mark them as
externally initialized. Because __shared__ variables cannot have an
initialization as part of their declaration and since the value maybe kept
across different kernel invocation, the value of __shared__ is effectively
undefined instead of zero initialized.

Wrongly using zero initializer may cause illegitimate optimization, e.g.
removing unused __constant__ variable because it's not updated in the device
code and the value is initialized with zero.

Test Plan: test/CodeGenCUDA/address-spaces.cu

Patch by Xuetian Weng

Reviewers: jholewinski, eliben, tra, jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12241

llvm-svn: 245786
2015-08-22 05:49:28 +00:00
Jingyue Wu fcec09866a [NVPTX] Allow undef value as global initializer
Summary:
__shared__ variable may now emit undef value as initializer, do not
throw error on that.

Test Plan: test/CodeGen/NVPTX/global-addrspace.ll

Patch by Xuetian Weng

Reviewers: jholewinski, tra, jingyue

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D12242

llvm-svn: 245785
2015-08-22 05:40:26 +00:00
Alexey Samsonov 4369a3f4ad Revert r245770 and r245777.
These changes break both autoconf Mac OS X buildbot (linker errors
due to wrong Makefiles) and CMake buildbot (safestack test failures).

llvm-svn: 245784
2015-08-22 05:15:55 +00:00
NAKAMURA Takumi 6fa8532041 [CMake] add_llvm_external_project: Just warn about nonexistent directories.
These entries were generated accidentally.

llvm-svn: 245783
2015-08-22 05:11:02 +00:00
NAKAMURA Takumi 04e2da526f [CMake] Make LLVM_EXTERNAL_*_SOURCE_DIR consistent against older buildsites.
If corresponding in-tree subdirectory exists, just ignore LLVM_EXTERNAL* stuff.
Otherwise, set LLVM_TOOL_*_BUILD ON/OFF properly according to LLVM_EXTERNAL_*.

This makes easier to walk among old revisions *without* deleteing CMakeCache.txt.

Before r242059, LLVM_EXTERNAL_* was working like;

  if(EXISTS ${*_SOURCE_DIR}/CMakeLists.txt)
    set(*_BUILD ON CACHE)
    if(*_BUILD is ON)
      add_subdirectory(*_SOURCE_DIR)
    endif()
  endif()

llvm-svn: 245782
2015-08-22 04:53:52 +00:00
Peter Collingbourne c7b675f48c LTO: Maintain target triple, FeatureStr and CGOptLevel in the module or LTOCodeGenerator.
This makes it easier to create new TargetMachines on demand.

llvm-svn: 245781
2015-08-22 02:25:53 +00:00
Richard Smith 0144f5154b [modules] Remove some dead code after r245779.
llvm-svn: 245780
2015-08-22 02:09:38 +00:00
Richard Smith d8a83718f0 [modules] Rearrange how redeclaration chains are loaded, to remove a walk over
all modules and reduce the number of declarations we load when loading a
redeclaration chain.

The new approach is:
 * when loading the first declaration of an entity within a module file, we
   first load all declarations of the entity that were imported into that
   module file, and then load all the other declarations of that entity from
   that module file and build a suitable decl chain from them
 * when loading any other declaration of an entity, we first load the first
   declaration from the same module file

As before, we complete redecl chains through name lookup where necessary.

To make this work, I also had to change the way that template specializations
are stored -- it no longer suffices to track only canonical specializations; we
now emit all "first local" declarations when emitting a list of specializations
for a template.

On one testcase with several thousand imported module files, this reduces the
total runtime by 72%.

llvm-svn: 245779
2015-08-22 01:47:18 +00:00
Ahmed Bougacha 86da3d8d7c [ARM NEON] Remove special-case for f16 vcvt handling. NFCI.
We can use the 'H' typespec modifier to use 128-bit vectors directly
in the only two users of this special-case: the vcvt f16 intrinsics.
This also lets us use more meaningful prototype modifiers.

llvm-svn: 245778
2015-08-22 01:30:13 +00:00
Alexey Samsonov 74683fe519 Try to fix Mac build.
llvm-svn: 245777
2015-08-22 01:07:05 +00:00
Alexey Samsonov eb4fe7883f [TSan] Support __sanitizer_set_death_callback().
llvm-svn: 245776
2015-08-22 01:07:02 +00:00
Matt Arsenault 0a3ac1be43 AMDGPU: Allow specifying different opcode on VI for SMRD/SMEM
Although the basic s_load_* instructions happen to use the same
opcode, some of the special case SMRD instructions have
different opcodes.

llvm-svn: 245775
2015-08-22 00:54:31 +00:00
Matt Arsenault e8df879948 AMDGPU: Improve accuracy of instruction rates for some FP instructions
llvm-svn: 245774
2015-08-22 00:50:41 +00:00
Evgeniy Stepanov d6376d9875 [asan] Don't apply glibc-specific TLS calculations on Android.
This fixes an infinite recursion between GetTls and GetTlsSize on
Android-x86.

llvm-svn: 245773
2015-08-22 00:47:12 +00:00
Matt Arsenault 33010103b7 AMDGPU: Use DFS to avoid second loop over function
llvm-svn: 245772
2015-08-22 00:43:38 +00:00
John McCall ee04aebbdb When building a pseudo-object assignment, and the RHS is
a contextually-typed expression that semantic analysis will
probably need to invasively rewrite, don't include the
RHS OVE as a separate semantic expression, and check the
operation with the original RHS expression.

There are two contextually-typed expressions that can survive
to here: overloaded function references, which are at least
safe to double-emit, and C++11 initializer list expressions,
which are not at all safe to double-emit and which often
don't update the original syntactic InitListExpr with
implicit conversions to member types, etc.

This means that the original RHS may appear, undecorated by
an OVE, in the semantic expressions.  Fortunately, it will
only ever be used in a single place there, and I don't
believe there are clients that rely on being able to pick
out the original RHS from the semantic expressions.
But this could be problematic if there are clients that do
visit the entire tree and rely on not seeing the same
expression multiple times, once in the syntactic and once
in the semantic expressions.  This is a very fiddly part
of the compiler.

rdar://21801088

llvm-svn: 245771
2015-08-22 00:35:27 +00:00
Alexey Samsonov 8e38c71cb7 [Sanitizer] Dump coverage if we're killing the program with __sanitizer::Die().
Previously we had to call __sanitizer_cov_dump() from tool-specific
callbacks - instead, let sanitizer_common library handle this in a single place.

llvm-svn: 245770
2015-08-22 00:28:12 +00:00
Matt Arsenault c8d8e4ed76 AMDGPU: Make sure to run verifier after SIFixSGPRLiveRanges
llvm-svn: 245769
2015-08-22 00:19:34 +00:00
Matt Arsenault aba29d6ab1 AMDGPU: Improve debug printing in SIFixSGPRLiveRanges
llvm-svn: 245768
2015-08-22 00:19:25 +00:00
Matt Arsenault 6adf07a92e AMDGPU: Move CI instructions into CIInstructions.td
There are still a couple of CI patterns left in SIInstructions.

llvm-svn: 245767
2015-08-22 00:16:34 +00:00
Alexey Samsonov b6a604aea5 [DFSan] Remove nolibc build.
It's not used now, as is not even included in "dfsan" target.

llvm-svn: 245766
2015-08-21 23:58:45 +00:00
Zachary Turner 21708cf970 Revert "Implement basic DidAttach and DidLaunch for DynamicLoaderWindowsDYLD."
This reverts commit 7749a10ddbe22767d0e055753c674fcde7f28d39.

This commit introduces about 15-20 new test failures with windows local
targets.

llvm-svn: 245765
2015-08-21 23:57:25 +00:00
Matt Arsenault f56872dc30 AMDGPU: Minor cleanups to help with f16 support
The main change is inverting the condition for the
operand class classes so that VT.Size == 16 uses VGPR_32
instead of 64.

llvm-svn: 245764
2015-08-21 23:49:51 +00:00
Ahmed Bougacha cd5b8a0235 [ARM NEON] Use the common naming scheme for vcvt f16 builtins. NFC.
We had "vcvt_f16" and "VCVT_HIGH_F16": for other FP types, this naming
is used for intrinsics with integer overloads. The FP->FP conversions,
on the other hand, use the full "vcvt_f32_f64" name instead.

Use the same naming convention for the f16<->f32 conversions.
While there, reorder the definitions a little bit.

llvm-svn: 245763
2015-08-21 23:34:20 +00:00
JF Bastien 057292a76c Improve the determinism of MergeFunctions
Summary:

Merge functions previously relied on unsigned comparisons of pointer values to
order functions. This caused observable non-determinism in the compiler for
large bitcode programs. Basically, opt -mergefuncs program.bc | md5sum produces
different hashes when run repeatedly on the same machine. Differing output was
observed on three large bitcodes, but it was less frequent on the smallest file.
It is possible that this only manifests on the large inputs, hence remaining
undetected until now.

This patch fixes this by removing (almost, see below) all places where
comparisons between pointers are used to order functions. Most of these changes
are local, but the comparison of global values requires assigning an identifier
to each local in the order it is visited. This is very similar to the way the
comparison function identifies Value*'s defined within a function. Because the
order of visiting the functions and their subparts is deterministic, the
identifiers assigned to the globals will be as well, and the order of functions
will be deterministic.

With these changes, there is no more observed non-determinism. There is also
only minor slowdowns (negligible to 4%) compared to the baseline, which is
likely a result of the fact that global comparisons involve hash lookups and not
just pointer comparisons.

The one caveat so far is that programs containing BlockAddress constants can
still be non-deterministic. It is not clear what the right solution is here. In
particular, even if the global numbers are used to order by function, we still
need a way to order the BasicBlock*'s. Unfortunately, we cannot just bail out
and fail to order the functions or consider them equal, because we require a
total order over functions. Note that programs with BlockAddress constants are
relatively rare, so the impact of leaving this in is minor as long as this pass
is opt-in.

Author: jrkoenig

Reviewers: nlewycky, jfb, dschuff

Subscribers: jevinskie, llvm-commits, chapuni

Differential revision: http://reviews.llvm.org/D12168

llvm-svn: 245762
2015-08-21 23:27:24 +00:00
Ahmed Bougacha 22a16965d6 [ARM NEON] Factor out FP-prototype checking. NFC.
llvm-svn: 245761
2015-08-21 23:24:18 +00:00
Adam Nemet 4e533ef7a9 [LAA] Hold bounds via ValueHandles during SCEV expansion
SCEV expansion can invalidate previously expanded values.  For example
in SCEVExpander::ReuseOrCreateCast, if we already have the requested
cast value but it's not at the desired location, a new cast is inserted
and the old cast will be invalidated.

Therefore, when expanding the bounds for the pointers, a later entry can
invalidate the IR value for an earlier one.  The fix is to store a value
handle rather than the value itself.

The newly added test has a more detailed description of how the bug
triggers.

This bug can have a negative but potentially highly variable performance
impact in Loop Distribution.  Because one of the bound values was
invalidated and is an undef expression now, InstCombine is free to
transform the array overlap check:

   Start0 <= End1 && Start1 <= End0

into:

   Start0 <= End1

So depending on the runtime location of the arrays, we would detect a
conflict and fall back on the original loop of the versioned loop.

Also tested compile time with SPEC2006 LTO bc files.

llvm-svn: 245760
2015-08-21 23:19:57 +00:00
Tyler Nowicki 552a62fabc Standardized 'failed' to 'Failed' in LoopVectorizationRequirements.
llvm-svn: 245759
2015-08-21 23:03:24 +00:00
Alexey Samsonov fc95c85cb5 [LSan] Support __sanitizer_set_death_callback in standalone LSan.
llvm-svn: 245758
2015-08-21 23:00:30 +00:00
Alex Lorenz ea788c4bf8 MIRLangRef: Add 'MIR Testing Guide' section.
llvm-svn: 245757
2015-08-21 22:58:33 +00:00
Peter Collingbourne 44ee84eec5 LTO: Change signature of LTOCodeGenerator::setCodePICModel() to take a Reloc::Model.
This allows us to remove a bunch of code in LTOCodeGenerator and llvm-lto
and has the side effect of improving error handling in the libLTO C API.

llvm-svn: 245756
2015-08-21 22:57:17 +00:00
Tom Stellard bd8a0856e2 AMDGPU/SI: Better handle s_wait insertion
We can wait on either VM, EXP or LGKM.
The waits are independent.

Without this patch, a wait inserted because of one of them
would also wait for all the previous others.
This patch makes s_wait only wait for the ones we need for the next
instruction.

Here's an example of subtle perf reduction this patch solves:

This is without the patch:

buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen
buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen
s_load_dwordx4 s[44:47], s[8:9], 0xc
s_waitcnt lgkmcnt(0)
buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen
s_load_dwordx4 s[48:51], s[8:9], 0x10
s_waitcnt vmcnt(1)
buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen

The s_waitcnt vmcnt(1) is useless.
The reason it is added is because the last
buffer_load_format_xyzw needs s[44:47], which was issued
by the first s_load_dwordx4. It waits for all VM
before that call to have finished.

Internally after every instruction, 3 counters (for VM, EXP and LGTM)
are updated after every instruction. For example buffer_load_format_xyzw
will
increase the VM counter, and s_load_dwordx4 the LGKM one.

Without the patch, for every defined register,
the current 3 counters are stored, and are used to know
how long to wait when an instruction needs the register.

Because of that, the s[44:47] counter includes that to use the register
you need to wait for the previous buffer_load_format_xyzw.

Instead this patch stores only the counters that matter for the
register,
and puts zero for the other ones, since we don't need any wait for them.

Patch by: Axel Davy

Differential Revision: http://reviews.llvm.org/D11883

llvm-svn: 245755
2015-08-21 22:47:27 +00:00
Alexey Samsonov 540ac1aab4 [MSan] Deprecate __msan_set_death_callback() in favor of __sanitizer_set_death_callback().
llvm-svn: 245754
2015-08-21 22:45:12 +00:00
Sanjoy Das c86c162a58 Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0"
The original checkin was buggy, this change has a fix.

Original commit message:

[InstCombine] Transform A & (L - 1) u< L --> L != 0

Summary:

This transform is never a pessimization at the IR level (since it
replaces an `icmp` with another), and has potentiall payoffs:

 1. It may make the `icmp` fold away or become loop invariant.
 2. It may make the `A & (L - 1)` computation dead.

This shows up in Java, in range checks generated by array accesses of
the form `a[i & (a.length - 1)]`.

Reviewers: reames, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12210

llvm-svn: 245753
2015-08-21 22:22:37 +00:00
David Blaikie 47bf5c019d Range-for-ify some things in GlobalMerge
llvm-svn: 245752
2015-08-21 22:19:06 +00:00
Zachary Turner 2156787b96 XFAIL pthreads test on Windows.
This test needs to be ported to c++ threads.

llvm-svn: 245751
2015-08-21 22:11:50 +00:00
Zachary Turner b20bafaec8 Fix TestPaths on Windows.
llvm-svn: 245750
2015-08-21 22:11:40 +00:00
Zachary Turner 7c9e1c0bef XFAIL the last Windows test that calls a function in the target.
llvm-svn: 245749
2015-08-21 22:11:31 +00:00
Zachary Turner 6df0cbc040 Skip TestCreateAfterAttach on Windows.
As with every other platform, this test occasionally hangs on
Windows.

llvm-svn: 245748
2015-08-21 22:11:21 +00:00
Zachary Turner 28ee0cd30e XFAIL Tests that require C++ exceptions on Windows.
clang-cl does not yet support C++ exceptions, so these tests will
not even compile.

Re-enabling these tests is tracked by llvm.org/pr24538

llvm-svn: 245747
2015-08-21 22:11:09 +00:00
David Blaikie 9ed57a9ef0 [opaque pointer types] Fix a few easy places in GlobalMerge that were accessing value types through pointee types
llvm-svn: 245746
2015-08-21 22:00:44 +00:00
Alex Lorenz c1136ef3b8 MIR Serialization: Serialize the pointer IR expression values in the machine
memory operands.

llvm-svn: 245745
2015-08-21 21:54:12 +00:00
Vedant Kumar 366dd9fd2b [ARM] Fix MachO CPU Subtype selection
Differential Revision: http://reviews.llvm.org/D12040

llvm-svn: 245744
2015-08-21 21:52:48 +00:00
Alex Lorenz 5d8b0bd9b0 MIRParser: Split the 'parseIRConstant' method into two methods. NFC.
One variant of this method can be reused when parsing the quoted IR pointer
expressions in the machine memory operands.

llvm-svn: 245743
2015-08-21 21:48:22 +00:00
David Blaikie d583b19569 [opaque pointer types] Push the passing of value types up from Function/GlobalVariable to GlobalObject
(coming next, pushing this up into GlobalValue, so it can store the
value type directly)

llvm-svn: 245742
2015-08-21 21:35:28 +00:00
Hal Finkel ff9639d6b7 [PowerPC] PPCVSXFMAMutate should not segfault on undef input registers
When PPCVSXFMAMutate would look at the input addend register, it would get its
input value number. This would fail, however, if the register was undef,
causing a segfault. Don't segfault (just skip such FMA instructions).

Fixes the test case from PR24542 (although that may have been over-reduced).

llvm-svn: 245741
2015-08-21 21:34:24 +00:00
Alex Lorenz 1de2acd3c2 AsmParser: Save and restore the parsing state for types using SlotMapping.
This commit extends the 'SlotMapping' structure and includes mappings for named
and numbered types in it. The LLParser is extended accordingly to fill out
those mappings at the end of module parsing.

This information is useful when we want to parse standalone constant values
at a later stage using the 'parseConstantValue' method. The constant values
can be constant expressions, which can contain references to types. In order
to parse such constant values, we have to restore the internal named and
numbered mappings for the types in LLParser, otherwise the parser will report
a parsing error. Therefore, this commit also introduces a new method called
'restoreParsingState' to LLParser, which uses the slot mappings to restore
some of its internal parsing state.

This commit is required to serialize constant value pointers in the machine
memory operands for the MIR format.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 245740
2015-08-21 21:32:39 +00:00
Bruno Cardoso Lopes 7a1483e7d1 [LVI] Use a SmallVector instead of SmallPtrSet. NFC
llvm-svn: 245739
2015-08-21 21:18:26 +00:00