We also disable this feature by default, as there are still some issues in
combination with invariant load hoisting that slipped through my initial
testing.
llvm-svn: 260025
Invariant load hoisting of memory accesses with non-canonical element
types lacks support for equivalence classes that contain elements of
different width/size. This support should be added, but to get our buildbots
back to green, we disable load hoisting for memory accesses with non-canonical
element size for now.
llvm-svn: 260023
Adjust the driver to invoke the linker more similar to gcc. -dynamic-linker is
only passed if -static and -shared are not part of the compiler (driver)
invocation. Replicate the passing of -export-rdynamic as per the GCC link spec:
%{!static: %{rdynamic:-export-dynamic} %{!shared:-dynamic-linker ...}}
This behaviour is consistent across all the targets that are supported, so no
need to conditionalise it on the target.
Resolves PR24245.
llvm-svn: 260019
The ivar ref would be transformed by the Typo Correction TreeTransform, but not
be owned, resulting in the source location being invalid. This would eventually
lead to an assertion in findCapturingExpr. Prevent this assertion from
triggering.
Resolves PR25113.
llvm-svn: 260017
We would previously assert in findCapturingExpr when performing a typo
correction resulting in an assignment of an ObjC property with a strong lifetype
specifier due to the expression not being rooted in the file (invalid SLoc)
during the retain cycle check on the typo-corrected expression. Handle the
expression type appropriately during the TreeTransform to ensure that we have a
source location associated with the expression.
Fixes PR26486.
llvm-svn: 260016
This patch is the first in a series of patches that's meant to better
support unordered_map. unordered_map has a special "value_type" that
differs from pair<const Key, Value>. In order to meet the EmplaceConstructible
and CopyInsertable requirements we need to teach __hash_table about this
special value_type.
This patch creates a "__hash_node_types" traits class that contains
all of the typedefs needed by the unordered containers and it's iterators.
These typedefs include ones for each node type and node pointer type,
as well as special typedefs for "unordered_map"'s value type.
As a result of this change all of the unordered containers now all support
incomplete types.
As a drive-by fix I changed the difference_type in __hash_table to always
be ptrdiff_t. There is a corresponding change to size_type but it cannot
take affect until an ABI break.
This patch will be followed up shortly with fixes for various unordered_map
fixes.
llvm-svn: 260012
Always use access-instruction pointer type to load the invariant values.
Otherwise mismatches between ScopArrayInfo element type and memory access
element type will result in invalid casts. These type mismatches are after
r259784 a lot more common and also arise with types of different size, which
have not been handled before.
Interestingly, this change actually simplifies the code, as we now have only
one code path that is always taken, rather then a standard code path for the
common case and a "fixup" code path that replaces the standard code path in
case of mismatching types.
llvm-svn: 260009
Summary:
This is the attribute purpose-made for e.g. __syncthreads. It appears
that NoDuplicate may not be sufficient to prevent Sink from touching a
call to __syncthreads.
Reviewers: jingyue, hfinkel
Subscribers: llvm-commits, jholewinski, jhen, rnk, tra, majnemer
Differential Revision: http://reviews.llvm.org/D16941
llvm-svn: 260005
It is common for the ivars for read-only assign properties to always be stored retained,
so don't warn for a release in dealloc for the ivar backing these properties.
llvm-svn: 259998
First step towards being able to decode AVX512 PMOVZX instructions without a massive bloat in the shuffle decode switch statement.
This should also make it easier to decode X86ISD::VZEXT target shuffles in the future.
llvm-svn: 259995
Summary:
Adds the linkage type to both the per-module and combined function
summaries, which subsumes the current islocal bit. This will eventually
be used to optimized linkage types based on global summary-based
analysis.
Reviewers: joker.eph
Subscribers: joker.eph, davidxl, llvm-commits
Differential Revision: http://reviews.llvm.org/D16943
llvm-svn: 259993
The previously implemented approach is to follow value definitions and
create write accesses ("push defs") while searching for uses. This
requires the same relatively validity- and requirement conditions to be
replicated at multiple locations (PHI instructions, other instructions,
uses by PHIs).
We replace this by iterating over the uses in a SCoP ("pull in
requirements"), and add writes only when at least one read has been
added. It turns out to be simpler code because each use is only iterated
over once and writes are added for the first access that reads it. We
need another iteration to identify escaping values (uses not in the
SCoP), which also makes the difference between such accesses more
obvious. As a side-effect, the order of scalar MemoryAccess can change.
Differential Revision: http://reviews.llvm.org/D15706
llvm-svn: 259987
Summary:
When alias analysis is uncertain about the aliasing between any two accesses,
it will return MayAlias. This uncertainty from alias analysis restricts LICM
from proceeding further. In cases where alias analysis is uncertain we might
use loop versioning as an alternative.
Loop Versioning will create a version of the loop with aggressive aliasing
assumptions in addition to the original with conservative (default) aliasing
assumptions. The version of the loop making aggressive aliasing assumptions
will have all the memory accesses marked as no-alias. These two versions of
loop will be preceded by a memory runtime check. This runtime check consists
of bound checks for all unique memory accessed in loop, and it ensures the
lack of memory aliasing. The result of the runtime check determines which of
the loop versions is executed: If the runtime check detects any memory
aliasing, then the original loop is executed. Otherwise, the version with
aggressive aliasing assumptions is used.
The pass is off by default and can be enabled with command line option
-enable-loop-versioning-licm.
Reviewers: hfinkel, anemet, chatur01, reames
Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga,
llvm-commits
Differential Revision: http://reviews.llvm.org/D9151
llvm-svn: 259986
user process dyld binary and/or a mach kernel binary image. By
default, it prefers the kernel if it finds both.
But if it finds two kernel binary images (which can happen when
random things are mapped into memory), it may pick the wrong
kernel image.
DynamicLoaderDarwinKernel has heuristics to find a kernel in memory;
once we've established that there is a kernel binary in memory,
call over to that class to see if it can find a kernel address via
its search methods. If it does, use that.
Some minor cleanups to DynamicLoaderDarwinKernel while I was at it.
<rdar://problem/24446112>
llvm-svn: 259983
Summary:
Different devices may in some cases require different code generation schemes in order to implement OpenMP. This is required not only for performance reasons, but also because it may not be possible to have the current (default) implementation working for these devices. E.g. GPU's cannot implement the same scheme a target such as powerpc or x86b would use, in the sense that it does not have the ability to fork threads, instead all the threads are always executing and need to be managed by the implementation.
This patch proposes a reorganization of the code in the OpenMP code generation to pave the way to have specialized implementation of OpenMP support. More than a "real" patch this is more a request for comments in order to understand if what is proposed is acceptable or if there are better/easier ways to do it.
In this patch part of the common OpenMP codegen infrastructure is moved to a new file under a new namespace (CGOpenMPCommon) so it can be shared between the default implementation and the specialized one. When CGOpenMPRuntime is created, an attempt to select a specialized implementation is done.
In the patch a specialization for nvptx targets is done which currently checks if the target is an OpenMP device and trap if it is not.
Let me know comments suggestions you may have.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev
Subscribers: Hahnfeld, cfe-commits, fraggamuffin, caomhin, jholewinski
Differential Revision: http://reviews.llvm.org/D16784
llvm-svn: 259977
It is possible for enums to be created as part of their own
declcontext. We need to cache a placeholder to avoid the type being
created twice before hitting the cache.
<rdar://problem/24493203>
llvm-svn: 259975
There is a legitimate use-case in clang where we need to replace a
temporary placeholder node with the temporary node that may be a
forward declaration.
<rdar://problem/24493203>
llvm-svn: 259973