With this patch, the profile summary data will be available in indexed
profile data file so that profiler reader/compiler optimizer can start
to make use of.
Differential Revision: http://reviews.llvm.org/D16258
llvm-svn: 259626
The return type of the __builtin___*StringMakeConstantString functions
is a pointer to a struct, so we need that struct to be visible to name
lookup so that we will correctly merge multiple declarations of that
type if they come from different modules.
Incidentally, to make this visible to name lookup we need to rename the
type to __NSConstantString, since the real NSConstantString is an
Objective-C interface type. This shouldn't affect anyone outside the
compiler since users of the constant string builtins cast the result
immediately to CFStringRef.
Since this struct type is otherwise implicitly created by the AST
context and cannot access namelookup, we make this a predefined type
and initialize it in Sema.
Note: this issue of builtins that refer to types not visible to name
lookup technically also affects other builtins (e.g. objc_msgSendSuper),
but in all other cases the builtin is a library builtin and the issue
goes away if you include the library that defines the types it uses,
unlike for these constant string builtins.
rdar://problem/24425801
llvm-svn: 259624
actual source name of the typedef or a vendor mangling) but at least it stands
a chance of demangling now.
We would also benefit from some test coverage of this (OpenCL +
__attribute__((overloadable)) is probably required to reach this).
llvm-svn: 259618
The purpose of PPCVSXFMAMutate is to elide copies by changing FMA forms
on PPC.
%vreg6<def> = COPY %vreg96
%vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg7
;v6 = v6 + v5 * v7
is replaced by
%vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg7, %vreg96
;v5 = v5 * v7 + v96
This was broken in the case where the target register was also used as a
multiplicand. Fix this case by checking for it and replacing both uses
with the copied register.
%vreg6<def> = COPY %vreg96
%vreg6<def,tied1> = XSMADDASP %vreg6<tied0>, %vreg5<kill>, %vreg6
;v6 = v6 + v5 * v6
is replaced by
%vreg5<def,tied1> = XSMADDMSP %vreg5<tied0>, %vreg96, %vreg96
;v5 = v5 * v96 + v96
llvm-svn: 259617
C++ ABI library for the same set of types for which we expect the C++ ABI
library to provide the RTTI.
Specifically:
1) __int128 and unsigned __int128 are now emitted into the ABI library. We
always expected them to be there but never actually made sure to emit them.
2) Do not expect OpenCL builtin types to have type info in the C++ ABI library.
Neither libc++abi nor libstdc++ puts them there when built with either GCC
or Clang.
This matches GCC's behavior.
llvm-svn: 259616
The register coalescer can rematerialize constants that define
more of a register than the copy it is going to replace was going
to do.
This is valid in the case the register was undef before the
copy happened.
This patch makes sure that all the subranges defined by the new
rematerialization instructions have at least a dead def.
Review: http://reviews.llvm.org/D16693
llvm-svn: 259614
track a source for. When we are pushing breakpoints and stepping past function prologues,
also push past code from line 0 immediately following the prologue end.
<rdar://problem/23730696>
llvm-svn: 259611
Summary:
LoopVersioning is a transform utility that transform passes can use to
run-time disambiguate may-aliasing accesses. I'd like to also expose as
pass to allow it to be unit-tested.
I am planning to add support for non-aliasing annotation in
LoopVersioning and I'd like to be able to write tests directly using
this pass.
(After that feature is done, the pass could also be used to look for
optimization opportunities that are hidden behind incomplete alias
information at compile time.)
The pass drives LoopVersioning in its default way which is to fully
disambiguate may-aliasing accesses no matter how many checks are
required.
Reviewers: hfinkel, ashutosh.nema, sbaranga
Subscribers: zzheng, mssimpso, llvm-commits, sanjoy
Differential Revision: http://reviews.llvm.org/D16612
llvm-svn: 259610
C++14 generic lambdas. It conflicts with the C++14 return type deduction
mechanism, and results in us failing to actually deduce the lambda's return
type in some cases.
llvm-svn: 259609
I don't understand how this worked before, but this fixes the recent test regressions on Windows in TestConsecutiveBreakpoints.py.
Differential Revision: http://reviews.llvm.org/D16825
llvm-svn: 259605
the details of the bug, but avoiding overloading llvm::cast with another
function template sidesteps it.
See gcc.gnu.org/PR58022 for details of the bug, and llvm.org/PR26362 for more
backgound on how it manifested in Clang. Patch by Igor Sugak!
llvm-svn: 259598
It can fail to open an output file for various reasons, including
lack of permission, too long filename, or the output file is not
a mmap'able file.
llvm-svn: 259596
Please see include/llvm/Transforms/Utils/MemorySSA.h for a description
of MemorySSA, and what it does.
Differential Revision: http://reviews.llvm.org/D7864
llvm-svn: 259595
Due to staleness in a patch I committed yesterday, the debug output was reporting overdefined cases as being undefined. Confusing to say the least. The mistake appears to have only effected the debug output thankfully.
llvm-svn: 259594
In general CUDA does not allow dynamic initialization of
global device-side variables. One exception is that CUDA allows
records with empty constructors as described in section E2.2.1 of
CUDA 7.5 Programming guide.
This patch applies initializer checks for all device-side variables.
Empty constructors are accepted, but no code is generated for them.
Differential Revision: http://reviews.llvm.org/D15305
llvm-svn: 259592
For ObjCXX, we can create a CastExpr with Kind being CK_UserDefinedConversion
and SubExpr being BlockExpr. Specifically one can return BlockExpr from
BuildCXXMemberCallExpr and the result can be used to build a CastExpr.
Fix the assumption in CastExpr::getSubExprAsWritten that SubExpr can only
be CXXMemberCallExpr.
rdar://problem/24364077
llvm-svn: 259591
Previously we were returning a tuple of (bool, skip_reason) from
the tuple function. This makes for some awkward code, especially
since a value of True for the first argument implies that the
second argument is None, and a value of False implies that the
second argument is not None. So it was basically redundant, and
with this patch we simply return the skip reason or None directly.
llvm-svn: 259590
In r259574 I fixed some of the issues with the mach header symbols
and DSO handles.
This is the next issue whereby the __mh_execute_header has to not
be dead stripped, and (to match ld64) should be dynamically referenced.
The test here should also have been added in r259574 to make sure that
we emit this symbol. But checking that it is not only emitted but also
has the correct reference type is fine.
llvm-svn: 259589
We support now code such as:
void multiple_types(char *Short, char *Float, char *Double) {
for (long i = 0; i < 100; i++) {
Short[i] = *(short *)&Short[2 * i];
Float[i] = *(float *)&Float[4 * i];
Double[i] = *(double *)&Double[8 * i];
}
}
To support such code we use as element type of the modeled array the smallest
element type of all original array accesses. Accesses with larger types are
modeled as multiple accesses with the smaller type.
For example the second load access is modeled as:
{ Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }
To support jscop-rewritable memory accesses we need each statement instance to
only be assigned a single memory location, which will be the address at which
we load the value. Currently we obtain this address by taking the lexmin of
the access function. We may consider keeping track of the memory location more
explicitly in the future.
llvm-svn: 259587
I introduced a declaration in 259583 to keep the diff readable. This change just moves the definition up to remove the declaration again.
llvm-svn: 259585
This patch uses the newly introduced 'intersect' utility (from 259461: [LVI] Introduce an intersect operation on lattice values) to simplify existing code in LVI.
While not introducing any new concepts, this change is probably not NFC. The common 'intersect' function is more powerful that the ad-hoc implementations we'd had in a couple of places. Given that, we may see optimizations triggering a bit more often.
llvm-svn: 259583