Summary:
This is important for e.g. the following case:
void sync() { __syncthreads(); }
void foo() {
do_something();
sync();
do_something_else():
}
Without this change, if the optimizer does not inline sync() (which it
won't because __syncthreads is also marked as noduplicate, for now
anyway), it is free to perform optimizations on sync() that it would not
be able to perform on __syncthreads(), because sync() is not marked as
convergent.
Similarly, we need a notion of convergent calls, since in the case when
we can't statically determine a call's target(s), we need to know
whether it's safe to perform optimizations around the call.
This change is conservative; the optimizer will remove these attrs where
it can, see r260318, r260319.
Reviewers: majnemer
Subscribers: cfe-commits, jhen, echristo, tra
Differential Revision: http://reviews.llvm.org/D17056
llvm-svn: 261779
The assert is triggered because isObjCRetainableType() is called on the
canonicalized return type that has been stripped of the typedefs and
attributes attached to it. To fix this assert, this commit gets the
original return type from CurCodeDecl or BlockInfo and uses it instead
of the canoicalized type.
rdar://problem/24470031
Differential Revision: http://reviews.llvm.org/D16914
llvm-svn: 261151
being called with the wrong size: convert CGFunctionInfo to use TrailingObjects
and ask TrailingObjects to provide a working 'operator delete' for us.
llvm-svn: 260181
Clang's CodeGen has several paths which end up invoking or calling a
function. The one that we used for calls to __RTtypeid did not
appropriately annotate the call with a funclet bundle.
This fixes PR26329.
llvm-svn: 258877
This updates clang to use bundle operands to associate an invoke with
the funclet which it is contained within.
Depends on D15517.
Differential Revision: http://reviews.llvm.org/D15518
llvm-svn: 255675
`pass_object_size` is our way of enabling `__builtin_object_size` to
produce high quality results without requiring inlining to happen
everywhere.
A link to the design doc for this attribute is available at the
Differential review link below.
Differential Revision: http://reviews.llvm.org/D13263
llvm-svn: 254554
This patch changes the generation of CGFunctionInfo to contain
the FunctionProtoType if it is available. This enables the code
generation for call instructions to look into this type for
exception information and therefore generate better quality
IR - it will not create invoke instructions for functions that
are know not to throw.
llvm-svn: 253926
The ``disable_tail_calls`` attribute instructs the backend to not
perform tail call optimization inside the marked function.
For example,
int callee(int);
int foo(int a) __attribute__((disable_tail_calls)) {
return callee(a); // This call is not tail-call optimized.
}
Note that this attribute is different from 'not_tail_called', which
prevents tail-call optimization to the marked function.
rdar://problem/8973573
Differential Revision: http://reviews.llvm.org/D12547
llvm-svn: 252986
This attribute is used to prevent tail-call optimizations to the marked
function. For example, in the following piece of code, foo1 will not be
tail-call optimized:
int __attribute__((not_tail_called)) foo1(int);
int foo2(int a) {
return foo1(a); // Tail-call optimization is not performed.
}
The attribute has effect only on statically bound calls. It has no
effect on indirect calls. Also, virtual functions and objective-c
methods cannot be marked as 'not_tail_called'.
rdar://problem/22667622
Differential Revision: http://reviews.llvm.org/D12922
llvm-svn: 252369
This works around PR25162. The MSVC tables make it very difficult to
correctly inline a C++ destructor that contains try / catch. We've
attempted to address PR25162 in LLVM's backend, but it feels pretty
infeasible. MSVC and ICC both appear to avoid inlining such complex
destructors.
Long term, we want to fix this by making the inliner smart enough to
know when it is inlining into a cleanup, so it can inline simple
destructors (~unique_ptr and ~vector) while avoiding destructors
containing try / catch.
llvm-svn: 251576
Summary: It breaks the build for the ASTMatchers
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D13893
llvm-svn: 250827
match the feature set of the function that they're being called from.
This ensures that we can effectively diagnose some[1] code that would
instead ICE in the backend with a failure to select message.
Example:
__m128d foo(__m128d a, __m128d b) {
return __builtin_ia32_addsubps(b, a);
}
compiled for normal x86_64 via:
clang -target x86_64-linux-gnu -c
would fail to compile in the back end because the normal subtarget
features for x86_64 only include sse2 and the builtin requires sse3.
[1] We're still not erroring on:
__m128i bar(__m128i const *p) { return _mm_lddqu_si128(p); }
where we should fail and error on an always_inline function being
inlined into a function that doesn't support the subtarget features
required.
llvm-svn: 250473
The backend restores the stack pointer after recovering from an
exception. This is similar to r245879, but it doesn't try to use the
normal cleanup mechanism, so hopefully it won't cause the same breakage.
llvm-svn: 249640
Summary:
This change adds support for `__builtin_ms_va_list`, a GCC extension for
variadic `ms_abi` functions. The existing `__builtin_va_list` support is
inadequate for this because `va_list` is defined differently in the Win64
ABI vs. the System V/AMD64 ABI.
Depends on D1622.
Reviewers: rsmith, rnk, rjmccall
CC: cfe-commits
Differential Revision: http://reviews.llvm.org/D1623
llvm-svn: 247941
-force-align-stack.
Also, make changes to the driver so that -mno-stack-realign is no longer
an option exposed to the end-user that disallows stack realignment in
the backend.
Differential Revision: http://reviews.llvm.org/D11815
llvm-svn: 247451
The type of a member pointer is incomplete if it has no inheritance
model. This lets us reuse more general logic already embedded in clang.
llvm-svn: 247346
It seems that there is small bug, and we can't generate assume loads
when some virtual functions have internal visibiliy
This reverts commit 982bb7d966947812d216489b3c519c9825cacbf2.
llvm-svn: 247332
Generating call assume(icmp %vtable, %global_vtable) after constructor
call for devirtualization purposes.
For more info go to:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html
Edit:
Fixed version because of PR24479.
After this patch got reverted because of ScalarEvolution bug (D12719)
Merged after John McCall big patch (Added Address).
http://reviews.llvm.org/D11859
llvm-svn: 247199
We know that a reference can always be dereferenced. However, we don't
always know the number of bytes if the reference's pointee type is
incomplete. This case was correctly handled but we didn't consider the
case where the type is complete but we cannot calculate its size for ABI
specific reasons. In this specific case, a member pointer's size is
available only under certain conditions.
This fixes PR24703.
llvm-svn: 247188
instruction used the ReturnValue as pointer operand or value operand. This
led to wrong code gen - in later stages (load-store elision code) the found
store and its operand would be erased, causing ReturnValue to become a <badref>.
The patch adds a check that makes sure that ReturnValue is a pointer operand of
store instruction. Regression test is also added.
This fixes PR24386.
Differential Revision: http://reviews.llvm.org/D12400
llvm-svn: 247003
Introduce an Address type to bundle a pointer value with an
alignment. Introduce APIs on CGBuilderTy to work with Address
values. Change core APIs on CGF/CGM to traffic in Address where
appropriate. Require alignments to be non-zero. Update a ton
of code to compute and propagate alignment information.
As part of this, I've promoted CGBuiltin's EmitPointerWithAlignment
helper function to CGF and made use of it in a number of places in
the expression emitter.
The end result is that we should now be significantly more correct
when performing operations on objects that are locally known to
be under-aligned. Since alignment is not reliably tracked in the
type system, there are inherent limits to this, but at least we
are no longer confused by standard operations like derived-to-base
conversions and array-to-pointer decay. I've also fixed a large
number of bugs where we were applying the complete-object alignment
to a pointer instead of the non-virtual alignment, although most of
these were hidden by the very conservative approach we took with
member alignment.
Also, because IRGen now reliably asserts on zero alignments, we
should no longer be subject to an absurd but frustrating recurring
bug where an incomplete type would report a zero alignment and then
we'd naively do a alignmentAtOffset on it and emit code using an
alignment equal to the largest power-of-two factor of the offset.
We should also now be emitting much more aggressive alignment
attributes in the presence of over-alignment. In particular,
field access now uses alignmentAtOffset instead of min.
Several times in this patch, I had to change the existing
code-generation pattern in order to more effectively use
the Address APIs. For the most part, this seems to be a strict
improvement, like doing pointer arithmetic with GEPs instead of
ptrtoint. That said, I've tried very hard to not change semantics,
but it is likely that I've failed in a few places, for which I
apologize.
ABIArgInfo now always carries the assumed alignment of indirect and
indirect byval arguments. In order to cut down on what was already
a dauntingly large patch, I changed the code to never set align
attributes in the IR on non-byval indirect arguments. That is,
we still generate code which assumes that indirect arguments have
the given alignment, but we don't express this information to the
backend except where it's semantically required (i.e. on byvals).
This is likely a minor regression for those targets that did provide
this information, but it'll be trivial to add it back in a later
patch.
I partially punted on applying this work to CGBuiltin. Please
do not add more uses of the CreateDefaultAligned{Load,Store}
APIs; they will be going away eventually.
llvm-svn: 246985
every time it's called rather than attempting to cache the result.
It's unlikely to be called frequently and the overhead of using
it in the first place is already factored out.
llvm-svn: 246706
the main attribute and cache the results so we don't have to parse
a single attribute more than once.
This reapplies r246596 with a fix for an uninitialized class member,
and a couple of cleanups and formatting changes.
llvm-svn: 246610
Also:
- Add a typedef to make working with the result easier.
- Update callers to use the new function.
- Make initFeatureMap out of line.
llvm-svn: 246468
A couple of changes here:
a) Do less work in the case where we don't have a target attribute on the
function. We've already canonicalized the attributes for the function -
no need to do more work.
b) Use the newer canonicalized feature adding functions from TargetInfo
to do the work when we do have a target attribute. This enables us to diagnose
some warnings in the case of conflicting written attributes (only ppc does
this today) and also make sure to get all of the features for a cpu that's
listed rather than just change the cpu.
Updated all testcases accordingly and added a new testcase to verify that we'll
error out on ppc if we have some incompatible options using the existing diagnosis
framework there.
llvm-svn: 246195