This is a retry of rL340135 (reverted at rL340136 because of gcc host compiler crashing)
with 2 changes:
1. Move the code into a helper to reduce code duplication (and hopefully work-around the crash).
2. The original commit had a formatting bug in the docs (missing an underscore).
Original commit message:
This exposes the LLVM funnel shift intrinsics as more familiar bit rotation functions in clang
(when both halves of a funnel shift are the same value, it's a rotate).
We're free to name these as we want because we're not copying gcc, but if there's some other
existing art (eg, the microsoft ops that are modified in this patch) that we want to replicate,
we can change the names.
The funnel shift intrinsics were added here:
https://reviews.llvm.org/D49242
With improved codegen in:
https://reviews.llvm.org/rL337966https://reviews.llvm.org/rL339359
And basic IR optimization added in:
https://reviews.llvm.org/rL338218https://reviews.llvm.org/rL340022
...so these are expected to produce asm output that's equal or better to the multi-instruction
alternatives using primitive C/IR ops.
In the motivating loop example from PR37387:
https://bugs.llvm.org/show_bug.cgi?id=37387#c7
...we get the expected 'rolq' x86 instructions if we substitute the rotate builtin into the source.
Differential Revision: https://reviews.llvm.org/D50924
llvm-svn: 340137
Clang generates copy and dispose helper functions for each block literal
on the stack. Often these functions are equivalent for different blocks.
This commit makes changes to merge equivalent copy and dispose helper
functions and reduce code size.
To enable merging equivalent copy/dispose functions, the captured object
infomation is encoded into the helper function name. This allows IRGen
to check whether an equivalent helper function has already been emitted
and reuse the function instead of generating a new helper function
whenever a block is defined. In addition, the helper functions are
marked as linkonce_odr to enable merging helper functions that have the
same name across translation units and marked as unnamed_addr to enable
the linker's deduplication pass to merge functions that have different
names but the same content.
rdar://problem/42640608
Differential Revision: https://reviews.llvm.org/D50152
llvm-svn: 339438
Summary:
Introduces funclet-based unwinding for Objective-C and fixes an issue
where global blocks can't have their isa pointers initialised on
Windows.
After discussion with Dustin, this changes the name mangling of
Objective-C types to prevent a C++ catch statement of type struct X*
from catching an Objective-C object of type X*.
Reviewers: rjmccall, DHowett-MSFT
Reviewed By: rjmccall, DHowett-MSFT
Subscribers: mgrang, mstorsjo, smeenai, cfe-commits
Differential Revision: https://reviews.llvm.org/D50144
llvm-svn: 339428
Summary:
C and C++ are interesting languages. They are statically typed, but weakly.
The implicit conversions are allowed. This is nice, allows to write code
while balancing between getting drowned in everything being convertible,
and nothing being convertible. As usual, this comes with a price:
```
unsigned char store = 0;
bool consume(unsigned int val);
void test(unsigned long val) {
if (consume(val)) {
// the 'val' is `unsigned long`, but `consume()` takes `unsigned int`.
// If their bit widths are different on this platform, the implicit
// truncation happens. And if that `unsigned long` had a value bigger
// than UINT_MAX, then you may or may not have a bug.
// Similarly, integer addition happens on `int`s, so `store` will
// be promoted to an `int`, the sum calculated (0+768=768),
// and the result demoted to `unsigned char`, and stored to `store`.
// In this case, the `store` will still be 0. Again, not always intended.
store = store + 768; // before addition, 'store' was promoted to int.
}
// But yes, sometimes this is intentional.
// You can either make the conversion explicit
(void)consume((unsigned int)val);
// or mask the value so no bits will be *implicitly* lost.
(void)consume((~((unsigned int)0)) & val);
}
```
Yes, there is a `-Wconversion`` diagnostic group, but first, it is kinda
noisy, since it warns on everything (unlike sanitizers, warning on an
actual issues), and second, there are cases where it does **not** warn.
So a Sanitizer is needed. I don't have any motivational numbers, but i know
i had this kind of problem 10-20 times, and it was never easy to track down.
The logic to detect whether an truncation has happened is pretty simple
if you think about it - https://godbolt.org/g/NEzXbb - basically, just
extend (using the new, not original!, signedness) the 'truncated' value
back to it's original width, and equality-compare it with the original value.
The most non-trivial thing here is the logic to detect whether this
`ImplicitCastExpr` AST node is **actually** an implicit conversion, //or//
part of an explicit cast. Because the explicit casts are modeled as an outer
`ExplicitCastExpr` with some `ImplicitCastExpr`'s as **direct** children.
https://godbolt.org/g/eE1GkJ
Nowadays, we can just use the new `part_of_explicit_cast` flag, which is set
on all the implicitly-added `ImplicitCastExpr`'s of an `ExplicitCastExpr`.
So if that flag is **not** set, then it is an actual implicit conversion.
As you may have noted, this isn't just named `-fsanitize=implicit-integer-truncation`.
There are potentially some more implicit conversions to be warned about.
Namely, implicit conversions that result in sign change; implicit conversion
between different floating point types, or between fp and an integer,
when again, that conversion is lossy.
One thing i know isn't handled is bitfields.
This is a clang part.
The compiler-rt part is D48959.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=21530 | PR21530 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=37552 | PR37552 ]], [[ https://bugs.llvm.org/show_bug.cgi?id=35409 | PR35409 ]].
Partially fixes [[ https://bugs.llvm.org/show_bug.cgi?id=9821 | PR9821 ]].
Fixes https://github.com/google/sanitizers/issues/940. (other than sign-changing implicit conversions)
Reviewers: rjmccall, rsmith, samsonov, pcc, vsk, eugenis, efriedma, kcc, erichkeane
Reviewed By: rsmith, vsk, erichkeane
Subscribers: erichkeane, klimek, #sanitizers, aaron.ballman, RKSimon, dtzWill, filcab, danielaustin, ygribov, dvyukov, milianw, mclow.lists, cfe-commits, regehr
Tags: #sanitizers
Differential Revision: https://reviews.llvm.org/D48958
llvm-svn: 338288
With this change compiler generates alignment checks for wider range
of types. Previously such checks were generated only for the record types
with non-trivial default constructor. So the types like:
struct alignas(32) S2 { int x; };
typedef __attribute__((ext_vector_type(2), aligned(32))) float float32x2_t;
did not get checks when allocated by 'new' expression.
This change also optimizes the checks generated for the arrays created
in 'new' expressions. Previously the check was generated for each
invocation of type constructor. Now the check is generated only once
for entire array.
Differential Revision: https://reviews.llvm.org/D49589
llvm-svn: 338199
When an exception is thrown in a block copy helper function, captured
objects that have previously been copied should be destructed or
released. Similarly, captured objects that are yet to be released should
be released when an exception is thrown in a dispose helper function.
rdar://problem/42410255
Differential Revision: https://reviews.llvm.org/D49718
llvm-svn: 338041
As documented here: https://software.intel.com/en-us/node/682969 and
https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning
is an ICC feature that provides for function multiversioning.
This feature is implemented with two attributes: First, cpu_specific,
which specifies the individual function versions. Second, cpu_dispatch,
which specifies the location of the resolver function and the list of
resolvable functions.
This is valuable since it provides a mechanism where the resolver's TU
can be specified in one location, and the individual implementions
each in their own translation units.
The goal of this patch is to be source-compatible with ICC, so this
implementation diverges from the ICC implementation in a few ways:
1- Linux x86/64 only: This implementation uses ifuncs in order to
properly dispatch functions. This is is a valuable performance benefit
over the ICC implementation. A future patch will be provided to enable
this feature on Windows, but it will obviously more closely fit ICC's
implementation.
2- CPU Identification functions: ICC uses a set of custom functions to identify
the feature list of the host processor. This patch uses the cpu_supports
functionality in order to better align with 'target' multiversioning.
1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function
marked cpu_dispatch be an empty definition. This patch supports that as well,
however declarations are also permitted, since the linker will solve the
issue of multiple emissions.
Differential Revision: https://reviews.llvm.org/D47474
llvm-svn: 337552
The member init list for the sole constructor for CodeGenFunction
has gotten out of hand, so this patch moves the non-parameter-dependent
initializations into the member value inits.
Note: This is what was intended to be committed in r336726
llvm-svn: 336729
The member init list for the sole constructor for CodeGenFunction
has gotten out of hand, so this patch moves the non-parameter-dependent
initializations into the member value inits.
llvm-svn: 336726
This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter.
Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible.
To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future.
To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin.
To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute.
To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins.
There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch.
Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue.
Differential Revision: https://reviews.llvm.org/D48617
llvm-svn: 336583
Similarly to CFI on virtual and indirect calls, this implementation
tries to use program type information to make the checks as precise
as possible. The basic way that it works is as follows, where `C`
is the name of the class being defined or the target of a call and
the function type is assumed to be `void()`.
For virtual calls:
- Attach type metadata to the addresses of function pointers in vtables
(not the functions themselves) of type `void (B::*)()` for each `B`
that is a recursive dynamic base class of `C`, including `C` itself.
This type metadata has an annotation that the type is for virtual
calls (to distinguish it from the non-virtual case).
- At the call site, check that the computed address of the function
pointer in the vtable has type `void (C::*)()`.
For non-virtual calls:
- Attach type metadata to each non-virtual member function whose address
can be taken with a member function pointer. The type of a function
in class `C` of type `void()` is each of the types `void (B::*)()`
where `B` is a most-base class of `C`. A most-base class of `C`
is defined as a recursive base class of `C`, including `C` itself,
that does not have any bases.
- At the call site, check that the function pointer has one of the types
`void (B::*)()` where `B` is a most-base class of `C`.
Differential Revision: https://reviews.llvm.org/D47567
llvm-svn: 335569
Summary:
Because wasm control flow needs to be structured, using WinEH
instructions to support wasm EH brings several benefits. This patch
makes wasm EH uses Windows EH instructions, with some changes:
1. Because wasm uses a single catch block to catch all C++ exceptions,
this merges all catch clauses into a single catchpad, within which we
test the EH selector as in Itanium EH.
2. Generates a call to `__clang_call_terminate` in case a cleanup
throws. Wasm does not have a runtime to handle this.
3. In case there is no catch-all clause, inserts a call to
`__cxa_rethrow` at the end of a catchpad in order to unwind to an
enclosing EH scope.
Reviewers: majnemer, dschuff
Subscribers: jfb, sbc100, jgravelle-google, sunfish, cfe-commits
Differential Revision: https://reviews.llvm.org/D44931
llvm-svn: 333703
These intrinsics are used by MSVC's header files on AArch64 Windows as
well as AArch32, so we should support them for both targets. I've
factored them out of CodeGenFunction::EmitARMBuiltinExpr into separate
functions that EmitAArch64BuiltinExpr can call as well.
Reviewers: javed.absar, mstorsjo
Reviewed By: mstorsjo
Subscribers: kristof.beyls, cfe-commits
Differential Revision: https://reviews.llvm.org/D47476
llvm-svn: 333513
Introduced CreateMemTempWithoutCast and CreateTemporaryAllocaWithoutCast to emit alloca
without casting to default addr space.
ActiveFlag is a temporary variable emitted for clean up. It is defined as AllocaInst* type and there is
a cast to AlllocaInst in SetActiveFlag. An alloca casted to generic pointer causes assertion in
SetActiveFlag.
Since there is only load/store of ActiveFlag, it is safe to use the original alloca, therefore use
CreateMemTempWithoutCast is called.
Differential Revision: https://reviews.llvm.org/D47099
llvm-svn: 332982
lifetime.start/end expects pointer argument in alloca address space.
However in C++ a temporary variable is in default address space.
This patch changes API CreateMemTemp and CreateTempAlloca to
get the original alloca instruction and pass it lifetime.start/end.
It only affects targets with non-zero alloca address space.
Differential Revision: https://reviews.llvm.org/D45900
llvm-svn: 332593
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
function if a function delegates to another function.
Fix a bug introduced in r328731, which caused a struct with ObjC __weak
fields that was passed to a function to be destructed twice, once in the
callee function and once in another function the callee function
delegates to. To prevent this, keep track of the callee-destructed
structs passed to a function and disable their cleanups at the point of
the call to the delegated function.
This reapplies r331016, which was reverted in r331019 because it caused
an assertion to fail in EmitDelegateCallArg on a windows bot. I made
changes to EmitDelegateCallArg so that it doesn't try to deactivate
cleanups for structs that have trivial destructors (cleanups for those
structs are never pushed to the cleanup stack in EmitParmDecl).
rdar://problem/39194693
Differential Revision: https://reviews.llvm.org/D45382
llvm-svn: 331020
function if a function delegates to another function.
Fix a bug introduced in r328731, which caused a struct with ObjC __weak
fields that was passed to a function to be destructed twice, once in the
callee function and once in another function the callee function
delegates to. To prevent this, keep track of the callee-destructed
structs passed to a function and disable their cleanups at the point of
the call to the delegated function.
rdar://problem/39194693
Differential Revision: https://reviews.llvm.org/D45382
llvm-svn: 331016
Summary:
A clang builtin for xray typed events. Differs from
__xray_customevent(...) by the presence of a type tag that is vended by
compiler-rt in typical usage. This allows xray handlers to expand logged
events with their type description and plugins to process traced events
based on type.
This change depends on D45633 for the intrinsic definition.
Reviewers: dberris, pelikan, rnk, eizan
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D45716
llvm-svn: 330220
register destructor functions annotated with __attribute__((destructor))
using __cxa_atexit or atexit.
Register destructor functions annotated with __attribute__((destructor))
calling __cxa_atexit in a synthesized constructor function instead of
emitting references to the functions in a special section.
The primary reason for adding this option is that we are planning to
deprecate the __mod_term_funcs section on Darwin in the future. This
feature is enabled by default only on Darwin. Users who do not want this
can use command line option 'fno_register_global_dtors_with_atexit' to
disable it.
rdar://problem/33887655
Differential Revision: https://reviews.llvm.org/D45578
llvm-svn: 330199
Found via codespell -q 3 -I ../clang-whitelist.txt
Where whitelist consists of:
archtype
cas
classs
checkk
compres
definit
frome
iff
inteval
ith
lod
methode
nd
optin
ot
pres
statics
te
thru
Patch by luzpaz! (This is a subset of D44188 that applies cleanly with a few
files that have dubious fixes reverted.)
Differential revision: https://reviews.llvm.org/D44188
llvm-svn: 329399
the tail padding is not reused.
We track on the AggValueSlot (and through a couple of other
initialization actions) whether we're dealing with an object that might
share its tail padding with some other object, so that we can avoid
emitting stores into the tail padding if that's the case. We still
widen stores into tail padding when we can do so.
Differential Revision: https://reviews.llvm.org/D45306
llvm-svn: 329342
Summary:
The following class hierarchy requires that we be able to emit a
this-adjusting thunk for B::foo in C's vftable:
struct Incomplete;
struct A {
virtual A* foo(Incomplete p) = 0;
};
struct B : virtual A {
void foo(Incomplete p) override;
};
struct C : B { int c; };
This TU is valid, but lacks a definition of 'Incomplete', which makes it
hard to build a thunk for the final overrider, B::foo.
Before this change, Clang gives up attempting to emit the thunk, because
it assumes that if the parameter types are incomplete, it must be
emitting the thunk for optimization purposes. This is untrue for the MS
ABI, where the implementation of B::foo has no idea what thunks C's
vftable may require. Clang needs to emit the thunk without necessarily
having access to the complete prototype of foo.
This change makes Clang emit a musttail variadic call when it needs such
a thunk. I call these "unprototyped" thunks, because they only prototype
the "this" parameter, which must always come first in the MS C++ ABI.
These thunks work, but they create ugly LLVM IR. If the call to the
thunk is devirtualized, it will be a call to a bitcast of a function
pointer. Today, LLVM cannot inline through such a call, but I want to
address that soon, because we also use this pattern for virtual member
pointer thunks.
This change also implements an old FIXME in the code about reusing the
thunk's computed CGFunctionInfo as much as possible. Now we don't end up
computing the thunk's mangled name and arranging it's prototype up to
around three times.
Fixes PR25641
Reviewers: rjmccall, rsmith, hans
Subscribers: Prazek, cfe-commits
Differential Revision: https://reviews.llvm.org/D45112
llvm-svn: 329009
Summary:
Libc++'s default allocator uses `__builtin_operator_new` and `__builtin_operator_delete` in order to allow the calls to new/delete to be ellided. However, libc++ now needs to support over-aligned types in the default allocator. In order to support this without disabling the existing optimization Clang needs to support calling the aligned new overloads from the builtins.
See llvm.org/PR22634 for more information about the libc++ bug.
This patch changes `__builtin_operator_new`/`__builtin_operator_delete` to call any usual `operator new`/`operator delete` function. It does this by performing overload resolution with the arguments passed to the builtin to determine which allocation function to call. If the selected function is not a usual allocation function a diagnostic is issued.
One open issue is if the `align_val_t` overloads should be considered "usual" when `LangOpts::AlignedAllocation` is disabled.
In order to allow libc++ to detect this new behavior the value for `__has_builtin(__builtin_operator_new)` has been updated to `201802`.
Reviewers: rsmith, majnemer, aaron.ballman, erik.pilkington, bogner, ahatanak
Reviewed By: rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43047
llvm-svn: 328134
source expressions when iterating over a PseudoObjectExpr's semantic
subexpression list.
Previously the loop in emitPseudoObjectExpr would emit the IR for each
OpaqueValueExpr that was in a PseudoObjectExpr's semantic-form
expression list and use the result when the OpaqueValueExpr later
appeared in other expressions. This caused an assertion failure when
AggExprEmitter tried to copy the result of an OpaqueValueExpr and the
copied type didn't have trivial copy/move constructors or assignment
operators.
This patch adds flag IsUnique to OpaqueValueExpr which indicates it is a
unique reference to its source expression (it is not used in multiple
places). The loop in emitPseudoObjectExpr ignores OpaqueValueExprs that
are unique and CodeGen visitors simply traverse the source expressions
of such OpaqueValueExprs.
rdar://problem/34363596
Differential Revision: https://reviews.llvm.org/D39562
llvm-svn: 327939
This patch uses the infrastructure added in r326307 for enabling
non-trivial fields to be declared in C structs to allow __weak fields in
C structs in ARC.
This recommits r327206, which was reverted because it caused
module-enabled builders to fail. I discovered that the
CXXRecordDecl::CanPassInRegisters flag wasn't being set correctly in
some cases after I moved it to RecordDecl.
Thanks to Eric Liu for helping me investigate the bug.
rdar://problem/33599681
https://reviews.llvm.org/D44095
llvm-svn: 327870
This patch uses the infrastructure added in r326307 for enabling
non-trivial fields to be declared in C structs to allow __weak fields in
C structs in ARC.
rdar://problem/33599681
Differential Revision: https://reviews.llvm.org/D44095
llvm-svn: 327206
The indirect function argument is in alloca address space in LLVM IR. However,
during Clang codegen for C++, the address space of indirect function argument
should match its address space in the source code, i.e., default addr space, even
for indirect argument. This is because destructor of the indirect argument may
be called in the caller function, and address of the indirect argument may be
taken, in either case the indirect function argument is expected to be in default
addr space, not the alloca address space.
Therefore, the indirect function argument should be mapped to the temp var
casted to default address space. The caller will cast it to alloca addr space
when passing it to the callee. In the callee, the argument is also casted to the
default address space and used.
CallArg is refactored to facilitate this fix.
Differential Revision: https://reviews.llvm.org/D34367
llvm-svn: 326946
We may emit incorrect lifetime info during codegen for loop counters in
OpenMP constructs because of automatic scope cleanup when we needed
temporarily locations for private loop counters.
llvm-svn: 326922
ARC mode.
Declaring __strong pointer fields in structs was not allowed in
Objective-C ARC until now because that would make the struct non-trivial
to default-initialize, copy/move, and destroy, which is not something C
was designed to do. This patch lifts that restriction.
Special functions for non-trivial C structs are synthesized that are
needed to default-initialize, copy/move, and destroy the structs and
manage the ownership of the objects the __strong pointer fields point
to. Non-trivial structs passed to functions are destructed in the callee
function.
rdar://problem/33599681
Differential Revision: https://reviews.llvm.org/D41228
llvm-svn: 326307
The following test case causes issue with codegen of __enqueue_block
void (^block)(void) = ^{ callee(id, out); };
enqueue_kernel(queue, 0, ndrange, block);
Clang first does codegen for block expression in the first line and deletes its block info.
Clang then tries to do codegen for the same block expression again for the second line,
and fails because the block info is gone.
The fix is to do normal codegen for both lines. Introduce an API to OpenCL runtime to
record llvm block invoke function and llvm block literal emitted for each AST block
expression, and use the recorded information for generating the wrapper kernel.
The EmitBlockLiteral APIs are cleaned up to minimize changes to the normal codegen
of blocks.
Another minor issue is that some clean up AST expression is generated for block
with captures, which can be stripped by IgnoreImplicit.
Differential Revision: https://reviews.llvm.org/D43240
llvm-svn: 325264
Summary:
Fixes PR36247, which is where WinEHPrepare replaces inline asm in
funclets with unreachable.
Make getBundlesForFunclet return by value to simplify some call sites.
Reviewers: smeenai, majnemer
Subscribers: eraman, cfe-commits
Differential Revision: https://reviews.llvm.org/D43033
llvm-svn: 324689
Summary:
Previously, Clang only emitted label names in assert builds.
However there is a CC1 option -discard-value-names that should have been used to control emission instead.
This patch removes the NDEBUG preprocessor block and instead allows LLVM to handle removing the names in accordance with the option.
Reviewers: erichkeane, aaron.ballman, majnemer
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D42829
llvm-svn: 324127
Summary:
This patch enables debugging of C99 VLA types by generating more precise
LLVM Debug metadata, using the extended DISubrange 'count' field that
takes a DIVariable.
This should implement:
Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's)
https://bugs.llvm.org/show_bug.cgi?id=30553
Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie
Reviewed By: aprantl
Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D41698
llvm-svn: 323952
simd`.
Added host codegen + codegen for devices with default codegen for
`#pragma omp target teams distribute parallel for simd` directive.
llvm-svn: 322515
This alignment can be less than 4 on certain embedded targets, which may
not even be able to deal with 4-byte alignment on the stack.
Patch by Jacob Young!
llvm-svn: 322406
getAssociatedStmt() returns the outermost captured statement for the
OpenMP directive. It may return incorrect region in case of combined
constructs. Reworked the code to reduce the number of calls of
getAssociatedStmt() and used getInnermostCapturedStmt() and
getCapturedStmt() functions instead.
In case of firstprivate variables it may lead to an extra allocas
generation for private copies even if the variable is passed by value
into outlined function and could be used directly as private copy.
llvm-svn: 322393
GCC's attribute 'target', in addition to being an optimization hint,
also allows function multiversioning. We currently have the former
implemented, this is the latter's implementation.
This works by enabling functions with the same name/signature to coexist,
so that they can all be emitted. Multiversion state is stored in the
FunctionDecl itself, and SemaDecl manages the definitions.
Note that it ends up having to permit redefinition of functions so
that they can all be emitted. Additionally, all versions of the function
must be emitted, so this also manages that.
Note that this includes some additional rules that GCC does not, since
defining something as a MultiVersion function after a usage has been made illegal.
The only 'history rewriting' that happens is if a function is emitted before
it has been converted to a multiversion'ed function, at which point its name
needs to be changed.
Function templates and virtual functions are NOT yet supported (not supported
in GCC either).
Additionally, constructors/destructors are disallowed, but the former is
planned.
llvm-svn: 322028
only.
Added support for -fopenmp-simd option that allows compilation of
simd-based constructs without emission of OpenMP runtime calls.
llvm-svn: 321560
...when such an operation is done on an object during con-/destruction.
This is the cfe part of a patch covering both cfe and compiler-rt.
Differential Revision: https://reviews.llvm.org/D40295
llvm-svn: 321519
Diagnose 'unreachable' UB when a noreturn function returns.
1. Insert a check at the end of functions marked noreturn.
2. A decl may be marked noreturn in the caller TU, but not marked in
the TU where it's defined. To diagnose this scenario, strip away the
noreturn attribute on the callee and insert check after calls to it.
Testing: check-clang, check-ubsan, check-ubsan-minimal, D40700
rdar://33660464
Differential Revision: https://reviews.llvm.org/D40698
llvm-svn: 321231
Summary:
The -fxray-always-emit-customevents flag instructs clang to always emit
the LLVM IR for calls to the `__xray_customevent(...)` built-in
function. The default behaviour currently respects whether the function
has an `[[clang::xray_never_instrument]]` attribute, and thus not lower
the appropriate IR code for the custom event built-in.
This change allows users calling through to the
`__xray_customevent(...)` built-in to always see those calls lowered to
the corresponding LLVM IR to lay down instrumentation points for these
custom event calls.
Using this flag enables us to emit even just the user-provided custom
events even while never instrumenting the start/end of the function
where they appear. This is useful in cases where "phase markers" using
__xray_customevent(...) can have very few instructions, must never be
instrumented when entered/exited.
Reviewers: rnk, dblaikie, kpw
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D40601
llvm-svn: 319388
This updates -mcount to use the new attribute names (LLVM r318195), and
switches over -finstrument-functions to also use these attributes rather
than inserting instrumentation in the frontend.
It also adds a new flag, -finstrument-functions-after-inlining, which
makes the cygprofile instrumentation get inserted after inlining rather
than before.
Differential Revision: https://reviews.llvm.org/D39331
llvm-svn: 318199
Summary:
We don't want to store cleanup dest slot saved into the coroutine frame (as some of the cleanup code may
access them after coroutine frame destroyed).
This is an alternative to https://reviews.llvm.org/D37093
It is possible to do this for all functions, but, cursory check showed that in -O0, we get slightly longer function (by 1-3 instructions), thus, we are only limiting cleanup.dest.slot elimination to coroutines.
Reviewers: rjmccall, hfinkel, eric_niebler
Reviewed By: eric_niebler
Subscribers: EricWF, cfe-commits
Differential Revision: https://reviews.llvm.org/D39768
llvm-svn: 317981
This patch fixes various places in clang to propagate may-alias
TBAA access descriptors during construction of lvalues, thus
eliminating the need for the LValueBaseInfo::MayAlias flag.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D39008
llvm-svn: 316988
This patch addresses the rest of the cases where we pass lvalue
base info, but do not provide corresponding TBAA info.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to make
reviewing easier.
Differential Revision: https://reviews.llvm.org/D38945
llvm-svn: 315986
In OpenCL the kernel function and non-kernel function has different calling conventions.
For certain targets they have different argument ABIs. Also kernels have special function
attributes and metadata for runtime to launch them.
The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such,
the block invoke function should be emitted as kernel with proper calling convention and
argument ABI.
This patch emits enqueued block as kernel. If a block is both called directly and passed
to enqueue_kernel, separate functions will be generated.
Differential Revision: https://reviews.llvm.org/D38134
llvm-svn: 315804
This patch enables explicit generation of TBAA information in all
cases where LValue base info is propagated or constructed in
non-trivial ways. Eventually, we will consider each of these
cases to make sure the TBAA information is correct and not too
conservative. For now, we just fall back to generating TBAA info
from the access type.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38733
llvm-svn: 315575
Besides obvious code simplification, avoiding explicit creation
of LValueBaseInfo objects makes it easier to make TBAA
information to be part of such objects.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38695
llvm-svn: 315289
The Cpu Init functionality is required for the target
attribute, so this patch simply splits it out into its own
function, exactly like CpuIs and CpuSupports.
llvm-svn: 315075
This patch is an attempt to clarify and simplify generation and
propagation of TBAA information. The idea is to pack all values
that describe a memory access, namely, base type, access type and
offset, into a single structure. This is supposed to make further
changes, such as adding support for unions and array members,
easier to prepare and review.
DecorateInstructionWithTBAA() is no more responsible for
converting types to tags. These implicit conversions not only
complicate reading the code, but also suggest assigning scalar
access tags while we generally prefer full-size struct-path tags.
TBAAPathTag is replaced with TBAAAccessInfo; the latter is now
the type of the keys of the cache map that translates access
descriptors to metadata nodes.
Fixed a bug with writing to a wrong map in
getTBAABaseTypeMetadata() (former getTBAAStructTypeInfo()).
We now check for valid base access types every time we
dereference a field. The original code only checks the top-level
base type. See isValidBaseType() / isTBAAPathStruct() calls.
Some entities have been renamed to sound more adequate and less
confusing/misleading in presence of path-aware TBAA information.
Now we do not lookup twice for the same cache entry in
getAccessTagInfo().
Refined relevant comments and descriptions.
Differential Revision: https://reviews.llvm.org/D37826
llvm-svn: 315048
code size.
Currently clang expands a call to __builtin_os_log_format into a long
sequence of instructions at the call site, causing code size to
increase in some cases.
This commit attempts to reduce code size by emitting a helper function
that can be shared by calls to __builtin_os_log_format with similar
formats and arguments. The helper function has linkonce_odr linkage to
enable the linker to merge identical functions across translation units.
Attribute 'noinline' is attached to the helper function at -Oz so that
the inliner doesn't inline functions that can potentially be merged.
This commit also fixes a bug where the generated IR writes past the end
of the buffer when "%m" is the last specifier appearing in the format
string passed to __builtin_os_log_format.
Original patch by Duncan Exon Smith.
rdar://problem/34065973
rdar://problem/34196543
Differential Revision: https://reviews.llvm.org/D38606
llvm-svn: 315045
This patch makes it possible to produce access tags in a uniform
manner regardless whether the resulting tag will be a scalar or a
struct-path one. getAccessTagInfo() now takes care of the actual
translation of access descriptors to tags and can handle all
kinds of accesses. Facilities that specific to scalar accesses
are eliminated.
Some more details:
* DecorateInstructionWithTBAA() is not responsible for conversion
of types to access tags anymore. Instead, it takes an access
descriptor (TBAAAccessInfo) and generates corresponding access
tag from it.
* getTBAAInfoForVTablePtr() reworked to
getTBAAVTablePtrAccessInfo() that now returns the
virtual-pointer access descriptor and not the virtual-point
type metadata.
* Added function getTBAAMayAliasAccessInfo() that returns the
descriptor for may-alias accesses.
* getTBAAStructTagInfo() renamed to getTBAAAccessTagInfo() as now
it is the only way to generate access tag by a given access
descriptor. It is capable of producing both scalar and
struct-path tags, depending on options and availability of the
base access type. getTBAAScalarTagInfo() and its cache
ScalarTagMetadataCache are eliminated.
* Now that we do not need to care about whether the resulting
access tag should be a scalar or struct-path one,
getTBAAStructTypeInfo() is renamed to getBaseTypeInfo().
* Added function getTBAAAccessInfo() that constructs access
descriptor by a given QualType access type.
This is part of D37826 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38503
llvm-svn: 314977
With this patch we implement a concept of TBAA access descriptors
that are capable of representing both scalar and struct-path
accesses in a generic way.
This is part of D37826 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38456
llvm-svn: 314780
This patch fixes misleading names of entities related to getting,
setting and generation of TBAA access type descriptors.
This is effectively an attempt to provide a review for D37826 by
breaking it into smaller pieces.
Differential Revision: https://reviews.llvm.org/D38404
llvm-svn: 314657
directives.
If the variable is used in the target-based region but is not found in
any private|mapping clause, then generate implicit firstprivate|map
clauses for these implicitly mapped variables.
llvm-svn: 314205
body of global block invoke functions.
This commit fixes an infinite loop in IRGen that occurs when compiling
the following code:
void FUNC2() {
static void (^const block1)(int) = ^(int a){
if (a--)
block1(a);
};
}
This is how IRGen gets stuck in the infinite loop:
1. GenerateBlockFunction is called to emit the body of "block1".
2. GetAddrOfGlobalBlock is called to get the address of "block1". The
function calls getAddrOfGlobalBlockIfEmitted to check whether the
global block has been emitted. If it hasn't been emitted, it then
tries to emit the body of the block function by calling
GenerateBlockFunction, which goes back to step 1.
This commit prevents the inifinite loop by building the global block in
GenerateBlockFunction before emitting the body of the block function.
rdar://problem/34541684
Differential Revision: https://reviews.llvm.org/D38118
llvm-svn: 314029
If the captured variable has re-declaration we may end up with the
situation where the captured variable is the re-declaration while the
referenced variable is the canonical declaration (or vice versa). In
this case we may generate wrong code. Patch fixes this situation.
llvm-svn: 313995
This change will make it possible to use -fsanitize=function on Darwin and
possibly on other platforms. It fixes an issue with the way RTTI is stored into
function prologue data.
On Darwin, addresses stored in prologue data can't require run-time fixups and
must be PC-relative. Run-time fixups are undesirable because they necessitate
writable text segments, which can lead to security issues. And absolute
addresses are undesirable because they break PIE mode.
The fix is to create a private global which points to the RTTI, and then to
encode a PC-relative reference to the global into prologue data.
Differential Revision: https://reviews.llvm.org/D37597
llvm-svn: 313096
Summary:
1.Updated annotations for include/clang/StaticAnalyzer/Core/PathSensitive/Store.h, which belong to the old version of clang.
2.Delete annotations for CodeGenFunction::getEvaluationKind() in clang/lib/CodeGen/CodeGenFunction.h, which belong to the old version of clang.
Reviewers: bkramer, krasimir, klimek
Reviewed By: bkramer
Subscribers: MTC
Differential Revision: https://reviews.llvm.org/D36330
Contributed by @MTC!
llvm-svn: 312790
Summary:
As the attributed statements are considered simple statements no
stoppoint was generated before emitting attributed do/while/for/range-
statement. This lead to faulty debug locations.
Reviewers: echristo, aaron.ballman, dblaikie
Reviewed By: dblaikie
Subscribers: bjope, aprantl, cfe-commits
Differential Revision: https://reviews.llvm.org/D37428
llvm-svn: 312623
"target" implementation
A small set of refactors that'll make it easier for me to implement 'target'
support.
First, extract the CPUSupports functionality into its own function.
THis has the advantage of not wasting time in this builtin to deal with
arguments.
Second, pulls both CPUSupports and CPUIs implementation into a member-function,
so that it can be called from the resolver generation that I'm working on.
Third, creates an overload that takes simply the feature/cpu name (rather than
extracting it from a callexpr), since that info isn't available later.
Note that despite how the 'diff' looks, the EmitX86CPUSupports function simply
takes the implementation out of the 'switch'.
llvm-svn: 312355
expressions
C++ allows us to reference static variables through member expressions. Prior to
this commit, non-integer static variables that were referenced using a member
expression were always emitted using lvalue loads. The old behaviour introduced
an inconsistency between regular uses of static variables and member expressions
uses. For example, the following program compiled and linked successfully:
struct Foo {
constexpr static const char *name = "foo";
};
int main() {
return Foo::name[0] == 'f';
}
but this program failed to link because "Foo::name" wasn't found:
struct Foo {
constexpr static const char *name = "foo";
};
int main() {
Foo f;
return f.name[0] == 'f';
}
This commit ensures that constant static variables referenced through member
expressions are emitted in the same way as ordinary static variable references.
rdar://33942261
Differential Revision: https://reviews.llvm.org/D36876
llvm-svn: 311772
of class fails to map class static variable.
If the global variable is captured and it has several redeclarations,
sometimes it may lead to a compiler crash. Patch fixes this by working
only with canonical declarations.
llvm-svn: 311479
If worksharing construct has at least one linear item, an implicit
synchronization point must be emitted to avoid possible conflict with
the loading/storing values to the original variables. Added implicit
barrier if the linear item is found before actual start of the
worksharing construct.
llvm-svn: 311013
We don't need special handling in CodeGenFunction::GenerateCode for
lambda block pointer conversion operators anymore. The conversion
operator emission code immediately calls back to the generic
EmitFunctionBody.
Rename EmitLambdaStaticInvokeFunction to EmitLambdaStaticInvokeBody for
better consistency with the other Emit*Body methods.
I'm preparing to do something about PR28299, which touches this code.
llvm-svn: 310145
On some targets, passing zero to the clz() or ctz() builtins has undefined
behavior. I ran into this issue while debugging UB in __hash_table from libcxx:
the bug I was seeing manifested itself differently under -O0 vs -Os, due to a
UB call to clz() (see: libcxx/r304617).
This patch introduces a check which can detect UB calls to builtins.
llvm.org/PR26979
Differential Revision: https://reviews.llvm.org/D34590
llvm-svn: 309459
When an omp for loop is canceled the constructed objects are being destructed
twice.
It looks like the desired code is:
{
Obj o;
If (cancelled) branch-through-cleanups to cancel.exit.
}
[cleanups]
cancel.exit:
__kmpc_for_static_fini
br cancel.cont (*)
cancel.cont:
__kmpc_barrier
return
The problem seems to be the branch to cancel.cont is currently also going
through the cleanups calling them again. This change just does a direct branch
instead.
Patch By: michael.p.rice@intel.com
Differential Revision: https://reviews.llvm.org/D35854
llvm-svn: 309288
The initializer for a static local variable cannot be hot, because it runs at
most once per program. That's not quite the same thing as having a low branch
probability, but under the assumption that the function is invoked many times,
modeling this as a branch probability seems reasonable.
For TLS variables, the situation is less clear, since the initialization side
of the branch can run multiple times in a program execution, but we still
expect initialization to be rare relative to non-initialization uses. It would
seem worthwhile to add a PGO counter along this path to make this estimation
more accurate in future.
For globals with guarded initialization, we don't yet apply any branch weights.
Due to our use of COMDATs, the guard will be reached exactly once per DSO, but
we have no idea how many DSOs will define the variable.
llvm-svn: 309195
The pointer overflow check gives false negatives when dealing with
expressions in which an unsigned value is subtracted from a pointer.
This is summarized in PR33430 [1]: ubsan permits the result of the
subtraction to be greater than "p", but it should not.
To fix the issue, we should track whether or not the pointer expression
is a subtraction. If it is, and the indices are unsigned, we know to
expect "p - <unsigned> <= p".
I've tested this by running check-{llvm,clang} with a stage2
ubsan-enabled build. I've also added some tests to compiler-rt, which
are in D34122.
[1] https://bugs.llvm.org/show_bug.cgi?id=33430
Differential Revision: https://reviews.llvm.org/D34121
llvm-svn: 307955
devirtualized.
The code to detect devirtualized calls is already in IRGen, so move the
code to lib/AST and make it a shared utility between Sema and IRGen.
This commit fixes a linkage error I was seeing when compiling the
following code:
$ cat test1.cpp
struct Base {
virtual void operator()() {}
};
template<class T>
struct Derived final : Base {
void operator()() override {}
};
Derived<int> *d;
int main() {
if (d)
(*d)();
return 0;
}
rdar://problem/33195657
Differential Revision: https://reviews.llvm.org/D34301
llvm-svn: 307883
Certain targets (e.g. amdgcn) require global variable to stay in global or constant address
space. In C or C++ global variables are emitted in the default (generic) address space.
This patch introduces virtual functions TargetCodeGenInfo::getGlobalVarAddressSpace
and TargetInfo::getConstantAddressSpace to handle this in a general approach.
It only affects IR generated for amdgcn target.
Differential Revision: https://reviews.llvm.org/D33842
llvm-svn: 307470
This patch makes ubsan's nonnull return value diagnostics more precise,
which makes the diagnostics more useful when there are multiple return
statements in a function. Example:
1 |__attribute__((returns_nonnull)) char *foo() {
2 | if (...) {
3 | return expr_which_might_evaluate_to_null();
4 | } else {
5 | return another_expr_which_might_evaluate_to_null();
6 | }
7 |} // <- The current diagnostic always points here!
runtime error: Null returned from Line 7, Column 2!
With this patch, the diagnostic would point to either Line 3, Column 5
or Line 5, Column 5.
This is done by emitting source location metadata for each return
statement in a sanitized function. The runtime is passed a pointer to
the appropriate metadata so that it can prepare and deduplicate reports.
Compiler-rt patch (with more tests): https://reviews.llvm.org/D34298
Differential Revision: https://reviews.llvm.org/D34299
llvm-svn: 306163
In C++ all variables are in default address space. Previously change has been
made to cast automatic variables to default address space. However that is
not sufficient since all temporary variables need to be casted to default
address space.
This patch casts all temporary variables to default address space except those
for passing indirect arguments since they are only used for load/store.
This patch only affects target having non-zero alloca address space.
Differential Revision: https://reviews.llvm.org/D33706
llvm-svn: 305711
Summary:
The title says it all.
Reviewers: GorNishanov, rsmith
Reviewed By: GorNishanov
Subscribers: rjmccall, cfe-commits
Differential Revision: https://reviews.llvm.org/D34194
llvm-svn: 305496
Adding an unsigned offset to a base pointer has undefined behavior if
the result of the expression would precede the base. An example from
@regehr:
int foo(char *p, unsigned offset) {
return p + offset >= p; // This may be optimized to '1'.
}
foo(p, -1); // UB.
This patch extends the pointer overflow check in ubsan to detect invalid
unsigned pointer index expressions. It changes the instrumentation to
only permit non-negative offsets in pointer index expressions when all
of the GEP indices are unsigned.
Testing: check-llvm, check-clang run on a stage2, ubsan-instrumented
build.
Differential Revision: https://reviews.llvm.org/D33910
llvm-svn: 305216
Check pointer arithmetic for overflow.
For some more background on this check, see:
https://wdtz.org/catching-pointer-overflow-bugs.htmlhttps://reviews.llvm.org/D20322
Patch by Will Dietz and John Regehr!
This version of the patch is different from the original in a few ways:
- It introduces the EmitCheckedInBoundsGEP utility which inserts
checks when the pointer overflow check is enabled.
- It does some constant-folding to reduce instrumentation overhead.
- It does not check some GEPs in CGExprCXX. I'm not sure that
inserting checks here, or in CGClass, would catch many bugs.
Possible future directions for this check:
- Introduce CGF.EmitCheckedStructGEP, to detect overflows when
accessing structures.
Testing: Apart from the added lit test, I ran check-llvm and check-clang
with a stage2, ubsan-instrumented clang. Will and John have also done
extensive testing on numerous open source projects.
Differential Revision: https://reviews.llvm.org/D33305
llvm-svn: 304459
The functions creating LValues propagated information about alignment
source. Extend the propagated data to also include information about
possible unrestricted aliasing. A new class LValueBaseInfo will
contain both AlignmentSource and MayAlias info.
This patch should not introduce any functional changes.
Differential Revision: https://reviews.llvm.org/D33284
llvm-svn: 303358
creation that are const-qualified.
When a block captures an ObjC object pointer, clang retains the pointer
to prevent prematurely destroying the object the pointer points to
before the block is called or copied.
When the captured object pointer is const-qualified, we can avoid
emitting the retain/release pair since the pointer variable cannot be
modified in the scope in which the block literal is introduced.
For example:
void test(const id x) {
callee(^{ (void)x; });
}
This patch implements that optimization.
rdar://problem/28894510
Differential Revision: https://reviews.llvm.org/D32601
llvm-svn: 301667
[OpenMP] Initial implementation of code generation for pragma 'distribute parallel for' on host
https://reviews.llvm.org/D29508
This patch makes the following additions:
It abstracts away loop bound generation code from procedures associated with pragma 'for' and loops in general, in such a way that the same procedures can be used for 'distribute parallel for' without the need for a full re-implementation.
It implements code generation for 'distribute parallel for' and adds regression tests. It includes tests for clauses.
It is important to notice that most of the clauses are implemented as part of existing procedures. For instance, firstprivate is already implemented for 'distribute' and 'for' as separate pragmas. As the implementation of 'distribute parallel for' is based on the same procedures, then we automatically obtain implementation for such clauses without the need to add new code. However, this requires regression tests that verify correctness of produced code.
llvm-svn: 301340
https://reviews.llvm.org/D29508
This patch makes the following additions:
1. It abstracts away loop bound generation code from procedures associated with pragma 'for' and loops in general, in such a way that the same procedures can be used for 'distribute parallel for' without the need for a full re-implementation.
2. It implements code generation for 'distribute parallel for' and adds regression tests. It includes tests for clauses.
It is important to notice that most of the clauses are implemented as part of existing procedures. For instance, firstprivate is already implemented for 'distribute' and 'for' as separate pragmas. As the implementation of 'distribute parallel for' is based on the same procedures, then we automatically obtain implementation for such clauses without the need to add new code. However, this requires regression tests that verify correctness of produced code.
Looking forward to comments.
llvm-svn: 301223
This patch teaches ubsan to insert an alignment check for the 'this'
pointer at the start of each method/lambda. This allows clang to emit
significantly fewer alignment checks overall, because if 'this' is
aligned, so are its fields.
This is essentially the same thing r295515 does, but for the alignment
check instead of the null check. One difference is that we keep the
alignment checks on member expressions where the base is a DeclRefExpr.
There's an opportunity to diagnose unaligned accesses in this situation
(as pointed out by Eli, see PR32630).
Testing: check-clang, check-ubsan, and a stage2 ubsan build.
Along with the patch from D30285, this roughly halves the amount of
alignment checks we emit when compiling X86FastISel.cpp. Here are the
numbers from patched/unpatched clangs based on r298160.
------------------------------------------
| Setup | # of alignment checks |
------------------------------------------
| unpatched, -O0 | 24326 |
| patched, -O0 | 12717 | (-47.7%)
------------------------------------------
Differential Revision: https://reviews.llvm.org/D30283
llvm-svn: 300370
Previously __cfi_check was created in LTO optimization pipeline, which
means LLD has no way of knowing about the existence of this symbol
without rescanning the LTO output object. As a result, LLD fails to
export __cfi_check, even when given --export-dynamic-symbol flag.
llvm-svn: 299806
GCC has the alloc_align attribute, which is similar to assume_aligned, except the attribute's parameter is the index of the integer parameter that needs aligning to.
Differential Revision: https://reviews.llvm.org/D29599
llvm-svn: 299117
Details:
Emit suspend expression which roughly looks like:
auto && x = CommonExpr();
if (!x.await_ready()) {
llvm_coro_save();
x.await_suspend(...); (*)
llvm_coro_suspend(); (**)
}
x.await_resume();
where the result of the entire expression is the result of x.await_resume()
(*) If x.await_suspend return type is bool, it allows to veto a suspend:
if (x.await_suspend(...))
llvm_coro_suspend();
(**) llvm_coro_suspend() encodes three possible continuations as a switch instruction:
%where-to = call i8 @llvm.coro.suspend(...)
switch i8 %where-to, label %coro.ret [ ; jump to epilogue to suspend
i8 0, label %yield.ready ; go here when resumed
i8 1, label %yield.cleanup ; go here when destroyed
]
llvm-svn: 298784
This is a follow-up to r297700 (Add a nullability sanitizer).
It addresses some FIXME's re: using nullability-specific diagnostic
handlers from compiler-rt, now that the necessary handlers exist.
check-ubsan test updates to follow.
llvm-svn: 297750
Teach UBSan to detect when a value with the _Nonnull type annotation
assumes a null value. Call expressions, initializers, assignments, and
return statements are all checked.
Because _Nonnull does not affect IRGen, the new checks are disabled by
default. The new driver flags are:
-fsanitize=nullability-arg (_Nonnull violation in call)
-fsanitize=nullability-assign (_Nonnull violation in assignment)
-fsanitize=nullability-return (_Nonnull violation in return stmt)
-fsanitize=nullability (all of the above)
This patch builds on top of UBSan's existing support for detecting
violations of the nonnull attributes ('nonnull' and 'returns_nonnull'),
and relies on the compiler-rt support for those checks. Eventually we
will need to update the diagnostic messages in compiler-rt (there are
FIXME's for this, which will be addressed in a follow-up).
One point of note is that the nullability-return check is only allowed
to kick in if all arguments to the function satisfy their nullability
preconditions. This makes it necessary to emit some null checks in the
function body itself.
Testing: check-clang and check-ubsan. I also built some Apple ObjC
frameworks with an asserts-enabled compiler, and verified that we get
valid reports.
Differential Revision: https://reviews.llvm.org/D30762
llvm-svn: 297700
It's possible to load out-of-range values from bitfields backed by a
boolean or an enum. Check for UB loads from bitfields.
This is the motivating example:
struct S {
BOOL b : 1; // Signed ObjC BOOL.
};
S s;
s.b = 1; // This is actually stored as -1.
if (s.b == 1) // Evaluates to false, -1 != 1.
...
Changes since the original commit:
- Single-bit bools are a special case (see CGF::EmitFromMemory), and we
can't avoid dealing with them when loading from a bitfield. Don't try to
insert a check in this case.
Differential Revision: https://reviews.llvm.org/D30423
llvm-svn: 297389
It's possible to load out-of-range values from bitfields backed by a
boolean or an enum. Check for UB loads from bitfields.
This is the motivating example:
struct S {
BOOL b : 1; // Signed ObjC BOOL.
};
S s;
s.b = 1; // This is actually stored as -1.
if (s.b == 1) // Evaluates to false, -1 != 1.
...
Differential Revision: https://reviews.llvm.org/D30423
llvm-svn: 297298
Summary:
Because of the existence branches out of GNU statement expressions, it
is possible that emitting cleanups for a full expression may cause the
new insertion point to not be dominated by the result of the inner
expression. Consider this example:
struct Foo { Foo(); ~Foo(); int x; };
int g(Foo, int);
int f(bool cond) {
int n = g(Foo(), ({ if (cond) return 0; 42; }));
return n;
}
Before this change, result of the call to 'g' did not dominate its use
in the store to 'n'. The early return exit from the statement expression
branches to a shared cleanup block, which ends in a switch between the
fallthrough destination (the assignment to 'n') or the function exit
block.
This change solves the problem by spilling and reloading expression
evaluation results when any of the active cleanups have branches.
I audited the other call sites of enterFullExpression, and they don't
appear to keep and Values live across the site of the cleanup, except in
ARC code. I wasn't able to create a test case for ARC that exhibits this
problem, though.
Reviewers: rjmccall, rsmith
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D30590
llvm-svn: 297084
Summary:
Added co_return statement emission.
Tweaked coro-alloc.cpp test to use co_return to trigger coroutine processing instead of co_await, since this change starts emitting the body of the coroutine and await expression handling has not been upstreamed yet.
Reviewers: rsmith, majnemer, EricWF, aaron.ballman
Reviewed By: rsmith
Subscribers: majnemer, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D29979
llvm-svn: 297076
UBSan's nonnull argument check applies when a parameter has the
"nonnull" attribute. The check currently works for FunctionDecls, but
not for ObjCMethodDecls. This patch extends the check to work for ObjC.
Differential Revision: https://reviews.llvm.org/D30599
llvm-svn: 296996
2nd attempt: the first was in r296231, but it had a use after lifetime
bug.
Clang has logic to lower certain conditional expressions directly into llvm
select instructions. However, it does not emit the correct profile counter
increment as it does this: it emits an unconditional increment of the counter
for the 'then branch', even if the value selected is from the 'else branch'
(this is PR32019).
That means, given the following snippet, we would report that "0" is selected
twice, and that "1" is never selected:
int f1(int x) {
return x ? 0 : 1;
^2 ^0
}
f1(0);
f1(1);
Fix the problem by using the instrprof_increment_step intrinsic to do the
proper increment.
llvm-svn: 296245
Clang has logic to lower certain conditional expressions directly into
llvm select instructions. However, it does not emit the correct profile
counter increment as it does this: it emits an unconditional increment
of the counter for the 'then branch', even if the value selected is from
the 'else branch' (this is PR32019).
That means, given the following snippet, we would report that "0" is
selected twice, and that "1" is never selected:
int f1(int x) {
return x ? 0 : 1;
^2 ^0
}
f1(0);
f1(1);
Fix the problem by using the instrprof_increment_step intrinsic to do
the proper increment.
llvm-svn: 296231
Fix the fact that we don't assign profile counters to constructors in
classes with virtual bases, or constructors with variadic parameters.
Differential Revision: https://reviews.llvm.org/D30131
llvm-svn: 296062
This fixes an assertion failure in cases where we had expression
statements that declared variables nested inside of pass_object_size
args. Since we were emitting the same ExprStmt twice (once for the arg,
once for the @llvm.objectsize call), we were getting issues with
redefining locals.
This also means that we can be more lax about when we emit
@llvm.objectsize for pass_object_size args: since we're reusing the
arg's value itself, we don't have to care so much about side-effects.
llvm-svn: 295935
This patch teaches ubsan to insert exactly one null check for the 'this'
pointer per method/lambda.
Previously, given a load of a member variable from an instance method
('this->x'), ubsan would insert a null check for 'this', and another
null check for '&this->x', before allowing the load to occur.
Similarly, given a call to a method from another method bound to the
same instance ('this->foo()'), ubsan would a redundant null check for
'this'. There is also a redundant null check in the case where the
object pointer is a reference ('Ref.foo()').
This patch teaches ubsan to remove the redundant null checks identified
above.
Testing: check-clang, check-ubsan, and a stage2 ubsan build.
I also compiled X86FastISel.cpp with -fsanitize=null using
patched/unpatched clangs based on r293572. Here are the number of null
checks emitted:
-------------------------------------
| Setup | # of null checks |
-------------------------------------
| unpatched, -O0 | 21767 |
| patched, -O0 | 10758 |
-------------------------------------
Changes since the initial commit:
- Don't introduce any unintentional object-size or alignment checks.
- Don't rely on IRGen of C labels in the test.
Differential Revision: https://reviews.llvm.org/D29530
llvm-svn: 295515
CodeGenFunction::EmitTypeCheck accepts a bool flag which controls
whether or not null checks are emitted. Make this a bit more flexible by
changing the bool to a SanitizerSet.
Needed for an upcoming change which deals with a scenario in which we
only want to emit null checks.
llvm-svn: 295514
This reverts commit r295401. It breaks the ubsan self-host. It inserts
object size checks once per C++ method which fire when the structure is
empty.
llvm-svn: 295494
This patch teaches ubsan to insert exactly one null check for the 'this'
pointer per method/lambda.
Previously, given a load of a member variable from an instance method
('this->x'), ubsan would insert a null check for 'this', and another
null check for '&this->x', before allowing the load to occur.
Similarly, given a call to a method from another method bound to the
same instance ('this->foo()'), ubsan would a redundant null check for
'this'. There is also a redundant null check in the case where the
object pointer is a reference ('Ref.foo()').
This patch teaches ubsan to remove the redundant null checks identified
above.
Testing: check-clang and check-ubsan. I also compiled X86FastISel.cpp
with -fsanitize=null using patched/unpatched clangs based on r293572.
Here are the number of null checks emitted:
-------------------------------------
| Setup | # of null checks |
-------------------------------------
| unpatched, -O0 | 21767 |
| patched, -O0 | 10758 |
-------------------------------------
Changes since the initial commit: don't rely on IRGen of C labels in the
test.
Differential Revision: https://reviews.llvm.org/D29530
llvm-svn: 295401
This patch teaches ubsan to insert exactly one null check for the 'this'
pointer per method/lambda.
Previously, given a load of a member variable from an instance method
('this->x'), ubsan would insert a null check for 'this', and another
null check for '&this->x', before allowing the load to occur.
Similarly, given a call to a method from another method bound to the
same instance ('this->foo()'), ubsan would a redundant null check for
'this'. There is also a redundant null check in the case where the
object pointer is a reference ('Ref.foo()').
This patch teaches ubsan to remove the redundant null checks identified
above.
Testing: check-clang and check-ubsan. I also compiled X86FastISel.cpp
with -fsanitize=null using patched/unpatched clangs based on r293572.
Here are the number of null checks emitted:
-------------------------------------
| Setup | # of null checks |
-------------------------------------
| unpatched, -O0 | 21767 |
| patched, -O0 | 10758 |
-------------------------------------
Differential Revision: https://reviews.llvm.org/D29530
llvm-svn: 295391
This patch implements codegen for the reduction clause on
any parallel construct for elementary data types. An efficient
implementation requires hierarchical reduction within a
warp and a threadblock. It is complicated by the fact that
variables declared in the stack of a CUDA thread cannot be
shared with other threads.
The patch creates a struct to hold reduction variables and
a number of helper functions. The OpenMP runtime on the GPU
implements reduction algorithms that uses these helper
functions to perform reductions within a team. Variables are
shared between CUDA threads using shuffle intrinsics.
An implementation of reductions on the NVPTX device is
substantially different to that of CPUs. However, this patch
is written so that there are minimal changes to the rest of
OpenMP codegen.
The implemented design allows the compiler and runtime to be
decoupled, i.e., the runtime does not need to know of the
reduction operation(s), the type of the reduction variable(s),
or the number of reductions. The design also allows reuse of
host codegen, with appropriate specialization for the NVPTX
device.
While the patch does introduce a number of abstractions, the
expected use case calls for inlining of the GPU OpenMP runtime.
After inlining and optimizations in LLVM, these abstractions
are unwound and performance of OpenMP reductions is comparable
to CUDA-canonical code.
Patch by Tian Jin in collaboration with Arpith Jacob
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29758
llvm-svn: 295333
This patch implements codegen for the reduction clause on
any parallel construct for elementary data types. An efficient
implementation requires hierarchical reduction within a
warp and a threadblock. It is complicated by the fact that
variables declared in the stack of a CUDA thread cannot be
shared with other threads.
The patch creates a struct to hold reduction variables and
a number of helper functions. The OpenMP runtime on the GPU
implements reduction algorithms that uses these helper
functions to perform reductions within a team. Variables are
shared between CUDA threads using shuffle intrinsics.
An implementation of reductions on the NVPTX device is
substantially different to that of CPUs. However, this patch
is written so that there are minimal changes to the rest of
OpenMP codegen.
The implemented design allows the compiler and runtime to be
decoupled, i.e., the runtime does not need to know of the
reduction operation(s), the type of the reduction variable(s),
or the number of reductions. The design also allows reuse of
host codegen, with appropriate specialization for the NVPTX
device.
While the patch does introduce a number of abstractions, the
expected use case calls for inlining of the GPU OpenMP runtime.
After inlining and optimizations in LLVM, these abstractions
are unwound and performance of OpenMP reductions is comparable
to CUDA-canonical code.
Patch by Tian Jin in collaboration with Arpith Jacob
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29758
llvm-svn: 295319
Support for CUDA printf is exploited to support printf for
an NVPTX OpenMP device.
To reflect the support of both programming models, the file
CGCUDABuiltin.cpp has been renamed to CGGPUBuiltin.cpp, and
the call EmitCUDADevicePrintfCallExpr has been renamed to
EmitGPUDevicePrintfCallExpr.
Reviewers: jlebar
Differential Revision: https://reviews.llvm.org/D17890
llvm-svn: 293444
in the current lexical scope.
clang currently emits the lifetime.start marker of a variable when the
variable comes into scope even though a variable's lifetime starts at
the entry of the block with which it is associated, according to the C
standard. This normally doesn't cause any problems, but in the rare case
where a goto jumps backwards past the variable declaration to an earlier
point in the block (see the test case added to lifetime2.c), it can
cause mis-compilation.
To prevent such mis-compiles, this commit conservatively disables
emitting lifetime variables when a label has been seen in the current
block.
This problem was discussed on cfe-dev here:
http://lists.llvm.org/pipermail/cfe-dev/2016-July/050066.html
rdar://problem/30153946
Differential Revision: https://reviews.llvm.org/D27680
llvm-svn: 293106
This patch adds support for codegen of 'target teams' on the host.
This combined directive has two captured statements, one for the
'teams' region, and the other for the 'parallel'.
This target teams region is offloaded using the __tgt_target_teams()
call. The patch sets the number of teams as an argument to
this call.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29084
llvm-svn: 293005
This patch adds support for codegen of 'target teams' on the host.
This combined directive has two captured statements, one for the
'teams' region, and the other for the 'parallel'.
This target teams region is offloaded using the __tgt_target_teams()
call. The patch sets the number of teams as an argument to
this call.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D29084
llvm-svn: 293001
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753
llvm-svn: 292419
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28753
llvm-svn: 292374
This patch refactors code that calls codegen for target regions. Currently
the codebase only supports the 'target' directive. The patch pulls out
common target processing code into a static function that can be called
by codegen for any target directive.
Reviewers: ABataev
Differential Revision: https://reviews.llvm.org/D28752
llvm-svn: 292134
This patch is to implement sema and parsing for 'target teams distribute simd’ pragma.
Differential Revision: https://reviews.llvm.org/D28252
llvm-svn: 291579
Summary:
This patch makes the type_mismatch static data 7 bytes smaller (and it
ends up being 16 bytes smaller due to alignment restrictions, at least
on some x86-64 environments).
It revs up the type_mismatch handler version since we're breaking binary
compatibility. I will soon post a patch for the compiler-rt side.
Reviewers: rsmith, kcc, vitalybuka, pgousseau, gbedwell
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D28242
llvm-svn: 291236
The special case to widen the integer literal zero when passed to
variadic function calls should only apply to variadic functions, not
unprototyped functions. This is consistent with what MSVC does. In this
test case, MSVC uses a 4-byte store to pass the 5th argument to 'kr' and
an 8-byte store to pass the zero to 'v':
void v(int, ...);
void kr();
void f(void) {
v(1, 2, 3, 4, 0);
kr(1, 2, 3, 4, 0);
}
Aaron Ballman discovered this issue in https://reviews.llvm.org/D28166
llvm-svn: 290906
This patch is to implement sema and parsing for 'target teams distribute parallel for simd’ pragma.
Differential Revision: https://reviews.llvm.org/D28202
llvm-svn: 290862
This patch is to implement sema and parsing for 'target teams distribute parallel for’ pragma.
Differential Revision: https://reviews.llvm.org/D28160
llvm-svn: 290725
This patch is to implement sema and parsing for 'target teams distribute' pragma.
Differential Revision: https://reviews.llvm.org/D28015
llvm-svn: 290508
This is a recommit of r290149, which was reverted in r290169 due to msan
failures. msan was failing because we were calling
`isMostDerivedAnUnsizedArray` on an invalid designator, which caused us
to read uninitialized memory. To fix this, the logic of the caller of
said function was simplified, and we now have a `!Invalid` assert in
`isMostDerivedAnUnsizedArray`, so we can catch this particular bug more
easily in the future.
Fingers crossed that this patch sticks this time. :)
Original commit message:
This patch does three things:
- Gives us the alloc_size attribute in clang, which lets us infer the
number of bytes handed back to us by malloc/realloc/calloc/any user
functions that act in a similar manner.
- Teaches our constexpr evaluator that evaluating some `const` variables
is OK sometimes. This is why we have a change in
test/SemaCXX/constant-expression-cxx11.cpp and other seemingly
unrelated tests. Richard Smith okay'ed this idea some time ago in
person.
- Uniques some Blocks in CodeGen, which was reviewed separately at
D26410. Lack of uniquing only really shows up as a problem when
combined with our new eagerness in the face of const.
llvm-svn: 290297
This commit fails MSan when running test/CodeGen/object-size.c in
a confusing way. After some discussion with George, it isn't really
clear what is going on here. We can make the MSan failure go away by
testing for the invalid bit, but *why* things are invalid isn't clear.
And yet, other code in the surrounding area is doing precisely this and
testing for invalid.
George is going to take a closer look at this to better understand the
nature of the failure and recommit it, for now backing it out to clean
up MSan builds.
llvm-svn: 290169
This patch does three things:
- Gives us the alloc_size attribute in clang, which lets us infer the
number of bytes handed back to us by malloc/realloc/calloc/any user
functions that act in a similar manner.
- Teaches our constexpr evaluator that evaluating some `const` variables
is OK sometimes. This is why we have a change in
test/SemaCXX/constant-expression-cxx11.cpp and other seemingly
unrelated tests. Richard Smith okay'ed this idea some time ago in
person.
- Uniques some Blocks in CodeGen, which was reviewed separately at
D26410. Lack of uniquing only really shows up as a problem when
combined with our new eagerness in the face of const.
Differential Revision: https://reviews.llvm.org/D14274
llvm-svn: 290149
copy constructors of classes with array members, instead using
ArrayInitLoopExpr to represent the initialization loop.
This exposed a bug in the static analyzer where it was unable to differentiate
between zero-initialized and unknown array values, which has also been fixed
here.
llvm-svn: 289618
This adds a way for us to version any UBSan handler by itself.
The patch overrides D21289 for a better implementation (we're able to
rev up a single handler).
After this, then we can land a slight modification of D19667+D19668.
We probably don't want to keep all the versions in compiler-rt (maybe we
want to deprecate on one release and remove the old handler on the next
one?), but with this patch we will loudly fail to compile when mixing
incompatible handler calls, instead of silently compiling and then
providing bad error messages.
Reviewers: kcc, samsonov, rsmith, vsk
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D21695
llvm-svn: 289444
initialization of each array element:
* ArrayInitLoopExpr is a prvalue of array type with two subexpressions:
a common expression (an OpaqueValueExpr) that represents the up-front
computation of the source of the initialization, and a subexpression
representing a per-element initializer
* ArrayInitIndexExpr is a prvalue of type size_t representing the current
position in the loop
This will be used to replace the creation of explicit index variables in lambda
capture of arrays and copy/move construction of classes with array elements,
and also C++17 structured bindings of arrays by value (which inexplicably allow
copying an array by value, unlike all of C++'s other array declarations).
No uses of these nodes are introduced by this change, however.
llvm-svn: 289413
This patch is to implement sema and parsing for 'teams distribute parallel for' pragma.
Differential Revision: https://reviews.llvm.org/D27345
llvm-svn: 289179
This patch is to implement sema and parsing for 'teams distribute parallel for simd' pragma.
Differential Revision: https://reviews.llvm.org/D27084
llvm-svn: 288294
If 'omp cancel' construct is used in a worksharing construct it may
cause hanging of the software in case if reduction clause is used. Patch fixes this problem by avoiding extra reduction processing for branches that were canceled.
llvm-svn: 287227
Summary:
r286944 introduced bugs detected by ASAN as use-after-return.
r287025 have not fixed them completely.
This reverts commit r286944 and r287025.
Reviewers: ABataev
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D26720
llvm-svn: 287069
If 'omp cancel' construct is used in a worksharing construct it may cause
hanging of the software in case if reduction clause is used. Patch fixes
this problem by avoiding extra reduction processing for branches that
were canceled.
llvm-svn: 286944
can be used to improve the locations when generating remarks for loops.
Depends on the companion LLVM change r286227.
Patch by Florian Hahn.
Differential Revision: https://reviews.llvm.org/D25764
llvm-svn: 286456
abstract information about the callee. NFC.
The goal here is to make it easier to recognize indirect calls and
trigger additional logic in certain cases. That logic will come in
a later patch; in the meantime, I felt that this was a significant
improvement to the code.
llvm-svn: 285258
Summary:
Current generation of lifetime intrinsics does not handle cases like:
```
{
char x;
l1:
bar(&x, 1);
}
goto l1;
```
We will get code like this:
```
%x = alloca i8, align 1
call void @llvm.lifetime.start(i64 1, i8* nonnull %x)
br label %l1
l1:
%call = call i32 @bar(i8* nonnull %x, i32 1)
call void @llvm.lifetime.end(i64 1, i8* nonnull %x)
br label %l1
```
So the second time bar was called for x which is marked as dead.
Lifetime markers here are misleading so it's better to remove them at all.
This type of bypasses are rare, e.g. code detects just 8 functions building
clang (2329 targets).
PR28267
Reviewers: eugenis
Subscribers: beanz, mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D24693
llvm-svn: 285176
Summary: D24693 will need access to it from other places
Reviewers: eugenis
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D24695
llvm-svn: 285158
constexpr variable.
When compiling a constexpr NSString initialized with an objective-c
string literal, CodeGen emits objc_storeStrong on an uninitialized
alloca, which causes a crash.
This patch folds the code in EmitScalarInit into EmitStoreThroughLValue
and fixes the crash by calling objc_retain on the string instead of
using objc_storeStrong.
rdar://problem/28562009
Differential Revision: https://reviews.llvm.org/D25547
llvm-svn: 284516
Summary: _BitScan intrinsics (and some others, for example _Interlocked and _bittest) are supposed to work on both ARM and x86. This is an attempt to isolate them, avoiding repeating their code or writing separate function for each builtin.
Reviewers: hans, thakis, rnk, majnemer
Subscribers: RKSimon, cfe-commits, aemerson
Differential Revision: https://reviews.llvm.org/D25264
llvm-svn: 284060
Summary:
With this commit simple coroutines can be created in plain C using coroutine builtins.
Reviewers: rnk, EricWF, rsmith
Subscribers: modocache, mgorny, mehdi_amini, beanz, cfe-commits
Differential Revision: https://reviews.llvm.org/D24373
llvm-svn: 283155
Instead of ignoring the evaluation order rule, ignore the "destroy parameters
in reverse construction order" rule for the small number of problematic cases.
This only causes incorrect behavior in the rare case where both parameters to
an overloaded operator <<, >>, ->*, &&, ||, or comma are of class type with
non-trivial destructor, and the program is depending on those parameters being
destroyed in reverse construction order.
We could do a little better here by reversing the order of parameter
destruction for those functions (and reversing the argument evaluation order
for all direct calls, not just those with operator syntax), but that is not a
complete solution to the problem, as the same situation can be reached by an
indirect function call.
Approach reviewed off-line by rnk.
llvm-svn: 282777
function correctly when targeting MS ABIs (this appears to have never mattered
prior to this change).
Update test case to always cover both 32-bit and 64-bit Windows ABIs, since
they behave somewhat differently from each other here.
Update test case to also cover operators , && and ||, which it appears are also
affected by P0145R3 (they're not explicitly called out by the design document,
but this is the emergent behavior of the existing wording).
Original commit message:
P0145R3 (C++17 evaluation order tweaks): evaluate the right-hand side of
assignment and compound-assignment operators before the left-hand side. (Even
if it's an overloaded operator.)
This completes the implementation of P0145R3 + P0400R0 for all targets except
Windows, where the evaluation order guarantees for <<, >>, and ->* are
unimplementable as the ABI requires the function arguments are evaluated from
right to left (because parameter destructors are run from left to right in the
callee).
llvm-svn: 282619
assignment and compound-assignment operators before the left-hand side. (Even
if it's an overloaded operator.)
This completes the implementation of P0145R3 + P0400R0 for all targets except
Windows, where the evaluation order guarantees for <<, >>, and ->* are
unimplementable as the ABI requires the function arguments are evaluated from
right to left (because parameter destructors are run from left to right in the
callee).
llvm-svn: 282556