__hipRegisterFunction and __hipRegisterVar need to accept device side kernel and variable names
so that HIP runtime can associate kernel stub functions in host code with kernel symbols in fat binaries,
and associate shadow variables in host code with device variables in fat binaries.
Currently, clang assumes kernel functions and device variables have the same name as the kernel
stub functions and shadow variables. However, when host is compiled in windows with MSVC C++
ABI and device is compiled with Itanium C++ ABI (e.g. AMDGPU), kernels and device symbols in fat
binary are mangled differently than host.
This patch gets the device side kernel and variable name by mangling them in the mangle context
of aux target.
Differential Revision: https://reviews.llvm.org/D58163
llvm-svn: 354004
This is the second attempt to port ASan to new PM after D52739. This takes the
initialization requried by ASan from the Module by moving it into a separate
class with it's own analysis that the new PM ASan can use.
Changes:
- Split AddressSanitizer into 2 passes: 1 for the instrumentation on the
function, and 1 for the pass itself which creates an instance of the first
during it's run. The same is done for AddressSanitizerModule.
- Add new PM AddressSanitizer and AddressSanitizerModule.
- Add legacy and new PM analyses for reading data needed to initialize ASan with.
- Removed DominatorTree dependency from ASan since it was unused.
- Move GlobalsMetadata and ShadowMapping out of anonymous namespace since the
new PM analysis holds these 2 classes and will need to expose them.
Differential Revision: https://reviews.llvm.org/D56470
llvm-svn: 353985
Argument evaluation order is different between gcc and clang, so pull out
the Builder calls to make the generated IR independent of the host compiler's
argument evaluation order. Thanks to rnk for reminding me of this clang/gcc
difference.
llvm-svn: 353969
This allows the global visibility controls to be restrictive while still
populating the dynamic symbol table where required.
Differential Revision: https://reviews.llvm.org/D56871
llvm-svn: 353870
This attribute applies to declarations of C stdlib functions
(sprintf, memcpy...) that have known fortified variants
(__sprintf_chk, __memcpy_chk, ...). When applied, clang will emit
calls to the fortified variant functions instead of calls to the
defaults.
In GCC, this is done by adding gnu_inline-style wrapper functions,
but that doesn't work for us for variadic functions because we don't
support __builtin_va_arg_pack (and have no intention to).
This attribute takes two arguments, the first is 'type' argument
passed through to __builtin_object_size, and the second is a flag
argument that gets passed through to the variadic checking variants.
rdar://47905754
Differential revision: https://reviews.llvm.org/D57918
llvm-svn: 353765
We must only set the construction vtable visibility after we create the
vtable initializer, otherwise the global value will be treated as
declaration rather than definition and the visibility won't be set.
Differential Revision: https://reviews.llvm.org/D58010
llvm-svn: 353742
The various EltSize, Offset, DataLayout, and StructLayout arguments
are all computable from the Address's element type and the DataLayout
which the CGBuilder already has access to.
After having previously asserted that the computed values are the same
as those passed in, now remove the redundant arguments from
CGBuilder's Create*GEP functions.
Differential Revision: https://reviews.llvm.org/D57767
llvm-svn: 353629
When a module name is specified as -fmodule-name, that module gets a
clang::Module object, but it won't actually be built or imported; it
will be textual. CGDebugInfo wouldn't detect this and them emit a
DICompileUnit that had a hash but no name and that confused both
dsymutil, LLDB, and myself.
rdar://problem/47926508
Differential Revision: https://reviews.llvm.org/D57976
llvm-svn: 353578
When we are calling `__builtin_constant_p` with ObjC objects of
different classes, we hit the assertion
> Assertion failed: (isa<X>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file include/llvm/Support/Casting.h, line 254.
It happens because LLVM types for `ObjCInterfaceType` are opaque and
have no name (see `CodeGenTypes::ConvertType`). As the result, for
different ObjC classes we have different `is_constant` intrinsics with
the same name `llvm.is.constant.p0s_s`. When we try to reuse an
intrinsic with the same name, we fail because of type mismatch.
Fix by bitcasting `ObjCObjectPointerType` to `id` prior to passing as an
argument to `__builtin_constant_p`. This results in using intrinsic
`llvm.is.constant.p0i8` and correct types.
rdar://problem/47499250
Reviewers: rjmccall, ahatanak, void
Reviewed By: void, ahatanak
Subscribers: ddunbar, jkorous, hans, dexonsmith, cfe-commits
Differential Revision: https://reviews.llvm.org/D57427
llvm-svn: 353577
This allows substantially simplifying the expression evaluation code,
because we don't have to special-case lvalues which are actually string
literal initialization.
This currently throws away an optimization where we would avoid creating
an array APValue for string literal initialization. If we really want
to optimize this case, we should fix APValue so it can store simple
arrays more efficiently, like llvm::ConstantDataArray. This shouldn't
affect the memory usage for other string literals. (Not sure if this is
a blocker; I don't think string literal init is common enough for this
to be a serious issue, but I could be wrong.)
The change to test/CodeGenObjC/encode-test.m is a weird side-effect of
these changes: we currently don't constant-evaluate arrays in C, so the
strlen call shouldn't be folded, but lvalue string init managed to get
around that check. I this this is fine.
Fixes https://bugs.llvm.org/show_bug.cgi?id=40430 .
llvm-svn: 353569
Some of these functions take some extraneous arguments, e.g. EltSize,
Offset, which are computable from the Type and DataLayout.
Add some asserts to ensure that the computed values are consistent
with the passed-in values, in preparation for eliminating the
extraneous arguments. This also asserts that the Type is an Array for
the calls named "Array" and a Struct for the calls named "Struct".
Then, correct a couple of errors:
1. Using CreateStructGEP on an array type. (this causes the majority
of the test differences, as struct GEPs are created with i32
indices, while array GEPs are created with i64 indices)
2. Passing the wrong Offset to CreateStructGEP in TargetInfo.cpp on
x86-64 NACL (which uses 32-bit pointers).
Differential Revision: https://reviews.llvm.org/D57766
llvm-svn: 353529
Summary:
Automatic initialization [1] of __block variables was trampling over the block's
headers after they'd been initialized, which caused self-init usage to crash,
such as here:
typedef struct XYZ { void (^block)(); } *xyz_t;
__attribute__((noinline))
xyz_t create(void (^block)()) {
xyz_t myself = malloc(sizeof(struct XYZ));
myself->block = block;
return myself;
}
int main() {
__block xyz_t captured = create(^(){ (void)captured; });
}
This type of code shouldn't be broken by variable auto-init, even if it's
sketchy.
[1] With -ftrivial-auto-var-init=pattern
<rdar://problem/47798396>
Reviewers: rjmccall, pcc, kcc
Subscribers: jkorous, dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57797
llvm-svn: 353495
The patch in r350643 incorrectly sets the COFF emission based on bits
instead of bytes. This patch converts the 32 via CharUnits to bits to
compare the correct values.
Change-Id: Icf38a16470ad5ae3531374969c033557ddb0d323
llvm-svn: 353411
CreateCall/Invoke.
Also, remove the getFunctionType() function from CGCallee, since it
accesses the pointee type of the value. The only use was in EmitCall,
so just inline it into the debug assertion.
This is the last of the changes for Call and Invoke in clang.
Differential Revision: https://reviews.llvm.org/D57804
llvm-svn: 353356
The assert added to EmitCall there was triggering in Windows Chromium
builds, due to a mismatch of the return type.
The MSVC constructor call extension (`this->Foo::Foo()`) was emitting
the constructor call from 'EmitCXXMemberOrOperatorMemberCallExpr' via
calling 'EmitCXXMemberOrOperatorCall', instead of
'EmitCXXConstructorCall'. On targets where HasThisReturn is true, that
was failing to set the proper return type in the call info.
Switching to calling EmitCXXConstructorCall also allowed removing some
code e.g. the trivial copy/move support, which is already handled in
EmitCXXConstructorCall.
Ref: https://bugs.chromium.org/p/chromium/issues/detail?id=928861
Differential Revision: https://reviews.llvm.org/D57794
llvm-svn: 353246
Summary:
Added ability to generate correct debug info data about the variable
address class. Currently, for all the locals and globals the default
values are used, ADDR_local_space(6) for locals and ADDR_global_space(5)
for globals. The values are taken from the table in
https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf.
We need to emit correct data for address classes of, at least, shared
and constant globals. Currently, all these variables are treated by
the cuda-gdb debugger as the variables in the global address space
and, thus, it require manual data type casting.
Reviewers: echristo, probinson
Subscribers: jholewinski, aprantl, cfe-commits
Differential Revision: https://reviews.llvm.org/D57162
llvm-svn: 353204
Emit{Nounwind,}RuntimeCall{,OrInvoke} have been modified to take a
FunctionCallee as an argument, and CreateRuntimeFunction has been
modified to return a FunctionCallee. All callers have been updated.
Additionally, CreateBuiltinFunction is removed, as it was redundant
with CreateRuntimeFunction after some previous changes.
Differential Revision: https://reviews.llvm.org/D57668
llvm-svn: 353184
edge cases.
Currently, EmitCall emits a call instruction with a function type
derived from the pointee-type of the callee. This *should* be the same
as the type created from the CallInfo parameter, but in some cases an
incorrect CallInfo was being passed.
All of these fixes were discovered by the addition of the assert in
EmitCall which verifies that the passed-in CallInfo matches the
Callee's function type.
As far as I know, these issues caused no bugs at the moment, as the
correct types were ultimately being emitted. But, some would become
problematic when pointee types are removed.
List of fixes:
* arrangeCXXConstructorCall was passing an incorrect value for the
number of Required args, when calling an inheriting constructor
where the inherited constructor is variadic. (The inheriting
constructor doesn't actually get passed any of the user's args, but
the code was calculating it as if it did).
* arrangeFreeFunctionLikeCall was not including the count of the
pass_object_size arguments in the count of required args.
* OpenCL uses other address spaces for the "this" pointer. However,
commonEmitCXXMemberOrOperatorCall was not annotating the address
space on the "this" argument of the call.
* Destructor calls were being created with EmitCXXMemberOrOperatorCall
instead of EmitCXXDestructorCall in a few places. This was a problem
because the calling convention sometimes has destructors returning
"this" rather than void, and the latter function knows about that,
and sets up the types properly (through calling
arrangeCXXStructorDeclaration), while the former does not.
* generateObjCGetterBody: the 'objc_getProperty' function returns type
'id', but was being called as if it returned the particular
property's type. (That is of course the *dynamic* return type, and
there's a downcast immediately after.)
* OpenMP user-defined reduction functions (#pragma omp declare
reduction) can be called with a subclass of the declared type. In
such case, the call was being setup as if the function had been
actually declared to take the subtype, rather than the base type.
Differential Revision: https://reviews.llvm.org/D57664
llvm-svn: 353181
Summary:
This is a follow up for https://reviews.llvm.org/D57278. The previous
revision should have also included Kernel ASan.
rdar://problem/40723397
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57711
llvm-svn: 353120
A non-lazy class will be initialized eagerly when the Objective-C runtime is
loaded. This is required for certain system classes which have instances allocated in
non-standard ways, such as the classes for blocks and constant strings.
Adding this attribute is essentially equivalent to providing a trivial
+load method but avoids the (fairly small) load-time overheads associated
with defining and calling such a method.
Differential Revision: https://reviews.llvm.org/D56555
llvm-svn: 353116
Summary: this commit adds support to a new dependence type introduced in OpenMP
5.0. The LLVM OpenMP RTL already supports this feature, so we only need to
modify CLANG to take advantage of them.
Differential Revision: https://reviews.llvm.org/D57576
llvm-svn: 353018
Summary:
This adds support for new-PM plugin loading to clang. The option
`-fpass-plugin=` may be used to specify a dynamic shared object file
that adheres to the PassPlugin API.
Tested: created simple plugin that registers an EP callback; with optimization level > 0, the pass is run as expected.
Committed on behalf of Marco Elver
Differential Revision: https://reviews.llvm.org/D56935
llvm-svn: 352972
Summary:
Currently, ASan inserts a call to `__asan_handle_no_return` before every
`noreturn` function call/invoke. This is unnecessary for calls to other
runtime funtions. This patch changes ASan to skip instrumentation for
functions calls marked with `!nosanitize` metadata.
Reviewers: TODO
Differential Revision: https://reviews.llvm.org/D57489
llvm-svn: 352948
This argument was added in r254554 in order to support the
pass_object_size attribute. However, in r296076, the attribute's
presence is now also represented in FunctionProtoType's
ExtParameterInfo, and thus it's unnecessary to pass along a separate
FunctionDecl.
The functions modified are:
RequiredArgs::forPrototype{,Plus}, and
CodeGenTypes::ConvertFunctionType.
After this, it's also (again) unnecessary to have a separate
ConvertFunctionType function ConvertType, so convert callers back to
the latter, leaving the former as an internal helper function.
llvm-svn: 352946
This is similar to import_module, but sets the import field name
instead.
By default, the import field name is the same as the C/asm/.o symbol
name. However, there are situations where it's useful to have it be
different. For example, suppose I have a wasm API with a module named
"pwsix" and a field named "read". There's no risk of namespace
collisions with user code at the wasm level because the generic name
"read" is qualified by the module name "pwsix". However in the C/asm/.o
namespaces, the module name is not used, so if I have a global function
named "read", it is intruding on the user's namespace.
With the import_field module, I can declare my function (in libc) to be
"__read", and then set the wasm import module to be "pwsix" and the wasm
import field to be "read". So at the C/asm/.o levels, my symbol is
outside the user namespace.
Differential Revision: https://reviews.llvm.org/D57602
llvm-svn: 352930
This patch implements parsing and sema for "omp declare mapper"
directive. User defined mapper, i.e., declare mapper directive, is a new
feature in OpenMP 5.0. It is introduced to extend existing map clauses
for the purpose of simplifying the copy of complex data structures
between host and device (i.e., deep copy). An example is shown below:
struct S { int len; int *d; };
#pragma omp declare mapper(struct S s) map(s, s.d[0:s.len]) // Memory region that d points to is also mapped using this mapper.
Contributed-by: Lingda Li <lildmh@gmail.com>
Differential Revision: https://reviews.llvm.org/D56326
llvm-svn: 352906
Summary:
UBSan wants to detect when unreachable code is actually reached, so it
adds instrumentation before every unreachable instruction. However, the
optimizer will remove code after calls to functions marked with
noreturn. To avoid this UBSan removes noreturn from both the call
instruction as well as from the function itself. Unfortunately, ASan
relies on this annotation to unpoison the stack by inserting calls to
_asan_handle_no_return before noreturn functions. This is important for
functions that do not return but access the the stack memory, e.g.,
unwinder functions *like* longjmp (longjmp itself is actually
"double-proofed" via its interceptor). The result is that when ASan and
UBSan are combined, the noreturn attributes are missing and ASan cannot
unpoison the stack, so it has false positives when stack unwinding is
used.
Changes:
Clang-CodeGen now directly insert calls to `__asan_handle_no_return`
when a call to a noreturn function is encountered and both
UBsan-unreachable and ASan are enabled. This allows UBSan to continue
removing the noreturn attribute from functions without any changes to
the ASan pass.
Previously generated code:
```
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable
```
Generated code (for now):
```
call void @__asan_handle_no_return
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable
```
rdar://problem/40723397
Reviewers: delcypher, eugenis, vsk
Differential Revision: https://reviews.llvm.org/D57278
> llvm-svn: 352690
llvm-svn: 352829
Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc
doesn't choke on it, hopefully.
Original Message:
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352827
This reverts commit f47d6b38c7 (r352791).
Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.
llvm-svn: 352800
Instead of calling CUDA runtime to arrange function arguments,
the new API constructs arguments in a local array and the kernels
are launched with __cudaLaunchKernel().
The old API has been deprecated and is expected to go away
in the next CUDA release.
Differential Revision: https://reviews.llvm.org/D57488
llvm-svn: 352799
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352791
Summary:
UBSan wants to detect when unreachable code is actually reached, so it
adds instrumentation before every unreachable instruction. However, the
optimizer will remove code after calls to functions marked with
noreturn. To avoid this UBSan removes noreturn from both the call
instruction as well as from the function itself. Unfortunately, ASan
relies on this annotation to unpoison the stack by inserting calls to
_asan_handle_no_return before noreturn functions. This is important for
functions that do not return but access the the stack memory, e.g.,
unwinder functions *like* longjmp (longjmp itself is actually
"double-proofed" via its interceptor). The result is that when ASan and
UBSan are combined, the noreturn attributes are missing and ASan cannot
unpoison the stack, so it has false positives when stack unwinding is
used.
Changes:
Clang-CodeGen now directly insert calls to `__asan_handle_no_return`
when a call to a noreturn function is encountered and both
UBsan-unreachable and ASan are enabled. This allows UBSan to continue
removing the noreturn attribute from functions without any changes to
the ASan pass.
Previously generated code:
```
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable
```
Generated code (for now):
```
call void @__asan_handle_no_return
call void @longjmp
call void @__asan_handle_no_return
call void @__ubsan_handle_builtin_unreachable
```
rdar://problem/40723397
Reviewers: delcypher, eugenis, vsk
Differential Revision: https://reviews.llvm.org/D57278
llvm-svn: 352690
objc_alloc and objc_allocWithZone may throw exceptions if the
underlying method does. If we're in a @try block, then make sure we
emit an invoke instead of a call.
rdar://47610407
Differential revision: https://reviews.llvm.org/D57476
llvm-svn: 352687
required.
Function __kmpc_push_target_tripcount should be emitted only if the
offloading entry is going to be emitted (for use in tgt_target...
functions). Otherwise, it should not be emitted.
llvm-svn: 352669