The cloning happens before all metadata nodes are resolved. Prevent the value
mapper from running into unresolved or temporary MD nodes.
Differential Revision: https://reviews.llvm.org/D39396
llvm-svn: 317047
Summary:
This change allows generalizing pointers in type signatures used for
cfi-icall by enabling the -fsanitize-cfi-icall-generalize-pointers flag.
This works by 1) emitting an additional generalized type signature
metadata node for functions and 2) llvm.type.test()ing for the
generalized type for translation units with the flag specified.
This flag is incompatible with -fsanitize-cfi-cross-dso because it would
require emitting twice as many type hashes which would increase artifact
size.
Reviewers: pcc, eugenis
Reviewed By: pcc
Subscribers: kcc
Differential Revision: https://reviews.llvm.org/D39358
llvm-svn: 317044
The LLVM sqrt intrinsic definition changed with:
D28797
...so we don't have to use any relaxed FP settings other than errno handling.
This patch sidesteps a question raised in PR27435:
https://bugs.llvm.org/show_bug.cgi?id=27435
Is a programmer using __builtin_sqrt() invoking the compiler's intrinsic definition of sqrt or the mathlib definition of sqrt?
But we have an answer now: the builtin should match the behavior of the libm function including errno handling.
Differential Revision: https://reviews.llvm.org/D39204
llvm-svn: 317031
This patch fixes various places in clang to propagate may-alias
TBAA access descriptors during construction of lvalues, thus
eliminating the need for the LValueBaseInfo::MayAlias flag.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D39008
llvm-svn: 316988
For non-zero alloca addr space, alloca is usually casted to default addr
space immediately.
For non-vla, alloca is inserted at AllocaInsertPt, therefore the addr
space cast should also be insterted at AllocaInsertPt. However,
for vla, alloca is inserted at the current insertion point of IRBuilder,
therefore the addr space cast should also inserted at the current
insertion point of IRBuilder.
Currently clang always insert addr space cast at AllocaInsertPt, which
causes invalid IR.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D39374
llvm-svn: 316909
Craig noticed that CodeGen wasn't properly ignoring the
values sent to the target attribute. This patch ignores
them.
This patch also sets the 'default' for this checking to
'supported', since only X86 has implemented the support
for checking valid CPU names and Feature Names.
One test was changed to i686, since it uses a lakemont,
which would otherwise be prohibited in x86_64.
Differential Revision: https://reviews.llvm.org/D39357
llvm-svn: 316783
Fixes an assertion failure when ivar is a struct containing incomplete
array. Also completes support for direct flexible array members.
rdar://problem/21054495
Reviewers: rjmccall, theraven
Reviewed By: rjmccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D38774
llvm-svn: 316723
Instead of only setting a non-zero debug location on the return
instruction in *_helper_block functions, set a proper location on all
instructions within these functions. Pick the start location of the
block literal expr for maximum clarity.
The debugger does not step into *_helper_block functions during normal
single-stepping because we mark their parameters as artificial. This is
what we want (the functions are implicitly generated and uninteresting
to most users). The stepping behavior is unchanged by this patch.
rdar://32907581
Differential Revision: https://reviews.llvm.org/D39310
llvm-svn: 316704
The exisiting code goes out of its way to put block parameters into an
alloca only at -O0, and then describes the funciton argument with a
dbg.declare, which is undocumented in the LLVM-CFE contract and does
not actually behave as intended after LLVM r642022.
This patch just generates the alloca unconditionally, the mem2reg pass
will eliminate it at -O1 and up anyway and points the dbg.declare to
the alloca as intended (which mem2reg will then correctly rewrite into
a dbg.value).
This reapplies r316684 with some dead code removed.
rdar://problem/35043980
Differential Revision: https://reviews.llvm.org/D39305
llvm-svn: 316689
The exisiting code goes out of its way to put block parameters into an
alloca only at -O0, and then describes the funciton argument with a
dbg.declare, which is undocumented in the LLVM-CFE contract and does
not actually behave as intended after LLVM r642022.
This patch just generates the alloca unconditionally, the mem2reg pass
will eliminate it at -O1 and up anyway and points the dbg.declare to
the alloca as intended (which mem2reg will then correctly rewrite into
a dbg.value).
rdar://problem/35043980
Differential Revision: https://reviews.llvm.org/D39305
llvm-svn: 316684
Darwin uses char * for the variadic list type (va_list). We use the PPC
SVR4 ABI for PPC, which uses a structure type for the va_list. When
constructing the GEP, we would fail due to the incorrect handling for
the va_list. Correct this to use the right type.
llvm-svn: 316599
Ensure that we check the ivar containing decl for the DLL storage
attribute rather than the ivar itself as the dll storage is associated
to the interface decl not the ivar decl.
llvm-svn: 316545
Builder save/restores insertion pointer when emitting addr space cast
for alloca, but does not save/restore debug loc, which causes verifier
failure for certain call instructions.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D39069
llvm-svn: 316484
In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }
For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }
This relands commit r316229 that I reverted in r316235 because it
failed on some bots. During investigation I found that this was because
Clang and GCC evaluate the two arguments to emplace_back() in
ReductionCodeGen::emitSharedLValue() in a different order, hence
leading to a different order of generated instructions in the final
LLVM IR. Fix this by passing in the arguments from temporary variables
that are evaluated in a defined order.
Differential Revision: https://reviews.llvm.org/D39136
llvm-svn: 316362
In some cases the compiler can deduce the length of an array section
as constants. With this information, VLAs can be avoided in place of
a constant sized array or even a scalar value if the length is 1.
Example:
int a[4], b[2];
pragma omp parallel reduction(+: a[1:2], b[1:1])
{ }
For chained array sections, this optimization is restricted to cases
where all array sections except the last have a constant length 1.
This trivially guarantees that there are no holes in the memory region
that needs to be privatized.
Example:
int c[3][4];
pragma omp parallel reduction(+: c[1:1][1:2])
{ }
Differential Revision: https://reviews.llvm.org/D39136
llvm-svn: 316229
In function GetIntrinsic, not all types are covered. Types double and long long are missed, type long is wrongly treated same as int, it should be same as long long. These problems cause compiler crashes when compiling code in PR31161. This patch fixed the problem.
Differential Revision: https://reviews.llvm.org/D38820
llvm-svn: 316179
If the variables is boolean and we generating inner function with real
types, the codegen may crash because of not loading boolean value from
memory.
llvm-svn: 316011
Currently clang assumes the temporary variables emitted during
codegen of atomic builtins have address space 0, which
is not true for target triple amdgcn---amdgiz and causes invalid
bitcasts.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D38966
llvm-svn: 316000
The main change is that now we generate TBAA info before
constructing the resulting lvalue instead of constructing lvalue
with some default TBAA info and fixing it as necessary
afterwards. We also keep the TBAA info close to lvalue base info,
which is supposed to simplify their future merging.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38947
llvm-svn: 315989
This patch addresses the rest of the cases where we pass lvalue
base info, but do not provide corresponding TBAA info.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to make
reviewing easier.
Differential Revision: https://reviews.llvm.org/D38945
llvm-svn: 315986
A trailing deferred region isn't necessary in a function that ends with
this pattern:
...
else {
...
return;
}
Special-case this pattern so that the closing curly brace of the
function isn't marked as uncovered. This issue came up in PR34962.
llvm-svn: 315982
This makes it possible to view sub-line region counts for the l.h.s of
&& and || expressions in coverage reports.
It also fixes PR33465, which shows an example of incorrect coverage
output for an assignment statement containing '||'.
llvm-svn: 315979
Currently all the consecutive bitfields are wrapped as a large integer unless there is unamed zero sized bitfield in between. The patch provides an alternative manner which makes the bitfield to be accessed as separate memory location if it has legal integer width and is naturally aligned. Such separate bitfield may split the original consecutive bitfields into subgroups of consecutive bitfields, and each subgroup will be wrapped as an integer. Now This is all controlled by an option -ffine-grained-bitfield-accesses. The alternative of bitfield access manner can improve the access efficiency of those bitfields with legal width and being aligned, but may reduce the chance of load/store combining of other bitfields, so it depends on how the bitfields are defined and actually accessed to choose when to use the option. For now the option is off by default.
Differential revision: https://reviews.llvm.org/D36562
llvm-svn: 315915
Summary:
Convert clang::LangAS to a strongly typed enum
Currently both clang AST address spaces and target specific address spaces
are represented as unsigned which can lead to subtle errors if the wrong
type is passed. It is especially confusing in the CodeGen files as it is
not possible to see what kind of address space should be passed to a
function without looking at the implementation.
I originally made this change for our LLVM fork for the CHERI architecture
where we make extensive use of address spaces to differentiate between
capabilities and pointers. When merging the upstream changes I usually
run into some test failures or runtime crashes because the wrong kind of
address space is passed to a function. By converting the LangAS enum to a
C++11 we can catch these errors at compile time. Additionally, it is now
obvious from the function signature which kind of address space it expects.
I found the following errors while writing this patch:
- ItaniumRecordLayoutBuilder::LayoutField was passing a clang AST address
space to TargetInfo::getPointer{Width,Align}()
- TypePrinter::printAttributedAfter() prints the numeric value of the
clang AST address space instead of the target address space.
However, this code is not used so I kept the current behaviour
- initializeForBlockHeader() in CGBlocks.cpp was passing
LangAS::opencl_generic to TargetInfo::getPointer{Width,Align}()
- CodeGenFunction::EmitBlockLiteral() was passing a AST address space to
TargetInfo::getPointerWidth()
- CGOpenMPRuntimeNVPTX::translateParameter() passed a target address space
to Qualifiers::addAddressSpace()
- CGOpenMPRuntimeNVPTX::getParameterAddress() was using
llvm::Type::getPointerTo() with a AST address space
- clang_getAddressSpace() returns either a LangAS or a target address
space. As this is exposed to C I have kept the current behaviour and
added a comment stating that it is probably not correct.
Other than this the patch should not cause any functional changes.
Reviewers: yaxunl, pcc, bader
Reviewed By: yaxunl, bader
Subscribers: jlebar, jholewinski, nhaehnle, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D38816
llvm-svn: 315871
In OpenCL the kernel function and non-kernel function has different calling conventions.
For certain targets they have different argument ABIs. Also kernels have special function
attributes and metadata for runtime to launch them.
The blocks passed to enqueue_kernel is supposed to be executed as kernels. As such,
the block invoke function should be emitted as kernel with proper calling convention and
argument ABI.
This patch emits enqueued block as kernel. If a block is both called directly and passed
to enqueue_kernel, separate functions will be generated.
Differential Revision: https://reviews.llvm.org/D38134
llvm-svn: 315804
The function sanitizer only checks indirect calls through function
pointers. This excludes all non-static member functions (constructor
calls, calls through thunks, etc. all use a separate code path). Don't
emit function signatures for functions that won't be checked.
Apart from cutting down on code size, this should fix a regression on
Linux caused by r313096. For context, see the mailing list discussion:
r313096 - [ubsan] Function Sanitizer: Don't require writable text segments
Testing: check-clang, check-ubsan
Differential Revision: https://reviews.llvm.org/D38913
llvm-svn: 315786
Currently Clang uses default address space (0) to represent private address space for OpenCL
in AST. There are two issues with this:
Multiple address spaces including private address space cannot be diagnosed.
There is no mangling for default address space. For example, if private int* is emitted as
i32 addrspace(5)* in IR. It is supposed to be mangled as PUAS5i but it is mangled as
Pi instead.
This patch attempts to represent OpenCL private address space explicitly in AST. It adds
a new enum LangAS::opencl_private and adds it to the variable types which are implicitly
private:
automatic variables without address space qualifier
function parameter
pointee type without address space qualifier (OpenCL 1.2 and below)
Differential Revision: https://reviews.llvm.org/D35082
llvm-svn: 315668
This feature is not (yet) approved by the C++ committee, so this is liable to
be reverted or significantly modified based on committee feedback.
No functionality change intended for existing code (a new type must be defined
in namespace std to take advantage of this feature).
llvm-svn: 315662
Fix PR32990 by effectively reverting r283063 and solving it a different
way.
We want to limit the hack to not replace equivalent available_externally
dtors specifically to libc++, which uses always_inline. It seems certain
versions of libc++ do not provide all the symbols that an explicit
template instantiation is expected to provide.
If we get to the code that forms a real alias, only *then* check if this
is available_externally, and do that by asking a better question, which
is "is this a declaration for the linker?", because *that's* what means
we can't form an alias to it.
As a follow-on simplification, remove the InEveryTU parameter. Its last
use guarded this code for forming aliases, but we should never form
aliases to declarations, regardless of what we know about every TU.
llvm-svn: 315656
reduction.
If the reduction is an array or an array section and reduction operation
is declare reduction without initializer, it may lead to crash.
llvm-svn: 315611
in C.
If we try to get the lvalue for thread_id variables in inlined regions,
we did not use the correct version of function. Fixed this bug by adding
overrided version of the function getThreadIDVariableLValue for inlined
regions.
llvm-svn: 315578
This patch enables explicit generation of TBAA information in all
cases where LValue base info is propagated or constructed in
non-trivial ways. Eventually, we will consider each of these
cases to make sure the TBAA information is correct and not too
conservative. For now, we just fall back to generating TBAA info
from the access type.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38733
llvm-svn: 315575
This reverts commit 4e4ee1c507e2707bb3c208e1e1b6551c3015cbf5.
This is failing due to some code that isn't built on MSVC
so I didn't catch. Not immediately obvious how to fix this
at first glance, so I'm reverting for now.
llvm-svn: 315536
There's a lot of misuse of Twine scattered around LLVM. This
ranges in severity from benign (returning a Twine from a function
by value that is just a string literal) to pretty sketchy (storing
a Twine by value in a class). While there are some uses for
copying Twines, most of the very compelling ones are confined
to the Twine class implementation itself, and other uses are
either dubious or easily worked around.
This patch makes Twine's copy constructor private, and fixes up
all callsites.
Differential Revision: https://reviews.llvm.org/D38767
llvm-svn: 315530
If both taskloop and task directives are used at the same time in one
program, we may ran into the situation when the particular type for task
directive is reused for taskloop directives. Patch fixes this problem.
llvm-svn: 315464
This change adds a new function, CodeGen::getFieldNumber, that
enables a user of clang's code generation to get the field number
in a generated LLVM IR struct that corresponds to a particular field
in a C struct.
It is important to expose this information in Clang's code generation
interface because there is no reasonable way for users of Clang's code
generation to get this information. In particular:
LLVM struct types do not include field names.
Clang adds a non-trivial amount of logic to the code generation of LLVM IR types for structs, in particular to handle padding and bit fields.
Patch by Michael Ferguson!
Differential Revision: https://reviews.llvm.org/D38473
llvm-svn: 315392
Usually compare expression should return i1 type, so EmitScalarConversion is called before return
return EmitScalarConversion(Result, CGF.getContext().BoolTy, E->getType(), E->getExprLoc());
But when ppc intrinsic is called to compare vectors, the ppc intrinsic can return i32 even E->getType() is BoolTy, in this case EmitScalarConversion does nothing, an i32 type result is returned and causes crash later.
This patch detects this case and truncates the result before return.
Differential Revision: https://reviews.llvm.org/D38656
llvm-svn: 315358
Besides obvious code simplification, avoiding explicit creation
of LValueBaseInfo objects makes it easier to make TBAA
information to be part of such objects.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38695
llvm-svn: 315289
This was done for CUDA functions in r261779, and for the same
reason this also needs to be done for OpenCL. An arbitrary
function could have a barrier() call in it, which in turn
requires the calling function to be convergent.
llvm-svn: 315094
The Cpu Init functionality is required for the target
attribute, so this patch simply splits it out into its own
function, exactly like CpuIs and CpuSupports.
llvm-svn: 315075
In C++11 variable to global variables are considered as constant
expressions and these variables are not captured in the outlined
regions. Patch allows capturing of such variables in the OpenMP regions.
llvm-svn: 315074
This patch is an attempt to clarify and simplify generation and
propagation of TBAA information. The idea is to pack all values
that describe a memory access, namely, base type, access type and
offset, into a single structure. This is supposed to make further
changes, such as adding support for unions and array members,
easier to prepare and review.
DecorateInstructionWithTBAA() is no more responsible for
converting types to tags. These implicit conversions not only
complicate reading the code, but also suggest assigning scalar
access tags while we generally prefer full-size struct-path tags.
TBAAPathTag is replaced with TBAAAccessInfo; the latter is now
the type of the keys of the cache map that translates access
descriptors to metadata nodes.
Fixed a bug with writing to a wrong map in
getTBAABaseTypeMetadata() (former getTBAAStructTypeInfo()).
We now check for valid base access types every time we
dereference a field. The original code only checks the top-level
base type. See isValidBaseType() / isTBAAPathStruct() calls.
Some entities have been renamed to sound more adequate and less
confusing/misleading in presence of path-aware TBAA information.
Now we do not lookup twice for the same cache entry in
getAccessTagInfo().
Refined relevant comments and descriptions.
Differential Revision: https://reviews.llvm.org/D37826
llvm-svn: 315048
code size.
Currently clang expands a call to __builtin_os_log_format into a long
sequence of instructions at the call site, causing code size to
increase in some cases.
This commit attempts to reduce code size by emitting a helper function
that can be shared by calls to __builtin_os_log_format with similar
formats and arguments. The helper function has linkonce_odr linkage to
enable the linker to merge identical functions across translation units.
Attribute 'noinline' is attached to the helper function at -Oz so that
the inliner doesn't inline functions that can potentially be merged.
This commit also fixes a bug where the generated IR writes past the end
of the buffer when "%m" is the last specifier appearing in the format
string passed to __builtin_os_log_format.
Original patch by Duncan Exon Smith.
rdar://problem/34065973
rdar://problem/34196543
Differential Revision: https://reviews.llvm.org/D38606
llvm-svn: 315045
This patch makes it possible to produce access tags in a uniform
manner regardless whether the resulting tag will be a scalar or a
struct-path one. getAccessTagInfo() now takes care of the actual
translation of access descriptors to tags and can handle all
kinds of accesses. Facilities that specific to scalar accesses
are eliminated.
Some more details:
* DecorateInstructionWithTBAA() is not responsible for conversion
of types to access tags anymore. Instead, it takes an access
descriptor (TBAAAccessInfo) and generates corresponding access
tag from it.
* getTBAAInfoForVTablePtr() reworked to
getTBAAVTablePtrAccessInfo() that now returns the
virtual-pointer access descriptor and not the virtual-point
type metadata.
* Added function getTBAAMayAliasAccessInfo() that returns the
descriptor for may-alias accesses.
* getTBAAStructTagInfo() renamed to getTBAAAccessTagInfo() as now
it is the only way to generate access tag by a given access
descriptor. It is capable of producing both scalar and
struct-path tags, depending on options and availability of the
base access type. getTBAAScalarTagInfo() and its cache
ScalarTagMetadataCache are eliminated.
* Now that we do not need to care about whether the resulting
access tag should be a scalar or struct-path one,
getTBAAStructTypeInfo() is renamed to getBaseTypeInfo().
* Added function getTBAAAccessInfo() that constructs access
descriptor by a given QualType access type.
This is part of D37826 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38503
llvm-svn: 314979
This patch makes it possible to produce access tags in a uniform
manner regardless whether the resulting tag will be a scalar or a
struct-path one. getAccessTagInfo() now takes care of the actual
translation of access descriptors to tags and can handle all
kinds of accesses. Facilities that specific to scalar accesses
are eliminated.
Some more details:
* DecorateInstructionWithTBAA() is not responsible for conversion
of types to access tags anymore. Instead, it takes an access
descriptor (TBAAAccessInfo) and generates corresponding access
tag from it.
* getTBAAInfoForVTablePtr() reworked to
getTBAAVTablePtrAccessInfo() that now returns the
virtual-pointer access descriptor and not the virtual-point
type metadata.
* Added function getTBAAMayAliasAccessInfo() that returns the
descriptor for may-alias accesses.
* getTBAAStructTagInfo() renamed to getTBAAAccessTagInfo() as now
it is the only way to generate access tag by a given access
descriptor. It is capable of producing both scalar and
struct-path tags, depending on options and availability of the
base access type. getTBAAScalarTagInfo() and its cache
ScalarTagMetadataCache are eliminated.
* Now that we do not need to care about whether the resulting
access tag should be a scalar or struct-path one,
getTBAAStructTypeInfo() is renamed to getBaseTypeInfo().
* Added function getTBAAAccessInfo() that constructs access
descriptor by a given QualType access type.
This is part of D37826 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38503
llvm-svn: 314977
Currently block is translated to a structure equivalent to
struct Block {
void *isa;
int flags;
int reserved;
void *invoke;
void *descriptor;
};
Except invoke, which is the pointer to the block invoke function,
all other fields are useless for OpenCL, which clutter the IR and
also waste memory since the block struct is passed to the block
invoke function as argument.
On the other hand, the size and alignment of the block struct is
not stored in the struct, which causes difficulty to implement
__enqueue_kernel as library function, since the library function
needs to know the size and alignment of the argument which needs
to be passed to the kernel.
This patch removes the useless fields from the block struct and adds
size and align fields. The equivalent block struct will become
struct Block {
int size;
int align;
generic void *invoke;
/* custom fields */
};
It also changes the pointer to the invoke function to be
a generic pointer since the address space of a function
may not be private on certain targets.
Differential Revision: https://reviews.llvm.org/D37822
llvm-svn: 314932
https://reviews.llvm.org/D38371
This patch implements codegen for the combined 'teams distribute" OpenMP pragma and adds regression tests for all its clauses.
llvm-svn: 314905
This patch fixes clang to propagate complete TBAA information for
atomic accesses and not just the final access types. Prepared
against D38456 and requires it to be committed first.
This is part of D37826 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38460
llvm-svn: 314784
With this patch we implement a concept of TBAA access descriptors
that are capable of representing both scalar and struct-path
accesses in a generic way.
This is part of D37826 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38456
llvm-svn: 314780
Don't emit alignment checks which the IR constant folder throws away.
I've tested this out on X86FastISel.cpp. While this doesn't decrease
end-to-end compile-time significantly, it results in 122 fewer type
checks (1% reduction) overall, without adding any real complexity.
Differential Revision: https://reviews.llvm.org/D37544
llvm-svn: 314752
directives.
The argument of the `device` clause in target-based executable
directives must be captured to support codegen for the `target`
directives with the `depend` clauses.
llvm-svn: 314686
Simplified and generalized codegen for non-offloading part that works if
offloading is failed or condition of the `if` clause is `false`.
llvm-svn: 314670
This patch fixes misleading names of entities related to getting,
setting and generation of TBAA access type descriptors.
This is effectively an attempt to provide a review for D37826 by
breaking it into smaller pieces.
Differential Revision: https://reviews.llvm.org/D38404
llvm-svn: 314657
to have child entries describing the template parameters. This will
be on by default for SCE tuning.
Differential Revision: https://reviews.llvm.org/D14358
llvm-svn: 314444
Added missing addrspacecast case in alignment computation
logic of pointer type emission in IR generation.
Differential Revision: https://reviews.llvm.org/D37804
llvm-svn: 314304
Currently, if _attribute_((section())) is used for extern variables,
section information is not emitted in generated IR when the variables are used.
This is expected since sections are not generated for external linkage objects.
However NiosII requires this information as it uses special GP-relative accesses
for any objects that use attribute section (.sdata). GCC keeps this attribute in
middle-end.
This change emits the section information for all targets.
Patch By: Elizabeth Andrews
Differential Revision:https://reviews.llvm.org/D36487
llvm-svn: 314262
This patch fixes clang to decorate reference accesses as pointers
and not as "omnipotent chars".
Differential Revision: https://reviews.llvm.org/D38074
llvm-svn: 314209
directives.
If the variable is used in the target-based region but is not found in
any private|mapping clause, then generate implicit firstprivate|map
clauses for these implicitly mapped variables.
llvm-svn: 314205
Summary:
This is the follow-up patch to D37924.
This change refactors clang to use the the newly added section headers
in SpecialCaseList to specify which sanitizers blacklists entries
should apply to, like so:
[cfi-vcall]
fun:*bad_vcall*
[cfi-derived-cast|cfi-unrelated-cast]
fun:*bad_cast*
The SanitizerSpecialCaseList class has been added to allow querying by
SanitizerMask, and SanitizerBlacklist and its downstream users have been
updated to provide that information. Old blacklists not using sections
will continue to function identically since the blacklist entries will
be placed into a '[*]' section by default matching against all
sanitizers.
Reviewers: pcc, kcc, eugenis, vsk
Reviewed By: eugenis
Subscribers: dberris, cfe-commits, mgorny
Differential Revision: https://reviews.llvm.org/D37925
llvm-svn: 314171
This is to fix PR34347. EmitAtomicExpr now only uses alignment information from
Type, instead of Decl, so when the declaration of an atomic variable is marked
to have the alignment equal as its size, EmitAtomicExpr doesn't know about it and
will generate libcall instead of atomic op. The patch uses EmitPointerWithAlignment
to get the precise alignment information.
Differential Revision: https://reviews.llvm.org/D37310
llvm-svn: 314145
This commit fixes a bug in the handling of storage-only __fp16 vectors
where clang didn't promote __fp16 vector operands to float vectors.
Conceptually, it performs the following transformation on the AST in
CreateBuiltinBinOp and CreateBuiltinUnaryOp:
(Before)
typedef __fp16 half4 __attribute__ ((vector_size (8)));
typedef float float4 __attribute__ ((vector_size (16)));
half4 hv0, hv1, hv2, hv3;
hv0 = hv1 + hv2 + hv3;
(After)
float4 t0 = (float4)hv1 + (float4)hv2;
float4 t1 = t0 + (float4)hv3;
hv0 = (half4)t1;
Note that this commit fixes the bug for targets that set
HalfArgsAndReturns to true (ARM and ARM64). Targets using intrinsics
such as llvm.convert.to.fp16 to handle __fp16 are still broken.
rdar://problem/20625184
Differential Revision: https://reviews.llvm.org/D32520
llvm-svn: 314056
body of global block invoke functions.
This commit fixes an infinite loop in IRGen that occurs when compiling
the following code:
void FUNC2() {
static void (^const block1)(int) = ^(int a){
if (a--)
block1(a);
};
}
This is how IRGen gets stuck in the infinite loop:
1. GenerateBlockFunction is called to emit the body of "block1".
2. GetAddrOfGlobalBlock is called to get the address of "block1". The
function calls getAddrOfGlobalBlockIfEmitted to check whether the
global block has been emitted. If it hasn't been emitted, it then
tries to emit the body of the block function by calling
GenerateBlockFunction, which goes back to step 1.
This commit prevents the inifinite loop by building the global block in
GenerateBlockFunction before emitting the body of the block function.
rdar://problem/34541684
Differential Revision: https://reviews.llvm.org/D38118
llvm-svn: 314029
Add an option to emit limited coverage info for unused decls. It's just a
cl::opt for now to allow us to experiment quickly.
When building llc, this results in an 84% size reduction in the llvm_covmap
section, and a similar size reduction in the llvm_prf_names section. In
practice I expect the size reduction to be roughly quadratic with the size of
the program.
The downside is that coverage for headers will no longer be complete. This will
make the line/function/region coverage metrics incorrect, since they will be
artificially high. One mitigation would be to somehow disable those metrics
when using limited-coverage=true.
This is related to: llvm.org/PR34533 (make SourceBasedCodeCoverage scale)
Differential Revision: https://reviews.llvm.org/D38107
llvm-svn: 314002
If the captured variable has re-declaration we may end up with the
situation where the captured variable is the re-declaration while the
referenced variable is the canonical declaration (or vice versa). In
this case we may generate wrong code. Patch fixes this situation.
llvm-svn: 313995
The attribute informs the compiler that the annotated pointer parameter
of a function cannot escape and enables IRGen to attach attribute
'nocapture' to parameters that are annotated with the attribute. That is
the only optimization that currently takes advantage of 'noescape', but
there are other optimizations that will be added later that improves
IRGen for ObjC blocks.
This recommits r313722, which was reverted in r313725 because clang
couldn't build compiler-rt. It failed to build because there were
function declarations that were missing 'noescape'. That has been fixed
in r313929.
rdar://problem/19886775
Differential Revision: https://reviews.llvm.org/D32210
llvm-svn: 313945
This reverts commit r313722.
It looks like compiler-rt/lib/tsan/rtl/tsan_libdispatch_mac.cc cannot be
compiled because some of the functions declared in the file do not match
the ones in the SDK headers (which are annotated with 'noescape').
llvm-svn: 313725
The attribute informs the compiler that the annotated pointer parameter
of a function cannot escape and enables IRGen to attach attribute
'nocapture' to parameters that are annotated with the attribute. That is
the only optimization that currently takes advantage of 'noescape', but
there are other optimizations that will be added later that improves
IRGen for ObjC blocks.
rdar://problem/19886775
Differential Revision: https://reviews.llvm.org/D32210
llvm-svn: 313722
The attribute informs the compiler that the annotated pointer parameter
of a function cannot escape and enables IRGen to attach attribute
'nocapture' to parameters that are annotated with the attribute. That is
the only optimization that currently takes advantage of 'noescape', but
there are other optimizations that will be added later that improves
IRGen for ObjC blocks.
rdar://problem/19886775
Differential Revision: https://reviews.llvm.org/D32520
llvm-svn: 313720
As a special case, throw away deferred regions for trailing returns.
This allows the closing curly brace to have a count, and is less
distracting.
llvm-svn: 313603
Summary:
Restore the `__builtin_wasm_rethrow` builtin deleted in D37931. On second
thought, it appears it can be used to implement `__cxa_rethrow`.
Reviewers: dschuff, sunfish
Reviewed By: dschuff
Subscribers: jfb, sbc100, jgravelle-google
Differential Revision: https://reviews.llvm.org/D37942
llvm-svn: 313430
This patch replaces the perm2f128 intrinsics with native shuffle vectors.
This uses a pretty simple approach to allocate source 0 to the lower half input and source 1 to the upper half input. Then its just a matter of filling in the indices to use either the lower or upper half of that specific source. This can result in the same source being used by both operands. InstCombine or SelectionDAGBuilder should be able to clean that up.
Differential Revision: https://reviews.llvm.org/D37892
llvm-svn: 313418
Summary:
Remove `__builtin_wasm_rethrow` builtin. I thought it was required to implement
`__cxa_rethrow` function in libcxxabi, but it turned out it will be using
`__builtin_wasm_throw` instead.
Reviewers: dschuff, jgravelle-google
Reviewed By: jgravelle-google
Subscribers: jfb, sbc100, jgravelle-google
Differential Revision: https://reviews.llvm.org/D37931
llvm-svn: 313402
Summary:
To improve CodeView quality for static member functions, we need to make the
static explicit. In addition to a small change in LLVM's CodeViewDebug to
return the appropriate MethodKind, this requires a small change in Clang to
note the staticness in the debug info metadata.
Subscribers: aprantl, hiraditya
Differential Revision: https://reviews.llvm.org/D37715
llvm-svn: 313192
This change will make it possible to use -fsanitize=function on Darwin and
possibly on other platforms. It fixes an issue with the way RTTI is stored into
function prologue data.
On Darwin, addresses stored in prologue data can't require run-time fixups and
must be PC-relative. Run-time fixups are undesirable because they necessitate
writable text segments, which can lead to security issues. And absolute
addresses are undesirable because they break PIE mode.
The fix is to create a private global which points to the RTTI, and then to
encode a PC-relative reference to the global into prologue data.
Differential Revision: https://reviews.llvm.org/D37597
llvm-svn: 313096
Summary:
Microsoft Visual Studio expects debug locations to correspond to
statements. We used to emit locations for expressions nested inside statements.
This would confuse the debugger, causing it to stop multiple times on the
same line and breaking the "step into specific" feature. This change inhibits
the emission of debug locations for nested expressions when emitting CodeView
debug information, unless column information is enabled.
Fixes PR34312.
Reviewers: rnk, zturner
Reviewed By: rnk
Subscribers: majnemer, echristo, aprantl, cfe-commits
Differential Revision: https://reviews.llvm.org/D37529
llvm-svn: 312965
This patch teaches the preprocessor to report more precise source ranges for
code that is skipped due to conditional directives.
The new behavior includes the '#' from the opening directive and the full text
of the line containing the closing directive in the skipped area. This matches
up clang's behavior (we don't IRGen the code between the closing "endif" and
the end of a line).
This also affects the code coverage implementation. See llvm.org/PR34166 (this
also happens to be rdar://problem/23224058).
The old behavior (report the end of the skipped range as the end
location of the 'endif' token) is preserved for indexing clients.
Differential Revision: https://reviews.llvm.org/D36642
llvm-svn: 312947
When performing a NSFastEnumeration, the compiler synthesizes a call to
`countByEnumeratingWithState:objects:count:` where the `count` parameter
is of type `NSUInteger` and the return type is a `NSUInteger`. We would
previously always use a `UnsignedLongTy` for the `NSUInteger` type. On
32-bit targets, `long` is 32-bits which is the same as `unsigned int`.
Most 64-bit targets are LP64, where `long` is 64-bits. However, on
LLP64 targets, such as Windows, `long` is 32-bits. Introduce new
`getNSUIntegerType` and `getNSIntegerType` helpers to allow us to
determine the correct type for the `NSUInteger` type. Wire those
through into the generation of the message dispatch to the selector.
llvm-svn: 312835
This is to fix PR34347. EmitAtomicExpr now only uses alignment information from
Type, instead of Decl, so when the declaration of an atomic variable is marked
to have the alignment equal as its size, EmitAtomicExpr doesn't know about it and
will generate libcall instead of atomic op. The patch uses EmitPointerWithAlignment
to get the precise alignment information.
Differential Revision: https://reviews.llvm.org/D37310
llvm-svn: 312830
The current coverage implementation doesn't handle region termination
very precisely. Take for example an `if' statement with a `return':
void f() {
if (true) {
return; // The `if' body's region is terminated here.
}
// This line gets the same coverage as the `if' condition.
}
If the function `f' is called, the line containing the comment will be
marked as having executed once, which is not correct.
The solution here is to create a deferred region after terminating a
region. The deferred region is completed once the start location of the
next statement is known, and is then pushed onto the region stack.
In the cases where it's not possible to complete a deferred region, it
can safely be dropped.
Testing: lit test updates, a stage2 coverage-enabled build of clang
This is a reapplication but there are no changes from the original commit.
With D36813, the segment builder in llvm will be able to handle deferred
regions correctly.
llvm-svn: 312818
This is to fix PR34347. EmitAtomicExpr now only uses alignment information from
Type, instead of Decl, so when the declaration of an atomic variable is marked
to have the alignment equal as its size, EmitAtomicExpr doesn't know about it and
will generate libcall instead of atomic op. The patch uses EmitPointerWithAlignment
to get the precise alignment information.
Differential Revision: https://reviews.llvm.org/D37310
llvm-svn: 312801
This is a recommit of r312781; in some build configurations
variable names are omitted, so changed the new regression
test accordingly.
llvm-svn: 312794
Summary:
1.Updated annotations for include/clang/StaticAnalyzer/Core/PathSensitive/Store.h, which belong to the old version of clang.
2.Delete annotations for CodeGenFunction::getEvaluationKind() in clang/lib/CodeGen/CodeGenFunction.h, which belong to the old version of clang.
Reviewers: bkramer, krasimir, klimek
Reviewed By: bkramer
Subscribers: MTC
Differential Revision: https://reviews.llvm.org/D36330
Contributed by @MTC!
llvm-svn: 312790
This adds _Float16 as a source language type, which is a 16-bit floating point
type defined in C11 extension ISO/IEC TS 18661-3.
In follow up patches documentation and more tests will be added.
Differential Revision: https://reviews.llvm.org/D33719
llvm-svn: 312781
__kmpc_for_static_fini().
Added special flags for calls of __kmpc_for_static_fini(), like previous
ly for __kmpc_for_static_init(). Added flag OMP_IDENT_WORK_DISTRIBUTE
for distribute cnstruct, OMP_IDENT_WORK_SECTIONS for sections-based
constructs and OMP_IDENT_WORK_LOOP for loop-based constructs in
location flags.
llvm-svn: 312642
move constructor.
Previously user-defined reduction initializer was considered as an
assignment expression, not as initializer. Fixed this by treating the
initializer expression as an initializer.
llvm-svn: 312638
Summary:
As the attributed statements are considered simple statements no
stoppoint was generated before emitting attributed do/while/for/range-
statement. This lead to faulty debug locations.
Reviewers: echristo, aaron.ballman, dblaikie
Reviewed By: dblaikie
Subscribers: bjope, aprantl, cfe-commits
Differential Revision: https://reviews.llvm.org/D37428
llvm-svn: 312623
By exposing the constant initializer, the optimizer can fold many
of these constructs.
This is a recommit of r311857 that was reverted in r311898 because
an assert was hit when building Chromium.
We have to take into account that the GlobalVariable may be first
created with a different type than the initializer. This can
happen for example when the variable is a struct with tail padding
while the initializer does not have padding. In such case, the
variable needs to be destroyed an replaced with a new one with the
type of the initializer.
Differential Revision: https://reviews.llvm.org/D34992
llvm-svn: 312512
Because it is common to treat vector types as an array of their elements, or
even some other type that's not the element type, and thus index into them, we
can't use struct-path TBAA for these accesses. Even though we already treat all
vector types as equivalent to 'char', we were using field-offset information
for them with TBAA, and this renders undefined the intra-value indexing we
intend to allow. Note that, although 'char' is universally aliasing, with path
TBAA, we can still differentiate between access to s.a and s.b in
struct { char a, b; } s;. We can't use this capability as-is for vector types.
Fixes PR33967.
llvm-svn: 312447
Not all targets support vararg (e.g. amdgpu). Instead of using vararg in the emitted functions for enqueue_kernel,
this patch creates a temporary array of size_t, stores the size arguments in the temporary array
and passes it to the emitted functions for enqueue_kernel.
Differential Revision: https://reviews.llvm.org/D36678
llvm-svn: 312441
"target" implementation
A small set of refactors that'll make it easier for me to implement 'target'
support.
First, extract the CPUSupports functionality into its own function.
THis has the advantage of not wasting time in this builtin to deal with
arguments.
Second, pulls both CPUSupports and CPUIs implementation into a member-function,
so that it can be called from the resolver generation that I'm working on.
Third, creates an overload that takes simply the feature/cpu name (rather than
extracting it from a callexpr), since that info isn't available later.
Note that despite how the 'diff' looks, the EmitX86CPUSupports function simply
takes the implementation out of the 'switch'.
llvm-svn: 312355
This fixes cases where dynamic classes produced RTTI data with
external linkage, producing linker errors about duplicate symbols.
This touches code close to what was changed in SVN r244266, but
this change doesn't break the tests added in that revision.
The previous version had missed to update CodeGenCXX/virt-dtor-key.cpp,
which had a behaviour change only when running the testsuite on windows.
Differential revision: https://reviews.llvm.org/D37327
llvm-svn: 312306
This fixes cases where dynamic classes produced RTTI data with
external linkage, producing linker errors about duplicate symbols.
This touches code close to what was changed in SVN r244266, but
this change doesn't break the tests added in that revision.
Differential revision: https://reviews.llvm.org/D37206
llvm-svn: 312224
Summary:
An implementation of ubsan runtime library suitable for use in production.
Minimal attack surface.
* No stack traces.
* Definitely no C++ demangling.
* No UBSAN_OPTIONS=log_file=/path (very suid-unfriendly). And no UBSAN_OPTIONS in general.
* as simple as possible
Minimal CPU and RAM overhead.
* Source locations unnecessary in the presence of (split) debug info.
* Values and types (as in A+B overflows T) can be reconstructed from register/stack dumps, once you know what type of error you are looking at.
* above two items save 3% binary size.
When UBSan is used with -ftrap-function=abort, sometimes it is hard to reason about failures. This library replaces abort with a slightly more informative message without much extra overhead. Since ubsan interface in not stable, this code must reside in compiler-rt.
Reviewers: pcc, kcc
Subscribers: srhines, mgorny, aprantl, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D36810
llvm-svn: 312029
It caused PR759744.
> Emit static constexpr member as available_externally definition
>
> By exposing the constant initializer, the optimizer can fold many
> of these constructs.
>
> Differential Revision: https://reviews.llvm.org/D34992
llvm-svn: 311898
This adds builtin_cpu_init which will emit a call to cpu_indicator_init in libgcc or compiler-rt.
This is needed to support builtin_cpu_supports/builtin_cpu_is in an ifunc resolver.
Differential Revision: https://reviews.llvm.org/D36336
llvm-svn: 311874
By exposing the constant initializer, the optimizer can fold many
of these constructs.
Differential Revision: https://reviews.llvm.org/D34992
llvm-svn: 311857
This allows multi-module / incremental compilation environments to have unique
initializer symbols.
Patch by Axel Naumann with minor modifications by me!
llvm-svn: 311844
When isIncrementalProcessingEnabled is on we might want to produce multiple
llvm::Modules. This patch allows the clients to start a new llvm::Module,
allowing CodeGen to continue working after a HandleEndOfTranslationUnit call.
This should give the necessary facilities to write a unittest for D34059.
As discussed in the review this is meant to give us a way to proceed forward
in our efforts to upstream our interpreter-related patches. The design of this
will likely change soon.
llvm-svn: 311843
This patch adds a flag -fclang-abi-compat that can be used to request that
Clang attempts to be ABI-compatible with some older version of itself.
This is provided on a best-effort basis; right now, this can be used to undo
the ABI change in r310401, reverting Clang to its prior C++ ABI for pass/return
by value of class types affected by that change, and to undo the ABI change in
r262688, reverting Clang to using integer registers rather than SSE registers
for passing <1 x long long> vectors. The intent is that we will maintain this
backwards compatibility path as we make ABI-breaking fixes in future.
The reversion to the old behavior for r310401 is also applied to the PS4 target
since that change is not part of its platform ABI (which is essentially to do
whatever Clang 3.2 did).
llvm-svn: 311823
expressions
C++ allows us to reference static variables through member expressions. Prior to
this commit, non-integer static variables that were referenced using a member
expression were always emitted using lvalue loads. The old behaviour introduced
an inconsistency between regular uses of static variables and member expressions
uses. For example, the following program compiled and linked successfully:
struct Foo {
constexpr static const char *name = "foo";
};
int main() {
return Foo::name[0] == 'f';
}
but this program failed to link because "Foo::name" wasn't found:
struct Foo {
constexpr static const char *name = "foo";
};
int main() {
Foo f;
return f.name[0] == 'f';
}
This commit ensures that constant static variables referenced through member
expressions are emitted in the same way as ordinary static variable references.
rdar://33942261
Differential Revision: https://reviews.llvm.org/D36876
llvm-svn: 311772
Summary:
If await_suspend returns a coroutine_handle, as in the example below:
```
coroutine_handle<> await_suspend(coroutine_handle<> h) {
coro.promise().waiter = h;
return coro;
}
```
suspensionExpression processing will resume the coroutine pointed at by that handle.
Related LLVM change rL311751 makes resume calls of this kind `musttail` at any optimization level.
This enables unlimited symmetric control transfer from coroutine to coroutine without blowing up the stack.
Reviewers: GorNishanov
Reviewed By: GorNishanov
Subscribers: rsmith, EricWF, cfe-commits
Differential Revision: https://reviews.llvm.org/D37131
llvm-svn: 311762
Summary: With accurate sample profile, we can do more aggressive size optimization. For some size-critical application, this can reduce the text size by 20%
Reviewers: davidxl, rsmith
Reviewed By: davidxl, rsmith
Subscribers: mehdi_amini, eraman, sanjoy, cfe-commits
Differential Revision: https://reviews.llvm.org/D37091
llvm-svn: 311707
Do not sanitize the 'this' pointer of a member call operator for a lambda with
no capture-default, since that call operator can legitimately be called with a
null this pointer from the static invoker function. Any actual call with a null
this pointer should still be caught in the caller (if it is being sanitized).
This reinstates r311589 (reverted in r311680) with the above fix.
llvm-svn: 311695
This patch is intended to enable the use of basic double letter constraints used in GCC extended inline asm {Yi Y2 Yz Y0 Ym Yt}.
Supersedes D35205
llvm counterpart: D36369
Differential Revision: https://reviews.llvm.org/D36371
llvm-svn: 311643
of class fails to map class static variable.
If the global variable is captured and it has several redeclarations,
sometimes it may lead to a compiler crash. Patch fixes this by working
only with canonical declarations.
llvm-svn: 311479
The comment markers accepted by the assembler vary between different targets,
but '//' is always accepted, so we should use that for consistency.
Differential revision: https://reviews.llvm.org/D36666
llvm-svn: 311325
Summary:
Augment SanitizerCoverage to insert maximum stack depth tracing for
use by libFuzzer. The new instrumentation is enabled by the flag
-fsanitize-coverage=stack-depth and is compatible with the existing
trace-pc-guard coverage. The user must also declare the following
global variable in their code:
thread_local uintptr_t __sancov_lowest_stack
https://bugs.llvm.org/show_bug.cgi?id=33857
Reviewers: vitalybuka, kcc
Reviewed By: vitalybuka
Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D36839
llvm-svn: 311186
Summary:
Even in the case of the input file is a preprocessed source, clang uses the file name of the preprocesses source for debug info (DW_AT_name attribute for DW_TAG_compile_unit). However, gcc uses the file name specified in the first linemarker instead. This makes more sense because the one specified in the linemarker represents the "actual" source file name.
Clang already uses the file name specified in the first linemarker for Module name (https://github.com/llvm-mirror/clang/blob/master/lib/Frontend/FrontendAction.cpp#L779) if the input is preprocessed. This patch makes clang to use the same value for debug info as well.
Reviewers: compnerd, rnk, dblaikie, rsmith
Reviewed By: rnk
Subscribers: aprantl, cfe-commits
Differential Revision: https://reviews.llvm.org/D36474
llvm-svn: 311037
If worksharing construct has at least one linear item, an implicit
synchronization point must be emitted to avoid possible conflict with
the loading/storing values to the original variables. Added implicit
barrier if the linear item is found before actual start of the
worksharing construct.
llvm-svn: 311013
If exceptions are enabled, there may be a problem with the codegen of
the finalization functions from OpenMP runtime. It happens because of
the problem with the getting of thread identifier value. Patch tries to
fix it by using the result of the call of function
__kmpc_global_thread_num() rather than loading of value of outlined
function parameter.
llvm-svn: 311007
constructors when deciding whether classes should be passed indirectly.
This fixes ABI differences between Clang and GCC:
* Previously, Clang ignored the move constructor when making this
determination. It now takes the move constructor into account, per
https://github.com/itanium-cxx-abi/cxx-abi/pull/17 (this change may
seem recent, but the ABI change was agreed on the Itanium C++ ABI
list a long time ago).
* Previously, Clang's behavior when the copy constructor was deleted
was unstable -- depending on whether the lazy declaration of the
copy constructor had been triggered, you might get different behavior.
We now eagerly declare the copy constructor whenever its deletedness
is unclear, and ignore deleted copy/move constructors when looking for
a trivial such constructor.
This also fixes an ABI difference between Clang and MSVC:
* If the copy constructor would be implicitly deleted (but has not been
lazily declared yet), for instance because the class has an rvalue
reference member, we would pass it directly. We now pass such a class
indirectly, matching MSVC.
Based on a patch by Vassil Vassilev, which was based on a patch by Bernd
Schmidt, which was based on a patch by Reid Kleckner!
This is a re-commit of r310401, which was reverted in r310464 due to ARM
failures (which should now be fixed).
llvm-svn: 310983
the interface.
The ultimate goal here is to make it easier to do some more interesting
things in constant emission, like emit constant initializers that have
ignorable side-effects, or doing the majority of an initialization
in-place and then patching up the last few things with calls. But for
now this is mostly just a refactoring.
llvm-svn: 310964
When translating arguments for NVPTX target it is not taken into account
that function may have variable number of arguments. Patch fixes this
problem.
llvm-svn: 310920
Generalize getOpenCLImageAddrSpace into getOpenCLTypeAddrSpace, such
that targets can select the address space per type.
No functional changes intended.
Initial patch by Simon Perretta.
Differential Revision: https://reviews.llvm.org/D33989
llvm-svn: 310911
__kmpc_for_static_init().
OpenMP 5.0 will include OpenMP Tools interface that requires distinguishing different worksharing constructs.
Since the same entry point (__kmp_for_static_init(ident_t *loc,
kmp_int32 global_tid,........)) is called in case static
loop/sections/distribute it is suggested using 'flags' field of the
ident_t structure to pass the type of the construct.
llvm-svn: 310865
This is causing failures when compiling clang with -O3
as one of the structures used by clang is passed by
value and uses the fastcc calling convention.
Faliures manifest for stage2 mips build.
llvm-svn: 310704
This patch adds support for __builtin_cpu_is. I've tried to match the strings supported to the latest version of gcc.
Differential Revision: https://reviews.llvm.org/D35449
llvm-svn: 310657
This is an improvement over always using byval for
structs.
This will use registers until ~16 are used, and then
switch back to byval. This needs more work, since I'm
not sure it ever really makes sense to use byval. If
the register limit is exceeded, the arguments still
end up passed on the stack, but with a different ABI.
It also may make sense to base this on number of
registers used for non-struct arguments, rather than
just arguments that appear first in the argument list.
llvm-svn: 310527
name.
If the host code is compiled with the debug info, while the target
without, there is a problem that the compiler is unable to find the
debug wrapper. Patch fixes this problem by emitting special name for the
debug version of the code.
llvm-svn: 310511
Previously we limited ourselves to only emitting nested classes, but we
need other kinds of types as well.
This fixes the Visual Studio STL visualizers, so that users can
visualize std::string and other objects.
llvm-svn: 310410
The code after a noreturn call doesn't execute.
The pattern in the testcase is pretty common in LLVM (a switch with
a default case that calls llvm_unreachable).
The original version of this patch was reverted in r309995 due to a
crash. This version includes a fix for that crash (testcase in
test/CoverageMapping/md.cpp).
Differential Revision: https://reviews.llvm.org/D36250
llvm-svn: 310406
constructors when deciding whether classes should be passed indirectly.
This fixes ABI differences between Clang and GCC:
* Previously, Clang ignored the move constructor when making this
determination. It now takes the move constructor into account, per
https://github.com/itanium-cxx-abi/cxx-abi/pull/17 (this change may
seem recent, but the ABI change was agreed on the Itanium C++ ABI
list a long time ago).
* Previously, Clang's behavior when the copy constructor was deleted
was unstable -- depending on whether the lazy declaration of the
copy constructor had been triggered, you might get different behavior.
We now eagerly declare the copy constructor whenever its deletedness
is unclear, and ignore deleted copy/move constructors when looking for
a trivial such constructor.
This also fixes an ABI difference between Clang and MSVC:
* If the copy constructor would be implicitly deleted (but has not been
lazily declared yet), for instance because the class has an rvalue
reference member, we would pass it directly. We now pass such a class
indirectly, matching MSVC.
llvm-svn: 310401
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310387
They still need to be implemented in the intrinsics, the command line, and the backend. But this change isn't dependent on any of that and resolves a TODO.
llvm-svn: 310386
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310377
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310360
This reverts commit r310010. I don't think there's anything wrong with
this commit, but it's causing clang to generate output that llvm-cov
doesn't do a good job with and the fix isn't immediately clear.
See Eli's comment in D36250 for more context.
I'm reverting the clang change so the coverage bot can revert back to
producing sensible output, and to give myself some time to investigate
what went wrong in llvm.
llvm-svn: 310154
We don't need special handling in CodeGenFunction::GenerateCode for
lambda block pointer conversion operators anymore. The conversion
operator emission code immediately calls back to the generic
EmitFunctionBody.
Rename EmitLambdaStaticInvokeFunction to EmitLambdaStaticInvokeBody for
better consistency with the other Emit*Body methods.
I'm preparing to do something about PR28299, which touches this code.
llvm-svn: 310145
Arguments, passed to the outlined function, must have correct address
space info for proper Debug info support. Patch sets global address
space for arguments that are mapped and passed by reference.
Also, cuda-gdb does not handle reference types correctly, so reference
arguments are represented as pointers.
llvm-svn: 310104
Summary:
Previously, STL allocators were blacklisted in compiler_rt's
cfi_blacklist.txt because they mandated a cast from void* to T* before
object initialization completed. This change moves that logic into the
front end because C++ name mangling supports a substitution compression
mechanism for symbols that makes it difficult to blacklist the mangled
symbol for allocate() using a regular expression.
Motivated by crbug.com/751385.
Reviewers: pcc, kcc
Reviewed By: pcc
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36294
llvm-svn: 310097
OpenCL 2.0 atomic builtin functions have a scope argument which is ideally
represented as synchronization scope argument in LLVM atomic instructions.
Clang supports translating Clang atomic builtin functions to LLVM atomic
instructions. However it currently does not support synchronization scope
of LLVM atomic instructions. Without this, users have to use LLVM assembly
code to implement OpenCL atomic builtin functions.
This patch adds OpenCL 2.0 atomic builtin functions as Clang builtin
functions, which supports generating LLVM atomic instructions with
synchronization scope operand.
Currently only constant memory scope argument is supported. Support of
non-constant memory scope argument will be added later.
Differential Revision: https://reviews.llvm.org/D28691
llvm-svn: 310082
The current coverage implementation doesn't handle region termination
very precisely. Take for example an `if' statement with a `return':
void f() {
if (true) {
return; // The `if' body's region is terminated here.
}
// This line gets the same coverage as the `if' condition.
}
If the function `f' is called, the line containing the comment will be
marked as having executed once, which is not correct.
The solution here is to create a deferred region after terminating a
region. The deferred region is completed once the start location of the
next statement is known, and is then pushed onto the region stack.
In the cases where it's not possible to complete a deferred region, it
can safely be dropped.
Testing: lit test updates, a stage2 coverage-enabled build of clang
llvm-svn: 310010
The code after a noreturn call doesn't execute.
The pattern in the testcase is pretty common in LLVM (a switch with
a default case that calls llvm_unreachable).
Differential Revision: https://reviews.llvm.org/D36250
llvm-svn: 309995
This option when combined with -mgpopt and -membedded-data places all
uninitialized constant variables in the read-only section.
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D35917
llvm-svn: 309940
We never overwrite the end location of a region, so we would end up with
an overly large region when we reused the switch's region.
It's possible this code will be substantially rewritten in the near
future to deal with fallthrough more accurately, but this seems like
an improvement on its own for now.
Differential Revision: https://reviews.llvm.org/D34801
llvm-svn: 309901
In r309007, I made -fsanitize=null a hard prerequisite for -fsanitize=vptr. I
did not see the need for the two checks to have separate null checking logic
for the same pointer. I expected the two checks to either always be enabled
together, or to be mutually compatible.
In the mailing list discussion re: r309007 it became clear that that isn't the
case. If a codebase is -fsanitize=vptr clean but not -fsanitize=null clean,
it's useful to have -fsanitize=vptr emit its own null check. That's what this
patch does: with it, -fsanitize=vptr can be used without -fsanitize=null.
Differential Revision: https://reviews.llvm.org/D36112
llvm-svn: 309846
In a future commit AMDGPU will start passing
aggregates directly to more functions, triggering
asserts in test/CodeGenOpenCL/addr-space-struct-arg.cl
llvm-svn: 309741
CodeGenFunction::EmitTypeMetadataCodeForVCall() could output an
llvm.assume(llvm.type.test())when CFI was enabled, optimizing out the
vcall check. This case was only reached when: 1) CFI-vcall was enabled,
2) -fwhole-program-tables was specified, and 3)
-fno-sanitize-trap=cfi-vcall was specified.
Patch by Vlad Tsyrklevich!
Differential Revision: https://reviews.llvm.org/D36013
llvm-svn: 309622
Summary:
Previously Clang incorrectly ignored the expression of a void `co_return`. This patch addresses that bug.
I'm not quite sure if I got the code-gen right, but this patch is at least a start.
Reviewers: rsmith, GorNishanov
Reviewed By: rsmith, GorNishanov
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D36070
llvm-svn: 309545
On some targets, passing zero to the clz() or ctz() builtins has undefined
behavior. I ran into this issue while debugging UB in __hash_table from libcxx:
the bug I was seeing manifested itself differently under -O0 vs -Os, due to a
UB call to clz() (see: libcxx/r304617).
This patch introduces a check which can detect UB calls to builtins.
llvm.org/PR26979
Differential Revision: https://reviews.llvm.org/D34590
llvm-svn: 309459
r303175 made changes to have __cxa_allocate_exception return a 16-byte
aligned pointer, so it's no longer necessary to specify a lower
alignment (8-bytes) for exception objects on Darwin.
rdar://problem/32363695
llvm-svn: 309308
When an omp for loop is canceled the constructed objects are being destructed
twice.
It looks like the desired code is:
{
Obj o;
If (cancelled) branch-through-cleanups to cancel.exit.
}
[cleanups]
cancel.exit:
__kmpc_for_static_fini
br cancel.cont (*)
cancel.cont:
__kmpc_barrier
return
The problem seems to be the branch to cancel.cont is currently also going
through the cleanups calling them again. This change just does a direct branch
instead.
Patch By: michael.p.rice@intel.com
Differential Revision: https://reviews.llvm.org/D35854
llvm-svn: 309288
Summary: The new PM needs to invoke add-discriminator pass when building with -fdebug-info-for-profiling.
Reviewers: chandlerc, davidxl
Reviewed By: chandlerc
Subscribers: sanjoy, cfe-commits
Differential Revision: https://reviews.llvm.org/D35746
llvm-svn: 309282
The ARM Runtime ABI document (IHI0043) defines the AEABI floating point
helper functions in 4.1.2 The floating-point helper functions. These
functions always use the base PCS (soft-fp). However helper functions
defined outside of this document such as the complex-number multiply and
divide helpers are not covered by this requirement and should use
hard-float PCS if the target is hard-float as both compiler-rt and libgcc
for a hard-float sysroot implement these functions with a hard-float PCS.
All of the floating point helper functions that are explicitly soft float
are expanded in the llvm ARM backend. This change makes clang not force the
BuiltinCC to AAPCS for AAPCS_VFP. With this change the ARM compiler-rt
tests involving _Complex pass with both hard-fp and soft-fp targets.
Differential Revision: https://reviews.llvm.org/D35538
llvm-svn: 309257
The initializer for a static local variable cannot be hot, because it runs at
most once per program. That's not quite the same thing as having a low branch
probability, but under the assumption that the function is invoked many times,
modeling this as a branch probability seems reasonable.
For TLS variables, the situation is less clear, since the initialization side
of the branch can run multiple times in a program execution, but we still
expect initialization to be rare relative to non-initialization uses. It would
seem worthwhile to add a PGO counter along this path to make this estimation
more accurate in future.
For globals with guarded initialization, we don't yet apply any branch weights.
Due to our use of COMDATs, the guard will be reached exactly once per DSO, but
we have no idea how many DSOs will define the variable.
llvm-svn: 309195
std::byte, when defined as an enum, needs to be given special treatment
with regards to its aliasing properties. An array of std::byte is
allowed to be used as storage for other types.
This fixes PR33916.
Differential Revision: https://reviews.llvm.org/D35824
llvm-svn: 309058
The instrumentation generated by -fsanitize=vptr does not null check a
user pointer before loading from it. This causes crashes in the face of
UB member calls (this=nullptr), i.e it's causing user programs to crash
only after UBSan is turned on.
The fix is to make run-time null checking a prerequisite for enabling
-fsanitize=vptr, and to then teach UBSan to reuse these run-time null
checks to make -fsanitize=vptr safe.
Testing: check-clang, check-ubsan, a stage2 ubsan-enabled build
Differential Revision: https://reviews.llvm.org/D35735https://bugs.llvm.org/show_bug.cgi?id=33881
llvm-svn: 309007