Summary:
Some lines have a hit counter where they should not have one.
Cleanup stuff is located to the last line of the body which is most of the time a '}'.
And Exception stuff is added at the beginning of a function and at the end (represented by '{' and '}').
So in such cases, the DebugLoc used in GCOVProfiling.cpp must be marked as not covered.
This patch is a followup of https://reviews.llvm.org/D49915.
Tests in projects/compiler_rt are fixed by: https://reviews.llvm.org/D49917
Reviewers: marco-c, davidxl
Reviewed By: marco-c
Subscribers: dblaikie, cfe-commits, sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D49916
llvm-svn: 342717
Summary:
Before this change, we only emit the XRay attributes in LLVM IR when the
-fxray-instrument flag is provided. This may cause issues with thinlto
when the final binary is being built/linked with -fxray-instrument, and
the constitutent LLVM IR gets re-lowered with xray instrumentation.
With this change, we can honour the "never-instrument "attributes
provided in the source code and preserve those in the IR. This way, even
in thinlto builds, we retain the attributes which say whether functions
should never be XRay instrumented.
This change addresses llvm.org/PR38922.
Reviewers: mboerger, eizan
Subscribers: mehdi_amini, dexonsmith, cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D52015
llvm-svn: 342200
Previously, both types (plus the future target-clones) of
multiversioning had a separate ResolverOption structure and emission
function. This patch combines the two, at the expense of a slightly
more expensive sorting function.
llvm-svn: 342152
Boilerplate code for using KMSAN instrumentation in Clang.
We add a new command line flag, -fsanitize=kernel-memory, with a
corresponding SanitizerKind::KernelMemory, which, along with
SanitizerKind::Memory, maps to the memory_sanitizer feature.
KMSAN is only supported on x86_64 Linux.
It's incompatible with other sanitizers, but supports code coverage
instrumentation.
llvm-svn: 341641
If using a custom stack alignment, one is expected to make sure
that all callers provide such alignment, or realign the stack in
all entry points (and callbacks).
Despite this, the compiler can assume that the main function will
need realignment in these cases, since the startup routines calling
the main function most probably won't provide the custom alignment.
This matches what GCC does in similar cases; if compiling with
-mincoming-stack-boundary=X -mpreferred-stack-boundary=X, GCC normally
assumes such alignment on entry to a function, but specifically for
the main function still does realignment.
Differential Revision: https://reviews.llvm.org/D51026
llvm-svn: 340334
Clang generates copy and dispose helper functions for each block literal
on the stack. Often these functions are equivalent for different blocks.
This commit makes changes to merge equivalent copy and dispose helper
functions and reduce code size.
To enable merging equivalent copy/dispose functions, the captured object
infomation is encoded into the helper function name. This allows IRGen
to check whether an equivalent helper function has already been emitted
and reuse the function instead of generating a new helper function
whenever a block is defined. In addition, the helper functions are
marked as linkonce_odr to enable merging helper functions that have the
same name across translation units and marked as unnamed_addr to enable
the linker's deduplication pass to merge functions that have different
names but the same content.
rdar://problem/42640608
Differential Revision: https://reviews.llvm.org/D50152
llvm-svn: 339438
As documented here: https://software.intel.com/en-us/node/682969 and
https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning
is an ICC feature that provides for function multiversioning.
This feature is implemented with two attributes: First, cpu_specific,
which specifies the individual function versions. Second, cpu_dispatch,
which specifies the location of the resolver function and the list of
resolvable functions.
This is valuable since it provides a mechanism where the resolver's TU
can be specified in one location, and the individual implementions
each in their own translation units.
The goal of this patch is to be source-compatible with ICC, so this
implementation diverges from the ICC implementation in a few ways:
1- Linux x86/64 only: This implementation uses ifuncs in order to
properly dispatch functions. This is is a valuable performance benefit
over the ICC implementation. A future patch will be provided to enable
this feature on Windows, but it will obviously more closely fit ICC's
implementation.
2- CPU Identification functions: ICC uses a set of custom functions to identify
the feature list of the host processor. This patch uses the cpu_supports
functionality in order to better align with 'target' multiversioning.
1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function
marked cpu_dispatch be an empty definition. This patch supports that as well,
however declarations are also permitted, since the linker will solve the
issue of multiple emissions.
Differential Revision: https://reviews.llvm.org/D47474
llvm-svn: 337552
The member init list for the sole constructor for CodeGenFunction
has gotten out of hand, so this patch moves the non-parameter-dependent
initializations into the member value inits.
Note: This is what was intended to be committed in r336726
llvm-svn: 336729
The member init list for the sole constructor for CodeGenFunction
has gotten out of hand, so this patch moves the non-parameter-dependent
initializations into the member value inits.
llvm-svn: 336726
This is part of an ongoing attempt at making 512 bit vectors illegal in the X86 backend type legalizer due to CPU frequency penalties associated with wide vectors on Skylake Server CPUs. We want the loop vectorizer to be able to emit IR containing wide vectors as intermediate operations in vectorized code and allow these wide vectors to be legalized to 256 bits by the X86 backend even though we are targetting a CPU that supports 512 bit vectors. This is similar to what happens with an AVX2 CPU, the vectorizer can emit wide vectors and the backend will split them. We want this splitting behavior, but still be able to use new Skylake instructions that work on 256-bit vectors and support things like masking and gather/scatter.
Of course if the user uses explicit vector code in their source code we need to not split those operations. Especially if they have used any of the 512-bit vector intrinsics from immintrin.h. And we need to make it so that merely using the intrinsics produces the expected code in order to be backwards compatible.
To support this goal, this patch adds a new IR function attribute "min-legal-vector-width" that can indicate the need for a minimum vector width to be legal in the backend. We need to ensure this attribute is set to the largest vector width needed by any intrinsics from immintrin.h that the function uses. The inliner will be reponsible for merging this attribute when a function is inlined. We may also need a way to limit inlining in the future as well, but we can discuss that in the future.
To make things more complicated, there are two different ways intrinsics are implemented in immintrin.h. Either as an always_inline function containing calls to builtins(can be target specific or target independent) or vector extension code. Or as a macro wrapper around a taget specific builtin. I believe I've removed all cases where the macro was around a target independent builtin.
To support the always_inline function case this patch adds attribute((min_vector_width(128))) that can be used to tag these functions with their vector width. All x86 intrinsic functions that operate on vectors have been tagged with this attribute.
To support the macro case, all x86 specific builtins have also been tagged with the vector width that they require. Use of any builtin with this property will implicitly increase the min_vector_width of the function that calls it. I've done this as a new property in the attribute string for the builtin rather than basing it on the type string so that we can opt into it on a per builtin basis and avoid any impact to target independent builtins.
There will be future work to support vectors passed as function arguments and supporting inline assembly. And whatever else we can find that isn't covered by this patch.
Special thanks to Chandler who suggested this direction and reviewed a preview version of this patch. And thanks to Eric Christopher who has had many conversations with me about this issue.
Differential Revision: https://reviews.llvm.org/D48617
llvm-svn: 336583
Summary:
When requirement imposed by __target__ attributes on functions
are not satisfied, prefer printing those requirements, which
are explicitly mentioned in the attributes.
This makes such messages more useful, e.g. printing avx512f instead of avx2
in the following scenario:
```
$ cat foo.c
static inline void __attribute__((__always_inline__, __target__("avx512f")))
x(void)
{
}
int main(void)
{
x();
}
$ clang foo.c
foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2'
x();
^
1 error generated.
```
bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338
Reviewers: craig.topper, echristo, dblaikie
Reviewed By: craig.topper, echristo
Differential Revision: https://reviews.llvm.org/D46541
llvm-svn: 334174
Summary:
A clang builtin for xray typed events. Differs from
__xray_customevent(...) by the presence of a type tag that is vended by
compiler-rt in typical usage. This allows xray handlers to expand logged
events with their type description and plugins to process traced events
based on type.
This change depends on D45633 for the intrinsic definition.
Reviewers: dberris, pelikan, rnk, eizan
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D45716
llvm-svn: 330220
When emitting CodeView debug information, compiler-generated thunk routines
should be emitted using S_THUNK32 symbols instead of S_GPROC32_ID symbols so
Visual Studio can properly step into the user code. This initial support only
handles standard thunk ordinals.
Differential Revision: https://reviews.llvm.org/D43838
llvm-svn: 330132
Summary:
This change addresses http://llvm.org/PR36926 by allowing users to pick
which instrumentation bundles to use, when instrumenting with XRay. In
particular, the flag `-fxray-instrumentation-bundle=` has four valid
values:
- `all`: the default, emits all instrumentation kinds
- `none`: equivalent to -fnoxray-instrument
- `function`: emits the entry/exit instrumentation
- `custom`: emits the custom event instrumentation
These can be combined either as comma-separated values, or as
repeated flag values.
Reviewers: echristo, kpw, eizan, pelikan
Reviewed By: pelikan
Subscribers: mgorny, cfe-commits
Differential Revision: https://reviews.llvm.org/D44970
llvm-svn: 329985
Summary:
Right now to disable -fsanitize=kernel-address instrumentation, one needs to use no_sanitize("kernel-address"). Make either no_sanitize("address") or no_sanitize("kernel-address") disable both ASan and KASan instrumentation. Also remove redundant test.
Patch by Andrey Konovalov
Reviewers: eugenis, kcc, glider, dvyukov, vitalybuka
Reviewed By: eugenis, vitalybuka
Differential Revision: https://reviews.llvm.org/D44981
llvm-svn: 329612
Summary:
Add support for the -fsanitize=shadow-call-stack flag which causes clang
to add ShadowCallStack attribute to functions compiled with that flag
enabled.
Reviewers: pcc, kcc
Reviewed By: pcc, kcc
Subscribers: cryptoad, cfe-commits, kcc
Differential Revision: https://reviews.llvm.org/D44801
llvm-svn: 329122
Summary:
Disables certain CMP optimizations to improve fuzzing signal under -O1
and -O2.
Switches all fuzzer tests to -O2 except for a few leak tests where the
leak is optimized out under -O2.
Reviewers: kcc, vitalybuka
Reviewed By: vitalybuka
Subscribers: cfe-commits, llvm-commits
Differential Revision: https://reviews.llvm.org/D44798
llvm-svn: 328384
Summary:
Currently only calls to mcount were suppressed with
no_instrument_function attribute.
Linux kernel requires that calls to fentry should also not be
generated.
This is an extended fix for PR PR33515.
Reviewers: hfinkel, rengolin, srhines, rnk, rsmith, rjmccall, hans
Reviewed By: rjmccall
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D43995
llvm-svn: 326639
So I wrote a clang-tidy check to lint out redundant `isa`, `cast`, and
`dyn_cast`s for fun. This is a portion of what it found for clang; I
plan to do similar cleanups in LLVM and other subprojects when I find
time.
Because of the volume of changes, I explicitly avoided making any change
that wasn't highly local and obviously correct to me (e.g. we still have
a number of foo(cast<Bar>(baz)) that I didn't touch, since overloading
is a thing and the cast<Bar> did actually change the type -- just up the
class hierarchy).
I also tried to leave the types we were cast<>ing to somewhere nearby,
in cases where it wasn't locally obvious what we were dealing with
before.
llvm-svn: 326416
Summary:
This patch enables debugging of C99 VLA types by generating more precise
LLVM Debug metadata, using the extended DISubrange 'count' field that
takes a DIVariable.
This should implement:
Bug 30553: Debug info generated for arrays is not what GDB expects (not as good as GCC's)
https://bugs.llvm.org/show_bug.cgi?id=30553
Reviewers: echristo, aprantl, dexonsmith, clayborg, pcc, kristof.beyls, dblaikie
Reviewed By: aprantl
Subscribers: jholewinski, schweitz, davide, fhahn, JDevlieghere, cfe-commits
Differential Revision: https://reviews.llvm.org/D41698
llvm-svn: 323952
This alignment can be less than 4 on certain embedded targets, which may
not even be able to deal with 4-byte alignment on the stack.
Patch by Jacob Young!
llvm-svn: 322406
Cf-protection is a target independent flag that instructs the back-end to instrument control flow mechanisms like: Branch, Return, etc.
For example in X86 this flag will be used to instrument Indirect Branch Tracking instructions.
Differential Revision: https://reviews.llvm.org/D40478
Change-Id: I5126e766c0e6b84118cae0ee8a20fe78cc373dea
llvm-svn: 322063
GCC's attribute 'target', in addition to being an optimization hint,
also allows function multiversioning. We currently have the former
implemented, this is the latter's implementation.
This works by enabling functions with the same name/signature to coexist,
so that they can all be emitted. Multiversion state is stored in the
FunctionDecl itself, and SemaDecl manages the definitions.
Note that it ends up having to permit redefinition of functions so
that they can all be emitted. Additionally, all versions of the function
must be emitted, so this also manages that.
Note that this includes some additional rules that GCC does not, since
defining something as a MultiVersion function after a usage has been made illegal.
The only 'history rewriting' that happens is if a function is emitted before
it has been converted to a multiversion'ed function, at which point its name
needs to be changed.
Function templates and virtual functions are NOT yet supported (not supported
in GCC either).
Additionally, constructors/destructors are disallowed, but the former is
planned.
llvm-svn: 322028
As discussed in the mail thread <https://groups.google.com/a/isocpp.org/forum/
#!topic/std-discussion/T64_dW3WKUk> "Calling noexcept function throug non-
noexcept pointer is undefined behavior?", such a call should not be UB.
However, Clang currently warns about it.
This change removes exception specifications from the function types recorded
for -fsanitize=function, both in the functions themselves and at the call sites.
That means that calling a non-noexcept function through a noexcept pointer will
also not be flagged as UB. In the review of this change, that was deemed
acceptable, at least for now. (See the "TODO" in compiler-rt
test/ubsan/TestCases/TypeCheck/Function/function.cpp.)
To remove exception specifications from types, the existing internal
ASTContext::getFunctionTypeWithExceptionSpec was made public, and some places
otherwise unrelated to this change have been adapted to call it, too.
This is the cfe part of a patch covering both cfe and compiler-rt.
Differential Revision: https://reviews.llvm.org/D40720
llvm-svn: 321859
There are 2 parts to getting the -fassociative-math command-line flag translated to LLVM FMF:
1. In the driver/frontend, we accept the flag and its 'no' inverse and deal with the
interactions with other flags like -ffast-math -fno-signed-zeros -fno-trapping-math.
This was mostly already done - we just need to translate the flag as a codegen option.
The test file is complicated because there are many potential combinations of flags here.
Note that we are matching gcc's behavior that requires 'nsz' and no-trapping-math.
2. In codegen, we map the codegen option to FMF in the IR builder. This is simple code and
corresponding test.
For the motivating example from PR27372:
float foo(float a, float x) { return ((a + x) - x); }
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math -emit-llvm | egrep 'fadd|fsub'
%add = fadd nnan ninf nsz arcp contract float %0, %1
%sub = fsub nnan ninf nsz arcp contract float %add, %2
So 'reassoc' is off as expected (and so is the new 'afn' but that's a different patch).
This case now works as expected end-to-end although the underlying logic is still wrong:
$ ./clang -O2 27372.c -S -o - -ffast-math -fno-associative-math | grep xmm
addss %xmm1, %xmm0
subss %xmm1, %xmm0
We're not done because the case where 'reassoc' is set is ignored by optimizer passes. Example:
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math -emit-llvm | grep fadd
%add = fadd reassoc float %0, %1
$ ./clang -O2 27372.c -S -o - -fassociative-math -fno-signed-zeros -fno-trapping-math | grep xmm
addss %xmm1, %xmm0
subss %xmm1, %xmm0
Differential Revision: https://reviews.llvm.org/D39812
llvm-svn: 320920
Previously the attributes were emitted only for function definitions.
Patch adds emission of the attributes for function declarations.
llvm-svn: 320826
Summary:
The -fxray-always-emit-customevents flag instructs clang to always emit
the LLVM IR for calls to the `__xray_customevent(...)` built-in
function. The default behaviour currently respects whether the function
has an `[[clang::xray_never_instrument]]` attribute, and thus not lower
the appropriate IR code for the custom event built-in.
This change allows users calling through to the
`__xray_customevent(...)` built-in to always see those calls lowered to
the corresponding LLVM IR to lay down instrumentation points for these
custom event calls.
Using this flag enables us to emit even just the user-provided custom
events even while never instrumenting the start/end of the function
where they appear. This is useful in cases where "phase markers" using
__xray_customevent(...) can have very few instructions, must never be
instrumented when entered/exited.
Reviewers: rnk, dblaikie, kpw
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D40601
llvm-svn: 319388
This is an instrumentation flag that's similar to
-finstrument-functions, but it only inserts calls on function entry, the
calls are inserted post-inlining, and they don't take any arugments.
This is intended for users who want to instrument function entry with
minimal overhead.
(-pg would be another alternative, but forces frame pointer emission and
affects link flags, so is probably best left alone to be used for
generating gcov data.)
Differential revision: https://reviews.llvm.org/D40276
llvm-svn: 318785
This updates -mcount to use the new attribute names (LLVM r318195), and
switches over -finstrument-functions to also use these attributes rather
than inserting instrumentation in the frontend.
It also adds a new flag, -finstrument-functions-after-inlining, which
makes the cygprofile instrumentation get inserted after inlining rather
than before.
Differential Revision: https://reviews.llvm.org/D39331
llvm-svn: 318199
Summary:
We don't want to store cleanup dest slot saved into the coroutine frame (as some of the cleanup code may
access them after coroutine frame destroyed).
This is an alternative to https://reviews.llvm.org/D37093
It is possible to do this for all functions, but, cursory check showed that in -O0, we get slightly longer function (by 1-3 instructions), thus, we are only limiting cleanup.dest.slot elimination to coroutines.
Reviewers: rjmccall, hfinkel, eric_niebler
Reviewed By: eric_niebler
Subscribers: EricWF, cfe-commits
Differential Revision: https://reviews.llvm.org/D39768
llvm-svn: 317981
This patch fixes various places in clang to propagate may-alias
TBAA access descriptors during construction of lvalues, thus
eliminating the need for the LValueBaseInfo::MayAlias flag.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D39008
llvm-svn: 316988
Summary:
Convert clang::LangAS to a strongly typed enum
Currently both clang AST address spaces and target specific address spaces
are represented as unsigned which can lead to subtle errors if the wrong
type is passed. It is especially confusing in the CodeGen files as it is
not possible to see what kind of address space should be passed to a
function without looking at the implementation.
I originally made this change for our LLVM fork for the CHERI architecture
where we make extensive use of address spaces to differentiate between
capabilities and pointers. When merging the upstream changes I usually
run into some test failures or runtime crashes because the wrong kind of
address space is passed to a function. By converting the LangAS enum to a
C++11 we can catch these errors at compile time. Additionally, it is now
obvious from the function signature which kind of address space it expects.
I found the following errors while writing this patch:
- ItaniumRecordLayoutBuilder::LayoutField was passing a clang AST address
space to TargetInfo::getPointer{Width,Align}()
- TypePrinter::printAttributedAfter() prints the numeric value of the
clang AST address space instead of the target address space.
However, this code is not used so I kept the current behaviour
- initializeForBlockHeader() in CGBlocks.cpp was passing
LangAS::opencl_generic to TargetInfo::getPointer{Width,Align}()
- CodeGenFunction::EmitBlockLiteral() was passing a AST address space to
TargetInfo::getPointerWidth()
- CGOpenMPRuntimeNVPTX::translateParameter() passed a target address space
to Qualifiers::addAddressSpace()
- CGOpenMPRuntimeNVPTX::getParameterAddress() was using
llvm::Type::getPointerTo() with a AST address space
- clang_getAddressSpace() returns either a LangAS or a target address
space. As this is exposed to C I have kept the current behaviour and
added a comment stating that it is probably not correct.
Other than this the patch should not cause any functional changes.
Reviewers: yaxunl, pcc, bader
Reviewed By: yaxunl, bader
Subscribers: jlebar, jholewinski, nhaehnle, Anastasia, cfe-commits
Differential Revision: https://reviews.llvm.org/D38816
llvm-svn: 315871
The function sanitizer only checks indirect calls through function
pointers. This excludes all non-static member functions (constructor
calls, calls through thunks, etc. all use a separate code path). Don't
emit function signatures for functions that won't be checked.
Apart from cutting down on code size, this should fix a regression on
Linux caused by r313096. For context, see the mailing list discussion:
r313096 - [ubsan] Function Sanitizer: Don't require writable text segments
Testing: check-clang, check-ubsan
Differential Revision: https://reviews.llvm.org/D38913
llvm-svn: 315786
This patch enables explicit generation of TBAA information in all
cases where LValue base info is propagated or constructed in
non-trivial ways. Eventually, we will consider each of these
cases to make sure the TBAA information is correct and not too
conservative. For now, we just fall back to generating TBAA info
from the access type.
This patch should not bring in any functional changes.
This is part of D38126 reworked to be a separate patch to
simplify review.
Differential Revision: https://reviews.llvm.org/D38733
llvm-svn: 315575