CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.
Reviewed By: davidxl
Differential Revision: https://reviews.llvm.org/D88609
Old GCC used to aggressively fold VLAs to constant-bound arrays at block
scope in GNU mode. That's non-conforming, and more modern versions of
GCC only do this at file scope. Update Clang to do the same.
Also promote the warning for this from off-by-default to on-by-default
in all cases; more recent versions of GCC likewise warn on this by
default.
This is still slightly more permissive than GCC, as pointed out in
PR44406, as we still fold VLAs to constant arrays in structs, but that
seems justifiable given that we don't support VLA-in-struct (and don't
intend to ever support it), but GCC does.
Differential Revision: https://reviews.llvm.org/D89523
This patch includes the supporting code that enables always
instrumenting the function entry block by default.
This patch will NOT the default behavior.
It adds a variant bit in the profile version, adds new directives in
text profile format, and changes llvm-profdata tool accordingly.
This patch is a split of D83024 (https://reviews.llvm.org/D83024)
Many test changes from D83024 are also included.
Differential Revision: https://reviews.llvm.org/D84261
And bump its version number accordingly.
This is a patched recommit of 7c298c104b
Previous hash implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.
Added a new test case that trivially tests that a small function change is
reflected in the hash value.
Not that as this patch fixes the hash computation, it would invalidate all hashes
computed before that patch applies, this is why we bumped the version number.
Update profile data hash entries due to hash function update, except for binary
version, in which case we keep the buggy behavior for backward compatibility.
Differential Revision: https://reviews.llvm.org/D79961
Previous implementation was incorrectly passing an uint64_t, that got converted
to an uint8_t, to finalize the hash computation. This led to different functions
having the same hash if they only differ by the remaining statements, which is
incorrect.
Added a new test case that trivially tests that a small function change is
reflected in the hash value.
Not that as this patch fixes the hash computation, it invalidates all hashes
computed before that patch applies, which could be an issue for large build
system that pre-compute the profile data and let client download them as part of
the build process.
Differential Revision: https://reviews.llvm.org/D79961
Try again with an up-to-date version of D69471 (99317124 was a stale
revision).
---
Revise the coverage mapping format to reduce binary size by:
1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.
This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).
Rationale for changes to the format:
- With the current format, most coverage function records are discarded.
E.g., more than 97% of the records in llc are *duplicate* placeholders
for functions visible-but-not-used in TUs. Placeholders *are* used to
show under-covered functions, but duplicate placeholders waste space.
- We reached general consensus about giving (1) a try at the 2017 code
coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
duplicates is simpler than alternatives like teaching build systems
about a coverage-aware database/module/etc on the side.
- Revising the format is expensive due to the backwards compatibility
requirement, so we might as well compress filenames while we're at it.
This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).
See CoverageMappingFormat.rst for the details on what exactly has
changed.
Fixes PR34533 [2], hopefully.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533
Differential Revision: https://reviews.llvm.org/D69471
Revise the coverage mapping format to reduce binary size by:
1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.
This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).
Rationale for changes to the format:
- With the current format, most coverage function records are discarded.
E.g., more than 97% of the records in llc are *duplicate* placeholders
for functions visible-but-not-used in TUs. Placeholders *are* used to
show under-covered functions, but duplicate placeholders waste space.
- We reached general consensus about giving (1) a try at the 2017 code
coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
duplicates is simpler than alternatives like teaching build systems
about a coverage-aware database/module/etc on the side.
- Revising the format is expensive due to the backwards compatibility
requirement, so we might as well compress filenames while we're at it.
This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).
See CoverageMappingFormat.rst for the details on what exactly has
changed.
Fixes PR34533 [2], hopefully.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533
Differential Revision: https://reviews.llvm.org/D69471
Revise the coverage mapping format to reduce binary size by:
1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.
This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).
Rationale for changes to the format:
- With the current format, most coverage function records are discarded.
E.g., more than 97% of the records in llc are *duplicate* placeholders
for functions visible-but-not-used in TUs. Placeholders *are* used to
show under-covered functions, but duplicate placeholders waste space.
- We reached general consensus about giving (1) a try at the 2017 code
coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
duplicates is simpler than alternatives like teaching build systems
about a coverage-aware database/module/etc on the side.
- Revising the format is expensive due to the backwards compatibility
requirement, so we might as well compress filenames while we're at it.
This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).
See CoverageMappingFormat.rst for the details on what exactly has
changed.
Fixes PR34533 [2], hopefully.
[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533
Differential Revision: https://reviews.llvm.org/D69471
This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing the developer's
annotation against a collected PGO profile. A more detailed proposal and
discussion appears on the CFE-dev mailing list
(http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a
prototype of the initial frontend changes appear here in D65300
We revised the work in D65300 by moving the misexpect check into the
LLVM backend, and adding support for IR and sampling based profiles, in
addition to frontend instrumentation.
We add new misexpect metadata tags to those instructions directly
influenced by the llvm.expect intrinsic (branch, switch, and select)
when lowering the intrinsics. The misexpect metadata contains
information about the expected target of the intrinsic so that we can
check against the correct PGO counter when emitting diagnostics, and the
compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight.
We use these branch weight values to determine when to emit the
diagnostic to the user.
A future patch should address the comment at the top of
LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and
UnlikelyBranchWeight values into a shared space that can be accessed
outside of the LowerExpectIntrinsic pass. Once that is done, the
misexpect metadata can be updated to be smaller.
In the long term, it is possible to reconstruct portions of the
misexpect metadata from the existing profile data. However, we have
avoided this to keep the code simple, and because some kind of metadata
tag will be required to identify which branch/switch/select instructions
are influenced by the use of llvm.expect
Patch By: paulkirth
Differential Revision: https://reviews.llvm.org/D66324
llvm-svn: 371635
This reverts commit r371584. It introduced a dependency from compiler-rt
to llvm/include/ADT, which is problematic for multiple reasons.
One is that it is a novel dependency edge, which needs cross-compliation
machinery for llvm/include/ADT (yes, it is true that right now
compiler-rt included only header-only libraries, however, if we allow
compiler-rt to depend on anything from ADT, other libraries will
eventually get used).
Secondly, depending on ADT from compiler-rt exposes ADT symbols from
compiler-rt, which would cause ODR violations when Clang is built with
the profile library.
llvm-svn: 371598
This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing the developer's
annotation against a collected PGO profile. A more detailed proposal and
discussion appears on the CFE-dev mailing list
(http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a
prototype of the initial frontend changes appear here in D65300
We revised the work in D65300 by moving the misexpect check into the
LLVM backend, and adding support for IR and sampling based profiles, in
addition to frontend instrumentation.
We add new misexpect metadata tags to those instructions directly
influenced by the llvm.expect intrinsic (branch, switch, and select)
when lowering the intrinsics. The misexpect metadata contains
information about the expected target of the intrinsic so that we can
check against the correct PGO counter when emitting diagnostics, and the
compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight.
We use these branch weight values to determine when to emit the
diagnostic to the user.
A future patch should address the comment at the top of
LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and
UnlikelyBranchWeight values into a shared space that can be accessed
outside of the LowerExpectIntrinsic pass. Once that is done, the
misexpect metadata can be updated to be smaller.
In the long term, it is possible to reconstruct portions of the
misexpect metadata from the existing profile data. However, we have
avoided this to keep the code simple, and because some kind of metadata
tag will be required to identify which branch/switch/select instructions
are influenced by the use of llvm.expect
Patch By: paulkirth
Differential Revision: https://reviews.llvm.org/D66324
llvm-svn: 371584
This patch contains the basic functionality for reporting potentially
incorrect usage of __builtin_expect() by comparing the developer's
annotation against a collected PGO profile. A more detailed proposal and
discussion appears on the CFE-dev mailing list
(http://lists.llvm.org/pipermail/cfe-dev/2019-July/062971.html) and a
prototype of the initial frontend changes appear here in D65300
We revised the work in D65300 by moving the misexpect check into the
LLVM backend, and adding support for IR and sampling based profiles, in
addition to frontend instrumentation.
We add new misexpect metadata tags to those instructions directly
influenced by the llvm.expect intrinsic (branch, switch, and select)
when lowering the intrinsics. The misexpect metadata contains
information about the expected target of the intrinsic so that we can
check against the correct PGO counter when emitting diagnostics, and the
compiler's values for the LikelyBranchWeight and UnlikelyBranchWeight.
We use these branch weight values to determine when to emit the
diagnostic to the user.
A future patch should address the comment at the top of
LowerExpectIntrisic.cpp to hoist the LikelyBranchWeight and
UnlikelyBranchWeight values into a shared space that can be accessed
outside of the LowerExpectIntrinsic pass. Once that is done, the
misexpect metadata can be updated to be smaller.
In the long term, it is possible to reconstruct portions of the
misexpect metadata from the existing profile data. However, we have
avoided this to keep the code simple, and because some kind of metadata
tag will be required to identify which branch/switch/select instructions
are influenced by the use of llvm.expect
Patch By: paulkirth
Differential Revision: https://reviews.llvm.org/D66324
llvm-svn: 371484
Add PGO support at -O0 in the experimental new pass manager to sync the
behavior of the legacy pass manager.
Also change the test of gcc-flag-compatibility.c for more complete test:
(1) change the match string to "profc" and "profd" to ensure the
instrumentation is happening.
(2) add IR format proftext so that PGO use compilation is tested.
Differential Revision: https://reviews.llvm.org/D64029
llvm-svn: 367628
This contains the part of D62225 which fixes Profile/gcc-flag-compatibility.c
by adding the pass that allows default profile generation to work under the new
PM. It seems that ./default.profraw was not being generated with new PM enabled.
Differential Revision: https://reviews.llvm.org/D63155
llvm-svn: 363278
Currently InstrProf lowering is not enabled for Clang PGO instrumentation in
the new pass manager. The following command
"-fprofile-instr-generate -fexperimental-new-pass-manager ..." is broken.
This CL enables InstrProf lowering pass for Clang PGO instrumentation in the
new pass manager.
Differential Revision: https://reviews.llvm.org/D61138
llvm-svn: 359215
Summary:
I hadn't realized that instrumentation runs before inlining, so we can't
use the function as the comdat group. Doing so can create relocations
against discarded sections when references to discarded __profc_
variables are inlined into functions outside the function's comdat
group.
In the future, perhaps we should consider standardizing the comdat group
names that ELF and COFF use. It will save object file size, since
__profv_$sym won't appear in the symbol table again.
Reviewers: xur, vsk
Subscribers: eraman, hiraditya, cfe-commits, #sanitizers, llvm-commits
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D58737
llvm-svn: 355044
Summary:
The MS C++ ABI has no constructor variants, but it has destructor
variants, so we should move the deleting destructor variant check
outside the check for "does the ABI have constructor variants".
Fixes PR37561, so basic code coverage works on Windows with C++.
Reviewers: vsk
Subscribers: jdoerfert, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D58691
llvm-svn: 354924
Lifting from Bob Wilson's notes: The hash value that we compute and
store in PGO profile data to detect out-of-date profiles does not
include enough information. This means that many significant changes to
the source will not cause compiler warnings about the profile being out
of date, and worse, we may continue to use the outdated profile data to
make bad optimization decisions. There is some tension here because
some source changes won't affect PGO and we don't want to invalidate the
profile unnecessarily.
This patch adds a new hashing scheme which is more sensitive to loop
nesting, conditions, and out-of-order control flow. Here are examples
which show snippets which get the same hash under the current scheme,
and different hashes under the new scheme:
Loop Nesting Example
--------------------
// Snippet 1
while (foo()) {
while (bar()) {}
}
// Snippet 2
while (foo()) {}
while (bar()) {}
Condition Example
-----------------
// Snippet 1
if (foo())
bar();
baz();
// Snippet 2
if (foo())
bar();
else
baz();
Out-of-order Control Flow Example
---------------------------------
// Snippet 1
while (foo()) {
if (bar()) {}
baz();
}
// Snippet 2
while (foo()) {
if (bar())
continue;
baz();
}
In each of these cases, it's useful to differentiate between the
snippets because swapping their profiles gives bad optimization hints.
The new hashing scheme considers some logical operators in an effort to
detect more changes in conditions. This isn't a perfect scheme. E.g, it
does not produce the same hash for these equivalent snippets:
// Snippet 1
bool c = !a || b;
if (d && e) {}
// Snippet 2
bool f = d && e;
bool c = !a || b;
if (f) {}
This would require an expensive data flow analysis. Short of that, the
new hashing scheme looks reasonably complete, based on a scan over the
statements we place counters on.
Profiles which use the old version of the PGO hash remain valid and can
be used without issue (there are tests in tree which check this).
rdar://17068282
Differential Revision: https://reviews.llvm.org/D39446
llvm-svn: 318229
The root cause of the issues reported in D32406 and D34680 is that clang
instruments functions without bodies. Make it stop doing that, and also
teach it how to use old (incorrectly generated) profiles without
crashing.
llvm-svn: 306883
Clang warns that a profile is out-of-date if it can't find a profile
record for any function in a TU. This warning became noisy after llvm
started allowing dead-stripping of instrumented functions.
To fix this, this patch changes the existing profile out-of-date warning
(-Wprofile-instr-out-of-date) so that it only complains about mismatched
data. Further, it introduces a new, off-by-default warning about missing
function data (-Wprofile-instr-missing).
Differential Revision: https://reviews.llvm.org/D28867
llvm-svn: 301570
2nd attempt: the first was in r296231, but it had a use after lifetime
bug.
Clang has logic to lower certain conditional expressions directly into llvm
select instructions. However, it does not emit the correct profile counter
increment as it does this: it emits an unconditional increment of the counter
for the 'then branch', even if the value selected is from the 'else branch'
(this is PR32019).
That means, given the following snippet, we would report that "0" is selected
twice, and that "1" is never selected:
int f1(int x) {
return x ? 0 : 1;
^2 ^0
}
f1(0);
f1(1);
Fix the problem by using the instrprof_increment_step intrinsic to do the
proper increment.
llvm-svn: 296245
Clang has logic to lower certain conditional expressions directly into
llvm select instructions. However, it does not emit the correct profile
counter increment as it does this: it emits an unconditional increment
of the counter for the 'then branch', even if the value selected is from
the 'else branch' (this is PR32019).
That means, given the following snippet, we would report that "0" is
selected twice, and that "1" is never selected:
int f1(int x) {
return x ? 0 : 1;
^2 ^0
}
f1(0);
f1(1);
Fix the problem by using the instrprof_increment_step intrinsic to do
the proper increment.
llvm-svn: 296231
Fix the fact that we don't assign profile counters to constructors in
classes with virtual bases, or constructors with variadic parameters.
Differential Revision: https://reviews.llvm.org/D30131
llvm-svn: 296062
The cxx-structors.cpp test checks that some instrumentation doesn't
appear, but it should be more explicit about which instrumentation it
actually expects to appear.
llvm-svn: 295532
The frontend can't see "__profn" profile name variables after IRGen
because llvm throws these away now. Tighten up some test cases which
checked for the non-existence of those variables.
llvm-svn: 295528
This is a re-try of r295085: fix up some test cases that assume that
profile name variables are preserved by the instrprof pass.
This catches one additional case in test/CoverageMapping/unused_names.c.
llvm-svn: 295101
Much to my surprise, '-disable-llvm-optzns' which I thought was the
magical flag I wanted to get at the raw LLVM IR coming out of Clang
deosn't do that. It still runs some passes over the IR. I don't want
that, I really want the *raw* IR coming out of Clang and I strongly
suspect everyone else using it is in the same camp.
There is actually a flag that does what I want that I didn't know about
called '-disable-llvm-passes'. I suspect many others don't know about it
either. It both does what I want and is much simpler.
This removes the confusing version and makes that spelling of the flag
an alias for '-disable-llvm-passes'. I've also moved everything in Clang
to use the 'passes' spelling as it seems both more accurate (*all* LLVM
passes are disabled, not just optimizations) and much easier to remember
and spell correctly.
This is part of simplifying how Clang drives LLVM to make it cleaner to
wire up to the new pass manager.
Differential Revision: https://reviews.llvm.org/D28047
llvm-svn: 290392
Value profiling should not profile constants and/or constant
expressions when they appear as callees in call instructions.
Constant expressions form when a direct callee has bitcasts or
inttoptr(ptrtint (callee)) nests surrounding it. Value profiling
should avoid instrumenting such cases. Mostly NFC.
llvm-svn: 265037
For terminator instructions, the value profiling instrumentation
happens in a basic block other than where the value site resides.
This CR moves the instrumentation point prior to the value site.
Mostly NFC.
llvm-svn: 264783
This patch changes cc1 option for PGO profile use from
-fprofile-instr-use=<path> to -fprofile-instrument-use-path=<path>.
-fprofile-instr-use=<path> is now a driver only option.
In addition to decouple the cc1 option from the driver level option, this patch
also enables IR level profile use. cc1 option handling now reads the profile
header and sets CodeGenOpt ProfileUse (valid values are {None, Clang, LLVM}
-- this is a common enum for -fprofile-instrument={}, for the profile
instrumentation), and invoke the pipeline to enable the respective PGO use pass.
Reviewers: silvas, davidxl
Differential Revision: http://reviews.llvm.org/D17737
llvm-svn: 262515
This patch changes cc1 option -fprofile-instr-generate to an enum option
-fprofile-instrument={clang|none}. It also changes cc1 options
-fprofile-instr-generate= to -fprofile-instrument-path=.
The driver level option -fprofile-instr-generate and -fprofile-instr-generate=
remain intact. This change will pave the way to integrate new PGO
instrumentation in IR level.
Review: http://reviews.llvm.org/D16730
llvm-svn: 259811
1. Make test case more focused and robust by focusing on what to be tested (linkage, icall) -- make it easier to validate
2. Testing linkages of data and counter variables instead of names. Counters and data are more relavant to be tested.
llvm-svn: 259067
NFC. These hints are only used for inlining and the inliner now uses
the same criteria to identify hot and cold callees and set appropriate
thresholds without relying on these hints. Hence this removed code is
superfluous.
Differential Revision: http://reviews.llvm.org/D15726
llvm-svn: 256793
This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use this information.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255918
(test case update)
Profile symbols have long prefixes which waste space and creating pressure for linker.
This patch shortens the prefixes to minimal length without losing verbosity.
Differential Revision: http://reviews.llvm.org/D15503
llvm-svn: 255576
(This is part-2 of the patch -- fixing test cases)
Before the patch, -fprofile-instr-generate compile will fail
if no integrated-as is specified when the file contains
any static functions (the -S output is also invalid).
This patch fixed the issue. With the change, the index format
version will be bumped up by 1. Backward compatibility is
preserved with this change.
Differential Revision: http://reviews.llvm.org/D15243
llvm-svn: 255366
Constructors and destructors may be represented by several functions
in IR. Only base structors correspond to source code, others are
small pieces of code and eventually call the base variant. In this
case instrumentation of non-base structors has little sense, this
fix remove it. Now profile data of a declaration corresponds to
exactly one function in IR, it agrees with the current logic of the
profile data loading.
This change fixes PR24996.
Differential Revision: http://reviews.llvm.org/D15158
llvm-svn: 254876
cannot assume that the current working directory is writable in all test
environments. I don't know a better way to write this test of hand, lets
discuss. Possibly, a better option would be to put these together with
other test testing the driver directly.
llvm-svn: 241885
This patch adds support for specifying where the profile is emitted in a
way similar to GCC. These flags are used to specify directories instead
of filenames. When -fprofile-generate=DIR is used, the compiler will
generate code to write to <DIR>/default.profraw.
The patch also adds a couple of extensions: LLVM_PROFILE_FILE can still be
used to override the directory and file name to use and -fprofile-use
accepts both directories and filenames.
To simplify the set of flags used in the backend, all the flags get
canonicalized to -fprofile-instr-{generate,use} when passed to the
backend. The decision to use a default name for the profile is done
in the driver.
llvm-svn: 241825
Several tests wouldn't pass when executed on an armv7a_pc_linux triple
due to the non-default arm_aapcs calling convention produced on the
function definitions in the IR output. Account for this with the
application of a little regex.
Patch by Ying Yi.
llvm-svn: 240971
When a profile file cannot be opened, we used to display just the error
message but not the name of the profile the compiler was trying to open.
This will become useful in the next set of patches that introduce
GCC-compatible flags to specify profiles.
llvm-svn: 240715
-fprofile-instr-generate does not emit counter increment intrinsics
for Dtor_Deleting and Dtor_Complete destructors with assigned
counters. This causes unnecessary [-Wprofile-instr-out-of-date]
warnings during profile-use runs even if the source has never been
modified since profile collection.
Patch by Betul Buyukkurt. Thanks!
llvm-svn: 237804