Summary:
This change allows us to patch/unpatch specific functions using the
function ID. This is useful in cases where implementations might want to
do coverage-style, or more fine-grained control of which functions to
patch or un-patch at runtime.
Depends on D32693.
Reviewers: dblaikie, echristo, kpw
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32695
llvm-svn: 302112
This makes it simpler for the runtime to consistently handle the entries
in the function sled index in both 32 and 64 bit platforms where the
XRay runtime works.
Follow-up on D32693.
llvm-svn: 302111
Summary:
This change adds a new section to the xray-instrumented binary that
stores an index into ranges of the instrumentation map, where sleds
associated with the same function can be accessed as an array. At
runtime, we can get access to this index by function ID offset allowing
for selective patching and unpatching by function ID.
Each entry in this new section (xray_fn_idx) will include two pointers
indicating the start and one past the end of the sleds associated with
the same function. These entries will be 16 bytes long on x86 and
aarch64. On arm, we align to 16 bytes anyway so the runtime has to take
that into consideration.
__{start,stop}_xray_fn_idx will be the symbols that the runtime will
look for when we implement the selective patching/unpatching by function
id APIs. Because XRay synthesizes the function id's in a monotonically
increasing manner at runtime now, implementations (and users) can use
this table to look up the sleds associated with a specific function.
This is useful in implementations that want to do things like:
- Implement coverage mode for functions by patching everything
pre-main, then as functions are encountered, the installed handler
can unpatch the function that's been encountered after recording
that it's been called.
- Do "learning mode", so that the implementation can figure out some
statistical information about function calls by function id for a
time being, and then determine which functions are worth
uninstrumenting at runtime.
- Do "selective instrumentation" where an implementation can
specifically instrument only certain function id's at runtime
(either based on some external data, or through some other
heuristics) instead of patching all the instrumented functions at
runtime.
Reviewers: dblaikie, echristo, chandlerc, javed.absar
Subscribers: pelikan, aemerson, kpw, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D32693
llvm-svn: 302109
When profiling a no-op incremental link of Chromium I found that the functions
computeImportForFunction and computeDeadSymbols were consuming roughly 10% of
the profile. The goal of this change is to improve the performance of those
functions by changing the map lookups that they were previously doing into
pointer dereferences.
This is achieved by changing the ValueInfo data structure to be a pointer to
an element of the global value map owned by ModuleSummaryIndex, and changing
reference lists in the GlobalValueSummary to hold ValueInfos instead of GUIDs.
This means that a ValueInfo will take a client directly to the summary list
for a given GUID.
Differential Revision: https://reviews.llvm.org/D32471
llvm-svn: 302108
We were correctly computing the size contribution of a .tbss input
section (it is none), but we were incorrectly considering the
alignment of the output section: it was advancing Dot instead of
ThreadBssOffset.
As far as I can tell this was always wrong in our linkerscript
implementation, but that became more visible now that the code is
shared with the non linker script case.
llvm-svn: 302107
_HAS_CXX17 indicates whether MSVC's STL is in C++17 mode.
In MSVC there's a distinction between CRT headers like stdlib.h and STL headers
like cstdlib. Only the STL headers drag in yvals.h, our internal STL-wide header
that defines internal macros like _HAS_CXX17.
_HAS_CXX17 is an MSVC STL library macro, unconditionally defined. We centralize
everything on this, because we have to ask different questions to determine
whether C1XX, EDG, or Clang is in 14 or 17 mode, and we additionally permit
users to override the detection in one way (it's okay to ask for 17 from the
compiler, but only 14 from the libs, at least for the moment; only noexcept
in the type system will give us a headache).
As this header is for testing MSVC's STL, we can assume _HAS_CXX17 is defined.
Fixes D32726.
llvm-svn: 302104
Summary:
This is an implementation of the loop detection logic that XRay needs to
determine whether a function might take time at runtime. Without this
heuristic, XRay will tend to not instrument short functions that have
loops that might have runtime dependent on inputs or external values.
While this implementation doesn't do any further analysis than just
figuring out whether there is a loop in the MachineFunction being
code-gen'ed, we're paving the way for being able to perform more
sophisticated analysis of the function in the future (for example to
determine whether the trip count for the loop might be constant, and
make a decision on that instead). This enables us to cover more
functions with the default heuristics, and potentially identify ones
that have variable runtime latency just by looking for the presence of
loops.
Reviewers: chandlerc, rnk, pelikan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32274
llvm-svn: 302103
These pragmas are intended to simulate the effect of entering or leaving a file
with an associated module. This is not completely implemented yet: declarations
between the pragmas will not be attributed to the correct module, but macro
visibility is already functional.
Modules named by #pragma clang module begin must already be known to clang (in
some module map that's either loaded or on the search path).
llvm-svn: 302098
Summary:
The existing implementation creates a symbolic SCEV expression every
time we analyze a phi node and then has to remove it, when the analysis
is finished. This is very expensive, and in most of the cases it's also
unnecessary. According to the data I collected, ~60-70% of analyzed phi
nodes (measured on SPEC) have the following form:
PN = phi(Start, OP(Self, Constant))
Handling such cases separately significantly speeds this up.
Reviewers: sanjoy, pete
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32663
llvm-svn: 302096
r296685 started adding the test/ subdirectory even when
LIBCXX_INCLUDE_TESTS=OFF. This is great for testing libcxx standalone,
but it also breaks the build when the test/ subdirectory is removed
(and our submission system strips all test/ directories).
This patch updates the logic to check for test/ before adding it.
rdar://problem/31931366
llvm-svn: 302095
Change checkRippleForAdd from a heuristic to a full check -
if it is provable that the add does not overflow return true, otherwise false.
Patch by Yoav Ben-Shalom
Differential Revision: https://reviews.llvm.org/D32686
llvm-svn: 302093
This patch adds isConstant and getConstant for determining if KnownBits represents a constant value and to retrieve the value. Use them to simplify code.
Differential Revision: https://reviews.llvm.org/D32785
llvm-svn: 302091
I don't believe its possible to have non-zero values here since DataLayout became required. The APInt constructor inside of the KnownBits object will assert if this ever happens.
llvm-svn: 302089
This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible.
Differential Revision: https://reviews.llvm.org/D32784
llvm-svn: 302088
This is the DAG equivalent of https://reviews.llvm.org/D32255 ,
which will hopefully be committed again. The functionality
(preferring a 'not' op) is already here in the DAG, so this is
just intended to be a clean-up and performance improvement.
llvm-svn: 302087
Compiler emitted synthetic types may not have an associated DIFile
(translation unit). In such a case, when generating CodeView debug type
information, we would attempt to compute an absolute filepath which
would result in a segfault due to a NULL DIFile*. If there is no source
file associated with the type, elide the type index entry for the type
and record the type information. This actually results in higher
fidelity debug information than clang/C2 as of this writing.
Resolves PR32668!
llvm-svn: 302085
It seems virtually everyone who tries to do LTO build with Clang and
LLD was hit by a mistake to forget using llvm-ar command to create
archive files. I wasn't an exception. Since this is an annoying common
issue, it is probably better to handle that gracefully rather than
reporting an error and tell the user to redo build with different
configuration.
Differential Revision: https://reviews.llvm.org/D32721
llvm-svn: 302083
Otherwise, each CPU has to manually specify the extensions it supports,
even though they have to be a superset of the base arch extensions.
And when there's redundant data there's stale data, so most of the CPUs
lie about the features they support (almost none lists AEK_FP).
Instead, do the saner thing: add the optional extensions on top of the
base extensions provided by the architecture.
The ARM TargetParser has the same behavior.
Differential Revision: https://reviews.llvm.org/D32780
llvm-svn: 302078
That's only a required extension as of v8.1a.
Remove it from the "generic" CPU as well: it should only support the
base ISA (and binutils agrees).
Also unify the MC tests into crc.s and arm64-crc32.s
llvm-svn: 302077
Summary:
It's not safe to assume that atexit handlers can be run once the app crashed.
Patch by Jochen Eisinger.
Reviewers: kcc, vitalybuka
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32640
llvm-svn: 302076
LLVM-IR names are commonly available in debug builds, but often not in release
builds. Hence, using LLVM-IR names to identify statements or memory reference
results makes the behavior of Polly depend on the compile mode. This is
undesirable. Hence, we now just number the statements instead of using LLVM-IR
names to identify them (this issue has previously been brought up by Zino
Benaissa).
However, as LLVM-IR names help in making test cases more readable, we add an
option '-polly-use-llvm-names' to still use LLVM-IR names. This flag is by
default set in the polly tests to make test cases more readable.
This change reduces the time in ScopInfo from 32 seconds to 2 seconds for the
following test case provided by Eli Friedman <efriedma@codeaurora.org> (already
used in one of the previous commits):
struct X { int x; };
void a();
#define SIG (int x, X **y, X **z)
typedef void (*fn)SIG;
#define FN { for (int i = 0; i < x; ++i) { (*y)[i].x += (*z)[i].x; } a(); }
#define FN5 FN FN FN FN FN
#define FN25 FN5 FN5 FN5 FN5
#define FN125 FN25 FN25 FN25 FN25 FN25
#define FN250 FN125 FN125
#define FN1250 FN250 FN250 FN250 FN250 FN250
void x SIG { FN1250 }
For a larger benchmark I have on-hand (10000 loops), this reduces the time for
running -polly-scops from 5 minutes to 4 minutes, a reduction by 20%.
The reason for this large speedup is that our previous use of printAsOperand
had a quadratic cost, as for each printed and unnamed operand the full function
was scanned to find the instruction number that identifies the operand.
We do not need to adjust the way memory reference ids are constructured, as
they do not use LLVM values.
Reviewed by: efriedma
Tags: #polly
Differential Revision: https://reviews.llvm.org/D32789
llvm-svn: 302072
It doesn't matter what binding we store in a non-UsedInRegularObj undefined
symbol because we should reset it when we see a real undefined symbol in
a combined LTO object. The fact that we weren't doing so before is a bug
(PR32899) which is now fixed.
llvm-svn: 302067
A bot had "-LTO" in its working directory, which matched the regex used in this
test. Since the arg is quoted, we can exploit that instead. Still broken if
there's a path with a quote in, but I think that's pretty niche.
llvm-svn: 302066