This patch adds support for reading sample profiles with inline stacks.
Inline stacks in a profile are generated when the sampled binary has
samples in inlined functions.
For instance, if main() calls foo() and foo() calls bar(), and bar() is
inlined into foo() and foo() inlined into main(), the profile may look
something like:
main total:364084 head:0
[ ... ]
2.3: _Z3fool total:243786
1: 60149
1.2: 38568
1.4: 46511
1.7: _Z3bari total:98558
1.1: 52672
1.2: 45886
At line 2, discriminator 3, main() calls foo(). In turn, foo() calls
bar() at line 1, discriminator 7.
In the textual format, this stacking of inline calls is represented
with indentation.
With this change, LLVM can now read sample profile files generated by
the create_gcov tool from https://github.com/google/autofdo.
llvm-svn: 249644
In r224059, we started verifying after addPass, but missed doing so on
insertPass. There isn't a good reason for the discrepancy, and
skipping the verifier in these cases causes bugs.
This also exposes a verifier error that was introduced in r249087, but
the verifier doesn't run until after the register coalescer, when the
issue happens to have been resolved. I've skipped the verifier after
SIFixSGPRLiveRangesID to avoid the failures for now and will follow up
with Matt for a proper fix.
llvm-svn: 249643
the "" and the suffix; that breaks names such as 'operator""if'. For symmetry,
also remove the space between the 'operator' and the '""'.
llvm-svn: 249641
The backend restores the stack pointer after recovering from an
exception. This is similar to r245879, but it doesn't try to use the
normal cleanup mechanism, so hopefully it won't cause the same breakage.
llvm-svn: 249640
Right now clang_Cursor_getMangling will attempt to mangle any
declaration, even if the declaration isn't mangled (extern C). This
results in a partially mangled name which isn't useful for much. This
patch makes clang_Cursor_getMangling return an empty string if the
declaration isn't mangled.
Patch by Michael Wu <mwu@mozilla.com>.
llvm-svn: 249639
In particular, passing non-trivially copyable objects by value on win32
uses a dynamic alloca (inalloca). We would clobber ESP in the epilogue
and end up returning to outer space.
llvm-svn: 249637
Summary:
Since r223145 we don't include sanitizer_allocator_internal.h into
sanitizer_symbolizer.h, so we can have undefined reference to
Internal{Alloc, Free} stuff into sanitizer_symbolizer_libbacktrace.cc under
SANITIZER_CP_DEMANGLE macro.
This patch simply includes appropriate header into
sanitizer_symbolizer_libbacktrace.h to resolve the issue.
Patch by Maxim Ostapenko!
Reviewers: kcc, eugenis, samsonov
Subscribers: llvm-commits, ygribov
Differential Revision: http://reviews.llvm.org/D13429
llvm-svn: 249633
llvm-nm only needs the target to parse module level assembly in bitcode. It doesn't need a disassembler or codegen.
llvm-objdump needs to be able to disassemble a file, but doesn't need asm parsers or codegen.
This reduces the sizes of these tools by a few MB each, depending on how many backends are linked in.
llvm-svn: 249632
`not` command on Windows is not able to find an executable from PATH
if a given command already has an extension even if the extension is
not ".exe".
llvm-svn: 249630
Summary: This change fixes pr24916. As associated test has been added.
Reviewers: clayborg
Subscribers: zturner, lldb-commits
Differential Revision: http://reviews.llvm.org/D13224
llvm-svn: 249629
When the target settings are consulted to decide the expression language
is decided in CommandObjectExpression, this doesn't help if you're running
SBFrame::EvaluateExpression(). Moving the logic into UserExpression fixes
this.
Based on patch from scallanan@apple.com
Reviewed by: dawn
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D13267
llvm-svn: 249624
Previously the CompileOnDemand layer always created single-function partitions.
In theory this new API allows for more interesting partitions, though this has
not been well tested yet.
llvm-svn: 249623
The relocation for the filter funclet will be against a symbol table
entry for a function instead of the section, making it easier to
understand what is going on.
llvm-svn: 249621
The __CxxFrameHandler3 tables for 32-bit are supposed to hold stack
offsets relative to EBP, not ESP. I blindly updated the win-catchpad.ll
test case, and immediately noticed that 32-bit catching stopped working.
While I'm at it, move the frame index to frame offset WinEH table logic
out of PEI. PEI shouldn't have to know about WinEHFuncInfo. I realized
we can calculate frame index offsets just fine from the table printer.
llvm-svn: 249618
We remove unreachable blocks because it is pointless to consider them
for coloring. However, we still had stale pointers to these blocks in
some data structures after we removed them from the function.
Instead, remove the unreachable blocks before attempting to do anything
with the function.
This fixes PR25099.
llvm-svn: 249617
We don't have a good place to put them. Our previous spot was causing us
to optimize loads from the exception object to undef, because it was
after the catchpad instruction that models the write to the catch
object.
llvm-svn: 249616
Recycler just needs a singly-linked list, and it takes less (and
simpler) code to hand-roll one of those than to build up the equivalent
`iplist_traits`. In theory, this should speed things up a bit too, but
this is really just a drive-by cleanup so I haven't measured.
llvm-svn: 249615
ScopDetection users are interested in the detection context and access
these via different get-methods. However, not all information was
exposed though the number of maps to hold it was increasing steadily.
With this change only the detection contexts the rejection log and the
ValidRegions set are mapped. The former is needed, the second could be
integrated in the first and the ValidRegions set is only needed for the
deterministic order of the regions.
llvm-svn: 249614
This replaces the support for user defined error functions by a
heuristic that tries to determine if a call to a non-pure function
should be considered "an error". If so the block is assumed not to be
executed at runtime. While treating all non-pure function calls as
errors will allow a lot more regions to be analyzed, it will also
cause us to dismiss a lot again due to an infeasible runtime context.
This patch tries to limit that effect. A non-pure function call is
considered an error if it is executed only in conditionally with
regards to a cheap but simple heuristic.
llvm-svn: 249611
This patch allows invariant loads to be used in the SCoP description,
e.g., as loop bounds, conditions or in memory access functions.
First we collect "required invariant loads" during SCoP detection that
would otherwise make an expression we care about non-affine. To this
end a new level of abstraction was introduced before
SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
if we want a load inside the region to be optimistically assumed
invariant or not. If we do, it will be marked as required and in the
SCoP generation we bail if it is actually not invariant. If we don't
it will be a non-affine expression as before. At the moment we
optimistically assume all "hoistable" (namely non-loop-carried) loads
to be invariant. This causes us to expand some SCoPs and dismiss them
later but it also allows us to detect a lot we would dismiss directly
if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
allow potential aliases between optimistically assumed invariant loads
and other pointers as our runtime alias checks are sound in case the
loads are actually invariant. Together with the invariant checks this
combination allows to handle a lot more than LICM can.
The code generation of the invariant loads had to be extended as we
can now have dependences between parameters and invariant (hoisted)
loads as well as the other way around, e.g.,
test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
First, it is important to note that we cannot have real cycles but
only dependences from a hoisted load to a parameter and from another
parameter to that hoisted load (and so on). To handle such cases we
materialize llvm::Values for parameters that are referred by a hoisted
load on demand and then materialize the remaining parameters. Second,
there are new kinds of dependences between hoisted loads caused by the
constraints on their execution. If a hoisted load is conditionally
executed it might depend on the value of another hoisted load. To deal
with such situations we sort them already in the ScopInfo such that
they can be generated in the order they are listed in the
Scop::InvariantAccesses list (see compareInvariantAccesses). The
dependences between hoisted loads caused by indirect accesses are
handled the same way as before.
llvm-svn: 249607
Value maps are created and used in many places and it is not always
possible to include CodeGen/Blockgenerators.h. To this end, ValueMapT
now lives in the ScopHelper.h which does not have any dependences itself.
This patch also replaces uses of different other value map types with
the ValueMapT.
llvm-svn: 249606
Create `SymbolTableList`, a wrapper around `iplist` for lists that
automatically manage a symbol table. This commit reduces a ton of code
duplication between the six traits classes that were used previously.
As a drive by, reduce the number of template parameters from 2 to 1 by
using a SymbolTableListParentType metafunction (I originally had this as
a separate commit, but it touched most of the same lines so I squashed
them).
I'm in the process of trying to remove the UB in `createSentinel()` (see
the FIXMEs I added for `ilist_embedded_sentinel_traits` and
`ilist_half_embedded_sentinel_traits`). My eventual goal is to separate
the list logic into a base class layer that knows nothing about (and
isn't templated on) the downcasted nodes -- removing the need to invoke
UB -- but for now I'm just trying to get a handle on all the current use
cases (and cleaning things up as I see them).
Besides these six SymbolTable lists, there are two others that use the
addNode/removeNode/transferNodes() hooks: the `MachineInstruction` and
`MachineBasicBlock` lists. Ideally there'll be a way to factor these
hooks out of the low-level API entirely, but I'm not quite there yet.
llvm-svn: 249602