Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.
Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.
Motivation
----------
Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.
We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.
Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.
http://reviews.llvm.org/D19034
<rdar://problem/25256815>
llvm-svn: 266446
This is almost identical to:
http://reviews.llvm.org/rL264527
This doesn't solve PR27344; it just allows the profile weights to survive.
To solve the bug, we need to use the profile weights in the backend.
llvm-svn: 266442
Summary:
Without MMOs, the callee-save load/store instructions were treated as
volatile by the MI post-RA scheduler and AArch64LoadStoreOptimizer.
Reviewers: t.p.northover, mcrosier
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D17661
llvm-svn: 266439
[PPC] Previously when casting generic loads to LXV2DX/ST instructions we
would leave the original load return type in place allowing for an
assertion failure when we merge two equivalent LXV2DX nodes with
different types.
This fixes PR27350.
Reviewers: nemanjai
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D19133
llvm-svn: 266438
Perform store clustering just like load clustering. This change add
StoreClusterMutation in machine-scheduler. To control StoreClusterMutation,
added enableClusterStores() in TargetInstrInfo.h. This is enabled only on
AArch64 for now.
This change also add support for unscaled stores which were not handled in
getMemOpBaseRegImmOfs().
llvm-svn: 266437
The root of the problem was that findMainViewFileID(File, Function)
could return some ID for any given file, even though that file
was not the main file for that function.
This patch ensures that the result of this function is conformed
with the result of findMainViewFileID(Function).
Differential Revision: http://reviews.llvm.org/D18787
llvm-svn: 266436
Summary:
In the added test-case, the atomic instruction feeds into a non-machine
CopyToReg node which hasn't been selected yet, so guard against
non-machine opcodes here.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19043
llvm-svn: 266433
Summary:
Doing a pthread_detach while the thread is exiting can cause crashes or other mischief, so we
make sure the thread stays around long enough. The performance impact of the added
synchronization should be minimal, as the parent thread is already holding a mutex, so I am just
making sure it holds it for a little while longer. It's possible the new thread will block on
this mutex immediately after startup, but it should be unblocked really quickly and some
blocking is unavoidable if we actually want to have this synchronization.
Reviewers: tberghammer
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D19153
llvm-svn: 266423
Recommit modified version of r266311 including build bot regression fix.
This differs from the original r266311 by:
- Fixing Scalar::Promote to correctly zero- or sign-extend value depending
on signedness of the *source* type, not the target type.
- Omitting a few stand-alone fixes that were already committed separately.
llvm-svn: 266422
This routine contained a stray "return false;" making part of the code
never executed. Also, the stack offset where to find on-stack arguments
was incorrect.
llvm-svn: 266417
Summary:
The original breakpoint location test was failing for linux, because the compilers here tend to
merge the full-object and subobject destructors even at -O0 (as a result, we are getting only 2
breakpoint locations, and not 4 as the test expected. The fixup in r266164 substantially weakened
the test, as it now did not check whether both kinds of destructors were being found.
Because of these contraints, I have altered the logic of the test. It sets the
breakpoint by name, and then independently verifies that the breakpoint is set on the correct
demangled symbol name (which is not very meaningful since both kinds of destructors demangle to
the same name) *and* the correct symbol address (which is obtained by looking up the mangled
symbol name).
Reviewers: clayborg
Subscribers: ovyalov, zturner, lldb-commits
Differential Revision: http://reviews.llvm.org/D19052
llvm-svn: 266416
This patch implements __unaligned as a type qualifier; before that, it was
modeled as an attribute. Proper mangling of __unaligned is implemented as well.
Some OpenCL code/tests are tangenially affected, as they relied on existing
number and sizes of type qualifiers.
Differential Revision: http://reviews.llvm.org/D18596
llvm-svn: 266415
Summary:
A default uses-allocator constructor has been added since that overload was previously provided by the extended constructor.
Since Clang does implicit conversion checking after substitution this constructor has to deduce the allocator_arg_t parameter so that it can prevent the evaluation of "is_default_constructible" if the first argument doesn't match. See http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1391 for more information.
This patch fixes PR24779 (https://llvm.org/bugs/show_bug.cgi?id=24779)
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D19006
llvm-svn: 266409
Summary:
Calls on NVPTX are unusually expensive (for one thing, lots of state
needs to be saved to memory, which is slow), so make the inlininer much
more aggressive.
Reviewers: chandlerc
Subscribers: jholewinski, llvm-commits, tra
Differential Revision: http://reviews.llvm.org/D18561
llvm-svn: 266406
Summary:
InlineCost's threshold is multiplied by this value. This lets us adjust
the inlining threshold up or down on a per-target basis. For example,
we might want to increase the threshold on targets where calls are
unusually expensive.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D18560
llvm-svn: 266405
It's unsafe to duplicate blocks that contain convergent instructions
during ifcnv. See the patch for details.
Reviewers: hfinkel
Differential Revision: http://reviews.llvm.org/D17518
llvm-svn: 266404
Summary:
This IR pass is helpful for GPUs, and other targets with divergent
branches. It's a nop on targets without divergent branches.
Reviewers: chandlerc
Subscribers: llvm-commits, jingyue, rnk, joker.eph, tra
Differential Revision: http://reviews.llvm.org/D18626
llvm-svn: 266399