This adds a detailed profile summary in llvm-profdata. The summary is in the
form of one or more triples of the form (P, N, M) which is interpreted as if
we look at the Top-N counts in the profile, their sum accounts for P percentage
of the sum of all counts in the program and the minimum count in the Top-N is M.
Differential Revision: http://reviews.llvm.org/D16005
llvm-svn: 257680
This rewrites and expands the existing codeview dumping functionality in
llvm-readobj using techniques similar to those in lib/Object. This defines a
number of new records and enums useful for reading memory mapped codeview
sections in COFF objects.
The dumper is intended as a testing tool for LLVM as it grows more codeview
output capabilities.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D16104
llvm-svn: 257658
The LinkAllPasses.h file is included in several main programs, to force
a large number of passes to be linked in. However, the ForcePassLinking
constructor uses undefined behavior, since it calls member functions on
`nullptr`, e.g.:
((llvm::Function*)nullptr)->viewCFGOnly();
llvm::RGPassManager RGM;
((llvm::RegionPass*)nullptr)->runOnRegion((llvm::Region*)nullptr, RGM);
When the optimization level is -O2 or higher, the code below the first
nullptr dereference is optimized away, and replaced by `ud2` (on x86).
Therefore, the calls after that first dereference are never emitted. In
my case, I noticed there was no call to `llvm::sys::RunningOnValgrind()`!
Replace instances of dereferencing `nullptr` with either objects on the
stack, or regular function calls.
Differential Revision: http://reviews.llvm.org/D15996
llvm-svn: 257645
Summary:
v2: Make ReturnsVoid private, so that I can another 8 lines of code and
look more productive.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16034
llvm-svn: 257622
Summary:
Return values can be stored in SGPRs (i32) and VGPRs (f32).
This will be used by functions which expect some bytecode or other binary to
be appended at the end. It allows defining in which registers the return
values will be stored.
v2: don't do this for compute shaders
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16033
llvm-svn: 257621
is < ``2.0``.
Older versions of psutil (e.g. ``1.2.1`` which is the version shipped with
Ubuntu 14.04) use a different API for retrieving the child processes.
To handle this try the new API first and if that fails try the old API.
llvm-svn: 257616
Summary:
It is off by default, but can be used
with --misched=si
Patch by: Axel Davy
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: nhaehnle, solenskiner, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D11885
llvm-svn: 257609
The global entry point prologue currently assumes that the TOC
associated with a function is less than 2GB away from the function
entry point. This is always true when using the medium or small
code model, but may not be the case when using the large code model.
This patch adds a new variant of the ELFv2 global entry point prologue
that lifts the 2GB restriction when building with -mcmodel=large.
This works by emitting a quadword containing the distance from the
function entry point to its associated TOC immediately before the
entry point, and then using a prologue like:
ld r2,-8(r12)
add r2,r2,r12
Since creation of the entry point prologue is now split across two
separate routines (PPCLinuxAsmPrinter::EmitFunctionEntryLabel emits
the data word, PPCLinuxAsmPrinter::EmitFunctionBodyStart the prolog
code), I've switched to using named labels instead of just temporaries
to indicate the locations of the global and local entry points and the
new TOC offset data word.
These names are provided by new routines in PPCFunctionInfo modeled
after the existing PPCFunctionInfo::getPICOffsetSymbol.
Note that a corresponding change was committed to GCC here:
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00355.html
Reviewers: hfinkel
Differential Revision: http://reviews.llvm.org/D15500
llvm-svn: 257597
Summary:
With the ability to concatenate shader binaries, the limit of 15 no longer
applies.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16031
llvm-svn: 257592
Summary:
This allows Mesa to pass initial SPI_PS_INPUT_ADDR to LLVM.
The register assigns VGPR locations to PS inputs, while the ENA register
determines whether or not they are loaded.
Mesa needs to set some inputs as not-movable, so that a pixel shader prolog
binary appended at the beginning can assume where some inputs are.
v2: Make PSInputAddr private, because there is never enough silly getters
and setters for people to read.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16030
llvm-svn: 257591
Summary: ret.ll will contain a test for this
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16029
llvm-svn: 257590
Make x86 OptimizeLEAs pass remove LEA instruction if there is another LEA
(in the same basic block) which calculates address differing only be a
displacement. Works only for -Oz.
Differential Revision: http://reviews.llvm.org/D13295
llvm-svn: 257589