The BinaryHolder has two caches for object and archive entries. These
are implemented as StringMaps of ObjectEntry and ArchiveEntry
respectively. The fact that they're stored by value is problematic
because the BinaryHolder hands out references that become invalidate
when the data structure grows. This patch wraps those object instances
in unique pointers and changes the interface to hand out pointers. This
resulted in transient failures.
rdar://90412671
Differential revision: https://reviews.llvm.org/D124567
There were two problems with directly copying the MMOs from the old
function. The MMOs are owned by the function's Allocator, so need to
be reallocated anyways (surprisingly I didn't notice breakage on
this). Second, the PseudoSourceValues are also allocated per function
and need to be reallocated.
The current testcase I'm trying to reduce only reproduces with IPRA
enabled and requires handling multiple functions.
The only real difference vs. the IR is the extra indirect to look for
the underlying MachineFunction, so treat the ReduceWorkItem as the
module instead of the function.
The ugliest piece of this is really the ugliness of
MachineModuleInfo. It not only tracks actual module state, but has a
number of transient fields used for isel and/or the asm printer. These
shouldn't do any harm for the use here, though they should be
separated out.
Right now, if we want to dump symbol at specified offset, we need to use `grep`.
And it can only show surrounding symbols in layout (not in lexical scope sense).
This adds similar options to `dump` command as `llvm-dwarfdump` to allow users
to dump symbol record at specified offset and its parents or children with
spcified depth.
`--symbol-offset=` must be used with `--modi` to dump only one symbol at given
offset.
`--show-parents`/`--show-children` must be used with `--symbol-offset` to
dump all symbols that are parents/children of the symbol at given offset.
`--parent-recurse-depth`/`--children-recurse-depth` must be used with
`--show-parents`/`--show-children` to specify the max up/down depth.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D124317
Default behavior for .file directory was changed in D105856, but
ptxas (CUDA 11.5 release) refuses to parse it:
$ llc -march=nvptx64 llvm/test/DebugInfo/NVPTX/debug-file-loc.ll
$ ptxas debug-file-loc.s
ptxas debug-file-loc.s, line 42; fatal : Parsing error near
'"foo.h"': syntax error
Added a new field to MCAsmInfo to control default value of
UseDwarfDirectory. This value is used if -dwarf-directory command line
option is not specified.
Differential Revision: https://reviews.llvm.org/D121299
Just clone all the virtual registers instead of looking for def
operands. This preserves the register values used, simplifying the
rest of the code. This avoids needing to expose the register map to
target code.
Previously the specific values used for fixed frame indexes was in
reverse order in the cloned function from the original, and a map was
used to adjust all frame indexes to the potentially new values. Insert
the fixed objects in reverse to avoid this. This simplifies other
code, since now we don't need to track down all frame indexes. This
will allow targets that store frame indexes in MachineFunctionInfo to
simply copy the values.
Note this isn't directly observable in the test since the resulting
MIR print/parse can shuffle the IDs around (in particular the final
serialization implicitly strips out dead objects).
Removing these is extremely unhelpful and just adds extra hassle. This
is really finding out whether your test script uses -mtriple or
not. You can't meaningfully delete these fields, and the resulting
module defaults to the host.
Functionality of restoreStatOnFile may be reused. Move it into
FileUtilities.cpp. Create helper class FilePermissionsApplier
to store and apply permissions.
Differential Revision: https://reviews.llvm.org/D123821
When using the L option to quick append a full archive to a thin
archive, the thin archive was being wrongly converted to a full archive.
I've fixed the issue and added a check for it in
thin-to-full-archive.test and expanded some tests.
Differential Revision: https://reviews.llvm.org/D123142
The legacy passes are deprecated now and would be removed in near
future. This patch tries to remove legacy passes in coroutines.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D123918
getSuccProbability was private for some reason, saying to go through
MachineBranchProbabilityInfo. There doesn't seem to be much point to
that, as that wrapper directly calls this.
Like other areas, some of these fields aren't handled by the MIR
printer/parser so aren't tested.
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove the (Thin)LTO pipelines.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D123882
For profile generation, we need to filter raw perf samples for binary of interest. Sometimes binary name along isn't enough as we can have binary of the same name running in the system. This change adds a process id filter to allow users to further disambiguiate the input raw samples.
Differential Revision: https://reviews.llvm.org/D123869
This patch makes printing of FailedToMaterialize errors in llvm-jitlink
conditional on the -show-err-failed-to-materialize option, which defaults to
false.
FailedToMaterialize errors are not root-cause errors: they're generated when a
symbol is requested but cannot be provided because of a failure that was
reported on some other error path. They typically don't convey actionable
information, and tend to flood error logs making root cause errors harder to
spot. Hiding FailedToMaterialize errors by default addresses these issues.
This change is a big blob of code that isn't easy to break up. It
either comes in all together as a blob, works and has tests, or it
doesn't do anything.
Logically you can think of this patch as three things:
(1) Adding virtual interfaces so the bitcode writer can be overridden
(2) Adding a new bitcode writer implementation for DXIL
(3) Adding some (optional) crazy CMake goop to build the
DirectXShaderCompiler's llvm-dis as dxil-dis for testing
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D122082
This didn't work at all before, and would assert on any frame
index. Also copy the other fields, which I believe should cover
everything. There are a few that are untested since MIR serialization
is apparently still missing them (isStatepointSpillSlot,
ObjectSSPLayout, and ObjectSExt/ObjectZExt).
The rust demangler has some odd buffer handling code, which will copy
the demangled string into the provided buffer, if it will fit.
Otherwise it uses the allocated buffer it made. But the length of the
incoming buffer will have come from a previous call, which was the
length of the demangled string -- not the buffer size. And of course,
we're unconditionally allocating a temporary buffer in the first
place. So we don't actually get buffer reuse, and we get a memcpy in
somecases.
However, nothing in LLVM ever passes in a non-null pointer. Neither
does anything pass in a status pointer that is then made use of. The
only exercise these have is in the test suite.
So let's just make the rust demangler have the same API as the dlang
demangler.
Reviewed By: tmiasko
Differential Revision: https://reviews.llvm.org/D123420
The change described by:
https://reviews.llvm.org/D122226
Moved some llvm-pdbutil functionality to the debug PDB library.
This patch addresses a broken '-modi' argument handling, which
causes an assertion if its value is other than '0' or '1'.
In addition, it moves the assertion for the number of occurrences
of the '-modi' argument from the PDB library into the llvm-pdbutil
driver.
Reviewed By: zequanwu
Differential Revision: https://reviews.llvm.org/D123483
This removes support for the legacy pass manager in llvm-lto and
llvm-lto2. In this case I've dropped the use-new-pm option entirely,
as I don't think this is considered part of the public interface.
This also makes -debug-pass-manager work with llvm-lto, because
that was needed to migrate some tests to NewPM.
Differential Revision: https://reviews.llvm.org/D123376
The current implementation of memprof information in the indexed profile
format stores the representation of each calling context fram inline.
This patch uses an interned representation where the frame contents are
stored in a separate on-disk hash table. The table is indexed via a hash
of the contents of the frame. With this patch, the compressed size of a
large memprof profile reduces by ~22%.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D123094
The changes described by:
https://reviews.llvm.org/D121801https://reviews.llvm.org/D122226
Moved some llvm-pdbutil functionality to the debug PDB library.
This patch addresses one outstanding issue concerning the global
state (Filters) created in the PDB library.
- Move 'Filters' inside the 'LinePrinter' class.
- Omit 'Optional' and just pass 'PrintScope &HeaderScope' everywhere.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D122887
This removes support for performing LTO using the legacy pass
manager in LLVMgold.so. Explicitly enabling the new pass manager
is retained as a no-op.
Differential Revision: https://reviews.llvm.org/D123294
The profiler can sometimes give us a LBR trace that implicates bogus code ranges. For example,
0xc5acb56/0xc66c6c0 0xc628195/0xf31fbb0 0xc611261/0xc628130 0xc5c1a21/0xc6111c0 0x1f7edfd3/0xc5c3a50 0xc5c154f/0x1f7edec0 0xe8eed07/0xc5c11e0
, note that the first two pairs are supposed to form a linear execution range, in this case, it is [0xf31fbb0, 0xc5acb56] , which doesn't make sense.
Such bogus ranges should be ruled out to avoid generating a bad profile. I'm fixing this for both CS and non-CS cases.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D123271
Alias.getAliaseeObject() shouldn't be null, do use dyn_cast instead of dyn_cast_or_null
Also, remove redundant `else if (!F)` test - that is always true at the point in the if-else chain
This flag is present in MSVC's ml.exe to suppress copyright info output.
LLVM doesn't output copyright info, so this flag does nothing in
llvm-ml. We still add this flag though so that when llvm-ml is used as a
drop-in replacement for MSVC ml.exe, we don't get any extra warnings.
Furthermore, this behavior is also consistent with other llvm binaries
for Windows (e.g. clang-cl, llvm-mt, lld-link, etc.)
Differential revision: https://reviews.llvm.org/D123068
I saw the TODOs while reading this file and figured I'd do them.
I haven't seen these happen in practice.
No expected behavior change.
Differential Revision: https://reviews.llvm.org/D123215
STABS information consists of a list of records in the linked binary
that look like this:
OSO: path/to/some.o
SO: path/to/some.c
FUN: sym1
FUN: sym2
...
The linked binary has one such set of records for every .o file linked
into it.
When dsymutil processes the binary's STABS information, it:
1. Reads the .o file mentioned in the OSO line
2. For each FUN entry after it in the main executable's STABS info:
a) it looks up that symbol in the symbol of that .o file
b) if it doesn't find it there, it goes through all symbols in the
main binary at the same address and sees if any of those match
With ICF, ld64.lld's STABS output claims that all identical functions
that were folded are in the .o file of the one that's deemed the
canonical one. Many small functions might be folded into a single
function, so there are .o OSO entries that end up with many FUN lines,
but almost none of them exist in the .o file's symbol table.
Previously, dsymutil would do a full scan of all symbols in the main
executable _for every of these entries_.
This patch instead scans all aliases once and remembers them per name.
This reduces the alias resolution complexity from
O(number_of_aliases_in_o_file * number_of_symbols_in_main_executable) to
O(number_of_aliases_in_o_file * log(number_of_aliases_in_o_file)).
In practice, it reduces the time spent to run dsymutil on
Chromium Framework from 26 min (after https://reviews.llvm.org/D89444)
or 12 min (before https://reviews.llvm.org/D89444) to ~8m30s.
We probably want to change how ld64.lld writes STABS entries when ICF
is enabled, but making dsymutil not have pathological performance for
this input seems like a good change as well.
No expected behavior change (other than it's faster). I verified that
for Chromium Framework, the generated .dSYM is identical with and
without this patch.
Differential Revision: https://reviews.llvm.org/D123218