In practice, this only gives modest savings (for a 6.5 MB DLL with
230 KB xdata, the xdata sections shrinks by around 2.5 KB); to
gain more, the frame lowering would need to be tweaked to more often
generate frame layouts that match the canonical layouts that can
be written in packed form.
Differential Revision: https://reviews.llvm.org/D87371
implements glibc-like wrappers over Linux syscalls.
[3/11] patch series to port ASAN for riscv64
Depends On D87998
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D87572
Some of the predicates can't always be decided - for example when a type
definition isn't available. At the same time it's necessary to let
client code decide what to do about such cases - specifically we can't
just use true or false values as there are callees with
conflicting strategies how to handle this.
This is a speculative fix for PR47276.
Differential Revision: https://reviews.llvm.org/D88133
Commit https://reviews.llvm.org/rG144e57fc9535 added this test
case that creates message queues but does not remove them. The
message queues subsequently build up on the machine until the
system wide limit is reached. This has caused failures for a
number of bots running on a couple of big PPC machines.
This patch just adds the missing cleanup.
When performing ThinLTO importing, the metadata loader attempts to lazy
load, by building an index. However, module level global decl attachment
metadata was being parsed early while building the index, since the
associated (module level) global values aren't materialized on demand.
This results in the creation of forward reference temporary metadatas,
which are expensive.
Normally, these module level global values don't have much attached
metadata. However, in the case of -fwhole-program-vtables (e.g. for
whole program devirtualization), the vtables may have many attached type
metadatas. This was resulting in very slow performance when performing
ThinLTO importing with the default lazy loading.
This patch restructures the handling of these global decl attachment
records, delaying their parsing until after the lazy loading index has
been built. Then the parser can use the interface that loads from the
index, which resolves forward references immediately instead of creating
expensive temporaries.
For one ThinLTO backend that imports from modules containing huge
numbers of vtables and associated types, I measured the following
compile times for the metadata materialization during function
importing, rounded to nearest second:
No -fwhole-program-vtables:
Lazy loading on (head): 1s
Lazy loading off (head): 3s
Lazy loading on (patch): 1s
With -fwhole-program-vtables:
Lazy loading on (head): 440s
Lazy loading off (head): 4s
Lazy loading on (patch): 2s
Differential Revision: https://reviews.llvm.org/D87970
The word "target" is overloaded, so lighten its load by using another word to denote the symbol or section to which a reloc points. While more stilted than "target", "referent" is rather less pompous than "designatum" or "denotatum". :P
Along the way, make a few neighboring variable names more descriptive.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D87584
-debug-pass is a legacy PM only option.
Some tests checks that the pass returned that it made a change,
which is not relevant to the NPM, since passes return PreservedAnalyses.
Some tests check that passes are freed at the proper time, which is also
not relevant to the NPM.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D87945
The IRSimilarityCandidate is a container to hold a region of
IRInstructions and offer interfaces for the starting instruction, ending
instruction, parent function, length. It also assigns a global value
number for each unique instance of a value in the region.
It also contains an interface to compare two IRSimilarity as to whether
they have the same sequence of similar instructions.
Tests for whether the instructions are similar are found in
unittests/Analysis/IRSimilarityIdentifierTest.cpp.
Differential Revision: https://reviews.llvm.org/D86970
The rendered html was (no hyperlink was generated):
(see Getting Started <GettingStarted.html#git-pre-push-hook>)
Now, it is (with proper hyperlink):
(see Git pre-push hook)
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D88116
A conversion from `pow` to `sqrt` shall not call an `errno`-setting
`sqrt` with -//infinity//: the `sqrt` will set `EDOM` where the `pow`
call need not.
This patch avoids the erroneous (pun not intended) transformation by
applying the restrictions discussed in the thread for
https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html.
The existing tests are updated (depending on emphasis in the checks for
library calls, avoidance of overlap, and overall coverage):
- to add `ninf`, retaining the intended library call,
- to use the intrinsic, retaining the use of `select`, or
- to expect the replacement to not occur.
The following is tested:
- The pow intrinsic folds to a `select` instruction to
handle -//infinity//.
- The pow library call folds, with `ninf`, to `sqrt` without the
`select` instruction associated with handling -//infinity//.
- The pow library call does not fold to `sqrt` without `ninf`.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D87877
The motivation here is that MachineBlockPlacement relies on analyzeBranch to remove branches to fallthrough blocks when the branch is not fully analyzeable. With the introduction of the FAULTING_OP psuedo for implicit null checking (see D87861), this case becomes important. Note that it's hard to otherwise exercise this path as BranchFolding handle's any fully analyzeable branch sequence without using this interface.
p.s. For anyone who saw my comment in the original review, what I thought was an issue in BranchFolding originally turned out to simply be a bug in my patch. (Now fixed.)
Differential Revision: https://reviews.llvm.org/D88035
Libc++ had an issue where nonsensical code like
decltype(std::stringstream{} << std::vector<int>{});
would compile, as long as you kept the expression inside decltype in
an unevaluated operand. This turned out to be that we didn't implement
LWG1203, which clarifies what we should do in that case.
rdar://58769296
When pairing ldr instructions to an ldp instruction, we cannot pair two ldr
destination registers where one is a sub or super register of the other.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D86906
This completes the circle, complementing -lto-embed-bitcode
(specifically, post-merge-pre-opt). Using -thinlto-assume-merged skips
function importing. The index file is still needed for the other data it
contains.
Differential Revision: https://reviews.llvm.org/D87949
This patch enables support for building compiler-rt builtins for 32-bit
Power arch on AIX. For now, we leave out the specialized ppc builtin
implementations for 128-bit long double and friends since those will
need some special handling for AIX.
Reviewed By: hubert.reinterpretcast
Differential Revision: https://reviews.llvm.org/D87383