Currently all images are lowered to have a single
image PseudoSourceValue. Image stores happen to have
overly strict mayLoad/mayStore/hasSideEffects flags
set on them, so this happens to work. When these
are fixed to be correct, the scheduler breaks
this because the identical PSVs are assumed to
be the same address. These need to be unique
to the image resource value.
llvm-svn: 321555
Summary:
New pass manager driver passes DebugPM (-debug-pass-manager) flag into
individual PassManager constructors in order to enable debug logging.
FunctionToLoopPassAdaptor has its own internal LoopCanonicalizationPM
which never gets its debug logging enabled and that means canonicalization
passes like LoopSimplify are never present in -debug-pass-manager output.
Extending FunctionToLoopPassAdaptor's constructor and
createFunctionToLoopPassAdaptor wrapper with an optional
boolean DebugLogging argument.
Passing debug-logging flags there as appropriate.
Reviewers: chandlerc, davide
Reviewed By: davide
Subscribers: mehdi_amini, eraman, llvm-commits, JDevlieghere
Differential Revision: https://reviews.llvm.org/D41586
llvm-svn: 321548
Many of the callers don't guarantee there is no conflict before calling these and instead check for conflicts later.
The makeNegative/makeNonNegative methods replaced Known.One.setSignBit() and Known.Zero.setSignBit() calls that didn't have asserts originally. So removing the asserts is no worse than the original code.
Fixes PR35769
llvm-svn: 321539
This reverts r321138. It seems there are still underlying issues with
memdep. PR35519 seems to still be present if debug info is enabled. We
end up losing a memcpy. Somehow during store to memset merging, we
insert the memset after the memcpy or fail to update the memdep analysis
to account for the newly inserted memset of a pair.
Reduced test case:
#include <assert.h>
#include <stdio.h>
#include <string>
#include <utility>
#include <vector>
void do_push_back(
std::vector<std::pair<std::string, std::vector<std::string>>>* crls) {
crls->push_back(std::make_pair(std::string(), std::vector<std::string>()));
}
int __attribute__((optnone)) main() {
// Put some data in the vector and then remove it so we take the push_back
// fast path.
std::vector<std::pair<std::string, std::vector<std::string>>> crl_set;
crl_set.push_back({"asdf", {}});
crl_set.pop_back();
printf("first word in vector storage: %p\n", *(void**)crl_set.data());
// Do the push_back which may fail to initialize the data.
do_push_back(&crl_set);
auto* first = &crl_set.back().first;
printf("first word in vector storage (should be zero): %p\n",
*(void**)crl_set.data());
assert(first->empty());
puts("ok");
}
Compile with libc++, enable optimizations, and enable debug info:
$ clang++ -stdlib=libc++ -g -O2 t.cpp -o t.exe -Wl,-rpath=llvm/build/lib
This program will assert with this change.
llvm-svn: 321510
If after if-conversion, most of the instructions in this new BB construct a long and slow dependence chain, it may be slower than cmp/branch, even if the branch has a high miss rate, because the control dependence is transformed into data dependence, and control dependence can be speculated, and thus, the second part can execute in parallel with the first part on modern OOO processor.
This patch checks for the long dependence chain, and give up if-conversion if find one.
Differential Revision: https://reviews.llvm.org/D39352
llvm-svn: 321377
In https://reviews.llvm.org/rL321077 and https://reviews.llvm.org/D41231 I fixed a regression in the c-api which prevented the pruning from being *effectively* disabled.
However this approach, helpfully recommended by @labath, is cleaner.
It is also nice to remove the weasel words about effectively disabling from the api comments.
Differential Revision: https://reviews.llvm.org/D41497
llvm-svn: 321376
Re-land r321234. It had to be reverted because it broke the shared
library build. The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).
Original commit message:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
llvm-svn: 321375
inlined subroutines for a given address.
This is essentially the hot path of llvm-symbolizer when extracting
inlined frames during symbolization. Previously, we would read every
subprogram and every inlined subroutine, building a std::map across the
entire PC space to the best DIE, and then do only a handful of queries
as we symbolized a backtrace. A huge fraction of the time was spent
building the map itself.
This patch changes it two a two-level system. First, we just build a map
from PC-interval to DWARF subprograms. These are required to be disjoint
and so constructing this is pretty easy. Second, we build a map *just*
for the inlined subroutines within the subprogram containing the query
address. This allows us to look at far fewer DIEs and build a *much*
smaller set of cached maps in the llvm-symbolizer case where only a few
address get symbolized during the entire run.
It also builds both interval maps in a very different way. It constructs
a single flat vector of pairs that maps from offset -> index. The
indices point into collections of DIE objects, but can also be
"tombstones" (-1) to mark gaps. In the case of subprograms, this mostly
just simplifies the data structure a bit. For inlined subroutines,
because we carefully split them as we build the map, we end up in many
cases having no holes and not having to store both start and stop
offsets.
Finally, the PC ranges for the inlined subroutines are compressed into
32-bits by making them relative to the base PC of the outer subprogram.
This means that if you have a single function body with over 2gb of
executable code in it, we will stop mapping address past the first 2gb
of that function into inlined subroutines and just give you the
subprogram. This doesn't seem like a problem. ;]
All of this combines to make llvm-symbolizer *well* over 2x faster for
symbolizing backtraces out of LLVM's unittests. Death-test heavy unit
tests are running >2x faster. I'm still going to look at completely
disabling symbolization there, but figured while I had a good benchmark
we should make symbolization a bit better.
Sadly, the logic to build the flat interval map for the inlined
subroutines is fairly complex. I'm not super happy about this and
welcome any simplifying suggestions.
Huge thanks to Dave Blaikie who helped walk me through what the various
things I needed to do in DWARF to make this work.
Differential Revision: https://reviews.llvm.org/D40987
llvm-svn: 321345
The penalty is currently getting applied in a bunch of places where it
doesn't make sense, like bitcasts (which are free) and calls (which
were getting the call penalty applied twice). Instead, just apply the
penalty to binary operators and floating-point casts.
While I'm here, also fix getFPOpCost() to do the right thing in more
cases, so we don't have to dig into function attributes.
Differential Revision: https://reviews.llvm.org/D41522
llvm-svn: 321332
Summary:
This replaces calls to getEntryCount().hasValue() with hasProfileData
that does the same thing. This refactoring is useful to do before adding
synthetic function entry counts but also a useful cleanup IMO even
otherwise. I have used hasProfileData instead of hasRealProfileData as
David had earlier suggested since I think profile implies "real" and I
use the phrase "synthetic entry count" and not "synthetic profile count"
but I am fine calling it hasRealProfileData if you prefer.
Reviewers: davidxl, silvas
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41461
llvm-svn: 321331
Summary:
Add an additional bit to ModRefInfo, ModRefInfo::Must, to be cleared for known must aliases.
Shift existing Mod/Ref/ModRef values to include an additional most
significant bit. Update wrappers that modify ModRefInfo values to
reflect the change.
Notes:
* ModRefInfo::Must is almost entirely cleared in the AAResults methods, the remaining changes are trying to preserve it.
* Only some small changes to make custom AA passes set ModRefInfo::Must (BasicAA).
* GlobalsModRef already declares a bit, who's meaning overlaps with the most significant bit in ModRefInfo (MayReadAnyGlobal). No changes to shift the value of MayReadAnyGlobal (see AlignedMap). FunctionInfo.getModRef() ajusts most significant bit so correctness is preserved, but the Must info is lost.
* There are cases where the ModRefInfo::Must is not set, e.g. 2 calls that only read will return ModRefInfo::NoModRef, though they may read from exactly the same location.
Reviewers: dberlin, hfinkel, george.burgess.iv
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38862
llvm-svn: 321309
Reorganizes the DWARF consumer to derive the string offsets table
contribution's format from the contribution header instead of
(incorrectly) from the unit's format.
Reviewers: JDevliegehere, aprantl
Differential Revision: https://reviews.llvm.org/D41146
llvm-svn: 321295
There is nothing useful that can be done with a read-only uninitialized
buffer without const_casting its contents to initialize it. A better
solution is to obtain a writable buffer
(WritableMemoryBuffer::getNewUninitMemBuffer), and then convert it to a
read-only buffer after initialization. All callers of this function have
already been updated to do this, so this function is now unused.
llvm-svn: 321257
This is the only wasm def (and likely likely will be
for the foreseeable) file so no need for a sub-directory
Differential Revision: https://reviews.llvm.org/D41476
llvm-svn: 321246
When weak aliases are used with in same translation
unit we need to be able to directly reference to alias
and not just the thing it is aliases. We do this by
defining both a wasm import and a wasm export in this
case that result in a single Symbol. This change is
a partial revert of rL314245. A corresponding lld
change address the previous issues we had with this.
See: https://github.com/WebAssembly/tool-conventions/issues/34
Differential Revision: https://reviews.llvm.org/D41472
llvm-svn: 321242
Summary:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
llvm-svn: 321234
Allows preserving MachineMemOperands on intrinsics
through selection. For reasons I don't understand, this
is a static property of the pattern and the selector
deliberately goes out of its way to drop if not present.
Intrinsics already inherit from SDPatternOperator allowing
them to be used directly in instruction patterns. SDPatternOperator
has a list of SDNodeProperty, but you currently can't set them on
the intrinsic. Without SDNPMemOperand, when the node is selected
any memory operands are always dropped. Allowing setting this
on the intrinsics avoids needing to introduce another equivalent
target node just to have SDNPMemOperand set.
llvm-svn: 321212
This patch modifies the indirect call promotion utilities by exposing and using
an unconditional call promotion interface. The unconditional promotion
interface (i.e., call promotion without creating an if-then-else) can be used
if it's known that an indirect call has only one possible callee. The existing
conditional promotion interface uses this unconditional interface to promote an
indirect call after it has been versioned and placed within the "then" block.
A consequence of unconditional promotion is that the fix-up operations for phi
nodes in the normal destination of invoke instructions are changed. This is
necessary because the existing implementation assumed that an invoke had been
versioned, creating a "merge" block where a return value bitcast could be
placed. In the new implementation, the edge between a promoted invoke's parent
block and its normal destination is split if needed to add a bitcast for the
return value. If the invoke is also versioned, the phi node merging the return
value of the promoted and original invoke instructions is placed in the "merge"
block.
Differential Revision: https://reviews.llvm.org/D40751
llvm-svn: 321210
Summary: Very similar to AddressSanitizer, with the exception of the error type encoding.
Reviewers: kcc, alekseyshl
Subscribers: cfe-commits, kubamracek, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D41417
llvm-svn: 321203
Summary:
The function section prefix for PGO based layout (e.g. hot/unlikely)
should look at the hotness of all blocks not just the entry BB.
A function with a cold entry but a very hot loop should be placed in the
hot section, for example, so that it is located close to other hot
functions it may call. For SamplePGO it was already looking at the
branch weights on calls, and I made that code conditional on whether
this is SamplePGO since it was essentially a noop for instrumentation
PGO anyway.
Reviewers: davidxl
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D41395
llvm-svn: 321197
This commits the non-controversial part of https://reviews.llvm.org/D41029
(making the queries virtual). The PPC-specific portion of this will be
committed in a subsequent patch once some of the finer points are ironed out.
llvm-svn: 321182
These functions simply call their counterparts in the associated SDNode,
which do take an optional SelectionDAG. This change makes the legalization
debug trace a little easier to read, since target-specific nodes will
now have their names shown instead of "Unknown node #123".
llvm-svn: 321180
Summary:
Initial changes in interfaces of MCAsmStreamer/MCTargetStreamer for
correct debug info emission for Cuda.
1. PTX foramt does not support `.ascii` directives. Added the ability to
nullify it.
2. The initial function label must follow the first debug `.loc`
directive, not be followed by.
3. DWARF sections must be enclosed in braces.
Reviewers: hfinkel, probinson, jlebar, rafael, echristo
Subscribers: sdardis, nemanjai, llvm-commits, aprantl
Differential Revision: https://reviews.llvm.org/D40033
llvm-svn: 321178
NFC for currently supported targets. This resolves a problem encountered by
targets such as RISCV that reference `Subtarget` in ImmLeaf predicates.
llvm-svn: 321176
This teaches memcpyopt to make a non-local memdep query when a local query
indicates that the dependency is non-local. This notably allows it to
eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%.
This is r319482 and r319483, along with fixes for PR35519: fix the
optimization that merges stores into memsets to preserve cached memdep
info, and fix memdep's non-local caching strategy to not assume that larger
queries are always more conservative than smaller ones.
Fixes PR28958 and PR35519.
Differential Revision: https://reviews.llvm.org/D40802
llvm-svn: 321138
Work towards the unification of MIR and debug output by refactoring the
interfaces.
Before this patch we printed "<call frame instruction>" in the debug
output.
llvm-svn: 321084
Negative values never disabled the pruning - they simply set high values for the pruning interval.
The behaviour now is that negative values set the maximum pruning interval (which appears to have been the intention from the start) see https://reviews.llvm.org/D41231.
I have adjusted the comments to reflect this, removed any inaccurate statements, and corrected any typos I spotted in the English.
Differential Revision: https://reviews.llvm.org/D41279
llvm-svn: 321078
borked by: rL284966 (see: https://reviews.llvm.org/D25730).
Previously, Interval was unsigned (see: CachePruning.h), replacing the type with std::chrono::seconds (which is signed) causes a regression in behaviour because the c-api intends negative values to translate to large positive intervals to *effectively* disable the pruning (see comments on: setCachePruningInterval()).
Differential Revision: https://reviews.llvm.org/D41231
llvm-svn: 321077
Summary:
The motivation here is LLDB, where we need to fixup relocations in
mmapped files before their contents can be read correctly. The
MemoryBuffer class does exactly what we need, *except* that it maps the
file in read-only mode.
WritableMemoryBuffer reuses the existing machinery for opening and
mmapping a file. The only difference is in the argument to the
mapped_file_region constructor -- we create a private copy-on-write
mapping, so that we can make changes to the mapped data, but the changes
aren't carried over to the underlying file.
This patch is based on an initial version by Zachary Turner.
Reviewers: mehdi_amini, rnk, rafael, dblaikie, zturner
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40291
llvm-svn: 321071
Note:
- X86ISelLowering: setLibcallName(SINCOS) was superfluous as
InitLibcalls() already does it.
- ARMISelLowering: Setting libcallnames for sincos/sincosf seemed
superfluous as in the darwin case it wouldn't be used while for all
other cases InitLibcalls already does it.
llvm-svn: 321036
Adds missing support for DW_FORM_data16.
Update of r320852/r320886, fixing the unittest again, this time use a
raw char string for the test data.
Differential Revision: https://reviews.llvm.org/D41090
llvm-svn: 321011