Summary:
This is a companion patch for http://reviews.llvm.org/D16124.
Internalized symbols increase the size of strongly-connected components in
SCC-based module splitting and thus reduce the amount of parallelism. This
patch records the original linkage of non-local symbols prior to
internalization and then restores it just before splitting/CodeGen. This is
also useful for cases where the linker requires symbols to remain external, for
instance, so they can be placed according to linker script rules.
It's currently under its own flag (-restore-globals) but should eventually
share a common flag with D16124.
Reviewers: joker.eph, pcc
Subscribers: slarin, llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D16229
llvm-svn: 258100
Although glibc defines it, this is currently of no use for my primary
use-case (dumping DT_* keys correctly). Its semantic is not described
anywhere I can find, so better leave it out for now.
Thanks to Rafael for pointing out in his post-commit review!
llvm-svn: 258089
Summary:
Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios.
This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols.
Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module).
Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org)
Subscribers: llvm-commits, joker.eph
Differential Revision: http://reviews.llvm.org/D16124
llvm-svn: 258083
Summary:
When SimplifySetCC sees a setcc node that compares the result of a
value extension operation with a constant, it tries to simplify the
setcc node by eliminating the extension and shrinking the constant.
If shrinking the inputs to setcc is deemed not desirable by the target
(e.g. the target does not want a setcc comparing i1 values), then it
is still possible to optimize this sequence in some cases.
This patch adds the following combines to SimplifySetCC when shrinking setcc
inputs is not desirable:
(setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc))
(setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc))
There are no tests for this yet, but once AMDGPU correctly implements
TargetLowering::isTypeDesirableForOp(), this new combine will be
exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test.
Reviewers: resistor, arsenm
Subscribers: jroelofs, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15034
llvm-svn: 258067
The symtab is logically referenced beyond the call to the create
method. This changes makes sure its lifetime matches that of
the reader.
llvm-svn: 258036
Entry block count was not counted and is corrected. Also
introduce a new metric that is MaxInternalBlockCount which
show command shows (as before).
llvm-svn: 257987
This is part of a new statistics gathering feature for the sanitizers.
See clang/docs/SanitizerStats.rst for further info and docs.
Differential Revision: http://reviews.llvm.org/D16174
llvm-svn: 257970
WebAssembly's stack will never be executable by default, so it isn't
necessary to declare .note.GNU-stack sections to request a non-executable
stack.
Differential Revision: http://reviews.llvm.org/D15969
llvm-svn: 257962
In the optimizer (GVN etc.) when eliminating redundant nodes with different
flags, the flags are ignored for the purposes of testing for congruence, and
then intersected for the purposes of producing a result that supports the union
of all the uses. This commit makes SelectionDAG's CSE do the same thing,
allowing it to CSE nodes in more cases. This fixes PR26063.
Differential Revision: http://reviews.llvm.org/D15957
llvm-svn: 257940
Summary:
Rename to getCatchSwitchParentPad, to make it more clear which ancestor
the "parent" in question is. Add a comment pointing out the key feature
that the returned pad indicates which funclet contains the successor
block.
Reviewers: rnk, andrew.w.kaylor, majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16222
llvm-svn: 257933
These casts were from function pointer to data pointer type, which some
compilers (including GCC) may warn about. In all cases where these casts were
used the original value was still available as a TargetAddress (uint64_t), so
we can just print a formatted version of that instead.
llvm-svn: 257932
The file and buffer writer code are mostly shared except for the
stream back-patching. This is because raw_string_ostream does not
support seek like interface. The result is that the data patching
code needs to be pushed to the caller which is not quite readable
(passing around offset, value etc). This also makes future enhancement
(which needs more patching) more difficult (and can make impl messy).
In this patch, two types of streams needed by the writer are now
unified with same set of interfaces under ProfOStream class. The patch
method is added so that common implementation becomes cleaner. It
also enables future enhancement. Should be NFC.
llvm-svn: 257921
This reverts commit r257751, bringing back r256105.
The problem the assert found was fixed in r257915.
Original commit message:
Assert that we have all use/users in the getters.
An error that is pretty easy to make is to use the lazy bitcode reader
and then do something like
if (V.use_empty())
The problem is that uses in unmaterialized functions are not accounted
for.
This patch adds asserts that all uses are known.
llvm-svn: 257920
# The first commit's message is:
Revert "[ARM] Add DSP build attribute and extension targeting"
This reverts commit b11cc50c0b4a7c8cdb628abc50b7dc226ff583dc.
# This is the 2nd commit message:
Revert "[ARM] Add new system registers to ARMv8-M Baseline/Mainline"
This reverts commit 837d08454e3e5beb8581951ac26b22fa07df3cd5.
llvm-svn: 257916
Added 2 constants:
DT_TLSDESC_PLT = 0x6FFFFEF6, Location of PLT entry for TLS descriptor resolver calls.
DT_TLSDESC_GOT = 0x6FFFFEF7, Location of GOT entry used by TLS descriptor resolver PLT entry.
Constants were taken from "Thread-Local Storage Descriptors for IA32 and AMD64/EM64T Version 0.9.5" http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt
Differential revision: http://reviews.llvm.org/D16185
llvm-svn: 257911
platforms.
With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.
This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311
(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)
Differential Revision: http://reviews.llvm.org/D16145
llvm-svn: 257902
There are several requirements that ended up with this design;
1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
2. Bitreversals and byteswaps are very related in their matching logic.
3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
4. Bswaps are best matched early in InstCombine.
The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.
We can then extend the matching logic in one place only.
llvm-svn: 257875
This method has no callers.
Also remove X86ELFRelocationInfo.cpp and X86MachORelocationInfo.cpp
which only existed to provide an implementation of that method.
Ok'd by Rafael and Jim.
llvm-svn: 257859
classes.
OrcRemoteTargetClient::RCMemoryManager will now register EH frames with the
server automatically. This allows remote-execution of code that uses exceptions.
llvm-svn: 257816
Rounding up an integer m to a nearest multiple of n where n is a power
of 2 is used very often if you are writing code to emit binary files.
RoundUpToAlignment is a small function to do that. But we found that the
function has a small but annoying issue; the name is a bit too long.
Because it is used quite often, that hurts readability.
This patch is to rename the function. The original name is kept as a
forwarder, so that submitting this patch won't immediately break Clang
and other LLVM projects. Once I update all occurrences of RoundUpToAlignment,
I'll remove the old name entirely.
http://reviews.llvm.org/D16162
llvm-svn: 257799