Passing a Value * to CreateCall has to call getPointerElementType
to find the type of the pointer.
In this case we can rely on the fact that Intrinsic::getDeclaration
returns a Function * and use that version of CreateCall.
Attributor.cpp became quite big and we need to start provide structure.
The Attributor code is now in Attributor.cpp and the classes derived
from AbstractAttribute are in AttributorAttributes.cpp. Minor changes
were required but no intended functional changes.
We also minimized includes as part of this.
Reviewed By: baziotis
Differential Revision: https://reviews.llvm.org/D76873
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: espindola, efriedma, sdesmalen
Reviewed By: efriedma
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77275
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: hfinkel, efriedma, sdesmalen
Reviewed By: efriedma
Subscribers: wuzish, nemanjai, hiraditya, kbarton, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77266
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: sdesmalen, rriddle, efriedma
Reviewed By: sdesmalen
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77263
Summary: Pass options are a better choice for various reasons and avoid the need for static constructors.
Differential Revision: https://reviews.llvm.org/D77707
This patch introduces the heat coloring of the Control Flow Graph which is based
on the relative "hotness" of each BB. The patch is a part of sequence of three
patches, related to graphs Heat Coloring.
Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu
Differential Revision: https://reviews.llvm.org/D77161
Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11.
None of the new features of CUDA-10.2+ have been implemented yet, so using these
versions will still produce a warning.
Differential Revision: https://reviews.llvm.org/D77670
Summary:
Preserve call site info for duplicated instructions. We copy over the
call site info in CloneMachineInstrBundle to avoid repeated calls to
copyCallSiteInfo in CloneMachineInstr.
(Alternatively, we could copy call site info higher up the stack, e.g.
into TargetInstrInfo::duplicate, or even into individual backend passes.
However, I don't see how that would be safer or more general than the
current approach.)
Reviewers: aprantl, djtodoro, dstenb
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77685
fuchsia-x86_64-linux builder fails with:
/b/fuchsia-x86_64-linux/llvm.src/llvm/include/llvm/ADT/TinyPtrVector.h:85:15:
error: no matching conversion for C-style cast from 'nullptr_t' to 'llvm::ReachingDef'
RHS.Val = (EltTy)nullptr;
Let's see whether adding an explicit nullptr_t constructor helps.
RDA currently uses SmallVector<int, 1> to store reaching definitions.
A SmallVector<int, 1> is 24 bytes large, and X86 currently has
164 register units, which means we need 3936 bytes per block.
If you have a large function with 1000 blocks, that's already 4MB.
A large fraction of these reg units will not have any reaching defs
(say, those corresponding to zmm registers), and many will have just
one. A TinyPtrVector serves this use-case much better, as it only
needs 8 bytes per register if it has 0 or 1 reaching defs.
As the name implies, TinyPtrVector is designed to work with pointers,
so we need to add some boilerplate to treat our reaching def integers
as pointers, using an appropriate encoding. We need to keep the low
bit free for tagging, and make sure at least one bit is set to
distinguish the null pointer.
Differential Revision: https://reviews.llvm.org/D77513
Any or all the argument registers can be used to pass a byval formal
argument, with the limitation that the argument must fit in the
available registers (ie: is not split between registers and stack).
Differential Revision: https://reviews.llvm.org/D76902
callCapturesBefore always returns ModRef , if UseInst isn't a call. As
we only call it if we already know Mod is set, this only destroys the
Must bit for non-calls.
A perf helper is always only ever cretaed to be checked for validity
then passed as Counter ctor argument, never to be touched again.
Its lifetime should outlive that of the counter, and there is never any
reason to have two different counters of top of the perf helper.
Make sure these assumptions always hold by making the Counter consume the
PerfHelper.
On PowerPC most functions require a valid TOC pointer.
This is the case because either the function itself needs to use this
pointer to access the TOC or because other functions that are called
from that function expect a valid TOC pointer in the register R2.
The main exception to this is leaf functions that do not access the TOC
since they are guaranteed not to need a valid TOC pointer.
This patch introduces a feature that will allow more functions to not
require a valid TOC pointer in R2.
Differential Revision: https://reviews.llvm.org/D73664
D72467 updated the shufflevector instruction to include a constant mask
rather than a mask operand. The LangRef text was vague enough to still
make sense, but it is better to update here too, so there's no confusion
about valid mask values. The text here is adapted from the documentation
code comments for "class ShuffleVectorInst".
Differential Revision: https://reviews.llvm.org/D77396
Based on the post-commit comments for rG0f56bbc, there might
be a problem with this transform:
(bitcast (fpext/fptrunc X)) to iX) < 0 --> (bitcast X to iY) < 0
...and the ppc_fp128 data type, so conservatively bypass if we
are bitcasting a ppc_fp128.
We might be able to account for endian or other differences to
enable this for PowerPC again if that is useful.
Differential Revision: https://reviews.llvm.org/D77642