- In certain cases, a generic pointer could be assumed as a pointer to
the global memory space or other spaces. With a dedicated target hook
to query that address space from a given value, infer-address-space
pass could infer and propagate that to all its users.
Differential Revision: https://reviews.llvm.org/D91121
When processing conditional branches, if the condition is an OR of 2 compares
and the false successor only has the current block as predecessor, queue both
negated conditions for the false successor
In the existing logic, for a given alloca, as long as its pointer value is stored into another location, it's considered as escaped.
This is a bit too conservative. Specifically, in non-optimized build mode, it's often to have patterns of code that first store an alloca somewhere and then load it right away.
These used should be handled without conservatively marking them escaped.
This patch tracks how the memory location where an alloca pointer is stored into is being used. As long as we only try to load from that location and nothing else, we can still
consider the original alloca not escaping and keep it on the stack instead of putting it on the frame.
Differential Revision: https://reviews.llvm.org/D91305
This patch adds a new pass to add !annotation metadata for entries in
@llvm.global.anotations, which is generated using
__attribute__((annotate("_name"))) on functions in Clang.
This has been discussed on llvm-dev as part of
RFC: Combining Annotation Metadata and Remarks
http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D91195
Widen the IV to the widest available and legal integer type, which makes this
transformations always safe so that we can skip overflow checks.
Motivation is to let this pass trigger on 64-bit targets too, and this is the
last patch in a serie to achieve this: D90402 moves pass LoopFlatten to just
before IndVarSimplify so that IVs are not already widened, D90421 factors out
widening from IndVarSimplify into Utils/SimplifyIndVar so that we can also use
it in LoopFlatten.
Differential Revision: https://reviews.llvm.org/D90640
This patch teaches the jump threading pass to call BPI->eraseBlock
when it folds a conditional branch.
Without this patch, BranchProbabilityInfo could end up with stale edge
probabilities for the basic block containing the conditional branch --
one edge probability with less than 1.0 and the other for a removed
edge.
This patch is one of the steps before we can safely re-apply D91017.
Differential Revision: https://reviews.llvm.org/D91511
In the last change to IRCE the BPI is ignored if BFI is present, however
BFI and BPI have a different thresholds. Specifically BPI approach checks only
latch exit probability so it is expected if the loop has only one exit block (latch)
the behavior with BFI and BPI should be the same,
BPI approach by default uses threshold 10, so it considers the loop with estimated
number of iterations less then 10 should not be considered for IRCE optimization.
BFI approach uses the default value 3 and this is inconsistent.
The CL modifies the code to use the same threshold for both approaches..
The test is updated due to it has two side-exits (except latch) and each of them has a
probability 1/16, so BFI estimates the number of runtime iteration is about to 7
(1/16 + 1/16 + some for latch) and test fails.
Reviewers: mkazantsev, ebrevnov
Reviewed By: mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D91230
There are 1-2 potential follow-up NFC commits to reduce
this further on the way to generalizing this for vectors.
The operand replacing path should be dead code because demanded
bits handles that more generally (D91415).
I noticed an add example like the one from D91343, so here's a similar patch.
The logic is based on existing code for the single-use demanded bits fold.
But I only matched a constant instead of using compute known bits on the
operands because that was the motivating patterni that I noticed.
I think this will allow removing a special-case (but incomplete) dedicated
fold within visitAnd(), but I need to untangle the existing code to be sure.
https://rise4fun.com/Alive/V6fP
Name: add with low mask
Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0
%a = add i8 %x, C1
%r = and i8 %a, C2
=>
%r = and i8 %x, C2
Differential Revision: https://reviews.llvm.org/D91415
This patch turns VPWidenGEPRecipe into a VPValue and uses it
during VPlan construction and codegeneration instead of the plain IR
reference where possible.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D84683
This is used to test RemoveRedundantDbgInstrs(), which is used by other
passes.
Reviewed By: ychen
Differential Revision: https://reviews.llvm.org/D91477
See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485
the implementation is known-broken for certain inputs,
the bugreport was up for a significant amount of timer,
and there has been no activity to address it.
Therefore, just completely rip out all of misexpect handling.
I suspect, fixing it requires redesigning the internals of MD_misexpect.
Should anyone commit to fixing the implementation problem,
starting from clean slate may be better anyways.
This reverts commit 7bdad08429,
and some of it's follow-ups, that don't stand on their own.
dependency. NFC
Use findSingleDependency in place of FindDependencies and stop passing a
set of Instructions around. Modify FindDependencies to return a boolean
flag which indicates whether the dependencies it has found are all
valid.
Like inlineCallIfPossible and InlinerPass, after inlining mergeAttributesForInlining
should be called to merge callee's attributes to caller. But it is not called in
AlwaysInliner, causes caller's attributes inconsistent with inlined code.
Attached test case demonstrates that attribute "min-legal-vector-width"="512" is
not merged into caller without this patch, and it causes failure in SelectionDAG
when lowering the inlined AVX512 intrinsic.
Differential Revision: https://reviews.llvm.org/D91446
Handle the emission of the add in a single place, instead of three
different ones.
Don't emit an unnecessary add with zero to start with. It will get
dropped by InstCombine, but we may as well not create it in the
first place. This also means that InstCombine does not need to
specially handle this extra add.
This is conceptually NFC, but can affect worklist order etc.
Use exact component name in add_ocaml_library.
Make expand_topologically compatible with new architecture.
Fix quoting in is_llvm_target_library.
Fix LLVMipo component name.
Write release note.
This patch adds a new !annotation metadata kind which can be used to
attach annotation strings to instructions.
It also adds a new pass that emits summary remarks per function with the
counts for each annotation kind.
The intended uses cases for this new metadata is annotating
'interesting' instructions and the remarks should provide additional
insight into transformations applied to a program.
To motivate this, consider these specific questions we would like to get answered:
* How many stores added for automatic variable initialization remain after optimizations? Where are they?
* How many runtime checks inserted by a frontend could be eliminated? Where are the ones that did not get eliminated?
Discussed on llvm-dev as part of 'RFC: Combining Annotation Metadata and Remarks'
(http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html)
Reviewed By: thegameg, jdoerfert
Differential Revision: https://reviews.llvm.org/D91188
No longer rely on an external tool to build the llvm component layout.
Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.
These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.
Differential Revision: https://reviews.llvm.org/D90848
If we cannot prove that the check is trivially true, but can prove that it either
fails on the 1st iteration or never fails, we can replace it with first iteration check.
Differential Revision: https://reviews.llvm.org/D88527
Reviewed By: skatkov
There might be some demanded/known bits way to generalize this,
but I'm not seeing it right now.
This came up as a regression when I was looking at a different
demanded bits improvement.
https://rise4fun.com/Alive/5fl
Name: general
Pre: ((-1 << countTrailingZeros(C1)) & C2) == 0
%a1 = add i8 %x, C1
%a2 = and i8 %x, C2
%r = sub i8 %a1, %a2
=>
%r = and i8 %a1, ~C2
Name: test 1
%a1 = add i8 %x, 192
%a2 = and i8 %x, 10
%r = sub i8 %a1, %a2
=>
%r = and i8 %a1, -11
Name: test 2
%a1 = add i8 %x, -108
%a2 = and i8 %x, 3
%r = sub i8 %a1, %a2
=>
%r = and i8 %a1, -4
Summary:
Refactor SinkAdHoistLICMFlags from a struct to a class with accessors and constructors to allow other
classes to construct flags with meaningful defaults while not exposing LICM internal details.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: asbirlea (Alina Sbirlea)
Differential Revision: https://reviews.llvm.org/D90482
Sometimes the an instruction we are trying to widen is used by the IV
(which means the instruction is the IV increment). Currently this may
prevent its widening. We should ignore such user because it will be
dead once the transform is done anyways.
Differential Revision: https://reviews.llvm.org/D90920
Reviewed By: fhahn
In the existing logic, for a given alloca, as long as its pointer value is stored into another location, it's considered as escaped.
This is a bit too conservative. Specifically, in non-optimized build mode, it's often to have patterns of code that first store an alloca somewhere and then load it right away.
These used should be handled without conservatively marking them escaped.
This patch tracks how the memory location where an alloca pointer is stored into is being used. As long as we only try to load from that location and nothing else, we can still
consider the original alloca not escaping and keep it on the stack instead of putting it on the frame.
Differential Revision: https://reviews.llvm.org/D91305
InstCombine canonicalizes 'sub nuw' instructions to 'add' without the
`nuw` flag. The typical case where we see it is decrementing induction
variables. For them, IndVars fails to prove that it's legal to widen them,
and inserts unprofitable `zext`'s.
This patch adds recognition of such pattern using SCEV.
Differential Revision: https://reviews.llvm.org/D89550
Reviewed By: fhahn, skatkov