Some use cases are appearing where salvaging is needed that does not
correspond to an instruction being deleted -- for example an instruction
being sunk, or a Value not being available in a block being isel'd.
Enable more fine grained control over how salavging occurs by splitting
the logic into helper functions, separating things that are specific to
working on DbgVariableIntrinsics from those specific to interpreting IR
and building DIExpressions.
Differential Revision: https://reviews.llvm.org/D57696
llvm-svn: 353156
When LSR first adds SCEVs to BaseRegs, it only does it if `isZero()` has
returned false. In the end, in invocation of `InsertFormula`, it asserts that
all values there are still not zero constants. However between these two
points, it makes some transformations, in particular extends them to wider
type.
SCEV does not give us guarantee that if `S` is not a constant zero, then
`sext(S)` is also not a constant zero. It might have missed some optimizing
transforms when it was calculating `S` and then made them when it took `sext`.
For example, it may happen if previously optimizing transforms were limited
by depth or somehow else.
This patch adds a bailout when we may end up with a zero SCEV after extension.
Differential Revision: https://reviews.llvm.org/D57565
Reviewed By: samparker
llvm-svn: 353136
Summary:
When attaching prof metadata to promoted direct calls in SamplePGO
mode, no need to construct and use a SmallVector to pass a single count
to the ArrayRef parameter, we can simply use a brace-enclosed init list.
This made a small but consistent improvement for a ThinLTO backend
compile I was measuring.
Reviewers: wmi
Subscribers: mehdi_amini, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57706
llvm-svn: 353123
Summary:
If the user declares or defines `__sancov_lowest_stack` with an
unexpected type, then `getOrInsertGlobal` inserts a bitcast and the
following cast fails:
```
Constant *SanCovLowestStackConstant =
M.getOrInsertGlobal(SanCovLowestStackName, IntptrTy);
SanCovLowestStack = cast<GlobalVariable>(SanCovLowestStackConstant);
```
This variable is a SanitizerCoverage implementation detail and the user
should generally never have a need to access it, so we emit an error
now.
rdar://problem/44143130
Reviewers: morehouse
Differential Revision: https://reviews.llvm.org/D57633
llvm-svn: 353100
Summary:
The fix added in r352904 is not quite correct, or rather misleading:
1. When the texfailctrl (TFC) argument was non-constant, the fix assumed
non-TFE/LWE, which is incorrect.
2. Regardless, this code path cannot even be hit for correct
TFE/LWE-enabled calls, because those return a struct. Added
a test case for those for completeness.
Change-Id: I92d314dbc67a2670f6d7adaab765ef45f56a49cf
Reviewers: hliao, dstuttard, arsenm
Subscribers: kzhuravl, jvesely, wdng, yaxunl, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57681
llvm-svn: 353097
LoopVectorize adds llvm.loop.isvectorized, but leaves
llvm.loop.vectorize.enable. Do not consider such a loop for user-forced
vectorization since vectorization already happened -- by prioritizing
llvm.loop.isvectorized except for TM_SuppressedByUser.
Fixes http://llvm.org/PR40546
Differential Revision: https://reviews.llvm.org/D57542
llvm-svn: 353082
This assertion makes sure all sub-loops are in LCSSA form before
bringing their parent in LCSSA form. This precondition was added to
formLCSSA in D56848.
Reviewers: davide, efriedma, mzolotukhin
Reviewed By: davide
Differential Revision: https://reviews.llvm.org/D56921
llvm-svn: 352958
Summary:
Currently, ASan inserts a call to `__asan_handle_no_return` before every
`noreturn` function call/invoke. This is unnecessary for calls to other
runtime funtions. This patch changes ASan to skip instrumentation for
functions calls marked with `!nosanitize` metadata.
Reviewers: TODO
Differential Revision: https://reviews.llvm.org/D57489
llvm-svn: 352948
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57173
llvm-svn: 352913
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57172
llvm-svn: 352911
This cleans up all InvokeInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57171
llvm-svn: 352910
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.
Differential Revision: https://reviews.llvm.org/D57170
llvm-svn: 352909
An unused variable problem was introduced with rL352870
and stubbed out with rL352871, but we can make a better
fix by actually using the local variable in code rather
than just the assert.
llvm-svn: 352873
If we can reduce the x86-specific intrinsic to the generic op, it allows existing
simplifications and value tracking folds. AFAICT, this always results in identical
x86 codegen in the non-reduced case...which should be true because we semi-generically
(too aggressively IMO) convert to llvm.uadd.with.overflow in CGP, so the DAG/isel must
already combine/lower this intrinsic as expected.
This isn't quite what was requested in:
https://bugs.llvm.org/show_bug.cgi?id=40486
...but we want to have these kinds of folds early for efficiency and to enable greater
simplifications. For the case in the bug report where we have:
_addcarry_u64(0, ahi, 0, &ahi)
...this gets completely simplified away in IR.
Differential Revision: https://reviews.llvm.org/D57453
llvm-svn: 352870
InlineCost's isInlineViable() is changed to return InlineResult
instead of bool. This provides messages for failure reasons and
allows to get more specific messages for cases where callsites
are not viable for inlining.
Reviewed By: xbolva00, anemet
Differential Revision: https://reviews.llvm.org/D57089
llvm-svn: 352849
Indices are checked as they are generated. No need to fill the whole array of indices.
Differential Revision: https://reviews.llvm.org/D57144
llvm-svn: 352839
Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc
doesn't choke on it, hopefully.
Original Message:
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352827
This reverts commit f47d6b38c7 (r352791).
Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.
llvm-svn: 352800
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.
Then:
- update the CallInst/InvokeInst instruction creation functions to
take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.
One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.
However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)
Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.
Differential Revision: https://reviews.llvm.org/D57315
llvm-svn: 352791
Summary:
COFF requires that COMDAT name match that of the leader. When we promote
and rename an internal leader in ThinLTO due to an import, ensure we
subsequently rename the associated COMDAT. Similar to D31963 which did
this during ThinLTO module splitting.
Fixes PR40414.
Reviewers: pcc, inglorion
Subscribers: mehdi_amini, dexonsmith, dmajor, llvm-commits
Differential Revision: https://reviews.llvm.org/D57395
llvm-svn: 352763
Introduces a pass that provides default lowering strategy for the
`experimental.widenable.condition` intrinsic, replacing all its uses with
`i1 true`.
Differential Revision: https://reviews.llvm.org/D56096
Reviewed By: reames
llvm-svn: 352739
This is meant to be used with clang's __builtin_dynamic_object_size.
When 'true' is passed to this parameter, the intrinsic has the
potential to be folded into instructions that will be evaluated
at run time. When 'false', the objectsize intrinsic behaviour is
unchanged.
rdar://32212419
Differential revision: https://reviews.llvm.org/D56761
llvm-svn: 352664
The point is that this simplifies integration of new intrinsics into SimplifiedDemandedVectorElts, and ensures we don't miss any existing ones.
This is intended to be NFC-ish, but as seen from the diffs, can produce slightly different output. This is due to order of transforms w/in instcombine resulting in two slightly different fixed points. That's something we should fix, but isn't a problem w/this patch per se.
Differential Revision: https://reviews.llvm.org/D57398
llvm-svn: 352653
Summary:
Check the bool value of the attribute in getOptionalBoolLoopAttribute
not just its existance.
Eliminates the warning noise generated when vectorization is explicitly disabled.
Reviewers: Meinersbur, hfinkel, dmgreen
Subscribers: jlebar, sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D57260
llvm-svn: 352555
I'm circling back around to a loose end from D51929.
The backend (either CGP or DAG) doesn't recognize this pattern, so we end up with different asm for these IR variants.
Regardless of any future changes to canonicalize to saturation/overflow intrinsics, we want to get raw IR variations
into the minimal number of raw IR forms. If/when we can canonicalize to intrinsics, that will make that step easier.
Pre: C2 == ~C1
%a = add i32 %x, C1
%c = icmp ugt i32 %x, C2
%r = select i1 %c, i32 -1, i32 %a
=>
%a = add i32 %x, C1
%c2 = icmp ult i32 %x, C2
%r = select i1 %c2, i32 %a, i32 -1
https://rise4fun.com/Alive/pkH
Differential Revision: https://reviews.llvm.org/D57352
llvm-svn: 352536
Summary:
This patch avoids an assert in IPConstantPropagation when
there is a argument count/type mismatch between the caller and
the callee.
While this is actually UB on C-level (clang emits a warning),
the IR verifier seems to accept it. I'm not sure what other
frontends/languages might think about this, so simply bailing out
to avoid hitting an assert (in CallSiteBase<>::getArgOperand or
Value::doRAUW) seems like a simple solution.
The problem is exposed by the fact that AbstractCallSites will look
through a bitcast at the callee position of a call/invoke.
Reviewers: jdoerfert, reames, efriedma
Reviewed By: jdoerfert, efriedma
Subscribers: eli.friedman, efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D57052
llvm-svn: 352469
GEPs can produce either scalar or vector results. If we're extracting only a subset of the vector lanes, simplifying the operands is helpful in eliminating redundant computation, and (eventually) allowing further optimizations
Differential Revision: https://reviews.llvm.org/D57177
llvm-svn: 352440
Summary:
A recent fix to the ThinLTO whole program dead code elimination (D56117)
increased the thin link time on a large MSAN'ed binary by 2x.
It's likely that the time increased elsewhere, but was more noticeable
here since it was already large and ended up timing out.
That change made it so we would repeatedly scan all copies of linkonce
symbols for liveness every time they were encountered during the graph
traversal. This was needed since we only mark one copy of an aliasee as
live when we encounter a live alias. This patch fixes the issue in a
more efficient manner by simply proactively visiting the aliasee (thus
marking all copies live) when we encounter a live alias.
Two notes: One, this requires a hash table lookup (finding the aliasee
summary in the index based on aliasee GUID). However, the impact of this
seems to be small compared to the original pre-D56117 thin link time. It
could be addressed if we keep the aliasee ValueInfo in the alias summary
instead of the aliasee GUID, which I am exploring in a separate patch.
Second, we only populate the aliasee GUID field when reading summaries
from bitcode (whether we are reading individual summaries and merging on
the fly to form the compiled index, or reading in a serialized combined
index). Thankfully, that's currently the only way we can get to this
code as we don't yet support reading summaries from LLVM assembly
directly into a tool that performs the thin link (they must be converted
to bitcode first). I added a FIXME, however I have the fix under test
already. The easiest fix is to simply populate this field always, which
isn't hard, but more likely the change I am exploring to store the
ValueInfo instead as described above will subsume this. I don't want to
hold up the regression fix for this though.
Reviewers: trentxintong
Subscribers: mehdi_amini, inglorion, dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D57203
llvm-svn: 352438
Summary:
If MemorySSA is avaiable, we can skip checking all instructions if block has any Defs.
(volatile loads are also Defs).
We still need to check all instructions for "canThrow", even if no Defs are found.
Reviewers: chandlerc
Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits
Differential Revision: https://reviews.llvm.org/D57129
llvm-svn: 352393