representing dependence from producer result to consumer.
With Linalg on tensors the dependence between operations can be from
the result of the producer to the consumer. This change just does a
NFC refactoring of the LinalgDependenceGraphElem to allow representing
both OpResult and OpOperand*.
Differential Revision: https://reviews.llvm.org/D95208
* Remove an unimplemented and unused member function declaration
* Remove a misleading comment about an unrelated constraint number
* Fix a comment
* Add f18 crash message to "flang" driver script
Differential Revision: https://reviews.llvm.org/D95180
Correct the analysis of references to transformational intrinsic
functions that have different semantics based on the presence or
absence of a DIM= argument; add shape analysis for UNPACK().
Differential Revision: https://reviews.llvm.org/D94716
In the motivating cases from https://llvm.org/PR48816 ,
we have a trailing trunc. But that is not required to
reduce the abs width:
https://alive2.llvm.org/ce/z/ECaz-p
...as long as we clear the int-min-is-poison bit (nsw).
We have some existing tests that are affected, and I'm
not sure what the overall implications are, but in general
we favor narrowing operations over preserving nsw/nuw.
If that causes problems, we could restrict this transform
based on type (shouldChangeType() and/or vector vs. scalar).
Differential Revision: https://reviews.llvm.org/D95235
spv.Ordered/spv.Unordered are meant for OpenCL Kernel capability.
For Vulkan Shader capability, we should use spv.IsNan to check
whether a number is NaN.
Add a new pattern for converting `std.cmpf ord|uno` to spv.IsNan
and bumped the pattern converting to spv.Ordered/spv.Unordered
to a higher benefit. The SPIR-V target environment will properly
select between these two patterns.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D95237
Expressions emitted to module files and error messages
sometimes contain conversions of integer results of inquiry
intrinsics; these are usually not needed, and can conflict
with "int" in the user's namespace. Improve folding so that
these conversions don't appear, and do some other clean-up
in adjacent code.
Differential Revision: https://reviews.llvm.org/D95172
Previously we only autogen the availability for ops that are
direct instantiating `SPV_Op` and expected other subclasses of
`SPV_Op` to define aggregated availability for all ops. This is
quite error prone and we can miss capabilities for certain ops.
Also it's arguable to have multiple levels of subclasses and try
to deduplicate too much: having the availability directly in the
op can be quite explicit and clear. A few extra lines of
declarative code is fine.
Reviewed By: mravishankar
Differential Revision: https://reviews.llvm.org/D95236
ObjCBOOLSummaryProvider was incorrectly treating BOOL as unsigned and this is now fixed.
Also adding tests for one bit bit-fields of BOOL and unsigned char.
This PR only has coro intrinsics needed for the Async to LLVM lowering. Will add other intrinsics as needed in the followup PRs.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D95143
The buckets are initialized in __kmp_dephash_create but when they are extended
the memory is allocated but not NULL'd, potentially leaving some buckets
uninitialized after all entries have been copied into the new allocation.
This commit makes sure the buckets are properly initialized with NULL before
copying the entries.
Differential Revision: https://reviews.llvm.org/D95167
If foo is referenced in any object file, bitcode file or shared object,
`__wrap_foo` should be retained as the redirection target of sym
(f96ff3c0f8).
If the object file defining foo has foo references, we cannot easily distinguish
the case from cases where foo is not referenced (we haven't scanned
relocations). Retain `__wrap_foo` because we choose to wrap sym references
regardless of whether sym is defined to keep non-LTO/LTO/relocatable links' behaviors similar
https://sourceware.org/bugzilla/show_bug.cgi?id=26358 .
If foo is defined in a shared object, `__wrap_foo` can still be omitted
(`wrap-dynamic-undef.s`).
Reviewed By: andrewng
Differential Revision: https://reviews.llvm.org/D95152
- Fix arguments name for subview and subtensor.
- Fix a typo in a comment of subtensor's method.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D95211
With this, we have complete support for finding integer sample points in FlatAffineConstraints.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D95047
This patch implements codegen for __managed__ variable attribute for HIP.
Diagnostics will be added later.
Differential Revision: https://reviews.llvm.org/D94814
On z/OS, the following error message is not matched correctly in lit tests. This patch updates the CHECK expression to match the end period successfully.
```
EDC5129I No such file or directory.
```
Differential Revision: https://reviews.llvm.org/D94239
Simplify vperm2x128(concat(X,Y),concat(Z,W)) folding.
Use collectConcatOps / ISD::INSERT_SUBVECTOR to find the source subvectors instead of hardcoded immediate matching.
The existing code did not deal with atomic loads correctly. Such loads
are represented as MemoryDefs. Bail out on any MemoryAccess that is not
a MemoryUse.
Because we were not looking for the llvm.coro.id.async intrinsic in the
early coro pass which triggers follow-up passes we relied on the
llvm.coro.end intrinsic being present. This might not be the case in
functions that end in unreachable code.
Differential Revision: https://reviews.llvm.org/D95144
[libomptarget][devicertl] Drop templated atomic functions
The five __kmpc_atomic templates are instantiated a total of seven times.
This change replaces the template with explictly typed functions, which
have the same prototype for amdgcn and nvptx, and implements them with
the same code presently in use.
Rolls in the accepted but not yet landed D95085.
The unsigned long long type can be replaced with uint64_t when replacing
the cuda function. Until then, clang warns on casting a pointer to one to
a pointer to the other.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D95093
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.
Note that we could position this transformation as just hoisting
of the `not` (still, iff y is freely negatible), but the test changes
show a number of regressions, so let's not do that.
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.
I'm intentionally structuring it this way, so that the actual fold only
does the fold, and no legality/correctness checks, all of which must be
done by the caller. This allows for the fold code to be more compact
and more easily grokable.
Hoist the successor updating out of the code that deals with branch
weight updating, and hoist the 'has weights' check from the latter,
making code more consistent and easier to follow.
While we already ignore uncond branches, we could still potentially
end up with a conditional branches with identical destinations
due to the visitation order, or because we were called as an utility.
But if we have such a disguised uncond branch,
we still probably shouldn't deal with it here.
The case where BB ends with an unconditional branch,
and has a single predecessor w/ conditional branch
to BB and a single successor of BB is exactly the pattern
SpeculativelyExecuteBB() transform deals with.
(and in this case they both allow speculating only a single instruction)
Well, or FoldTwoEntryPHINode(), if the final block
has only those two predecessors.
Here, in FoldBranchToCommonDest(), only a weird subset of that
transform is supported, and it's glued on the side in a weird way.
In particular, it took me a bit to understand that the Cond
isn't actually a branch condition in that case, but just the value
we allow to speculate (otherwise it reads as a miscompile to me).
Additionally, this only supports for the speculated instruction
to be an ICmp.
So let's just unclutter FoldBranchToCommonDest(), and leave
this transform up to SpeculativelyExecuteBB(). As far as i can tell,
this shouldn't really impact optimization potential, but if it does,
improving SpeculativelyExecuteBB() will be more beneficial anyways.
Notably, this only affects a single test,
but EarlyCSE should have run beforehand in the pipeline,
and then FoldTwoEntryPHINode() would have caught it.
This reverts commit rL158392 / commit d33f4efbfd.
I may have given bad advice, and skipping sext_inreg when matching SSAT
patterns is not valid on it's own. It at least needs to sext_inreg the
input again, but as far as I can tell is still only valid based on
demanded bits. For the moment disable that part of the combine,
hopefully reimplementing it in the future more correctly.
Instead of using the type llvm::StringMapEntry<{stringified_value_type}>
use only the base class llvm::StringMapEntryBase and calculate the
offsets of the member variables manually. The approach with stringifying
the name of the value type is pretty fragile as it can easily break with
local and dependent types.
Differential Revision: https://reviews.llvm.org/D94431
lto::Config has a field to control whether the build is "freestanding"
(no builtins) or not, but it is not hooked up to the code actually
running the passes.
This patch adds support for the flag to both the code that runs
optimization with the new and old pass managers, by explicitly adding a
TargetLibraryInfo instance. If Freestanding is true, all library functions
are disabled.
Reviewed By: steven_wu
Differential Revision: https://reviews.llvm.org/D94630
This is a step towards allowing CDB behavior to being configurable.
Previously ClangdServer itself created the configs and installed them into
contexts. This was natural as it knows how to deal with resulting diagnostics.
However this prevents config being used in CDB, which must be created before
ClangdServer. So we extract the context provider (config loader) as a separate
object, which publishes diagnostics to a ClangdServer::Callbacks itself.
Now initialization looks like:
- First create the config::Provider
- Then create the ClangdLSPServer, passing config provider
- Next, create the context provider, passing config provider + diagnostic callbacks
- now create the CDB, passing context provider
- finally create ClangdServer, passing CDB, context provider, and diagnostic callbacks
Differential Revision: https://reviews.llvm.org/D95087
Without this patch the old index could be freed, but there still could be tries to access it via the function returned by `SwapIndex::indexedFiles()` call.
This leads to hard to reproduce clangd crashes at code completion.
This patch keeps the old index alive until the function returned by `SwapIndex::indexedFiles()` call is alive.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D95206
combineX86ShufflesConstants/canonicalizeShuffleMaskWithHorizOp can both handle/earlyout shuffles with inputs of different widths, so delay widening as late as possible to make it easier to match constant folds etc.
The plan is to eventually move the widening inside combineX86ShuffleChain so that we don't create any new nodes unless we successfully combine the shuffles.
Walking the use list of a Constant (particularly, ConstantData)
is not scalable, since a given constant may be used by many
instructinos in many functions in many modules.
Differential Revision: https://reviews.llvm.org/D94713