Implements the RegionBranchOpInterface method getNumRegionInvocations to `scf::IfOp` so that, when the condition is constant, the number of region executions can be analyzed by `NumberOfExecutions`.
Reviewed By: jpienaar, ftynse
Differential Revision: https://reviews.llvm.org/D115087
Instead of bailing and using the default expansion, we can more efficiently use the shl(unpack(x,x),unpack(amt,zero)) pattern for vXi8 rotl, as we'll then use vXi16 fast PMULLW (or PSLLVW).
This required some minor changes to improve constant folding during unpack shuffle creation and convertShiftLeftToScale to support constants that have already been lowered to constant pools.
No need to represent splats as a node with the reused scalars, it may
increase the cost (currently pass just ignores extra shuffle cost and it
is still not correct).
Differential Revision: https://reviews.llvm.org/D115800
* Call `replaceOp` instead of `mapBuffer`.
* Remove bvm and all helper functions around bvm.
* Simplify FuncOp bufferization and rely on existing functionality to generate ToMemrefOps for function BlockArguments.
Differential Revision: https://reviews.llvm.org/D115515
Prior to FreeBSD 14, FreeBSD provided special _p.a libraries for use
with -pg. They are no longer used or provided. If the target does
not specify a major version (e.g. amd64-unknown-freebsd, rather than
amd64-unknown-freebsd12) default to the new behaviour.
Differential Revision: https://reviews.llvm.org/D114396
Attempt to lower a shuffle as a permute instruction(rev/revb/revh/revw) for fixed length SVE.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D114960
Teach convertToThreeAddress to use the V_FMA_F16_gfx9 pseudo (i.e. the
standard instruction in GFX9 onwards) instead of V_FMA_F16 (the legacy
pseudo for GFX8 compatibility, which is no longer supported in GFX10).
This follows the example of macToMad in SIFoldOperands.
Differential Revision: https://reviews.llvm.org/D115731
Pseudos like V_MAD_U16 and V_FMA_F16 map down to what GFX9 calls
v_mad_legacy_u16 and v_fma_legacy_f16, which are documented to have the
same zeroing behaviour as on GFX8.
Differential Revision: https://reviews.llvm.org/D115729
This helps user to know what level of support there
is (roughly) for coroutine feature.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D115778
Ops for the signed counterparts "llvm.smin" and "llvm.smax" already exist. This patch adds the unsigned versions as well.
Differential Revision: https://reviews.llvm.org/D115796
After removing the range type, Linalg does not define any type. The revision thus consolidates the LinalgOps.h and LinalgTypes.h into a single Linalg.h header. Additionally, LinalgTypes.cpp is renamed to LinalgDialect.cpp to follow the convention adopted by other dialects such as the tensor dialect.
Depends On D115727
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115728
For some reason, the user-defined implicit conversion from StringRef to
std::string is not invoked by std::map::emplace in libstdc++ 5.4.0, even
though it works fine on modern systems.
We currently rely on generic promotion to vXi16/vXi32 types for rotation lowering on various AVX512 targets.
We can more efficiently perform this by making use of the shl(unpack(x,x),amt) style pattern that we already use for vXi8 rotation by splat amounts, either by widening to a larger vector type or unpacking lo/hi halves of the subvectors so we can access whatever vXi16/vXi32 per-element shifts are supported.
This uncovered an issue in the supportedVectorShiftWithImm/supportedVectorVarShift legality checkers which was using hasAVX512() instead of useAVX512Regs() to detect support for 512-bit vector shifts.
NOTE: I'm actually hoping to eventually reuse this code for shl(unpack(y,x),amt) funnel shift lowering (vXi8 and wider), but initially I just want to ensure we have efficient ISD::ROTL lowering for all targets.
Differential Revision: https://reviews.llvm.org/D115180
CreateElementBitCast() can preserve the pointer element type in
the presence of opaque pointers, so use it in place of CreateBitCast()
in some places. This also sometimes simplifies the code a bit.
This expands checking for more expressions. This will check underflow
and loss of precision when using call expressions like:
void foo(unsigned);
int i = -1;
foo(i);
This also includes other expressions as well, so it can catch negative
indices to std::vector since it uses unsigned integers for [] and .at()
function.
Patch by: @pfultz2
Differential Revision: https://reviews.llvm.org/D46081
This patch adds lowering from omp.sections and omp.section (simple lowering along with the nowait clause) to LLVM IR.
Tests for the same are also added.
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D115030
Instead of modifying the existing linalg.tiled_loop op, create a new op with memref input/outputs and delete the old op.
Differential Revision: https://reviews.llvm.org/D115493
Change all uses of the deprecated constructor to pass the
element type explicitly and drop it.
For cases where the correct element type was not immediately
obvious to me or would require a slightly larger change I'm
falling back to explicitly calling getPointerElementType() for now.
Instead of modifying the existing scf.if op, create a new op with memref OpOperands/OpResults and delete the old op.
New allocations / other memrefs can now be yielded from the op. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.
Differential Revision: https://reviews.llvm.org/D115491
With VectorType supporting scalable dimensions, we don't need many of
the operations currently present in ArmSVE, like mask generation and
basic arithmetic instructions. Therefore, this patch also gets
rid of those.
Having built-in scalable vector support also simplifies the lowering of
scalable vector dialects down to LLVMIR.
Scalable dimensions are indicated with the scalable dimensions
between square brackets:
vector<[4]xf32>
Is a scalable vector of 4 single precission floating point elements.
More generally, a VectorType can have a set of fixed-length dimensions
followed by a set of scalable dimensions:
vector<2x[4x4]xf32>
Is a vector with 2 scalable 4x4 vectors of single precission floating
point elements.
The scale of the scalable dimensions can be obtained with the Vector
operation:
%vs = vector.vscale
This change is being discussed in the discourse RFC:
https://llvm.discourse.group/t/rfc-add-built-in-support-for-scalable-vector-types/4484
Differential Revision: https://reviews.llvm.org/D111819
Instead of modifying the existing scf.for op, create a new op with memref OpOperands/OpResults and delete the old op.
New allocations / other memrefs can now be yielded from the loop. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.
This change also introduces `replaceOp`, which will be utilized by all other `bufferize` implementations in future commits. Bufferization will then no longer rely on old (pre-bufferize) ops to DCE away. Instead old ops are deleted on the spot. This improves debuggability because there won't be any duplicate ops anymore (bufferized + not-yet-bufferized) when dumping IR during bufferization. It is also less fragile because unbufferized IR can no longer silently "hang around" due to an implementation bug.
Differential Revision: https://reviews.llvm.org/D114926
Added documentation to clearify the purpose of the bufferization to memref pass
and added some remarks.
Differential Revision: https://reviews.llvm.org/D115326
Sorting the prefixes by decreasing frequency can improve performance.
.gcc_except_table is relatively frequent, so move it ahead.
.ctors and .dtors mostly disappear and should be the last.
Explicitly track the pointer element type in Address, rather than
deriving it from the pointer type, which will no longer be possible
with opaque pointers. This just adds the basic facility, for now
everything is still going through the deprecated constructors.
I had to adjust one place in the LValue implementation to satisfy
the new assertions: Global registers are represented as a
MetadataAsValue, which does not have a pointer type. We should
avoid using Address in this case.
This implements a part of D103465.
Differential Revision: https://reviews.llvm.org/D115725
Use `_LIBCPP_DEBUG_ASSERT` instead of `_LIBCPP_ASSERT` and guarding it with `LIBCPP_DEBUG_LEVEL == 2`
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D115765
Summary:
This patch emits the DW_AT_accessibility attribute for
class/struct/union types in the LLVM part.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D115606
Remove the RangeOp and the RangeType that are not actively used anymore. After removing RangeType, the LinalgTypes header only includes the generated dialect header.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115727