After removing the range type, Linalg does not define any type. The revision thus consolidates the LinalgOps.h and LinalgTypes.h into a single Linalg.h header. Additionally, LinalgTypes.cpp is renamed to LinalgDialect.cpp to follow the convention adopted by other dialects such as the tensor dialect.
Depends On D115727
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115728
For some reason, the user-defined implicit conversion from StringRef to
std::string is not invoked by std::map::emplace in libstdc++ 5.4.0, even
though it works fine on modern systems.
We currently rely on generic promotion to vXi16/vXi32 types for rotation lowering on various AVX512 targets.
We can more efficiently perform this by making use of the shl(unpack(x,x),amt) style pattern that we already use for vXi8 rotation by splat amounts, either by widening to a larger vector type or unpacking lo/hi halves of the subvectors so we can access whatever vXi16/vXi32 per-element shifts are supported.
This uncovered an issue in the supportedVectorShiftWithImm/supportedVectorVarShift legality checkers which was using hasAVX512() instead of useAVX512Regs() to detect support for 512-bit vector shifts.
NOTE: I'm actually hoping to eventually reuse this code for shl(unpack(y,x),amt) funnel shift lowering (vXi8 and wider), but initially I just want to ensure we have efficient ISD::ROTL lowering for all targets.
Differential Revision: https://reviews.llvm.org/D115180
CreateElementBitCast() can preserve the pointer element type in
the presence of opaque pointers, so use it in place of CreateBitCast()
in some places. This also sometimes simplifies the code a bit.
This expands checking for more expressions. This will check underflow
and loss of precision when using call expressions like:
void foo(unsigned);
int i = -1;
foo(i);
This also includes other expressions as well, so it can catch negative
indices to std::vector since it uses unsigned integers for [] and .at()
function.
Patch by: @pfultz2
Differential Revision: https://reviews.llvm.org/D46081
This patch adds lowering from omp.sections and omp.section (simple lowering along with the nowait clause) to LLVM IR.
Tests for the same are also added.
Reviewed By: ftynse, kiranchandramohan
Differential Revision: https://reviews.llvm.org/D115030
Instead of modifying the existing linalg.tiled_loop op, create a new op with memref input/outputs and delete the old op.
Differential Revision: https://reviews.llvm.org/D115493
Change all uses of the deprecated constructor to pass the
element type explicitly and drop it.
For cases where the correct element type was not immediately
obvious to me or would require a slightly larger change I'm
falling back to explicitly calling getPointerElementType() for now.
Instead of modifying the existing scf.if op, create a new op with memref OpOperands/OpResults and delete the old op.
New allocations / other memrefs can now be yielded from the op. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.
Differential Revision: https://reviews.llvm.org/D115491
With VectorType supporting scalable dimensions, we don't need many of
the operations currently present in ArmSVE, like mask generation and
basic arithmetic instructions. Therefore, this patch also gets
rid of those.
Having built-in scalable vector support also simplifies the lowering of
scalable vector dialects down to LLVMIR.
Scalable dimensions are indicated with the scalable dimensions
between square brackets:
vector<[4]xf32>
Is a scalable vector of 4 single precission floating point elements.
More generally, a VectorType can have a set of fixed-length dimensions
followed by a set of scalable dimensions:
vector<2x[4x4]xf32>
Is a vector with 2 scalable 4x4 vectors of single precission floating
point elements.
The scale of the scalable dimensions can be obtained with the Vector
operation:
%vs = vector.vscale
This change is being discussed in the discourse RFC:
https://llvm.discourse.group/t/rfc-add-built-in-support-for-scalable-vector-types/4484
Differential Revision: https://reviews.llvm.org/D111819
Instead of modifying the existing scf.for op, create a new op with memref OpOperands/OpResults and delete the old op.
New allocations / other memrefs can now be yielded from the loop. This functionality is deactivated by default and guarded against by AssertDestinationPassingStyle.
This change also introduces `replaceOp`, which will be utilized by all other `bufferize` implementations in future commits. Bufferization will then no longer rely on old (pre-bufferize) ops to DCE away. Instead old ops are deleted on the spot. This improves debuggability because there won't be any duplicate ops anymore (bufferized + not-yet-bufferized) when dumping IR during bufferization. It is also less fragile because unbufferized IR can no longer silently "hang around" due to an implementation bug.
Differential Revision: https://reviews.llvm.org/D114926
Added documentation to clearify the purpose of the bufferization to memref pass
and added some remarks.
Differential Revision: https://reviews.llvm.org/D115326
Sorting the prefixes by decreasing frequency can improve performance.
.gcc_except_table is relatively frequent, so move it ahead.
.ctors and .dtors mostly disappear and should be the last.
Explicitly track the pointer element type in Address, rather than
deriving it from the pointer type, which will no longer be possible
with opaque pointers. This just adds the basic facility, for now
everything is still going through the deprecated constructors.
I had to adjust one place in the LValue implementation to satisfy
the new assertions: Global registers are represented as a
MetadataAsValue, which does not have a pointer type. We should
avoid using Address in this case.
This implements a part of D103465.
Differential Revision: https://reviews.llvm.org/D115725
Use `_LIBCPP_DEBUG_ASSERT` instead of `_LIBCPP_ASSERT` and guarding it with `LIBCPP_DEBUG_LEVEL == 2`
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D115765
Summary:
This patch emits the DW_AT_accessibility attribute for
class/struct/union types in the LLVM part.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D115606
Remove the RangeOp and the RangeType that are not actively used anymore. After removing RangeType, the LinalgTypes header only includes the generated dialect header.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D115727
SHT_GNU_verdef is typically small, so it's unnecessary to reserve the vector.
While here, fix a hypothetical issue when SHT_GNU_verdef has non-increasing
version indexes, which don't happen with GNU ld, gold, ld.lld's output.
My x86-64 lld executable is 256 bytes smaller.
sizeof(ObjFile<ELF64LE>) is decreased from 344 to 272 on an ELF64 system.
In a large link with 30000 ObjFiles, this may be 2+MiB saving.
Change std::vector members to SmallVector, and std::string members to
SmallString<0> (these members typically don't benefit from small string optimization).
On Linux x86-64 the lld executable is ~6k smaller.
Adds -L<search-path> and -l<library> options that are analogous to ld's
versions.
Each instance of -L<search-path> or -l<library> will apply to the most recent
-jd option on the command line (-jd <name> creates a JITDylib with the given
name). Library names will match against JITDylibs first, then llvm-jitlink will
look through the search paths for files named <search-path>/lib<library>.dylib
or <search-path>/lib<library>.a.
The default "main" JITDylib will link against all JITDylibs created by -jd
options, and all JITDylibs will link against the process symbols (unless
-no-process-symbols is specified).
The -dlopen option is renamed -preload, and will load dylibs into the JITDylib
for the ORC runtime only.
The effect of these changes is to make it easier to describe a non-trivial
program layout to llvm-jitlink for testing purposes. E.g. the following
invocation describes a program consisting of three JITDylibs: "main" (created
implicitly) containing main.o, "Foo" containing foo1.o and foo2.o, and linking
against library "bar" (not a JITDylib, so it must be a .dylib or .a on disk)
and "Baz" (which is a JITDylib), and "Baz" containing baz.o.
llvm-jitlink \
main.o \
-jd Foo foo1.o foo2.o -L${HOME}/lib -lbar -lBaz
-jd Baz baz.o
This is present in our assembly files. It should fix decorate_proc_maps.cpp failures because of shadow memory being allocated as executable.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D115552
preinliner has been tuned on large server workloads and it's not ready to be turned on by default. this change also updates the thresholds based on tuning.
Differential Revision: https://reviews.llvm.org/D115770
In 32bit mode, attaching TBAA metadata to the store following the call
to inline assembler results in describing the wrong type by making a
fake lvalue(i.e., whatever the inline assembler happens to leave in
EAX:EDX.) Even if inline assembler somehow describes the correct type,
setting TBAA information on return type of call to inline assembler is
likely not correct, since TBAA rules need not apply to inline assembler.
Differential Revision: https://reviews.llvm.org/D115320
Also moves object interface building functions out of Mangling.h and in to the
new ObjectFileInterfaces.h header, and updates the llvm-jitlink tool to use
custom object interfaces rather than a custom link layer.
ObjectLayer::add overloads are added to match the old signatures (which
do not take a MaterializationUnit::Interface). These overloads use the
standard getObjectFileInterface function to build an interface.
Passing a MaterializationUnit::Interface explicitly makes it easier to alter
the effective interface of the object file being added, e.g. by changing symbol
visibility/linkage, or renaming symbols (in both cases the changes will need to
be mirrored by a JITLink pass at link time to update the LinkGraph to match the
explicit interface). Altering interfaces in this way can be useful when lazily
compiling (e.g. for renaming function bodies) or emulating linker options (e.g.
demoting all symbols to hidden visibility to emulate -load_hidden).