tensor.empty/linalg.init_tensor produces an uninititalized tensor that can be used as a destination operand for destination-style ops (ops that implement `DestinationStyleOpInterface`).
This change makes it possible to implement `TilingInterface` for non-destination-style ops without depending on the Linalg dialect.
RFC: https://discourse.llvm.org/t/rfc-add-tensor-from-shape-operation/65101
Differential Revision: https://reviews.llvm.org/D135129
foldOpIntoPhi() currently only folds operations into the phi if all
but one operands constant-fold. The two exceptions to this are freeze
and select, where we allow more general simplification.
This patch makes foldOpIntoPhi() generally simplification based and
removes all the instruction-specific logic. We just try to simplify
the instruction for each operand, and for the (potentially) one
non-simplified operand, we move it into the new block with adjusted
operands.
This fixes https://github.com/llvm/llvm-project/issues/57448, which
was my original motivation for the change.
This patch updates lowering to produce the correct fir.class types for
various polymorphic and unlimited polymoprhic entities cases. This is only the
lowering. Some TODOs have been added to the CodeGen part to avoid errors since
this part still need to be updated as well.
The fir.class<*> representation for unlimited polymorphic entities mentioned in
the document has been updated to fir.class<none> to avoid useless work in pretty
parse/printer.
This patch is part of the implementation of the poltymorphic
entities.
https://github.com/llvm/llvm-project/blob/main/flang/docs/PolymorphicEntities.md
Depends on D134957
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D134959
Simplify LoopAccessLegacyAnalysis by using LoopAccessInfoManager from
D134606. As a side-effect this also removes printing support from
LoopAccessLegacyAnalysis.
Depends on D134606.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D134608
Add a check to detect usages of `realloc` where the result is assigned
to the same variable (or field) as passed to the first argument.
Reviewed By: steakhal, martong
Differential Revision: https://reviews.llvm.org/D133119
The -slab-page-size option is used to set a simulated page size in -no-exec
tests. In order for this to work we need to use read/write permissions only
on all simulated pages in order to ensure that no simulated page is made
read-only by a permission change to the underlying real page.
The aim of this patch is to make it safe to enable ExecutionEngine regression
tests on arm64. Those tests will be enabled in a follow-up patch.
One of the sources is the same size as the destination so that source
doesn't have an overlap with the destination register. By using the _TIED
form we avoid an early clobber contraint for that source.
This matches what was already done for instrinsics. ConvertToThreeAddress
will fix it if it can't stay tied.
We want to emit a masked setcc that preserves zeros in all of the bits
where the original mask is zero. To do this we need to pass the original
mask as the passthru operand as well. Otherwise, we'll use the mask agnostic
policy and replace the zeros with 1s on some CPUs.
Differential Revision: https://reviews.llvm.org/D135122
This addresses the long-standing FIXME in the test. I would like to
update the test, and objdump's output is a lot more readable / editable
than readobj's.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D134690
When debuging a crash or conversion failure in a deep pass pipeline,
there are often many interleaved frames with `failed` and `succeeded`.
`LogicalResult` is used through the pass infrastructure, so by not implementing
failure in terms of a call to succeess, this patch noticeably reduces the total
total call stack depth and improves the debugging experience.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D135116
The synchronous callbacks are not intended to start the target running
during the callback, and doing so is flakey. This patch converts them
to being regular async callbacks, and adds some testing for sequential
reports that have caused problems in the field.
Differential Revision: https://reviews.llvm.org/D134927
There are a few issues with the code we generate for atomic operations and the way we generate it:
- Hard coded CR0 for compares
- Order of operands for compares not conducive to
emitting compare-immediate or for CSE of compares
- Missing MachineMemOperand for st[bhwd]cx intrinsics
- Missing intrinsic properties for the same
- Unnecessary blocks with store conditional
instructions to clear reservation (which ends
up hindering performance)
- Move from CR instructions just to compare the
result of a store conditional with zero (even
though it is a record-form)
This patch aims to resolve all of those issues.
Differential revision: https://reviews.llvm.org/D134783
Prior to this change, `CurrentGSYMPath` was never updated. As a consequence, the GSYM file was reopened for every frame, even if all frames were relative to the same GSYM file.
This change brings a 13x speedup on a test I'm doing (symbolizing ~25K frames from libxul)
(This is my first-ever LLVM change - sorry if I missed something in the process!)
Reviewed By: simon.giesecke, clayborg
Differential Revision: https://reviews.llvm.org/D132912
- Add support -Bdynamic/-Bstatic and their aliases
- Add support for `--library` and `--library-path` long form args
- Add test based on test/ELF/libsearch.s
- In `-Bdynamic` mode search for `.so` files in preference to `.a`.
- Unlike ELF continue to default to static mode until `-pie` or
`-shared` are used.
Differential Revision: https://reviews.llvm.org/D135087
D128745 handled DR1432 for the partial ordering of partial specializations, but
missed the handling for the partial ordering of function templates. This patch
implements the latter. While at it, also simplifies the previous implementation to
be more close to the wording without functional changes.
Fixes https://github.com/llvm/llvm-project/issues/56090
Reviewed By: erichkeane, #clang-language-wg, mizvekov
Differential Revision: https://reviews.llvm.org/D133683
Previously, it was possible to load dynamic libraries which would be unloaded on llvm_shutdown(), but recently ManagedStatic removal changed this so that loaded libraries really can't ever be unloaded. This functionality was very useful, and so to add it back in a more explicit way, I've added new getLibrary() and closeLibrary() methods to allow callers to use the very convenient platform independent abstraction that LLVM has for dynamic libraries.
As a specific use case, the onnx-mlir project was using this functionality with an API that allows instancing LLVM so you can compile a shared library, and then load that library, and eventually close the instance (and library) and compile something else. This change to llvm_shutdown causes libraries to leak and also locks the libraries for the entire duration of the program which prevents reusing library names.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D134763
* Remove -no-canonical-prefixes. See 980679981f
* Test a.out and %t.out in few tests and remove excess `-o %t.o`
* Avoid wrapping lines too aggresstively
* Replace `"{{[^"]*}}ld{{(\.(lld|bfd|gold))?}}{{(\.exe)?}}"` with `ld{{(.exe)?}}"`. The new pattern supports more CLANG_DEFAULT_LINKER.
This patch enables libc++ build as shared library in all combinations of ASCII/EBCDIC and 32-bit/64-bit variants. In particular it introduces:
# ASCII version of libc++ named as libc++_a.so
# Script to rename DLL name inside the generated side deck
# Various names for dataset members where DLL libraries and their side decks will reside
# Add the following options:
- LIBCXX_SHARED_OUTPUT_NAME
- LIBCXX_ADDITIONAL_COMPILE_FLAGS
- LIBCXX_ADDITIONAL_LIBRARIES
- LIBCXXABI_ADDITIONAL_COMPILE_FLAGS
- LIBCXXABI_ADDITIONAL_LIBRARIES
**Background and rational of this patch**
The linker on z/OS creates a list of exported symbols in a file called side deck. The list contains the symbol name as well as the name of the DLL which implements the symbol. The name of the DLL depends on what is specified in the -o command line option. If it points to a USS file, than the DLL name in the side deck will be the USS file name. If it points to a member of a dataset then the DLL name in the side deck is the member name.
If CMake could deal with z/OS datasets we could use -o that points to a dataset member name, but this does not seem to work so we have to produce a USS file as the DLL and then copy the content of the produced side deck to a dataset as well as rename the USS file name in the side deck to a dataset member name that corresponds to that DLL.
Reviewed By: muiez, SeanP, ldionne, #libc, #libc_abi
Differential Revision: https://reviews.llvm.org/D118503
Noticed this falling back on CTMark at -Os (bullet).
Seems like we have no 1:1 matching for it, so match SDAG and just lower.
Add testcases for common legal cases as well.
Differential Revision: https://reviews.llvm.org/D135111
This reverts commit 11adae5089.
There are multiple aspects this change is not appealing.
* They conflict with JoinedOrSeparate `-o`. The old exception `-objc-*`
should not be used an excuse.
* We generally want new options to be more rigid and avoid multiple
spellings.
* If users get used to `-offload-*`, a misspelled `-offload-*` option still
gets passed as `-o ffloat-*` without being detected.
Before this patch the code in printScalarConstant was unable to handle
nested constant expressions like (gep (addrspacecast ptr)) and crashed
with:
LLVM ERROR: Unsupported expression in static initializer:
addrspacecast ([4 x i8] addrspace(1)* @ga to [4 x i8]*)
We can use lowerConstantForGV instead which is a customized version of
lowerConstant that supports generic() and nested expressions.
Differential Revision: https://reviews.llvm.org/D127878
1) Fixed a typo in PTXAS_EXECUTABLE CMake variable (PXTAS -> PTXAS).
2) Version check was implemented incorrectly,
now version (major, minor) is converted to int for comparison.
3) ptxas -arch argument was incorrect (or missing) in 3 tests.
Differential Revision: https://reviews.llvm.org/D127866
This option forces constructors and non-deleting destructors to return
`this` pointer in C++ ABI (except for Microsoft ABI, on which this flag
has no effect).
This is similar to ARM32, Apple ARM64, or Fuchsia C++ ABI, but can be
applied to any target triple.
Differential Revision: https://reviews.llvm.org/D119209
This change enables a declaration to be conveniently displayed within
a debugger when only a pointer to its DeclContext is available. For example,
in gdb:
(gdb) p Ctx
$1 = (const clang::DeclContext *) 0x14c1a580
(gdb) p Ctx->dumpAsDecl()
ClassTemplateSpecializationDecl 0x14c1a540 <t.cpp:1:1, line:7:1> line:2:8 struct ct
`-TemplateArgument type 'int'
`-BuiltinType 0x14bac420 'int'
$2 = void
In the event that the pointed to DeclContext is invalid (that it has an
invalid DeclKind as a result of a dangling pointer, memory corruption, etc...)
it is not possible to dump its associated declaration. In this case, the
DeclContext will be reported as invalid. For example, in gdb:
(gdb) p Ctx->dumpAsDecl()
DeclContext 0x14c1a580 <unrecognized Decl kind 127>
$3 = void
With this patch, TypedefTypes and UsingTypes can have an
underlying type which diverges from their corresponding
declarations.
For the TypedefType case, this can be seen when getting
the common sugared type between two redeclarations with
different sugar.
For both cases, this will become important as resugaring
is implemented, as this will allow us to resugar these
when they were dependent before instantiation.
Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>
Differential Revision: https://reviews.llvm.org/D133468