The program point created by the checker, even if it is an error node,
might not be the same as the name under which the report is emitted.
Make sure we're checking the name of the checker, because thats what
we're silencing after all.
Differential Revision: https://reviews.llvm.org/D102683
The current implementation assumes the destination type of shuffle is the same as the decomposed ones. Add the check to avoid crush when the condition is not satisfied.
This fixes PR37616.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D102751
There doesn't seem to be a need to support recursive locking,
and a recursive mutex is unnecessarily inefficient.
Differential Revision: https://reviews.llvm.org/D102486
Do the single hash calculation before acquiring the lock, to reduce
lock contention. If Copy is true, and the string was not yet contained
in the StringStorage, use the new address from StringStorage, but
reuse the hash we already calculated.
Differential Revision: https://reviews.llvm.org/D102484
This patch adds a new option to the LoopVectorizer to control how
scalable vectors can be used.
Initially, this suggests three levels to control scalable
vectorization, although other more aggressive options can be added in
the future.
The possible options are:
- Disabled: Disables vectorization with scalable vectors.
- Enabled: Vectorize loops using scalable vectors or fixed-width
vectors, but favors fixed-width vectors when the cost
is a tie.
- Preferred: Like 'Enabled', but favoring scalable vectors when the
cost-model is inconclusive.
Reviewed By: paulwalker-arm, vkmr
Differential Revision: https://reviews.llvm.org/D101945
This will allow to use llvm-objcopy with file names that begin with dashes.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D102665
Like the element extraction of these vectors, we choose to promote up to
an i8 vector type and perform the insertion there.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D102697
In this case, it does the same thing as the original pattern does.
SimplifyCFG has a few lurking miscompilations about deleting blocks that
have their address taken, and consistently using DeleteDeadBlocks() instead
of a hand-rolled pattern will allow to weed those cases out easierly.
VectorTransferPermutationMapLoweringPatterns can be enabled via a pass option. These additional patterns lower permutation maps to minor identity maps with broadcasting, if possible, allowing for more efficient vector load/stores. The option is deactivated by default.
Differential Revision: https://reviews.llvm.org/D102593
LinalgOps that are all parallel do not use the value of `outs`
tensor. The semantics is that the `outs` tensor is fully
overwritten. Using anything other than `init_tensor` can add false
dependencies between operations, when the use is just for the shape of
the tensor. Adding a canonicalization to always use `init_tensor` in
such cases, breaks this dependence.
Differential Revision: https://reviews.llvm.org/D102561
To bring D99599's implementation in line with the existing
PrintPassInstrumentation, and to fix a FIXME, add more customizability
to PrintPassInstrumentation.
Introduce three new options. The first takes over the existing
"-debug-pass-manager-verbose" cl::opt.
The second and third option are specific to -fdebug-pass-structure. They
allow indentation, and also don't print analysis queries.
To avoid more golden file tests than necessary, prune down the
-fdebug-pass-structure tests.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D102196
Parametrize the cache file with TARGET_TRIPLE parameter. Normalize
the target triple to follow the runtime library installation directory.
Explicity enable LLVM_ENABLE_PER_TARGET_RUNTIME_DIR option.
Original interfaces are not safe to be called during dialect conversion.
This is because some ops (e.g. `dynamic_reshape(input, target_shape)`)
depend on the values of their operands to calculate the output shape.
However the operands may be out of reach during dialect conversion (e.g.
converting from tensor world to buffer world). This patch provides a new
kind of interface which accpets user-provided operands to solve this
problem.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D102317
"[mlir] Speed up Lexer::getEncodedSourceLocation"
This reverts commit 3043be9d2d and commit
861d69a525.
This change resulted in printing textual MLIR that can't be parsed; see
review thread https://reviews.llvm.org/D102567 for details.
Summary:
The OpenMP runtime functions don't always provide unique thread ID's to
determine if a basic block is truly single-threaded. Change the implementation
to only check NVPTX intrinsics for now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102700
This patch transforms the sequence
lea (reg1, reg2), reg3
sub reg3, reg4
to two sub instructions
sub reg1, reg4
sub reg2, reg4
Similar optimization can also be applied to LEA/ADD sequence.
The modifications to TwoAddressInstructionPass is to ensure the operands of ADD
instruction has expected order (the dest register of LEA should be src register
of ADD).
Differential Revision: https://reviews.llvm.org/D101970
At present, a lot of code contains main function bodies like "return failed(mlir::MlirOptMain(...);". This is unfortunate for two reasons: a) it uses ADL, which is maybe not what the free "failed" function was designed for; and b) it is a bit awkward to read, requring the reader to both understand the boolean nature of the value and the semantics of main's return value. (And it's also not portable, since 1 is not a portable success value.)
The replacement code, `return mlir::AsMainReturnCode(mlir::MlirOptMain(...))` is a bit more self-explanatory.
The change applies the new function to a few internal uses of MlirOptMain, too.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D102641
We currently use SourceMgr::getLineAndColumn to get the line and column for an SMLoc, but this includes a call to StringRef::find_last_of that ends up dominating compile time. In D102567, we start creating locations from the input file for block arguments which resulted in an extreme performance regression for modules with very large amounts of block arguments. This revision switches to just using a pointer offset from the beginning of the line to calculate the column(all MLIR files are simple ascii), resulting in a compile time reduction from 4700 seconds (1 hour and 18 minutes) to 8 seconds.
Differential Revision: https://reviews.llvm.org/D102734
Add this attribute to some types to ensure that they have
debug info.
The debug info for these classes are required for debuggers to display
some STL types. With constructor homing (a new debug info optimization)
their debug info isn't emitted because their constructors are never
called.
The list of types with the attribute added are __hash_value_type,
__value_type, __tree_node_base, __tree_node, __hash_node, __list_node,
and __forward_list_node.
Differential Revision: https://reviews.llvm.org/D98750
Fix __bitop_unsigned_integer and rename to __libcpp_is_unsigned_integer.
There are only five unsigned integer types, so we should just list them out.
Also provide `__libcpp_is_signed_integer`, even though the Standard doesn't
consume that trait anywhere yet.
Notice that `concept uniform_random_bit_generator` is specifically specified
to rely on `concept unsigned_integral` and *not* `__is_unsigned_integer`.
Instantiating `std::ranges::sample` with a type `U` satisfying
`uniform_random_bit_generator` where `unsigned_integral<U::result_type>`
and not `__is_unsigned_integer<U::result_type>` is simply IFNDR.
Orthogonally, fix an undefined behavior in std::countr_zero(__uint128_t).
Orthogonally, improve tests for the <bit> manipulation functions.
It was these new tests that detected the bug in countr_zero.
Differential Revision: https://reviews.llvm.org/D102328
This patch implements first part of Flow Sensitive SampleFDO (FSAFDO).
It has the following changes:
(1) disable current discriminator encoding scheme,
(2) new hierarchical discriminator for FSAFDO.
For this patch, option "-enable-fs-discriminator=true" turns on the new
functionality. Option "-enable-fs-discriminator=false" (the default)
keeps the current SampleFDO behavior. When the fs-discriminator is
enabled, we insert a flag variable, namely, llvm_fs_discriminator, to
the object. This symbol will checked by create_llvm_prof tool, and used
to generate a profile with FS-AFDO discriminators enabled. If this
happens, for an extbinary format profile, create_llvm_prof tool
will add a flag to profile summary section.
Differential Revision: https://reviews.llvm.org/D102246
Revert recent commit to require x86-registered-target (e4b790c5e3).
Remove -O1 from the run lines so they are less dependent on backend passes.
Update the CHECK6 and CHECK10 lines with script.
Differential Revision: https://reviews.llvm.org/D102720
In many cases it is helpful to know at what address the resolved function starts.
This patch adds a new StartAddress member to the DILineInfo structure.
Reviewed By: jhenderson, dblaikie
Differential Revision: https://reviews.llvm.org/D102316
X86 NaCl generally requires the stack to be aligned to 16 bytes.
This change was already implemented in two downstream NaCl compilers
based on llvm.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D102610
While developing a change to the allocator I ended up breaking
realloc on secondary allocations with increasing sizes. That didn't
cause any of the unit tests to fail, which indicated that we're
missing some test coverage here. Add a unit test for that case.
Differential Revision: https://reviews.llvm.org/D102716
This is a hook that allows for providing custom initialization of the pattern, e.g. if it has bounded recursion, setting the debug name, etc., without needing to define a custom constructor. A non-virtual hook was chosen to avoid polluting the vtable with code that we really just want to be inlined when constructing the pattern. The alternative to this would be to just define a constructor for each pattern, this unfortunately creates a lot of otherwise unnecessary boiler plate for a lot of patterns and a hook provides a much simpler/cleaner interface for the very common case.
Differential Revision: https://reviews.llvm.org/D102440
We currently do not document how the pattern rewriter infra treats recursion when it gets detected. This revision adds a blurb on recursion in patterns, and how patterns can signal that they are equipped to handle it.
Differential Revision: https://reviews.llvm.org/D102439