This adds some basic costs for MVE reductions - currently just costing
the simple legal add vectors as a single MVE instruction. More complex
costing can be added in the future when the framework more readily
allows it.
Differential Revision: https://reviews.llvm.org/D88980
This adds a very basic cost for active_lane_mask under MVE - making the
assumption that they will be free and then apologizing for that in a
comment.
In reality they may either be free (by being nicely folded into a tail
predicated loop), cost the same as a VCTP or be expanded into vdup's,
adds and cmp's. It is difficult to detect the difference from a single
getIntrinsicInstrCost call, so makes the assumption that the vectorizer
is adding them, and only added them where it makes sense.
We may need to change this in the future to better model predicate costs
in the vectorizer, especially at -Os or non-tail predicated loops. The
vectorizer currently does not query the cost of these instructions but
that will change in the future and a zero cost there probably makes the
most sense at the moment.
Differential Revision: https://reviews.llvm.org/D88989
In lldb, explicitly set the "option() honors normal variables" CMake policy. This applies for
standalone lldb builds and matches what llvm, clang, etc do. This prevents potentially unwanted
clearing of variables like `LLVM_ENABLE_WARNINGS`, and also prevents unnecessary build warnings.
See: https://cmake.org/cmake/help/latest/policy/CMP0077.html
Differential Revision: https://reviews.llvm.org/D89614
This patch adds metadata !noundef and makes load instructions can optionally have it.
A load with !noundef always return a well-defined value (has no undef bit or isn't poison).
If the loaded value isn't well defined, the behavior is undefined.
This metadata can be used to encode the assumption from C/C++ that certain reads of variables should have well-defined values.
It is helpful for optimizing freeze instructions away, because freeze can be removed when its operand has well-defined value, and showing that a load from arbitrary location is well-defined is usually hard otherwise.
The same information can be encoded with llvm.assume with operand bundle; using metadata is chosen because I wasn't sure whether code motion can be freely done when llvm.assume is inserted from clang instead.
The existing codebase already is stripping unknown metadata when doing code motion, so using metadata is UB-safe as well.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D89050
The test reorders the basic blocks to be dis-contiguous in the address space and checks if the back trace contains the right symbol.
Reviewed By: labath
Differential Revision: https://reviews.llvm.org/D89179
LLVM rejects DWARF operator DW_OP_over. This DWARF operator is needed
for Flang to support assumed rank array.
Summary:
Currently LLVM rejects DWARF operator DW_OP_over. Below error is
produced when llvm finds this operator.
[..]
invalid expression
!DIExpression(151, 20, 16, 48, 30, 35, 80, 34, 6)
warning: ignoring invalid debug info in over.ll
[..]
There were some parts missing in support of this operator, which are
now completed.
Testing
-added a unit testcase
-check-debuginfo
-check-llvm
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D89208
As requested in D89346. This allows us to add some early outs.
I reordered some checks a little bit to make the more common bail outs happen earlier. Like checking opcode before checking hasOneUse. And I moved the bit width check to make sure it was safe to look through a truncate to the spot where we look through truncates instead of after.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D89494
initialization a little smarter.
Look through casts that preserve zero-ness when determining if an
initializer is zero, so that we can handle cases like an {0} initializer
whose corresponding field is a type other than 'int'.
This patch fixes a problem whereby the pointee object of a PTR_AND_OBJ entry with a `map(to)` motion clause can be overwritten on the device even if its reference counter is >=1.
Currently, we check the reference counter of the parent struct in order to determine whether the motion clause should be respected, but since the pointee object is not part of the struct, it's got its own reference counter which should be used to enqueue the copy or discard it.
The same behavior has already been implemented in targetDataEnd (omptarget.cpp:539-540), but we somehow missed doing the same in targetDataBegin.
Differential Revision: https://reviews.llvm.org/D89597
Old GCC used to aggressively fold VLAs to constant-bound arrays at block
scope in GNU mode. That's non-conforming, and more modern versions of
GCC only do this at file scope. Update Clang to do the same.
Also promote the warning for this from off-by-default to on-by-default
in all cases; more recent versions of GCC likewise warn on this by
default.
This is still slightly more permissive than GCC, as pointed out in
PR44406, as we still fold VLAs to constant arrays in structs, but that
seems justifiable given that we don't support VLA-in-struct (and don't
intend to ever support it), but GCC does.
Differential Revision: https://reviews.llvm.org/D89523
Before formating ARM64_RELOC_ADDEND relocation target name as a hex
number, the architecture need to be checked since other architectures
can define a different relocation type with the same integer as
ARM64_RELOC_ADDEND.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D89094
ClangFormat does not correctly handle an Objective-C interface declaration
with both lightweight generics and a protocol conformance.
This simple example:
```
@interface Foo : Bar <Baz> <Blech>
@end
```
means `Foo` extends `Bar` (a lightweight generic class whose type
parameter is `Baz`) and also conforms to the protocol `Blech`.
ClangFormat should not apply any changes to the above example, but
instead it currently formats it quite poorly:
```
@interface Foo : Bar <Baz>
<Blech>
@end
```
The bug is that `UnwrappedLineParser` assumes an open-angle bracket
after a base class name is a protocol list, but it can also be a
lightweight generic specification.
This diff fixes the bug by factoring out the logic to parse
lightweight generics so it can apply both to the declared class
as well as the base class.
Test Plan: New tests added. Ran tests with:
% ninja FormatTests && ./tools/clang/unittests/Format/FormatTests
Confirmed tests failed before diff and passed after diff.
Reviewed By: sammccall, MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D89496
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.
Differential Revision: https://reviews.llvm.org/D74158
This addresses a regression where pretty much all C++ compilations using
-frounding-math now fail, due to rounding being performed in constexpr
function definitions in the standard library.
This follows the "manifestly constant evaluated" approach described in
https://reviews.llvm.org/D87528#2270676 -- evaluations that are required
to succeed at compile time are permitted even in regions with dynamic
rounding modes, as are (unfortunately) the evaluation of the
initializers of local variables of const integral types.
Differential Revision: https://reviews.llvm.org/D89360
We can not bitcast pointers across different address spaces, and VectorCombine
should be careful when it attempts to find the original source of the loaded
data.
Differential Revision: https://reviews.llvm.org/D89577
Aborts if we hit the max devirtualization iteration.
Will be useful for testing that changes to devirtualization don't cause
devirtualization to repeat passes more times than necessary.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D89519
If instructions were removed in peephole passes after the hazard recognizer was
run it is possible that new hazards could be introduced.
Fixes: SWDEV-253090
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D89077
This transforms the symbol lookups to O(1) from O(NM), greatly speeding up both passes. For a large MLIR module this shaved seconds off of the compilation time.
Differential Revision: https://reviews.llvm.org/D89522
The initial goal of this interface is to fix the current problems with verifying symbol user operations, but can extend beyond that in the future. The current problems with the verification of symbol uses are:
* Extremely inefficient:
Most current symbol users perform the symbol lookup using the slow O(N) string compare methods, which can lead to extremely long verification times in large modules.
* Invalid/break the constraints of verification pass
If the symbol reference is not-flat(and even if it is flat in some cases) a verifier for an operation is not permitted to touch the referenced operation because it may be in the process of being mutated by a different thread within the pass manager.
The new SymbolUserOpInterface exposes a method `verifySymbolUses` that will be invoked from the parent symbol table to allow for verifying the constraints of any referenced symbols. This method is passed a `SymbolTableCollection` to allow for O(1) lookups of any necessary symbol operation.
Differential Revision: https://reviews.llvm.org/D89512
This revision contains two optimizations related to symbol checking:
* Optimize SymbolOpInterface to only check for a name attribute if the operation is an optional symbol.
This removes an otherwise unnecessary attribute lookup from a majority of symbols.
* Add a new SymbolTableCollection class to represent a collection of SymbolTables.
This allows for perfoming non-flat symbol lookups in O(1) time by caching SymbolTables for symbol table operations. This class is very useful for algorithms that operate on multiple symbol tables, either recursively or not.
Differential Revision: https://reviews.llvm.org/D89505
(Note: This is a reland of D82597)
This class allows for defining thread local objects that have a set non-static lifetime. This internals of the cache use a static thread_local map between the various different non-static objects and the desired value type. When a non-static object destructs, it simply nulls out the entry in the static map. This will leave an entry in the map, but erase any of the data for the associated value. The current use cases for this are in the MLIRContext, meaning that the number of items in the static map is ~1-2 which aren't particularly costly enough to warrant the complexity of pruning. If a use case arises that requires pruning of the map, the functionality can be added.
This is especially useful in the context of MLIR for implementing thread-local caching of context level objects that would otherwise have very high lock contention. This revision adds a thread local cache in the MLIRContext for attributes, identifiers, and types to reduce some of the locking burden. This led to a speedup of several seconds when compiling a somewhat large mlir module.
Differential Revision: https://reviews.llvm.org/D89504
This patch also avoids hardcoding the clang options, which makes it
less likely for them to become out-of-date.
rdar://problem/63791367+66927829
Differential Revision: https://reviews.llvm.org/D89428