During pop() we convert nodes into spans of expanded syntax::Tokens.
If we precompute a range of plausible (expanded) tokens, then we can do an
extremely cheap approximate hit-test against it, because syntax::Tokens are
ordered by pointer.
This would seem not to buy anything (we don't enter nodes unless they overlap
the selection), but in fact the spans we have are for *newly* claimed ranges
(i.e. those unclaimed by any child node).
So if you have:
{ { [[2+2]]; } }
then all of the CompoundStmts pass the hit test and are pushed, but we skip
full hit-testing of the brackets during pop() as they lie outside the range.
This is ~10x average speedup for selectiontree on a bad case I've seen
(large gtest file).
Differential Revision: https://reviews.llvm.org/D117107
Not sure it's OK to suppress this in clang itself - if we're building a PCH
or module, maybe it matters?
Differential Revision: https://reviews.llvm.org/D116925
Implements part of the legacy "DEC structures" feature from
VMS Fortran. STRUCTUREs are processed as if they were derived
types with SEQUENCE. DATA-like object entity initialization
is supported as well (e.g., INTEGER FOO/666/) since it was used
for default component initialization in structures. Anonymous
components (named %FILL) are also supported.
These features, and UNION/MAP, were already being parsed.
An omission in the collection of structure field names in the
case of nested structures with entity declarations was fixed
in the parser.
Structures are supported in modules, but this is mostly for
testing purposes. The names of fields in structures accessed
via USE association cannot appear with dot notation in client
code (at least not yet). DEC structures antedate Fortran 90,
so their actual use in applications should not involve modules.
This patch does not implement UNION/MAP, since that feature
would impose difficulties later in lowering them to MLIR types.
In the meantime, if they appear, semantics will issue a
"not yet implemented" error message.
Differential Revision: https://reviews.llvm.org/D117151
Let's consider sequential min/max expression family
to be more complex than their non-sequential counterparts,
preserving internal ordering within them.
I strongly believe we need some variant of this.
The main problem is e.g. that the glibc's assert has 4 parameters,
but the profitability check is only okay with one extra phi node,
so D116692 doesn't even trigger on most of the expected cases.
While that restriction probably makes sense in normal code, if we
are about to run off of a cliff (into an `unreachable`), this
successor block is unlikely so the cost to setup these PHI nodes
should not be on the hotpath, and shouldn't matter performance-wise.
Likewise, we don't sink if there are unconditional predecessors
UNLESS we'd sink at least one non-speculatable instruction,
which is a performance workaround, but if we are about to run into
`unreachable`, it shouldn't matter.
Note that we only allow the case where there are at
most unconditiona branches on the way to the unreachable block.
Differential Revision: https://reviews.llvm.org/D117045
See `gcc -dumpspecs` that -r essentially implies -nostdlib and suppresses
default -l* and crt*.o. The behavior makes sense because otherwise there will be
assuredly conflicting definitions when the relocatable output is linked into the
final executable/shared object.
Reviewed By: thesamesam, phosek
Differential Revision: https://reviews.llvm.org/D116843
Convert the `crashlog` command to be implemented as a class. The `Symbolicate`
function is switched to a class, to implement `get_long_help`. The text for the
long help comes from the help output generated by `OptionParser`. That is, the
output of `help crashlog` is the same as `crashlog --help`.
Differential Revision: https://reviews.llvm.org/D117165
This makes all the tests consistent and improves code coverage. This also
uncovers a bug with negative indices in advance() (which also impacts
prev()) -- I'll fix that in a subsequent patch.
I chose to only count operations in the tests for ranges::advance because
doing so in prev() and next() too was reaching diminishing returns, and
didn't meaningfully improve our test coverage.
The names of the generated attribute getters for ops changed some time ago. The method created from the attribute name returns the return type and an additional method of the same name with Attr as suffix is generated which returns the actual attribute as its storage type.
The code generating effects however was using the methods without the Attr suffix, which is a problem in the case of FlatSymbolRefAttr as it has a return type of llvm::StringRef. This would lead to compilation errors as the constructor of SideEffects::EffectInstance expects a SymbolRefAttr in this case.
This patch simply fixes the generated effects code to use the Attr suffixed getter to get the actual storage type of the attribute.
Differential Revision: https://reviews.llvm.org/D117194
Stop allowing use of `SmallVectorBase::set_size()` outside of the
SmallVector implementation, which sets the size without calling
constructors or destructors.
Most callers should probably just use `resize()`. Or, if the new size is
guaranteed to be `<= size()`, then the new-ish `truncate()` works too
(and optimizes better).
Some callers want to avoid initializing memory before overwriting, but
need a pointer to the memory and so cannot use `push_back()`,
`emplace_back()`, or `append()`. Before this commit, this depended on
`reserve()` and `set_size()`:
```
V.reserve(V.size() + NumNew); // Reserve expected size.
NumNew = initialize(V.end(), ...); // Get number added.
V.set_size(V.size() + NumNew); // Set size to match.
```
Such code should be updated to use `resize_for_overwrite()` and
`truncate()`:
```
auto Size = V.size(); // Save initial size.
V.resize_for_overwrite(Size + NumNew); // Resize to expected size.
NumNew = initialize(V.begin() + Size, ...)); // Get number added.
V.truncate(Size + NumNew); // Truncate to match.
```
The new pattern is safe even for non-trivial types, since
`resize_for_overwrite()` calls constructors and `truncate()` calls
destructors. For trivial types, it should optimize the same way as the
old pattern.
Downstream code adapt to the disappearance of `set_size()` using this
new pattern should carefully audit uses of `V` between the resize and
the truncate:
- Change `V.size()` => `Size`.
- Change `V.capacity()` => `V.size()` (mostly).
- Change `V.end()` => `V.begin() + Size`.
- If `V` is an out-parameter, early returns need a `V.truncate()` or
`V.clear()`. A scope exit is recommended.
Differential Revision: https://reviews.llvm.org/D115380
A bogus error message is appearing for structure constructors containing
values that correspond to unlimited polymorphic allocatable components.
A value of any type can actually be used.
Differential Revision: https://reviews.llvm.org/D117154
Add an extra argument for rounding mode to EXPECT_MPFR_MATCH and ASSERT_MPFR_MATCH macros.
Reviewed By: sivachandra, michaelrj
Differential Revision: https://reviews.llvm.org/D116777
Adding the optional decompositions have been verified to improve memory
usage on common models. Added the decomposition to the default tosa to linalg
passes.
Reviewed By: NatashaKnk
Differential Revision: https://reviews.llvm.org/D117175
Spotted this in a final grep of projects I don't usually build before
pushing https://reviews.llvm.org/D115380, which makes
`SmallVector::set_size()` private.
Update to `truncate()`, a new-ish variant of `resize()` that asserts the
new size is not bigger and that avoids pulling in the allocation and
initialization code for growing. Doesn't really look like the perf
impact of that would matter here, but since `dirLength` is known to be a
smaller size then we might as well.
Differential Revision: https://reviews.llvm.org/D117073
Enable ReassociatingReshapeOpConversion with "non-identity" layouts.
This removes an early-return in this function, which seems unnecessary and is
preventing some memref.collapse_shape from converting to LLVM (see included lit test).
It seems unnecessary because the return message says "only empty layout map is supported"
but there actually is code in this function to deal with non-empty layout maps. Maybe
it refers to an earlier state of implementation and is just out of date?
Though, there is another concern about this early return: the condition that it actually
checks, `{src,dst}MemrefType.getLayout().isIdentity()`, is not quite the same as what the
return message says, "only empty layout map is supported". Stepping through this
`getLayout().isIdentity()` code in GDB, I found that it evaluates to `.getAffineMap().isIdentity()`
which does (AffineMap.cpp:271):
```
if (getNumDims() != getNumResults())
return false;
```
This seems that it would always return false for memrefs of rank greater than 1 ?
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114808
Allows shuffle combining through per-element shift nodes
This exposed a number of issues with shuffle combining with target intrinsics that are lowered to nodes later during legalization - in particular shuffle combining and SimplifyDemandedVectorElts were being called after canonicalizeShuffleWithBinOps, meaning that shuffles didn't have a chance to be combined away before the shuffle(binop(x,y)) -> binop(shuffle(x),shuffle(y)) fold.
I believe this is unexploitable because in either case the result
will be 'couldn't allocate register for constraint' error message,
but error code checking is clearly wrong.
Differential Revision: https://reviews.llvm.org/D117189
The PlatformKind/PlatformType enums contain the same information, which requires
them to be kept in-sync. This commit changes over to PlatformType as the sole
source of truth, which allows the removal of the redundant PlatformKind.
The majority of the changes were in LLD and TextAPI.
Reviewed By: cishida
Differential Revision: https://reviews.llvm.org/D117163
Shockingly we weren't doing this already. We should probably have this
be done earlier in the IR too, but it's still helpful to have the
lowering guarantee it so that we can modify the ABI implicit inputs
based on it.
If we know we we aren't using a component from the kernel, we can save
a few bit packing instructions.
We're still enabling the VGPR input to the kernel though.
As per https://github.com/flang-compiler/f18-llvm-project/issues/1344,
the `flang` bash script works fine with 4.4.19 and requiring
4.4.23 is too restrictive. Rather than keep updating the patch level,
this patch removes this particular check (so that it will only check the
major and minor versions instead).
As this is both rather straightforward and urgent, I'm merging this
without a review.
This patch adds the `weak` identifier to the openmp device environment
variable. The changes introduced in https://reviews.llvm.org/D117211
result in multiply defined symbols. Because the symbol is potentially
included multiple times for each offloading file we will get symbol
colisions, and because it needs to have external visiblity it should be
weak.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D117231
This completes the implementation of
WG14 N2412 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf),
which standardizes C on a twos complement representation for integer
types. The only work that remained there was to define the correct
macros in the standard headers, which this patch does.
Functions pointers should be created with program address space. This
patch introduces program address space in TargetInfo. Targets with
non-default (default is 0) address space for functions should explicitly
set this value. This patch fixes a crash on lvalue reference to function
pointer (in device code) when using oneAPI DPC++ compiler.
Differential Revision: https://reviews.llvm.org/D111566