This patch slightly generalizes the code to emit loads and stores of a
matrix and adds helpers to load/store a tile of a larger matrix.
This will be used in a follow-up patch introducing initial tiling.
Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D75564
Builder::get{I32,I64}VectorAttr are actually of limited applicability since
vector types can't have zero elements, whereas many uses of this kind of
attribute (such as dimension lists for "transpose"-like and other tensor
ops) often can result in empty lists.
Differential Revision: https://reviews.llvm.org/D76403
If we know the SSE shift amount is out of range then we can simplify to zero value (logical) or a 'signsplat' bitwidth-1 shift (arithmetic). This allows us to remove the equivalent ConstantInt constant folding path from simplifyX86immShift.
Support prefixing destructive operations, with the MOVPRFX instruction, to build constructive operations.
Differential Revision: https://reviews.llvm.org/D75064
Along the same lines as eb918d8daf1: This code also had to acquire the session
mutex, and this could cause a deadlock under the wrong circumstances. This
patch updates GenericLLVMIRPlatformSupport to just use the session lock for
everything.
In MachOPlatform, obtaining the link-order for a JITDylib requires locking the
session, but also needs to be part of a larger atomic operation that collates
initializer symbols tracked by the platform. Trying to do this under a separate
platform mutex leads to potential locking order issues, e.g.
T1 locks session then tries to lock platform to register a new init symbol
meanwhile
T2 locks platform then tries to lock session to obtain link order.
Removing the platform lock and performing all these operations under the session
lock eliminates this possibility.
At the same time we also need to collate init pointers from the
MachOPlatform::InitScraperPlugin, and we don't need or want to lock the session
for that. The new InitSeqMutex has been added to guard these init pointers, and
the session mutex is never obtained while the InitSeqMutex is held.
The MU may define no symbols, but still contain a non-trivial destructor (e.g.
an LLVM IR module that has been stripped of all externally visible
definitions, but which still needs to lock its context to be destroyed).
Bailing out early ensures that we destroy the unit outside the session lock,
rather than under it which may cause deadlocks.
Also adds some extra sanity-checking assertions.
(This is D68010 but I also set the new parameter in LibStdcpp.cpp to fix
the Debian tests).
Summary:
Printing a summary for an empty NSPathStore2 string currently prints random bytes behind the empty string pointer from memory (rdar://55575888).
It seems the reason for this is that the SourceSize parameter in the `ReadStringAndDumpToStreamOptions` - which is supposed to contain the string
length - actually uses the length 0 as a magic value for saying "read as much as possible from the buffer" which is clearly wrong for empty strings.
This patch adds another flag that indicates if we have know the string length or not and makes this behaviour dependent on that (which seemingly
was the original purpose of this magic value).
Reviewers: aprantl, JDevlieghere, shafik
Reviewed By: aprantl
Subscribers: christof, abidh, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D68010
In C++03 mode, nullptr is defined by libc++, not the compiler so, we can't use __is_fundamental (because it will return false for nullptr).
Fixes: 5ade17e0ca
Currently when an expression fails to parse and we have a FixIt, we keep
the failed UserExpression around while trying to parse the expression with
applied fixits. This means that we have this rather confusing control flow:
1. Original expression created and parsing attempted.
2. Expression with applied FixIts is created and parsing attempted.
3. Original expression is destroyed and parser deconstructed.
4. Expression with applied FixIts is destroyed and parser deconstructed.
This patch just deletes the original expression so that step 2 and 3 are
swapped and the whole process looks more like just sequentially parsing two
expressions (which is what we actually do here).
Doesn't fix anything just makes the code less fragile.
This patch updates <type_traits> to use builtin type traits whenever
possible to improve compile times.
Differential Revision: https://reviews.llvm.org/D67900
Summary:
Rollforward of
https://reviews.llvm.org/rGdd12826808f9079e164b82e64b0697a077379241 after
temporarily adding -fno-delayed-template-parsing to the TreeTest.
Original summary:
> Copy of https://reviews.llvm.org/D72334, submitting with Ilya's permission.
>
> Handles template declaration of all kinds.
>
> Also builds template declaration nodes for specializations and explicit
> instantiations of classes.
>
> Some missing things will be addressed in the follow-up patches:
>
> * specializations of functions and variables,
> * template parameters.
Reviewers: gribozavr2
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76418
Previously we multiplied the cost for the table entries by the number of splits needed. But that implies that each split goes through a reduction to scalar independently. I think what really happens is that the we AND/OR the split pieces until we're down to a single value with a legal type and then do special reduction sequence on that.
So to model that this patch takes the number of splits minus one multiplied by the cost of a AND/OR at the legal element count and adds that on top of the table lookup.
Differential Revision: https://reviews.llvm.org/D76400
A number of X86 tests were accidentally disabled in
https://reviews.llvm.org/D73568. This commit re-enables those tests.
```
$ for x86_test in $(gg 'REQUIRES: x86$' llvm/test | fst); do sed -i "" '/REQUIRES: x86/d' $x86_test; done
```
(Note that 'x86' is not an available feature, that's what caused the
tests to be disabled.)
The slli/srli/srai 'immediate' vector shifts (although its not immediate anymore to match gcc) can be replaced with generic shifts if the shift amount is known to be in range.
Summary: This patch add tests when lowering multiple `gpu.all_reduce` operations in the same kernel. This was previously failing.
Differential Revision: https://reviews.llvm.org/D75930
Summary:
TestInlineStepping tests LLDB's ability to step in the presence of
inline frames. The testcase source has a number of functions and some
of them are marked `always_inline`.
The test is built around the assumption that the inline function will
be fully represented once inlined, but this is not true with the
current arm64 code generation. For example:
void caller() {
always_inline_function(); // Step here
}
When stppeing into `caller()` above, you might immediatly end up in
the inlines frame for `always_inline_function()`, because there might
literally be no code associated with `caller()` itself.
This patch hacks around the issue by adding an `asm volatile("nop")`
on some lines with inlined calls where we expect to be able to
step. Like so:
void caller() {
asm volatile("nop"); always_inline_function(); // Step here
}
This guarantees there is always going to be one instruction for this
line in the caller.
Reviewers: labath, jingham
Subscribers: kristof.beyls, danielkiss, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D76406
Summary:
TestBuiltinTrap fail on darwin embedded because the `__builin_trap`
builtin doesn't get any line info attached to it by clang when
building for arm64.
The test was already XFailed for linux arm(64), I presume for the same
reasons. This patch just XFails it independently of the platform.
Reviewers: labath
Subscribers: kristof.beyls, danielkiss, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D76408
Suppress those diagnostics if lhs of a member expression contains
errors. Typo correction produces dependent expressions even in
non-template code, that led to spurious diagnostics before.
previous:
/tmp/t.cpp:6:17: error: use 'template' keyword to treat 'f' as a dependent template name
auto a = bilder.f<int>();
^
template
/tmp/t.cpp:6:10: error: use of undeclared identifier 'bilder'; did you mean 'builder'?
auto a = bilder.f<int>();
^~~~~~
builder
vs now:
/tmp/t.cpp:6:10: error: use of undeclared identifier 'bilder'; did you mean 'builder'?
auto a = bilder.f<int>();
^~~~~~
builder
Original patch from Ilya.
Reviewers: sammccall
Reviewed By: sammccall
Tags: #clang
Differential Revision: https://reviews.llvm.org/D65592
`CheckerRegistry` registers a checker either if it is excplicitly
enabled or it is a dependency of an explicitly enabled checker and is
not explicitly disabled. In both cases it is also important that the
checker should be registered (`shoudRegister`//XXX//`()` returns true).
Currently there is a bug here: if the dependenct checker is not
explicitly disabled it is registered regardless of whether it should
be registered. This patch fixes this bug.
Differential Revision: https://reviews.llvm.org/D75842
Currently obj2yaml always emits the `EntSize` property when `sh_entsize != 0`.
It is not correct. For example, for `SHT_DYNAMIC` section, `EntSize == 0`
is abnormal, while `sizeof(ELFT::Dyn)` is the expected default.
To reduce the output produces we should not dump default values.
yaml2obj tests that shows `sh_entsize` values produced are:
1) For `SHT_REL*` sections: `yaml2obj\ELF\reloc-sec-entry-size.yaml`
2) For `SHT_DYNAMIC`: `yaml2obj\ELF\dynamic-section.yaml`
Differential revision: https://reviews.llvm.org/D76227
We do not have tests that shows the current behavior.
It is needed for D76227 which changes the logic of dumping of `EntSize` fields.
Differential revision: https://reviews.llvm.org/D76282
Check the path length limit against the length of the UTF-16 version of
the input rather than the UTF-8 equivalent, as the UTF-16 length may be
shorter. Move widenPath from the llvm::sys::path namespace in Path.h to
the llvm::sys::windows namespace in WindowsSupport.h. Only use the
reduced path length limit for create directory. Canonicalize using
sys::path::remove_dots().
Differential Revision: https://reviews.llvm.org/D75372
Summary:
Treat each C# generic type constraint, `where T: ...`, as a line.
Add C# keyword: where
Add Token Types: CSharpGenericTypeConstraint, CSharpGenericTypeConstraintColon, CSharpGenericTypeConstraintComma.
This patch does not wrap generic type constraints well, that will be addressed in a follow up patch.
Reviewers: krasimir
Reviewed By: krasimir
Subscribers: cfe-commits, MyDeveloperDay
Tags: #clang-format, #clang
Differential Revision: https://reviews.llvm.org/D76367
Summary:
In order to keep the names consistent with other SVE gather loads, the
intrinsics for gather prefetch are renamed as follows:
* @llvm.aarch64.sve.gather.prfb -> @llvm.aarch64.sve.prfb.gather
Reviewed by: fpetrogalli
Differential Revision: https://reviews.llvm.org/D76421
For PHIs with multiple incoming values, we can improve precision by
using constant ranges for integers. We can over-approximate phis
by merging the incoming values.
Reviewers: davide, efriedma, mssimpso
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71933
We usually start error messages with lowercase letters and most of them
in llvm-dwp follow that rule. This patch fixes a few messages that
started with capital letters.
Differential revision: https://reviews.llvm.org/D76277