The encoding of COMPUTE_TMPRING_SIZE.WAVESIZE and
SPI_TMPRING_SIZE.WAVESIZE has changed in GFX11: it is now in units
of 64 dwords instead of 256 dwords, and the field has been widened
from 13 bits to 15 bits.
Depends on D126989
Reviewed By: rampitec, arsenm, #amdgpu
Differential Revision: https://reviews.llvm.org/D127248
Add pattern to hoist scalar code outside of warp distribute region as
those cannot be distributed and we would want to execute them on all
the lanes.
Add patterns to distribute transfer_write ops. Those operations can be
distributed in different ways and it is control by user.
Differential Revision: https://reviews.llvm.org/D127152
The DWARF version was raised to 5 for all platforms which do not opt
out. Default to DWARF version to 4 for z/OS again.
Reviewed By: abhina.sreeskantharajan, uweigand
Differential Revision: https://reviews.llvm.org/D127498
Building on D126796, this commit adds the infrastructure for being able
to print out descriptions of what each warning does.
Differential Revision: https://reviews.llvm.org/D126832
Previously all FILE objects were fully buffered, this patch adds line
buffering and unbuffered output, as well as applying them to stdout and
stderr.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D126829
The 1st try ( afa192cfb6 ) was reverted because it could
cause an infinite loop with constant expressions.
A test for that and an extra condition to enable the transform
are added now. I also added code comments to better describe
the transform and the existing, related transform.
Original commit message:
https://alive2.llvm.org/ce/z/hRy3rE
As shown in D123408, we can produce this pattern when moving
casts around, and we already have a related fold for a binop
with a constant operand.
This is reduced from the post-commit C example in:
afa192cfb6
That was reverted with:
6fedc6a2b4
This will infinite loop unless that transform or the ones
with casts that can "EvaluateInDifferentType" are limited
to ignore constant expressions.
Apparently, update_analyze_test_checks.sh does *not* warn on conflicting CHECKs, it just silently drops those lines from the generated test. That is.. less than helpful.
This prevents them from being assumed legal by the cost model.
This matches what is done for AArch64 SVE.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D123799
Tested by building the docs-lldb-html target and
confirming the code-block renders properly with
the fix.
Patch by Michael Buch!
Differential Revision: https://reviews.llvm.org/D127437
The only interesting test change is in @PR31262, where the following
fold is now performed, while it previously was not:
https://alive2.llvm.org/ce/z/a5Qmr6
llvm/test/Transforms/InstSimplify/ConstProp/gep.ll has not been
updated, because there is a tradeoff between folding and inrange
preservation there that we may want to discuss.
Updates have been performed using:
https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34
Currently the unchecked-optional-access model fails on this example:
#include <memory>
#include <optional>
void foo() {
std::unique_ptr<std::optional<float>> x;
*x = std::nullopt;
}
You can verify the failure by saving the file as `foo.cpp` and running this command:
clang-tidy -checks='-*,bugprone-unchecked-optional-access' foo.cpp -- -std=c++17
The failing `assert` is in the `transferAssignment` function of the `UncheckedOptionalAccessModel.cpp` file:
assert(OptionalLoc != nullptr);
The symptom can be treated by replacing that `assert` with an early `return`:
if (OptionalLoc == nullptr)
return;
That would be better anyway since we cannot expect to always cover all possible LHS expressions, but it is out of scope for this patch and left as a followup.
Note that the failure did not occur on this very similar example:
#include <optional>
template <typename T>
struct smart_ptr {
T& operator*() &;
T* operator->();
};
void foo() {
smart_ptr<std::optional<float>> x;
*x = std::nullopt;
}
The difference is caused by the `isCallReturningOptional` matcher, which was previously checking the `functionDecl` of the `callee`. This patch changes it to instead use `hasType` directly on the call expression, fixing the failure for the `std::unique_ptr` example above.
Reviewed By: sgatev
Differential Revision: https://reviews.llvm.org/D127434
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.
Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.
@beanz added it in d0e1c2a550 to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.
Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.
To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.
For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D117977
sources to SALU and VALU instructions.
Contributors:
Baptiste Saleil <baptiste.saleil@amd.com>
Patch 20/N for upstreaming of AMDGPU gfx11 architecture
Depends on D126989
Reviewed By: rampitec, foad, #amdgpu
Differential Revision: https://reviews.llvm.org/D127143
Add a basic implementation of isExtractSubvectorCheap that only
considers extracts at offset 0.
Differential Revision: https://reviews.llvm.org/D127385
In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`.
The idea is to get support from the compiler to easily write efficient memory function implementations.
This patch could be split in two:
- one for the LLVM part adding the `llvm.memset.inline.*` intrinsics.
- and another one for the Clang part providing the instrinsic as a builtin.
Differential Revision: https://reviews.llvm.org/D126903
It seems like I should have ran the `check-clang` target after introducing a new warning diagnostic entry.
My bad. I'll run it next time; `check-clang-analysis` was not enough.
Here is a link to the broken bot: http://45.33.8.238/linux/78236/step_7.txt
This commit should fix this.
Differential Revision: https://reviews.llvm.org/D126067
Fixes a regression on D127115 - splitting was creating extract_subvector(bitcast(build_vector())) patterns which prevented the build vectors being split before being bitcast to vXi16 types, resulting in various issues with further folding of the (now legal) build vectors