Some of the compiler-rt runtimes use custom instrumented libc++ build.
Use the runtimes build for building this custom libc++.
Differential Revision: https://reviews.llvm.org/D114922
adds the .yaml files clang-tidy generates as byproducts, which means
that they will be updated properly and cleaned by `ninja -t clean`
Reviewed By: lntue
Differential Revision: https://reviews.llvm.org/D115290
Both these preference helper functions have initial support with
this change. The loop unrolling preferences are set with initial
settings to control thresholds, size and attributes of loops to
unroll with some tuning done. The peeling preferences may need
some tuning as well as the initial support looks much like what
other architectures utilize.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D113798
Prepare amdgpu plugin for asynchronous implementation. This patch switches to using HSA API for asynchronous memory copy.
Moving away from hsa_memory_copy means that plugin is responsible for locking/unlocking host memory pointers.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D115279
This patch is part of the upstreaming effort from the fir-dev branch.
Address review comments
- move CHECK blocks to after the mlir code in the test file
- fix style with respect to anonymous namespaces: only include class definitions in the namespace and make functions static and outside the namespace
- fix a few nits
- remove TODO in favor of notifyMatchFailure
- removed unnecessary CHECK line from convert-to-llvm.fir
- rebase on main - add TODO back in
- get successfull test of TODO in AllocMemOp converion of derived type with LEN params
- clearer comments and reduced use of auto
- move defintion of computeDerivedTypeSize to fix build error
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
Reviewed By: awarzynski, clementval, kiranchandramohan, schweitz
Differential Revision: https://reviews.llvm.org/D114104
This does mostly the same as D112126, but for the runtimes cmake files.
Most of that is straightforward, but the interdependency between
libcxx and libunwind is tricky:
Libunwind is built at the same time as libcxx, but libunwind is not
installed yet. LIBCXXABI_USE_LLVM_UNWINDER makes libcxx link directly
against the just-built libunwind, but the compiler implicit -lunwind
isn't found. This patch avoids that by adding --unwindlib=none if
supported, if we are going to link explicitly against a newly built
unwinder anyway.
Reapplying this after
db32c4f456, which should fix the issues
that were reported last time this was applied.
Differential Revision: https://reviews.llvm.org/D113253
The XRebox Op is formed by the codegen rewrite which makes it easier to
convert the operation to LLVM. The XRebox op includes the information
from the rebox op and the associated slice, shift, and shape ops.
During the conversion process a new descriptor is created for reboxing.
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Val Donaldson <vdonaldson@nvidia.com>
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D114709
clang has `= default` as an extension in c++03, so just use it.
Reviewed By: ldionne, Quuxplusone, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D115275
At present, amdgpu plugin merges both asynchronous and synchronous kernel launch implementations into a single synchronous version.
This patch prepares the plugin for asynchronous implementation by:
- Privatizing actual kernel launch code (valid in both cases) into an anonymous namespace base function
Actual separation of kernel launch code (async vs sync) is a following patch.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D115267
We avoid this fold in the more general cases where we use `FoldOpIntoSelect`.
That's because -- unlike most binary opcodes -- 'rem' can't usually be
speculated with a variable divisor since it can have immediate UB. But in
the case where both arms of the select are constants, we can safely evaluate
both sides and eliminate 'rem' completely.
This should fix:
https://llvm.org/PR52102
The same optimization for 'div' is planned as a follow-up patch.
Differential Revision: https://reviews.llvm.org/D115173
In the "runtimes" setup, the runtime (e.g. OpenMP) can be built for
a target entirely different from the current host build (where LLVM
and Clang are built). If profiling is enabled, libomptarget links
against LLVMSupport (which only has been built for the host).
Thus, don't enable profiling by default in this setup.
This should allow relanding D113253.
Differential Revision: https://reviews.llvm.org/D114083
This patch adds the runtime function to allocate and
deallocate ragged arrays.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D114534
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
A series of unary operators and casts may obscure the variable we're
trying to analyze. Ignore them for the uninitialized value analysis.
Other checks determine if the unary operators result in a valid l-value.
Link: https://github.com/ClangBuiltLinux/linux/issues/1521
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D114848
Before this patch, the new test's `CountedInvocable<int*, int*>`
would hard-error instead of SFINAEing and cleanly returning false.
Notice that views::counted specifically does NOT work with pipes;
`counted(42)` is ill-formed. This is because `counted`'s first argument
is supposed to be an iterator, not a range.
Also, mark `views::counted(it, n)` as [[nodiscard]], and test that.
(We have a general policy now that range adaptors are consistently
marked [[nodiscard]], so that people don't accidentally think that
they have side effects. This matters mostly for `reverse` and
`transform`, arguably `drop`, and just generally let's be consistent.)
Differential Revision: https://reviews.llvm.org/D115177
This revision implements sparse outputs (from scratch) in all cases where
the loops can be reordered with all but one parallel loops outer. If the
inner parallel loop appears inside one or more reductions loops, then an
access pattern expansion is required (aka. workspaces in TACO speak).
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D115091
This patch applies the lint rules described in the previous patch. There
was also a significant amount of effort put into manually fixing things,
since all of the templated functions, or structs defined in /spec, were
not updated and had to be handled manually.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D114302
This commit changes the clang-tidy rules for LLVM-libc to follow the new
format. The next commit applies these rules to the codebase.
The rules are as follows:
CamelCase for classes
lower_case for variables
lower_case for functions
UPPER_CASE for constexpr variables
There are also some exceptions, but the most important one is that any
function or variable that starts with an underscore is exempt from the
formatting.
Reviewed By: sivachandra, lntue
Differential Revision: https://reviews.llvm.org/D114301
This is a split of D113724. Calling `TypeSystemClang::AddMethodToCXXRecordType`
to create function decls for class methods.
Differential Revision: https://reviews.llvm.org/D113930
It is required for the [Leak Sanitizer port to Windows](https://reviews.llvm.org/D115103).
The currently used `unsigned long` type is 64 bits wide on UNIX like systems but only 32 bits wide on Windows.
Because of that, the literal `8UL << 30` causes an integer overflow on Windows.
By changing the type of the literals to `unsigned long long`, we have consistent behavior and no overflows on all Platforms.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D115186