The paciasp and autiasp instructions are only accepted by recent
compilers, but have the same encoding as hint instructions, so we can
use the hint menmonic to support older compilers.
If MLIR_CUDA_RUNNER_ENABLED, register a 'gpu-to-cubin' conversion pass to mlir-opt.
The next step is to switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner and remove mlir-cuda-runner.
Depends On D98279
Reviewed By: herhut, rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D98203
For locally scoped lambdas like this there's no particular benefit to
explicitly listing captures - or avoiding capturing this. Switch to [&]
and make it all easier to maintain.
(& driveby change std::function to llvm::function_ref)
Commit 1b04bdc2f3 added support for
capturing the 'this' pointer in a SEH context (__finally or __except),
But the case in which the 'this' pointer is part of a lambda capture
was not handled properly
Differential Revision: https://reviews.llvm.org/D97687
This avoids the `__NR_gettimeofday` syscall number, which does not exist on 32-bit musl (it has `__NR_gettimeofday_time32`).
This switched Android to `clock_gettime` as well, which should work according to the old code before D96925.
Tested on Alpine Linux x86-64 (musl) and FreeBSD x86-64.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D98121
This is no longer needed, we can add __llvm_profile_runtime directly
to llvm.compiler.used or llvm.used to achieve the same effect.
Differential Revision: https://reviews.llvm.org/D98325
All check-tsan tests fail on aarch64-*-linux because HeapMemEnd() > ShadowBeg()
for the following code path:
```
#if defined(__aarch64__) && !HAS_48_BIT_ADDRESS_SPACE
ProtectRange(HeapMemEnd(), ShadowBeg());
```
Restore the behavior before D86377 for aarch64-*-linux.
This errors, but doesn't give source location. We'd need to pass
the Record through several layers to get to the location.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D98379
Demonstrate how to generate vadd/vfadd intrinsic functions
1. add -gen-riscv-vector-builtins for clang builtins.
2. add -gen-riscv-vector-builtin-codegen for clang codegen.
3. add -gen-riscv-vector-header for riscv_vector.h. It also generates
ifdef directives with extension checking, base on D94403.
4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.
Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface
5. update tblgen doc for riscv related options.
riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need
them for the whole intrinsic functions implementation in the future.
Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>
Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos
Differential Revision: https://reviews.llvm.org/D95016
As we may overwrite inactive lanes of a caller-save-vgpr, we should
always save/restore the reserved vgpr for sgpr spill.
Reviewed by: arsenm
Differential Revision: https://reviews.llvm.org/D98319
The current implementation has some inefficiencies that become noticeable when running on large modules. This revision optimizes the code, and updates some out-dated idioms with newer utilities. The main components of this optimization include:
* Add an overload of Block::eraseArguments that allows for O(N) erasure of disjoint arguments.
* Don't process entry block arguments given that we don't erase them at this point.
* Don't track individual operation results, given that we don't erase them. We can just track the parent operation.
Differential Revision: https://reviews.llvm.org/D98309
Initially, this flag was meant to only be used through cc1 and not directly
through the clang driver. However, we accidentally ended up using this flag
as a driver flag already for selecting multilibs within the fuchsia toolchain.
We're currently in an awkward state where it's only accepted as a driver flag
when targeting Fuchsia, and all other instances it can only be added via
-Xclang. Since we're ready to use this in Fuchsia, we can just expose this to
the driver for simplicity.
Differential Revision: https://reviews.llvm.org/D98375
This reverts commit bacf9cf2c5 and
reinstates commit 1a9bd5b813.
Reverting this commit did not appear to make the problem go away, so we
can go ahead and reland it.
Updates __is_unsigned to have the same behavior as the standard
specifies. This is in line with 511dbd8, which applied the same change
to __is_signed.
Refs D67897.
Differential Revision: https://reviews.llvm.org/D98104
Generate a json file containing descriptions of AST classes and their
public accessors which return SourceLocation or SourceRange.
Use the JSON file to generate a C++ API and implementation for accessing
the source locations and method names for accessing them for a given AST
node.
This new API can be used to implement 'srcloc' output in clang-query:
http://ce.steveire.com/z/m_kTIo
In this first version of this feature, only the accessors for Stmt
classes are generated, not Decls, TypeLocs etc. Those can be added
after this change is reviewed, as this change is mostly about
infrastructure of these code generators.
Differential Revision: https://reviews.llvm.org/D93164
Right now, the createTargetMachine function in LTOBackend.cpp (used by llvm-lto, and other components) selects the default Relocation Model when none is specified in the module.
Other components (such as opt and llc) that construct a TargetMachine delegate the decision on the default value to the polymorphic TargetMachine's constructor.
This commit aligns llvm-lto with other components.
Reviewed By: daltenty, fhahn
Differential Revision: https://reviews.llvm.org/D97507
We previously have lowering for:
vecreduce.add(zext(X)) to vecreduce.add(UDOT(zero, X, one))
This extends that to also handle:
vecreduce.add(mul(zext(X), zext(Y)) to vecreduce.add(UDOT(zero, X, Y))
It extends the existing code to optionally handle a mul with equal
extends.
Differential Revision: https://reviews.llvm.org/D97280
This patch makes uses of the context bridges introduced in D83299 to make
AAValueConstantRange call site specific.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D83744
Ignore `-Wreturn-type-c-linkage` diagnostics for `LLDBSwigPythonBreakpointCallbackFunction`.
The function is defined in `python-wrapper.swig` which uses `extern "C" { ... }` blocks.
The declaration of this function in `ScriptInterpreterPython.cpp` already uses these
same pragmas to silence the warning there.
This prevents `-Werror` builds from failing.
Differential Revision: https://reviews.llvm.org/D98368
GetXcodeSDK() consistently takes over 1 second to complete if the
queried SDK is missing, because `xcrun` doesn't cache negative lookups.
Because there are multiple simulator platforms, this can add 4+ seconds
to `lldb -b some_object_file.o`.
To work around this, skip the call to GetXcodeSDK() when setting up
simulator platforms if the specified arch doesn't have what looks like a
simulator triple.
Some other ways to fix this:
- Fix caching in xcrun (rdar://74882205)
- Test for arch compat before calling SomePlatform::CreateInstance() (much
larger change)
Differential Revision: https://reviews.llvm.org/D98272
Add support to widen select instructions in VPlan native path by using a correct recipe when such instructions are encountered. This is already used by inner loop vectorizer.
Previously select instructions get handled by the wrong recipe and resulted in unreachable instruction errors like this one: https://bugs.llvm.org/show_bug.cgi?id=48139.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D97136
The patch adds an argument to update_cc_test_checks for replacing a function name matching a regex. This functionality is needed to match generated function signatures that include file hashes. Example:
The function signature for the following function:
`__omp_offloading_50_b84c41e__Z9ftemplateIiET_i_l30_worker`
with `--replace-function-regex "__omp_offloading_[0-9]+_[a-z0-9]+_(.*)"` will become:
`CHECK-LABEL: @{{__omp_offloading_[0-9]+_[a-z0-9]+__Z9ftemplateIiET_i_l30_worker}}(`
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D97107