I've changed the definition of the experimental.vector.splice
instrinsic to reject indices that are known to be or possibly
out-of-bounds. In practice, this means changing the definition so that
the index is now only valid in the range [-VL, VL-1] where VL is the
known minimum vector length. We use the vscale_range attribute to
take the minimum vscale value into account so that we can permit
more indices when the attribute is present.
The splice intrinsic is currently only ever generated by the vectoriser,
which will never attempt to splice vectors with out-of-bounds values.
Changing the definition also makes things simpler for codegen since we
can always assume that the index is valid.
This patch was created in response to review comments on D115863
Differential Revision: https://reviews.llvm.org/D115933
These tests were recently added in
50ec1306d0. The clang-cl invocations
interpret the source path, %s, which begins with /Users, as a cl
option /U. Normally this is worked around by passing -- before
the arguments that must be interpreted as files, not options, but
in the current test, the -link option must be passed last, after any
file names.
D115722 added a DL spec to GPU modules. It happens that the DL
default interface implementation is sensitive to the name of the
DL spec attribute. This patch is fixing the name of the attribute
to be the expected one.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D116956
When calling emitArrayDestroy(), the pointer will usually have
ConvertTypeForMem(EltType) as the element type, as one would expect.
However, globals with initializers sometimes don't use the same
types as values normally would, e.g. here the global uses
{ double, i32 } rather than %struct.T as element type.
Add an early cast to the global destruction path to avoid this
special case. The cast would happen lateron anyway, it only gets
moved to an earlier point.
Differential Revision: https://reviews.llvm.org/D116219
Loading constants inline is expensive on CSKY and it's in general better
to place the constant nearby in code space and then it can be loaded with a
simple 16/32 bit load instruction like lrw.
It needs lift or duplicates constant pool entry to make constant nearby so that lrw instruction can reach.
When `Zbt` is enabled, we can generate SELECT for division by power
of 2, so that there is no data dependency.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D114856
When passing a set of flags to configure defaults for a specific
target (similar to the cmake settings `CLANG_DEFAULT_RTLIB`,
`CLANG_DEFAULT_UNWINDLIB`, `CLANG_DEFAULT_CXX_STDLIB` and
`CLANG_DEFAULT_LINKER`, but without hardcoding them in the binary),
some of the flags may cause warnings (e.g. `-stdlib=` when compiling C
code). Allow requesting selectively ignoring unused arguments among
some of the arguments on the command line, without needing to resort
to `-Qunused-arguments` or `-Wno-unused-command-line-argument`.
Fix up the existing diagnostics.c testcase. It was added in
response to PR12181 to fix handling of
`-Werror=unused-command-line-argument`, but the command line option
in the test (`-fzyzzybalubah`) now triggers "error: unknown argument"
instead of the intended warning. Change it into a linker input
(`-lfoo`) which triggers the intended diagnostic. Extend the
existing test case to check more cases and make sure that it keeps
testing the intended case.
Add testing of the new option to this existing test.
Differential Revision: https://reviews.llvm.org/D116503
Summary: Since yaml2obj now supports converting AuxSymbols, we can use YAMLs as input for symbol-related tests of llvm-readobj instead of binaries.
Reviewed By: jhenderson, shchenz
Differential Revision: https://reviews.llvm.org/D114434
When `file->fetch(sym)` is replaced with a no-op, no test fails.
The new test catches the case.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D116916
For vmsgeu.vi with 0, we know this is always true. So we can replace
it with vmset.m (unmasked) or vmset.m+vmand.mm (masked).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D116584
Now the backend will select ~0 vl to a register and load instruction, we could use X0 to replace it.
Differential Revision: https://reviews.llvm.org/D116798
Now AND is used for zero extension when both Zbb and Zbp are not enabled.
It may be better to use shift operation if the trailing ones mask exceeds simm12.
This patch optimzes LUI+ADDI+AND to SLLI+SRLI.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D116720
When we want to create an splat vector that only the first element is initialized, we could use vmv.s.x or vfmv.s.f to build it.
Differential Revision: https://reviews.llvm.org/D116277
Two crashes have been reported. This change disables the new logic while leaving the new node in tree. Hopefully, that's enough to allow investigation without breakage while avoiding massive churn.
The compiler would crash if we lookup for name in transparent decl
context. See the tests attached for example.
I think this should make sense since the member declared in transparent
DeclContext are semantically defined in the enclosing (non-transparent)
DeclContext, this is the definition for transparent DeclContext.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D116792
As noted in post-commit review, the value of the bits between
the saturating type and result type were not defined defined. Define
them to be sign extended for FP_TO_SINT_SAT and zero extended for
FP_TO_UINT_SAT.
This mimics the style of 90010c2e1 (Don't consider 'LinkageSpec' when
calculating DeclContext 'Encloses'). Since ExportDecl and LinkageSpec
are transparent DeclContext, they share some similarity.
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D116911
This patch introduces a new directive that allow to parse/print attributes and types fully
qualified.
This is a follow-up to ee0908703d which introduces the eliding of the `!dialect.mnemonic` by default and allows to force to fully qualify each type/attribute
individually.
Differential Revision: https://reviews.llvm.org/D116905
"driver <flags> -- <input>" is a particularly convenient form of the
compile command to manipulate, with fewer special cases to handle.
Guaranteeing that the output command is of that form is cheap and makes
it easier to consume the result in some cases.
Differential Revision: https://reviews.llvm.org/D116721
When a -W<diag> option is given on the command line, and the corresponding diagnostic has
the NoWarnOnError flag set, prevent the flag from being dropped when the severity is reevaluated.
This fixes PR51837.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D109981
... even when `!defined(_LIBCPP_VERSION)`. (Note that the previous definition for this case - `((void)0);` - is ill-formed at namespace scope.) Ditto for `LIBCPP_ASSERT`, `LIBCPP_ASSERT_NOEXCEPT`, `LIBCPP_ASSERT_NOT_NOEXCEPT`, and `LIBCPP_ONLY`.
Differential Revision: https://reviews.llvm.org/D116880
Because of commit: https://reviews.llvm.org/D104291 the -gmodules .pcm
files do not have the same DW_AT_language dialect as the .o file. This
was a simple matter of passing the DebugStrictDwarf flag to the
PCHContainerGenerator object's CodeGenOpts from the CompilerInstance
passed in to it.
Before this change if you ran dwarfdump on the gmodule cache folder
you would get DW_AT_language (DW_LANG_C_plus_plus) even when using
-std=c++14 with clang
Patch by Shubham Rastogi!
Differential Revision: https://reviews.llvm.org/D116790
Add mmap and munmap to the linux aarch64 entrypoint list as the first
user of these syscalls.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D116949
Not all allocation functions are removable if unused. An example of a non-removable allocation would be a direct call to the replaceable global allocation function in C++. An example of a removable one - at least according to historical practice - would be malloc.
getNumberOfRegisters takes a ClassID as it's argument. It shouldn't be passed a bool. Assuming the bool meant vector or not, we should call getRegisterClassForType first.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D116903
Break up the huge function by extracting a class, storing intermediate
state as class members and breaking up the big function into a group
of class methods all at the same level of abstraction.
Differential Revision: https://reviews.llvm.org/D56343
Currently when -fgpu-rdc is specified, HIP toolchain always does host linking even
if --cuda-device-only is specified.
This patch fixes that. Only device linking is performed when --cuda-device-only
is specified.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D116840
When the unroll factor is 1, we should only fail "unrolling" when the trip count also is determined to be 1 and it is unable to be promoted.
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D115365