Replace custom constant scalar/splat folding with FoldConstantArithmetic call and canonicalize commutative constant ops to the RHS before the SimplifyVBinOp call
The inline keyword is required on those functions because they are defined
in the headers, so we need them to be inline to avoid ODR violations.
While we're at it, slap _LIBCPP_HIDE_FROM_ABI on them because they are
implementation details and we don't want them to be part of our ABI under
any circumstances.
Differential Revision: https://reviews.llvm.org/D115906
We were being wildly inconsistent about what memory access was implied by an indirect function call. Depending on the call site attributes, you could get anything from a read, to unknown, to none at all. (The last was a miscompile.)
We were also always traversing the uses of a readonly indirect call. This is entirely unneeded as the indirect call does not capture. The callee might capture itself internally, but that has no implications for this caller. (See the nice explanation in the CaptureTracking comments if that case is confusing.)
Note that elsewhere in the same file, we were correctly computing the nocapture attribute for indirect calls. The changed case only resulted in conservatism when computing memory attributes if say the return value was written to.
Differential Revision: https://reviews.llvm.org/D115916
This patch partially resolves an issue for VLS code generation
where a mask is generated from a smaller width integer comparison
than the instruction using the mask requires.
Instead of sign extending a p register by converting it to a z
register, extending that, and converting back, we instead just
do an unpack of the p register.
A separate issue causes the code generation to still be poor when
the mask generation would fit in a neon register, as we then use
a neon comparison operation and have to convert that to a p register.
This will be resolved in a separate patch.
Reviewed By: peterwaller-arm
Differential Revision: https://reviews.llvm.org/D111221
There are a number of places that specially handle loads from a
uniform value where all the bits are the same (zero, one, undef,
poison), because we a) don't care about the load offset in that
case and b) it bypasses casts that might not be legal generally
but do work with uniform values.
We had multiple implementations of this, with a different set of
supported values each time, as well as incomplete type checks in
some cases. In particular, this fixes the assertion reported in
https://reviews.llvm.org/D114889#3198921, as well as a similar
assertion that could be triggered via constant folding.
Differential Revision: https://reviews.llvm.org/D115924
I missed the async info parameter in the first version of this API.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D115887
Preserve the invariant that offset reported in the case of a
`PartialAlias` between `Loc1` and `Loc2`, is such that
`Loc1 + Offset = Loc2`, where `Loc1` and `Loc2` are the first and
the second argument, respectively, in alias queries.
Differential Revision: https://reviews.llvm.org/D115927
Fix a mistake in 9bf917394eba3ba4df77cc17690c6d04f4e9d57f: sret
arguments use ConvertType, not ConvertTypeForMem, see the handling
in CodeGenTypes::GetFunctionType().
This fixes fp-matrix-pragma.c on s390x.
This extends the custom lowering for truncating stores on
fixed length vectors in SVE to support masked truncating stores.
It also adds a DAG combine for truncates followed by masked
stores.
Reviewed By: peterwaller-arm, paulwalker-arm
Differential Revision: https://reviews.llvm.org/D108115
Matches llvm's and clang's /test/CMakeLists.txt, makes it easier to
see in diffs which deps get added, and makes it easier to see if
a given dependency is present or not.
No behavior change.
AARCH64_GET_REG() is used to initialize uptrs, and after D79132
the ptrauth branch of its implementation explicitly casts to uptr.
The non-ptrauth branch returns ucontext->uc_mcontext->__ss.__fp (etc),
which has either type void* or __uint64_t (ref usr/include/mach/arm/_structs.h)
where __uint64_t is a unsigned long long (ref usr/include/arm/_types.h).
uptr is an unsigned long (ref
compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h). So explicitly
cast to uptr in this branch as well, so that AARCH64_GET_REG() has a
well-defined type.
Then change DUMPREGA64() tu use %lx instead of %llx since that's the right type
for uptr. (Most other places in compiler-rt print uptrs as %p and cast the arg
to (void*), but there are explicit 0x%016 format strings in the surroundings,
so be locally consistent with that.)
No behavior change, in the end it's just 64-bit unsigneds by slightly different
names.
As requested in the review, this implements unary +,-,~, and ! for
vector types.
All of our boolean operations on vector types should be using something
like vcmpeqd, which results in a mask of '-1' for the 'truth' type. We are
currently instead using '1', which results in some incorrect
calculations when used later (note that it does NOT result in a boolean
vector, as that is not really a thing).
This patch corrects that 1 to be a -1, and updates the affected tests.
Differential Revision: https://reviews.llvm.org/D115670
D111441 included trunc isel patterns for sve_int_pred_pattern_a
but no accompanying tests. This patch adds the missing tests and
also simplifies the isel patterns that use sve_cnt_shl_imm.
Differential Revision: https://reviews.llvm.org/D115512
This setting is for variables we want to pass to the emulator only --
then will be automatically removed from the target environment by our
environment diffing code. This variable can be used to pass various
QEMU_*** variables (although most of these can be passed through
emulator-args as well), as well as any other variables that can affect
the operation of the emulator (e.g. LD_LIBRARY_PATH).
SimplifyVBinOp still has a FoldConstantArithmetic call, which now it isn't vector specific we should be able to remove (once fp binops are tidied up); but we can at least clean up the integer opcodes to perform the basic constant/undef handling in common code first.
78d15a112c adds llvm/test/DebugInfo/Generic/type-units-maybe-unused-types.ll
Move the test into llvm/test/DebugInfo/X86 and add -mtriple=x86_64-linux-gnu
because not all platforms support type units.
Example of failing bot: type-units-maybe-unused-types.ll
Original review: https://reviews.llvm.org/D115325
Summary: Refactor return value of `StoreManager::attemptDownCast` function by removing the last parameter `bool &Failed` and replace the return value `SVal` with `Optional<SVal>`. Make the function consistent with the family of `evalDerivedToBase` by renaming it to `evalBaseToDerived`. Aligned the code on the call side with these changes.
Differential Revision: https://reviews.llvm.org/
The LOD bias operand is of type 'half' when the A16-bit is ON' for MIMG instructions.
'bias' is only 16-bit but occupies 32-bits with upper 16-bits containing junk.
The patch fixes both the paths(ISelDAG and GlobalISel) for proper encoding of LOD bias operand.
Differential Revision: https://reviews.llvm.org/D111754
This patch extends the `FIRToLLVMLowering` pass in Flang by extending
the hook to transform `fir.coordinate_of` into a sequence of LLVM MLIR
instructions (i.e. `CoordinateOpConversion::doRewrite`). The following
case is added:
3.1 the input object is inside `!fir.ref` (e.g. `!fir.ref<!fir.array>` or
`!fir.ref<!fir.type>`).
3.2 the input object is inside `!fir.ptr` (e.g. `!fir.ptr<!fir.array>` or
`!fir.ptr<!fir.type>`).
From the point of view of the conversion, 3.1 and 3.2 are currently identical.
This is part of the upstreaming effort from the `fir-dev` branch in [1].
[1] https://github.com/flang-compiler/f18-llvm-project
Originally written by:
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Depends on: D114159
Differential Revision: https://reviews.llvm.org/D115333
Pass in the vector header instead of relying on ILV::LoopVectorBody.
This reduces the dependence on state from ILV. Where VPTransformState is
available, State.CFG.PrevBB can be used.
Fixes https://llvm.org/PR51087: Extraneous enum record in DWARF with type units.
As explained in PR51087 we sometimes get skeleton DIEs for enums in a Dwarf
Compile Unit (CU) that are not referenced from any CU and are already described
by a type unit.
Types for enums are emitted whether used or not, all together before most types
in the CU. Mechanically, the extraneous CU records are generated because the
enum types are generated with a call to CU->getOrCreateTypeDIE. This function
will recursively get-or-create the parent DIE (in the CU) and the type unit for
each. We don't need the CU-side DIEs if the type units are sucesfully
emitted. Fix by only emitting the type units for enums if possible, falling back
to a call to getOrCreateTypeDIE if not. Do the same for retained types.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D115325
With this change, the following invocations will be treated as errors
(multiple actions are specified):
```
$ flang-new -fc1 -E -fsyntax-only file.95
$ flang-new -fc1 -fsyntax-only -fdebug-dump-symbols file.95
```
In the examples above it is not clear whether it is `-fsyntax-only` or
the other action that is run (i.e. `-E` or `-fdebug-dump-symbols`). It
makes sense to disallow such usage. This should also lead to cleaner and
clearer tests (the `RUN` lines using `%flang_fc1` will only allow one
action).
This change means that `flang-new -fc1` and `clang -cc1` will behave
differently when multiple action options are specified. As frontend
drivers are mostly used by compiler developers, this shouldn't affect or
confuse the compiler end-users. Also, `flang-new` and `clang` remain
consistent.
Tests are updated accordingly. More specifically, I've made sure that
every test specifies only one action. I've also taken the opportunity to
simplify "multiple-input-files.f90" a bit.
Differential Revision: https://reviews.llvm.org/D111781
This is a follow up of D115458 and truncates the worklist of actual arguments
that can be specialised to 'MaxConstantsThreshold' candidates if
MaxConstantsThreshold was exceeded. Thus, this changes the behaviour of option
-func-specialization-max-constants. Before it didn't specialise at all when
this threshold was exceeded, but now it specialises up to MaxConstantsThreshold
candidates from the sorted worklist.
Differential Revision: https://reviews.llvm.org/D115509
Some MVE instructions have qr variants that take a Q and R register,
splatting the R register for each lane. This is usually handled fine for
standard splats as we sink the splat into the loop and combine the
resulting dup into the qr instruction. It does not work for constant
splats though, as we generate a vmovimm or constant pool load instead.
This intercepts that, generating a vdup of the constant instead where we
can turn the result into a qr instruction variant.
Differential Revision: https://reviews.llvm.org/D115242
They were being applied too narrowly (they didn't cover signed char *,
for instance), and too broadly (they covered SomeTemplate<char[6]>) at
the same time.
Differential Revision: https://reviews.llvm.org/D112709