fix an assertion due to mismatch type for Numerator and CacheLineSize in loop cache analysis pass.
Reviewed By: bmahjour
Differential Revision: https://reviews.llvm.org/D107618
- Loads from the constant memory (either explicit one or as the source
of memory transfer intrinsics) won't alias any stores.
Reviewed By: asbirlea, efriedma
Differential Revision: https://reviews.llvm.org/D107605
This isn't optimal, but prevents crashing when the libcall isn't
available. It just calculates the full product and makes sure the high bits
match the sign of the low half. Each of the pieces should go through their own
type legalization.
This can make D107420 unnecessary.
Needs tests, but I wanted to start discussion about D107420.
Reviewed By: FreddyYe
Differential Revision: https://reviews.llvm.org/D107581
Some of the Arm complex pattern functions call canExtractShiftFromMul,
which can modify the DAG in-place. For this to be valid and handled
successfully we need to define ComplexPatternFuncMutatesDAG.
Differential Revision: https://reviews.llvm.org/D107476
See: http://45.33.8.238/macm1/15677/step_10.txt
This is a test that has `REQUIRES: x86` which means it never ran
before; I don't have a MachO environment but based on the FileCheck
output it looks like it should be sufficient to remove one CHECK line.
Fix an edge case missed by https://reviews.llvm.org/D78921. For e.g.,
the Repro debug entry (generated with the /Brepro linker flag) does not
have a debug-directory payload. Do not attempt to patch Debug entries
without a payload.
Differential Revision: https://reviews.llvm.org/D107324
%%%
This patch defines __HOS_AIX__ macro for AIX in case of a cross compiler implementation.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107242
Currently the UNSUPPORTED and XFAIL clauses support specifying
substrings of the target triple; but REQUIRES does not, which can trip
people up or lead to hacking config files to insert substitute feature
names. Consistency across all three lit clauses seems preferable.
Differential Revision: https://reviews.llvm.org/D107162
%%%
This patch defines the macro __THW_PPC__ for AIX.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107243
%%%
This patch defines the macro __THW_BIG_ENDIAN__ for AIX.
%%%
Tested with SPEC.
Reviewed By: cebowleratibm
Differential Revision: https://reviews.llvm.org/D107241
We already strip all the inputs provided without `--`, this patch also
handles the cases with `--`.
Differential Revision: https://reviews.llvm.org/D107637
This patch strips all the arch options in case of multiple ones. As it
results in multiple compiler jobs, which clangd cannot handle.
It doesn't pick any over the others as it is unclear which one the user wants
and defaulting to host architecture seems less surprising. Users also have the
ability to explicitly specify the architecture they want via clangd config
files.
Fixes https://github.com/clangd/clangd/issues/827.
Differential Revision: https://reviews.llvm.org/D107634
D31709 added an assertion was added to `FullSourceLoc::hasManager()` that ensured a valid `SourceLocation` is always paired with a `SourceManager`, and missing `SourceManager` is always paired with an invalid `SourceLocation`.
This appears to be incorrect, since clients never cared about constructing `FullSourceLoc` to uphold that invariant, or always checking `isValid()` before calling `hasManager()`.
The assertion started failing when serializing diagnostics pointing into an explicit module. Explicit modules don't have valid `SourceLocation` for the `import` statement, since they are "imported" from the command-line argument `-fmodule-name=x.pcm`.
This patch removes the assertion, since `FullSourceLoc` was never intended to uphold any kind of invariants between the validity of `SourceLocation` and presence of `SourceManager`.
Reviewed By: arphaman
Differential Revision: https://reviews.llvm.org/D106862
The Solaris buildbots have been broken for some time by the unconditional
use of `NT_GNU_BUILD_ID`, e.g. Solaris/sparcv9
<https://lab.llvm.org/staging/#/builders/50/builds/4910> and Solaris/amd64
<https://lab.llvm.org/staging/#/builders/101/builds/3751>. Being a GNU
extension, it is not defined in `<sys/elf.h>`. However, providing a
fallback definition doesn't help because the code also relies on
`__ehdr_start`, another unportable GNU extension that most likely never
will be implemented in Solaris `ld`. Besides, there's reallly no point in
supporting build ids since they aren't used on Solaris at all.
This patch fixes this by making the relevant code conditional on the
definition of `NT_GNU_BUILD_ID`.
Tested on `amd64-pc-solaris2.11` and `sparcv9-sun-solaris2.11`.
Differential Revision: https://reviews.llvm.org/D107556
1) add some self-diagnosis (when asserts are enabled) to check that all
features have the same nr of entries
2) avoid storing pointers to mutable fields because the proto API
contract doesn't actually guarantee those stay fixed even if no further
mutation of the object occurs.
Differential Revision: https://reviews.llvm.org/D107594
This introduces a new flag ignored-reference-qualifiers for the
existing "'A' qualifier on reference type B has no effect" diagnostic,
as a child of ignored-qualifiers.
Rationale:
This particular diagnostic is enabled by default, but other parts of
ignored-qualifiers are not. Anecdotally, a user may encounter this
diagnostic in the wild, and, seeing it to be valuable, might try to
raise it to error with -Werror=ignored-qualifiers, whereupon the other
diagnostics the flag covers will also be raised, to the user's surprise
and confusion. By splitting this diagnostic out into a separate flag,
and marking it as a child of ignored-qualifiers, we allow the user more
granular control of the diagnostics they care about, while maintaining
backwards compatibility with existing build scripts.
This patch introduces a new code object metadata field, ".kind"
which is used to add support for init and fini kernels.
HSAStreamer will use function attributes, "device-init" and
"device-fini" to distinguish between init and fini kernels from
the regular kernels and will emit metadata with ".kind" set to
"init" and "fini" respectively.
To reduce the number of init and fini kernels, the ctors and
dtors present in the llvm's global.ctors and global.dtors lists
are called from a single init and fini kernel respectively.
Reviewed by: yaxunl
Differential Revision: https://reviews.llvm.org/D105682
D107068 fixed the same problem on aarch64 but the arm variant wasn't exposed in existing test coverage.
I've copied the arm64-neon-copy tests (and stripped the intrinsic test from it) for testing on arm neon builds as well.
As reported on PR51281, an internal fuzz test encountered an issue when extracting constant bits from a SUBV_BROADCAST node from a constant pool source larger than the broadcasted subvector width.
The getTargetConstantBitsFromNode was assuming that the Constant would the same size as the subvector, resulting in the incorrect packing of the per-element bits data.
This patch attempts to solve this by using the SUBV_BROADCAST node to determine the subvector width, and then ensuring we extract only the lowest bits from Constant of that subvector bitsize.
Differential Revision: https://reviews.llvm.org/D107158
This change provides a way to conveniently declare types that have
address space qualifiers removed.
Since OpenCL adds address spaces implicitly even when they are not
specified in source, it is useful to allow deriving address space
unqualified types.
Fixes llvm.org/PR45326
Differential Revision: https://reviews.llvm.org/D106785
As we are trying to reach parity between opencl-c.h and
-fdeclare-opencl-builtins, ensure the documentation mentions that new
builtins should be added to both.
Reviewed by: Anastasia Stulova
This patch adds more instructions to the Uniforms list, for example certain
intrinsics that are uniform by definition or whose operands are loop invariant.
This list includes:
1. The intrinsics 'experimental.noalias.scope.decl' and 'sideeffect', which
are always uniform by definition.
2. If intrinsics 'lifetime.start', 'lifetime.end' and 'assume' have
loop invariant input operands then these are also uniform too.
Also, in VPRecipeBuilder::handleReplication we check if an instruction is
uniform based purely on whether or not the instruction lives in the Uniforms
list. However, there are certain cases where calls to some intrinsics can
be effectively treated as uniform too. Therefore, we now also treat the
following cases as uniform for scalable vectors:
1. If the 'assume' intrinsic's operand is not loop invariant, then we
are free to treat this as uniform anyway since it's only a performance
hint. We will get the benefit for the first lane.
2. When the input pointers for 'lifetime.start' and 'lifetime.end' are loop
variant then for scalable vectors we assume these still ultimately come
from the broadcast of an alloca. We do not support scalable vectorisation
of loops containing alloca instructions, hence the alloca itself would
be invariant. If the pointer does not come from an alloca then the
intrinsic itself has no effect.
I have updated the assume test for fixed width, since we now treat it
as uniform:
Transforms/LoopVectorize/assume.ll
I've also added new scalable vectorisation tests for other intriniscs:
Transforms/LoopVectorize/scalable-assume.ll
Transforms/LoopVectorize/scalable-lifetime.ll
Transforms/LoopVectorize/scalable-noalias-scope-decl.ll
Differential Revision: https://reviews.llvm.org/D107284
Use new return type for `OpAsmDialectInterface::getAlias`:
* `AliasResult::NoAlias` if an alias was not provided.
* `AliasResult::OverridableAlias` if an alias was provided, but it might be overriden by other hook.
* `AliasResult::FinalAlias` if an alias was provided and it should be used (no other hooks will be checked).
In that case `AsmPrinter` will use either the first alias with `FinalAlias` result or
the last alias with `OverridableAlias` result (it depends on dialect array order).
Used `OverridableAlias` result for `BuiltinOpAsmDialectInterface`.
Use case: provide more informative alias for built-in attributes like `AffineMapAttr`
instead of generic "map<N>".
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D107437
The may get changed before specialization by RunSCCPSolver. In other
words, the pass may change the function without specialization happens.
Add test and comment to reveal this.
And it may return No Changed if the function get changed by
RunSCCPSolver before the specialization. It looks like a potential bug.
Test Plan: check-all
Reviewed By: https://reviews.llvm.org/D107622
Differential Revision: https://reviews.llvm.org/D107622
We can improve on the generic splitting by using ffbh/ffbl, which have a
defined result when the input is zero.
Differential Revision: https://reviews.llvm.org/D107442
This is the counterpart to G_AMDGPU_FFBH_U32 which already exists. These
instructions have a defined result of -1 when the input is zero.
Differential Revision: https://reviews.llvm.org/D107441
Noticed that the computation for function specialization cost of a
function wouldn't change during the traversal of the arguments for the
function. We could hoist the computation out of the traversal. I
observed about ~1% improvement on compile time for spec2017. But I guess
it may not be precise. This should be NFC and fine.
Reviewed By: Sjoerd Meijer
Differential Revision: https://reviews.llvm.org/D107621