Let the sext_inreg be selected to sext.w. Remove unneeded sext.w
during PostProcessISelDAG.
This gives opportunities for some other isel patterns to match
like the ADDIPair or matching mul with immediate to shXadd.
This becomes possible after D107658 started selecting W instructions
based on users. The sext.w will be considered a W user so isel
will often select a W instruction for the sext.w input and we can
just remove the sext.w. Otherwise we can combine the sext.w with
a ADD/SUB/MUL/SLLI to create a new W instruction in parallel
to the the original instruction.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D107708
When dealing with memmove, we also add the load instruction to the ignored
instructions list passed to `mayLoopAccessLocation`. Renaming "Stores" to
"IgnoredInsts" to be more precise.
Differential Revision: https://reviews.llvm.org/D108275
Translate the `@llvm.isnan` intrinsic to G_ISNAN when we see it.
This is pretty much the same as the associated SelectionDAGBuilder code. Main
difference is that we don't expand it here. It makes more sense to do that
during legalization in GlobalISel. GlobalISel will just legalize the generated
illegal types.
Differential Revision: https://reviews.llvm.org/D108226
D104556 change the CountersPtr to be relative, however, it did not
update the pointer initialization in __llvm_profile_register_function,
so the platform (eg:AIX) that use __llvm_profile_register_function is now totaly
broken, any PGO code will SEGV.
This patch update the code to reflect that the Data->CountersPtr is now
relative.
Reviewed By: MaskRay, davidxl
Differential Revision: https://reviews.llvm.org/D108304
We already do this for non-constants RHS. This just removes the
special case. I believe the special case may have been needed
because the ANY_EXTEND of a constant used to create zero extended
constants, but we recently changed that to produce sign extended
constants.
D107658 is needed to prevent some regressions.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D107697
Add a generic opcode equivalent to the `llvm.isnan` intrinsic +
MachineVerifier support for it.
We need an opcode here because we may want target-specific lowering later on.
Differential Revision: https://reviews.llvm.org/D108222
DAGCombiner::visitStore can clear the upper bits of constants
used by stores. This leads prevents them from being recognized as
sign extended negative values making them more expensive to
materialize.
This patch uses the hasAllNBitUsers method from D107658 to make
a negative constant if none of the users care about the upper bits.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D108052
We normally select these when the root node is a sext_inreg, but
SimplifyDemandedBits can sometimes bypass the sext_inreg for some
users. This can create situation where sext_inreg+add/sub/mul/shl
is selected to a W instruction, and then the add/sub/mul/shl is
separately selected to a non-W instruction with the same inputs.
This patch tries to detect when it would still be ok to use a W
instruction without the sext_inreg by checking the direct users.
This can allow the W instruction to CSE with one created for a
sext_inreg+add/sub/mul/shl. To minimize complexity and cost of
checking, we make no attempt to determine if the CSE will happen
and just always use a W instruction when we can.
Differential Revision: https://reviews.llvm.org/D107658
If we have these instructions, we don't need to hoist the immediate
for an AND that would match them.
Reviewed By: luismarques
Differential Revision: https://reviews.llvm.org/D107783
This very occasionally causes to an assertion failure in the compiler.
Turning off until we can get to the bottom of this.
Reviewed By: hctim
Differential Revision: https://reviews.llvm.org/D108282
For tight loops like this:
float r = 0;
for (int i = 0; i < n; i++) {
r += a[i];
}
it's better not to vectorise at -O3 using fixed-width ordered reductions
on AArch64 targets. Although the resulting number of instructions in the
generated code ends up being comparable to not vectorising at all, there
may be additional costs on some CPUs, for example perhaps the scheduling
is worse. It makes sense to deter vectorisation in tight loops.
Differential Revision: https://reviews.llvm.org/D108292
In the future, we'll want to rely exclusively on using_if_exists for this
job, but for now, only rely on it when the compiler supports that attribute.
That removes the possibility for getting the logic wrong.
Differential Revision: https://reviews.llvm.org/D108297
This allows testing the rest of those headers on most platforms, instead
of XFAILing the whole test just because of a few functions.
As a fly-by fix, remove std/utilities/time/date.time/ctime.pass.cpp,
which was a duplicate of std/language.support/support.runtime/ctime.pass.cpp.
Differential Revision: https://reviews.llvm.org/D108295
Currently, AAKernelInfo will fail on an assertion if we attempt to run
it on a kernel without the init / deinit runtime calls. However, this
occurs for global constructors on the device. This will cause OpenMPOpt
to crash whenever global constructors are present. This patch removes
this assertion and just gives up instead.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108258
Currently, the runtime returns an error when the `exec_mode` global is
not present. The expected behvaiour is that the region will default to
Generic. This prevents global constructors from being called because
they do not contain execution mode globals.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108255
This change-set puts 93d08acaac functionality
under -add-omp-offload-notes switch that is OFF by default.
CUDA toolchain is not able to handle ELF images with LLVMOMPOFFLOAD
notes for unknown reason (see https://reviews.llvm.org/D99551#2950272).
I disable the ELF notes embedding until the CUDA issue is triaged and resolved.
Differential Revision: https://reviews.llvm.org/D108246
LLVM considers global variables marked as externals to be defined within the module if it is initialized (including to an undef). Other external globals are considered as being defined externally and imported into the current translation unit. Lowering of MLIR Global Ops does not properly propagate undefined initializers, resulting in a global which is expected to be defined within the current TU, not being defined.
Differential Revision: https://reviews.llvm.org/D108252
The `get` half of this machinery was already implemented, but the `tuple_size`
and `tuple_element` parts were hiding in [ranges.syn] and therefore missed.
Differential Revision: https://reviews.llvm.org/D108054
The interesting bit about that triple isn't the architecture, it's the
fact that ps4 implies C99 as the standard rather than a newer C mode.
Specify the language standard rather than the triple so the test is a
bit more general.
The old entry mapped the email address `<compnerd@compnerd.org>` to user name
`Saleem Abdulrasool` and email address `<compnerd@compnerd.org>`. Since the two
addresses are identical, that's a needless detail.
The new entry just maps email address `<compnerd@compnerd.org>` to user name
`Saleem Abdulrasool`.
No behavior change.
Differential Revision: https://reviews.llvm.org/D108079
According to comments at https://reviews.llvm.org/D107911,
Trace.MemoryAccessSize fails on Mac buildbots.
Because this test is newly introduced, and is the only user of the code
added in that patch, disable the test on Mac till the problem is
resolved.
Differential Revision: https://reviews.llvm.org/D108294
All supported compilers have supported deduction guides in C++17 for a
while, so this isn't necessary anymore.
Differential Revision: https://reviews.llvm.org/D108213
The test precision_type.pass.cpp was a duplicate of precision.pass.cpp,
so it is removed. atomic_flag_test.pass.cpp was a duplicate of
atomic_flag_test_and_set.pass.cpp, so instead I wrote a proper
test for it. Those duplicate tests were detected with
find libcxx ! -empty -type f -exec md5sum {} + | sort | uniq -w32 -dD
Instead of trying to sniff out what features are supported by the
library being tested, the way we normally handle these things is with
Lit annotations. This should not be treated differently.
Differential Revision: https://reviews.llvm.org/D108209
This adds the Unicode 13 data for XID_Start and XID_Continue.
The definition of valid identifier is changed in all C++ modes
as P1949 (https://wg21.link/p1949) was accepted by WG21 as a defect
report.
Target is only ever non-null when we find an existing type, so move its declaration inside that case, and remove the dead code where Target was always null.