In some binaries, built with clang/lld, libunwind crashes
with "unsupported x86_64 register" for regNum == 16:
Differential Revision: https://reviews.llvm.org/D107919
A DiffConsumer object may be reused, but we'd like to reset it before
the next use.
No functionality change intended.
Differential Revision: https://reviews.llvm.org/D107985
Using the python API to easily set up sparse kernels, this test
exhaustively builds, compilers, and runs SpMM for all annotations
on a sparse tensor, making sure every version generates the correct
result. This test also illustrates using the python API to set up
a sparse kernel and sparse compilation.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D107943
Flang uses positional arguments for `messages::say()`, such as "%1$s" which is only supported in MS Compilers with the `_*printf_p` form of the function. This uses a conditional macro to convert the existing `vsnprintf` used to the one needed in MS-World.
7 tests in D107575 rely on this change.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D107654
This reverts the revert 28c04794df.
The failing MLIR test that caused the revert should be fixed in this
version.
Also includes a PPC test fix previously in 1f87c7c478.
Since we officially don't support several older compilers now, we can
drop a lot of the markup in the test suite. This helps keep the test
suite simple and makes sure that UNSUPPORTED annotations don't rot.
This is the first patch of a series that will remove annotations for
compilers that are now unsupported.
Differential Revision: https://reviews.llvm.org/D107787
std::clock_t can be an unsigned value on some platforms like MacOS and
therefore needs a cast when initializing an std::clock_t value with -1.
Reviewed By: klausler
Differential Revision: https://reviews.llvm.org/D107972
DAGCombiner::visitStore can call GetDemandedBits which will remove
upper bits from immediates. The upper bits are important for good
materialization of negative constants on RISCV. GetDemandedBits is a
different mechanism than SimplifyDemandedBits so
TargetShrinkDemandedConstant can't block it.
As far as I know this behavior is unique to stores.
I think we can fix this in isel using a concept similar to D107658.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D107860
For unit-stride and strided load/stores we set the SEW operand of
the pseudo instruction equal the EEW in the opcode. The LMUL
of the pseudo instruction is the LMUL we want.
These instructions calculate EMUL=(EEW/SEW) * LMUL. We can use
this to avoid changing vtype if the SEW/LMUL of the previous
vtype matches the EEW/EMUL ratio we need for the instruction.
Due to how the global analysis works, we can only do this
optimization when the previous vsetvli was produced in the block
containing the store. We need to know in the first phase if the
vsetvli will be inserted so we can propagate information to
the successors in the second phase correctly. This means we can't
depend on predecessors.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D106601
Instead of using scalar size divided by 8 for segment loads, get
the alignment from clang's type system.
Make vleff match for consistency.
Also replace uses of getPointerElementType() which will be removed as part of the OpaquePtr changes.
Reviewed By: HsiangKai
Differential Revision: https://reviews.llvm.org/D106738
We really shouldn't deal with a conditional branch that can be trivially
constant-folded into an unconditional branch.
Indeed, barring failure to trigger BB reprocessing, that should be true,
so let's assert as much, and hope the assertion never fires.
If it does, we have a bug to fix.
Mainly, i want to add an assertion that `SimplifyCFGOpt::simplifyCondBranch()`
doesn't get asked to deal with non-unconditional branches,
and if i do that, then said assertion fires on existing tests,
and this is what prevents it from firing.
This is a direct translation of the select folds added with
D53033 / D53036 and another step towards canonicalization
using the intrinsics (see D98152).
Recent work in runtime assignments failed an assertion in fir-dev
while running tests (flang/test/Semantics/defined-ops.f90). This
test didn't fail in llvm-project/main because only the "new" Arm
driver is used now, and that only builds runtime derived type information
tables when some debug dumping options are enabled.
So add a reproducing test case to another test that is run with
-fdebug-dump-symbols, and fix the crash by emitting special procedure
binding information only for type-bound generic ASSIGNMENT(=) bindings
that are relevant to the runtime support library for use in intrinsic
assignment of derived types.
Differential Revision: https://reviews.llvm.org/D107918
New ports in glibc typically don't define ELF_INITFINI, so
DT_INIT/DT_FINI support is disabled.
(rhel ppc64le likely patches their glibc this way as well.)
musl can disable DT_INIT/DT_FINI via -DNO_LEGACY_INITFINI.
So we cannot guarantee ctor()/dtor() will be printed.
Fixes miscompile of calls into ocml. Bug 51445.
The stack variable `double __tmp` is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.
This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D107971
Currently, LNICM pass does not support sinking instructions out of loop nest.
This patch enables LNICM to sink down as many instructions to the exit block of outermost loop as possible.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D107219
Previoulsy debug-info-for-profiling and pseudo-probe-for-profiling are mutual exclusive because they compete the dwarf discrimnator for callsites on the IR. This changes allows to use the two switches together. The side effect is that callsite discriminators will be taken by pseudo probe, while discriminators for other instructions are still available for AutoFDO use. This is less than ideal, however, it still allows us a chance to smoothly transition from AutoFDO to CSSPGO, by collecting both profiles from a CSSPGO binary.
Reviewed By: wenlei, wmi
Differential Revision: https://reviews.llvm.org/D107876
Given a constant operand, the MVE and DAGCombine combines could fight,
each redistributing in the opposite order. Add a guard to the MVE
vecreduce distribution to prevent that.
Add a check for enforcing minimum length for variable names. A default
minimum length of three characters is applied to regular variables
(including function parameters). Loop counters and exception variables
have a minimum of two characters. Additionally, the 'i', 'j' and 'k'
are accepted as legacy values.
All three sizes, as well as the list of accepted legacy loop counter
names are configurable.
When depth > 0, callee frame address is used to compute the return address of
callee producing improper return address. This patch adds the fix to use caller
frame address to compute the return address of callee.
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D107646
The compilation of the file
526.blender_r/src/blender/source/blender/editors/space_logic/logic_ops.c
from the SPEC CPU 2017 benchmarks took excessive time to compute
InvalidDomain.gist_params(Ctx)
Simplifying beforehand, specifically using isl_set_detect_equalities,
reduces the computation time to a negible level again.
The default compiler-generated MutexSet::Desc::operator=()
now contains memcpy() call since Desc become bigger.
This fails in debug mode since we call interceptor from within the runtime.
Define own operator=() using internal_memcpy().
This also makes copy ctor necessary, otherwise:
tsan_mutexset.h:33:11: warning: definition of implicit copy constructor for
'Desc' is deprecated because it has a user-declared copy assignment operator
And if we add copy ctor, we also need the default ctor
since it's called by MutexSet ctor.
Depends on D107911.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D107959
We currently memorize u64 id + epoch for each mutex.
The new tsan runtime will memorize address + stack_id instead.
But switching to address + stack_id requires new trace,
which in turn requires new MutexSet and some other changes.
Extend MutexSet to support both new and old info to break
the dependency cycles. The plan is to remove the old
info/methods after switching to the new runtime.
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D107910
Introducing a plugin API and a simple HelloWorld Plugin example.
This patch adds the `-load` and `-plugin` flags to frontend driver and
the code around using custom frontend actions from within a plugin
shared library object.
It also adds to the Driver-help test to check the help option with the
updated driver flags.
Additionally, the patch creates a plugin-example test to check the
HelloWorld plugin example runs correctly. As part of this, a new CMake
flag (`FLANG_BUILD_EXAMPLES`) is added to allow the example to be built
and for the test to run.
This Plugin API has only been tested on Linux.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D106137