Summary:
These instructions compute multiply+add in integers, with one of the
operands being a splat of a scalar. (VMLA and VMLAS differ in whether
the splat operand is a multiplier or the addend.)
I've represented these in IR using existing standard IR operations for
the unpredicated forms. The predicated forms are done with target-
specific intrinsics, as usual.
When operating on n-bit vector lanes, only the bottom n bits of the
i32 scalar operand are used. So we have to tell that to isel lowering,
to allow it to remove a pointless sign- or zero-extension instruction
on that input register. That's done in `PerformIntrinsicCombine`, but
first I had to enable `PerformIntrinsicCombine` for MVE targets
(previously all the intrinsics it handled were for NEON), and make it
a method of `ARMTargetLowering` so that it can get at
`SimplifyDemandedBits`.
Reviewers: dmgreen, MarkMurrayARM, miyuki, ostannard
Reviewed By: dmgreen
Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76122
For selects with an unknown condition, we can approximate the result by
merging the state of both options. This automatically takes care of
the case where on operand is undef.
Reviewers: davide, efriedma, mssimpso
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71935
If it is a*b-c*d, it could be also folded into fma(a, b, -c*d) or fma(-c, d, a*b).
This patch is trying to respect the uses of a*b and c*d to make the best choice.
Differential Revision: https://reviews.llvm.org/D75982
Summary:
This adds support in RewriterGen for calling into the new `PatternRewriter::notifyMatchFailure` hook. This lets derived pattern rewriters display this information to users, an example from DialectConversion is shown below:
```
Legalizing operation : 'std.and'(0x60e0000066a0) {
* Fold {
} -> FAILURE : unable to fold
* Pattern : 'std.and -> (spv.BitwiseAnd)' {
** Failure : operand 0 of op 'std.and' failed to satisfy constraint: '8/16/32/64-bit integer or vector of 8/16/32/64-bit integer values of length 2/3/4'
} -> FAILURE : pattern failed to match
* Pattern : 'std.and -> (spv.LogicalAnd)' {
** Failure : operand 0 of op 'std.and' failed to satisfy constraint: 'bool or vector of bool values of length 2/3/4'
} -> FAILURE : pattern failed to match
} -> FAILURE : no matched legalization pattern
```
Differential Revision: https://reviews.llvm.org/D76335
Summary:
This revision restructures the calling of vector transforms to make it more flexible to ask for lowering through LLVM matrix intrinsics.
This also makes sure we bail out in degenerate cases (i.e. 1) in which LLVM complains about not being able to scalarize.
Differential Revision: https://reviews.llvm.org/D76266
LLVM has a documented mechanism for passing configuration information
to an out of tree project using cmake. See
https://llvm.org/docs/CMake.html#embedding-llvm-in-your-project. This
patch adds similar support for MLIR.
Using this requires something like:
cmake_minimum_required(VERSION 3.4.3)
project(SimpleProject)
find_package(MLIR REQUIRED CONFIG)
include_directories(${LLVM_INCLUDE_DIRS})
include_directories(${MLIR_INCLUDE_DIRS})
link_directories(${LLVM_BUILD_LIBRARY_DIR})
add_definitions(${LLVM_DEFINITIONS})
set(CMAKE_MODULE_PATH
${LLVM_CMAKE_DIR}
${MLIR_CMAKE_DIR}
)
include(AddLLVM)
include(TableGen)
include(AddMLIR)
add_executable(test-opt test-opt.cpp)
llvm_update_compile_flags(test-opt)
get_property(dialect_libs GLOBAL PROPERTY MLIR_DIALECT_LIBS)
get_property(conversion_libs GLOBAL PROPERTY MLIR_CONVERSION_LIBS)
message(dialects=${dialect_libs})
set(LIBS
${dialect_libs}
${conversion_libs}
MLIRLoopOpsTransforms
MLIRLoopAnalysis
MLIRAnalysis
MLIRDialect
MLIREDSC
MLIROptLib
MLIRParser
MLIRPass
MLIRQuantizerFxpMathConfig
MLIRQuantizerSupport
MLIRQuantizerTransforms
MLIRSPIRV
MLIRSPIRVTestPasses
MLIRSPIRVTransforms
MLIRTransforms
MLIRTransformUtils
MLIRTestDialect
MLIRTestIR
MLIRTestPass
MLIRTestTransforms
MLIRSupport
MLIRIR
MLIROptLib
LLVMSupport
LLVMCore
LLVMAsmParser
)
target_link_libraries(test-opt ${LIBS})
Differential Revision: https://reviews.llvm.org/D76047
Summary: The following change is to allow the machine outlining can be applied for Nth times, where N is specified by the compiler option. By default the value of N is 1. The motivation is that the repeated machine outlining can further reduce code size. Please refer to the presentation "Improving Swift Binary Size via Link Time Optimization" in LLVM Developers' Meeting in 2019.
Reviewers: aschwaighofer, tellenbach, paquette
Reviewed By: paquette
Subscribers: tellenbach, hiraditya, llvm-commits, jinlin
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71027
Summary: These dependencies are needed for testing on the buildbots until we migrate AORs into libc.
Reviewers: sivachandra
Reviewed By: sivachandra
Subscribers: MaskRay, tschuett, libc-commits
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D76330
Make sure that `process` is not None before calling is_alive. Otherwise
this might result in an AttributeError: 'NoneType' object has no
attribute 'is_alive'.
Although lldb.process and friends could already be None in the past, for
example after leaving an interactive scripting session, the issue became
more prevalent after `fc1fd6bf9fcfac412b10b4193805ec5de0e8df57`.
I audited the other interface files for usages of target, process,
thread and frame, but this seems the only place where a global is used
from an SB class.
Summary:
Explanation is in a comment in the diff, but essentially printing a
physical register name here is ambiguous. Until we can implement
printing a DWARF register name here just use the encoding directly.
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76253
Summary:
Renamed QuantOps to Quant to avoid the Ops suffix. All dialects will contain
ops, so the Ops suffix is redundant.
Differential Revision: https://reviews.llvm.org/D76318
- After https://reviews.llvm.org/D68578, the implicit conversion from
`FunctionDecl` to `GlobalDecl` needs replacing with `getGlobalDecl`;
otherwise, assertion is triggered.
Summary:
The current relaxation implementation is not correctly adjusting the
size and offsets of fragements in one section based on changes in size
of another if the layout order of the two happened to be such that the
former was visited before the later. Therefore, we need to invalidate
the fragments in all sections after each iteration of relaxation, and
possibly further relax some of them in the next ieration. This fixes
PR#45190.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D76114
The current implementation isn't very resilient when it comes to the
output of xcrun. Currently it cannot deal with:
- Trailing newlines.
- Leading newlines and errors/warnings before the Xcode path.
- Xcode not being named Xcode.app.
This extract the logic into a helper in PlatformDarwin and fixes those
issues. It's also the first step towards removing code duplication
between the different platforms and downstream Swift.
Differential revision: https://reviews.llvm.org/D76261
ISD::ROTL/ROTR rotation values are guaranteed to act as a modulo amount, so for power-of-2 bitwidths we only need the lowest bits.
Differential Revision: https://reviews.llvm.org/D76201
Summary:
[Clang] Attribute to allow defining undef global variables
Initializing global variables is very cheap on hosted implementations. The
C semantics of zero initializing globals work very well there. It is not
necessarily cheap on freestanding implementations. Where there is no loader
available, code must be emitted near the start point to write the appropriate
values into memory.
At present, external variables can be declared in C++ and definitions provided
in assembly (or IR) to achive this effect. This patch provides an attribute in
order to remove this reason for writing assembly for performance sensitive
freestanding implementations.
A close analogue in tree is LDS memory for amdgcn, where the kernel is
responsible for initializing the memory after it starts executing on the gpu.
Uninitalized variables in LDS are observably cheaper than zero initialized.
Patch is loosely based on the cuda __shared__ and opencl __local variable
implementation which also produces undef global variables.
Reviewers: kcc, rjmccall, rsmith, glider, vitalybuka, pcc, eugenis, vlad.tsyrklevich, jdoerfert, gregrodgers, jfb, aaron.ballman
Reviewed By: rjmccall, aaron.ballman
Subscribers: Anastasia, aaron.ballman, davidb, Quuxplusone, dexonsmith, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D74361
It fixes an ambiguity issue in case of a user has a custom policy and
calls a version of exclusive_scan with binary operation.
Differential Revision: https://reviews.llvm.org/D62719
Functions include their arguments in the use-list. Changed function
values mean that the result of the function changed. We only need
to update the call sites with the new function result and do not
have to propagate the call arguments.
To do so, this patch splits up the visitCallSite into handleCallResult
and handleCallArguments and updates markUsersAsChanged to only update
call results for functions.
Reviewers: efriedma, davide
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D75846
Summary:
There seems to be a race condition between the pipe closing and the child process death. Likely these two events are not atomic on some versions of linux.
With the removal of `WNOHANG` we eliminate the race condition, however if the child closes the pipe intentionally then it could result in the test runner hanging. I find this situation less likely, where as I experience failures locally with this race condition rather consistently.
Reviewers: sivachandra, abrachet
Reviewed By: sivachandra, abrachet
Subscribers: MaskRay, jfb, tschuett, libc-commits
Tags: #libc-project
Differential Revision: https://reviews.llvm.org/D76267
The current implementation of binomial_distribution is not guaranteed to
converge for certain extreme configurations of the engine and distribution.
This is due to a mistake in the implementation of the algorithm from the
given reference paper. The algorithm in the paper is guaranteed to
terminate but has redundant statements. The current implementation
simplified away the redundancy into a while loop, but it excludes the
return condition of the case where a good sample cannot be returned for
the particular sample being used from the uniform distribution, which is
what causes the infinite loop. This change guarantees termination by
recognizing that a good sample cannot be returned and returning 0 after
breaking the loop. This is also in contrast to the paper because the
return value as specified in the paper violates basic checks in at least
a subset of the extreme cases where the current implementation fails to
terminate. This default return value of 0 is satisfactory for the
extreme case known so far.
Since this is only meant to affect extreme cases where the algorithm
does not terminate anyways, the behavior is expected to remain exactly
the same for all non-extreme cases that have been terminating so far.
Fixes https://llvm.org/PR44847
Differential Revision: https://reviews.llvm.org/D74997
When compiling
```
struct S {
float w;
};
void f(long w, long b);
void g(struct S s) {
int w = s.w;
f(w, w*4);
}
```
I get Assertion failed: ((!CombinedExpr || CombinedExpr->isValid()) && "Combined debug expression is invalid").
That's because we combine two epxressions that both end in DW_OP_stack_value:
```
(lldb) p Expr->dump()
!DIExpression(DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_LLVM_convert, 64, DW_ATE_signed, DW_OP_stack_value)
(lldb) p Param.Expr->dump()
!DIExpression(DW_OP_constu, 4, DW_OP_mul, DW_OP_LLVM_convert, 32, DW_ATE_signed, DW_OP_LLVM_convert, 64, DW_ATE_signed, DW_OP_stack_value)
(lldb) p CombinedExpr->isValid()
(bool) $0 = false
(lldb) p CombinedExpr->dump()
!DIExpression(4097, 32, 5, 4097, 64, 5, 16, 4, 30, 4097, 32, 5, 4097, 64, 5, 159, 159)
```
I believe that in this particular case combining two stack values is
safe, but I didn't want to sink the special handling into
DIExpression::append() because I do want everyone to think about what
they are doing.
Patch by Adrian Prantl.
Fixes PR45181.
rdar://problem/60383095
Differential Revision: https://reviews.llvm.org/D76164
RDF is designed to be target agnostic. Therefore it would be useful to have it available for other targets, such as X86.
Based on a previous patch by Krzysztof Parzyszek
Differential Revision: https://reviews.llvm.org/D75932