If we don't know anything about the alignment of a pointer, Align(1) is
still correct: all pointers are at least 1-byte aligned.
Included in this patch is a bugfix for an issue discovered during this
cleanup: pointers with "dereferenceable" attributes/metadata were
assumed to be aligned according to the type of the pointer. This
wasn't intentional, as far as I can tell, so Loads.cpp was fixed to
stop making this assumption. Frontends may need to be updated. I
updated clang's handling of C++ references, and added a release note for
this.
Differential Revision: https://reviews.llvm.org/D80072
Summary:
This is a pre-requisite to parallelizing PDB symbol and type merging.
Currently this timer usage would not be thread safe.
Reviewers: aganea, MaskRay
Subscribers: jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80298
With the new SVE stack layout, we now need to provide a Darwin variant
for all the calling conventions based on the main AAPCS CSR save order.
This also changes APCS_SwiftError to have a Darwin and a non-Darwin
version, assuming it could be used on other platforms these days, and
restricts the AArch64_CXX_TLS calling convention to Darwin.
Differential Revision: https://reviews.llvm.org/D73805
Even though series of cmd/cndmask can produce quite a lot of
code that is still better than a loop. In case of doubles we
would even produce two loops.
Differential Revision: https://reviews.llvm.org/D80032
Previously this code just used a default constructed
MachinePointerInfo. But we know the accesses are to a fixed stack
object or at least somewhere on the stack.
While there fix the alignment passed to the full vector load/stores.
I don't think this function is currently exercised in tree so I
don't know how to test it. I just noticed it when I removed
non-constant index support in this function.
Differential Revision: https://reviews.llvm.org/D80058
Due to similar APIs between CUDA and ROCm (HIP),
ConvertGpuLaunchFuncToCudaCalls pass could be used on both platforms with some
refactoring.
In this commit:
- Migrate ConvertLaunchFuncToCudaCalls from GPUToCUDA to GPUCommon, and rename.
- Rename runtime wrapper APIs be platform-neutral.
- Let GPU binary annotation attribute be specifiable as a PassOption.
- Naming changes within the implementation and tests.
Subsequent patches would introduce ROCm-specific tests and runtime wrapper
APIs.
Differential Revision: https://reviews.llvm.org/D80167
lit runs a gtest executable multiple times. First it runs it to
discover tests, then later it runs the executable again for each test.
However, if the discovery fails (perhaps because of a broken
executable), then no tests were previously run and no failures were
reported. This patch creates a dummy test if discovery fails, which
will later fail when test are run and be reported as a failure.
Differential Revision: https://reviews.llvm.org/D80096
Demangling Itanium symbols either consumes the whole input or fails,
but Microsoft symbols can be successfully demangled with just some
of the input.
Add an outparam that enables clients to know how much of the input was
consumed, and use this flag to give llvm-undname an opt-in warning
on partially consumed symbols.
Differential Revision: https://reviews.llvm.org/D80173
Summary:
This revision is to complement {D75791} so we can be sure that we don't change any default behavior.
For now just add rules to cover AfterExternBlock, but in the future we should add cases to cover the other BraceWrapping rules for each style. This will help guard us when we change code inside of the various getXXXStyle() functions to ensure we are not breaking everyone.
Reviewed By: MarcusJohnson91
Subscribers: cfe-commits
Tags: #clang, #clang-format
Differential Revision: https:
This class should've been instrumented when it landed. Whether the class
is "highly mutable" or not doesn't affect that.
With this patch TestSBEnvironment.py now passes when replayed.
Print a little snippet before exiting when passed unrecognized
arguments. The goal is twofold:
- Point users to lldb --help.
- Make it clear that we exited the debugger.
Summary:
Long long ago system_libs was appended to LLDB_SYSTEM_LIBS in
cmake/LLDBDependencies.cmake. After that file was removed, system_libs
is orphaned.
Currently the only user is source/Utility. Move the logic there and
remove system_libs.
Subscribers: mgorny, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D80253
Summary:
This changes allows to disable or use customized libxml2 for lldb.
1. Removes redundant include_directories. The one in LLDBConfig.cmake should be enough.
2. Link to ${LIBXML2_LIBRARIES} if xml2 is enabled.
Subscribers: mgorny, lldb-commits
Tags: #lldb
Differential Revision: https://reviews.llvm.org/D80257
Summary:
No need to generate inlined OpenMP region for variables captured in
lambdas or block decls, only for implicitly captured variables in the
OpenMP region.
Reviewers: jdoerfert
Subscribers: yaxunl, guansong, cfe-commits, caomhin
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79966
The subview semantics changes recently to allow for more natural
representation of constant offsets and strides. The legalization of
subview op for lowering to SPIR-V needs to account for this.
Also change the linearization to use the strides from the affine map
of a memref.
Differential Revision: https://reviews.llvm.org/D80270
Tests for `std::system_error` constructor marked as slightly non-portable.
One (but not the only one) reason for such non-portability is that these
tests assume the default locale to be set to "C" (or "POSIX").
However, the default locale for the process depends on OS and
environment. This patch adds explicit setting of the correct
locale expected by the tests.
Thanks to Andrey Maksimov for the patch.
Differential Revision: https://reviews.llvm.org/D72456
This change removes both the member function swap and the free function
overload of swap for std::span. While swap is a member and overloaded
for every other container in the standard library [1], it is neither a
member function nor a free function overload for std::span [2].
Thus the corresponding implementation should be removed.
[1] https://eel.is/c++draft/libraryindex#:swap
[2] https://eel.is/c++draft/span.overview
Differential Revision: https://reviews.llvm.org/D69827
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.
In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.
This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.
The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.
The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.
Force any function containing a preallocated call to use the frame
pointer.
Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.
Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).
Aside from the tests added here, I checked that this codegen produces
correct code for something like
```
struct A {
A();
A(A&&);
~A();
};
void bar() {
foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```
by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.
Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77689
This makes it possible to instrument the call for the reproducers. This
fixes TestStructuredDataAPI.py with reproducer replay.
Differential revision: https://reviews.llvm.org/D80312
Summary:
Asm goto is not supported by SLH. Warn if an instance of asm goto is detected
while SLH is enabled.
Test included.
Reviewed By: jyu2
Differential Revision: https://reviews.llvm.org/D79743
Summary:
Rename 'i' to 'I'.
Factor out the optional field handling to getOptionalVal().
Split out of D79951.
Reviewers: davidxl
Subscribers: eraman, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80230
There appears to be consensus in D80165 that this is the desired
behavior and I personally agree.
Differential revision: https://reviews.llvm.org/D80226
See https://reviews.llvm.org/D74651 for the preallocated IR constructs
and LangRef changes.
In X86TargetLowering::LowerCall(), if a call is preallocated, record
each argument's offset from the stack pointer and the total stack
adjustment. Associate the call Value with an integer index. Store the
info in X86MachineFunctionInfo with the integer index as the key.
This adds two new target independent ISDOpcodes and two new target
dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}.
The setup ISelDAG node takes in a chain and outputs a chain and a
SrcValue of the preallocated call Value. It is lowered to a target
dependent node with the SrcValue replaced with the integer index key by
looking in X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an
%esp adjustment, the exact amount determined by looking in
X86MachineFunctionInfo with the integer index key.
The arg ISelDAG node takes in a chain, a SrcValue of the preallocated
call Value, and the arg index int constant. It produces a chain and the
pointer fo the arg. It is lowered to a target dependent node with the
SrcValue replaced with the integer index key by looking in
X86MachineFunctionInfo. In
X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a
lea of the stack pointer plus an offset determined by looking in
X86MachineFunctionInfo with the integer index key.
Force any function containing a preallocated call to use the frame
pointer.
Does not yet handle a setup without a call, or a conditional call.
Does not yet handle musttail. That requires a LangRef change first.
Tried to look at all references to inalloca and see if they apply to
preallocated. I've made preallocated versions of tests testing inalloca
whenever possible and when they make sense (e.g. not alloca related,
inalloca edge cases).
Aside from the tests added here, I checked that this codegen produces
correct code for something like
```
struct A {
A();
A(A&&);
~A();
};
void bar() {
foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8);
}
```
by replacing the inalloca version of the .ll file with the appropriate
preallocated code. Running the executable produces the same results as
using the current inalloca implementation.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77689
Add reproducer support to PlatformRemoteGDBServer. The logic is
essentially the same as for ProcessGDBRemote. During capture we record
the GDB packets and during replay we connect to a replay server.
This fixes TestPlatformClient.py when run form a reproducer.
Differential Revision: https://reviews.llvm.org/D80224
Summary:
Previously, the only support partial lowering from vector transfers to SCF was
going through loops. This requires a dedicated allocation and extra memory
roundtrips because LLVM aggregates cannot be indexed dynamically (for more
details see the [deep-dive](https://mlir.llvm.org/docs/Dialects/Vector/#deeperdive)).
This revision allows specifying full unrolling which removes this additional roundtrip.
This should be used carefully though because full unrolling will spill, negating the
benefits of removing the interim alloc in the first place.
Proper heuristics are left for a later time.
Differential Revision: https://reviews.llvm.org/D80100
The SingleBlockImplicitTerminator op trait provides a function
`ensureRegionTerminator` that injects an appropriate terminator into the block
if necessary, which is used during operation constructing and parsing.
Currently, this function directly modifies the IR using low-level APIs on
Operation and Block. If this function is called from a conversion pattern,
these manipulations are not reflected in the ConversionPatternRewriter and thus
cannot be undone or, worse, lead to tricky memory errors and malformed IR.
Change `ensureRegionTerminator` to take an instance of `OpBuilder` instead of
`Builder`, and use it to construct the block and the terminator when required.
Maintain overloads taking an instance of `Builder` and creating a simple
`OpBuilder` to use in parsers, which don't have an `OpBuilder` and cannot
interact with the dialect conversion mechanism. This change was one of the
reasons to make `<OpTy>::build` accept an `OpBuilder`.
Differential Revision: https://reviews.llvm.org/D80138
Originally, the SCFToStandard conversion only declared Ops from the Standard
dialect as legal after conversion. This is undesirable as it would fail the
conversion if the SCF ops contained ops from any other dialect. Furthermore,
this would be problematic for progressive lowering of `scf.parallel` to
`scf.for` after `ensureRegionTerminator` is made aware of the pattern rewriting
infrastructure because it creates temporary `scf.yield` operations declared
illegal. Change the legalization target to declare any op other than `scf.for`,
`scf.if` and `scf.parallel` legal.
Differential Revision: https://reviews.llvm.org/D80137