https://discourse.llvm.org/t/parallel-input-file-parsing/60164
To decouple symbol initialization and section initialization, `Defined::section`
assignment should be postponed after input file parsing. To avoid spurious
duplicate definition error due to two definitions in COMDAT groups of the same
signature, we should postpone the duplicate symbol check.
The function is called postScan instead of a more specific name like
checkDuplicateSymbols, because we may merge Symbol::mergeProperties into
postScan. It is placed after compileBitcodeFiles to apply to ET_REL files
produced by LTO. This causes minor diagnostic regression
for skipLinkedOutput configurations: ld.lld --thinlto-index-only a.bc b.o
(bitcode definition prevails) won't detect duplicate symbol error. I think this
is an acceptable compromise. The important cases where (a) both files are
bitcode or (b) --thinlto-index-only is unused are still detected.
Reviewed By: ikudrin
Differential Revision: https://reviews.llvm.org/D119908
This adds very basic support for hashing MachineBasicBlock
and MachineFunction, for use in MachineFunctionPass to
detect passes that modify the MachineFunction wrongly.
Differential Revision: https://reviews.llvm.org/D120122
D118623 added code to fold not-of-compare into a compare
with the inverted predicate, if the compare had no other
uses. This relies on accurate use lists in the IR but it
was run before setPhiValues, when some phi inputs are still
stored in a data structure on the side, instead of being
real uses in the IR. The effect was that a phi that should
be using the original compare result would now get an
inverted result instead.
Fix this by moving simplifyConditions after setPhiValues.
Differential Revision: https://reviews.llvm.org/D120312
This adds handling of the _SADDR forms to the GLOBAL_LOAD combining.
TODO: merge global stores.
TODO: merge flat load/stores.
TODO: merge flat with global promoting to flat.
Differential Revision: https://reviews.llvm.org/D120285
There can be situations where global and flat loads and stores are not
combined by the vectorizer, in particular if their address space
differ in the IR but they end up the same class instructions after
selection. For example a divergent load from constant address space
ends up being the same global_load as a load from global address space.
TODO: merge global stores.
TODO: handle SADDR forms.
TODO: merge flat load/stores.
TODO: merge flat with global promoting to flat.
Differential Revision: https://reviews.llvm.org/D120279
This is the next step towards supporting bitcode auto upgrade with
opaque pointers. The ValueList now stores the Value* together with
its associated type ID, which allows inspecting the original pointer
element type of arbitrary values.
This is a largely mechanical change threading the type ID through
various places. I've left TODOTypeID placeholders in a number of
places where determining the type ID is either non-trivial or
requires allocating a new type ID not present in the original
bitcode. For this reason, the new type IDs are also not used for
anything yet (apart from propagation). They will get used once the
TODOs are resolved.
Differential Revision: https://reviews.llvm.org/D119821
This removes two feature flags from `AArch64TargetInfo` class:
- `HasHBC`: this feature does not involve generating any IR intrinsics,
so clang does not need to know about whether it is set
- `HasCrypto`: this feature is deprecated in favor of finer grained
features such as AES, SHA2, SHA3 and SM4. The associated ACLE macro
__ARM_FEATURE_CRYPTO is thus no longer used.
Differential Revision: https://reviews.llvm.org/D118757
The Linux kernel build uses absolute expressions suffixed with @lo/@ha
relocations. This currently doesn't work for DS/DQ form instructions and
there is no reason for it not to. It also works with GAS.
This patch allows this as long as the value is a multiple of 4/16
for DS/DQ form.
Differential revision: https://reviews.llvm.org/D115419
Otherwise callers of these functions have to check both the return value
for and the contents of the returned llvm::Optional.
Fixes#53742
Differential Revision: https://reviews.llvm.org/D119525
In 529aa4b011
by setting the identifier info to nullptr, we started to subtly
interfere with the parts in the beginning of the function,
529aa4b011/clang/lib/Format/UnwrappedLineParser.cpp (L991)
causing the preprocessor nesting to change in some cases. E.g., for the
added regression test, clang-format started incorrectly guessing the
language as C++.
This tries to address this by introducing an internal identifier info
element to use instead.
Reviewed By: curdeius, MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D120315
'streaming-sve' is not a feature that users should be able to set,
hence why it shouldn't show up in user-diagnostics. The only
flag that end-users should be able to set is '+sme'.
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D120256
This patch is the first in a series of patches to upstream the support for Apple's DriverKit. Once complete, it will allow targeting DriverKit platform with Clang similarly to AppleClang.
This code was originally authored by JF Bastien.
Differential Revision: https://reviews.llvm.org/D118046
Split v512.32 binary ops into two v256.32 ops using packing support
opcodes (vec_unpack_lo|hi, vec_pack).
Depends on D120053 for packing opcodes.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D120146
With only a load-fold the diffs look neutral. If there's a load and store (rmw)
fold opportunity as shown in the test based on #53862, then we end up with an
extra instruction.
Fixes#53862
Differential Revision: https://reviews.llvm.org/D120281
This patch brings in some initial changes for lowering Fortran
intrinsics. Intrinsics are generally lowered to a mix of FIR and
MLIR operations, runtime calls or LLVM intrinsics. This patch
particularly brings in the lowering of the Fortran `andi` intrinsic
to `arith.andi` in MLIR.
The significant changes are in ConvertExpr.cpp and IntrinsicCall.cpp.
Intrinsic functions occur as part of expressions. Lowering deals with this
in ConvertExpr.cpp in `genval(const Fortran::evaluate::FunctionRef<A> &funcRef)`.
The code in the above mentioned function kicks of a sequence of calls
that ultimately results in a call to the `genIand ` function in
IntrinsicCall.cpp which creates the MLIR `arith.andi` operation.
A few tests are also included.
Note: Generally intrinsics like `iand` can occur in array (elemental)
context, but since that part is not fully supported in lowering, tests
are only added for the scalar context.
This patch is part of upstreaming from the fir-dev branch of
https://github.com/flang-compiler/f18-llvm-project.
Reviewed By: clementval
Differential Revision: https://reviews.llvm.org/D119990
Co-authored-by: Jean Perier <jperier@nvidia.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: zacharyselk <zrselk@gmail.com>
Co-authored-by: V Donaldson <vdonaldson@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Extends getReductionOpChain to look through Phis which may be part of
the reduction chain. adjustRecipesForReductions will now also create a
CondOp for VPReductionRecipe if the block is predicated and not only if
foldTailByMasking is true.
Changes were required in tryToBlend to ensure that we don't attempt
to convert the reduction Phi into a select by returning a VPBlendRecipe.
The VPReductionRecipe will create a select between the Phi and the reduction.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D117580
The race is between these two pieces of code that are executed in two separate
lldb-vscode threads (the first is in the main thread and another is in the
event-handling thread):
```
// lldb-vscode.cpp
g_vsc.debugger.SetAsync(false);
g_vsc.target.Launch(launch_info, error);
g_vsc.debugger.SetAsync(true);
```
```
// Target.cpp
bool old_async = debugger.GetAsyncExecution();
debugger.SetAsyncExecution(true);
debugger.GetCommandInterpreter().HandleCommands(GetCommands(), exc_ctx,
options, result);
debugger.SetAsyncExecution(old_async);
```
The sequence that leads to the bug is this one:
1. Main thread enables synchronous mode and launches the process.
2. When the process is launched, it generates the first stop event.
3. This stop event is catched by the event-handling thread and DoOnRemoval
is invoked.
4. Inside DoOnRemoval, this thread runs stop hooks. And before running stop
hooks, the current synchronization mode is stored into old_async (and
right now it is equal to "false").
5. The main thread finishes the launch and returns to lldb-vscode, the
synchronization mode is restored to asynchronous by lldb-vscode.
6. Event-handling thread finishes stop hooks processing and restores the
synchronization mode according to old_async (i.e. makes the mode synchronous)
7. And now the mode is synchronous while lldb-vscode expects it to be
asynchronous. Synchronous mode forbids the process to broadcast public stop
events, so, VS Code just hangs because lldb-vscode doesn't notify it about
stops.
So, this diff makes the target intercept the first stop event if the process is
launched in the synchronous mode, thus preventing stop hooks execution.
The bug is only present on Windows because other platforms already
intercept this event using their own hijacking listeners.
So, this diff also fixes some problems with lldb-vscode tests on Windows to make
it possible to run the related test. Other tests still can't be enabled because
the debugged program prints something into stdout and LLDB can't intercept this
output and redirect it to lldb-vscode properly.
Reviewed By: jingham
Differential Revision: https://reviews.llvm.org/D119548
Header class in SPIR-V HTML spec has changed. Update script to reflect that.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D120179
This patch fixes a logical error in how we work with `LoopUsers` map.
It maps a loop onto a set of AddRecs that depend on it. The Addrecs
are added to this map only once when they are created and put to
the UniqueSCEVs` map.
The only purpose of this map is to make sure that, whenever we forget
a loop, all (directly or indirectly) dependent SCEVs get forgotten too.
Current code erases SCEVs from dependent set of a given loop whenever
we forget this loop. This is not a correct behavior due to the following scenario:
1. We have a loop `L` and an AddRec `AR` that depends on it;
2. We modify something in the loop, but don't destroy it. We still call forgetLoop on it;
3. `AR` is no longer dependent on `L` according to `LoopUsers`. It is erased from
ValueExprMap` and `ExprValue map, but still exists in UniqueSCEVs;
4. We can later request the very same AddRec for the very same loop again, and get existing
SCEV `AR`.
5. Now, `AR` exists and is used again, but its notion that it depends on `L` is lost;
6. Then we decide to delete `L`. `AR` will not be forgotten because we have lost it;
7. Just you wait when you run into a dangling pointer problem, or any other kind of problem
because an active SCEV is now referecing a non-existent loop.
The solution to this is to stop erasing values from `LoopUsers`. Yes, we will maybe forget something
that is already not used, but it's cheap.
This fixes a functional bug and potentially may have negative compile time impact on methods with
huge or numerous loops.
Differential Revision: https://reviews.llvm.org/D120303
Reviewed By: nikic
Most places already seem to use the short spelling instead of
'unsigned int/long', so perform the following substitutions:
s/unsigned int /uint /g
s/unsigned long /ulong /g
This simplifies completeness comparisons against OpenCLBuiltins.td.
Differential Revision: https://reviews.llvm.org/D120032
This is an initial enabling patch for module partition support.
We add enumerations for partition interfaces/implementations.
This means that the module kind enumeration now occupies three
bits, so the AST streamer is adjusted for this. Adding one bit there
seems preferable to trying to overload the meanings of existing
kinds (and we will also want to add a C++20 header unit case later).
Differential Revision: https://reviews.llvm.org/D114714
As suggested by the cmake warning:
CMake Warning at <...>/llvm-project/libcxx-ci/libcxx/CMakeLists.txt:289 (message):
LIBCXX_TARGET_TRIPLE is deprecated, please use CMAKE_CXX_COMPILER_TARGET instead
Depends on D119948
Differential Revision: https://reviews.llvm.org/D120038
This patch fixes an invalid TypeSize->uint64_t implicit conversion in
FoldReinterpretLoadFromConst. If the size of the constant is scalable
we bail out of the optimisation for now.
Tests added here:
Transforms/InstCombine/load-store-forward.ll
Differential Revision: https://reviews.llvm.org/D120240
Summary:
This patch adds checks that were missing in clang for Armv8.5/6/7-A. These include:
* ACLE macro defines for AArch32.
* Handling of crypto and SM4, SHA and AES feature flags on clang's driver.
Reviewers: dmgreen, SjoerdMeijer, tmatheson
Differential Revision: https://reviews.llvm.org/D116153
Constants cannot be cyclic, but they can be tree-like. Keep a
visited set to ensure we do not degenerate to exponential run-time.
This fixes the problem reported in https://reviews.llvm.org/D117223#3335482,
though I haven't been able to construct a concise test case for
the issue. This requires a combination of dead constants and the
kind of constant expression tree that textual IR cannot represent
(because the textual representation, unlike the in-memory
representation, is also exponential in size).
The related functionality is moved over to the bufferization dialect. Test cases are cleaned up a bit.
Differential Revision: https://reviews.llvm.org/D120191