As test in PR56672 shows, LAA produces different results which lead to either
positive or negative vectorization decisions depending on the order of blocks
in loop. The exact reason of this is not clear to me, however this makes investigation
of related bugs extremely complex.
Current order of blocks in the loop is arbitrary. It may change, for example, if loop
info analysis is dropped and recomputed. Seems that it interferes with LAA's logic.
This patch chooses fixed traversal order of blocks in loops, making it RPOT.
Note: this is *not* a fix for bug with incorrect analysis result. It just makes
the answer more robust to make the investigation easier.
Differential Revision: https://reviews.llvm.org/D130482
Reviewed By: aeubanks, fhahn
We will insert a new operand which is identical to the Dest for complex
FMUL with a mask. https://godbolt.org/z/eTEdnYv3q
Complex FMA and FMUL with maskz don't have this problem.
Reviewed By: LuoYuanke, skan
Differential Revision: https://reviews.llvm.org/D130638
By not clustering loads and adjusting heuristics to more aggressively reduce
register pressure we may be able to increase occupancy for the function if it
was dropped in a first pass scheduling.
Similarly, try to reduce spilling if register usage exceeds lower bound
occupancy.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D130329
In diff and diffstat modes, the return code is != 0 even when there are no
changes between commits. This issue can be fixed by passing --exit-code to
git-diff command that returns 0 when there are no changes and using that as
the return code for git-clang-format.
Fixes#56736.
Differential Revision: https://reviews.llvm.org/D129311
The checkout action will hard-code the default github actions token in
the git config so that all pushes use it. We need to set
persist-credentials=false so we can use a token that has permission
to push to the llvm-project-release-prs repo.
Clear all kill flags on source register when folding a COPY.
This is necessary because the kills may now be out of order with the uses.
Reviewed By: foad
Differential Revision: https://reviews.llvm.org/D130622
* https://discourse.llvm.org/t/rfc-removing-the-quant-dialect/3643/8
* Removes most ops. Leaves casts given final comment (can remove more in a followup).
* There are a few uses in Tosa keeping some of the utilities alive. In a followup, I will probably elect to just move simplified versions of them into Tosa itself vs having this quasi-library dependency.
Differential Revision: https://reviews.llvm.org/D120204
InstCombine and DAGCombine prefer to keep shl before binops.
This patch teaches isel to convert to (shl (and/or/xor X, C1 >> C2), C2)
if (C1 >> C2) is a simm12. The idea was taken from X86's isel code.
There's a special case implemented for a sext_inreg between the
shift and the binop.
Differential Revision: https://reviews.llvm.org/D130610
Vtables will be emitted in fewer places than ctors (every ctor
references the vtable, so at worst it's the same places - but at best
the type has a non-inline key function and the vtable is emitted in one
place)
Pulling this fix out of 517bbc64db which
was reverted in 4821508d4d
This commit extends UnifyAliasedResourcePass to handle the case
where aliased resources have different vector sizes. (It still
requires all scalar types to be of the same bitwidth.) This is
effectively reusing the code for handling different-bitwidth
scalar types.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D130671
This commit fixes spv.CompositeConstruct to assembly to list
operand types to enable vector construction out of smaller vectors.
Validation is also fixed to properly check the cases for vector
construction.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D130669
tryLastChanceRecoloring iterates over the set of LiveInterval pointers
and used that to seed the recoloring stack, which was
nondeterministic. Fixes a future test failing about 20% of the time.
This just takes the order the interfering vreg was encountered. Not
sure if we should try to order this more intelligently.
Xcode 14 no longer puts the Rosetta expanded shared cache in a directory
named "16.0". Instead, it includes the real version number (e.g. 13.0),
the build string and the architecture, similar to the device support
directory names for iOS, tvOS and watchOS.
Currently, when there are multiple directories, we might end up picking
the wrong one in GetSDKDirectoryForCurrentOSVersion. The problem is that
without the build string we have no way to differentiate between
multiple directories with the same version number. This patch fixes the
problem by using GetOSBuildString which, as the name implies, returns
the build string if known.
This also adds a test for Rosetta debugging on Apple Silicon. Depending
on whether the Rosetta expanded shared cache is present, the test
ensures that there is or isn't a diagnostic about reading out of memory.
rdar://97576121
Differential revision: https://reviews.llvm.org/D130540
I noticed that the test TestSetWatchpoint.py was failing every so often
on macOS. The failure was in the last assert, that after destroying the
SBTarget containing it, the SBWatchpoint was still saying it was valid.
IsValid in this case just meant the watchpoint weak pointer could be turned
into a shared pointer. The watchpoint shared pointers have two strong references
in general, one to the "Target::m_last_created_watchpoint", and one in the
Target::m_watchpoint_list. Target::Destroy reset the last created watchpoint
but neglected to call RemoveAll on the watchpoint list (it does the analogous
work for the internal & external breakpoint lists...) This patch does the
equivalent cleanup for the watchpoint list.
DR2338 clarified that it was undefined behavior to set the value outside the
range of the enumerations values for an enum without a fixed underlying type.
We should diagnose this with a constant expression context.
Differential Revision: https://reviews.llvm.org/D130058
In this patch we replace common code patterns with the use of utility
functions for dealing with profiling metadata. There should be no change
in functionality, as the existing checks should be preserved in all
cases.
Reviewed By: bogner, davidxl
Differential Revision: https://reviews.llvm.org/D128860
Currently, there is significant code duplication for dealing with
MD_prof metadata throughout the compiler. These utility functions can
improve code reuse and simplify boilerplate code when dealing with
profiling metadata, such as branch weights. The inent is to provide a
uniform set of APIs that allow common tasks, such as identifying
specific types of MD_prof metadata and extracting branch weights.
Future patches can build on this initial implementation and clean up the
different implementations across the compiler.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D128858
Currently, the IR to MIR translator can only handle two kinds of constant
inputs to dbg.values intrinsics: constant integers and constant floats. In
particular, it cannot handle pointers created from IntToPtr ConstantExpression
objects.
This patch addresses the limitation above by replacing the IntToPtr with
its input integer prior to converting the dbg.value input.
Patch by Felipe Piovezan!
Differential Revision: https://reviews.llvm.org/D130642
Summary:
Linkers use `--verbose` to let users investigate search libraries among
other things. The linker wrapper was incorrectly not forwarding this to
the linker job. This patch simply renames this so users can still see
verbose messages from the linker if it was passed.
This patch adds support for AsmPrinter `-mmlir` options to the Flang driver.
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D130598
Since the calendar is added in C++20 the existing operators are removed.
Implements part of:
- P1614R2 The Mothership has Landed
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D129887
This change enables vectorization (using scalable vectorization only, fixed vectors are not yet enabled) for RISCV when vector instructions are available for the target configuration.
At this point, the resulting configuration should be both stable (e.g. no crashes), and profitable (i.e. few cases where scalar loops beat vector ones), but is not going to be particularly well tuned (i.e. we emit the best possible vector loop). The goal of this change is to align testing across organizations and ensure the default configuration matches what downstreams are using as closely as possible.
This exposes a large amount of code which hasn't otherwise been on by default, and thus may not have been fully exercised. Given that, having issues fall out is not unexpected. If you find issues, please make sure to include as much information as you can when reverting this change.
Differential Revision: https://reviews.llvm.org/D129013
Lambdas with trailing return type 'auto' are annotated incorrectly. It causes a misformatting. The simpliest code to reproduce is:
```
auto list = {[]() -> auto { return 0; }};
```
Fixes https://github.com/llvm/llvm-project/issues/54798
Reviewed By: HazardyKnusperkeks, owenpan, curdeius
Differential Revision: https://reviews.llvm.org/D130299
Lifting the core functionalities of the clang-offload-bundler into a
user-facing library/API. This will allow online and JIT compilers to
bundle and unbundle files without spawning a new process.
This patch lifts the classes and functions used to implement
the clang-offload-bundler into a separate OffloadBundler.cpp,
and defines three top-level API functions in OfflaodBundler.h.
BundleFiles()
UnbundleFiles()
UnbundleArchives()
This patch also introduces a Config class that locally stores the
previously global cl::opt options and arrays to allow users to call
the APIs in a multi-threaded context, and introduces an
OffloadBundler class to encapsulate the top-level API functions.
We also lift the BundlerExecutable variable, which is specific
to the clang-offload-bundler tool, from the API, and replace
its use with an ObjcopyPath variable. This variable must be set
in order to internally call llvm-objcopy.
Finally, we move the API files from
clang/tools/clang-offload-bundler into clang/lib/Driver and
clang/include/clang/Driver.
Differential Revision: https://reviews.llvm.org/D129873