There are cases, like with -fzero-call-used-regs, where we need to know
which registers can be used by a certain calling convention. This change
generates a list of registers used by each calling convention defined in
*CallingConv.td.
Calling conventions that use registers conditioned on Swift have those
registers placed in a separate list. This allows us to be flexible about
whether to use the Swift registers or not.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D125421
By closing over the `rank` itself rather than `this`, we save a method call on each iteration. A minor optimization, but one that adds up.
Depends On D126016
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D126019
There are cases, like with -fzero-call-used-regs, where we need to know
which registers can be used by a certain calling convention. This change
generates a list of registers used by each calling convention defined in
*CallingConv.td.
Calling conventions that use registers conditioned on Swift have those
registers placed in a separate list. This allows us to be flexible about
whether to use the Swift registers or not.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D125421
Only remove dx.valver from module flags when cleanup module flags in DXILTranslateMetadataPass.
Reviewed By: beanz
Differential Revision: https://reviews.llvm.org/D125842
This patch should fix a bug in PExpect.launch that happened when color
support is not enabled.
In that case, we need to add the `--no-use-colors` flag to lldb's launch
argument list. However, previously, each character to the string was
appended separately to the `args` list. This patch solves that by adding
the whole string to the list.
This should fix the TestIOHandlerResize failure on GreenDragon.
Differential Revision: https://reviews.llvm.org/D126021
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This reverts commit 86f7d7074a.
The test cases added for this exposed an pre-existing bug that is failing
the expensive checks bot. Reverting so I can revert that patch.
This diff adjusts binaryOr to take advantage of the analysis
based on KnownBits.
Differential revision: https://reviews.llvm.org/D125933
Test plan:
1/ ninja check-llvm
2/ ninja check-llvm-unit
When parallel is used in a combined construct, then use a separate
function to create the parallel operation. It handles the parallel
specific clauses and leaves the rest for handling at the inner
operations.
Reviewed By: peixin, shraiysh
Differential Revision: https://reviews.llvm.org/D125465
Co-authored-by: Sourabh Singh Tomar <SourabhSingh.Tomar@amd.com>
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Nimish Mishra <neelam.nimish@gmail.com>
It is already marked as having side effects, at least in MIR. It does
not interact with anything else that is modelled as a memory access
either in IR or MachineIR.
Differential Revision: https://reviews.llvm.org/D125985
s_getreg does not interact with anything else that is modelled as a
memory access either in IR or MachineIR.
Differential Revision: https://reviews.llvm.org/D125968
Since these are unused, I've removed them from the configuration, so that it can be easier to read and follow.
Reviewed By: stellaraccident
Differential Revision: https://reviews.llvm.org/D125132
Update clearReductionWrapFlags to use the VPlan def-use chain from the
reduction phi recipe to drop reduction wrap flags.
This addresses an existing FIXME and fixes a crash when instructions in
the reduction chain are not used and have been removed before VPlan
codegeneration.
Fixes#55540.
This became empty when we removed the legacy macho lld. This results in
a warning when running `check-lld`. We can revert this in the future if
we want unit tests.
Differential Revision: https://reviews.llvm.org/D125436
Similar to D124357, this adds some cost modelling for fptoi_sat for Arm
targets. Where VFP2 is available (and FP64/FP16 for the relevant types),
the operations are legal as the Arm instructions naturally saturate.
Otherwise they will need an extra smin/smax clamp, similar to AArch64.
Differential Revision: https://reviews.llvm.org/D125665
Splitted the merge constant-indexed GEP optimization into two smaller transformations: 1. Merging GEP of GEP if both are constant-indexed. 2. Swapping constant indexed GEP in a chain of (non-constant) GEP to the end, so that 1 can be applied repeatedly.
There is existing code to partially handle transformation 1, but it only deals with limited cases
Unit tests are breaking down into two parts for the 2 transformations.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D125438
Add Value Tracking support to deduce induction variable being a power of 2, allowing urem optimizations
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D125332
Summary:
This patch adds the `leaf` attribute to the `vprintf` declaration in the
OpenMP runtime. This attribute allows us to determine that the `vprintf`
function will not call any functions within the translation unit,
allowing us to deduce `norecurse` attributes on the caller.
FixIrreducibleControlFlow pass adds dispatch blocks with a `br_table`
that has multiple predecessors and successors, because it serves as
something like a traffic hub for BBs. As a result of this, there can be
register uses that are not dominated by a def in every path from the
entry block. For example, suppose register %a is defined in BB1 and used
in BB2, and there is a single path from BB1 and BB2:
```
BB1 -> ... -> BB2
```
After FixIrreducibleControlFlow runs, there can be a dispatch block
between these two BBs:
```
BB1 -> ... -> Dispatch -> ... -> BB2
```
And this dispatch block has multiple predecessors, now
there is a path to BB2 that does not first visit BB1, and in that path
%a is not dominated by a def anymore.
To fix this problem, we have been adding `IMPLICIT_DEF`s to all
registers in PrepareForLiveInternals pass, and then remove unnecessary
ones in OptimizeLiveIntervals pass after computing `LiveIntervals`. But
FixIrreducibleControlFlow pass itself ends up violating register use-def
relationship, resulting in invalid code. This was OK so far because
MIR verifier apparently didn't check this in validation. But @arsenm
fixed this and it caught this bug in validation
(https://github.com/llvm/llvm-project/issues/55249).
This CL moves the `IMPLICIT_DEF` adding routine from
PrepareForLiveInternals to FixIrreducibleControlFlow. We only run it
when FixIrreducibleControlFlow changes the code. And then
PrepareForLiveInternals doesn't do anything other than setting
`TracksLiveness` property, which is a prerequisite for running
`LiveIntervals` analysis, which is required by the next pass
OptimizeLiveIntervals.
But in our backend we don't seem to do anything that invalidates this up
until OptimizeLiveIntervals, and I'm not sure why we are calling
`invalidateLiveness` in ReplacePhysRegs pass, because what that pass
does is to replace physical registers with virtual ones 1-to-1. I
deleted the `invalidateLiveness` call there and we don't need to set
that flag explicitly, which obviates all the need for
PrepareForLiveInternals.
(By the way, This 'Liveness' here is different from `LiveIntervals`
analysis. Setting this only means BBs' live-in info is correct, all uses
are dominated by defs, `kill` flag is conservatively correct, which
means if there is a `kill` flag set it should be the last use. See
2a0837aab1/llvm/include/llvm/CodeGen/MachineFunction.h (L125-L134)
for details.)
So this CL removes PrepareForLiveInternals pass altogether. Something
similar to this was attempted by D56091 long ago but that came short of
actually removing the pass, and I couldn't land it because
FixIrreducibleControlFlow violated use-def relationship, which this CL
fixes.
This doesn't change output in any meaningful way. All test changes
except `irreducible-cfg.mir` are register numbering.
Also this will likely to reduce compilation time, because we have been
adding `IMPLICIT_DEF` for all registers every time `-O2` is given, but
now we do that only when there is irreducible control flow, which is
rare.
Fixes https://github.com/llvm/llvm-project/issues/55249.
Reviewed By: dschuff, kripken
Differential Revision: https://reviews.llvm.org/D125515
Previously the error message didn't include the failing path, which made
it hard to tell what went wrong.
Differential Revision: https://reviews.llvm.org/D121665
These no longer exist. A few have been added since but I'm not enough of
an expert to provide a useful blurb on them outside of what you see with
`--help`.
Differential Revision: https://reviews.llvm.org/D122361
When creating an archive, llvm-ar looks at the host to determine the
archive format to use, on Apple platforms this means it uses the
K_DARWIN format. K_DARWIN is _virtually_ equivalent to K_BSD, expect for
some very slight differences around padding, timestamps in deterministic
mode, and 64 bit formats. When updating an archive using llvm-ar, or
llvm-objcopy, Archive would try to determine the kind, but it was not
possible to get K_DARWIN in the initialization of the archive, because
they're virtually inciting usable from K_BSD, especially since the
slight differences only apply in very specific cases. This leads to
linker failures when the alignment workaround is not applied to an
archive copied with llvm-objcopy. This change teaches Archive to infer
the K_DARWIN type in the cases where it's possible and the first object
in the archive is a macho object. This avoids using the host triple to
determine this to not affect cross compiling.
Ideally we would eliminate the separate K_DARWIN type entirely since
it's not a truly separate archive type, but then we'd have to force the
macho workarounds on the BSD format generally. This might be acceptable
but then it would be unclear how to handle this case without forcing the
K_DARWIN64 format on all BSD users:
```
if (LastOffset >= Sym64Threshold) {
if (Kind == object::Archive::K_DARWIN)
Kind = object::Archive::K_DARWIN64;
else
Kind = object::Archive::K_GNU64;
}
```
The logic used to determine if the object is macho is derived from the
logic llvm-ar uses.
Previous context:
- 111cd669e9
- 23a76be5ad
Differential Revision: https://reviews.llvm.org/D124895
WG14 DR423 (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2148.htm#dr_423),
resolved during the C11 time frame, changed the way qualifiers are
handled on function return types and in cast expressions after it was
noted that these types are now directly observable via generic
selection expressions. In C, the function declarator is adjusted to
ignore all qualifiers (including _Atomic qualifiers).
Clang already handles the cast expression case correctly (by performing
the lvalue conversion, which drops the qualifiers as well), but with
these changes it will now also handle function declarations
appropriately.
Fixes#39595
Differential Revision: https://reviews.llvm.org/D125919
It doesn't matter which value we use for dead args, so let's switch
to poison, so we can eventually kill undef.
Reviewed By: aeubanks, fhahn
Differential Revision: https://reviews.llvm.org/D125983
This is the first commit for the cmov-vs-branch optimization pass.
The goal is to develop a new profile-guided and target-independent cost/benefit analysis
for selecting conditional moves over branches when optimizing for performance.
Initially, this new pass is expected to be enabled only for instrumentation-based PGO.
RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D120230
This patch adds the `nocallback` attribute to the NVVM intrinsics that
did not use the `DefaultAttrsIntrinsic` method that includes it already.
The `nocallback` attribute states that the intrinsic function cannot
enter back into the caller's translation-unit. This allows as to
determine that a function calling a `nocallback` function can have the
`norecurse` attribute. This should be safe for all the NVVM intrinsics
because they do not call other functions within the translation unit.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125937
This patch implements the following floating point negative absolute value
builtins that required for compatibility with the XL compiler:
```
double __fnabs(double);
float __fnabss(float);
```
These builtins will emit :
- fnabs on PWR6 and below, or if VSX is disabled.
- xsnabsdp on PWR7 and above, if VSX is enabled.
Differential Revision: https://reviews.llvm.org/D125506