Add the class MultiAffineFunction which represents functions whose domain is an
IntegerPolyhedron and which produce an output given by a tuple of affine
expressions in the IntegerPolyhedron's ids.
Also add support for piece-wise MultiAffineFunctions, which are defined on a
union of IntegerPolyhedrons, and may have different output affine expressions
on each IntegerPolyhedron. Thus the function is affine on each individual
IntegerPolyhedron piece in the domain.
This is part of a series of patches leading up to parametric integer programming.
Depends on D118778.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D118779
Debug position data is cleared after ScheduleDAGMILive::schedule() due to it also calling placeDebugValues(). Make it so the data is not cleared after initial call to placeDebugValues since we will call it again after reverting a schedule.
Secondly, since we skip debug instructions when reverting the schedule on AMDGPU, all debug instructions are now moved to the end of the scheduling region. RegionEnd points to the beginning of this chunk of debug instructions since it was not incremented when a debug instruction was skipped. RegionBegin may also point to the same debug instruction if Unsched.front() is a debug instruction thus shrinking the region to 1. Fix RegionBegin and RegionEnd so that they point to the current beginning and ending before calling placeDebugValues() since both vars will be used as reference points to move debug instructions back.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D119022
Currently GCC produces lots of warnings. Most of them are `-Wattributes`, but these warnings are completly ignored by everybody. So let's disable -Wattributes and make the output cleaner.
Reviewed By: ldionne, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D119140
to decrease sizeof(SymbolUnion) from 72 to 64 on ELF64 platforms.
Use a dummy `Undefined` to prevent null pointer dereference (though unused)
`*rel.sym` in InputSectionBase::relocateAlloc.
The relocation order may shuffle a bit, but otherwise there is no behavior
difference.
* Implement `FlatAffineConstraints::getConstantBound(EQ)`.
* Inject a simpler constraint for loops that have at most 1 iteration.
* Taking into account constant EQ bounds of FlatAffineConstraint dims/symbols during canonicalization of the resulting affine map in `canonicalizeMinMaxOp`.
Differential Revision: https://reviews.llvm.org/D119153
Add SPARC to the list of platforms for which we provide a full
unwind implementation which leads to _Unwind_Backtrace being defined within
libunwind.so.
Likewise for PPC (see D118320 for background).
Reviewed By: #libunwind, MaskRay, Arfrever
Differential Revision: https://reviews.llvm.org/D119068
This fixes TestGdbRemoteSingleStep.py and TestGdbRemote_vCont.py. This
patch updates the test to account for the possibility that the constants
are already materialized. This appears to behave differently between
embedded arm64 devices and Apple Silicon.
If we call CGOpenCLRuntime::convertOpenCLSpecificType() multiple times
we should get the same type back.
Reviewed By: svenvh
Differential Revision: https://reviews.llvm.org/D119011
lld/ELF/OutputSections.cpp includes llvm/Config/config.h for
LLVM_ENABLE_ZLIB definition, but llvm/Config/config.h doesn't exist in
standalone build.
To fix this, this patch moves LLVM_ENABLE_ZLIB from config.h to
llvm-config.h and updates OutputSections.cpp to include llvm-config.h
instead of config.h
Reviewed By: MaskRay, mgorny
Differential Revision: https://reviews.llvm.org/D119058
This is a follow-up suggested in D119060.
Instead of checking each of the bottom 2 bits individually,
we can check them together and handle the possibility that
we demand both together.
https://alive2.llvm.org/ce/z/C2ihC2
Differential Revision: https://reviews.llvm.org/D119139
The parsing of nested names is a little lax. This corrects that.
1) The 'L' local name prefix cannot appear before a NestedName -- only
within it. Let's remove that check from parseName, and then adjust
parseUnscopedName to allow it with or without the 'St' prefix.
2) In a nested name, a <template-param>, <decltype> or <substitution>
can only appear as the first element. Let's enforce that. Note I do
not remove these from the loop, to make the change easier to follow
(such a change will come later).
3) Given that, there's no need to special case 'St' outside of the
loop, handle it with the other 'S' elements.
4) There's no need to reset 'EndsWithTemplateArgs' after each
non-template-arg component. Rather, always clear it and then set it
in the template-args case.
5) An template-args cannot immediately follow a template-args.
6) The parsing of a CDtor name with ABITags would attach the tags to
the NestedName node, rather than the CDTor node. This is different to
how ABITags are attached to an unscopedName. Make it consistent.
7) We remain with only CDTor and UnscopedName requireing construction
of a NestedName, so let's drop the PushComponent lambda.
8) Add some tests to catch the new rejected manglings.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D118132
After aed965d55d we no longer demangle and store the full name. The
test was updated accordingly but the comment still specified that we
should be able to find the symbol by its full demangled name.
We were dropping the [gs] modifier by parsing it in parseExpr, but not
forwarding it on to parseUnresolvedName. This is the straightforwards
fix to forward that flag -- parseExpr must see past it.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D118504
The StdQualifiedName node class is used for names exactly in the std
namespace. It is not used for nested names that descend further --
those use a NestedName with NameType("std") as the scope.
Representing the compression scheme in the node graph is layer
breaking. We can use the same structure for those exactly in std too,
and reduce code size a bit.
Reviewed By: ChuanqiXu
Differential Revision: https://reviews.llvm.org/D118249
AArch32/Armv8A introduced the performance deprecation of certain patterns
of IT instructions. After some debate internal to ARM, this is now being
reverted; i.e. no IT instruction patterns are performance deprecated
anymore, as the perfomance degredation is not significant enough.
This reverts the following:
"ARMv8-A deprecates some uses of the T32 IT instruction. All uses of
IT that apply to instructions other than a single subsequent 16-bit
instruction from a restricted set are deprecated, as are explicit
references to the PC within that single 16-bit instruction. This permits
the non-deprecated forms of IT and subsequent instructions to be treated
as a single 32-bit conditional instruction."
The deprecation no longer applies, but the behaviour may be controlled
by the -arm-restrict-it and -arm-no-restrict-it command-line options,
with the latter being the default. No warnings about complex IT blocks
will be generated.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D118044
LLVM Embedded Toolchains working group regular sync up calls to start in early
March, adding details to the table of sync ups for general reference.
Differential Revision: https://reviews.llvm.org/D118884
This is effectively inverting the transform added with D116804
because the downside of the false dependency of something like
"sbb %eax, %eax" is much greater than the upside of eliminating
a zeroing instruction on (all?) Intel CPUs.
Differential Revision: https://reviews.llvm.org/D118843
If we had a large offset which required materializing in a register,
we would emit an s_add_i32, clobbering SCC. Start checking if SCC is
live, and instead use a VGPR offset. For MUBUF, we switch to using
offen. We would do this anyway in a normal load/store with a frame
index, but not for spills.
The same problem still exists in other contexts where we expand frame
indices.
The nasty edge case is when SGPRs are spilled to memory at a large
frame offset where SCC is also clobbered. This requires a second
scavenging index, and also required several patches in the scavenger
to correctly handle multiple recursive scavenge indexes.
An even nastier edge case we still don't support is if we don't have
any free SGPRs. If SCC is live and we don't have any free SGPRs to
save exec, we have no way of flipping exec back and forth without also
clobbering SCC.
Fixes: SWDEV-309419
This patch modifies the FCOPYSIGN lowering to go through the BSP
pseudo-instruction. This allows the same lowering code for NEON,
SVE and SVE2.
As part of this, lowering for BSP for SVE and SVE2 is also added.
For SVE and NEON this patch is NFC.
Differential Revision: https://reviews.llvm.org/D118394
A significant number of our tests in C accidentally use functions
without prototypes. This patch converts the function signatures to have
a prototype for the situations where the test is not specific to K&R C
declarations. e.g.,
void func();
becomes
void func(void);
This is the third batch of tests being updated (there are a significant
number of other tests left to be updated).
- Add or remove empty lines surrounding union blocks.
- Fixes https://github.com/llvm/llvm-project/issues/53229, in which
keywords like class and struct in a line ending with left brace or
whose next line is left brace only, will be falsely recognized as
definition line, causing extra empty lines inserted surrounding blocks
with no need to be formatted.
Reviewed By: MyDeveloperDay, curdeius, HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D119067
D43208 extracted `useEmulatedMaskMemRefHack()` from legality into cost model.
What it essentially does is prevents scalarized vectorization of masked memory operations:
```
// TODO: Cost model for emulated masked load/store is completely
// broken. This hack guides the cost model to use an artificially
// high enough value to practically disable vectorization with such
// operations, except where previously deployed legality hack allowed
// using very low cost values. This is to avoid regressions coming simply
// from moving "masked load/store" check from legality to cost model.
// Masked Load/Gather emulation was previously never allowed.
// Limited number of Masked Store/Scatter emulation was allowed.
```
While i don't really understand about what specifically `is completely broken`
was talking about, i believe that at least on X86 with AVX2-or-later,
this is no longer true. (or at least, i would like to know what is still broken).
So i would like to follow suit after D111460, and like wise disable that hack for AVX2+.
But since this was added for X86 specifically, let's just instead completely remove this hack.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D114779