This requires us to override the isTargetCanonicalConstantNode callback introduced in D128144, so we can recognise the various cases where a VBROADCAST_LOAD constant is being reused at different vector widths to prevent infinite loops.
This reverts commit 9ffeaaa0ea.
This fixes debugging large executables with lldb and gdb.
When StringTableBuilder is used, the string offsets for any string
can point anywhere in the string table - while previously, all strings
were inserted in order (without deduplication and tail merging).
For symbols, there's no complications in encoding the string offset;
the offset is encoded as a raw 32 bit binary number in half of the
symbol name field.
For sections, the string table offset is written as
"/<decimaloffset>", but if the decimal offset would be larger than
7 digits, it's instead written as "//<base64offset>". Tools that
operate on object files can handle the base64 offset format, but
apparently neither lldb nor gdb expect that syntax when locating the
debug information section. Prior to the reverted commit, all long
section names were located at the start of the string table, so
their offset never exceeded the range for the decimal syntax.
Just reverting this change for now, as the actual benefit from it
was fairly modest.
Longer term, lld could write all long section names unoptimized
at the start of the string table, followed by all the strings for
symbol names, with deduplication and tail merging. And lldb and
gdb could be fixed to handle sections with the base64 offset syntax.
This fixes https://github.com/mstorsjo/llvm-mingw/issues/289.
There was a copy-paste mistake at the embedded link:
`clang-tidy/checks/cppcoreguidelines-virtual-class-destructor`
->
`clang-tidy/checks/cppcoreguidelines/virtual-class-destructor`
Sphinx error:
/home/zbebnal/git/llvm-project/clang-tools-extra/docs/ReleaseNotes.rst:168:unknown document: clang-tidy/checks/cppcoreguidelines-virtual-class-destructor
Build bot: https://lab.llvm.org/buildbot#builders/115/builds/29805
Differential Revision: https://reviews.llvm.org/D126891
The `cppcoreguidelines-virtual-class-destructor` supposed to enforce
http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c35-a-base-class-destructor-should-be-either-public-and-virtual-or-protected-and-non-virtual
Quote:
> A **base** class destructor should be either public and virtual, or
> protected and non-virtual
[emphasis mine]
However, this check still rules the following case:
class MostDerived final : public Base {
public:
MostDerived() = default;
~MostDerived() = default;
void func() final;
};
Even though `MostDerived` class is marked `final`, thus it should not be
considered as a **base** class. Consequently, the rule is satisfied, yet
the check still flags this code.
In this patch, I'm proposing to ignore `final` classes since they cannot
be //base// classes.
Reviewed By: whisperity
Differential Revision: https://reviews.llvm.org/D126891
As a followup to D128144, this adds extract(DUP(C)) as a canonical
constant to prevent it being transformed back into a BUILD_VECTOR,
leading to an infinite loop.
This revision adds the necessary plumbing for canonicalizing scf::ForeachThread with the
`AffineOpSCFCanonicalizationPattern`.
In the process the `loopMatcher` helper is updated to take OpFoldResult instead of just values.
This allows composing various scenarios without the need for an artificial builder.
Differential Revision: https://reviews.llvm.org/D128244
This updates StdLibraryFunctionsChecker to set the state of 'errno'
by using the new errno_modeling functionality.
The errno value is set in the PostCall callback. Setting it in call::Eval
did not work for some reason and then every function should be
EvalCallAsPure which may be bad to do. Now the errno value and state
is not allowed to be checked in any PostCall checker callback because
it is unspecified if the errno was set already or will be set later
by this checker.
Reviewed By: martong, steakhal
Differential Revision: https://reviews.llvm.org/D125400
This patch adds omp.taskgroup operation according to OpenMP 5.0 2.17.6.
Also added tests for the same.
Reviewed By: kiranchandramohan, peixin
Differential Revision: https://reviews.llvm.org/D127250
To accomodate macOS universal configuration include the assembly files
and `blake3_neon.c` without a CMake check but instead guard their source
with architecture "#ifdef" checks.
Differential Revision: https://reviews.llvm.org/D128132
The failure that caused the previous revert has been fixed
by https://reviews.llvm.org/D126048
Original commit message:
RVV makes heavy use of subregisters due to LMUL>1 and segment
load/store tuples. Enabling subregister liveness tracking improves the quality
of the register allocation.
I've added a command line that can be used to turn it off if it causes compile
time or functional issues. I used the command line to keep the old behavior
for one interesting test case that was testing register allocation.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D128016
There is no instruction to fold NZCV, so, just do not do it.
Without the fix the added test case crashes with an assert
"Mismatched register size in non subreg COPY"
Reviewed By: danilaml
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D127294
This patch implements a new way to generate the CTR loops. Now the
intrinsics inserted in hardware loop pass will be mapped to pseudo
instructions and these pseudo instructions will be expanded to CTR
loop or normal compare+branch loop in this post ISEL pass.
Reviewed By: lkail
Differential Revision: https://reviews.llvm.org/D122125
Use it in place of VSELECT_VL+VRGATHER*_VL.
This simplifies the isel patterns.
Overall, I think trying to match select+op to create masked instructions
in isel doesn't scale. We either need to do it in DAG combine, pre-isel
peepole, or post-isel peephole. I don't yet know which is the right
answer, but for this case it seemed best to be able to request the
masked form directly from lowering.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D128023
For below case, virtual register is defined twice in the self loop. We
don't need to spill %0 after the third instruction `%0 = def (tied %0)`,
because it is defined in the second instruction `%0 = def`.
1 bb.1
2 %0 = def
3 %0 = def (tied %0)
4 ...
5 jmp bb.1
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D125079
This fixes a useless filecheck and wrong comment for always-inline.ll. Testing
has been done using ninja check-llvm and llvm-lit always-inline.ll --show-all.
Reviewed By: modimo, hoy
Differential Revision: https://reviews.llvm.org/D127815
Microsoft does not seem to document the flag. Ignoring it for now is probably
better than getting an unknown flag error.
Reviewed By: thakis
Differential Revision: https://reviews.llvm.org/D128231
This patch implements VSCode DAP logpoints feature (also called tracepoint
in other VS debugger).
This will provide a convenient way for user to do printf style logging
debugging without pausing debuggee.
Differential Revision: https://reviews.llvm.org/D127702
The error used to look like this:
ld64.lld: error: undefined symbol: _foo
>>> referenced by /path/to/bar.o:(symbol _baz+0x4)
If DWARF line information is available, we now show where in the source
the references are coming from:
ld64.lld: error: unreferenced symbol: _foo
>>> referenced by: bar.cpp:42 (/path/to/bar.cpp:42)
>>> /path/to/bar.o:(symbol _baz+0x4)
Differential Revision: https://reviews.llvm.org/D128184
Add missing intrinsics and tests for them. An expanding macro
from _vel_pack_f32p to __builtin_ve_vl_pack_f32p and others is
already defined in clang/lib/Headers/velintrin.h.
Reviewed By: efocht
Differential Revision: https://reviews.llvm.org/D128120
The instructions that generate the source of dual source blend export
should run in strict-wqm. That is if any lane in a quad is active,
we need to enable all four lanes of that quad to make the shuffling
operation before exporting to dual source blend target work correctly.
Differential Revision: https://reviews.llvm.org/D127981
LDS_PARAM_LOAD and LDS_DIRECT_LOAD use EXEC per quad
(if any pixel is enabled in the quad, data is written
to all 4 pixels/threads in the quad).
Tag LDS_PARAM_LOAD and LDS_DIRECT_LOAD as using strict_wqm
to enforce this and avoid lane clobbering issues.
Note that only the instruction itself is tagged.
The implicit uses of these do not need to be set WQM.
The reduces unnecessary WQM calculation of M0.
Differential Revision: https://reviews.llvm.org/D127977