This change implements unified text stub format and command line
interface proposed in the elfabi/ifs merge plan.
Differential Revision: https://reviews.llvm.org/D99399
Added shape inference handles cases for convolution operations. This includes
conv2d, conv3d, depthwise_conv2d, and transpose_conv2d. With transpose conv
we use the specified output shape when possible however will shape propagate
if the output shape attribute has dynamic values.
Reviewed By: jpienaar
Differential Revision: https://reviews.llvm.org/D105645
Although this combine checks that there's no load folding barriers between
the loads that it's trying to merge, it was inserting the load at the
MIRBuilder's default insertion point, which is the G_OR use inst.
This was causing a miscompile in the test suite's
SingleSource/Regression/C/gcc-c-torture/execute/GCC-C-execute-bswap-2
Differential Revision: https://reviews.llvm.org/D106251
RISCV would prefer a sign extended constant since that works better
with our constant materialization. We have an existing TLI hook we
use to control sign extension of setcc operands in type legalization.
That hook happens to do the right check we need here, but might be
straying from its original purpose. With only RISCV defining this
hook in tree, I wasn't sure if it was worth adding another hook
with identical behavior.
This is an alternative to D105785 where I tried to handle this in
the RISCV backend by not creating ANY_EXTENDs in some places.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D105918
This deletes all the pooling ops in LinalgNamedStructuredOpsSpec.tc. All the
uses are replaced with the yaml pooling ops.
Reviewed By: gysit, rsuderman
Differential Revision: https://reviews.llvm.org/D106181
Some template functions were missing '&&' in function arguments,
therefore these were always taken by value after template instantiation.
This patch adds the double ampersand to introduce proper perfect
forwarding.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D106148
This patch adds the `-faltivec-src-compat=mixed` option to the
`builtins-ppc-altivec.c` test.
Currently, the default for `-faltivec-src-compat` is `mixed`. The reason we
explicitly specify `mixed` to the RUN lines of this test is because eventually,
the default will set to `xl`.
Having the default as `xl` changes the CHECKs of this test slightly, as it
reorders some of the `vector bool` and `vector pixel` CHECKs (since under the
`xl` option, `vector bool` and `vector pixel` are treated in the same way as
other vector scalars). Explicitly specifying `mixed` ensures that we are testing
pre-existing Clang behaviour.
Differential Revision: https://reviews.llvm.org/D106282
As encountered on D106053, we need to be very explicit that the Assertion nodes don't hold true for a poison value (or for specific poisoned vector elements).
Differential Revision: https://reviews.llvm.org/D106257
Add pattern to fold a TensorCast into a PadTensorOp if the cast removes static size information.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D106278
This removes the promotion of NEON AND, OR and XOR nodes to v2i32/v4i32,
treating them the same as the AArch64 and MVE backends where we just add
the relevant patterns for each legal type. This prevents a lot of
bitcasts from being added to the DAG, which have the potential to make
optimizations more difficult. It does mean adding extra patterns, and
some codegen can change due to the types now being legal, not promoted.
Differential Revision: https://reviews.llvm.org/D105588
This patch adds a new pass called LNICM which is a LoopNest version of LICM and a test case to show how LNICM works.
Basically, LNICM only hoists invariants out of loop nest (not a loop) to keep/make perfect loop nest. This enables later optimizations that require perfect loop nest.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D104180
This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface.
Changes made:
- Stop using `isl::set::lex_le_set`. The official way to do this is to use `isl::map::lex_le_at`
- Removed `isl::set::lex_le_set` from `isl-noexceptions.h`
- `isl-noexceptions.h` has been generated by this 266fea1d3d
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D106269
For some reason we have 2 switches on arch and add
half of arch flags in one place and half in another.
Merge these 2 switches.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D106274
Use _Float16 as the half-precision floating point type. Define a new
type specifier 'x' for the _Float16 type.
Differential Revision: https://reviews.llvm.org/D105001
This patch implements the initialization of vectors under the
-faltivec-src-compat=xl option introduced in https://reviews.llvm.org/D103615.
Under this option, the initialization of scalar vectors, vector bool, and vector
pixel are treated the same, where the initialization value is splatted across
the whole vector.
This patch does not change the behaviour of the -faltivec-src-compat=mixed option,
which is the current default for Clang.
Differential Revision: https://reviews.llvm.org/D106120
Avoid a crash when using instruction referencing if x87 floating point
instructions are used. These instructions are significantly mutated when
they're rewritten from referring to registers, to referring to
floating-point-stack positions. As a result, their operands are re-ordered,
and (InstrRef) LiveDebugValues asserts when it sees a DBG_INSTR_REF
referring to a non-reg non-def register operand.
To fix this, drop the instruction numbers, and thus variable locations.
This patch adds a helper utility do do that.
Dropping the variable locations is sub-optimal, but applying DBG_VALUEs to
the $fp0 and similar registers is dropped on emission too. It seems we've
never done well at describing variables that live in x87 registers, at all.
Differential Revision: https://reviews.llvm.org/D105657
Summary:
The AIX linker will produce errors on unresolved weak symbols. Change the
generated code to not check for the initialization function but just call
it and ensure that it always exists. Also, the AIX atexit routine has a
different name (and signature) so call it correctly. Update the lit tests
to test on AIX appropriately.
Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: hubert.reinterpretcast (Hubert Tong)
Differential Revision: https://reviews.llvm.org/D104420
This patch removes a duplicate checks in the top-level comments in `clang-tools-extra/clangd/ParsedAST.h`
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D106227
This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.
If clang is built for a 32-bit platform and SourceLocation is 64 bits
wide, then a SourceLocation will be larger than a pointer, so it won't
be possible to keep them in a SmallPtrSet any more. Switch to
SmallDenseSet instead.
Patch originally by Mikhail Maltsev.
Differential Revision: https://reviews.llvm.org/D105493
If a clang-tidy child process exits with a signal then run-clang-tidy will exit
with an error but there is no hint why in the output, since the clang-tidy
doesn't log anything and may not even have had the opportunity to do so
depending on the signal used.
`subprocess.CompletedProcess.returncode` is the negative signal number in this
case.
I hit this in a CI system where the parallelism used exceeded the RAM assigned
to the container causing the OOM killer to SIGKILL clang-tidy processes.
Reviewed By: sylvestre.ledru
Differential Revision: https://reviews.llvm.org/D99081
The incoming values for PHI nodes may come from unreachable BasicBlocks,
need to handle this case.
Differential Revision: https://reviews.llvm.org/D106264
This fixes the lower and upper bound calculation of a
RuntimeCheckingPtrGroup when it has more than one loop
invariant pointers. Resolves PR50686.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D104148
We obtain the current PC is all interceptors and collectively
common interceptor code contributes to overall slowdown
(in particular cheaper str/mem* functions).
The current way to obtain the current PC involves:
4493e1: e8 3a f3 fe ff callq 438720 <_ZN11__sanitizer10StackTrace12GetCurrentPcEv>
4493e9: 48 89 c6 mov %rax,%rsi
and the called function is:
uptr StackTrace::GetCurrentPc() {
438720: 48 8b 04 24 mov (%rsp),%rax
438724: c3 retq
The new way uses address of a local label and involves just:
44a888: 48 8d 35 fa ff ff ff lea -0x6(%rip),%rsi
I am not switching all uses of StackTrace::GetCurrentPc to GET_CURRENT_PC
because it may lead some differences in produced reports and break tests.
The difference comes from the fact that currently we have PC pointing
to the CALL instruction, but the new way does not yield any code on its own
so the PC points to a random instruction in the function and symbolizing
that instruction can produce additional inlined frames (if the random
instruction happen to relate to some inlined function).
Reviewed By: vitalybuka, melver
Differential Revision: https://reviews.llvm.org/D106046
This member is now only used when storage is heap-allocated so it does not
need to be const. Dropping 'const' eliminates cast warnings on many builders.