Lowering of bitwise_not to linalg dialect using a xor operation with a constant
of all-bits-one.
Differential Revision: https://reviews.llvm.org/D99221
This implements a subset of the initial set of inference rules proposed in the llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree". The nolias one got moved to a separate review as there was some concerns raised which require further discussion.
Differential Revision: https://reviews.llvm.org/D99135
With cost-benefit analysis for inlining, we bypass the cost-threshold by returning inline result from call analyzer early.
However the cost and threshold are still available from call analyzer, and when cost is actually higher than threshold, we incorrect set the reason.
The change makes the decision from cost-benefit analysis explicit. It's mostly NFC, except that it allows the priority-based sample loader inliner used by CSSPGO to use cost-benefit heuristic.
Differential Revision: https://reviews.llvm.org/D99302
```
Warn when a function pointer is cast to an incompatible function
pointer. In a cast involving function types with a variable argument
list only the types of initial arguments that are provided are
considered. Any parameter of pointer-type matches any other
pointer-type. Any benign differences in integral types are ignored, like
int vs. long on ILP32 targets. Likewise type qualifiers are ignored. The
function type void (*) (void) is special and matches everything, which
can be used to suppress this warning. In a cast involving pointer to
member types this warning warns whenever the type cast is changing the
pointer to member type. This warning is enabled by -Wextra.
```
Reviewed By: rsmith
Differential Revision: https://reviews.llvm.org/D97831
This patch enables the cost-benefit-analysis-based inliner by default
if we have instrumentation profile.
- SPEC CPU 2017 shows a 0.4% improvement.
- An internal large benchmark shows a 0.9% reduction in the cycle
count along with 14.6% reduction in the number of call instructions
executed.
Differential Revision: https://reviews.llvm.org/D98213
This reverts commit aae84b8e39.
The chromium goma folks want to use a Debian sysroot without
lib/x86_64-linux-gnu to perform `clang -c` but no link action. The previous
commit has removed D.getVFS().exists check to make such usage work.
Not only can this save unneeded filesystem stats, it can make `clang
--sysroot=/path/to/debian-sysroot -c a.cc` work (get `-internal-isystem
$sysroot/usr/include/x86_64-linux-gnu`) even without `lib/x86_64-linux-gnu/`.
This should make thakis happy.
For such op chains, we can create new linalg.fill ops
with the result type of the linalg.tensor_reshape op.
Differential Revision: https://reviews.llvm.org/D99116
init tensor operands also has indexing map and generally follow
the same constraints we expect for non-init-tensor operands.
Differential Revision: https://reviews.llvm.org/D99115
This commit exposes an option to the pattern
FoldWithProducerReshapeOpByExpansion to allow
folding unit dim reshapes. This gives callers
more fine-grained controls.
Differential Revision: https://reviews.llvm.org/D99114
This identifies a pattern where the producer affine min/max op
is bound to a dimension/symbol that is used as a standalone
expression in the consumer affine op's map. In that case the
producer affine min/max op can be merged into its consumer.
For example, a pattern like the following:
```
%0 = affine.min affine_map<()[s0] -> (s0 + 16, s0 * 8)> ()[%sym1]
%1 = affine.min affine_map<(d0)[s0] -> (s0 + 4, d0)> (%0)[%sym2]
```
Can be turned into:
```
%1 = affine.min affine_map<
()[s0, s1] -> (s0 + 4, s1 + 16, s1 * 8)> ()[%sym2, %sym1]
```
Differential Revision: https://reviews.llvm.org/D99016
If there are multiple identical expressions in an affine
min/max op's map, we can just keep one.
Differential Revision: https://reviews.llvm.org/D99015
Until now Linalg fusion only allow fusing producers whose operands
are all permutation indexing maps. It's easier to deduce the
subtensor/subview but it is an unnecessary constraint, as in tiling
we have more advanced logic to deduce the subranges even when the
operand is not of permutation indexing maps, e.g., the input operand
for convolution ops.
This patch uses the logic on tiling side to deduce subranges for
fusion. This enables fusing convolution with its consumer ops
when possible.
Along the way, we are now generating proper affine.min ops to guard
against size boundaries, if we cannot be certain they won't be
out of bounds.
Differential Revision: https://reviews.llvm.org/D99014
This is a preparation step to reuse makeTiledShapes in tensor
fusion. Along the way, did some lightweight cleanups.
Differential Revision: https://reviews.llvm.org/D99013
This path would unblock the build of libc++ library on AIX:
1. Add _AIX guard for _LIBCPP_HAS_THREAD_API_PTHREAD
2. Use uselocale to actually take the locale setting
into account.
3. extract_mtime and extract_atime mod needed for AIX. As stat
structure on AIX uses internal structure st_timespec to store
time for binary compatibility reason. So we need to convert it
back to timespec here.
4. Do not build cxa_thread_atexit.cpp for libcxxabi on AIX.
Differential Revision: https://reviews.llvm.org/D97558
This is similar to the select logic just ahead of the new code.
Min/max choose exactly one value from the inputs, so if both of
those are a power-of-2, then the result must be a power-of-2.
This might help with D98152, but we likely still need other
pieces of the puzzle to avoid regressions.
The change in PatternMatch.h is needed to build with clang.
It's possible there is a better way to deal with the 'const'
incompatibities.
Differential Revision: https://reviews.llvm.org/D99276
This avoided some conversion overhead on a model in TypeUniquer when
converting from ArrayRef -> TypeRange.
Differential Revision: https://reviews.llvm.org/D99300
Including xlocinfo.h is a bit of a layering violation; locale.h is
the C library header we should use, while xlocinfo.h is essentially
part of the MS C++ library. Including xlocinfo.h brings in yvals.h,
which brings in yvals_core.h, which defines the MS STL's version
support macros, overriding what libc++'s <version> had defined.
Instead just include locale.h, and provide the few defines we need
for locale categories manually.
Differential Revision: https://reviews.llvm.org/D99213
SCEV currently tries to prove implications of x pred y by also
trying to imply ~y pred ~x. This is expensive in terms of
compile-time (in fact, the majority of isImpliedCond compile-time
is spent here) and generally not fruitful. The issue is that this
also swaps the operands and thus breaks canonical ordering. If
originally we were trying to prove an implication like
X > C1 -> Y > C2, then we'll now try to prove X > C1 -> C3 > ~Y,
which will not work.
The only real case where we can get some use out of this transform
is if the original conditions were in the form X > C1 -> Y < C2, were
then swapped to X > C1 -> C2 > Y and are then swapped again here to
X > C1 -> ~Y > C3.
As such, handle this at a higher level, where we are doing the
swapping in the first place. There's four different ways that we
can line up a predicate and a swapped predicate, so we use some
heuristics to pick some profitable way.
Because we now try this transform at a higher level
(isImpliedCondOperands rather than isImpliedCondOperandsHelper),
we can also prove additional facts. Of the added tests, one was
proven previously while the other wasn't.
Differential Revision: https://reviews.llvm.org/D90926
A recent filecheck change resulted in better reporting of invalid variables and this test had a couple. This is the second occurence that the first fix missed.
As of CMake commit https://gitlab.kitware.com/cmake/cmake/-/commit/d993ebd4,
which first appeared in CMake 3.19.x series, in the compile commands for
clang-cl, CMake puts `--` before the input file. When operating on such a
database, the `InterpolatingCompilationDatabase` - specifically, the
`TransferableCommand` constructor - does not recognize that pattern and so, does
not strip the input, or the double dash when 'transferring' the compile command.
This results in a incorrect compile command - with the double dash and old input
file left in, and the language options and new input file appended after them,
where they're all treated as inputs, including the language version option.
Test files for some tests have names similar enough to be matched to commands
from the database, e.g.:
`.../path-mappings.test.tmp/server/bar.cpp`
can be matched to:
`.../Driver/ToolChains/BareMetal.cpp`
etc. When that happens, the tool being tested tries to use the matched, and
incorrectly 'transferred' compile command, and fails, reporting errors similar
to:
`error: no such file or directory: '/std:c++14'; did you mean '/std:c++14'? [clang-diagnostic-error]`
This happens in at least 4 tests:
Clang Tools :: clang-tidy/checkers/performance-trivially-destructible.cpp
Clangd :: check-fail.test
Clangd :: check.test
Clangd :: path-mappings.test
The fix for `TransferableCommand` removes the `--` and everything after it when
determining the arguments that apply to the new file. `--` is inserted in the
'transferred' command if the new file name starts with `-` and when operating in
clang-cl mode, also `/`. Additionally, other places in the code known to do
argument adjustment without accounting for the `--` and causing the tests to
fail are fixed as well.
Differential Revision: https://reviews.llvm.org/D98824
This patch exploits the xxsplti32dx instruction available on Power10
in place of constant pool loads where xxspltidp would not be able to,
usually because the immediate cannot fit into 32 bits.
Differential Revision: https://reviews.llvm.org/D95458
LLDB can often appear deadlocked to users that use IDEs when it is indexing DWARF, or parsing symbol tables. These long running operations can make a debug session appear to be doing nothing even though a lot of work is going on inside LLDB. This patch adds a public API to allow clients to listen to debugger events that report progress and will allow UI to create an activity window or display that can show users what is going on and keep them informed of expensive operations that are going on inside LLDB.
Differential Revision: https://reviews.llvm.org/D97739
This is yet another attempt to fix tightlyNested().
Add checks in tightlyNested() for the inner loop exit block,
such that 1) if there is control-flow divergence in between the inner
loop exit block and the outer loop latch, or 2) if the inner loop exit
block contains unsafe instructions, tightlyNested() returns false.
The reasoning behind is that after interchange, the original inner loop
exit block, which was part of the outer loop, would be put into the new
inner loop, and will be executed different number of times before and
after interchange. Thus it should be dealt with appropriately.
Reviewed By: Whitney
Differential Revision: https://reviews.llvm.org/D98263
Make TSan runtime initialization and finalization hooks work
even if these hooks are not built in the main executable. When these
hooks are defined in another library that is not directly linked against
the TSan runtime (e.g., Swift runtime) we cannot rely on the "strong-def
overriding weak-def" mechanics and have to look them up via `dlsym()`.
Let's also define hooks that are easier to use from C-only code:
```
extern "C" void __tsan_on_initialize();
extern "C" int __tsan_on_finalize(int failed);
```
For now, these will call through to the old hooks. Eventually, we want
to adopt the new hooks downstream and remove the old ones.
This is part of the effort to support Swift Tasks (async/await and
actors) in TSan.
rdar://74256720
Reviewed By: vitalybuka, delcypher
Differential Revision: https://reviews.llvm.org/D98810
Functions specified in `-emscripten-cxx-exceptions-allowed`, which is
set by Emscripten's `EXCEPTION_CATCHING_ALLOWED` setting, can be inlined
in LLVM middle ends before we reach WebAssemblyLowerEmscriptenEHSjLj
pass in the wasm backend and thus don't get transformed for exception
catching.
This fixes the issue by adding `--force-attribute=FUNC_NAME:noinline`
for each function name in `-emscripten-cxx-exceptions-allowed`, which
adds `noinline` attribute to the specified function and thus excludes
the function from inlining candidates in optimization passes.
Fixes the remaining half of
https://github.com/emscripten-core/emscripten/issues/10721.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D99259
Create fix-it hints to fix the order of constructors.
To make this a lot simpler, I've grouped all the warnings for each out of order initializer into 1.
This is necessary as fixing one initializer would often interfere with other initializers.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D98745
The commit passes the tests on Darwin. The failure on linux shows
that this change was not sufficient to get this setting to work on linux,
but the behavior is the same as before the patch & test, and it caused
no new failures.
So marking the tests as Darwin only till someone can debug the Linux
issue.
FileCheck string substitution block parsing code only report an invalid
variable name in a string variable use if it starts with a forbidden
character. It does not report anything if there are unparsed characters
after the variable name, i.e. [[X-Y]] is parsed as [[X]] and no error is
returned. This commit fixes that.
Reviewed By: jdenny, jhenderson
Differential Revision: https://reviews.llvm.org/D98691
Userspace page aliasing allows us to use middle pointer bits for tags
without untagging them before syscalls or accesses. This should enable
easier experimentation with HWASan on x86_64 platforms.
Currently stack, global, and secondary heap tagging are unsupported.
Only primary heap allocations get tagged.
Note that aliasing mode will not work properly in the presence of
fork(), since heap memory will be shared between the parent and child
processes. This mode is non-ideal; we expect Intel LAM to enable full
HWASan support on x86_64 in the future.
Reviewed By: vitalybuka, eugenis
Differential Revision: https://reviews.llvm.org/D98875
The I/O runtime library code was failing to retain data in a buffer
from the current output record when flushing the buffer; this is
fatally wrong when the corresponding file cannot be repositioned,
as in the case of standard output to the console. So refine the
Flush() member function to retain a specified number of bytes,
rearrange the data as necessary (using existing code for read frame
management after moving it into a new member function), and add
a big comment to the head of the file to clarify the roles of the
various data members in the management of contiguous frames in
circular buffers.
Update: added a unit test.
Differential Revision: https://reviews.llvm.org/D99198
Binding labels start as expressions but they have to evaluate to
constant character of default kind, so they can be represented as an
std::string. Leading and trailing blanks have to be removed, so the
folded expression isn't exactly right anyway.
So all BIND(C) symbols now have a string binding label, either the
default or user-supplied one. This is recorded in the .mod file.
Add WithBindName mix-in for details classes that can have a binding
label so that they are all consistent. Add GetBindName() and
SetBindName() member functions to Symbol.
Add tests that verifies that leading and trailing blanks are ignored
in binding labels and that the default label is folded to lower case.
Differential Revision: https://reviews.llvm.org/D99208