Commit Graph

401151 Commits

Author SHA1 Message Date
Sanjay Patel 668beb8ae8 [InstCombine] refactor folds of 'not' instructions; NFC
This removes repeated calls to m_Not, so hopefully a little
more efficient.

Also, we may need to enhance some of these blocks to allow
logical and/or (select of bools).
2021-10-05 16:36:57 -04:00
Sam Clegg 8fe128476e [lld][WebAssembly] Create optional internal symbols only after LTO object as been added
This is important for the cases where new symbols can be introduced
during LTO.   Specifically this happens for during TLS-lowering where
references to `__tls_base` can be introduced.

Fixes: https://github.com/emscripten-core/emscripten/issues/12489

Differential Revision: https://reviews.llvm.org/D111171
2021-10-05 13:31:09 -07:00
Nikita Popov 0be9940ef2 [SCEV] Don't check if propagation safe if there are no flags (NFC)
If there are no nowrap flags, then we don't need to determine
whether propagating flags is safe -- it will make no difference.
2021-10-05 22:25:41 +02:00
Philip Reames 94c1c56cc5 [tests] Cover cases we could infer SCEV flags, but don't 2021-10-05 13:16:16 -07:00
Vitaly Buka 84afd02525 [sanitizer] Fix Android bot
We don't need to check for equality, we need to check
that storage is large enough.
2021-10-05 13:08:16 -07:00
Vitaly Buka 6fab808f6f [NFC][sanitizer] Combine MSAN data in single field
Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D111118
2021-10-05 12:34:02 -07:00
Jonas Devlieghere 730fca46fc [lldb] Improve meta data stripping from JSON crashlogs
JSON crashlogs normally start with a single line of meta data that we
strip unconditionally. Some producers started omitting the meta data
which tripped up crashlog. Be more resilient by only removing the first
line when we know it really is meta data.

rdar://82641662
2021-10-05 12:15:54 -07:00
Roman Lebedev f92961d238
[NFC] Fixup newly-added costmodel tests to actually test what they should 2021-10-05 21:35:47 +03:00
Valentin Clement fc66dbba1f
[fir] Add external name interop pass
Add the external name conversion pass needed for compiler
interoperability. This pass convert the Flang internal symbol name to
the common gfortran convention.

Clean up old passes without implementation in the Passes.ts file so
the project and fir-opt can build correctly.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: schweitz

Differential Revision: https://reviews.llvm.org/D111057
2021-10-05 20:33:41 +02:00
Louis Dionne d9346f5255 [libc++abi] Mark __cxa_new_handler with _LIBCPP_SAFE_STATIC
For consistency with the other handlers, and because requiring constant
initialization whenever we can is a good thing.

Differential Revision: https://reviews.llvm.org/D110866
2021-10-05 14:29:32 -04:00
Lei Zhang 7a89444cd9 [mlir][spirv] Add ops and patterns for lowering standard max/min ops
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D111143
2021-10-05 14:27:32 -04:00
peter klausler cc1d13f997 [flang] Fold MAXLOC and MINLOC
Generalize the code that folds FINDLOC to also handle
folding for MAXLOC and MINLOC.

Differential Revision: https://reviews.llvm.org/D110951
2021-10-05 11:22:02 -07:00
Philip Reames c608b49d67 [SCEV] Tweak the algorithm for figuring out if flags must apply to a SCEV [mostly-NFC]
Behavior wise, this patch should be mostly NFC.  The only behavior difference known is that on the isSCEVExprNeverPoison path we'll consider a bound imposed by the SCEVable operands (if any).

Algorithmically, it's an invert of the existing code.  Previously, we checked for each operand if we could find a bound, then checked for must-execute given that bound.  With the patch, we use dominance to refine the innermost bound, then check must execute once.  The interesting case is when we have multiple unknowns within a single basic block.  While both dominance and must-execute are worst-case linear walks within the block, only dominance is cached.  As such, refining based on dominance should be more efficient.
2021-10-05 11:20:48 -07:00
River Riddle b8ffcb12e2 [mlir:Pass] Generate a reproducer as early as possible
This avoids keeping references to passes that may be freed by
the time that the pass manager has finished executing (in the
non-crash case).

Fixes PR#52069

Differential Revision: https://reviews.llvm.org/D111106
2021-10-05 18:11:26 +00:00
Joe Loser 8cf5319aff
[libc++][test] Use = delete over DELETE_FUNCTION. NFC.
Some tests repeat the definition of `DELETE_FUNCTION` macro locally.
However, it's not even requred to guard against in the C++03 case since
Clang supports `= delete;` in C++03 mode. A warning is issued but
`libc++` tests run with `-Wno-c++11-extensions`, so this isn't an issue.
Since we don't support other compilers in C++03 mode, `= delete;` is
always available for use. As such, inline all calls of `DELETE_FUNCTION`
to use `= delete;`.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D111148
2021-10-05 14:08:48 -04:00
Rob Suderman d5a4c86d14 [mlir][tosa] tosa.cast support for unsigned integers
Unsigned integers need to be handled for cast to floating point.

Reviewed By: NatashaKnk

Differential Revision: https://reviews.llvm.org/D111102
2021-10-05 10:57:16 -07:00
Amy Huang c7104e5066 [Sema] Allow comparisons between different ms ptr size address space types.
We're currently using address spaces to implement __ptr32/__ptr64 attributes;
this patch fixes a bug where clang doesn't allow types with different pointer
size attributes to be compared.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51889

Differential Revision: https://reviews.llvm.org/D110670
2021-10-05 10:56:29 -07:00
Simon Pilgrim 2e5daac217 [llvm] Update report_fatal_error calls from raw_string_ostream to use Twine(OS.str())
As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.

We can use the raw_string_ostream::str() method to perform the implicit flush() and return a reference to the std::string container that we can then wrap inside Twine().
2021-10-05 18:42:12 +01:00
David Zarzycki 5bc32ad08d [lldb testing] NFC: run through clang-format 2021-10-05 13:40:27 -04:00
Keith Smiley 0f3254b29f [lldb] Improve help for platform put-file
Previously it was not clear what arguments this required, or what it would do if you didn't pass the destination argument.

Differential Revision: https://reviews.llvm.org/D110981
2021-10-05 10:29:37 -07:00
Petr Hosek 24c615fa6b [InstrProfData] Bump the raw profile version to 8
This is to account for the change that made CountersPtr in __profd_
relative which landed in a1532ed275.
That change hasn't updated the raw profile version, and while the
profile layout stayed the same, profiles generated by tip-of-tree
LLVM are incompatible with 13.x tooling.

Differential Revision: https://reviews.llvm.org/D111123
2021-10-05 09:57:56 -07:00
Kirill Bobyrev 0c14e279c7
[clangd] Revert unwanted change from D108194 2021-10-05 18:44:43 +02:00
Geoffrey Martin-Noble b983783d2e [MLIR][linalg] Preserve location during elementwise fusion
This otherwise loses a lot of debugging info and results in a painful
debugging experience.

Reviewed By: mravishankar, stellaraccident

Differential Revision: https://reviews.llvm.org/D111107
2021-10-05 09:43:53 -07:00
Alexey Bataev bebe702dbe [SLP]Detect reused scalars in all possible gathers for better vectorization cost.
Some initially gathered nodes missed the check for the reused scalars,
which leads to high gather cost. Such nodes still can be represented as
m gathers + shuffle instead of n gathers, where m < n.

Differential Revision: https://reviews.llvm.org/D111153
2021-10-05 09:43:03 -07:00
Roman Lebedev 200edc152b
[NFC][X86][LV] Add basic costmodel test coverage for not-fully-interleaved i32 loads
The coverage could have cumulative explosion here,
so i'm adding only the most basic cases,
and hoping it's enough, though more can be added if needed.
2021-10-05 19:39:50 +03:00
Aart Bik 16b8f4ddae [mlir][sparse] add a "release" operation to sparse tensor dialect
We have several ways to materialize sparse tensors (new and convert) but no explicit operation to release the underlying sparse storage scheme at runtime (other than making an explicit delSparseTensor() library call). To simplify memory management, a sparse_tensor.release operation has been introduced that lowers to the runtime library call while keeping tensors, opague pointers, and memrefs transparent in the initial IR.

*Note* There is obviously some tension between the concept of immutable tensors and memory management methods. This tension is addressed by simply stating that after the "release" call, no further memref related operations are allowed on the tensor value. We expect the design to evolve over time, however, and arrive at a more satisfactory view of tensors and buffers eventually.

Bug:
http://llvm.org/pr52046

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D111099
2021-10-05 09:35:59 -07:00
Jonas Paulsson 7a4e9a0c73 [SystemZ] Implement memcmp of variable length with CLC.
Following the same pattern of memset/memcpy, this patch implements a variable
length memcmp with a CLC loop followed by an EXRL instruction.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D107380
2021-10-05 18:20:36 +02:00
Nikita Popov 64eaffb613 [APInt] Fix type limits warning (NFC)
Unsigned number is always >= 0.
2021-10-05 18:10:12 +02:00
Matt Beardsley 32ab79ebc4 [clang-tidy] Fix add_new_check.py to generate correct list.rst autofix column from relative path
Previously, the code in add_new_check.py that looks for fixit keywords in check source files when generating list.rst assumed that the script would only be called from its own path. That means it doesn't find any source files for the checks it's attempting to scan for, and it defaults to writing out nothing in the "Offers fixes" column for all checks. Other parts of add_new_check.py work from other paths, just not this part.

After this fix, add_new_check.py's "offers fixes" column generation for list.rst will be consistent regardless of what path it's called from by using the caller path that's deduced elsewhere already from sys.argv[0].

Reviewed By: kbobyrev

Differential Revision: https://reviews.llvm.org/D110600
2021-10-05 18:09:53 +02:00
Joe Nash 8f55fdf26c [MacroFusion] Expose useful static methods. NFC.
hasLessThanNumFused and fuseInstructionPair are useful for
DAG mutations similar to MacroFusion, but which cannot use
MacroFusion as a whole (such as fusing non-dependent instruction).

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D111070

Change-Id: I3a5d56aba0471d45ef64cebb9b724030e2eae2f3
2021-10-05 11:51:48 -04:00
Kirill Bobyrev ebfcd06d42 [clangd] IncludeCleaner: Mark used headers
Follow-up on D105426.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D108194
2021-10-05 18:08:24 +02:00
Nikita Popov c117d77e93 [ConstantFold] Refactor load folding
This refactors load folding to happen in two cleanly separated
steps: ConstantFoldLoadFromConstPtr() takes a pointer to load from
and decomposes it into a constant initializer base and an offset.
Then ConstantFoldLoadFromConst() loads from that initializer at
the given offset. This makes the core logic independent of having
actual GEP expressions (and those GEP expressions having certain
structure) and will allow exposing ConstantFoldLoadFromConst() as
an independent API in the future.

This is mostly only a refactoring, but it does make the folding
logic slightly more powerful.

Differential Revision: https://reviews.llvm.org/D111023
2021-10-05 18:07:57 +02:00
Simon Pilgrim d67935ed8e [Support] Update SmallVector report_fatal_error calls to use Twine and add missing implicit header dependency. 2021-10-05 17:03:19 +01:00
Simon Pilgrim 3ca232feb3 [TableGen] CodeEmitterGen - emit report_fatal_error(const char*) instead of report_fatal_error(std::string&)
As described on D111049, we're trying to remove the <string> dependency from error handling. In most cases the plan is to use the Twine() variant directly but to reduce introducing additional headers for the generated files, I'm using the const char* variant here instead.
2021-10-05 17:03:18 +01:00
Simon Pilgrim 9503ad3b53 [clang] FatalErrorHandler.cpp - add explicit <stdio.h> include
Required for fprintf/stderr usage in the error handler, noticed while trying to remove the <string> dependency described in D111049
2021-10-05 17:03:17 +01:00
Chris Lattner cc697fc292 [APInt] Make insertBits and concat work with zero width APInts.
These should both clearly work with our current model for zero width
integers, but don't until now!

Differential Revision: https://reviews.llvm.org/D111113
2021-10-05 08:41:53 -07:00
Utkarsh Saxena 6831c1d868 [clangd] Include refs of base method in refs for derived method.
Addresses https://github.com/clangd/clangd/issues/881

Includes refs of base class method in refs of derived class method.
Previously we reported base class method's refs only for decl of derived
class method. Ideally this should work for all usages of derived class method.

Related patch:
fbeff2ec2b.

Differential Revision: https://reviews.llvm.org/D111039
2021-10-05 17:39:49 +02:00
Kazu Hirata 3081de8c72 [llvm] Migrate from getNumArgOperands to arg_size (NFC)
Note that getNumArgOperands is considered a legacy name.  See
llvm/include/llvm/IR/InstrTypes.h for details.
2021-10-05 08:29:19 -07:00
Amara Emerson de5b16d8ca Revert "Revert "Revert "[GlobalISel][IRTranslator] Emit trap intrinsic for "unreachable""""
This reverts commit c93bc508ee.

Seems to break a different thing now.
2021-10-05 08:25:13 -07:00
Jonas Paulsson c6c13c58ee [SystemZ] Implement memcpy of variable length with MVC.
Instead of making a memcpy libcall, emit an MVC loop and an EXRL instruction
the same way as is already done for memset 0.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D106874
2021-10-05 17:14:41 +02:00
Peter Waller be26e6ff73 [AArch64][SVE] Remove redundant PTEST following PNEXT/PFIRST
PNEXT and PFIRST set the NZCV flags, so the subsequent PTEST can be
optimized away in AArch64InstrInfo::optimizePTestInstr.

See-also: https://reviews.llvm.org/D93292

Differential Revision: https://reviews.llvm.org/D110177
2021-10-05 15:10:48 +00:00
Matthew Devereau 2ac1999937 [AArch64][SVE] Propagate math flags from intrinsics to instructions
Retain floating-point math flags inside instCombineSVEVectorBinOp
2021-10-05 15:39:13 +01:00
David Zarzycki 79bf032fe1 [lldb testing] Avoid subtle terminfo behavioral differences
The original "arbitrary" changes were causing EINVAL on a Fedora 34 box.
2021-10-05 10:28:02 -04:00
Joe Loser 0ad9013fcd
[libc++][test] Remove unused macro in is_constructible.pass.cpp. NFC.
Test file defines `LIBCPP11_STATIC_ASSERT` but it never uses it now. It
always uses `static_assert` unconditionally. So, remove the unused
macro.
2021-10-05 10:15:24 -04:00
TN Khanh fe2b2cb58e Add .cmt and .cmti files for OCaml bindings
We can build .cmt and .cmti files for easier
code navigation for OCaml bindings
2021-10-05 19:36:12 +05:30
Roman Lebedev 3f9b235482
[X86][Costmodel] Load/store i64/f64 Stride=6 VF=8 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/1jfGddcre - for intels `Block RThroughput: =36.0`; for ryzens, `Block RThroughput: =12.0`
So could pick cost of `36`

For store we have:
https://godbolt.org/z/ao9srMT8r - for intels `Block RThroughput: =30.0`; for ryzens, `Block RThroughput: =12.0`
So we could pick cost of `30`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111094
2021-10-05 16:58:58 +03:00
Roman Lebedev e2784c5d8c
[X86][Costmodel] Load/store i64/f64 Stride=6 VF=4 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/rc8jYxW6M - for intels `Block RThroughput: =18.0`; for ryzens, `Block RThroughput: =6.0`
So could pick cost of `18`.

For store we have:
https://godbolt.org/z/9PhPEr65G - for intels `Block RThroughput: =15.0`; for ryzens, `Block RThroughput: =6.0`
So we could pick cost of `15`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111093
2021-10-05 16:58:58 +03:00
Roman Lebedev 3960693048
[X86][Costmodel] Load/store i64/f64 Stride=6 VF=2 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/onese7rec - for intels `Block RThroughput: =6.0`; for ryzens, `Block RThroughput: =3.0`
So could pick cost of `6`.

For store we have:
https://godbolt.org/z/bMd7dddnT - for intels `Block RThroughput: =8.0`; for ryzens, `Block RThroughput: <=6.0`
So we could pick cost of `8`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111092
2021-10-05 16:58:58 +03:00
Roman Lebedev 79d6d12d95
[X86][Costmodel] Load/store i32/f32 Stride=6 VF=16 interleaving costs
This one required quite a bit of an assembly surgery, but i think it's in the right ballpark..

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/na97Kb96o - for intels `Block RThroughput: <=64.0`; for ryzens, `Block RThroughput: <=32.0`
So could pick cost of `64`.

For store we have:
https://godbolt.org/z/GG1WeoKar - for intels `Block RThroughput: =66.0`; for ryzens, `Block RThroughput: <=27.5`
So we could pick cost of `66`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111091
2021-10-05 16:58:58 +03:00
Roman Lebedev 2996a2b50f
[X86][Costmodel] Load/store i32/f32 Stride=6 VF=8 interleaving costs
The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/jK85GWKaK - for intels `Block RThroughput: =31.0`; for ryzens, `Block RThroughput: <=17.0`
So could pick cost of `31`.

For store we have:
https://godbolt.org/z/hPWWhEEf9 - for intels `Block RThroughput: =33.0`; for ryzens, `Block RThroughput: <=13.8`
So we could pick cost of `33`.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D111089
2021-10-05 16:58:57 +03:00