Commit Graph

405686 Commits

Author SHA1 Message Date
Tobias Gysi e3d386ea27 [mlir][linalg] Add a tile and fuse on tensors pattern.
Add a pattern to apply the new tile and fuse on tensors method. Integrate the pattern into the CodegenStrategy and use the CodegenStrategy to implement the tests.

Depends On D114012

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114067
2021-11-22 11:13:21 +00:00
Diego Caballero 4348cd42c3 [LV] Drop integer poison-generating flags from instructions that need predication
This patch fixes PR52111. The problem is that LV propagates poison-generating flags (`nuw`/`nsw`, `exact`
and `inbounds`) in instructions that contribute to the address computation of widen loads/stores that are
guarded by a condition. It may happen that when the code is vectorized and the control flow within the loop
is linearized, these flags may lead to generating a poison value that is effectively used as the base address
of the widen load/store. The fix drops all the integer poison-generating flags from instructions that
contribute to the address computation of a widen load/store whose original instruction was in a basic block
that needed predication and is not predicated after vectorization.

Reviewed By: fhahn, spatel, nlopes

Differential Revision: https://reviews.llvm.org/D111846
2021-11-22 10:57:29 +00:00
Nicolas Vasilache 789c88e80e [mlir] Fix unintentional mutation by VectorType/RankedTensorType::Builder dropDim
Differential Revision: https://reviews.llvm.org/D113933
2021-11-22 10:51:50 +00:00
Tobias Gysi 0ccc44cec0 [mlir][linalg] Fix tile and fuse for outermost reduction.
Tile and fuse failed if the outermost tile loop is a reduction dimension. Add the necessary check to handle outermost reductions and introduce a test case to verify the change.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D114012
2021-11-22 10:44:15 +00:00
Nicolas Vasilache a9e236bed8 [mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)
This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations:        100
Instructions:      5900
Total Cycles:      2415
Total uOps:        7300

Dispatch Width:    6
uOps Per Cycle:    3.02
IPC:               2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
  Resource Pressure       [ 89.65% ]
  - SKXPort1  [ 0.04% ]
  - SKXPort2  [ 12.42% ]
  - SKXPort3  [ 12.42% ]
  - SKXPort5  [ 89.52% ]
  Data Dependencies:      [ 37.06% ]
  - Register Dependencies [ 37.06% ]
  - Memory Dependencies   [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations:        100
Instructions:      6300
Total Cycles:      2015
Total uOps:        7700

Dispatch Width:    6
uOps Per Cycle:    3.82
IPC:               3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
  Resource Pressure       [ 83.18% ]
  - SKXPort0  [ 14.49% ]
  - SKXPort1  [ 14.54% ]
  - SKXPort2  [ 19.70% ]
  - SKXPort3  [ 19.70% ]
  - SKXPort5  [ 83.03% ]
  - SKXPort6  [ 14.49% ]
  Data Dependencies:      [ 39.75% ]
  - Register Dependencies [ 39.75% ]
  - Memory Dependencies   [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Reviewed By: ftynse, dcaballe

Differential Revision: https://reviews.llvm.org/D114335
2021-11-22 10:32:34 +00:00
Sjoerd Meijer 4d21b64464 [BPI] Look-up tables for non-loop branches. NFC.
This adds and uses look-up tables for non-loop branch probabilities, which have
have probabilities directly encoded into the tables for the different condition
codes. Compared to having this logic inlined in different functions, as it used
to be the case, I think this is compacter and thus also easier to check/cross
reference. This also adds a test for pointer heuristics that was missing.

Differential Revision: https://reviews.llvm.org/D114009
2021-11-22 10:30:42 +00:00
Arjun P d92aabc336 [MLIR][NFC] Simplex: remove repeated words in comment 2021-11-22 15:50:03 +05:30
Diego Caballero a7027bb799 [LV] Pre-commit test for D111846
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D112054
2021-11-22 10:13:56 +00:00
Guillaume Chatelet 2f1c037bbd [libc] Remove unused variable 2021-11-22 10:12:46 +00:00
Manuel Klimek 84bf5e3286 Fix various problems found by fuzzing.
1. IndexTokenSource::getNextToken cannot return nullptr; some code was
still written assuming it can; make getNextToken more resilient against
incorrect input and fix its call-sites.

2. Change various asserts that can happen due to user provided input to
conditionals in the code.
2021-11-22 11:08:38 +01:00
Salman Javed a82942dd07 Add missing clang-tidy args in index.rst (NFC)
The RST docs have gone out of sync with the command-line args that the
clang-tidy program actually supports.
2021-11-22 22:50:05 +13:00
Kirill Bobyrev b5f20372a8
[clangd] IncludeCleaner: Mark possible expr resolutions as used
Fixes: https://github.com/clangd/clangd/issues/934

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D114287
2021-11-22 10:44:24 +01:00
David Green 760d4d03d5 [AArch64] Sink splat shuffles to lane index intrinsics
This teaches AArch64TargetLowering::shouldSinkOperands to sink splat
shuffles to certain neon intrinsics, so that they can make use of the
lane variants of the instructions that are available.

Differential Revision: https://reviews.llvm.org/D112994
2021-11-22 08:11:35 +00:00
Salman Javed 83484f8472 Fix nits in clang-tidy's documentation (NFC)
Add commas, articles, and conjunctions where missing.
2021-11-22 21:10:24 +13:00
Chuanqi Xu 2ac339ef5f [C++20] [Coroutines] Warn for deprecated form 'for co_await'
The form 'for co_await' is part of CoroutineTS instead of C++20.
So if we detected the use of 'for co_await' in C++20, we should emit
a warning at least.
2021-11-22 15:57:57 +08:00
Dmitry Vyukov 6a3958247a tsan: add another fork test
Add a fork test that models what happens on Mac
where fork calls malloc/free inside of our atfork
callbacks.

Reviewed By: vitalybuka, yln

Differential Revision: https://reviews.llvm.org/D114250
2021-11-22 08:36:51 +01:00
Igor Kudrin a05b694b1e [ELF][NFC] Do not pass region name to expandMemoryRegion()
The name can be easily got on-site.

Differential Revision: https://reviews.llvm.org/D114228
2021-11-22 14:19:07 +07:00
wangpc af0ecfccae [RISCV] Generate pseudo instruction li
Add an alias of `addi [x], zero, imm` to generate pseudo
instruction li, which makes assembly mush more readable.
For existed tests, users can update them by running script
`llvm/utils/update_llc_test_checks.py`.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112692
2021-11-22 14:01:37 +08:00
Kazu Hirata 49e3838145 [llvm] Use make_early_inc_range (NFC) 2021-11-21 19:24:17 -08:00
Kazu Hirata ea5421bd0d [llvm] Use range-based for loops (NFC) 2021-11-21 19:24:15 -08:00
Roland McGrath b72b56016a NFC: clang-format lib/Transforms/Instrumentation/InstrProfiling.cpp
Differential Revision: https://reviews.llvm.org/D114343
2021-11-21 18:16:02 -08:00
Joe Loser a60b63940a
[libc++][NFC] Sort includes in __ranges/concepts.h
Differential Revision: https://reviews.llvm.org/D114328
2021-11-21 19:34:02 -05:00
LLVM GN Syncbot 0a413506a2 [gn build] Port 1dc62f2653 2021-11-22 00:29:19 +00:00
Nikolas Klauser 1dc62f2653 [libc++] Implement P1272R4 (std::byteswap)
Implement P1274R4

Reviewed By: Quuxplusone, Mordante, #libc

Spies: jloser, lebedev.ri, mgorny, libcxx-commits, arichardson

Differential Revision: https://reviews.llvm.org/D114074
2021-11-22 01:28:18 +01:00
Jacques Pienaar e5a4d0f149 [mlir] Fix unused function warning (NFC)
Delete function no longer needed as all derived classes override
printer.
2021-11-21 15:06:08 -08:00
Jacques Pienaar 6f9cceb775 [mlir] Move trait to InferTypeOpInterface
Step towards removing the hard coded behavior for this trait and to instead use common interface.

Differential Revision: https://reviews.llvm.org/D114208
2021-11-21 14:41:12 -08:00
Kazu Hirata c133fb321f [CodeGen] Use llvm::is_contained (NFC) 2021-11-21 10:36:20 -08:00
Kazu Hirata fc981cedea [llvm] Use range-based for loops (NFC) 2021-11-21 10:36:18 -08:00
Simon Pilgrim 4a5e1ffcf9 [ARM] Regenerate sxt_rot.ll tests 2021-11-21 18:33:29 +00:00
Simon Pilgrim eced44637c [Thumb2] Regenerate ext + rot tests 2021-11-21 18:33:28 +00:00
Simon Pilgrim 357d636289 [PowerPC] Regenerate rlwinm2.ll test 2021-11-21 18:33:28 +00:00
Philip Reames 73d52ee785 Add a best practice section on how to configure a fast builder
This is based on conversations with a couple of folks currently running buildbots. There's a couple pieces which didn't make it in, but this tries to cover the common themes.

Differential Revision: https://reviews.llvm.org/D114325
2021-11-21 08:01:29 -08:00
Arjun P ad48ef1e31 [MLIR][NFC] Simplex::restoreRow: improve documentation 2021-11-21 19:23:55 +05:30
Simon Pilgrim 3234f2d9c1 [ARM][ParallelDSP] Regenerate complex_dot_prod.ll test 2021-11-21 12:01:44 +00:00
David Green 2b9c41189e [AArch64] Extra testing for sinking splats to various instructions. NFC 2021-11-21 11:46:34 +00:00
Fangrui Song 648157b05a [ELF] Move getOutputSectionName from Writer.cpp to LinkerScript.cpp. NFC
and internalize it.
2021-11-20 22:18:09 -08:00
Kazu Hirata f6bce30cf9 [llvm] Use range-based for loops (NFC) 2021-11-20 18:42:10 -08:00
Phoebe Wang 6cc820a3e2 [X86][FP16] Relax the pattern condition for VZEXT_MOVL to match more cases
Fixes pr52560

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D114313
2021-11-21 09:14:11 +08:00
Joe Loser dca681fee9
[libc++][NFC] Fix typo in ranges::iterator_t synopsis
The `iterator_t` alias template is on `T` not a `R` like the other
neighboring alias templates. Fix the typo.
2021-11-20 19:15:00 -05:00
Arthur O'Dwyer e74114add3 [libc++] [doc] Mark some spaceship-related LWG issues as "Complete."
LWG3330 has been "Completed" since D99309, which was in the 13.x timeframe.
Reviewed as part of D110738.
2021-11-20 18:16:22 -05:00
Roman Lebedev df70cf5e14
[NFC][X86][Costmodel] Actually test +prefer-256-bit in replication-shuffle-related tests :(
While -prefer-256-bit indeed becomes complete with D114314,
the real-world (the one with +prefer-256-bit) coverage is lacking.

Hilarious.
2021-11-21 01:25:49 +03:00
Nikita Popov aeba28bc62 [DSE] Drop hasAnalyzableMemoryWrite() (NFCI)
The functionality of hasAnalyzableMemoryWrite() is effectively
subsumed by getLocForWriteEx(), which will return None if the
instruction is not analyzable. The implementations don't match
exactly (e.g. getLocForWriteEx() does not limit non-calls to
stores), but in conjunction with the isRemovable() check, it ends
up being the same.
2021-11-20 23:20:12 +01:00
Felix Berger fefe20b993 [clang-tidy] performance-unnecessary-copy-initialization: Correctly match the type name of the thisPointertype.
The matching did not work correctly for pointer and reference types.

Differential Revision: https://reviews.llvm.org/D114212

Reviewed-by: courbet
2021-11-20 15:13:41 -05:00
Nikita Popov 0a2bde94a0 [LVI] Drop requirement that modulus is constant
If we're looking only at the lower bound, the actual modulus
doesn't matter. This is a leftover from when I wanted to consider
the upper bound as well, where the modulus does matter.
2021-11-20 21:06:08 +01:00
Nikita Popov cd84cab6b3 [LVI] Support urem in implied conditions
If (X urem M) >= C we know that X >= C. Make use of this fact
when computing the implied condition range.

In some cases we could also establish an upper bound, but that's
both tricker and not interesting in practice.

Alive: https://alive2.llvm.org/ce/z/R5ZGSW
2021-11-20 21:01:26 +01:00
Nikita Popov 25a9ee52f1 [CVP] Add tests for implied conditions using urem (NFC) 2021-11-20 20:49:29 +01:00
Florian Hahn cf8efbd30e
[VPlan] Wrap vector loop blocks in region.
A first step towards modeling preheader and exit blocks in VPlan as well.
Keeping the vector loop in a region allows for changing the VF as we
traverse region boundaries.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D113182
2021-11-20 17:59:48 +00:00
Sanjay Patel 337948ac6e [InstCombine] add folds for binop with sexted bool and constant operands
This is a generalization/extension of the existing and/or
folds noted with TODO comments. Those have a one-use
constraint that is not necessary.

Potential follow-ups are noted by the TODO comments in
the new function. We can also call this function from
other binop visit* functions, but we need to add tests
first.

This solves:
https://llvm.org/PR52543

https://alive2.llvm.org/ce/z/NWuCR5
2021-11-20 12:33:00 -05:00
Sanjay Patel 1d007d0e5a [InstCombine] add tests for bitwise logic with bool op; NFC 2021-11-20 12:32:55 -05:00
Arthur O'Dwyer 401b76fdf2 [libc++] [test] Eliminate libcpp-no-noexcept-function-type and libcpp-no-structured-bindings.
At this point, every supported compiler that claims a -std=c++17 mode
should also support these features.

Differential Revision: https://reviews.llvm.org/D113436
2021-11-20 11:44:57 -05:00