Commit Graph

403785 Commits

Author SHA1 Message Date
Tamir Duberstein 33d9b7b4b2 [sanitizer] Mark before deref in PosixSpawnImpl
Read each pointer in the argv and envp arrays before dereferencing
it; this correctly marks an error when these pointers point into
memory that has been freed.

Differential Revision: https://reviews.llvm.org/D113046
2021-11-03 10:18:06 -07:00
Keith Smiley f79e65e61f [lld-macho] Cache library paths from findLibrary
On top of https://reviews.llvm.org/D113063 this took another 10 seconds
off our overall link time.

Reviewed By: #lld-macho, int3

Differential Revision: https://reviews.llvm.org/D113073
2021-11-03 10:02:23 -07:00
Louis Dionne 9904bcf2a4 [libc++] Fix GDB pretty printer tests for older Clangs and GCC
This was missed by https://llvm.org/D111477, which broke the CI.

Differential Revision: https://reviews.llvm.org/D113112
2021-11-03 13:02:04 -04:00
Shivam Gupta 2a7c3f8b02 [Docs] Document scripts that are use to generate assertion in test cases
This patch document llvm/utils/update_*  python scripts that are used to generate
assertions in many of the LLVM regression test cases.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D112936
2021-11-03 22:24:10 +05:30
Harald van Dijk 889c2b97bd
[X86] Fix X32 indirect call generation
The check for whether a zero extension was needed was subtly wrong and
saw a value that was already 64 bits, so did not extend.

Fixes PR52357.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D112860
2021-11-03 16:43:44 +00:00
Sanjay Patel c85df3c7d5 [InstCombine] refactor fold for icmp with trunc op; NFC
There are at least 3 related folds we can add here - see D112634.
2021-11-03 12:43:15 -04:00
Sanjay Patel d18b7ea621 [InstCombine] add tests for icmp with trunc op; NFC 2021-11-03 12:43:15 -04:00
Roman Lebedev 34b903d8b0
[NFC] Add forgotten `REQUIRES: asserts` into the new costmodel test 2021-11-03 19:40:23 +03:00
Roman Lebedev 9c2469c1dd
[PassManager] `buildModuleOptimizationPipeline()`: schedule `LoopDeletion` pass run before vectorization passes
Test thanks to Michael Kuklinski from `#llvm`: https://godbolt.org/z/bdrah5Goo
originally inspired by Daniel Lemire's https://lemire.me/blog/2021/10/26/in-c-is-empty-faster-than-comparing-the-size-with-zero/

We manage to deduce that the answer does not require looping,
but we do that after the last `LoopDeletion` pass run,
so we end up being stuck with a dead loop.

Now, as with all things SCEV, this has
a very expected ~`+0.12%` compile time performance regression:
https://llvm-compile-time-tracker.com/compare.php?from=0ae7bf124a9bca76dd9a91b2f7379168ff13f562&to=c2ae57c9b961aeb4a28c747266949340613a6d84&stat=instructions
(for comparison, doing that in function simplification pipeline
would have been ~`+0.5` compile time performance regression, D112840)

Looking at the transformation stats over vanilla test-suite, i think it's rather expected:
```
| statistic name                                   |  baseline |  proposed |     Δ |      % |    |%| |
|--------------------------------------------------|----------:|----------:|------:|-------:|-------:|
| scalar-evolution.NumBruteForceTripCountsComputed |       789 |       888 |    99 | 12.55% | 12.55% |
| scalar-evolution.NumTripCountsNotComputed        |    105592 |    117900 | 12308 | 11.66% | 11.66% |
| loop-delete.NumBackedgesBroken                   |       542 |       559 |    17 |  3.14% |  3.14% |
| regalloc.numExtends                              |        81 |        79 |    -2 | -2.47% |  2.47% |
| indvars.NumFoldedUser                            |       408 |       400 |    -8 | -1.96% |  1.96% |
| indvars.NumElimCmp                               |      3831 |      3758 |   -73 | -1.91% |  1.91% |
| scalar-evolution.NumTripCountsComputed           |    299759 |    304278 |  4519 |  1.51% |  1.51% |
| loop-delete.NumDeleted                           |      8055 |      8128 |    73 |  0.91% |  0.91% |
| machine-cse.NumCommutes                          |       111 |       110 |    -1 | -0.90% |  0.90% |
| globaldce.NumFunctions                           |      1187 |      1192 |     5 |  0.42% |  0.42% |
| codegenprepare.NumSelectsExpanded                |       277 |       278 |     1 |  0.36% |  0.36% |
| loop-unroll.NumRuntimeUnrolled                   |     13841 |     13791 |   -50 | -0.36% |  0.36% |
| machinelicm.NumPostRAHoisted                     |      1168 |      1172 |     4 |  0.34% |  0.34% |
| phi-node-elimination.NumCriticalEdgesSplit       |     83054 |     82879 |  -175 | -0.21% |  0.21% |
| machine-cse.NumPREs                              |      3085 |      3079 |    -6 | -0.19% |  0.19% |
| branch-folder.NumBranchOpts                      |    108122 |    107942 |  -180 | -0.17% |  0.17% |
| loop-unroll.NumUnrolled                          |     40136 |     40067 |   -69 | -0.17% |  0.17% |
| branch-folder.NumDeadBlocks                      |    130818 |    130607 |  -211 | -0.16% |  0.16% |
| codegenprepare.NumBlocksElim                     |     92856 |     92714 |  -142 | -0.15% |  0.15% |
| instsimplify.NumSimplified                       |    103263 |    103129 |  -134 | -0.13% |  0.13% |
| instcombine.NumConstProp                         |     26070 |     26102 |    32 |  0.12% |  0.12% |
| instsimplify.NumExpand                           |      1716 |      1718 |     2 |  0.12% |  0.12% |
| loop-unroll.NumCompletelyUnrolled                |      9236 |      9225 |   -11 | -0.12% |  0.12% |
| branch-folder.NumHoist                           |      2773 |      2770 |    -3 | -0.11% |  0.11% |
| regalloc.NumReloadsRemoved                       |     10822 |     10834 |    12 |  0.11% |  0.11% |
| regalloc.NumSnippets                             |     11394 |     11406 |    12 |  0.11% |  0.11% |
| machine-cse.NumCrossBBCSEs                       |      1052 |      1053 |     1 |  0.10% |  0.10% |
| machinelicm.NumCSEed                             |     99887 |     99784 |  -103 | -0.10% |  0.10% |
| branch-folder.NumTailMerge                       |     72501 |     72435 |   -66 | -0.09% |  0.09% |
| codegenprepare.NumExtUses                        |     22007 |     21987 |   -20 | -0.09% |  0.09% |
| local.NumRemoved                                 |     68232 |     68294 |    62 |  0.09% |  0.09% |
| loop-vectorize.LoopsAnalyzed                     |     75483 |     75413 |   -70 | -0.09% |  0.09% |
```

Note that i'm only changing current PM, and not touching obsolete PM.

This is an alternative to the function simplification pipeline variant
of the same change, D112840. It has both less compile time impact
(since the additional number of SCEV trip count calculations
is way lass less than with the D112840), and it is
much more powerful/impactful (almost 2x more loops deleted).

I have checked, and doing this after loop rotation
is favorable (more loops deleted).

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D112851
2021-11-03 19:24:49 +03:00
Kazu Hirata 4bef0304e1 [AArch64, AMDGPU] Use make_early_inc_range (NFC) 2021-11-03 09:22:51 -07:00
Roman Lebedev c65e2ac405
[NFC] Rewrite runlines in interleaved-store-accesses-with-gaps.ll once again
https://lab.llvm.org/buildbot/#/builders/98/builds/8198 is still failing,
and i really don't understand how runlines in this test differ
from the ones in other nearby tests...
2021-11-03 19:15:33 +03:00
Hans Wennborg a2a58d91e8 Revert "X86InstrInfo: Support immediates that are +1/-1 different in optimizeCompareInstr"
This casued miscompiles of switches, see comments on the code review.

> This extends `optimizeCompareInstr` to re-use previous comparison
> results if the previous comparison was with an immediate that was 1
> bigger or smaller. Example:
>
>     CMP x, 13
>     ...
>     CMP x, 12   ; can be removed if we change the SETg
>     SETg ...    ; x > 12  changed to `SETge` (x >= 13) removing CMP
>
> Motivation: This often happens because SelectionDAG canonicalization
> tends to add/subtract 1 often when optimizing for fallthrough blocks.
> Example for `x > C` the fallthrough optimization switches true/false
> blocks with `!(x > C)` --> `x <= C` and canonicalization turns this into
> `x < C + 1`.
>
> Differential Revision: https://reviews.llvm.org/D110867

This reverts commit e2c7ee0743.
2021-11-03 17:01:36 +01:00
Roman Lebedev df93c8a919
[X86] `X86TTIImpl::getInterleavedMemoryOpCostAVX512()`: fallback to scalarization cost computation for mask
I don't really buy that masked interleaved memory loads/stores are supported on X86.
There is zero costmodel test coverage, no actual cost modelling for the generation
of the mask repetition, and basically only two LV tests.
Additionally, i'm not very interested in AVX512.

I don't know if this really helps "soft" block over at
https://reviews.llvm.org/D111460#inline-1075467,
but i think it can't make things worse at least.

When we are being told that there is a masking, instead of
completely giving up and falling back to
fully scalarizing `BasicTTIImplBase::getInterleavedMemoryOpCost()`,
let's correctly query the cost of masked memory ops,
keep all the pretty shuffle cost modelling,
but scalarize the cost computation for the mask replication.

I think, not scalarizing the shuffles themselves
may adjust the computed costs a bit,
and maybe hopefully just enough to hide the "regressions"
at https://reviews.llvm.org/D111460#inline-1075467
I do mean hide, because the test coverage is non-existent.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D112873
2021-11-03 18:14:35 +03:00
Roman Lebedev f3d1ddfe71
[NFC] Use single-dash-prefixed options in newly-added test
https://lab.llvm.org/buildbot/#/builders/98/builds/8195 complains,
and this is the only guess i have.
2021-11-03 18:12:40 +03:00
Clement Courbet 45b84a547e [Sema][NFC] Improve test coverage for builtin binary operators.
In preparation for D112453.
2021-11-03 15:51:35 +01:00
Erich Keane b2cbdf6c13 Update ast-dump-decl.mm test to work on 32 bit windows
Windows member functions have __attribute__((thiscall)) on their type,
so any machine running this that is 32 bit windows fails this test, add
a wildcard, plus an additional run line to explain why.
2021-11-03 07:42:06 -07:00
Roman Lebedev a4b64f7727
[BasicTTI] getInterleavedMemoryOpCost(): discount unused members of mask if mask for gap will be used
As it can be seen in `InnerLoopVectorizer::vectorizeInterleaveGroup()`,
in some cases (reported by `UseMaskForGaps`), the gaps in the interleaved load/store group
will be masked away by another constant mask, so there is no need to
account for the cost of replication of the mask for these.

Differential Revision: https://reviews.llvm.org/D112877
2021-11-03 17:33:28 +03:00
Roman Lebedev c6b3da1d66
[NFC][X86] Duplicate LV test into a costmodel test
Copied from llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
As discussed in D111460 / D112877 / D112873 we have basically no test coverage
for this part of cost model.
2021-11-03 17:25:18 +03:00
Erich Keane 09233412ed Revert part of D112349 to allow ifunc resolvers be declarations.
The patch in D112349 added a previously nonexistant restriction on ifunc
resolvers that they MUST be defintions.  However, the function
multiversioning depends on being able to resolve these resolvers at
link-time, so this additional restriction was breaking.
2021-11-03 07:15:16 -07:00
David Sherwood c0f2774973 [NFC][LoopVectorize] Simple tidy-up in InnerLoopVectorizer::createVectorIntOrFpInductionPHI
Use getSignedIntOrFpConstant instead of creating int or FP constants
manually.
2021-11-03 14:05:21 +00:00
David Spickett fac3f20de5 Reland "[lldb] Remove non address bits when looking up memory regions"
This reverts commit 5fbcf67734.

ProcessDebugger is used in ProcessWindows and NativeProcessWindows.
I thought I was simplifying things by renaming to DoGetMemoryRegionInfo
in ProcessDebugger but the Native process side expects "GetMemoryRegionInfo".

Follow the pattern that WriteMemory uses. So:
* ProcessWindows::DoGetMemoryRegioninfo calls ProcessDebugger::GetMemoryRegionInfo
* NativeProcessWindows::GetMemoryRegionInfo does the same
2021-11-03 13:56:51 +00:00
Peter Waller 7a34145f40 Reland "[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store"
This reverts commit 753eba6421.

Contiguous gather => masked load:

  (sve.ld1.gather.index Mask BasePtr (sve.index IndexBase 1))
  => (masked.load (gep BasePtr IndexBase) Align Mask undef)

Contiguous scatter => masked store:

  (sve.ld1.scatter.index Value Mask BasePtr (sve.index IndexBase 1))
  => (masked.store Value (gep BasePtr IndexBase) Align Mask)

Tests with <vscale x 2 x double>:

[Gather, Scatter] for each [Positive test (index=1), Negative test
(index=2), Alignment propagation].

Differential Revision: https://reviews.llvm.org/D112076
2021-11-03 13:42:14 +00:00
Peter Waller 753eba6421 Revert "[AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store"
This reverts commit 1febf42f03, which has
a use-of-uninitialized-memory bug.

See: https://reviews.llvm.org/D112076
2021-11-03 13:39:38 +00:00
David Spickett 5fbcf67734 Revert "[lldb] Remove non address bits when looking up memory regions"
This reverts commit 6f5ce43b43 due to
build failure on Windows.
2021-11-03 13:27:41 +00:00
Florian Hahn 64bc31ee93
[LV] Drop unneeded use of getVPSingleValue (NFC).
VPReductionPHIRecipe inherits from VPValue, so there's no need to call
getVPSingleValue.
2021-11-03 14:26:15 +01:00
Konstantin Boyarinov d7ac595fc5 [libcxx][test][NFC] More tests for containers comparisons
Add more missing tests for comparisons to improve code coverage (follow-up for D111738)

Reviewed By: ldionne, rarutyun, #libc

Differential Revision: https://reviews.llvm.org/D112424
2021-11-03 16:15:10 +03:00
Sanjay Patel ff30394de8 [PhaseOrdering] add tests for x86 abs/max using SSE intrinsics (PR34047); NFC
D113035
2021-11-03 09:13:23 -04:00
Florian Hahn 8e44bdd12a
[VPlan] Make VPWidenCanonicalIVRecipe a VPValue (NFC).
The recipe produces exactly one VPValue and can inherit directly from
it. This is in line with other recipes and avoids having to use
getVPSingleValue.
2021-11-03 14:11:01 +01:00
Andrew Savonichev 123ad720f1 [NVPTX] Mark special registers as reserved
A reserved register:
 - is not allocatable
 - is considered always live
 - is ignored by liveness tracking

NVPTX special registers match the criteria, and marking them as
reserved helps to avoid machine verifier error:

    *** Bad machine code: Using an undefined physical register ***
    - function:    foo
    - basic block: %bb.0  (0x557bb178b708)
    - instruction: %0:int32regs = MOV_SPECIAL $envreg0
    - operand 1:   $envreg0

Differential Revision: https://reviews.llvm.org/D113008
2021-11-03 15:48:04 +03:00
Clement Courbet 1427742750 [Sema][NFC] Improve test coverage for builtin operators.
In preparation for D112453.
2021-11-03 13:32:48 +01:00
Pavel Labath 30f922741a [lldb] Remove ConstString from plugin names in PluginManager innards
This completes de-constification of plugin names.
2021-11-03 13:14:21 +01:00
Cullen Rhodes d968b173d3 [TableGen] Emit a warning for unused template args
Add a warning to TableGen for unused template arguments in classes and
multiclasses, for example:

  multiclass Foo<int x> {
    def bar;
  }

  $ llvm-tblgen foo.td

  foo.td:1:20: warning: unused template argument: Foo::x
  multiclass Foo<int x> {
                     ^
A flag '--no-warn-on-unused-template-args' is added to disable the
warning. The warning is disabled for LLVM and sub-projects if
'LLVM_ENABLE_WARNINGS=OFF'.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D109359
2021-11-03 11:55:07 +00:00
Cullen Rhodes 6c5a897c44 [mlir][nvvm] NFC: Fix unused template arg tablegen warning
Identified in D109359.
2021-11-03 11:55:06 +00:00
Butygin 1cb13fddb9 [mlir] spirv: Add some atomic ops
Differential Revision: https://reviews.llvm.org/D112812
2021-11-03 14:47:12 +03:00
Andrew Savonichev 0e70785538 [NVPTX] Add MoveParam instruction for TargetExternalSymbol operand
TargetExternalSymbol is considered to be an immediate and not a
register, so machine verifier emits an error:

    *** Bad machine code: Expected a register operand. ***
    - function:    static_offset
    - basic block: %bb.0 bb (0x560e9b306028)
    - instruction: %3:int64regs = MoveParamI64 &static_offset_param_1
    - operand 1:   &static_offset_param_1

The patch adds variants of this instruction with an immediate operand
for byval arguments on 64-bit and 32-bit targets.

Differential Revision: https://reviews.llvm.org/D113006
2021-11-03 14:43:41 +03:00
David Green 3bc586b9aa [ARM] Treat MVE gather add-like-or's like adds
LLVM has the habit of turning adds with no common bits set into ors,
which means we need to detect them and treat them like adds again in the
MVE gather/scatter lowering pass.

Differential Revision: https://reviews.llvm.org/D112922
2021-11-03 11:41:06 +00:00
David Spickett 6f5ce43b43 [lldb] Remove non address bits when looking up memory regions
On AArch64 we have various things using the non address bits
of pointers. This means when you lookup their containing region
you won't find it if you don't remove them.

This changes Process GetMemoryRegionInfo to a non virtual method
that uses the current ABI plugin to remove those bits. Then it
calls DoGetMemoryRegionInfo.

That function does the actual work and is virtual to be overriden
by Process implementations.

A test case is added that runs on AArch64 Linux using the top
byte ignore feature.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D102757
2021-11-03 11:10:42 +00:00
Peter Waller 1febf42f03 [AArch64][SVE][InstCombine] Combine contiguous gather/scatter to load/store
Contiguous gather => masked load:

  (sve.ld1.gather.index Mask BasePtr (sve.index IndexBase 1))
  => (masked.load (gep BasePtr IndexBase) Align Mask undef)

Contiguous scatter => masked store:

  (sve.ld1.scatter.index Value Mask BasePtr (sve.index IndexBase 1))
  => (masked.store Value (gep BasePtr IndexBase) Align Mask)

Tests with <vscale x 2 x double>:

[Gather, Scatter] for each [Positive test (index=1), Negative test (index=2), Alignment propagation].

Differential Revision: https://reviews.llvm.org/D112076
2021-11-03 11:02:44 +00:00
David Green d36dd1f842 [ARM] Push gather/scatter shl index updates out of loops
This teaches the MVE gather scatter lowering pass that SHL is
essentially the same as Mul, where we are able to optimize the
induction of a gather/scatter address by pushing them out of loops.
https://alive2.llvm.org/ce/z/wG4VyT

Differential Revision: https://reviews.llvm.org/D112920
2021-11-03 11:00:05 +00:00
David Spickett 52615df0f2 [libcxx][utils] Note read only mount and ptrace permission in container script
Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D110938
2021-11-03 10:09:15 +00:00
Qiu Chaofan 741aeda97d [PowerPC] Implement longdouble pack/unpack builtins
Implement two builtins to pack/unpack IBM extended long double float,
according to GCC 'Basic PowerPC Builtin Functions Available ISA 2.05'.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D112055
2021-11-03 17:57:25 +08:00
David Sherwood 9da8dde7fd [NFC][LoopVectorize] Add test for tail-folding loop with conditional uniform load
I've added a test for a loop containing a conditional uniform load for
a target that supports masked loads. The test just ensures that we
correctly use gather instructions and have the correct mask.

Differential Revision: https://reviews.llvm.org/D112619
2021-11-03 09:51:11 +00:00
Alex Zinenko 34f72d9125 [mlir][python] expose the shape property of shaped types
This has been missing in the original definition of shaped types.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113025
2021-11-03 10:49:12 +01:00
Alex Zinenko fc7594cc4a [mlir][python] improve usability of Python affine construct bindings
- Provide the operator overloads for constructing (semi-)affine expressions in
  Python by combining existing expressions with constants.
- Make AffineExpr, AffineMap and IntegerSet hashable in Python.
- Expose the AffineExpr composition functionality.

Reviewed By: gysit, aoyal

Differential Revision: https://reviews.llvm.org/D113010
2021-11-03 10:48:01 +01:00
rkayaith f78fe0b7b8 [mlir][python] Make Operation and Value hashable
This allows operations and values to be used as dict keys

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D112669
2021-11-03 10:40:03 +01:00
Andrew Savonichev 30a3a17df8 [NVPTX] Copy machine operand flags in TII::insertBranch
Before this patch, flags such as undef were dropped by TII::insertBranch
(used by BranchFolding pass), resulting in the following error from
machine verifier:

    *** Bad machine code: Reading virtual register without a def ***
    - function:    hoge
    - basic block: %bb.0 bb (0x562e9c240e68)
    - instruction: CBranch %2:int1regs, %bb.3
    - operand 0:   %2:int1regs

Differential Revision: https://reviews.llvm.org/D113001
2021-11-03 12:38:27 +03:00
Yi Kong 803d4f8a35 [ARM][AsmParser] Don't emit "deprecated instruction in IT block" warning if requested
Also fixed formatting in AsmMatcherEmitter because it was confusing.

Differential Revision: https://reviews.llvm.org/D112993
2021-11-03 17:18:04 +08:00
Valentin Clement 3c7ff45cbb
[fir] Add substr information to fircg.ext_embox and fircg.ext_rebox operations
This patch adds the substring information to the fircg.ext_embox and
fircg.ext_rebox operations.

Substring is used for CHARACTER types.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D112807

Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
2021-11-03 10:15:10 +01:00
Andrew Savonichev a8083d42b1 [X86][clang] Disable long double type for -mno-x87 option
This patch attempts to fix a compiler crash that occurs when long
double type is used with -mno-x87 compiler option.

The option disables x87 target feature, which in turn disables x87
registers, so CG cannot select them for x86_fp80 LLVM IR type. Long
double is lowered as x86_fp80 for some targets, so it leads to a
crash.

The option seems to contradict the SystemV ABI, which requires long
double to be represented as a 80-bit floating point, and it also
requires to use x87 registers.

To avoid that, `long double` type is disabled when -mno-x87 option is
set. In addition to that, `float` and `double` also use x87 registers
for return values on 32-bit x86, so they are disabled as well.

Differential Revision: https://reviews.llvm.org/D98895
2021-11-03 12:08:39 +03:00
Kazushi (Jam) Marukawa 3d32218d1a [VE] Change to omitting the frame pointer on leaf functions
Change to omitting the frame pointer on leaf functions by default for VE.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D113087
2021-11-03 17:45:18 +09:00