Commit Graph

388224 Commits

Author SHA1 Message Date
Carl Ritson ad558a4ff7 [AMDGPU] Pre-commit tests for D102211 2021-05-11 12:16:58 +09:00
Hsiangkai Wang d8ec2b183e [RISCV] Fix the calculation of the offset of Zvlsseg spilling.
For Zvlsseg spilling, we need to convert the pseudo instructions
into multiple vector load/store instructions with appropriate offsets.
For example, for PseudoVSPILL3_M2, we need to convert it to

VS2R %v2, %base
ADDI %base, %base, (vlenb x 2)
VS2R %v4, %base
ADDI %base, %base, (vlenb x 2)
VS2R %v6, %base

We need to keep the size of the offset in the pseudo spilling instructions.
In this case, it is (vlenb x 2).

In the original implementation, we use the size of frame objects divide the
number of vectors in zvlsseg types. The size of frame objects is not
necessary exactly the same as the spilling data. It may be larger than
it. So, we change it to (VLENB x LMUL) in this patch. The calculation is
more direct and easy to understand.

Differential Revision: https://reviews.llvm.org/D101869
2021-05-11 10:13:18 +08:00
Renaud-K 1e11616a07 Enable export of FIR includes into the install tree
https://reviews.llvm.org/D102040
2021-05-10 18:05:12 -07:00
Vitaly Buka c057779d38 [NFC][LSAN] Fix flaky multithreaded test 2021-05-10 17:33:46 -07:00
LLVM GN Syncbot 842b162446 [gn build] Port e5d483f28a 2021-05-11 00:19:33 +00:00
Lang Hames 6d263b6f1c [ORC-RT] Add unit test infrastructure, extensible_rtti implementation, unit test
Add unit test infrastructure for the ORC runtime, plus a cut-down
extensible_rtti system and extensible_rtti unit test.

Removes the placeholder.cpp source file.

Differential Revision: https://reviews.llvm.org/D102080
2021-05-10 17:15:59 -07:00
zoecarver e5d483f28a [libcxx][ranges] Add ranges::empty CPO.
Depends on D101079. Refs D101189.

Differential Revision: https://reviews.llvm.org/D101193
2021-05-10 17:14:39 -07:00
Aart Bik bf812ea484 [mlir][linalg] remove the -now- obsolete sparse support in linalg
All glue and clutter in the linalg ops has been replaced by proper
sparse tensor type encoding. This code is no longer needed. Thanks
to ntv@ for giving us a temporary home in linalg.

So long, and thanks for all the fish.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102098
2021-05-10 16:49:33 -07:00
Stanislav Mekhanoshin 22d295f695 [AMDGPU] Constant fold Intrinsic::amdgcn_perm
Differential Revision: https://reviews.llvm.org/D102203
2021-05-10 16:23:11 -07:00
LLVM GN Syncbot 0077dce361 [gn build] Port 3b8d2be527 2021-05-10 23:06:37 +00:00
Sam Clegg 3b8d2be527 Reland: "[lld][WebAssembly] Initial support merging string data"
This change was originally landed in: 5000a1b4b9
It was reverted in: 061e071d8c

This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.

Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)

Like the ELF linker merging is only performed at `-O1` and above.

This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)

Differential Revision: https://reviews.llvm.org/D97657
2021-05-10 16:03:38 -07:00
Benjamin Kramer 7b52aeadfa [mlir][Tensor] Add folding for tensor.from_elements
This trivially folds into a constant when all operands are constant.

Differential Revision: https://reviews.llvm.org/D102199
2021-05-11 00:42:45 +02:00
Jessica Paquette 79be9c59c6 [AArch64][GlobalISel] Add post-legalizer lowering for NEON vector fcmps
This is roughly equivalent to the floating point portion of
`AArch64TargetLowering::LowerVSETCC`. Main part that's missing is the v4s16 bit.

This also adds helpers equivalent to `EmitVectorComparison`, and
`changeVectorFPCCToAArch64CC`. This moves `changeFCMPPredToAArch64CC` out of
the selector into AArch64GlobalISelUtils for the sake of code reuse.

This is done in post-legalizer lowering with pseudos to simplify selection.
The imported patterns end up handling selection for us this way.

Differential Revision: https://reviews.llvm.org/D101782
2021-05-10 15:40:06 -07:00
Nico Weber 061e071d8c Revert "[lld][WebAssembly] Initial support merging string data"
This reverts commit 5000a1b4b9.
Breaks tests, see https://reviews.llvm.org/D97657#2749151

Easily repros locally with `ninja check-llvm-mc-webassembly`.
2021-05-10 18:28:28 -04:00
Jessica Paquette 6d8b070d96 [AArch64][GlobalISel] Enable memcpy family combines on minsize functions
The combines in `tryCombineMemCpyFamily` have heuristics (e.g.
`TLI.getMaxStoresPerMemset`) which consider size. So, theoretically, enabling
these combines on minsize functions shouldn't be harmful.

With this enabled we save 0.9% geomean on CTMark at -Oz, and 5.1% on Bullet.
There are no code size regressions.

Differential Revision: https://reviews.llvm.org/D102198
2021-05-10 15:25:23 -07:00
Guozhi Wei a0fed635fe Pre-commit test case for D101970
This is a test case for D101970, which shows the optimization opportunity for

    lea (reg1, reg2), reg3
    sub reg3, reg4

to

    sub reg1, reg4
    sub reg2, reg4

Differential Revision: https://reviews.llvm.org/D102010
2021-05-10 14:47:54 -07:00
Krzysztof Parzyszek 8b9c15c281 [Hexagon] Handle loads and stores of scalar predicate vectors
Handle v2i1, v4i1, and v8i1.
2021-05-10 16:42:22 -05:00
David Blaikie 174606877d Clangd Matchers.h: Fix -Wdeprecated-copy by making the defaulted copy ctor and deleted copy assignment operators explicit 2021-05-10 14:31:11 -07:00
David Blaikie 6dc2a6a8c9 Remove some unnecessary explicit defaulted copy ctors to cleanup -Wdeprecated-copy
These types also wanted to be/were copy assignable, and using the
implicit copy ctor is deprecated in the presence of an explicit copy
ctor.

Removing the explicit copy ctor provides the desired behavior - both
ctor and assignment operator are available implicitly.

Also while I was nearby there were some missing std::moves on shared
pointer parameters.
2021-05-10 14:31:11 -07:00
Sanjay Patel 5577e86691 [InstCombine] fold extract subvector of bitcast insertelt
This is visible in the original example from:
https://llvm.org/PR50055
(but this change doesn't solve the bug)

https://alive2.llvm.org/ce/z/vM_Yq-
2021-05-10 17:20:10 -04:00
Sanjay Patel 8a74cc139d [InstCombine] add tests for extract-subvector of insert; NFC 2021-05-10 17:03:28 -04:00
Artem Dergachev 91ca3269a1 [clang-tidy] Aliasing: Add support for aggregates with references.
When a variable is used in an initializer of an aggregate
for its reference-type field this counts as aliasing.

Differential Revision: https://reviews.llvm.org/D101791
2021-05-10 14:00:31 -07:00
Artem Dergachev 9b292e0edc [clang-tidy] Aliasing: Add more support for captures.
D96215 takes care of the situation where the variable is captured into
a nearby lambda. This patch takes care of the situation where
the current function is the lambda and the variable is one of its captures
from an enclosing scope.

The analogous problem for ^{blocks} is already handled automagically
by D96215.

Differential Revision: https://reviews.llvm.org/D101787
2021-05-10 14:00:30 -07:00
Artem Dergachev 43f4331edf [clang-tidy] Aliasing: Add support for captures.
The utility function clang::tidy::utils::hasPtrOrReferenceInFunc() scans the
function for pointer/reference aliases to a given variable. It currently scans
for operator & over that variable and for declarations of references to that
variable.

This patch makes it also scan for C++ lambda captures by reference
and for Objective-C block captures.

Differential Revision: https://reviews.llvm.org/D96215
2021-05-10 14:00:30 -07:00
Roman Lebedev 6a64c462eb
[X86] AMD Zen 3: same-reg AVX YMM VPCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Still not zero-cycle :)
2021-05-10 23:49:27 +03:00
Roman Lebedev 5864e7b86b
[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX YMM VPCMP 2021-05-10 23:49:27 +03:00
Roman Lebedev 2953245337
[X86] AMD Zen 3: same-reg AVX XMM VPCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Again, it's not zero-cycle.
2021-05-10 23:49:26 +03:00
Roman Lebedev f59db6c4f8
[NFC][X86][MCA] AMD Zen 3: add tests for same-re AVX XMM VPCMP 2021-05-10 23:49:26 +03:00
Roman Lebedev 0f3bcb97ef
[X86] AMD Zen 3: same-reg SSE XMM PCMP is dep breaking one-idiom
As measured by exegesis, and confirmed by ref docs.
Much like with MMX PCMP, it does actually have to execute, though.
2021-05-10 23:49:26 +03:00
Roman Lebedev 0e538f937a
[NFC][X86][MCA] AMD Zen 3: add tests for same-reg XMM SSE PCMP 2021-05-10 23:49:26 +03:00
Roman Lebedev b24edfff4f
[X86] AMD Zen 3: same-reg PCMPEQ is an MMX all-ones dep breaking idiom
They are, however, not zero-cycle, and do actually execute.

As measured by exegesis, and confirmed by ref docs.
2021-05-10 23:49:26 +03:00
Roman Lebedev ba225ce961
[NFC][X86][MCA] AMD Zen 3: add tests for same-reg MMX PCMPEQ 2021-05-10 23:49:25 +03:00
Christopher Di Bella 4ff2fe1df0 [libcxx] removes `weak_equality` and `strong_equality` from <compare>
`weak_equality` and `strong_equality` were removed before being
standardised, and need to be removed.

Also adjusts `common_comparison_category` since its test needed
adjusting due to the equality deletions.

Differential Revision: https://reviews.llvm.org/D100283
2021-05-10 20:45:04 +00:00
Arthur Eubanks edfa44b732 [test] Put aix-xcoff-huge-relocs.ll under expensive checks
It is an order of magnitude slower than the second slowest test
according to obj/llvm/test/.lit_test_times.txt.

The two slowest are:
 2.870437e+02 CodeGen/PowerPC/aix-xcoff-huge-relocs.ll
 2.850697e+01 tools/llvm-readobj/ELF/file-header-machine-types.test

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D102190
2021-05-10 13:44:29 -07:00
Stella Laurenzo 295087644a [mlir] Fix windows build bot break due to use of `alloca` in a test.
Differential Revision: https://reviews.llvm.org/D102189
2021-05-10 20:39:16 +00:00
Stella Laurenzo a2c8aebd8f [mlir][Python] Finish adding RankedTensorType support for encoding.
Differential Revision: https://reviews.llvm.org/D102184
2021-05-10 20:39:16 +00:00
Nikita Popov 463ea28e96 [InstCombine] Fold comparison of integers by parts
Let's say you represent (i32, i32) as an i64 from which the parts
are extracted with lshr/trunc. Then, if you compare two tuples by
parts you get something like A[0] == B[0] && A[1] == B[1], just
that the part extraction happens by lshr/trunc and not a narrow
load or similar.

The fold implemented here reduces such equality comparisons by
converting them into a comparison on a larger part of the integer
(which might be the whole integer). It handles both the "and of eq"
and the conjugated "or of ne" case.

I'm being conservative with one-use for now, though this could be
relaxed if profitable (the base pattern converts 11 instructions
into 5 instructions, but there's quite a few variations on how it
can play out).

Differential Revision: https://reviews.llvm.org/D101232
2021-05-10 22:22:39 +02:00
Florian Hahn 93a9a8a8d9
[VecLib] Add support for vector fns from Darwin's libsystem.
This patch adds support for Darwin's libsystem math vector functions to
TLI. Darwin's libsystem provides a range of vector functions for libm
functions.

This initial patch only adds the 2 x double and 4 x float versions,
which are available on both X86 and ARM64. On X86, wider vector versions
are supported as well.

Reviewed By: jroelofs

Differential Revision: https://reviews.llvm.org/D101856
2021-05-10 21:19:58 +01:00
Sam Clegg 5000a1b4b9 [lld][WebAssembly] Initial support merging string data
This change adds support for a new WASM_SEG_FLAG_STRINGS flag in
the object format which works in a similar fashion to SHF_STRINGS
in the ELF world.

Unlike the ELF linker this support is currently limited:
- No support for SHF_MERGE (non-string merging)
- Always do full tail merging ("lo" can be merged with "hello")
- Only support single byte strings (p2align 0)

Like the ELF linker merging is only performed at `-O1` and above.

This fixes part of https://bugs.llvm.org/show_bug.cgi?id=48828,
although crucially it doesn't not currently support debug sections
because they are not represented by data segments (they are custom
sections)

Differential Revision: https://reviews.llvm.org/D97657
2021-05-10 13:15:12 -07:00
Arthur Eubanks 85af8a8c1b [NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().

A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.

The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.

Differential Revision: https://reviews.llvm.org/D101713
2021-05-10 13:05:15 -07:00
Lang Hames 9507bace6c [ORC] Use a unique_function rather than std::function for dispatchTask. 2021-05-10 13:04:33 -07:00
Nikita Popov aa9b02ac75 [Inliner] Fix noalias metadata handling for instructions simplified during cloning (PR50270)
Instead of using VMap, which may include instructions from the
caller as a result of simplification, iterate over the
(FirstNewBlock, Caller->end()) range, which will only include new
instructions.

Fixes https://bugs.llvm.org/show_bug.cgi?id=50270.

Differential Revision: https://reviews.llvm.org/D102110
2021-05-10 21:59:59 +02:00
Mitch Phillips e78b64df98 [Scudo] Use GWP-ASan's aligned allocations and fixup postalloc hooks.
This patch does a few cleanup things:
 1. The non-standalone scudo has a problem where GWP-ASan allocations
 may not meet alignment requirements where Scudo was requested to have
 alignment >= 16. Use the new GWP-ASan API to fix this.
 2. The standalone variant loses some debugging information inside of
 GWP-ASan because we ask GWP-ASan to allocate an aligned size in the
 frontend. This means reports end up with 'UaF on a 16-byte allocation'
 for a 1-byte allocation with 16-byte alignment. Also use the new API to
 fix this.
 3. Add post-alloc hooks for GWP-ASan intercepted allocations, and add
 stats tracking for GWP-ASan allocations.
 4. Add a small test that checks the alignment of the frontend
 allocator, so that it can be used under GWP-ASan torture mode.
 5. Add GWP-ASan torture mode as a testing configuration to catch these
 regressions.

Depends on D94830, D95889.

Reviewed By: cryptoad

Differential Revision: https://reviews.llvm.org/D95884
2021-05-10 12:56:18 -07:00
Aart Bik 96a23911f6 [mlir][sparse] complete migration to sparse tensor type
A very elaborate, but also very fun revision because all
puzzle pieces are finally "falling in place".

1. replaces lingalg annotations + flags with proper sparse tensor types
2. add rigorous verification on sparse tensor type and sparse primitives
3. removes glue and clutter on opaque pointers in favor of sparse tensor types
4. migrates all tests to use sparse tensor types

NOTE: next CL will remove *all* obsoleted sparse code in Linalg

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D102095
2021-05-10 12:55:22 -07:00
Jez Ng b1c3c2e4fc [lld-macho] Fix order file arch filtering
We had a hardcoded check and a stale TODO, written back when we only had
support for one architecture.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102154
2021-05-10 15:45:54 -04:00
Jez Ng 2516b0b526 [lld-macho] Treat undefined symbols uniformly
In particular, we should apply the `-undefined` behavior to all
such symbols, include those that are specified via the command line
(i.e.  `-e`, `-u`, and `-exported_symbol`). ld64 supports this too.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D102143
2021-05-10 15:45:54 -04:00
Jez Ng 3d5e5066f1 [lld-macho][nfc] Clean up tests
* Remove unnecessary `rm -rf %t`s
* Have lc-linker-option.ll use the right comment marker
2021-05-10 15:45:54 -04:00
Stefan Pintilie 6215f49b8f [PowerPC] Spilling to registers does not require frame index scavenging
If spills are to registers instead of to the stack then a copy will be used
and frame index scavenging is not required.

This patch adds debug info to frame index scavenging and makes sure that
spilling to registers does not cause frame index scavenging.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D101360
2021-05-10 14:42:39 -05:00
Arthur Eubanks 16748bd2fb [TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.

[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101806
2021-05-10 12:35:11 -07:00
Lei Zhang 7e71823f1d [mlir][linalg] Restrict distribution to parallel dims
According to the API contract, LinalgLoopDistributionOptions
expects to work on parallel iterators. When getting processor
information, only loop ranges for parallel dimensions should
be fed in. But right now after generating scf.for loop nests,
we feed in *all* loops, including the ones materialized for
reduction iterators. This can cause unexpected distribution
of reduction dimensions. This commit fixes it.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D102079
2021-05-10 15:23:00 -04:00