Commit Graph

425110 Commits

Author SHA1 Message Date
Mehdi Amini 2e2a8a2d90 Revert "[VP] vp intrinsics are not speculatable"
This reverts commit 78a18d2b54.

Break MLIR bot: https://lab.llvm.org/buildbot/#/builders/61/builds/27127
2022-05-30 12:26:16 +00:00
Mehdi Amini eacfd04744 Apply clang-tidy fixes for llvm-else-after-return in OpPythonBindingGen.cpp (NFC) 2022-05-30 12:25:58 +00:00
Mehdi Amini 0f68c959d2 Apply clang-tidy fixes for modernize-use-override in SparseTensorUtils.cpp (NFC) 2022-05-30 12:25:55 +00:00
Hans Wennborg bac4934c84 Revert "build_llvm_package.bat: Produce zip files in addition to the installers"
The zip files were too large to be practical, so they were never
shipped. Reverting to reduce build time and complexity of the script.

This reverts commit 4486aa03c5.
2022-05-30 13:55:54 +02:00
Mats Petersson 820146abe9 [OpenMP] Pass chunk-size to MLIR while lowering from parse-tree
Test that chunk size is passed to the static init function.
Using three different variations:
1. Single constant.
2. Expression with constants.
3. Variable value.

Reviewed By: peixin, shraiysh

Differential Revision: https://reviews.llvm.org/D126383
2022-05-30 12:14:31 +01:00
Max Kazantsev 7e5a730473 [MemDep][NFC] Remove duplicating check in `if` and `else` branch
Same check is done whether the condition is true or false. Just hoist
it out of conditional.
2022-05-30 17:43:00 +07:00
Simon Moll 78a18d2b54 [VP] vp intrinsics are not speculatable
VP intrinsics show UB if the %evl parameter is out of bounds - they must
not carry the speculatable attribute.  The out-of-bounds UB disappears
when the %evl parameter is expanded into the mask or expansion replaces
the entire VP intrinsic with non-VP code.

This patch
- Removes the speculatable attribute on all VP intrinsics.
- Generalizes the isSafeToSpeculativelyExecute function to let VP
  expansion know whether the VP intrinsic replacement will be
  speculatable.  VP expansion may only discard %evl where this is the
  case.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125296
2022-05-30 12:20:05 +02:00
Max Kazantsev 180d3f251d [MemDep][NFCI] Remove redundant dyn_cast, replace with cast
When `IsLoad` is `true`, we don't need to check if the instruction
is actually a load with dyn_cast. Saves some petty amount of CT.
2022-05-30 17:17:55 +07:00
Hans Wennborg 10d2195305 Update the Windows packaging script
Check in updates based on how the latest release was built [0] and add
the bug fix from [1] which allows LLDB to start.

Other changes which had accumulated in the local release script:
- Don't build the clang format plugin (VS has the functionality built
  in now)
- Disable tests that have been failing (I'll try to follow up and
  re-enable them)
- Switch to Python 3.10
- Jump through more hoops to make LLDB pick the right Python.

0. https://discourse.llvm.org/t/14-0-4-final-has-been-tagged/62750/3
1. https://github.com/llvm/llvm-project/issues/54589
2022-05-30 11:58:13 +02:00
Edd Barrett d245974e1a Test stackmap support for floating point types.
It appears that float support is complete, or at least, the stackmap records
emitted are not inconceivable (I must admit that I don't know about many of the
architectures under test here).

One curiosity, the SystemZ tests highlight an undocumented (or maybe incorrect)
quirk of the stackmap format: in the case of a Register record, the Offset or
SmallConstant field can encode a sub-register index! I've only ever seen this
field zero for Register entries up until now.
2022-05-30 10:49:32 +01:00
Sven van Haastregt a5cf17f8ae [OpenCL] Expose wg collective functions for CL3 SPIR targets
Since the SPIR/SPIR-V targets enable all known features, we must
ensure the Work-group Collective Functions feature macro is set for
OpenCL 3.0.

Fixes https://github.com/llvm/llvm-project/issues/55770
2022-05-30 10:48:49 +01:00
David Green 99b0078064 [AArch64] Tests for showing MachineCombiner COPY patterns. NFC 2022-05-30 10:47:44 +01:00
Alex Zinenko 5cde5a5739 [mlir] add interchange, pad and scalarize to structured transform dialect
Add ops to the structured transform extension of the transform dialect that
perform interchange, padding and scalarization on structured ops. Along with
tiling that is already defined, this provides a minimal set of transformations
necessary to build vectorizable code for a single structured op.

Define two helper traits: one that implements TransformOpInterface by applying
a function to each payload op independently and another that provides a simple
"functional-style" producer/consumer list of memory effects for the transform
ops.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D126374
2022-05-30 11:42:40 +02:00
Ivan Kosarev b4dbcba3b7 [AMDGPU][GFX9][NFC] Rename the base class for SMEM stores. 2022-05-30 10:31:59 +01:00
Ivan Kosarev 082822b381 [AMDGPU][GFX9] Support base+soffset+offset SMEM stores.
Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D126388
2022-05-30 10:27:57 +01:00
Simon Pilgrim 14cc4674bf [X86] Adjust vector fp test costs to match int test costs
znver1/2 models were missing the vtestps/pd overrides to match the vptest integer equivalents.

Noticed while investigating Issue #54889
2022-05-30 09:50:15 +01:00
Alexander Belyaev 402b837302 Revert "[mlir] Lower complex.sqrt and complex.atan2 to Arithmetic dialect."
This reverts commit f5fa633b09.

Integration test sparse_complex_ops.mlir breaks because of it.
2022-05-30 10:48:58 +02:00
Christian Sigg 544d6507ba [MLIR][NVVM] NFC: add labels to test functions.
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D126631
2022-05-30 10:39:44 +02:00
Simon Pilgrim f82967b786 [M68k] Remove unused variable to fix MSVC warning. NFC. 2022-05-30 08:59:49 +01:00
Simon Pilgrim 1956f28037 [X86] Adjust vector extend to ymm to match SoG (Issue #54889)
znver1 ymm variants of VPMOVSX**/VPMOVZX** instructions require double pumping.

Now matches AMD SoG, Agner and instlatx64 numbers.

Thanks to @fabian-r for the report
2022-05-30 08:58:56 +01:00
Nikita Popov 1721ff1dfd [GVN] Enable enable-split-backedge-in-load-pre option by default
This option was added in D89854. It prevents GVN from performing
load PRE in a loop, if doing so would require critical edge
splitting on the backedge. From the review:

> I know that GVN Load PRE negatively impacts peeling,
> loop predication, so the passes expecting that latch has
> a conditional branch.

In the PhaseOrdering test in this patch, splitting the backedge
negatively affects vectorization: After critical edge splitting,
the loop gets rotated, effectively peeling off the first loop
iteration. The effect is that the first element is handled
separately, then the bulk of the elements use a vectorized
reduction (but using unaligned, off-by-one memory accesses) and
then a tail of 15 elements is handled separately again.

It's probably worth noting that the loop load PRE from D99926 is
not affected by this change (as it does not need backedge
splitting). This is about normal load PRE that happens to occur
inside a loop.

Differential Revision: https://reviews.llvm.org/D126382
2022-05-30 09:55:58 +02:00
Alexander Belyaev f5fa633b09 [mlir] Lower complex.sqrt and complex.atan2 to Arithmetic dialect.
I don't see a point here in the lit tests here since sqrt, mul and other ops
expand as well. I just added "smoke" tests to verify that the conversion works
and does not create any illegal ops.

I will create a patch that adds a simple integration test to
mlir/test/Integration/Dialect/ComplexOps/ that will compare the values.

Differential Revision: https://reviews.llvm.org/D126539
2022-05-30 09:44:36 +02:00
Christian Sigg bcf3d52486 [MLIR][GPU] Expose GpuParallelLoopMapping as non-test pass.
Reviewed By: bondhugula, herhut

Differential Revision: https://reviews.llvm.org/D126199
2022-05-30 09:20:48 +02:00
Haojian Wu a5ddd4a238 [pseudo] Remove an unnecessary nullable check diagnostic in the bnf
grammar, NFC.

This diagnostic has been handled in eliminateOptional.
2022-05-30 09:04:47 +02:00
Chuanqi Xu 738c20e6df [NFC] Use %clang instead of %clang++ in tests
Previously the tests uses %clang++ instead of %clang, which cause the
test fail in windows.
2022-05-30 14:38:46 +08:00
LLVM GN Syncbot b16460bb48 [gn build] Port 751c7be5b2 2022-05-30 06:27:55 +00:00
Sheng 751c7be5b2 [TableGen] Remove code beads
Code beads is useless since the only user, M68k, has moved on to
a new encoding/decoding infrastructure.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D126349
2022-05-30 14:27:37 +08:00
Chuanqi Xu a544710cd4 [Driver] Enable to use C++20 standalne by -fcxx-modules
This patch allows user to use C++20 module by -fcxx-modules. Previously,
we could only use it under -std=c++20. Given that user could use C++20
coroutine standalonel by -fcoroutines-ts. It makes sense to offer an
option to use C++20 modules without enabling C++20.

Reviewed By: iains, MaskRay

Differential Revision: https://reviews.llvm.org/D120540
2022-05-30 14:19:56 +08:00
Max Kazantsev 503d5771b6 [JumpThreading][NFCI] Reuse existing DT instead of recomputation
This whole part with recomputation of BPI and BFI looks redundant,
and we tried to get rid of it in D124439. Unfortunately, it causes
some hard-to-reproduce failures due to invalid state of analysis.
Until this is investigated and fixed, let's try to reuse at least
part of available analyzes.

DT is available at this point, and there is no need to recompute it.

Please revert if you see it causing *any* behavior changes.
2022-05-30 12:48:10 +07:00
Sockke 3f3a235aa2 [clang-apply-replacements] Added an option to ignore insert conflict.
If two different texts are inserted at the same offset, clang-apply-replacements prints the conflict error and discards all fixes. This patch adds support for adjusting conflict offset and keeps running to continually fix them.

https://godbolt.org/z/P938EGoxj doesn't have any fixes when I run run-clang-tidy.py to generate a YAML file with clang-tidy and fix them with clang-apply-replacements. The YAML file has two different header texts insertions at the same offset, unlike clang-tidy with '-fix', clang-apply-replacements does not adjust for this conflict.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D123924
2022-05-30 13:02:25 +08:00
Ping Deng 88af539c0e [RISCV] Support VP_REDUCE_MUL mask operation
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126520
2022-05-30 03:05:39 +00:00
Ping Deng 083798e270 [LegalizeTypes][VP] Add integer promotion support for vp.fptosi/vp.fptoui
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125760
2022-05-30 03:05:39 +00:00
Chuanqi Xu 42c3c70a9e Revert "[Driver] Enable to use C++20 standalne by -fcxx-modules"
This reverts commit 99eca83538.

Since it would cause clang-tools-extra fail.
2022-05-30 10:43:13 +08:00
Chuanqi Xu 99eca83538 [Driver] Enable to use C++20 standalne by -fcxx-modules
This patch allows user to use C++20 module by -fcxx-modules. Previously,
we could only use it under -std=c++20. Given that user could use C++20
coroutine standalonel by -fcoroutines-ts. It makes sense to offer an
option to use C++20 modules without enabling C++20.

Reviewed By: iains, MaskRay

Differential Revision: https://reviews.llvm.org/D120540
2022-05-30 10:24:09 +08:00
Chenbing Zheng ef256ed58e [InstCombine] bitcast (extractelement <1 x elt>, dest) -> bitcast(<1 x elt>, dest)
Only solve dest type is vector to avoid inverse transform in visitBitCast.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D125951
2022-05-30 10:16:32 +08:00
Sockke c98b3a8cd9 Fix `performance-unnecessary-value-param` for template specialization
The checker missed a check for parameter type of primary template of specialization template and this could cause build breakages.

Reviewed By: aaron.ballman, flx

Differential Revision: https://reviews.llvm.org/D116593
2022-05-30 09:55:53 +08:00
Lian Wang 967ef4ad0a [NFC][VP] Fix llvm.vp.merge intrinsic Expansion in LangRef
Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D126457
2022-05-30 01:43:41 +00:00
Craig Topper 6a6cf2e28d [RISCV] isel (add (and X, 0x1FFFFFFFE), Y) as (SH1ADD (SRLI X, 1), Y)
This pattern is what we get after DAG combine for C code like this.

short *ptr1, *ptr2, *ptr3;
unsigned diff = ptr1 - ptr2;
return ptr3[diff];

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126588
2022-05-29 18:24:07 -07:00
Craig Topper e642d0ea21 [RISCV] Add test cases showing missed opportunity to use shXadd.uw. NFC
The tests here show the codegen for something like this C code.

unsigned diff = ptr1 - ptr2;
return ptr3[diff];

The pointer difference is truncated to 32-bits before being used
again as an index. In SelectionDAG this appears as an AND between
a SRL and a SHL. DAGCombiner will remove the shifts leaving only
an AND. The Mask now has 1,2, or 3 trailing zeros and 31, 30, or 29
leading zeros. We end up falling back to constant materialization
to create this mask.

We could instead use srli followed by slli.uw. Or since
we have an add, we can use srli followed by shXadd.uw.

Differential Revision: https://reviews.llvm.org/D126589
2022-05-29 18:22:55 -07:00
Florian Hahn 0776c48f9b
Recommit "[LICM] Only create load in ph when promoting load or store doesn't exec."
This reverts the revert commit ad95255b92.

The updated version also creates a load when the store may not execute.
In those cases, we still need to introduce a load in a function where
there may not have been one before, so this doesn't completely resolve
issue #51248.

Original message:

    When only a store is sunk, there is no need to create a load in the
    pre-header, as the result of the load will never get used.

    The dead load can can introduce UB, if the function is marked as
    writeonly.

    Reviewed By: nikic

    Differential Revision: https://reviews.llvm.org/D123473
2022-05-29 21:57:14 +01:00
David Green 9a3144d078 [AArch64] Reuse larger DUP if available
If both a v2i32 DUP(x) and a v4i32 DUP(x) node exists, we can re-use the
larger node using a vector extract to obtain the smaller. This comes up
in the smull/smlal code, but needs a small fixup to allow the smull2
code in tryExtendDUPToExtractHigh/performAddSubLongCombine to still
match smull2 extracts.

Differential Revision: https://reviews.llvm.org/D126449
2022-05-29 19:42:13 +01:00
Joe Loser 7f1e048041
[libc++][test] Remove Clang <= 3.7 workaround in is_default_constructible test
Clang 3.7 and below is not actively used or supported in the test suite now, so
remove the workaround in the test.

Differential Revision: https://reviews.llvm.org/D126603
2022-05-29 11:57:06 -06:00
chenglin.bi e091721fdc [InstCombine] Add baseline tests for shift+and+icmp transforms; NFC 2022-05-30 01:01:37 +08:00
Simon Pilgrim c99690462e [X86] Adjust vector shift costs to match SoG (Issue #54889)
znver1/2 models were incorrectly modelling the fpupipe (should be pipe2 for shift-by-scalar-amount and pipe1 for shift-by-element-amount) and znver1 ymm variants also require double pumping.

Now matches AMD SoG, Agner and instlatx64 numbers.

Thanks to @fabian-r for the report
2022-05-29 17:55:39 +01:00
chenglin.bi 9080e21906 [InstCombine] Add baseline tests for shift+and transforms; NFC 2022-05-30 00:30:56 +08:00
Mark de Wever 773c6e4358 [libc++][doc] Clarify wording on the status page.
Reviewed By: philnik, #libc

Differential Revision: https://reviews.llvm.org/D125630
2022-05-29 15:33:26 +02:00
Mark de Wever 4cb184ce1c [libc++] Adds __format_string as nasty macro.
Both D121530 and D125606 had issues with this macro.

Reviewed By: #libc, philnik

Differential Revision: https://reviews.llvm.org/D125629
2022-05-29 15:32:11 +02:00
Ayke van Laethem 0bd645d370
[libclang] Fix error message capitalization
This was a review suggestion from MaskRay that I forgot to incorporate
in the patch.

See: https://reviews.llvm.org/D124815
2022-05-29 13:42:22 +02:00
Ayke van Laethem 75d12e49c7
[libclang] Fall back to getMainExecutable when dladdr fails
musl-libc doesn't support dladdr in statically linked binaries:

> Are you using static or dynamic linking? If static, dladdr is just a
> stub that always fails. It could be implemented to work under some
> conditions, but it would be highly dependent on what options you
> compile the binary with, since by default static binaries do not
> contain the bloat that would be needed to perform introspection.

Source: https://www.openwall.com/lists/musl/2013/01/15/25 (in response
to a bug report).

Libclang unfortunately uses dladdr to find the ResourcesPath so will
fail if it is linked statically on Alpine Linux. This patch fixes this
issue by falling back to getMainExecutable if dladdr returns an error.

Reference: https://github.com/llvm/llvm-project/issues/40641#issuecomment-981011427

Differential Revision: https://reviews.llvm.org/D124815
2022-05-29 13:40:43 +02:00
Nikolas Klauser 7e69bd9bf0 [libc++] Use __enable_if_t and is_integral in cstddef
Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D126469
2022-05-29 12:05:02 +02:00