Commit Graph

421707 Commits

Author SHA1 Message Date
Jannik Silvanus 607f8ced39 [AMDGPU]: Fix failing assertion in SIMachineScheduler
This fixes the assertion failure "Loop in the Block Graph!".

SIMachineScheduler groups instructions into blocks (also referred to
as coloring or groups) and then performs a two-level scheduling:
inter-block scheduling, and intra-block scheduling.

This approach requires that the dependency graph on the blocks which
is obtained by contracting the blocks in the original dependency graph
is acyclic. In other words: Whenever A and B end up in the same block,
all vertices on a path from A to B must be in the same block.

When compiling an example consisting of an export followed by
a buffer store, we see a dependency between these two. This dependency
may be false, but that is a different issue.
This dependency was not correctly accounted for by SiMachineScheduler.

A new test case si-scheduler-exports.ll demonstrating this is
also added in this commit.

The problematic part of SiMachineScheduler was a post-optimization of
the block assignment that tried to group all export instructions into
a separate export block for better execution performance. This routine
correctly checked that any paths from exports to exports did not
contain any non-exports, but not vice-versa: In case of an export with
a non-export successor dependency, that single export was moved
to a separate block, which could then be both a successor and a
predecessor block of a non-export block.

As fix, we now skip export grouping if there are exports with direct
non-export successor dependencies. This fixes the issue at hand,
but is slightly pessimistic:
We *could* group all exports into a separate block that have neither
direct nor indirect export successor dependencies.
We will review the potential performance impact and potentially
revisit with a more sophisticated implementation.

Note that just grouping all exports without direct non-export successor
dependencies could still lead to illegal blocks, since non-export A
could depend on export B that depends on export C. In that case,
export C has no non-export successor, but still may not be grouped
into an export block.
2022-04-21 14:52:29 +01:00
Luo, Yuanke fa4347261e [X86] Add test case for SetCCMOVMSK combine.
Create 2 users for MOVMSK to test if compiler would perform the combine
"MOVMSK(CONCAT(X,Y)) == 0 ->  MOVMSK(OR(X,Y))".
2022-04-21 21:47:40 +08:00
Nikita Popov 662f57ee21 [InstCombine] Add tests for memset with undef/poison value (NFC) 2022-04-21 15:45:54 +02:00
Nikita Popov 9001edc535 [InstCombine] Split up test for store with undef (NFC) 2022-04-21 15:41:38 +02:00
Markus Böck 850b2c6b3c [mlir] Fix `Region`s `takeBody` method if the region is not empty
The current implementation of takeBody first clears the Region, before then taking ownership of the blocks of the other regions. The issue here however, is that when clearing the region, it does not take into account references of operations to each other. In particular, blocks are deleted from front to back, and operations within a block are very likely to be deleted despite still having uses, causing an assertion to trigger [0].

This patch fixes that issue by simply calling dropAllReferences()before clearing the blocks.

[0] 9a8bb4bc63/mlir/lib/IR/Operation.cpp (L154)

Differential Revision: https://reviews.llvm.org/D123913
2022-04-21 15:32:59 +02:00
Fabian Wolff 95d77383f2 [clang-tidy] Fix behavior of `modernize-use-using` with nested structs/unions
Fixes https://github.com/llvm/llvm-project/issues/50334.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D113804
2022-04-21 15:18:31 +02:00
Andrew Savonichev 96e7487013 [NVPTX] Fix LIT tests with default nameTableKind
Default nameTableKind results in the following DWARF section:

    .section .debug_pubnames
    {
      .b32 LpubNames_end0-LpubNames_start0    // Length of Public Names Info
      LpubNames_start0:
      [...]
      LpubNames_end0:
    }

Without -mattr=+ptx75 ptxas complains about labels and label
expressions:

error   : Feature 'labels1 - labels2 expression in .section' requires
PTX ISA .version 7.5 or later
error   : Feature 'Defining labels in .section' requires PTX ISA
.version 7.0 or later

The patch modifies dbg-value-const-byref.ll to let it run without PTX
7.5 (available from CUDA 11.0), and adds a new test just for this
case.

Differential revision: https://reviews.llvm.org/D124108
2022-04-21 16:05:25 +03:00
Simon Pilgrim ac213375d9 [InstCombine] Add nonpow2 (negative) test for D123374 2022-04-21 13:58:43 +01:00
Aaron Ballman 408226f20a Fix Sphinx build 2022-04-21 08:52:29 -04:00
Nikita Popov 20cf4f8af8 [PhaseOrdering] Remove RUN lines for legacy PM (NFC) 2022-04-21 14:43:00 +02:00
wangpc b1620d40d0 Revert "[RISCV] Precommit test for D122634"
This reverts commit 360d44e86d.
2022-04-21 20:32:56 +08:00
Nikolas Klauser 29c8c070a1 [libc++] Use bit field for checking if string is in long or short mode
This makes the code a bit simpler and (I think) removes the undefined behaviour from the normal string layout.

Reviewed By: ldionne, Mordante, #libc

Spies: labath, dblaikie, JDevlieghere, krytarowski, jgorbe, jingham, saugustine, arichardson, libcxx-commits

Differential Revision: https://reviews.llvm.org/D123580
2022-04-21 14:20:21 +02:00
Pavel Labath 1056c56786 [lldb] Adjust libc++ string formatter for changes in D123580
The code needs more TLC, but for now I've tried making only the changes
that are necessary to get the tests passing -- postponing the more
invasive changes after I create a more comprehensive test.

In a couple of places I have changed the index-based element accesses to
name-based ones (as these are less sensitive to code perturbations). I'm
not sure why the code was using indexes in the first place, but I've
(manually) tested the change with various libc++ versions, and found no
issues with this approach.

Differential Revision: https://reviews.llvm.org/D124113
2022-04-21 14:07:56 +02:00
Nikola Tesic c5600aef88 [Debugify] Limit number of processed functions for original mode
Debugify in OriginalDebugInfo mode, does (DebugInfo) collect-before-pass & check-after-pass
for each instruction, which is pretty expensive. When used to analyze DebugInfo losses
in large projects (like LLVM), this raises the build time unacceptably.
This patch introduces a limit for the number of processed functions per compile unit.
By default, the limit is set to UINT_MAX (practically unlimited), and by using the introduced
option  -debugify-func-limit  the limit could be set to any positive integer number.

Differential revision: https://reviews.llvm.org/D115714
2022-04-21 13:58:17 +02:00
Markus Böck a41aaf166f [mlir] Make `Regions`s `cloneInto` multithread-readable
Prior to this patch, `cloneInto` would do a simple walk over the blocks and contained operations and clone and map them as it encounters them. As finishing touch it then remaps any successor and operands it has remapped during that process.

This is generally fine, but sadly leads to a lot of uses of both operations and blocks from the source region, in the cloned operations in the target region. Those uses lead to writes in the use-def list of the operations, making `cloneInto` never thread safe.

This patch reimplements `cloneInto` in three steps to avoid ever creating any extra uses on elements in the source region:
* It first creates the mapping of all blocks and block operands
* It then clones all operations to create the mapping of all operation results, but does not yet clone any regions or set the operands
* After all operation results have been mapped, it now sets the operations operands and clones their regions.

That way it is now possible to call `cloneInto` from multiple threads if the Region or Operation is isolated-from-above. This allows creating copies of  functions or to use `mlir::inlineCall` with the same source region from multiple threads. In the general case, the method is thread-safe if through cloning, no new uses of `Value`s from outside the cloned Operation/Region are created. This can be ensured by mapping any outside operands via the `BlockAndValueMapping` to `Value`s owned by the caller thread.

While I was at it, I also reworked the `clone` method of `Operation` a little bit and added a proper options class to avoid having a `cloneWithoutRegionsAndOperands` method, and be more extensible in the future. `cloneWithoutRegions` is now also a simple wrapper that calls `clone` with the proper options set. That way all the operation cloning code is now contained solely within `clone`.

Differential Revision: https://reviews.llvm.org/D123917
2022-04-21 13:43:00 +02:00
Hui Xie 3d3103b733 [libcxx][ranges] add views::join adaptor object. added test coverage to join_view
- added views::join adaptor object
- added test for the adaptor object
- fixed some join_view's tests. e.g iter_swap test
- added some negative tests for join_view to test that operations do not exist when constraints aren't met
- added tests that locks down issues that were already addressed in previous change
  - LWG3500 `join_view::iterator::operator->()` is bogus
  - LWG3313 `join_view::iterator::operator--` is incorrectly constrained
  - LWG3517 `join_view::iterator`'s `iter_swap` is underconstrained
  - P2328R1 join_view should join all views of ranges
- fixed some issues in join_view and added tests
  - LWG3535 `join_view::iterator::iterator_category` and `::iterator_concept` lie
  - LWG3474 Nesting ``join_views`` is broken because of CTAD
- added tests for an LWG issue that isn't resolved in the standard yet, but the previous code has workaround.
  - LWG3569 Inner iterator not default_initializable

Reviewed By: #libc, var-const

Spies: var-const, libcxx-commits

Differential Revision: https://reviews.llvm.org/D123466
2022-04-21 13:10:46 +02:00
Dmitry Preobrazhensky 81af32b9a3 [AMDGPU][MC][NFC][GFX940] Corrected an error position
Differential Revision: https://reviews.llvm.org/D124099
2022-04-21 14:04:46 +03:00
Uday Bondhugula f47a38f517 Add async dependencies support for gpu.launch op
Add async dependencies support for gpu.launch op: this allows specifying
a list of async tokens ("streams") as dependencies for the launch.

Update the GPU kernel outlining pass lowering to propagate async
dependencies from gpu.launch to gpu.launch_func op. Previously, a new
stream was being created and destroyed for a kernel launch. The async
deps support allows the kernel launch to be serialized on an existing
stream.

Differential Revision: https://reviews.llvm.org/D123499
2022-04-21 16:25:59 +05:30
Alexey Moksyakov 48e894a536 [BOLT] Add R_AARCH64_PREL16/32/64 relocations support
Reviewed By: yota9, rafauler

Differential Revision: https://reviews.llvm.org/D122294
2022-04-21 13:52:47 +03:00
Vladislav Khmelevsky 63686af1e1 [BOLT] Fix build with GCC 7.3.0
The gcc 7.3.0 version raises "could not covert" error without std::move
used explicitly.

Differential Revision: https://reviews.llvm.org/D124009
2022-04-21 13:47:58 +03:00
Dmitry Preobrazhensky b4231ac4be [AMDGPU][GFX90A+] Disabled ds_ordered_count and exp
Differential Revision: https://reviews.llvm.org/D124087
2022-04-21 13:16:44 +03:00
Daniil Dudkin 488b9fd103 [flang] Do not ICE on recursive function definition in function result
The following code causes the compiler to ICE in several places due to
lack of support of recursive procedure definitions through the function
result.

  function foo() result(r)
    procedure(foo), pointer :: r
  end function foo
2022-04-21 19:04:17 +09:00
Daniil Dudkin 5e49008b58 [NFC] Test commit
An empty commit to test the access
2022-04-21 18:58:41 +09:00
Sven van Haastregt 87a258366e [OpenCL] Guard read_write images with TypeExtension
Ensure that any `read_write` image type carries the
`__opencl_c_read_write_images` upon construction of the `ImageType`.
2022-04-21 10:52:41 +01:00
Haojian Wu 82cddb173f [clangd] tweak tile should start with a capital letter.
to consistent with other tweaks.
2022-04-21 11:24:02 +02:00
Nikita Popov 3df86e799e [SimplifyCFG] Handle branch on same condition in pred more directly
Rather than creating a PHI node and then using the PHI threading
code, directly handle this case in
FoldCondBranchOnValueKnownInPredecessor().

This change is supposed to be NFC-ish, but may cause changes due
to different transform order.
2022-04-21 11:22:02 +02:00
Haojian Wu 1234b1c6d8 [AST] Support template declaration found through using-decl for QualifiedTemplateName.
This is a followup of https://reviews.llvm.org/D123127, adding support
for the QualifiedTemplateName.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D123775
2022-04-21 10:53:23 +02:00
Nikita Popov 8988254667 [SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension
This general threading transform can be performed whenever we know
a constant value for the condition in a predecessor, which would
currently just be the case of a phi node with constant arguments.
2022-04-21 10:49:49 +02:00
gpei-dev 3e6b904f0a Force insert zero-idiom and break false dependency of dest register for several instructions.
The related instructions are:

VPERMD/Q/PS/PD
VRANGEPD/PS/SD/SS
VGETMANTSS/SD/SH
VGETMANDPS/PD - mem version only
VPMULLQ
VFMULCSH/PH
VFCMULCSH/PH

Differential Revision: https://reviews.llvm.org/D116072
2022-04-21 16:47:13 +08:00
Nikita Popov 15fc293b11 Revert "[GVNSink] Regenerate test checks (NFC)"
This reverts commit 3b13230072.

It looks like GVNSink is currently non-deterministic, due to an
std::sort() on BasicBlock* pointers in ModelledPHI. This becomes
visible in the generated checks.
2022-04-21 10:46:34 +02:00
wangpc ce83883691 Revert "[RISCV] Do not outline CFI instructions when they are needed in EH"
This reverts commit 0d40688925.
2022-04-21 16:23:10 +08:00
wangpc 0d40688925 [RISCV] Do not outline CFI instructions when they are needed in EH
We saw a failure caused by unwinding with incomplete CFIs, so we
can't outline CFI instructions when they are needed in EH.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D122634
2022-04-21 16:13:22 +08:00
wangpc 360d44e86d [RISCV] Precommit test for D122634
Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D123364
2022-04-21 16:08:40 +08:00
Nikita Popov 3b13230072 [GVNSink] Regenerate test checks (NFC) 2022-04-21 10:07:09 +02:00
Tobias Hieta 334522ca58 [CMake] Check for problematic MSVC + /arch:AVX configuration
Add a new CMake file to expand on for more problematic configurations
in the future.

Related to #54645

Reviewed By: beanz, phosek, smeenai

Differential Revision: https://reviews.llvm.org/D123777
2022-04-21 09:46:44 +02:00
Chuanqi Xu 7eaa84eac3 [NFC] Code cleanups for coroutine after we remvoed legacy passes 2022-04-21 15:32:46 +08:00
Nimish Mishra 00c511b351 Added lowering support for atomic read and write constructs
This patch adds lowering support for atomic read and write constructs.
Also added is pointer modelling code to allow FIR pointer like types to
be inferred and converted while lowering.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D122725

Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
2022-04-21 12:19:13 +05:30
Xiang Li b02d88d5af [HLSL] Add shader attribute
Shader attribute is for shader library identify entry functions.
Here's an example,

[shader("pixel")]
float ps_main() : SV_Target {
  return 1;
}

When compile this shader to library target like -E lib_6_3, compiler needs to know ps_main is an entry function for pixel shader. Shader attribute is to offer the information.

A new attribute HLSLShader is added to support shader attribute. It has an EnumArgument which included all possible shader stages.

Reviewed By: aaron.ballman, MaskRay

Differential Revision: https://reviews.llvm.org/D123907
2022-04-20 23:46:43 -07:00
Fraser Cormack 3e678cb772 [RISCV] Don't emit fractional VIDs with negative steps
We can't shift-right negative numbers to divide them, so avoid emitting
such sequences. Use negative numerators as a proxy for this situation, since
the indices are always non-negative.

An alternative strategy could be to add a compiler flag to emit division
instructions, which would at least allow us to test the VID sequence
matching itself.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123796
2022-04-21 07:00:34 +01:00
Fraser Cormack 627e21048a [RISCV] Add another test showing incorrect BUILD_VECTOR lowering
This test shows a (contrived) BUILD_VECTOR which is correctly identified
as a sequence of ((vid * -3) / 8) + 5. However, the issue is that using
shift-right for the divide is invalid as the step values are negative.

This patch just adds the test: the fix is added in D123796.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D123989
2022-04-21 06:55:13 +01:00
Fangrui Song f4a3569d0a [ELF] Fix spurious GOT/PLT assertion failure when .dynsym is discarded
Linux kernel arch/arm64/kernel/vmlinux.lds.S discards .dynsym . D123985 triggers
a spurious assertion failure. Detect the case with
`!mainPart->dynSymTab->getParent()`.
2022-04-20 22:49:49 -07:00
River Riddle 0fd3a1ce60 [mlir][NFC] Update remaining textual references of un-namespaced `func` operations
The special case parsing of operations in the `func` dialect is being removed, and
operations will require the dialect namespace prefix.
2022-04-20 22:17:31 -07:00
River Riddle cda6aa78f8 [mlir][NFC] Update textual references of `func` to `func.func` in Transform tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:30 -07:00
River Riddle a4936cb3e8 [mlir][NFC] Update textual references of `func` to `func.func` in Pass/Target tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:30 -07:00
River Riddle 63237cddc1 [mlir][NFC] Update textual references of `func` to `func.func` in tool/runner tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:30 -07:00
River Riddle 6a99d29022 [mlir][NFC] Update textual references of `func` to `func.func` in IR/Interface tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:30 -07:00
River Riddle 87db8e4439 [mlir][NFC] Update textual references of `func` to `func.func` in Integration tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:29 -07:00
River Riddle c48e3a13f3 [mlir][NFC] Update textual references of `func` to `func.func` in Tensor/Tosa/Vector tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:29 -07:00
River Riddle 2c7836ef15 [mlir][NFC] Update textual references of `func` to `func.func` in SPIRV tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:29 -07:00
River Riddle fb35cd3baf [mlir][NFC] Update textual references of `func` to `func.func` in SparseTensor tests
The special case parsing of `func` operations is being removed.
2022-04-20 22:17:29 -07:00