Commit Graph

399684 Commits

Author SHA1 Message Date
Kazu Hirata 54229cd9e4 [CodeGen] Remove redundant declaration getFileType (NFC) 2021-09-21 09:12:30 -07:00
Sanjay Patel 08ef71ca92 [InstCombine] move/add tests for trunc-of-lshr; NFC
Planning to reframe a proposed transform in terms of
demanded bits as suggested in D110170.
The new tests end with an 'or'.
2021-09-21 12:11:25 -04:00
Kostya Serebryany 11c533e1ea [sanitizer coverage] write the pc-table at the process exit
The current code writes the pc-table at the process startup,
which may happen before the common_flags() are initialized.
Move writing to the process end.
This is consistent with how we write the counters and avoids the problem with the uninitalized flags.
Add prints if verbosity>=1.

Reviewed By: kostik

Differential Revision: https://reviews.llvm.org/D110119
2021-09-21 09:09:25 -07:00
Florian Hahn 5131037ea9
[ValueTracking,VectorCombine] Allow passing DT to computeConstantRange.
isValidAssumeForContext can provide better results with access to the
dominator tree in some cases. This patch adjusts computeConstantRange to
allow passing through a dominator tree.

The use VectorCombine is updated to pass through the DT to enable
additional scalarization.

Note that similar APIs like computeKnownBits already accept optional dominator
tree arguments.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D110175
2021-09-21 16:54:47 +01:00
Michael Liao 5fb3ae525f [SelectionDAG] Re-calculate scoped AA metadata when merging stores.
Reviewed By: jeroen.dobbelaere

Differential Revision: https://reviews.llvm.org/D102821
2021-09-21 11:41:17 -04:00
Aleksandr Bezzubikov 624e4d087e [GlobalISel] Support ConstantAsMetadata in IRTranslator
When using instructions which have a MetadataAsValue argument
(e.g. some target-specific intrinsics) MD canonicalization strips
internal MDNodes with a single ConstantAsMetadata child. That
prevented IRTranslator from the proper translation of such a calls.
2021-09-21 11:24:56 -04:00
Tobias Gysi 8b5236def5 [mlir][linalg] Simplify slice dim computation for fusion on tensors (NFC).
Compute the tiled producer slice dimensions directly starting from the consumer not using the producer at all.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110147
2021-09-21 15:09:46 +00:00
Tobias Gysi 9072f1b5f8 [mlir][linalg] Add isPermutation helper (NFC).
Add a helper method to check if an index vector contains a permutation of its indices. Additionally, refactor applyPermutationToVector to take int64_t.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110135
2021-09-21 15:07:39 +00:00
Dmitry Preobrazhensky 3500e7d2b0 [AMDGPU][MC][GFX7][GFX10] Corrected image_atomic_fcmpswap
Differential Revision: https://reviews.llvm.org/D109616
2021-09-21 18:06:02 +03:00
Petar Avramovic f3366983f0 AMDGPU/GlobalISel: Restore run line erased in D109154 by mistake 2021-09-21 17:03:46 +02:00
Andy Wingo 9ae4275557 [clang][NFC] Fix needless double-parenthisation
Strip a layer of parentheses in TreeTransform::RebuildQualifiedType.

Differential Revision: https://reviews.llvm.org/D108359
2021-09-21 17:03:23 +02:00
David Green a502294b2d [AArch64] Regenerate test lines in and-mask-removal.ll 2021-09-21 15:37:00 +01:00
Nicolas Vasilache 101d017a64 [mlir][Linalg] Revisit heuristic ordering of tensor.insert_slice in comprehensive bufferize.
It was previously assumed that tensor.insert_slice should be bufferized first in a greedy fashion to avoid out-of-place bufferization of the large tensor. This heuristic does not hold upon further inspection.

This CL removes the special handling of such ops and adds a test that exhibits better behavior and appears in real use cases.

The only test adversely affected is an artificial test which results in a returned memref: this pattern is not allowed by comprehensive bufferization in real scenarios anyway and the offending test is deleted.

Differential Revision: https://reviews.llvm.org/D110072
2021-09-21 14:22:45 +00:00
Nicolas Vasilache 0d2c54e851 [mlir][Linalg] Revisit RAW dependence interference in comprehensive bufferize.
Previously, comprehensive bufferize would consider all aliasing reads and writes to
the result buffer and matching operand. This resulted in spurious dependences
being considered and resulted in too many unnecessary copies.

Instead, this revision revisits the gathering of read and write alias sets.
This results in fewer alloc and copies.
An exhaustive test cases is added that considers all possible permutations of
`matmul(extract_slice(fill), extract_slice(fill), ...)`.
2021-09-21 14:22:22 +00:00
Tobias Gysi c8eed8f9a7 [mlir][linalg] Assert tile loop nest invariants in fusion.
Assert the tile loop nest invariants are satisfied instead of failing silently.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D110137
2021-09-21 14:20:57 +00:00
Chris Bieneman 744ec74b30 [NFC] `goto fail` has failed us in the past...
This patch replaces reliance on `goto failure` pattern with
`llvm::scope_exit`.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D109865
2021-09-21 09:18:37 -05:00
Ben Shi b3052013b4 [RISCV] Optimize (add (mul x, c0), c1)
Optimize (add (mul x, c0), c1) -> (ADDI (MUL (ADDI, c1/c0), c0), c1%c0),
if c1/c0 and c1%c0 are simm12, while c1 is not.

Optimize (add (mul x, c0), c1) -> (MUL (ADDI, c1/c0), c0),
if c1%c0 is zero, and c1/c0 is simm12 while c1 is not.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D108607
2021-09-21 14:13:14 +00:00
Justas Janickas 32b994bca6 [OpenCL] Defines helper function for OpenCL default address space
Helper function `getDefaultOpenCLPointeeAddrSpace()` introduced to
`ASTContext` class. It returns default OpenCL address space
depending on language version and enabled features. If generic
address space is supported, the helper function returns value
`LangAS::opencl_generic`. Otherwise, value `LangAS::opencl_private`
is returned. Code refactoring changes performed in several suitable
places.

Differential Revision: https://reviews.llvm.org/D109874
2021-09-21 15:12:08 +01:00
Anna Thomas 69921f6f45 [InstCombine] Improve TryToSinkInstruction with multiple uses
This patch allows sinking an instruction which can have multiple uses in a
single user. We were previously over-restrictive by looking for exactly one use,
rather than one user.

Also added an API for retrieving a unique undroppable user.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D109700
2021-09-21 10:04:04 -04:00
Saiyedul Islam ee31ad0ab5 [clang-offload-bundler][docs][NFC] Add archive unbundling documentation
Add documentation of unbundling of heterogeneous device archives to
create device specific archives, as introduced by D93525. Also, add
documentation for supported text file formats.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D110083
2021-09-21 19:24:44 +05:30
OGINO Masanori 17a26f5851 [NFC] Update the list of subprojects in docs.
The updated list is based on the output of
cmake -G Ninja -S llvm -B build -DLLVM_ENABLE_PROJECTS='foo'.

Differential Revision: https://reviews.llvm.org/D110124
2021-09-21 17:27:13 +02:00
Sanjay Patel af1c5312d7 [InstCombine] add tests for mask-shift with trunc; NFC 2021-09-21 09:41:41 -04:00
Dmitry Preobrazhensky b8e7f53208 [AMDGPU][MC][GFX10] Enabled dlc for FLAT and GLOBAL atomics
Differential Revision: https://reviews.llvm.org/D109614
2021-09-21 16:23:20 +03:00
hyeongyu kim 043733d677 [IR] Add the constructor of ShuffleVector for one-input-vector.
One of the two inputs of the Shufflevector is often a placeholder.
Previously, there were cases where the placeholder was undef, and there were cases where it was poison.
I added these constructors to create a placeholder consistently.

Changing to use the newly added constructor will be written in a separate patch.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D110146
2021-09-21 22:06:07 +09:00
Nico Weber e9ea03c62c [llvm] Pass LLVM_CHECK_ENABLED_PROJECTS through in cross builds 2021-09-21 09:01:37 -04:00
Jonas Paulsson a48b43f981 [SystemZ] Emit EXRL target instructions before text section is ended.
SystemZ adds the EXRL target instructions in the end of each file. This must
be done before debug info emission since that may end the text section, and
therefore this is now done in emitConstantPools() (instead of in
emitEndOfAsmFile).

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D109513
2021-09-21 14:32:28 +02:00
Florian Hahn ea27dd7497
[VectorCombine] Add tests which require DT to use info from assumes. 2021-09-21 13:07:06 +01:00
Nicholas Guy 9e4d72675f [AArch64] Improve schedule modelling on the Cortex-A55
Enables the FuseAddress feature in the Cortex-A55 scheduling model

Differential Revision: https://reviews.llvm.org/D109323
2021-09-21 13:03:34 +01:00
Simon Pilgrim fc8f1e4419 [InstCombine] foldConstantInsEltIntoShuffle - bail if we fail to find constant element (PR51824)
If getAggregateElement() returns null for any element, early out as otherwise we will assert when creating a new constant vector

Fixes PR51824 + ; OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38057
2021-09-21 13:01:09 +01:00
Simon Pilgrim 20b58855e0 [CodeGen] SelectionDAGBuilder - Use const-ref iterator in for-range loops. NFCI.
Avoid unnecessary copies, reported by MSVC static analyzer.
2021-09-21 13:01:08 +01:00
Simon Pilgrim f5d23d36de RewriteStatepointsForGC - Use const-ref iterator in for-range loops. NFCI.
Avoid unnecessary copies, reported by MSVC static analyzer.
2021-09-21 13:01:08 +01:00
Simon Pilgrim 0f83456cf5 [CodeGen] SDDbgValue::getSDNodes() - use const-ref to avoid unnecessary copies. NFCI.
Reported by MSVC static analyzer.
2021-09-21 13:01:08 +01:00
Dmitry Vyukov 9d7b7350c9 tsan: simplify thread context setting
Currently we set thr->tctx after OnStarted callback
taking thread registry mutex again and searching for the context.
But OnStarted already runs under the thread registry mutex
and has access to the context, so set it in the OnStarted.
This makes code simpler and faster.

Depends on D110132.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D110133
2021-09-21 13:26:55 +02:00
Dmitry Vyukov 908256b0ea tsan: rearrange thread state callbacks (NFC)
Thread state functions are split into 2 parts:
tsan entry function (e.g. ThreadStart) and thread registry
state change callback (e.g. OnStart). Currently these
pairs of functions are located far from each other and
in reverse order. This makes it hard to read and follow the logic.
Reorder the code so that OnFoo directly follows ThreadFoo.
No other code changes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D110132
2021-09-21 13:26:36 +02:00
Dmitry Vyukov 6fe35ef419 tsan: fix debug format strings
Some of the DPrintf's currently produce -Wformat warnings if enabled.
Fix these format strings.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D110131
2021-09-21 13:23:10 +02:00
Jay Foad 598bebeaa6 [AMDGPU] Prefer fmac over fma when selecting FMA_W_CHAIN
FMA_W_CHAIN is used when lowering fdiv f32. Prefer to select it to fmac
if there are no source modifiers, just like we do for other mad/mac and
fma/fmac cases.

Differential Revision: https://reviews.llvm.org/D110074
2021-09-21 11:57:45 +01:00
Jay Foad 86dcb59206 [AMDGPU] Prefer v_fmac over v_fma only when no source modifiers are used
v_fmac with source modifiers forces VOP3 encoding, but it is strictly
better to use the VOP3-only v_fma instead, because $dst and $src2 are
not tied so it gives the register allocator more freedom and avoids a
copy in some cases.

This is the same strategy we already use for v_mad vs v_mac and
v_fma_legacy vs v_fmac_legacy.

Differential Revision: https://reviews.llvm.org/D110070
2021-09-21 11:57:45 +01:00
David Green e83629280f [AArch64] Regenerate test lines in sve-implicit-zero-filling.ll 2021-09-21 11:44:41 +01:00
Max Kazantsev cd166fb2ef [SCEV] Use isAvailableAtLoopEntry in the asserts
This is what is supposed to be there.
2021-09-21 17:11:15 +07:00
Petar Avramovic 8bc7185668 GlobalISel/Utils: Refactor constant splat match functions
Add generic helper function that matches constant splat. It has option to
match constant splat with undef (some elements can be undef but not all).
Add util function and matcher for G_FCONSTANT splat.

Differential Revision: https://reviews.llvm.org/D104410
2021-09-21 12:09:35 +02:00
Max Kazantsev 4d5d725428 [SCEV] Add some asserts on availability of arguments of isLoopEntryGuardedByCond
The logic in howManyLessThans is fishy. It first checks invariance of
RHS, and then uses OrigRHS as argument for isLoopEntryGuardedByCond, which
is, strictly saying, a different thing. We are seeing a very rare intermittent
failure of availability checks, and it looks like this precondition is
sometimes broken. Before we can figure out what's going on, adding asserts
that all involved values that may possibly to to isLoopEntryGuardedByCond
are available at loop entry.

If either of these asserts fails (OrigRHS is the most likely suspect), it
means that the logic here is flawed.
2021-09-21 17:08:52 +07:00
David Stenberg 7b4cc09b14 [LowerConstantIntrinsics] Fix heap-use-after-free bug in worklist
This fixes PR51730, a heap-use-after-free bug in
replaceConditionalBranchesOnConstant().

With the attached reproducer we were left with a function looking
something like this after replaceAndRecursivelySimplify():

  [...]

  cont2.i:
    br i1 %.not1.i, label %handler.type_mismatch3.i, label %cont4.i

  handler.type_mismatch3.i:
    %3 = phi i1 [ %2, %cont2.thread.i ], [ false, %cont2.i ]
    unreachable

  cont4.i:
    unreachable

  [...]

with both the branch instruction and PHI node being in the worklist. As
a result of replacing the branch instruction with an unconditional
branch, the PHI node in %handler.type_mismatch3.i would be removed. This
then resulted in a heap-use-after-free bug due to accessing that removed
PHI node in the next worklist iteration.

This is solved by using a value handle worklist. I am a unsure if this
is the most idiomatic solution. Another solution could have been to
produce a worklist just containing the interesting branch instructions,
but I thought that it perhaps was a bit cleaner to keep all worklist
filtering in the loop that does the rewrites.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D109221
2021-09-21 11:33:07 +02:00
Justas Janickas 57b8b5c114 [OpenCL] Test case for C++ for OpenCL 2021 in OpenCL C header test
RUN line representing C++ for OpenCL 2021 added to the test. This
should have been done as part of earlier commit fb321c2ea2 but
was missed during rebasing.

Differential Revision: https://reviews.llvm.org/D109492
2021-09-21 10:27:46 +01:00
Uday Bondhugula 5c77ed0330 [MLIR] NFC. gpu.launch op argument const folder cleanup
NFC updates to gpu.launch op argument const folder.

Differential Revision: https://reviews.llvm.org/D110136
2021-09-21 14:30:03 +05:30
Andrzej Warzynski 7e7484a816 [flang][docs] Document plugin limitations
This was extracted from the discussion on
https://reviews.llvm.org/D108283.

Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>

Differential Revision: https://reviews.llvm.org/D109871
2021-09-21 08:51:12 +00:00
Sylvestre Ledru eccd477ce3 Add CMAKE_BUILD_TYPE to the list of BOOTSTRAP_DEFAULT_PASSTHROUGH variables
When building clang in stage2, when -DCMAKE_BUILD_TYPE=RelWithDebInfo is set,
the developer can expect that the stage2 clang is built using the same mode.
Especially as the performances are much worst in debug mode.
(Principle of least astonishment)

Differential Revision: https://reviews.llvm.org/D53014
2021-09-21 10:44:08 +02:00
Cullen Rhodes b23d22f7d5 [PowerPC] NFC: Remove unused tblgen template args
Identified in D109359.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D109715
2021-09-21 08:24:16 +00:00
Morten Borup Petersen 032cb1650f [MLIR][SCF] Add for-to-while loop transformation pass
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.

This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).

Differential Revision: https://reviews.llvm.org/D108454
2021-09-21 09:09:54 +01:00
Pavel Labath 791b6ebc86 [lldb] Speculative fix to TestGuiExpandThreadsTree
This test relies on being able to unwind from an arbitrary place inside
libc. While I am not sure this is the cause of the observed flakyness,
it is known that we are not able to unwind correctly from some places in
(linux) libc.

This patch adds additional synchronization to ensure that the inferior
is in the main function (instead of pthread guts) when lldb tries to
unwind it. At the very least, it should make the test runs more
predictable/repeatable.
2021-09-21 10:01:00 +02:00
Kunwar Shaanjeet Singh Grover 0d12c99191 [MLIR] Add mergeLocalIds and mergeSymbolIds
This patch adds mergeLocalIds andmergeSymbolIds as public functions
for FlatAffineConstraints and FlatAffineValueConstraints respectively.

mergeLocalIds is also required to support divisions in intersection,
subtraction, equality checks, and complement for PresburgerSet.

This patch is part of a series of patches aimed at generalizing affine
dependence analysis.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D110045
2021-09-21 13:02:23 +05:30