Commit Graph

417111 Commits

Author SHA1 Message Date
Mehdi Amini e6e36b9c20 Apply clang-tidy fixes for modernize-loop-convert to MLIR (NFC) 2022-03-07 10:41:44 +00:00
Mehdi Amini cfdf9747bf Apply clang-tidy fixes for llvm-qualified-auto to MLIR (NFC) 2022-03-07 10:41:44 +00:00
Mehdi Amini 393c6db7a1 Apply clang-tidy fixes for bugprone-macro-parentheses to MLIR (NFC) 2022-03-07 10:41:44 +00:00
Nikita Popov 1bd33691cb [CoroElide] Remove fallback for frame layout determination
Only determine the frame layout based on dereferenceable and align
attributes, and remove the type-based fallback, which is incompatible
with opaque pointers. The dereferenceable attribute is required,
while the align attribute uses default alignment of 1 (commonly,
align 1 attributes do not get placed, relying on default alignment).

The CoroSplit pass producing the resume function adds the necessary
attributes in 7daed35911/llvm/lib/Transforms/Coroutines/CoroSplit.cpp (L840),
and their presence is checked in coro-debug.ll at least.

Differential Revision: https://reviews.llvm.org/D120988
2022-03-07 11:23:02 +01:00
Jan Svoboda 2d26f163f6 [clang][modules] Fix failing test
This test started failing on Windows after b45888e959 due to path separators not matching up.
2022-03-07 11:21:21 +01:00
Simon Atanasyan 7daed35911 Remove Simon Atanasyan from the code owners list. MIPS Backend. 2022-03-07 13:17:08 +03:00
Nikita Popov 9bca4ea364 [Coroutines] Allow FramePtr to be an Argument
With opaque pointers, after splitRetconCoroutine() the FramePtr
may be an Argument rather than an Instruction. With typed pointers,
this currently doesn't happen because the FramePtr would be a
bitcast instruction.

Fix this by making FramePtr a Value and adding a helper for the
"after FramePtr" insertion point, which would be the start of the
function in the Argument case.

Differential Revision: https://reviews.llvm.org/D120994
2022-03-07 10:58:56 +01:00
Jan Svoboda b45888e959 [clang][modules] Report module maps affecting `no_undeclared_includes` modules
Since D106876, PCM files don't report module maps as input files unless they contributed to the compilation.

Reporting only module maps of (transitively) imported modules is not enough, though. For modules marked with `[no_undeclared_includes]`, other module maps affect the compilation by introducing anti-dependencies.

This patch makes sure such module maps are being reported as input files.

Depends on D120463.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120464
2022-03-07 10:47:46 +01:00
Jan Svoboda 242b24c184 [clang][modules] NFC: Simplify and clarify test
This patch simplifies a test that checks only used module map files are reported as input files in PCM files.

Instead of using opaque `diff`, this patch uses `clang -module-file-info` and `FileCheck` to verify this.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120463
2022-03-07 10:47:46 +01:00
David Green d9633d1490 [AArch64] Turn truncating buildvectors into truncates
When lowering large v16f32->v16i8 fp_to_si_sat, the fp_to_si_sat node is
split several times, creating an illegal v4i8 concat that gets expanded
into a BUILD_VECTOR. After some combining and other legalisation, it
ends up the a buildvector that extracts from 4 vectors, looking like
BUILDVECTOR(a0,a1,a2,a3,b0,b1,b2,b3,c0,c1,c2,c3,d0,d1,d2,d3). That is
really an v16i32->v16i8 truncate in disguise.

This adds a ReconstructTruncateFromBuildVector method to detect the
pattern, converting it back into the legal "concat(trunc(concat(trunc(a),
trunc(b))), trunc(concat(trunc(c), trunc(d))))" tree. The extracted
nodes could also be v4i16, in which case the truncates are not needed.
All those truncates and concats then become uzip1's, which is much
better than expanding by moving vector lanes around.

Differential Revision: https://reviews.llvm.org/D119469
2022-03-07 09:42:54 +00:00
Siva Chandra Reddy c74c344263 [libc] Fix alignment logic in TLS image size calculation. 2022-03-07 09:21:37 +00:00
LLVM GN Syncbot d7480d065d [gn build] Port 5f62156762 2022-03-07 09:08:13 +00:00
River Riddle ee1d447e5f [mlir][NFC] Move Translation.h to a Tools/mlir-translate directory
Translation.h is currently awkwardly shoved into the top-level mlir, even though it is
specific to the mlir-translate tool. This commit moves it to a new Tools/mlir-translate
directory, which is intended for libraries used to implement tools. It also splits the
translate registry from the main entry point, to more closely mirror what mlir-opt
does.

Differential Revision: https://reviews.llvm.org/D121026
2022-03-07 01:05:38 -08:00
River Riddle 6b7d211a1b [mlir][NFC] Move MlirOptMain to the Tools/ directory
MlirOptMain is currently awkwardly shoved into mlir/Support. This commit
moves it to the Tools/ directory, which is intended for libraries used to
implement tools.

Differential Revision: https://reviews.llvm.org/D121025
2022-03-07 01:05:38 -08:00
River Riddle 9eaff42360 [mlir][NFC] Move Parser.h to Parser/
There is no reason for this file to be at the top-level, and
its current placement predates the Parser/ folder's existence.

Differential Revision: https://reviews.llvm.org/D121024
2022-03-07 01:05:38 -08:00
Florian Hahn 542c335159
[ConstraintElimination] Remove dead variables when dropping constraints.
This patch extends ConstraintElimination to also remove dead variables
when removing a constraint. When a constraint is removed because it is
out of scope, all new variables added for this constraint can also be
removed.

This keeps the total size of the systems much smaller, because it
reduces the number of variables drastically.

It also fixes a bug where variables where removed incorrectly.

Fixes https://github.com/llvm/llvm-project/issues/54228
2022-03-07 09:04:07 +00:00
Florian Hahn 4ad1ed3a2e
[ConstraintElimination] Add test from PR54228.
Test for https://github.com/llvm/llvm-project/issues/54228
2022-03-07 09:04:07 +00:00
Luo, Yuanke be85f55b2d [X86] Update some of the AVX512 intrinsic tests to avoid adds.
As noticed in D119654, by adding the masked intrinsics results together
we can end up with the selects being canonicalized away from the
intrinsic - this isn't what we want to test here so replace with a
insertvalue chain into a aggregate instead to retain all the results.
2022-03-07 17:03:31 +08:00
Simon Moll 5f62156762 [VP] Introducing VectorBuilder, the VP intrinsic builder
VectorBuilder wraps around an IRBuilder and
VectorBuilder::createVectorInstructions emits VP intrinsics as if they
were regular instructions.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D105283
2022-03-07 10:02:07 +01:00
Nikita Popov a9b03d9e2e [Attributor] Remove function pointer restriction for AAAlign
This check is not compatible with opaque pointers. We can avoid
it by adjusting the getPointerAlignment() implementation to avoid
creating unnecessary ptrtoint expressions for bitcasted pointers.
The code already uses OnlyIfReduced to not create an expression
if it does not simplify, and this makes sure that folding a
bitcast and ptrtoint into a ptrtoint doesn't count as a
simplification.

Differential Revision: https://reviews.llvm.org/D120904
2022-03-07 10:02:45 +01:00
David Green 43b638241a [AArch64] Use NPM for cost model tests. NFC
As per the other tests, this switches the run lines back to using the
NPM via
-passes='print<cost-model>' -cost-kind=throughput 2>&1 -disable-output
2022-03-07 08:57:50 +00:00
Nikita Popov 81b43b23e4 [SCEV] Enable verification under EXPENSIVE_CHECKS
SCEV verification should no longer affect results of subsequent
queries, and our lit tests as well as llvm-test-suite pass with
SCEV verification enabled, so I think we can enable it by default
under EXPENSIVE_CHECKS now.

Differential Revision: https://reviews.llvm.org/D120708
2022-03-07 09:53:00 +01:00
Weining Lu c063f9da55 [LoongArch] Add EncoderMethods for transformed immediate operands
This is a split patch of D120476 and thanks to myhsu.

'Transformed' means the encoding of an immediate is not the same as
its binary representation. For example, the `bl` instruction
requires a signed 28-bits integer as its operand and the low 2 bits
must be 0. So only the upper 26 bits are needed to get encoded into
the instruction.

Based on the above reason this kind of immediate needs a customed
`EncoderMethod` to get the real value getting encoded into the
instruction.

Currently these immediate includes:
```
  uimm2_plus1
  simm14_lsl2
  simm16_lsl2
  simm21_lsl2
  simm26_lsl2
```

This patch adds those `EncoderMethod`s and revises related .mir test
in previous patch.

Reviewed By: xen0n, MaskRay

Differential Revision: https://reviews.llvm.org/D120545
2022-03-07 16:47:26 +08:00
Nikita Popov d1e880acaa [SCEV] Enable verification in LoopPM
Currently, we hardly ever actually run SCEV verification, even in
tests with -verify-scev. This is because the NewPM LPM does not
verify SCEV. The reason for this is that SCEV verification can
actually change the result of subsequent SCEV queries, which means
that you see different transformations depending on whether
verification is enabled or not.

To allow verification in the LPM, this limits verification to
BECounts that have actually been cached. It will not calculate
new BECounts.

BackedgeTakenInfo::getExact() is still not entirely readonly,
it still calls getUMinFromMismatchedTypes(). But I hope that this
is not problematic in the same way. (This could be avoided by
performing the umin in the other SCEV instance, but this would
require duplicating some of the code.)

Differential Revision: https://reviews.llvm.org/D120551
2022-03-07 09:46:20 +01:00
Adrian Kuegel ef193a7a7c [mlir] Use empty() instead of checking size() == 0 (NFC) 2022-03-07 09:41:43 +01:00
Nikita Popov 8133778d3c [SCEV] Fully invalidate SCEVUnknown on RAUW
When a SCEVUnknown gets RAUWd, we currently drop it from the folding
set, but don't forget memoized values. I believe we should be
treating RAUW the same way as deletion here and invalidate all
caches and dependent expressions.

I don't have any specific cases where this causes issues right now,
but it does address the FIXME in https://reviews.llvm.org/D119488.

Differential Revision: https://reviews.llvm.org/D120033
2022-03-07 09:28:28 +01:00
Timm Bäder 7b969b0bb5 [clang][parser] Stop dragging an EndLoc around when parsing attributes
It's almost always entirely unused and if it is used, the end of the
attribute range can be used instead.

Differential Revision: https://reviews.llvm.org/D120888
2022-03-07 08:16:39 +01:00
Christian Sigg 0dc66b76fe [MLIR] Change call sites from deprecated `parseSourceFile()` to `parseSourceFile<ModuleOp>()`.
Mark `parseSourceFile()` deprecated. The functions will be removed two weeks after landing this change.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D121075
2022-03-07 06:49:38 +01:00
Johannes Doerfert 5af11ec34b [Attributor] Determine potentially loaded values through memory
We already look through memory to determine where a value that is stored
might pop up again (potential copies). This patch introduces the other
direction with similar logic. If a value is loaded, we can follow all
the accesses to the pointer (or better object) and try to determine what
value might have been stored.
2022-03-06 23:26:37 -06:00
Johannes Doerfert eb73af4af4 [Attributor] Handle undef and null in AAAlignFloating
Both `undef` and `nullptr` are maximally aligned. This is especially
important as we often see `undef` until a proper value has been
identified during simplification.
2022-03-06 23:26:22 -06:00
Johannes Doerfert ad26e199ff [Attributor] Use CFG reasoning also for read accesses
With D106397 we used CFG reasoning to filter out writes that will not
interfere with a given load instruction. With this patch we use the
same logic (modulo the reversal in reachability check order) for store
instructions. As an example, we can now proof stores to shared memory
are dead if all the loads of the shared memory are not reachable from
them.
2022-03-06 23:26:22 -06:00
Johannes Doerfert acb3773491 [Attributor] Improve isValidAtPosition (mostly for old PM)
To minimize the test difference between old and new PM we perform some
local dominance check if no dominator tree is available.
2022-03-06 23:26:21 -06:00
Qiu Chaofan b2497e5435 [PowerPC] Add generic fnmsub intrinsic
Currently in Clang, we have two types of builtins for fnmsub operation:
one for float/double vector, they'll be transformed into IR operations;
one for float/double scalar, they'll generate corresponding intrinsics.

But for the vector version of builtin, the 3 op chain may be recognized
as expensive by some passes (like early cse). We need some way to keep
the fnmsub form until code generation.

This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub
intrinsics.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D116015
2022-03-07 13:00:06 +08:00
Johannes Doerfert ff758372bd [Attributor][NFCI] Introduce fine-grained anonymous namespaces 2022-03-06 21:28:38 -06:00
Johannes Doerfert 192a34ddb0 [Attributor][OpenMPOpt][FIX] Register simplification callbacks
Heap-2-stack and heap-2-shared can replace an allocation call with
something else. To avoid us deriving information from the allocator
implementation we register a simplification callback now that will
force us to stop at the call site. We probably should create the
replacement memory eagerly and return that instead though.
2022-03-06 21:28:38 -06:00
Johannes Doerfert 5859ae6a5d [Attributor][FIX] Use maximal access for dereferenceability deduction
While we can use range information when we derive dereferenceability we
must make sure to pick he right end of the range. Before we always went
with the minimal offset, which is not correct if we want to combine
the base dereferenceability with some offset. In that case it's the
maximum that gives the correct result.
2022-03-06 21:28:38 -06:00
Johannes Doerfert 1fcd4d0e3b [Attributor][FIX] Initialize stack variable 2022-03-06 21:28:38 -06:00
Johannes Doerfert 7ead7e90fc Revert "[OpenMP][NFCI] Use RAII lock guards in libomptarget where possible"
This reverts commit ff50e81b50 as it broke
the buildbots, see https://reviews.llvm.org/D121060#3362737.
2022-03-06 21:27:41 -06:00
Zakk Chen 3be907621f [RISCV] Fix incorrect optimization for masked vmsgeu.vi with 0 immediate.
vmsgeu.vi with 0 is always true, but in the masked with mask undisturbed
policy, we still need to keep inactive elelemt which come from maskedoff.

We could return mask directly if it's mask agnostic policy in the future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D121080
2022-03-06 19:22:35 -08:00
Johannes Doerfert ff50e81b50 [OpenMP][NFCI] Use RAII lock guards in libomptarget where possible
Differential Revision: https://reviews.llvm.org/D121060
2022-03-06 19:59:23 -06:00
Johannes Doerfert 6158f4a466 [Attributor][NFCI] No repeated manifest of AAValueSimplifyReturned (CGSCC) 2022-03-06 19:59:23 -06:00
Johannes Doerfert efedf70aa5 [Attributor][NFC] Expose helper with more generic interface
This simply makes the function argument of the
`Attributor::checkForAllInstructions` helper explicit so one can iterate
over instructions in other functions.
2022-03-06 19:59:23 -06:00
Johannes Doerfert 8fa839aa58 [Attributor][NFC] Improve debug messages 2022-03-06 19:59:22 -06:00
William S. Moses 87ec6f41bb [OpenMPIRBuilder] Allocate temporary at the correct block in a nested parallel
The OpenMPIRBuilder has a bug. Specifically, suppose you have two nested openmp parallel regions (writing with MLIR for ease)

```
omp.parallel {
  %a = ...
  omp.parallel {
    use(%a)
  }
}
```

As OpenMP only permits pointer-like inputs, the builder will wrap all of the inputs into a stack allocation, and then pass this
allocation to the inner parallel. For example, we would want to get something like the following:

```
omp.parallel {
  %a = ...
  %tmp = alloc
  store %tmp[] = %a
  kmpc_fork(outlined, %tmp)
}
```

However, in practice, this is not what currently occurs in the context of nested parallel regions. Specifically to the OpenMPIRBuilder,
the entirety of the function (at the LLVM level) is currently inlined with blocks marking the corresponding start and end of each
region.

```
entry:
  ...

parallel1:
  %a = ...
  ...

parallel2:
  use(%a)
  ...

endparallel2:
  ...

endparallel1:
  ...
```

When the allocation is inserted, it presently inserted into the parent of the entire function (e.g. entry) rather than the parent
allocation scope to the function being outlined. If we were outlining parallel2, the corresponding alloca location would be parallel1.

This causes a variety of bugs, including https://github.com/llvm/llvm-project/issues/54165 as one example.

This PR allows the stack allocation to be created at the correct allocation block, and thus remedies such issues.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D121061
2022-03-06 18:34:25 -05:00
Michael Kruse fb75afd730 [mlir][support] Fix msvc build.
Add typename keyword to help the C++ parser to disambiguate dependent
qualified name after D120852/1c941d. Fixes the msvc build.
2022-03-06 16:59:23 -06:00
mydeveloperday 4ab5d7608b [clang-format] NFC update LLVM overall clang-formatted status
A 1% increase in the number of clang-formatted files.

An additional 530 files have been added to LLVM, and an additional
450 files are now clang-format clean. Raising the overall % to 53%

There are now 8857 files clean out of 16432 (ignoring lit tests)
2022-03-06 20:03:27 +00:00
David Green 4388f4f776 [DAG] Don't convert undef to 0 when creating buildvector
When inserting undef into buildvectors created from shuffles of
buildvectors, we convert elements to the largest needed type. This had
the effect of converting undef into 0, which isn't needed as the
buildvector implicitly truncates and trunc(zext(undef)) == undef.

Differential Revision: https://reviews.llvm.org/D121002
2022-03-06 18:35:34 +00:00
Benjamin Kramer 924eac4942 [Hexagon] Move single-use global tables into their only user and turn them into StringSwitch
Delete the unused globals. NFCI.
2022-03-06 19:23:09 +01:00
Simon Pilgrim 830ba4cebe [X86] Update AVX512-BW mask intrinsic tests to avoid adds
As noticed in D119654, by adding the masked intrinsics results together we can end up with the selects being canonicalized away from the intrinsic - this isn't what we want to test here so replace with a insertvalue chain into a aggregate instead to retain all the results.
2022-03-06 17:23:51 +00:00
Simon Pilgrim 1bd836fa10 [X86] Update AVX512 rotate intrinsic tests to avoid adds
As noticed in D119654, by adding the masked intrinsics results together we can end up with the selects being canonicalized away from the intrinsic - this isn't what we want to test here so replace with a insertvalue chain into a aggregate instead to retain all the results.
2022-03-06 17:05:44 +00:00