Commit Graph

399639 Commits

Author SHA1 Message Date
MaheshRavishankar 0b33890f45 [mlir][Linalg] Add ConvolutionOpInterface.
Add an interface that allows grouping together all covolution and
pooling ops within Linalg named ops. The interface currently
- the indexing map used for input/image access is valid
- the filter and output are accessed using projected permutations
- that all loops are charecterizable as one iterating over
  - batch dimension,
  - output image dimensions,
  - filter convolved dimensions,
  - output channel dimensions,
  - input channel dimensions,
  - depth multiplier (for depthwise convolutions)

Differential Revision: https://reviews.llvm.org/D109793
2021-09-20 10:41:10 -07:00
Fangrui Song 6cd382bf28 Revert "[CMake] Add debuginfo-tests to LLVM_ALL_PROJECTS after D110016"
This reverts commit 4b80f0125a.

debuginfo-tests has been renamed to cross-project-tests.
2021-09-20 10:32:19 -07:00
Fangrui Song a07727199d Revert code change of D63497 & D74399 for riscv64-*-linux GCC detection
This partially reverts commits 1fc2a47f0b and 9816e726e7.

See D109727. Replacing config.guess in favor of {gcc,clang} -dumpmachine
can avoid the riscv64-{redhat,suse}-linux GCC detection.

Acked-by: Luís Marques <luismarques@lowrisc.org>
2021-09-20 10:28:32 -07:00
Arthur O'Dwyer d7d7060127 Eliminate _LIBCPP_EQUAL_DELETE in favor of `=delete`.
All supported compilers have supported `=delete` as an extension
in C++03 mode for many years at this point.

Differential Revision: https://reviews.llvm.org/D109942
2021-09-20 13:26:59 -04:00
Craig Topper 04ab6c85ef [RISCV] Teach RISCVTargetLowering::shouldSinkOperands to sink splats for FAdd/FSub/FMul/FDiv. 2021-09-20 10:25:46 -07:00
Craig Topper 890027b314 [RISCV] Add test cases showing failure to use .vf vector operations when splat is in another basic block. NFC
We should have CGP copy the splats into the same basic block as the
FP operation so that SelectionDAG can fold them.
2021-09-20 10:25:38 -07:00
Vedant Kumar e31b2d7d7b [lldb][crashlog] Avoid specifying arch for image when a UUID is present
When adding an image to a target for crashlog purposes, avoid specifying
the architecture of the image.

This has the effect of making SBTarget::AddModule infer the ArchSpec for
the image based on the SBTarget's architecture, which LLDB puts serious
effort into calculating correctly (in TargetList::CreateTargetInternal).

The status quo is that LLDB randomly guesses the ArchSpec for a module
if its architecture is specified, via:

```
  SBTarget::AddModule -> Platform::GetAugmentedArchSpec -> Platform::IsCompatibleArchitecture ->
GetSupportedArchitectureAtIndex -> {ARM,x86}GetSupportedArchitectureAtIndex
```

... which means that the same crashlog can fail to load on an Apple
Silicon Mac (due to the random guess of arm64e-apple-macosx for the
module's ArchSpec not being compatible with the SBTarget's (correct)
ArchSpec), while loading just fine on an Intel Mac.

I'm not sure how to add a test for this (it doesn't look like there's
test coverage of this path in-tree). It seems like it would be pretty
complicated to regression test: the host LLDB would need to be built for
arm64e, we'd need a hand-crafted arm64e iOS crashlog, and we'd need a
binary with an iOS deployment target. I'm open to other / simpler
options.

rdar://82679400

Differential Revision: https://reviews.llvm.org/D110013
2021-09-20 10:23:35 -07:00
cchen 3679d2001c [NCF][OpenMP] Fix metadirective test on SystemZ 2021-09-20 12:22:54 -05:00
Mehdi Amini 5edd79fc97 Revert "[MLIR][SCF] Add for-to-while loop transformation pass"
This reverts commit 644b55d57e.

The added test is failing the bots.
2021-09-20 17:21:59 +00:00
Mehdi Amini f18f1ab4fd Temporarily XFAIL MLIR test that fails the LLVM verifier after 8700f2bd3 2021-09-20 17:20:11 +00:00
Alexander Grund f4b5d597d8 Add use_default_shell_env = True to ctx.actions.run
When building a tool in a non-standard environment (e.g. custom
compiler path -> LD_LIBRARY_PATH set) then
`use_default_shell_env = True` is required to run that tool in the same
environment or otherwise the build will fail due to missing symbols.
See https://github.com/google/jax/issues/7842 for this issue and
https://github.com/tensorflow/tensorflow/pull/44549 for related fix in
TF.

Reviewed By: GMNGeoffrey

Differential Revision: https://reviews.llvm.org/D109873
2021-09-20 10:09:04 -07:00
Amy Huang 6e994a833e [lld] Remove timers.ll because inconsistent timers behavior causes the test to fail sometimes
See https://reviews.llvm.org/D109904
2021-09-20 09:57:18 -07:00
Fangrui Song a954bb18b1 [ELF] Add --why-extract= to query why archive members/lazy object files are extracted
Similar to D69607 but for archive member extraction unrelated to GC. This patch adds --why-extract=.

Prior art:

GNU ld -M prints
```
Archive member included to satisfy reference by file (symbol)

a.a(a.o)                      main.o (a)
b.a(b.o)                      (b())
```

-M is mainly for input section/symbol assignment <-> output section mapping
(often huge output) and the information may appear ad-hoc.

Apple ld64
```
__Z1bv forced load of b.a(b.o)
_a forced load of a.a(a.o)
```

It doesn't say the reference file.

Arm's proprietary linker
```
Selecting member vsnprintf.o(c_wfu.l) to define vsnprintf.
...
Loading member vsnprintf.o from c_wfu.l.
              definition:  vsnprintf
              reference :  _printf_a
```

---

--why-extract= gives the user the full data (which is much shorter than GNU ld
-Map). It is easy to track a chain of references to one archive member with a
one-liner, e.g.

```
% ld.lld main.o a_b.a b_c.a c.a -o /dev/null --why-extract=- | tee stdout
reference       extracted       symbol
main.o  a_b.a(a_b.o)    a
a_b.a(a_b.o)    b_c.a(b_c.o)    b()
b_c.a(b_c.o)    c.a(c.o)        c()

% ruby -ane 'BEGIN{p={}}; p[$F[1]]=[$F[0],$F[2]] if $.>1; END{x="c.a(c.o)"; while y=p[x]; puts "#{y[0]} extracts #{x} to resolve #{y[1]}"; x=y[0] end}' stdout
b_c.a(b_c.o) extracts c.a(c.o) to resolve c()
a_b.a(a_b.o) extracts b_c.a(b_c.o) to resolve b()
main.o extracts a_b.a(a_b.o) to resolve a
```

Archive member extraction happens before --gc-sections, so this may not be a live path
under --gc-sections, but I think it is a good approximation in practice.

* Specifying a file avoids output interleaving with --verbose.
* Required `=` prevents accidental overwrite of an input if the user forgets `=`. (Most of compiler drivers' long options accept `=` but not ` `)

Differential Revision: https://reviews.llvm.org/D109572
2021-09-20 09:52:30 -07:00
Nikita Popov ecd52a5be9 [Verifier] Try to fix MSVC build
Some buildbots fail with:

> C:\a\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\lib\IR\Verifier.cpp(4352): error C2678: binary '==': no operator found which takes a left-hand operand of type 'const llvm::MDOperand' (or there is no acceptable conversion)

Possibly the explicit MDOperand to Metadata* conversion will help?
2021-09-20 18:47:25 +02:00
Kazu Hirata f3cfec9c9e [MCA] Fix a warning
This patch fixes the warning

  InstructionTables.cpp:27:56: error: loop variable 'Resource' of type
  'const std::pair<const uint64_t, ResourceUsage> &' (aka 'const
  pair<const unsigned long, llvm::mca::ResourceUsage> &') binds to a
  temporary constructed from type 'const std::pair<unsigned long,
  llvm::mca::ResourceUsage> &' [-Werror,-Wrange-loop-construct]

Note that Resource is declared as:

   SmallVector<std::pair<uint64_t, ResourceUsage>, 4> Resources;

without "const" for uint64_t.
2021-09-20 09:46:38 -07:00
LLVM GN Syncbot 93604c9711 [gn build] Port d85e347a28 2021-09-20 16:39:59 +00:00
Craig Topper d85e347a28 [RISCV] Add a pass to recognize VLS strided loads/store from gather/scatter.
For strided accesses the loop vectorizer seems to prefer creating a
vector induction variable with a start value of the form
<i32 0, i32 1, i32 2, ...>. This value will be incremented each
loop iteration by a splat constant equal to the length of the vector.
Within the loop, arithmetic using splat values will be done on this
vector induction variable to produce indices for a vector GEP.

This pass attempts to dig through the arithmetic back to the phi
to create a new scalar induction variable and a stride. We push
all of the arithmetic out of the loop by folding it into the start,
step, and stride values. Then we create a scalar GEP to use as the
base pointer for a strided load or store using the computed stride.
Loop strength reduce will run after this pass and can do some
cleanups to the scalar GEP and induction variable.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D107790
2021-09-20 09:39:44 -07:00
Fangrui Song d001ab82e4 [ELF] Don't fall back to .text for e_entry
We have the rule to simulate
(https://sourceware.org/binutils/docs/ld/Entry-Point.html),
but the behavior is questionable
(https://sourceware.org/pipermail/binutils/2021-September/117929.html).

gold doesn't fall back to .text.
The behavior is unlikely relied by projects (there is even a warning for
executable links), so let's just delete this fallback path.

Reviewed By: jhenderson, peter.smith

Differential Revision: https://reviews.llvm.org/D110014
2021-09-20 09:35:12 -07:00
Nikita Popov 8700f2bd36 [Verifier] Verify scoped noalias metadata
Verify that !noalias, !alias.scope and llvm.experimental.noalias.scope
arguments have the format specified in
https://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata.
I've fixed up a lot of broken metadata used by tests in advance.
Especially using a scope instead of the expected scope list is a
commonly made mistake.

Differential Revision: https://reviews.llvm.org/D110026
2021-09-20 18:27:28 +02:00
Jonas Devlieghere a89bfc6120 [lldb] Extract adding symbols for UUID/File/Frame (NFC)
This moves the logic for adding symbols based on UUID, file and frame
into little helper functions. This is in preparation for D110011.

Differential revision: https://reviews.llvm.org/D110010
2021-09-20 09:08:04 -07:00
Jonas Devlieghere fe4b8467b5 [lldb] Fix whitespace in CommandObjectTarget (NFC) 2021-09-20 09:08:03 -07:00
Florian Hahn 963d3a22b3
[DSE] Add additional tests to cover review comments.
Adds additional tests following comments from D109844.

Also removes unusued in.ptr arguments and places in the call tests that
used loads instead of a getval call.
2021-09-20 17:06:04 +01:00
Tobias Gysi 7be28d82b4 [mlir][linalg] Add IndexOp support to fusion on tensors.
This revision depends on https://reviews.llvm.org/D109761 and https://reviews.llvm.org/D109766.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D109774
2021-09-20 15:59:35 +00:00
Morten Borup Petersen 644b55d57e [MLIR][SCF] Add for-to-while loop transformation pass
This pass transforms SCF.ForOp operations to SCF.WhileOp. The For loop condition is placed in the 'before' region of the while operation, and indctuion variable incrementation + the loop body in the 'after' region. The loop carried values of the while op are the induction variable (IV) of the for-loop + any iter_args specified for the for-loop.
Any 'yield' ops in the for-loop are rewritten to additionally yield the (incremented) induction variable.

This transformation is useful for passes where we want to consider structured control flow solely on the basis of a loop body and the computation of a loop condition. As an example, when doing high-level synthesis in CIRCT, the incrementation of an IV in a for-loop is "just another part" of a circuit datapath, and what we really care about is the distinction between our datapath and our control logic (the condition variable).

Differential Revision: https://reviews.llvm.org/D108454
2021-09-20 16:57:50 +01:00
Tobias Gysi 09100c75b5 [mlir][linalg] Fix typo (NFC). 2021-09-20 15:46:16 +00:00
Alexey Bataev bc69dd62c0 [SLP]Improve graph reordering.
Reworked reordering algorithm. Originally, the compiler just tried to
detect the most common order in the reordarable nodes (loads, stores,
extractelements,extractvalues) and then fully rebuilding the graph in
the best order. This was not effecient, since it required an extra
memory and time for building/rebuilding tree, double the use of the
scheduling budget, which could lead to missing vectorization due to
exausted scheduling resources.

Patch provide 2-way approach for graph reodering problem. At first, all
reordering is done in-place, it doe not required tree
deleting/rebuilding, it just rotates the scalars/orders/reuses masks in
the graph node.

The first step (top-to bottom) rotates the whole graph, similarly to the previous
implementation. Compiler counts the number of the most used orders of
the graph nodes with the same vectorization factor and then rotates the
subgraph with the given vectorization factor to the most used order, if
it is not empty. Then repeats the same procedure for the subgraphs with
the smaller vectorization factor. We can do this because we still need
to reshuffle smaller subgraph when buildiong operands for the graph
nodes with lasrger vectorization factor, we can rotate just subgraph,
not the whole graph.

The second step (bottom-to-top) scans through the leaves and tries to
detect the users of the leaves which can be reordered. If the leaves can
be reorder in the best fashion, they are reordered and their user too.
It allows to remove double shuffles to the same ordering of the operands in
many cases and just reorder the user operations instead. Plus, it moves
the final shuffles closer to the top of the graph and in many cases
allows to remove extra shuffle because the same procedure is repeated
again and we can again merge some reordering masks and reorder user nodes
instead of the operands.

Also, patch improves cost model for gathering of loads, which improves
x264 benchmark in some cases.

Gives about +2% on AVX512 + LTO (more expected for AVX/AVX2) for {625,525}x264,
+3% for 508.namd, improves most of other benchmarks.
The compile and link time are almost the same, though in some cases it
should be better (we're not doing an extra instruction scheduling
anymore) + we may vectorize more code for the large basic blocks again
because of saving scheduling budget.

Differential Revision: https://reviews.llvm.org/D105020
2021-09-20 08:42:19 -07:00
peter klausler 5661317f86 [flang] Put intrinsic function table back into order
Some intrinsic functions weren't findable because the table
wasn't strictly in order of names.

And complete a missing generalization of the extension DCONJG
to accept any kind of complex argument, like DREAL and DIMAG
were.

Differential Revision: https://reviews.llvm.org/D110002
2021-09-20 08:40:28 -07:00
Wang, Pengfei 227673398c [X86] Always check the size of SourceTy before getting the next type
D109607 results in a regression in llvm-test-suite.
The reason is we didn't check the size of SourceTy, so that we will
return wrong SSE type when SourceTy is overlapped.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D110037
2021-09-20 23:34:19 +08:00
Wang, Pengfei 5b47256fa5 [X86] Add test to show the effect caused by D109607. NFC 2021-09-20 23:34:18 +08:00
Justas Janickas 228dd20c3f [OpenCL] Supports atomics in C++ for OpenCL 2021
Atomics in C++ for OpenCL 2021 are now handled the same way as in
OpenCL C 3.0. This is a header-only change.

Differential Revision: https://reviews.llvm.org/D109424
2021-09-20 16:24:30 +01:00
Kadir Cetinkaya 444a5f304f
[clangd] Bail-out when an empty compile flag is encountered
Fixes https://github.com/clangd/clangd/issues/865

Differential Revision: https://reviews.llvm.org/D109894
2021-09-20 16:51:56 +02:00
Tobias Gysi 6db928b8f3 [mlir][linalg] Fusion on tensors.
Add a new version of fusion on tensors that supports the following scenarios:
- support input and output operand fusion
- fuse a producer result passed in via tile loop iteration arguments (update the tile loop iteration arguments)
- supports only linalg operations on tensors
- supports only scf::for
- cannot add an output to the tile loop nest

The LinalgTileAndFuseOnTensors pass tiles the root operation and fuses its producers.

Reviewed By: nicolasvasilache, mravishankar

Differential Revision: https://reviews.llvm.org/D109766
2021-09-20 14:45:34 +00:00
Deep Majumder 5dee50111c [analyzer] Move docs of SmartPtr to correct subcategory
The docs of alpha.cplusplus.SmartPtr was incorrectly placed under
alpha.deadcode. Moved it to under alpha.cplusplus

Differential Revision: https://reviews.llvm.org/D110032
2021-09-20 20:13:04 +05:30
David Sherwood f988f68064 [Analysis] Add support for vscale in computeKnownBitsFromOperator
In ValueTracking.cpp we use a function called
computeKnownBitsFromOperator to determine the known bits of a value.
For the vscale intrinsic if the function contains the vscale_range
attribute we can use the maximum and minimum values of vscale to
determine some known zero and one bits. This should help to improve
code quality by allowing certain optimisations to take place.

Tests added here:

  Transforms/InstCombine/icmp-vscale.ll

Differential Revision: https://reviews.llvm.org/D109883
2021-09-20 15:01:59 +01:00
Jay Foad 680592b5d0 [AMDGPU] Regenerate checks 2021-09-20 14:48:23 +01:00
Stefan Gränitz e8d81d80f6 [JITLink] Adopt forEachRelocation() helper in ELF RISCV backend (NFC)
Following D109516, this patch re-uses the new helper function for ELF relocation traversal in the RISCV backend.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D109522
2021-09-20 15:46:41 +02:00
Stefan Gränitz 68914dc990 [JITLink] Adopt forEachRelocation() helper in ELF x86-64 backend (NFC)
Following D109516, this patch re-uses the new helper function for ELF relocation traversal in the x86-64 backend.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D109520
2021-09-20 15:46:41 +02:00
Aaron Puchert 6de19ea4b6 Thread safety analysis: Drop special block handling
Previous changes like D101202 and D104261 have eliminated the special
status that break and continue once had, since now we're making
decisions purely based on the structure of the CFG without regard for
the underlying source code constructs.

This means we don't gain anything from defering handling for these
blocks. Dropping it moves some diagnostics, though arguably into a
better place. We're working around a "quirk" in the CFG that perhaps
wasn't visible before: while loops have an empty "transition block"
where continue statements and the regular loop exit meet, before
continuing to the loop entry. To get a source location for that, we
slightly extend our handling for empty blocks. The source location for
the transition ends up to be the loop entry then, but formally this
isn't a back edge. We pretend it is anyway. (This is safe: we can always
treat edges as back edges, it just means we allow less and don't modify
the lock set. The other way around it wouldn't be safe.)

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D106715
2021-09-20 15:20:15 +02:00
Michał Górny ec50d351ff [lldb] [DynamicRegisterInfo] Unset value_regs/invalidate_regs before Finalize()
Set value_regs and invalidate_regs in RegisterInfo pushed onto m_regs
to nullptr, to ensure that the temporaries passed there are not
accidentally used.

Differential Revision: https://reviews.llvm.org/D109879
2021-09-20 15:02:20 +02:00
Michał Górny 4737dcbc83 [lldb] [test] Add unittest for DynamicRegisterInfo::Finalize()
Differential Revision: https://reviews.llvm.org/D109906
2021-09-20 15:02:20 +02:00
Brain Swift fae57a6a97 [Clang] [Fix] Clang build fails when build directory contains space character
Clang build fails when build directory contains space character.

Error messages:

[ 95%] Linking CXX executable ../../../../bin/clang
clang: error: no such file or directory: 'Space/Net/llvm/Build/tools/clang/tools/driver/Info.plist'
make[2]: *** [bin/clang-14] Error 1
make[1]: *** [tools/clang/tools/driver/CMakeFiles/clang.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

The path name is actually:
  'Dev Space/Net/llvm/Build/tools/clang/tools/driver/Info.plist'

Bugzilla issue - https://bugs.llvm.org/show_bug.cgi?id=51884
Reporter and patch author - Brain Swift <bsp2bsp-llvm@yahoo.com>

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D109979
2021-09-20 18:19:08 +05:30
David Green 3f90df22f1 [ARM] MVE reverse shuffles.
The vectorizer can sometimes make reverse shuffles from indices that
count down. In MVE, we don't have a 128bit rev instruction, but we can
select this to a VREV64 with some lane movs to swap the two halfs.

Ideally this would use VMOVD's, but only gets as far as VMOVS's at the
moment.

Differential Revision: https://reviews.llvm.org/D69510
2021-09-20 13:48:01 +01:00
Alex Richardson 817e23d481 [update_mir_test_checks.py] Use -NEXT FileCheck directories
Previously the script emitted output using plain CHECK directives. This
can result in a test passing even if there are some instructions between
CHECK directives that should have been removed. It also makes debugging
tests that have the output in a different order more difficult since
FileCheck can match with a later line and then complain about the "wrong"
directive not being found.

This will cause quite large diffs when updating existing tests, but I'm not sure we need an opt-in flag here.

Depends on D109765 (pre-commit tests)

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D109767
2021-09-20 12:55:56 +01:00
Alex Richardson 7b68c0725d pre-commit test for D109767
Differential Revision: https://reviews.llvm.org/D109765
2021-09-20 12:55:56 +01:00
Alex Richardson 6d7b3d6b3a Fix CLANG_ENABLE_STATIC_ANALYZER=OFF building all analyzer source
Since https://reviews.llvm.org/D87118, the StaticAnalyzer directory is
added unconditionally. In theory this should not cause the static analyzer
sources to be built unless they are referenced by another target. However,
the clang-cpp target (defined in clang/tools/clang-shlib) uses the
CLANG_STATIC_LIBS global property to determine which libraries need to
be included. To solve this issue, this patch avoids adding libraries to
that property if EXCLUDE_FROM_ALL is set.

In case something like this comes up again: `cmake --graphviz=targets.dot`
is quite useful to see why a target is included as part of `ninja all`.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D109611
2021-09-20 12:55:56 +01:00
Simon Pilgrim 7fc12b822c MachOObjectFile - checkOverlappingElement - use const-ref to avoid unnecessary copies. NFCI.
Reported by MSVC static analyzer.
2021-09-20 12:53:18 +01:00
Simon Pilgrim 4ab7c0d3fa [X86] X86TargetTransformInfo - remove unnecessary if-else after early exit. NFCI.
(style) Break the if-else chain as they all return.
2021-09-20 12:53:17 +01:00
Simon Pilgrim ea17b15f2d [MCA] InstructionTables::execute() - use const-ref iterator in for-range loop. NFCI.
Avoid unnecessary copies, reported by MSVC static analyzer.
2021-09-20 12:53:17 +01:00
Valentin Clement d6929aaa67
[mlir][openacc] Make use of the second counter extension in DataOp translation
Make use of runtime extension for the second reference counter used in
structured data region. This extension is implemented in D106510 and D106509.

Differential Revision: https://reviews.llvm.org/D106517
2021-09-20 13:43:50 +02:00
Michał Górny b1099120ff [lldb] [gdb-remote] Always send PID when detaching w/ multiprocess
Always send PID in the detach packet when multiprocess extensions are
enabled.  This is required by qemu's GDB server, as plain 'D' packet
results in an error and the emulated system is not resumed.

Differential Revision: https://reviews.llvm.org/D110033
2021-09-20 13:29:07 +02:00