Commit Graph

401858 Commits

Author SHA1 Message Date
Andrew Savonichev dc8a41de34 [ARM] Simplify address calculation for NEON load/store
The patch attempts to optimize a sequence of SIMD loads from the same
base pointer:

    %0 = gep float*, float* base, i32 4
    %1 = bitcast float* %0 to <4 x float>*
    %2 = load <4 x float>, <4 x float>* %1
    ...
    %n1 = gep float*, float* base, i32 N
    %n2 = bitcast float* %n1 to <4 x float>*
    %n3 = load <4 x float>, <4 x float>* %n2

For AArch64 the compiler generates a sequence of LDR Qt, [Xn, #16].
However, 32-bit NEON VLD1/VST1 lack the [Wn, #imm] addressing mode, so
the address is computed before every ld/st instruction:

    add r2, r0, #32
    add r0, r0, #16
    vld1.32 {d18, d19}, [r2]
    vld1.32 {d22, d23}, [r0]

This can be improved by computing address for the first load, and then
using a post-indexed form of VLD1/VST1 to load the rest:

    add r0, r0, #16
    vld1.32 {d18, d19}, [r0]!
    vld1.32 {d22, d23}, [r0]

In order to do that, the patch adds more patterns to DAGCombine:

  - (load (add ptr inc1)) and (add ptr inc2) are now folded if inc1
    and inc2 are constants.

  - (or ptr inc) is now recognized as a pointer increment if ptr is
    sufficiently aligned.

In addition to that, we now search for all possible base updates and
then pick the best one.

Differential Revision: https://reviews.llvm.org/D108988
2021-10-14 15:23:10 +03:00
Simon Pilgrim 88487662f7 [Codegen] TargetLowering::getCanonicalIndexType - early out scaled MVT::i8 indices. NFCI.
Avoids unused assignment scan-build warning.
2021-10-14 13:08:40 +01:00
Simon Pilgrim b577126d62 [clang][sema] instantiateOMPDeclareVariantAttr - merge repeated VariantFuncRef.get() calls. NFCI.
Fixes scan-build warning about dead initialization
2021-10-14 12:51:34 +01:00
Kirill Bobyrev 0ce3c7111e
[clangd] IncludeCleaner: Handle macros coming from ScratchBuffer
Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D111698
2021-10-14 13:36:37 +02:00
Nicolas Vasilache 012c0cc7c3 [mlir] NFC - Avoid unused symbol in opt mode. 2021-10-14 11:26:33 +00:00
Simon Pilgrim 77dcdc2f50 [CostModel][X86] Pre-SSE41 targets can use PMADDWD for sext sub-i16 -> i32
Without SSE41 sext/zext instructions the extensions will be split, meaning that the MUL->PMADDWD fold will split the sext_i32(x) into zext_i32(sext_i16(x))
2021-10-14 12:17:40 +01:00
Simon Pilgrim 16729d0f62 [Orc] ELFNixPlatform::setupJITDylib - remove dead return. NFCI.
2 returns, one after the other - reported by coverity
2021-10-14 12:17:40 +01:00
Alex Zinenko 18fbd5fe34 [mlir][python] Better support for variadic regions in Python bindings
Improve support for variadic regions in ODS-generated operation view classes.
In particular, make generated constructors take an extra argument that
specifies the number of variadic regions if the operation has them. Previously,
there was no mechanism to specify a non-zero number of variadic regions. Also
generate named accessors to regions.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D111783
2021-10-14 13:15:13 +02:00
Alex Zinenko a04c0b7ed2 [mlir][python] Fix MemRefType IsAFunction in Python bindings
MemRefType was using a wrong `isa` function in the bindings code, which
could lead to invalid IR being constructed. Also run the verifier in
memref dialect tests.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111784
2021-10-14 13:12:37 +02:00
Jeremy Morse e3e1da20d4 Follow up to a3936a6c19, correctly select LiveDebugValues implementation
Some functions get opted out of instruction referencing if they're being
compiled with no optimisations, however the LiveDebugValues pass picks one
implementation and then sticks with it through the rest of compilation.
This leads to a segfault if we encounter a function that doesn't use
instr-ref (because it's optnone, for example), but we've already decided
to use InstrRefBasedLDV which expects to be passed a DomTree.

Solution: keep both implementations around in the pass, and pick whichever
one is appropriate to the current function.
2021-10-14 11:28:53 +01:00
Uday Bondhugula 05fb26062c [MLIR] Fix assert crash when an unregistered dialect op is encountered
Fix assert crash when an unregistered dialect op is encountered during
parsing and `-allow-unregistered-dialect' isn't on. Instead, emit an
error.

While on this, clean up "registered" vs "loaded" on `getDialect()` and
local clang-tidy warnings.

https://llvm.discourse.group/t/assert-behavior-on-unregistered-dialect-ops/4402

Differential Revision: https://reviews.llvm.org/D111628
2021-10-14 15:43:53 +05:30
Tobias Gysi a8f69be61f [mlir][linalg] Expose flag to control nofold attribute when padding.
Setting the nofold attribute enables packing an operand. At the moment, the attribute is set by default. The pack introduces a callback to control the flag.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111718
2021-10-14 10:07:07 +00:00
Josh Mottley 0b48b015b5 [Flang] flang-omp-report replace std::vector's with llvm::SmallVector
This patch replaces all uses of std::vector with llvm::SmallVector in the flang-omp-report plugin.
This is a one of several patches focusing on switching containers from STL to LLVM's ADT library.

Reviewed By: Leporacanthicus

Differential Revision: https://reviews.llvm.org/D111709
2021-10-14 11:05:24 +01:00
Tobias Gysi eaa52750ce [mlir][linalg] Verify every LinalgOp has a body.
After removing the last LinalgOps that have no region attached we can verify there is a region. The patch performs the following changes:
- Move the SingleBlockImplicitTerminator trait further up the the structured op base class.
- Adapt the LinalgOp verification since the trait only check if there is 0 or 1 block.
- Introduce a getBlock method on the LinalgOp interface.
- Access the LinalgOp body using either getBlock() or getBody() if the concrete operation type is known.

This patch is a follow up to https://reviews.llvm.org/D111233.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111393
2021-10-14 09:08:39 +00:00
Pavel Labath fa639eda65 [lldb] Fix TestStackCorefile.py for ca0ce99fc8 2021-10-14 10:38:48 +02:00
Jonas Paulsson a33e4c8ae9 [SystemZ] Reapply memcmp and memcpy patches.
This reverts 3562076 and includes some refactoring as well.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D111733
2021-10-14 10:37:33 +02:00
Jonas Paulsson 00baad35b2 [SystemZ] Bugfix and refactorization of mem-mem operations
This patch fixes the bug that consisted of treating variable / immediate
length mem operations (such as memcpy, memset, ...) differently. The variable
length case needs to have the length minus 1 passed due to the use of EXRL
target instructions. However, the DAGCombiner can convert a register length
argument into a constant one, and whenever that happened one byte too little
would end up being performed.

This is also a refactorization by reducing the number of opcodes and variants
involved. For any opcode (variable or constant length), only the length minus
one is passed on to the ISD node. The rest of the logic is now instead
handled during isel pseudo expansion.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D111729
2021-10-14 10:37:33 +02:00
Martin Storsjö 7106f58856 [lldb] Make the thread_local g_global_boundary accessed from a single file
This makes the compiler generated code for accessing the thread local
variable much simpler (no need for wrapper functions and weak pointers
to potential init functions), and can avoid toolchain bugs regarding how
to access TLS variables.

In particular, this fixes LLDB when built with current GCC/binutils for
MinGW, see https://github.com/msys2/MINGW-packages/issues/8868.

Differential Revision: https://reviews.llvm.org/D111779
2021-10-14 11:17:20 +03:00
Pavel Labath ca0ce99fc8 [lldb] Print embedded nuls in char arrays (PR44649)
When we know the bounds of the array, print any embedded nuls instead of
treating them as terminators. An exception to this rule is made for the
nul character at the very end of the string. We don't print that, as
otherwise 99% of the strings would end in \0. This way the strings
usually come out the same as how the user typed it into the compiler
(char foo[] = "with\0nuls"). It also matches how they come out in gdb.

This resolves a FIXME left from D111399, and leaves another FIXME for dealing
with nul characters in "escape-non-printables=false" mode. In this mode the
characters cause the entire summary string to be terminated prematurely.

Differential Revision: https://reviews.llvm.org/D111634
2021-10-14 09:50:40 +02:00
Max Kazantsev 6e1308bc10 [SCEV][NFC] Simplify check with CI->isZero() exit condition
Replace check with
    if ((ExitIfTrue && CI->isZero()) || (!ExitIfTrue && CI->isOne()))
with equivalent and simpler version
    if (ExitIfTrue == CI->isZero())
2021-10-14 14:06:52 +07:00
Max Kazantsev 46a1dd47e6 [SCEV][NFC] Reorder checks to delay call of all_of
Check lightweight getter condition before calling all_of.
2021-10-14 13:30:51 +07:00
Arthur Eubanks 60605a2b8f Set LLVM_HAS_RVALUE_REFERENCE_THIS when __GNUC__ is defined
gcc does not support __has_feature(), so this was accidentally changed
in D111581 when compiling with gcc.
2021-10-13 23:13:55 -07:00
Valentin Clement 0fbd3aad75
[fir] Remove unused variable in FIRBuilder.h
Remove unsused variable that break Werror on some buildbots
2021-10-14 07:11:41 +02:00
Ben Shi 7e81526126 [RISCV] Optimize immediate materialisation with BSETI/BCLRI
Opitimize immediate materialisation in the following way if profitable:
1. Use BCLRI for upper 32 bits if the lower 32 bits are negative int32.
2. Use BSETI for upper 32 bits if the lower 32 bits are positive int32.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111508
2021-10-14 04:56:47 +00:00
Kazu Hirata e567f37dab [clang] Use llvm::is_contained (NFC) 2021-10-13 20:41:55 -07:00
Abinav Puthan Purayil b3c9d84e5a [AMDGPU] Fix 24-bit mul intrinsic generation for > 32-bit result.
The 24-bit mul intrinsics yields the low-order 32 bits. We should only
do the transformation if the operands are known to be not wider than 24
bits and the result is known to be not wider than 32 bits.

Differential Revision: https://reviews.llvm.org/D111523
2021-10-14 09:00:19 +05:30
Tom Stellard 509fe20fbc docs: Document workaround for arcanist failures
Reviewed By: smeenai

Differential Revision: https://reviews.llvm.org/D110976
2021-10-14 03:25:36 +00:00
Ben Shi 481db13fec [RISCV] Optimize immediate materialisation with SLLI.UW
Use LUI+SLLI.UW to compose the upper bits instead of LUI+SLLI.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111705
2021-10-14 02:24:50 +00:00
Ben Shi c1d6ba54d3 [RISCV][test] Add more tests of immediate materialisation
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111704
2021-10-14 02:24:14 +00:00
Stella Laurenzo fe6d9937b3 [mlir] Ability to build CAPI dylibs from out of tree projects against installed LLVM.
* Incorporates a reworked version of D106419 (which I have closed but has comments on it).
* Extends the standalone example to include a minimal CAPI (for registering its dialect) and a test which, from out of tree, creates an aggregate dylib and links a little sample program against it. This will likely only work today in *static* MLIR builds (until the TypeID fiasco is finally put to bed). It should work on all platforms, though (including Windows - albeit I haven't tried this exact incarnation there).
* This is the biggest pre-requisite to being able to build out of tree MLIR Python-based projects from an installed MLIR/LLVM.
* I am rather nauseated by the CMake shenanigans I had to endure to get this working. The primary complexity, above and beyond the previous patch is because (with no reason given), it is impossible to export target properties that contain generator expressions... because, of course it isn't. In this case, the primary reason we use generator expressions on the individual embedded libraries is to support arbitrary ordering. Since that need doesn't apply to out of tree (which import everything via FindPackage at the outset), we fall back to a more imperative way of doing the same thing if we detect that the target was imported. Gross, but I don't expect it to need a lot of maintenance.
* There should be a relatively straight-forward path from here to rebase libMLIR.so on top of this facility and also make it include the CAPI.

Differential Revision: https://reviews.llvm.org/D111504
2021-10-13 18:45:55 -07:00
Lang Hames abdb82b237 [examples] Fix LLJITWithRemoteDebugging example after 4fcc0ac15e. 2021-10-13 18:19:53 -07:00
wlei 30ca33eab0 [llvm-profgen] Ignore the whole trace with the leading external branch
The first LBR entry can be an external branch, we should ignore the whole trace.

```
     7f7448e889e4 0x7f7448e889e4/0x7f7448e88826/P/-/-/1  0x7f7448e8899f/0x7f7448e889d8/P/-/-/4  ...
```

Reviewed By: wenlei, hoy

Differential Revision: https://reviews.llvm.org/D111749
2021-10-13 16:52:29 -07:00
wlei ab5d65e685 [llvm-profgen] Ignore stack samples before aggregation
With `ignore-stack-samples`, We can ignore the call stack before the samples aggregation which could reduce some redundant computations.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D111577
2021-10-13 16:52:29 -07:00
Lang Hames 4fcc0ac15e [ORC] Use a Setup object for SimpleRemoteEPC construction.
SimpleRemoteEPC notionally allowed subclasses to override the
createMemoryManager and createMemoryAccess methods to use custom objects, but
could not actually be subclassed in practice (The construction process in
SimpleRemoteEPC::Create could not be re-used).

Instead of subclassing, this commit adds a SimpleRemoteEPC::Setup class that
can be used by clients to set up the memory manager and memory access members.
A default-constructed Setup object results in no change from previous behavior
(EPCGeneric* memory manager and memory access objects used by default).
2021-10-13 16:47:00 -07:00
Lang Hames 8d2736d9dd [ORC] Add a missing definition. 2021-10-13 16:47:00 -07:00
wren romano 5167c36ab4 [mlir][sparse] Misc code cleanup
Depends On D111763

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D111766
2021-10-13 16:39:29 -07:00
wren romano 63d4fc9483 [mlir][sparse] Factoring out helper functions for generating constants
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D111763
2021-10-13 16:19:55 -07:00
Nico Weber 8e184f3d2a [gn build] (manually) port 6c76d01011 2021-10-13 18:43:16 -04:00
Shoaib Meenai 6404f4b5af [InstCombine] Remove attributes after hoisting free above null check
If the parameter had been annotated as nonnull because of the null
check, we want to remove the attribute, since it may no longer apply and
could result in miscompiles if left. Similarly, we also want to remove
undef-implying attributes, since they may not apply anymore either.

Fixes PR52110.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D111515
2021-10-13 15:34:56 -07:00
Mircea Trofin 6c76d01011 [mlgo][aot] requrie the model is autogenerated for test determinism
The tests that exercise the 'release' mode, where the model is AOT-ed,
check the output has certain properties, to validate that, indeed, a
different policy from the default one was exercised. For determinism, we
can't reliably check that output for an arbitrary learned policy, since
it could be that policy happens to mimic the default one in that
particular case.

This patch adds a requirement that those tests run only when the model
is autogenerated (e.g. on build bots).

Differential Revision: https://reviews.llvm.org/D111747
2021-10-13 14:02:41 -07:00
Vitaly Buka 8383e49b53 [sanitizer] Cleanup benchmark 2021-10-13 13:58:28 -07:00
Philip Reames 47d10b25f8 [instcombine] PRE freeze to only potentially posion/undef operand of phi
This extends the foldOpIntoPhi code used when visiting a freeze user of a phi to allow any non-undef/poison operand as opposed to only non-undef/poison constants.  This lets us hoist a freeze in the increment of an IV into the preheader in many cases.

Differential Revision: https://reviews.llvm.org/D111744
2021-10-13 13:55:54 -07:00
Martin Storsjö 6fbc812883 [Support] [Path] Move function declarations to the right doxygen group in the header. NFC.
They were in the doxygen group Observers, while they are about
mutating paths.

Differential Revision: https://reviews.llvm.org/D111732
2021-10-13 22:55:14 +03:00
Martin Storsjö 2a4b1539e9 [Support] [Path] Use std::replace instead of an explicit comparison loop. NFC.
After 8fc7a907b9, this loop does
the same as a plain `std::replace`.

Also clarify the comment about what this function does.

Differential Revision: https://reviews.llvm.org/D111730
2021-10-13 22:55:14 +03:00
Jeremy Drake d9b9a7f428 [clang][Tooling] Use Windows command lines on all Windows, except Cygwin
Previously it only used Windows command lines for MSVC triples, but this
was causing issues for windows-gnu.  In fact, everything 'native' Windows
(ie, not Cygwin) should use Windows command line parsing.

Reviewed By: mstorsjo

Differential Revision: https://reviews.llvm.org/D111195
2021-10-13 22:55:14 +03:00
Martin Storsjö a03e17d4d9 [libcxx] [test] Generalize the conditions for testing bitcasts between long double, double and int128
MSVC targets also have a 64 bit long double, as do MinGW targets on ARM.
This hasn't been noticed in CI because the MSVC configurations there run
with _LIBCPP_HAS_NO_INT128 defined.

This avoids assuming that either __int128_t or double is equal in size to
long double. i386 MinGW targets have sizeof(long double) == 10, which
doesn't match any of the tested types.

Differential Revision: https://reviews.llvm.org/D111671
2021-10-13 22:55:01 +03:00
Martin Storsjö b541845ea0 [clang] [Windows] Mark PIC as implicitly enabled for aarch64, just like for x86_64
This doesn't practically affect the code generation.

Differential Revision: https://reviews.llvm.org/D111707
2021-10-13 22:55:00 +03:00
Eric Schweitz bde89ac7f1
[fir] Add the DoLoopHelper
Add the DoLoopHelper. Some helpers functions
to create fir.do_loop operations.

This code was part of D111337 and was extracted in order to
make the patch easier to review.

This patch is part of the upstreaming effort from fir-dev branch.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D111713

Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>
2021-10-13 21:48:45 +02:00
Roman Lebedev a8a64eaafc
[NFC][X86][LV] Autogenerate checklines in cost-model.ll to simplify further updates 2021-10-13 22:47:43 +03:00
Roman Lebedev cb41efb5f4
[NFC][Costmodel][X86] Fix broken `CHECK-NOT`'s in interleave costmodel tests 2021-10-13 22:44:57 +03:00