Commit Graph

378365 Commits

Author SHA1 Message Date
David Green 40f46cb0e4 [ARM] Add alignment checks for MVE VLDn
The MVE VLD2/4 and VST2/4 instructions require the pointer to be aligned
to at least the size of the element type. This adds a check for that
into the ARM lowerInterleavedStore and lowerInterleavedLoad functions,
not creating the intrinsics if they are invalid for the alignment of
the load/store.

Unfortunately this is one of those bug fixes that does effect some
useful codegen, as we were able to sometimes do some nice lowering of
q15 types. But they can cause problem with low aligned pointers.

Differential Revision: https://reviews.llvm.org/D95319
2021-01-28 13:10:08 +00:00
Nicolas Vasilache 299cc5da6d [mlir][Linalg] Further improve codegen strategy and add a linalg.matmul_i8_i8_i32
This revision adds a layer of SFINAE to the composable codegen strategy so it does
not have to require statically defined ops but instead can also be used with OpInterfaces, Operation* and an op name string.

A linalg.matmul_i8_i8_i32 is added to the .tc spec to demonstrate how all this works end to end.

Differential Revision: https://reviews.llvm.org/D95600
2021-01-28 13:02:42 +00:00
Bradley Smith 42635856ed [AArch64][SVE] Allow accesses to SVE stack objects to use frame pointer
The layout of the stack frame for SVE means that using the frame pointer
rather than the stack pointer for an access to an SVE stack object
removes the need for an additional add to jump over the non-SVE objects.

Likewise the opposite is true for non-SVE stack objects.

This patch allows for the former to be done by having HasFP return true
in the presence of both SVE and non-SVE stack objects, and also fixes a
minor issue whereby the later would not be done for certain offsets.
2021-01-28 12:39:57 +00:00
Simon Pilgrim 0805e40a94 AMDGPUPrintfRuntimeBinding - don't dereference a dyn_cast<> pointer. NFCI.
We dereference the dyn_cast<> in all paths - use cast<> to silence the clang static analyzer warning.
2021-01-28 12:38:44 +00:00
Shilei Tian c571b16834 [OpenMP] Disabled profiling in `libomp` by default to unblock link errors
Link error occurred when time profiling in libomp is enabled by default
because `libomp` is assumed to be a C library but the dependence on
`libLLVMSupport` for profiling is a C++ library. Currently the issue blocks all
OpenMP tests in Phabricator.

This patch set a new CMake option `OPENMP_ENABLE_LIBOMP_PROFILING` to
enable/disable the feature. By default it is disabled. Note that once time
profiling is enabled for `libomp`, it becomes a C++ library.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95585
2021-01-28 07:24:32 -05:00
Simon Pilgrim 6663330bc8 [X86][AVX] canonicalizeLaneShuffleWithRepeatedOps - don't merge VPERMILPD ops with different low/high masks.
Unlike VPERMILPS, VPERMILPD can have non-repeating masks in each 128-bit subvector, we weren't accounting for this when folding vperm2f128(vpermilpd(x,c),vpermilpd(y,c)) -> vpermilpd(vperm2f128(x,y),c).

I'm intending to add support for this but wanted to get a minimal fix in first for merging into 12.xx.

Fixes PR48908
2021-01-28 12:11:31 +00:00
Simon Pilgrim da8845fc3d [X86][AVX] Add PR48908 shuffle test case 2021-01-28 11:21:36 +00:00
Simon Pilgrim aa76cebab5 Fix "32-bit shift result used in 64-bit comparison" MSVC warning. NFCI. 2021-01-28 11:21:36 +00:00
Simon Pilgrim 0164d546d2 [Support] Add some missing namespace closure comments. NFCI.
Fixes some clang-tidy warnings.
2021-01-28 11:21:35 +00:00
Simon Pilgrim 7396f720f9 [DebugInfo] Remove some unused includes. NFCI.
Mainly removing a lot of <vector> includes from files that don't explicitly use std::vector
2021-01-28 11:21:35 +00:00
Sven van Haastregt 526c42e76c [OpenCL] Hide sampler-less read_image builtins before CL1.2
Ensure sampler-less image read functions are not available with
`-fdeclare-opencl-builtins` before OpenCL 1.2.
2021-01-28 11:14:19 +00:00
Roman Lebedev 6617529a1d
[CodeGen][DwarfEHPrepare] Preserve Dominator Tree
Now that D94827 has flipped the switch, and SimplifyCFG is officially marked
as production-ready regarding Dominator Tree preservation,
we can update this user pass to also preserve Dominator Tree.

This is a geomean compile-time win of `-0.05%`..`-0.08%`.
https://llvm-compile-time-tracker.com/compare.php?from=51a25846c198cff00abad0936f975167357afa6f&to=082499aac236a5c141e50a9e77870d5be2de5f0b&stat=instructions

Differential Revision: https://reviews.llvm.org/D95548
2021-01-28 14:11:34 +03:00
Roman Lebedev 8cfa963463
[SimplifyCFG] If provided, preserve Dominator Tree
SimplifyCFG is an utility pass, and the fact that it does not
preserve DomTree's, forces it's users to somehow workaround that,
likely by not preserving DomTrees's themselves.

Indeed, simplifycfg pass didn't know how to preserve dominator tree,
it took me just under a month (starting with e113317958)
do rectify that, now it fully knows how to,
there's likely some problems with that still,
but i've dealt with everything i can spot so far.

I think we now can flip the switch.

Note that this is functionally an NFC change,
since this doesn't change the users to pass in the DomTree,
that is a separate question.

Reviewed By: kuhar, nikic

Differential Revision: https://reviews.llvm.org/D94827
2021-01-28 14:11:34 +03:00
Nicolas Vasilache d0c9fb1b8e [mlir][Linalg] Improve codegen strategy
This revision improves the usage of the codegen strategy by adding a few flags that
make it easier to control for the CLI.
Usage of ModuleOp is replaced by FuncOp as this created issues in multi-threaded mode.

A simple benchmarking capability is added for linalg.matmul as well as linalg.matmul_column_major.
This latter op is also added to linalg.

Now obsolete linalg integration tests that also take too long are deleted.

Correctness checks are still missing at this point.

Differential revision: https://reviews.llvm.org/D95531
2021-01-28 10:59:16 +00:00
KareemErgawy-TomTom 279e7ea63b [MLIR][LinAlg][Docs] Add missing example code and other small fixes.
Fixes a few small issues in the docs. It seems one of the examples was missing
the expected MLIR output due to a copy-paste typo.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D95599
2021-01-28 11:49:36 +01:00
David Green c1c1944e69 [ARM] Regenerate constant hoisting test. NFC 2021-01-28 10:37:16 +00:00
Tomas Matheson 9a2bbfae6c [NFC] Move scavenge-lr.mir From AArch64 to Thumb2 test directory. 2021-01-28 10:22:31 +00:00
Mirko Brkusanin 3c979ae9ec [AMDGPU][GlobalISel] Remove redundant cmp when copying constant to vcc
Differential Revision: https://reviews.llvm.org/D95540
2021-01-28 11:20:09 +01:00
Mirko Brkusanin 4b422708ba [AMDGPU][GlobalISel] Handle G_PTR_ADD when looking for constant offset
Look throught G_PTRTOINT and G_PTR_ADD nodes when looking for constant
offset for buffer stores. This also helps with merging of these instructions
later on.

Differential Revision: https://reviews.llvm.org/D95242
2021-01-28 11:20:09 +01:00
Nemanja Ivanovic 54e570d94a [PowerPC] Do not emit XXSPLTI32DX for sub 64-bit constants
If the APInt returned by BuildVectorSDNode::isConstantSplat() is narrower than
64 bits, the result produced by XXSPLTI32DX is incorrect. The result returned
by the function appears to be incorrect and we'll investigate/fix it in a
follow-up commit. However, since this causes miscompiles, we must
temporarily disable emitting this instruction for such values.
2021-01-28 04:16:48 -06:00
Fraser Cormack fc2f27ccf3 [RISCV] Add support for RVV int<->fp & fp<->fp conversions
This patch adds support for the full range of vector int-to-float,
float-to-int, and float-to-float conversions on legal types.

Many conversions are supported natively in RVV so are lowered with
patterns. These include conversions between (element) types of the same
size, and those that are half/double the size of the input. When
conversions take place between types that are less than half or more
than double the size we must lower them using sequences of instructions
which go via intermediate types.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D95447
2021-01-28 09:50:32 +00:00
Jan Svoboda 2393b03239 Revert "[clang][cli] Use variadic macros for parsing/generating"
This reverts commit 374862d7.

Some build bots are failing with:
clang/Driver/Options.inc(4315): warning C4003: not enough arguments for function-like macro invocation 'PARSE_OPTION_WITH_MARSHALLING'
clang/Driver/Options.inc(4315): warning C4003: not enough arguments for function-like macro invocation 'NO_PREFIX'
clang/Driver/Options.inc(4315): error C2059: syntax error: ')'
clang/Driver/Options.inc(4315): error C2143: syntax error: missing ';' before '{'
clang/Driver/Options.inc(4315): error C2059: syntax error: '='
2021-01-28 10:48:43 +01:00
Rahman Lavaee 3ca502a7d6 Use DataExtractor to decode SLEB128 in android_relas.
A simple refactoring patch which let us use `DataExtractor::getSLEB128` rather than using a lambda function.

Differential Revision: https://reviews.llvm.org/D95158
2021-01-28 01:35:18 -08:00
Jan Svoboda 374862d71c [clang][cli] Use variadic macros for parsing/generating
This patch makes all macros forwarding to `PARSE_OPTION_WITH_MARSHALLING` and `GENERATE_OPTION_WITH_MARSHALLING` variadic.

Sice we will be splitting up all CompilerInvocation parts, this will allow us to avoid a lot of boilerplate code.

The local macros prefix forwarded arguments with local variables required by the main macros. The `{THIS,NO}_PREFIX` macros make it possible for forwarding macros in member functions (`parseSimpleArgs`, `generateCC1CommandLine`) to prefix keypaths with `this->`. (Some build bots seem to require that.)

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D95532
2021-01-28 10:35:02 +01:00
Tomas Matheson b9ed8ebe0e [ARM][RegisterScavenging] Don't consider LR liveout if it is not reloaded
https://bugs.llvm.org/show_bug.cgi?id=48232

When PrologEpilogInserter writes callee-saved registers to the stack, LR is not reloaded but is instead loaded directly into PC.
This was not taken into account when determining if each callee-saved register was liveout for the block.
When frame elimination inserts virtual registers, and the register scavenger tries to scavenge LR, it considers it liveout and tries to spill again.
However there is no emergency spill slot to use, and it fails with an error:

    fatal error: error in backend: Error while trying to spill LR from class GPR: Cannot scavenge register without an emergency spill slot!

This patch pervents any callee-saved registers which are not reloaded (including LR) from being marked liveout.
They are therefore available to scavenge without requiring an extra spill.
2021-01-28 09:22:55 +00:00
Tomas Matheson 01b9e613c2 [Clang][Codegen] Truncate initializers of union bitfield members
If an initial value is given for a bitfield that does not fit in the
bitfield, the value should be truncated. Constant folding for
expressions did not account for this truncation in the case of union
member functions, despite a warning being emitted. In some contexts,
evaluation of expressions was not enabled unless C++11, ROPI or RWPI
was enabled.

Differential Revision: https://reviews.llvm.org/D93101
2021-01-28 09:19:19 +00:00
Yang Fan fc4e8a3e8b
[NFC][IR][AsmWriter] Fix Wreturn-type gcc warning
GCC warning:
```
/llvm-project/llvm/lib/IR/AsmWriter.cpp:3175:1: warning: control reaches end of non-void function [-Wreturn-type]
 3175 | }
      | ^
```
2021-01-28 16:42:30 +08:00
Yang Fan 8644eb024b
[NFC][Transforms][Coroutines] Remove unused variable 2021-01-28 16:42:30 +08:00
Luo, Yuanke bf64918150 [X86][AMX] Prevent shape def being scheduled across ldtilecfg.
Differential Revision: https://reviews.llvm.org/D95582
2021-01-28 16:20:16 +08:00
Georgii Rymar 68195b15a3 [yaml2obj] - Allow empty SectionHeaderTable definitions.
Currently we don't allow the following definition:

```
Sections:
  - Type: SectionHeaderTable
  - Name: .foo
    Type: SHT_PROGBITS
```

We report an error: "SectionHeaderTable can't be empty. Use 'NoHeaders' key to drop the section header table".

It was implemented in this way earlier, when `SectionHeaderTable`
was a dedicated key outside of the `Sections` list. And we did not
allow to select where the table is written.

Currently it makes sense to allow it, because a user might
want to place the default section header table at an arbitrary position,
e.g. before other sections. In this case it is not convenient and error prone
to require specifying all sections:

```
Sections:
  - Type: SectionHeaderTable
    Sections:
      - Name: .foo
      - Name: .strtab
      - Name: .shstrtab
  - Name: .foo
    Type: SHT_PROGBITS
```

This patch allows empty SectionHeaderTable definitions.

Differential revision: https://reviews.llvm.org/D95341
2021-01-28 10:51:52 +03:00
Piotr Sobczak fc8e741121 [AMDGPU] Avoid an illegal operand in si-shrink-instructions
Before the patch it was possible to trigger a constant bus
violation when folding immediates into a shrunk instruction.

The patch adds a check to enforce the legality of the new operand.

Differential Revision: https://reviews.llvm.org/D95527
2021-01-28 08:49:21 +01:00
Kazu Hirata 0da15ea581 [llvm] Use append_range (NFC) 2021-01-27 23:25:41 -08:00
Kazu Hirata f890fd5f91 [llvm] Use llvm::is_sorted (NFC) 2021-01-27 23:25:39 -08:00
Kazu Hirata f82b5a647e [DebugInfo] Forward-declare PDBFile (NFC)
NativeEnumInjectedSources.h needs PDBFile but relies on a
forward declaration of PDBFile in InjectedSourceStream.h.
This patch adds a forward declaration right in
NativeEnumInjectedSources.h.

While we are at it, this patch removes the one in
InjectedSourceStream.h, where it is unnecessary.
2021-01-27 23:25:38 -08:00
Ben Shi 50f1aa1db5 [AVR] Optimize 16-bit int shift
Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D90092
2021-01-28 15:10:11 +08:00
Hongtao Yu 7e99bddfea [CSSPGO] Support of CS profiles in extended binary format.
This change brings up support of context-sensitive profiles in the format of extended binary. Existing sample profile reader/writer/merger code is being tweaked to reflect the fact of bracketed input contexts, like (`[...]`). The paired brackets are also needed in extbinary profiles because we don't yet have an otherwise good way to tell calling contexts apart from regular function names since the context delimiter `@` can somehow serve as a part of the C++ mangled names.

Reviewed By: wmi, wenlei

Differential Revision: https://reviews.llvm.org/D95547
2021-01-27 21:29:46 -08:00
Craig Topper 5d05cdf55c [RISCV] Copy isUnneededShiftMask from X86.
In d2927f786e, I added patterns
to remove (and X, 31) from sllw/srlw/sraw shift amounts.

There is code in SelectionDAGISel.cpp that knows to use
computeKnownBits to fill in bits of the mask that were removed
by SimplifyDemandedBits based on bits being known zero.

The non-W shift patterns use immbottomxlenset which allows the
mask to have more than log2(xlen) trailing ones, but doesn't
have a call to computeKnownBits to fill in bits of the mask that may
have been cleared by SimplifyDemandedBits.

This patch copies code from X86 to handle more than log2(xlen)
bottom bits set and uses computeKnownBits to fill in missing bits
before counting.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95422
2021-01-27 20:46:10 -08:00
Fangrui Song b7d6324422 IntrinsicEmitter: Change IntrinsicsToAttributesMap from uint8_t[] to uint16_t[]
We need at least 252 UniqAttributes now, which will soon overflow.
Actually with downstream backends we can easily use up the last few values.
So bump to uint16_t.
2021-01-27 20:34:35 -08:00
Serge Pavlov 5c1cea6f40 [Support] Fix build for Haiku
This change fixes two issues with building LLVM on Haiku. The first issue is
that LLVM requires wait4(), which on Haiku is hidden behind the _BSD_SOURCE
feature flag when using the --std=c++14 flag. Additionally, the wait4()
function is only available in libbsd.so, so this is now a dependency.

The other fix is that Haiku does not have the (non-standard) rusage.maxrss
member, so by default the used memory info will be set to 0 on this platform.

Reviewed By: sepavloff

Differential Revision: https://reviews.llvm.org/D87920

Patch by Niels Sascha Reedijk.
2021-01-28 10:50:04 +07:00
Carl Ritson 2b9ed4fca6 [AMDGPU][NFC] Pre-commit test for D95509 2021-01-28 12:37:58 +09:00
Vyacheslav Zakharin 0fc90873b2 [libomptarget][NFC] Link plugins with threads support library due to std::call_once usage.
Differential Revision: https://reviews.llvm.org/D95572
2021-01-27 19:26:18 -08:00
Carl Ritson 8d8be87979 [AMDGPU][NFC] Generate llvm.amdgcn.set.inactive tests
This is a pre-commit for D95509.
2021-01-28 11:43:36 +09:00
David Blaikie dd7297e1bf DebugInfo: Fix bug in addr+offset exprloc to use DWARFv5 addrx op instead of DWARFv4 GNU extension 2021-01-27 18:39:44 -08:00
Atmn Patel 8a77056256 [OpenMP][Libomptarget] Fix conditional in CMake for remote plugin
The remote offloading plugin's CMakeLists was trying to build if its
flag was enabled even if it didn't find gRPC/protobuf. The conditional
was wrong, it's fixed by this.

Differential Revision: https://reviews.llvm.org/D95574
2021-01-27 21:28:25 -05:00
River Riddle 02bc4c95f0 [mlir][PassManager] Only reinitialize the pass manager if the context registry changes
This prevents needless reinitialization for clients that want to reuse a pass manager multiple times. A new `getRegisryHash` function is exposed by the context to give a rough indicator of when the context registry has changed.

Differential Revision: https://reviews.llvm.org/D95493
2021-01-27 17:41:51 -08:00
Sam McCall c3df9d58c7 [clangd] Parse Diagnostics block, and nest ClangTidy block under it.
(ClangTidy configuration block hasn't been in any release, so we should be OK
to move it around like this)

Differential Revision: https://reviews.llvm.org/D95362
2021-01-28 01:36:23 +01:00
Sam McCall 29472bb769 [clangd] Log warning when using legacy (theia) semantic highlighting.
The legacy protocol will be removed on trunk after the 12 branch cut,
and gone in clangd 13.

Differential Revision: https://reviews.llvm.org/D95031
2021-01-28 01:29:28 +01:00
Stanislav Mekhanoshin d91ee2f782 [AMDGPU] Do not reassign spilled registers
We cannot call LRM::unassign() if LRM::assign() was never called
before, these are symmetrical calls. There are two ways of
assigning a physical register to virtual, via LRM::assign() and
via VRM::assignVirt2Phys(). LRM::assign() will call the VRM to
assign the register and then update LiveIntervalUnion. Inline
spiller calls VRM directly and thus LiveIntervalUnion never gets
updated. A call to LRM::unassign() then asserts about inconsistent
liveness.

We have to note that not all callers of the InlineSpiller even
have LRM to pass, RegAllocPBQP does not have it, so we cannot
always pass LRM into the spiller.

The only way to get into that spiller LRE_DidCloneVirtReg() call
is from LiveRangeEdit::eliminateDeadDefs if we split an LI.

This patch refuses to reassign a LiveInterval created by a split
to workaround the problem. In fact we cannot reassign a spill
anyway as all registers of the needed class are occupied and we
are spilling.

Fixes: SWDEV-267996

Differential Revision: https://reviews.llvm.org/D95489
2021-01-27 16:29:05 -08:00
Fangrui Song 6612c2bb68 [llvm-c] Move LLVMX86_AMXTypeKind & LLVMPoisonValueValueKind to the bottom to avoid value changes compared with LLVM<=11
Fixes PR48905
2021-01-27 16:28:04 -08:00
Richard Smith 727fc31a98 [cxx_status] Mark P0732R2 as only 'partial', not 'Clang 12', as some of
the changes were reverted.
2021-01-27 16:08:51 -08:00