Commit Graph

366725 Commits

Author SHA1 Message Date
Andrew Litteken 7e4c6fb854 [IRSim] Adding IR Instruction Mapper
This introduces the IRInstructionMapper, and the associated wrapper for
instructions, IRInstructionData, that maps IR level Instructions to
unsigned integers.

Mapping is done mainly by using the "isSameOperationAs" comparison
between two instructions.  If they return true, the opcode, result type,
and operand types of the instruction are used to hash the instruction
with an unsigned integer.  The mapper accepts instruction ranges, and
adds each resulting integer to a list, and each wrapped instruction to
a separate list.

At present, branches, phi nodes are not mapping and exception handling
is illegal.  Debug instructions are not considered.

The different mapping schemes are tested in
unittests/Analysis/IRSimilarityIdentifierTest.cpp

Recommit of: b04c1a9d31

Differential Revision: https://reviews.llvm.org/D86968
2020-09-17 14:06:16 -05:00
Cameron McInally a35c7f3076 [SVE][WIP] Implement lowering for fixed length VSELECT to Scalable
Map fixed length VSELECT to its Scalable equivalent.

Differential Revision: https://reviews.llvm.org/D85364
2020-09-17 14:02:57 -05:00
Reid Kleckner 1e5b7e91aa [PDB] Split TypeServerSource and extend type index map lifetime
Extending the lifetime of these type index mappings does increase memory
usage (+2% in my case), but it decouples type merging from symbol
merging. This is a pre-requisite for two changes that I have in mind:
- parallel type merging: speeds up slow type merging
- defered symbol merging: avoid heap allocating (relocating) all symbols

This eliminates CVIndexMap and moves its data into TpiSource. The maps
are also split into a SmallVector and ArrayRef component, so that the
ipiMap can alias the tpiMap for /Z7 object files, and so that both maps
can simply alias the PDB type server maps for /Zi files.

Splitting TypeServerSource establishes that all input types to be merged
can be identified with two 32-bit indices:
- The index of the TpiSource object
- The type index of the record
This is useful, because this information can be stored in a single
64-bit atomic word to enable concurrent hashtable insertion.

One last change is that now all object files with debugChunks get a
TpiSource, even if they have no type info. This avoids some null checks
and special cases.

Differential Revision: https://reviews.llvm.org/D87736
2020-09-17 11:53:10 -07:00
Amara Emerson 7d5b103483 [AArch64][GlobalISel] Widen G_EXTRACT_VECTOR_ELT element types if < 8b.
In order to not unnecessarily promote the source vector to greater than our
native vector size of 128b, I've added some cascading rules to widen based on
the number of elements.
2020-09-17 11:50:33 -07:00
Amara Emerson bea7749d03 [AArch64][GlobalISel] Make <8 x s16> and <16 x s8> legal for shifts. 2020-09-17 11:50:32 -07:00
Sanjay Patel 48a23bccf3 [VectorCombine] limit load+insert transform to one-use
As discussed in:
https://llvm.org/PR47558
...there are several potential fixes/follow-ups visible
in the test case, but this is the quickest and safest
fix of the perf regression.
2020-09-17 14:29:15 -04:00
Craig Topper 3783d3bc7b [X86] Don't match x87 register inline asm constraints unless the VT is floating point or its a clobber
The register class picked will be the RFP80 register class which has a f80 VT. The code in SelectionDAGBuilder that generates copies around inline assembly doesn't know how to handle an integer and floating point type of different bit widths.

The test case is derived from this https://godbolt.org/z/sEa659 which gcc accepts but clang crashes on. This patch just gives a more graceful error. I'm not sure if the single element struct case is special in gcc. Adding another field to the struct makes gcc reject it. If we want to support this correctly I think we need a change in the frontend to give us the true element type. Right now the frontend just realizes the constraint can take a memory argument so creates an integer type of the same size and bitcasts.

Differential Revision: https://reviews.llvm.org/D87485
2020-09-17 11:26:50 -07:00
Navdeep Kumar 0602e8f77f [MLIR][Affine] Add parametric tile size support for affine.for tiling
Add support to tile affine.for ops with parametric sizes (i.e., SSA
values). Currently supports hyper-rectangular loop nests with constant
lower bounds only. Move methods

  - moveLoopBody(*)
  - getTileableBands(*)
  - checkTilingLegality(*)
  - tilePerfectlyNested(*)
  - constructTiledIndexSetHyperRect(*)

to allow reuse with constant tile size API. Add a test pass -test-affine
-parametric-tile to test parametric tiling.

Differential Revision: https://reviews.llvm.org/D87353
2020-09-17 23:39:14 +05:30
Abhishek Varma 296e97ae8f [MLIR] Support for return values in Affine.For yield
Add support for return values in affine.for yield along the same lines
as scf.for and affine.parallel.

Signed-off-by: Abhishek Varma <abhishek.varma@polymagelabs.com>

Differential Revision: https://reviews.llvm.org/D87437
2020-09-17 23:34:59 +05:30
Yaxun (Sam) Liu 829d14ee0a Revert "[NFC] Refactor DiagnosticBuilder and PartialDiagnostic"
This reverts commit ee5519d323.
2020-09-17 13:56:09 -04:00
Yaxun (Sam) Liu 772bd8a7d9 Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device functions"
This reverts commit 7f1f89ec8d.

This reverts commit 40df06cdaf.
2020-09-17 13:55:31 -04:00
Sanjay Patel ddd9575d15 [VectorCombine] rearrange bailouts for load insert for efficiency; NFC 2020-09-17 13:50:37 -04:00
Sanjay Patel e06914b59b [VectorCombine] add test for multi-use load (PR47558); NFC 2020-09-17 13:50:37 -04:00
Jinsong Ji 50f1d4517a [PowerPC][AIX] Don't hardcode python invoke command line
We shouldn't assume python exists, we should let lit
to decide whether it is python or python3 and expand the path.
2020-09-17 17:47:41 +00:00
Adrian Prantl dd28254063 Add missing include 2020-09-17 10:46:03 -07:00
jerryyin fb18202836 [AMDGPU] Fix ROCm unit test memref initialization 2020-09-17 09:48:05 -07:00
Raul Tambre e09107ab80 [Sema] Introduce BuiltinAttr, per-declaration builtin-ness
Instead of relying on whether a certain identifier is a builtin, introduce BuiltinAttr to specify a declaration as having builtin semantics.

This fixes incompatible redeclarations of builtins, as reverting the identifier as being builtin due to one incompatible redeclaration would have broken rest of the builtin calls.
Mostly-compatible redeclarations of builtins also no longer have builtin semantics. They don't call the builtin nor inherit their attributes.
A long-standing FIXME regarding builtins inside a namespace enclosed in extern "C" not being recognized is also addressed.

Due to the more correct handling attributes for builtin functions are added in more places, resulting in more useful warnings.
Tests are updated to reflect that.

Intrinsics without an inline definition in intrin.h had `inline` and `static` removed as they had no effect and caused them to no longer be recognized as builtins otherwise.

A pthread_create() related test is XFAIL-ed, as it relied on it being recognized as a builtin based on its name.
The builtin declaration syntax is too restrictive and doesn't allow custom structs, function pointers, etc.
It seems to be the only case and fixing this would require reworking the current builtin syntax, so this seems acceptable.

Fixes PR45410.

Reviewed By: rsmith, yutsumi

Differential Revision: https://reviews.llvm.org/D77491
2020-09-17 19:28:57 +03:00
Matt Morehouse 50dd545b00 [DFSan] Add bcmp wrapper.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87801
2020-09-17 09:23:49 -07:00
Eduardo Caldas 1e19165bd8 [SyntaxTree][Synthesis] Fix allocation in `createTree` for more general use
Prior to this change `createTree` could not create arbitrary syntax
trees. Now it dispatches to the constructor of the concrete syntax tree
according to the `NodeKind` passed as argument. This allows reuse inside
the Synthesis API.  # Please enter the commit message for your changes.
Lines starting

Differential Revision: https://reviews.llvm.org/D87820
2020-09-17 16:09:35 +00:00
Bogdan Graur 7d593d0d69 [amdgpu] Compilation fix for Release
Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D87838
2020-09-17 18:04:53 +02:00
Sanjay Patel c6ebe3fd00 [InstSimplify] add tests for FP constant miscompile; NFC (PR43907) 2020-09-17 12:04:39 -04:00
David Green 7f7993e0da [ARM] Expand distributing increments to also handle existing pre/post inc instructions.
This extends the distributing postinc code in load/store optimizer to
also handle the case where there is an existing pre/post inc instruction,
where subsequent instructions can be modified to use the adjusted
offset from the increment. This can save us having to keep the old
register live past the increment instruction.

Differential Revision: https://reviews.llvm.org/D83377
2020-09-17 16:58:35 +01:00
Amara Emerson 79b21fc187 [AArch64][GlobalISel] Fix bug in fewVectorElts action while legalizing oversize G_FPTRUNC vectors.
For <8 x s32> = fptrunc <8 x s64> the fewerElementsVector action tries to break
down the source vector into the final source vectors of <2 x s64> using unmerge.
This fixes a crash due to using the wrong number of elements for the breakdown
type.

Also add some legalizer tests for explicitly G_FPTRUNC which we didn't have.

Differential Revision: https://reviews.llvm.org/D87814
2020-09-17 08:56:26 -07:00
Hanhan Wang f16abe5f84 [mlir][Vector] Add a folder for vector.broadcast
Fold the operation if the source is a scalar constant or splat constant.

Update transform-patterns-matmul-to-vector.mlir because the broadcast ops are folded in the conversion.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D87703
2020-09-17 08:54:51 -07:00
Yaxun (Sam) Liu 7f1f89ec8d Fix build failure in clangd 2020-09-17 11:51:09 -04:00
Simon Pilgrim 2a56a0ba08 ModuloSchedule.cpp - remove unnecessary includes. NFCI.
Already included in ModuloSchedule.h
2020-09-17 16:47:48 +01:00
Matt Morehouse df017fd906 Revert "[DFSan] Add bcmp wrapper."
This reverts commit 559f919812 due to bot
failure.
2020-09-17 08:43:45 -07:00
Max Kazantsev 7688027f16 [Test] Add tests showing that IndVars cannot prove (X + 1 > X) 2020-09-17 22:37:43 +07:00
Valentin Clement f0e028f4b3 [flang][openacc] Lower clauses on loop construct to OpenACC dialect
Lower OpenACCLoopConstruct and most of the clauses to the OpenACC acc.loop operation in MLIR.
This patch refelcts what can be upstream from PR flang-compiler/f18-llvm-project#419

Reviewed By: SouraVX

Differential Revision: https://reviews.llvm.org/D87389
2020-09-17 11:34:43 -04:00
Valentin Clement 6d3cabd90e [mlir][openacc] Change operand type from index to AnyInteger in parallel op
This patch change the type of operands async, wait, numGangs, numWorkers and vectorLength from index
to AnyInteger to fit with acc.loop and the OpenACC specification.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D87712
2020-09-17 11:33:55 -04:00
David Green 72a4a478fe [ARM] Add more MVE postinc distribution tests. NFC 2020-09-17 16:33:03 +01:00
Yaxun (Sam) Liu 40df06cdaf [CUDA][HIP] Defer overloading resolution diagnostics for host device functions
In CUDA/HIP a function may become implicit host device function by
pragma or constexpr. A host device function is checked in both
host and device compilation. However it may be emitted only
on host or device side, therefore the diagnostics should be
deferred until it is known to be emitted.

Currently clang is only able to defer certain diagnostics. This causes
false alarms and limits the usefulness of host device functions.

This patch lets clang defer all overloading resolution diagnostics for host device functions.

An option -fgpu-defer-diag is added to control this behavior. By default
it is off.

It is NFC for other languages.

Differential Revision: https://reviews.llvm.org/D84364
2020-09-17 11:30:42 -04:00
Sanne Wouda d5fd3d9b90 [AArch64] Match pairwise add/fadd pattern
D75689 turns the faddp pattern into a shuffle with vector add.

Match this new pattern in target-specific DAG combine, rather than ISel,
because legalization (for v2f32) turns it into a bit of a mess.

- extended to cover f16, f32, f64 and i64
2020-09-17 16:27:01 +01:00
Sanne Wouda 3ee87a976d Precommit test updates 2020-09-17 16:27:01 +01:00
Matt Morehouse 559f919812 [DFSan] Add bcmp wrapper.
Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87801
2020-09-17 08:23:09 -07:00
Alexey Bataev d5ce8233bf [OpenMP 5.0] Fix user-defined mapper privatization in tasks
This patch fixes the problem that user-defined mapper array is not correctly privatized inside a task. This problem causes openmp/libomptarget/test/offloading/target_depend_nowait.cpp fails.

Differential Revision: https://reviews.llvm.org/D84470
2020-09-17 11:21:10 -04:00
Xun Li 5b533d6cde [Coroutine] Fix a bug where Coroutine incorrectly spills phi and invoke defs before CoroBegin
When a spill definition is before CoroBegin, we cannot spill it to the frame immediately after the definition. We have to spill it after the frame is ready.
The current implementation handles it properly for any other kinds of instructions except for PhINode and InvokeInst, which could also be defined before CoroBegin.
This patch fixes it by moving the CoroBegin dominance check earlier, so that it covers all cases.
Added a test.

Differential Revision: https://reviews.llvm.org/D87810
2020-09-17 08:13:07 -07:00
Louis Dionne a3c28ccd49 [libc++] Remove some workarounds for missing variadic templates
We don't support GCC in C++03 mode, and Clang provides variadic templates
even in C++03 mode. So there's effectively no supported compiler that
doesn't support variadic templates.

This effectively gets rid of all uses of _LIBCPP_HAS_NO_VARIADICS, but
some workarounds for the lack of variadics remain.
2020-09-17 11:05:39 -04:00
Michael Liao c3492a1aa1 [amdgpu] Lower SGPR-to-VGPR copy in the final phase of ISel.
- Need to lower COPY from SGPR to VGPR to a real instruction as the
  standard COPY is used where the source and destination are from the
  same register bank so that we potentially coalesc them together and
  save one COPY. Considering that, backend optimizations, such as CSE,
  won't handle them. However, the copy from SGPR to VGPR always needs
  materializing to a native instruction, it should be lowered into a
  real one before other backend optimizations.

Differential Revision: https://reviews.llvm.org/D87556
2020-09-17 11:04:17 -04:00
David Green 34b27b9441 [ARM] Sink splats to MVE intrinsics
The predicated MVE intrinsics are generated as, for example,
llvm.arm.mve.add.predicated(x, splat(y). p). We need to sink the splat
value back into the loop, like we do for other instructions, so we can
re-select qr variants.

Differential Revision: https://reviews.llvm.org/D87693
2020-09-17 16:00:51 +01:00
Kamil Rytarowski 7b2dd58eb0 [compiler-rt] [scudo] Fix typo in function attribute
Fixes the build after landing https://reviews.llvm.org/D87562
2020-09-17 16:57:30 +02:00
Stephan Herhut 5e0ded2689 [mlir][Standard] Canonicalize chains of tensor_cast operations
Adds a pattern that replaces a chain of two tensor_cast operations by a single tensor_cast operation if doing so will not remove constraints on the shapes.
2020-09-17 16:50:38 +02:00
Kamil Rytarowski e7de267910 [compiler-rt] [hwasan] Replace INLINE with inline
Fixes the build after landing D87562.
2020-09-17 16:46:32 +02:00
Kamil Rytarowski 72c5feeed8 [compiler-rt] [netbsd] Include <sys/dkbad.h>
Fixes build on NetBSD/sparc64.
2020-09-17 16:35:39 +02:00
alex-t 0efbb70b71 [AMDGPU] should expand ROTL i16 to shifts.
Instruction combining pass turns library rotl implementation to llvm.fshl.i16.
In the selection dag the intrinsic is turned to ISD::ROTL node that cannot be selected.
Need to expand it to shifts again.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D87618
2020-09-17 17:34:33 +03:00
Kamil Rytarowski 9339f68f21 [compiler-rt] [tsan] [netbsd] Catch unsupported LONG_JMP_SP_ENV_SLOT
Error out during build for unsupported CPU.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87602
2020-09-17 16:28:11 +02:00
Kamil Rytarowski 85e578f53a [compiler-rt] Replace INLINE with inline
This fixes the clash with BSD headers.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D87562
2020-09-17 16:24:20 +02:00
Simon Pilgrim 85ba2f1663 LiveDebugVariables.cpp - remove unnecessary Compiler.h include. NFCI.
Already included in LiveDebugVariables.h
2020-09-17 15:06:02 +01:00
Simon Pilgrim 46e59062a0 DwarfExpression.cpp - remove unnecessary includes. NFCI.
Already included in DwarfExpression.h
2020-09-17 15:06:02 +01:00
Simon Pilgrim d566771779 ValueList.cpp - remove unnecessary includes. NFCI.
Already included in ValueList.h
2020-09-17 15:06:01 +01:00