Commit Graph

340602 Commits

Author SHA1 Message Date
aartbik 459cf6e500 [mlir] [VectorOps] Lowering of vector.extract/insert_slices to LLVM IR
Summary: Uses progressive lowering to convert vector.extract_slices and vector_insert_slices to equivalent vector operations that can be subsequently lowered into LLVM.

Reviewers: nicolasvasilache, andydavis1, rriddle

Reviewed By: nicolasvasilache, rriddle

Subscribers: merge_guards_bot, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, nicolasvasilache, arpith-jacob, mgester, lucyrfox, liufengdb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72808
2020-01-27 10:35:48 -08:00
Stanislav Mekhanoshin 53eb0f8c07 [AMDGPU] Attempt to reschedule withou clustering
We want to have more load/store clustering but we also want
to maintain low register pressure which are oposit targets.
Allow scheduler to reschedule regions without mutations
applied if we hit a register limit.

Differential Revision: https://reviews.llvm.org/D73386
2020-01-27 10:27:16 -08:00
Matt Arsenault 97711228fd AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.load.format 2020-01-27 13:23:35 -05:00
Luke Drummond 482e890d1f [tablegen] Emit string literals instead of char arrays
This changes the generated (Instr|Asm|Reg|Regclass)Name tables from this
form:
    extern const char HexagonInstrNameData[] = {
      /* 0 */ 'G', '_', 'F', 'L', 'O', 'G', '1', '0', 0,
      /* 9 */ 'E', 'N', 'D', 'L', 'O', 'O', 'P', '0', 0,
      /* 18 */ 'V', '6', '_', 'v', 'd', 'd', '0', 0,
      /* 26 */ 'P', 'S', '_', 'v', 'd', 'd', '0', 0,
      [...]
    };

...to this:

    extern const char HexagonInstrNameData[] = {
      /* 0 */ "G_FLOG10\0"
      /* 9 */ "ENDLOOP0\0"
      /* 18 */ "V6_vdd0\0"
      /* 26 */ "PS_vdd0\0"
      [...]
    };

This should make debugging and exploration a lot easier for mortals,
while providing a significant compile-time reduction for common compilers.

To avoid issues with low implementation limits, this is disabled by
default for visual studio.

To force output one way or the other, pass
`--long-string-literals=<bool>` to `tablegen`

Reviewers: mstorsjo, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D73044

A variation of this patch was originally committed in ce23515f5a and
then reverted in e464b31c due to build failures.
2020-01-27 18:22:25 +00:00
Jonas Devlieghere 3ed88b052b [llvm][TextAPI/MachO] Support writing single macCatalyst platform
TAPI currently lacks a way to emit the macCatalyst platform. For TBD_V3
is does support zippered frameworks given that both macOS and
macCatalyst are part of the PlatformSet.

Differential revision: https://reviews.llvm.org/D73325
2020-01-27 10:21:06 -08:00
Matt Arsenault ce7ca2caf2 AMDGPU/GlobalISel: Select llvm.amdgcn.struct.buffer.load 2020-01-27 13:05:55 -05:00
Matt Arsenault 198624c39d AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load.format 2020-01-27 13:02:19 -05:00
Gabor Horvath c98d98ba9b [analyzer] Fix handle leak false positive when the handle dies too early
Differential Revision: https://reviews.llvm.org/D73151
2020-01-27 09:52:06 -08:00
Matt Arsenault fc90222a91 AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.load
Use intermediate instructions, unlike with buffer stores. This is
necessary because of the need to have an internal way to distinguish
between signed and unsigned extloads. This introduces some duplication
and near duplication with the buffer store selection path. The store
handling should maybe be moved into legalization to match and
eliminate the duplication.
2020-01-27 12:49:23 -05:00
Matt Arsenault e60d658260 AMDGPU/GlobalISel: Handle VOP3NoMods 2020-01-27 09:03:44 -08:00
Matt Arsenault d309b4ebe4 AMDGPU/GlobalISel: Add baseline tests for fma/fmad selection 2020-01-27 09:02:13 -08:00
Matt Arsenault 0968234590 AMDGPU/GlobalISel: Minor refactor of MUBUF complex patterns
This will make it easier to support the small variants in the complex
patterns for atomics.
2020-01-27 09:00:00 -08:00
Matt Arsenault bef27175c7 AMDGPU: Fix not using f16 fsin/fcos
I noticed this because this accidentally started working for
GlobalISel.
2020-01-27 08:59:59 -08:00
Jay Foad e37997cc0d [AMDGPU] Simplify test and extend to gfx9 and gfx10
Summary:
This is in preparation for adding more test cases for D69661 and other
bug fixes in the same area.

Reviewers: tpr, dstuttard, critson, nhaehnle, arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D70708
2020-01-27 16:56:40 +00:00
Simon Pilgrim 2d5e281b0f [X86][AVX] Add a more aggressive SimplifyMultipleUseDemandedBits to simplify masked store masks.
Fixes a poor codegen issue noticed in PR11210.
2020-01-27 16:44:25 +00:00
Matt Arsenault a1d33ce73a AMDGPU/GlobalISel: Custom legalize v2s16 G_SHUFFLE_VECTOR
Try to keep simple v2s16 cases as-is. This will more naturally map to
how the VOP3P op_sel modifiers work compared to the expansion
involving bitcasts and bitshifts.

This could maybe try harder with wider source vector types, although
that could be handled with a pre-legalize combine.
2020-01-27 08:28:05 -08:00
Christian Sigg 97431831e5 Add pretty printers for llvm::PointerIntPair and llvm::PointerUnion.
Reviewers: aprantl, dblaikie, jdoerfert, nicolasvasilache

Reviewed By: dblaikie

Subscribers: jpienaar, dexonsmith, merge_guards_bot, llvm-commits

Tags: #llvm, #clang, #lldb, #openmp

Differential Revision: https://reviews.llvm.org/D72557
2020-01-27 17:23:59 +01:00
Nico Weber 68051c1224 Revert "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()"
This reverts commit 7a8b0b1595.
It seems to break exception handling on 32-bit Windows, see
https://crbug.com/1045650
2020-01-27 11:22:33 -05:00
Matt Arsenault 4e69df091d Revert "AMDGPU: Temporary drop s_mul_hi_i/u32 patterns"
This reverts commit fe23ed2c68.

It was never really clear this was responsible for the performance
regressions that caused this to be reverted. It's been a long time,
and we need to have scalar patterns for this to get GlobalISel
working.
2020-01-27 08:07:21 -08:00
David Goldman 60249c2c3b [clangd] Only re-open files if their flags changed
Reviewers: sammccall

Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D72647
2020-01-27 10:58:20 -05:00
Teresa Johnson 2f63d549f1 Restore "[LTO/WPD] Enable aggressive WPD under LTO option"
This restores 59733525d3 (D71913), along
with bot fix 19c76989bb.

The bot failure should be fixed by D73418, committed as
af954e441a.

I also added a fix for non-x86 bot failures by requiring x86 in new test
lld/test/ELF/lto/devirt_vcall_vis_public.ll.
2020-01-27 07:55:05 -08:00
Matt Arsenault bc3d900fa5 AMDGPU/GlobalISel: Fix not using global atomics on gfx9+
For some reason the flat/global atomics end up in the generated
matcher table in a different order from SelectionDAG. Use
AddedComplexity to prefer checking for global atomics first.
2020-01-27 07:42:42 -08:00
Whitney Tsang 2b335e9aae [LoopUnroll] Remove remapInstruction().
Summary:
LoopUnroll can reuse the RemapInstruction() in ValueMapper, or
remapInstructionsInBlocks() in CloneFunction, depending on the needs.
There is no need to have its own version in LoopUnroll.

By calling RemapInstruction() without TypeMapper or Materializer and
with Flags (RF_NoModuleLevelChanges | RF_IgnoreMissingLocals), it does
the same as remapInstruction(). remapInstructionsInBlocks() calls
RemapInstruction() exactly as described.

Looking at the history, I cannot find any obvious reason to have its own
version.
Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto,
foad, aprantl
Reviewed By: jdoerfert
Subscribers: hiraditya, zzheng, llvm-commits, prithayan, anhtuyen
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D73277
2020-01-27 15:42:13 +00:00
James Henderson c963b5fbd6 [test][llvm-dwarfdump] Add extra test case for invalid MD5 form
A subsequent patch will change how an invalid file name table is handled
to allow parsing to continue. This patch adds a test case that will
demonstrate a difference in behaviour with that change between invalid
file tables where the error is before the end of the stated prologue
length and where the error occurs after the stated length.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D72157
2020-01-27 15:33:34 +00:00
James Henderson f1be770ff6 [DebugInfo] Make incorrect debug line extended opcode length non-fatal
It is possible to try to keep parsing a debug line program even when the
length of an extended opcode does not match what is expected for that
opcode. This patch changes what was previously a fatal error to be
non-fatal. The parser now continues by assuming the the claimed length
is correct, even if it means moving the offset backwards.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D72155
2020-01-27 15:32:41 +00:00
Matt Arsenault ac0b9b4ccf AMDPGPU/GlobalISel: Select more MUBUF global addressing modes
The handling of the high bits of the resource descriptor seem weird to
me, where the 3rd dword changes based on the instruction.
2020-01-27 07:28:36 -08:00
Martin Probst 02656f29ab clang-format: [JS] options for arrow functions.
Summary:
clang-format currently always wraps the body of non-empty arrow
functions:

    const x = () => {
      z();
    };

This change implements support for the `AllowShortLambdasOnASingleLine`
style options, controlling the indent style for arrow function bodies
that have one or fewer statements. SLS_All puts all on a single line,
SLS_Inline only arrow functions used in an inline position.

    const x = () => { z(); };

Multi-statement arrow functions continue to be wrapped. Function
expressions (`a = function() {}`) and function/method declarations are
unaffected as well.

Reviewers: krasimir

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73335
2020-01-27 16:27:25 +01:00
Alex Zinenko 84c3f05c8e [mlir] Harden error propagation in LLVM import
Summary:
LLVM importer to MLIR was implemented mostly as a prototype. As such, it did
not deal handle errors in a consistent way, reporting them out stderr in some
cases and continuing the execution in the error state until eventually
crashing. This is not desirable for a user-facing tool. Make sure errors are
returned from functions, consistently checked at call sites and propagated
further. Functions returning nullable IR values return nullptr to denote the
error state. Other functions return LogicalResult. LLVM importer in
mlir-translate should no longer crash on unsupported inputs.

The errors are reported without association with the source file (and therefore
cannot be checked using -verify-diagnostics). Attaching them to the actual
input file is left for future work.

Differential Revision: https://reviews.llvm.org/D72839
2020-01-27 16:15:11 +01:00
Alex Zinenko 07328944ef [mlir] LLVM import: handle constant data and array/vector aggregates
Summary:
Implement the handling of llvm::ConstantDataSequential and
llvm::ConstantAggregate for (nested) array and vector types when imporitng LLVM
IR to MLIR. In all cases, the result is a DenseElementsAttr that can be used in
either a `llvm.mlir.global` or a `llvm.mlir.constant`. Nested aggregates are
unpacked recursively until an element or a constant data is found. Nested
arrays with innermost scalar type are represented as DenseElementsAttr of
tensor type. Nested arrays with innermost vector type are represented as
DenseElementsAttr with (multidimensional) vector type.

Constant aggregates of struct type are not yet supported as the LLVM dialect
does not have a well-defined way of modeling struct-type constants.

Differential Revision: https://reviews.llvm.org/D72834
2020-01-27 16:15:11 +01:00
Matt Arsenault fdaad485e6 AMDGPU/GlobalISel: Initial selection of MUBUF addr64 load/store
Fixes the main reason for compile failures on SI, but doesn't really
try to use the addressing modes yet.
2020-01-27 07:13:56 -08:00
Simon Pilgrim d89180972b [X86][AVX] Add test case from PR11210
Shows failure to remove sign bit comparison when the result has multiple uses
2020-01-27 15:08:21 +00:00
Hans Wennborg 739b410f1f Add a warning, flags and pragmas to limit the number of pre-processor tokens in a translation unit
See
https://docs.google.com/document/d/1xMkTZMKx9llnMPgso0jrx3ankI4cv60xeZ0y4ksf4wc/preview
for background discussion.

This adds a warning, flags and pragmas to limit the number of
pre-processor tokens either at a certain point in a translation unit, or
overall.

The idea is that this would allow projects to limit the size of certain
widely included headers, or for translation units overall, as a way to
insert backstops for header bloat and prevent compile-time regressions.

Differential revision: https://reviews.llvm.org/D72703
2020-01-27 16:04:17 +01:00
Dominik Montada 9965b12fd1 Use pointer type size for offset constant when lowering load/stores 2020-01-27 06:55:32 -08:00
Matt Arsenault 2214bc81d0 AMDGPU: Allow i16 shader arguments
Not allowing this just creates unnecessary complications when writing
simple tests.
2020-01-27 06:55:32 -08:00
Jay Foad 1bf00219fc [AMDGPU] Handle multiple base operands in areMemAccessesTriviallyDisjoint
Summary:
This is in preparation for getMemOperandsWithOffset returning more base
operands.

Depends on D73455.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73456
2020-01-27 14:45:21 +00:00
Jay Foad 6461eadf8f [AMDGPU] Handle multiple base operands in shouldClusterMemOps
Summary:
This is in preparation for getMemOperandsWithOffset returning more base
operands.

Depends on D73454.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73455
2020-01-27 14:45:21 +00:00
Jay Foad fcf5254fa7 [AMDGPU] Handle frame index base operands in memOpsHaveSameBasePtr
Summary:
This is in preparation for getMemOperandsWithOffset returning more base
operands.

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, arphaman, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73454
2020-01-27 14:45:21 +00:00
vpykhtin 4332f1a4c8 [AMDGPU] Fix GCN regpressure trackers for INLINEASM instructions.
Differential revision: https://reviews.llvm.org/D73338
2020-01-27 17:25:25 +03:00
Teresa Johnson af954e441a [WPD] Emit vcall_visibility metadata for MicrosoftCXXABI
Summary:
The MicrosoftCXXABI uses a separate mechanism for emitting vtable
type metadata, and thus didn't pick up the change from D71907
to emit the vcall_visibility metadata under -fwhole-program-vtables.

I believe this is the cause of a Windows bot failure when I committed
follow on change D71913 that required a revert. The failure occurred
in a CFI test that was expecting to not abort because it expected a
devirtualization to occur, and without the necessary vcall_visibility
metadata we would not get devirtualization.

Note in the equivalent code in CodeGenModule::EmitVTableTypeMetadata
(used by the ItaniumCXXABI), we also emit the vcall_visibility metadata
when Virtual Function Elimination is enabled. Since I am not as familiar
with the details of that optimization, I have marked that as a TODO and
am only inserting under -fwhole-program-vtables.

Reviewers: evgeny777

Subscribers: Prazek, ostannard, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D73418
2020-01-27 06:22:24 -08:00
Matt Arsenault 2a160ba5b0 GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results
Only use shifts if the requested type exactly matches the source type,
and create sub-unmerges otherwise.
2020-01-27 06:18:26 -08:00
David Green 8a6b948eb5 [MVE] Fixup order of gather writeback intrinsic outputs
The MVE_VLDRWU32_qi_pre gather loads, like the other _pre/_post mve
loads returns the writeback as result 0, the value as result 1. The llvm
ir intrinsic seems to have this the other way around though, and so when
lowering from one to the other we need to switch the first two outputs.

I've also fixed up the types of _pre/_post on normal MVE loads. There we
were already getting the values the right way around, just not for the
types. I don't believe this was causing anything to go wrong, but it was
very confusing to read in the debug output.

Differential Revision: https://reviews.llvm.org/D73370
2020-01-27 14:08:06 +00:00
Matt Arsenault 06d9230fef GlobalISel: Translate vector GEPs 2020-01-27 05:35:05 -08:00
Russell Gallop 77e6bb3cba Re-land [Support] Extend TimeProfiler to support multiple threads
This makes TimeTraceProfilerInstance thread local. Added
timeTraceProfilerFinishThread() which moves the thread local instance to
a global vector of instances. timeTraceProfilerWrite() then writes
recorded data from all instances.

Threads are identified based on their thread ids. Totals are reported
with artificial thread ids higher than the real ones.

This fixes the previous version to work with __thread as well as
thread_local.

Differential Revision: https://reviews.llvm.org/D71059
2020-01-27 13:01:49 +00:00
Igor Kudrin 9a952fd462 [LLDB] Fix build failures after removing Version from DWARFExpression. 2020-01-27 19:33:34 +07:00
Igor Kudrin 8f3d47c54a [DWARF] Do not pass Version to DWARFExpression. NFCI.
The Version was used only to determine the size of an operand of
DW_OP_call_ref. The size was 4 for all versions apart from 2, but
the DW_OP_call_ref operation was introduced only in DWARF3. Thus,
the code may be simplified and using of Version may be eliminated.

Differential Revision: https://reviews.llvm.org/D73264
2020-01-27 19:08:46 +07:00
Igor Kudrin 548553eac7 [DWARF] Simplify DWARFExpression. NFC.
As DataExtractor already has a method to extract an unsigned value of
a specified size, there is no need to duplicate that.

Differential Revision: https://reviews.llvm.org/D73263
2020-01-27 19:08:46 +07:00
Krasimir Georgiev 36a8f7f6d8 [clang-format] Handle escaped " in C# string-literals
Reviewers: krasimir

Reviewed By: krasimir

Subscribers: klimek, MyDeveloperDay

Tags: #clang-format

Differential Revision: https://reviews.llvm.org/D73353
2020-01-27 12:57:20 +01:00
David Stenberg 13d4ef9ac0 Improvements to call site register worklist
Summary:
This fixes PR44118. For cases where we have a chain like this:

  R8 = R1 (entry value)
  R0 = R8
  call @foo R0

the code that emits call site entries using entry values would not
follow that chain, instead emitting a call site entry with R8 as
location rather than R0. Such a case was discovered when originally
adding dbgcall-site-orr-moves.mir. This patch fixes that issue. This is
done by changing the ForwardedRegWorklist set to a map in which the
worklist registers always map to the parameter registers that they
describe.

Another thing this patch fixes is that worklist registers now can
describe more than one parameter register at a time. Such a case
occurred in dbgcall-site-interpretation.mir, resulting in a call site
entry not being emitted for one of the parameters.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: vsk

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D73168
2020-01-27 12:41:42 +01:00
Stephen Kelly 0a57d14abf [ASTMatchers] Fix parent traversal with InitListExpr
Children of InitListExpr are traversed twice by RAV, so this code
populates a vector to represent the possibly-multiple parents (in
reality in this situation the parent is the same and is therefore
de-duplicated).
2020-01-27 11:19:59 +00:00
Sjoerd Meijer b567ff2fa0 [ARM][MVE] Tail-predication: support constant trip count
We had support for runtime trip count values, but not constants, and this adds
supports for that.

And added a minor optimisation while I was add it: don't invoke Cleanup when
there's nothing to clean up.

Differential Revision: https://reviews.llvm.org/D73198
2020-01-27 11:05:26 +00:00