Commit Graph

404829 Commits

Author SHA1 Message Date
Mehdi Amini 1585b13024 Revert "[MLIR][LLVM] Permit integer types in switch other than i32"
This reverts commit 94992670fc.
Build is broken with:

tools/mlir/include/mlir/Dialect/LLVMIR/LLVMOps.cpp.inc:23996:3: error: no matching function for call to 'printSwitchOpCases'
  printSwitchOpCases(_odsPrinter, *this, getValue().getType(), getCaseValuesAttr(), getCaseDestinations(), getCaseOperands(), getCaseOperands().getTypes());
  ^~~~~~~~~~~~~~~~~~
2021-11-16 05:59:12 +00:00
Serguei Katkov 0ecb12a27f [STATEPOINT] Force implicit-def for lr register.
STATEPOINT instruction behavior is similar to call instruction.
In aarch64 BL instruction implicitly define lr register, so
STATEPOINT instruction should do the same.
However STATEPOINT is a general pseudo instruction and I could not find
a way to override list of implicit defs for specific target.

So this patch post processes inserting STATEPOINT instruction by
adding implisit dead def for lr.

Reviewers: reames, loicottet, ostannard
Reviewed By: reames
Subscribers: danilaml, hiraditya, kristof.beyls, llvm-commits, yrouban
Differential Revision: https://reviews.llvm.org/D111114
2021-11-16 12:52:00 +07:00
William S. Moses 94992670fc [MLIR][LLVM] Permit integer types in switch other than i32
LLVM switchop currently only permits i32. Both LLVM IR and MLIR Standard switch permit other integer types leading to an illegal state when lowering an i8 switch from MLIR standard

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D113955
2021-11-16 00:46:25 -05:00
Kazu Hirata 7f00806a6a [llvm] Use make_early_inc_range (NFC) 2021-11-15 21:28:46 -08:00
LLVM GN Syncbot 582352d488 [gn build] Port dc84770d55 2021-11-16 05:11:06 +00:00
Amara Emerson dc84770d55 [GlobalISel] Add a store-merging optimization pass and enable for AArch64.
This is a first attempt at a constant value consecutive store merging pass,
a counterpart to the DAGCombiner's store merging optimization.

The high level goals of this pass:

* Have a simple and efficient algorithm. As close to linear time as we can get.
  Thus, prioritizing scalability of the algorithm over merging every corner case
  we can find. The DAGCombiner's store merging code has been the source of
  compile time and complexity issues in the past and I wanted to avoid that.
* Don't introduce any new data structures for ordering memory operations. In MIR,
  we don't have the concept of chains like we do in the DAG, and the instruction
  order is stricter than enforcing ordering with graph edges. Although I
  considered adding something similar, I couldn't justify the overhead.

The pass is current split into 3 main parts. The main store merging code focuses
on identifying candidate stores and managing the candidate group that's under
consideration for merging. Analyzing addressing of stores is a potentially
complex part and for now there's just a basic implementation to identify easy
cases. Finally, the other main bit of complexity is the alias analysis, which
tries to follow the same logic as the DAG's AA.

Currently this implementation only supports merging of constant stores. Stores
of arbitrary variables are technically possible with a very small change, but
the DAG chooses not to do this. Doing so here makes most code worse since
there's extra overhead in merging values into wider registers.

On AArch64 -Os, this optimization results in very minor savings on CTMark.

Differential Revision: https://reviews.llvm.org/D109131
2021-11-15 21:10:39 -08:00
Wenlei He f7976edc1e [llvm-profgen] Add switch to allow use of first loadable segment for calculating offset
Adding `-use-loadable-segment-as-base` to allow use of first loadable segment for calculating offset. By default first executable segment is used for calculating offset. The switch helps compatibility with unsymbolized profile generated from older tools.

Differential Revision: https://reviews.llvm.org/D113727
2021-11-15 19:00:27 -08:00
Greg Clayton dbd36e1e9f Add the stop count to "statistics dump" in each target's dictionary.
It is great to know how many times the target has stopped over its lifetime as each time the target stops, and possibly resumes without the user seeing it for things like shared library loading and signals that are not notified and auto continued, to help explain why a debug session might be slow. This is now included as "stopCount" inside each target JSON.

Differential Revision: https://reviews.llvm.org/D113810
2021-11-15 18:59:09 -08:00
Craig Topper 391b0ba603 [RISCV] Override TargetLowering::hasAndNot for Zbb.
Differential Revision: https://reviews.llvm.org/D113937
2021-11-15 18:44:07 -08:00
Craig Topper d90eeab0ed [RISCV] Add test cases to prepare for overring TargetLowering::hasAndNot. NFC
These test files are copied directly from AArch64. Some of the cases
may benefit from ANDN with the Zbb extension. Somes cases already
improve use ANDN.

selectcc-to-shiftand.ll also contains tests that test select->and
conversion even when a ANDN isn't needed. I think this improves our
coverage of these optimizations.

Differential Revision: https://reviews.llvm.org/D113935
2021-11-15 18:44:07 -08:00
Fabian Wolff b484fa8289 [X86] Fix crash with inline asm using wrong register name
Fixes PR#48678. `X86TargetLowering::getRegForInlineAsmConstraint()` can adjust the register class to match the type, e.g. change `VR128X` to `VR256X` if the type needs 256 bits. However, the function currently returns the unadjusted register and the adjusted register class, e.g. `xmm15` and `VR256X`, which then causes an assertion failure later because the register class does not contain that register. This patch fixes this behavior.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D113834
2021-11-16 10:38:12 +08:00
Matt Arsenault 659887b405 AMDGPU: Mark prolog/epilog SCC defs as dead
A future change will add SCC liveness checks. Since we are still
relying on forward register scavenging, add dead flags to avoid
spuriously detecting SCC as live.
2021-11-15 21:35:06 -05:00
Matt Arsenault e6bfbd7e0d AMDGPU: Regenerate test checks 2021-11-15 21:35:06 -05:00
Matthias Springer 17194ca96a [mlir][linalg][bufferize][NFC] Clean up tensor op bufferization
Differential Revision: https://reviews.llvm.org/D113730
2021-11-16 11:17:42 +09:00
Matheus Izvekov 21ed00bc1b
[clang] NFC: rename internal `IsPossiblyOpaquelyQualifiedType` overload
Rename `IsPossiblyOpaquelyQualifiedType` overload taking a Type*
as `IsPossiblyOpaquelyQualifiedTypeInternal` instead.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Differential Revision: https://reviews.llvm.org/D113954
2021-11-16 03:09:50 +01:00
Duncan P. N. Exon Smith d7a81359d7 DebugInfo: Make DWARFExpression::iterator::skipBytes() const, NFC
Given that DWARFExpression::iterator::skipBytes() doesn't change any
state (it returns a new `iterator`), it might as well be
`const`-qualified.
2021-11-15 17:32:25 -08:00
Duncan P. N. Exon Smith 79df41011b DebugInfo: const-qualify accessors of DWARFExpression::Operation
Add `const` to DWARFExpression::Operation's accessors and make
Operation::extract() private, since it's only used by the friend class
DWARFExpression::iterator.
2021-11-15 17:30:10 -08:00
Duncan P. N. Exon Smith b23ba295bd DebugInfo: Make DWARFExpression::iterator::operator++ return itself
Looks like an accident that `operator++` was returning `Operator&`
instead of `iterator&`. Update to match standard iterator behaviour.
2021-11-15 17:29:00 -08:00
Craig Topper 233def40f7 [DAGCombiner] Prevent unfoldMaskedMerge from creating an AND with two inverted inputs.
It's possible that the mask is already a NOT. At least if InstCombine
hasn't canonicalized the input. In that case we will form an ANDN with
X instead of with Y. So we don't need to worry about Y being a constant.

We might need to check that X isn't a constant instead, but we don't
have a test case for that yet.

This fixes a size regression found when trying to enable this combine
for RISCV in D113937.

Differential Revision: https://reviews.llvm.org/D113948
2021-11-15 17:15:51 -08:00
ZijunZhao d2b43605c9 add tsan shared lib
Change-Id: Ic83ff1ec86d6a7d61b07fa3df7e0cb2790b5ebc7
2021-11-16 00:42:30 +00:00
Zequan Wu b6d3535230 [LLDB][NativePDB] Fix local-variables.cpp failure on windows bots 2021-11-15 16:14:55 -08:00
Geoffrey Martin-Noble d4238fbf6a [Bazel] Enable layering_check for MLIR build
This feature checks that headers included by a file are provided by a
header exported by one of the direct dependencies of the build rule in
which it is contained. It ensures that appropriate layering (a goal of
the LLVM project) is preserved. So far, I'm only adding this to MLIR
because we've had it turned on internally since the beginning, so MLIR
is already layering clean. It would be nice to also enable it for LLVM,
but that requires some additional cleanup.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D113952
2021-11-15 15:53:22 -08:00
Mehrnoosh Heidarpour 62c51a72f9 [InstSimplify] Fold A|B | (A^B) --> A|B
This patch adds the following fold opportunity:
A|B | (A^B) --> A|B

that is reported here : https://bugs.llvm.org/show_bug.cgi?id=52479

https://alive2.llvm.org/ce/z/33-My-

Test cases with base results are added in D113860

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D113861
2021-11-15 18:55:04 -05:00
Ben Shi 4c3d916c4b [RISCV] Optimize immediate materialisation with SH*ADD
Use LUI+SH*ADD+ADDI to compose specific immediates.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D113568
2021-11-15 23:34:28 +00:00
Ben Shi 39256ed58c [RISCV][test] Add more tests of immediate materialisation
Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D113567
2021-11-15 23:34:27 +00:00
Aart Bik f66e5769d4 [mlir][sparse] first version of "truly" dynamic sparse tensors as outputs of kernels
This revision contains all "sparsification" ops and rewriting necessary to support sparse output tensors when the kernel has no reduction (viz. insertions occur in lexicographic order and are "injective"). This will be later generalized to allow reductions too. Also, this first revision only supports sparse 1-d tensors (viz. vectors) as output in the runtime support library. This will be generalized to n-d tensors shortly. But this way, the revision is kept to a manageable size.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D113705
2021-11-15 15:33:32 -08:00
natashaknk 381677dfbf [tosa][mlir] Refactor tosa.reshape lowering to linalg for dynamic cases.
Split tosa.reshape into three individual lowerings: collapse, expand and a
combination of both. Add simple dynamic shape support.

Reviewed By: rsuderman

Differential Revision: https://reviews.llvm.org/D113936
2021-11-15 15:31:37 -08:00
Stanislav Mekhanoshin 833cdb0a07 Revert "[InstSimplify] Fold A|B | (A^B) --> A|B"
This reverts commit 193c40e966.
2021-11-15 14:56:20 -08:00
James King 9809c6c61c Add `isInitCapture` and `forEachLambdaCapture` matchers.
This contributes follow-up work from https://reviews.llvm.org/D112491, which
allows for increased control over the matching of lambda captures. This also
updates the documentation for the `lambdaCapture` matcher.

Reviewed By: ymandel, aaron.ballman

Differential Revision: https://reviews.llvm.org/D113575
2021-11-15 22:55:28 +00:00
Arthur Eubanks 0b5051cede [llvm-reduce] Don't reuse SmallVector across calls to getAllMetadata()
The SmallVector is not cleared in calls to getAllMetadata().

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D113808
2021-11-15 14:53:48 -08:00
not-jenni cdb0623ad8 [mlir][tosa] Add tosa.mul by one canonicalization
Multiply by one can be removed during canonicalization. This optimizes away unneeded operations.

Differential Revision: https://reviews.llvm.org/D113807
2021-11-15 14:52:16 -08:00
Arthur Eubanks 19867de9e7 [NewPM] Only invalidate modified functions' analyses in CGSCC passes + turn on eagerly invalidate analyses
Previously, any change in any function in an SCC would cause all
analyses for all functions in the SCC to be invalidated. With this
change, we now manually invalidate analyses for functions we modify,
then let the pass manager know that all function analyses should be
preserved since we've already handled function analysis invalidation.

So far this only touches the inliner, argpromotion, function-attrs, and
updateCGAndAnalysisManager(), since they are the most used.

This is part of an effort to investigate running the function
simplification pipeline less on functions we visit multiple times in the
inliner pipeline.

However, this causes major memory regressions especially on larger IR.
To counteract this, turn on the option to eagerly invalidate function
analyses. This invalidates analyses on functions immediately after
they're processed in a module or scc to function adaptor for specific
parts of the pipeline.

Within an SCC, if a pass only modifies one function, other functions in
the SCC do not have their analyses invalidated, so in later function
passes in the SCC pass manager the analyses may still be cached. It is
only after the function passes that the eager invalidation takes effect.
For the default pipelines this makes sense because the inliner pipeline
runs the function simplification pipeline after all other SCC passes
(except CoroSplit which doesn't request any analyses).

Overall this has mostly positive effects on compile time and positive effects on memory usage.
https://llvm-compile-time-tracker.com/compare.php?from=7f627596977624730f9298a1b69883af1555765e&to=39e824e0d3ca8a517502f13032dfa67304841c90&stat=instructions
https://llvm-compile-time-tracker.com/compare.php?from=7f627596977624730f9298a1b69883af1555765e&to=39e824e0d3ca8a517502f13032dfa67304841c90&stat=max-rss

D113196 shows that we slightly regressed compile times in exchange for
some memory improvements when turning on eager invalidation.  D100917
shows that we slightly improved compile times in exchange for major
memory regressions in some cases when invalidating less in SCC passes.
Turning these on at the same time keeps the memory improvements while
keeping compile times neutral/slightly positive.

Reviewed By: asbirlea, nikic

Differential Revision: https://reviews.llvm.org/D113304
2021-11-15 14:44:53 -08:00
Matheus Izvekov c9e46219f3
[clang] retain type sugar in auto / template argument deduction
This implements the following changes:
* AutoType retains sugared deduced-as-type.
* Template argument deduction machinery analyses the sugared type all the way
down. It would previously lose the sugar on first recursion.
* Undeduced AutoType will be properly canonicalized, including the constraint
template arguments.
* Remove the decltype node created from the decltype(auto) deduction.

As a result, we start seeing sugared types in a lot more test cases,
including some which showed very unfriendly `type-parameter-*-*` types.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Reviewed By: rsmith, #libc, ldionne

Differential Revision: https://reviews.llvm.org/D110216
2021-11-15 23:07:45 +01:00
Steven Wu fcd07f8107 [JITLink] Fix splitBlock if there are symbols span across the boundary
Fix `splitBlock` so that it can handle the case when the block being
split has symbols span across the split boundary. This is an error
case in general but for EHFrame splitting on macho platforms, there is an
anonymous symbol that marks the entire block. Current implementation
will leave a symbol that is out of bound of the underlying block. Fix
the problem by dropping such symbols when the block is split.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D113912
2021-11-15 13:55:21 -08:00
Stanislav Mekhanoshin 193c40e966 [InstSimplify] Fold A|B | (A^B) --> A|B
This patch adds the following fold opportunity:
A|B | (A^B) --> A|B

that is reported here : https://bugs.llvm.org/show_bug.cgi?id=52479

https://alive2.llvm.org/ce/z/33-My-

Test cases with base results are added in D113860

(authored by MehrHeidar, committed by rampitec).

Differential Revision:  https://reviews.llvm.org/D113861
2021-11-15 13:49:20 -08:00
Jonas Paulsson 1c3ef9ef4a [SystemZ] Support symbolic displacements.
This patch adds support for symbolic displacements, e.g. like 'lg %r0,
sym(%r1)', which is done using relocations. This is needed to compile the
kernel without disabling the integrated assembler.

Review: Ulrich Weigand

Differential Revision: https://reviews.llvm.org/D113341
2021-11-15 16:46:31 -05:00
Nicolas Vasilache 0b17336f79 [mlir][Vector] Make vector.shape_cast based size-1 foldings opt-in and separate.
This is in prevision of dropping them altogether and using insert/extract based patterns.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D113928
2021-11-15 21:17:57 +00:00
Mircea Trofin 19e6b730ce [NFC][Regalloc] Factor types that would be used by the eviction advisor
This is in prepartion of pulling the eviction decision-making into an
analysis pass, which would then allow swapping that decision making
process.

RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153639.html

Differential Revision: https://reviews.llvm.org/D113929
2021-11-15 13:15:14 -08:00
Fangrui Song fee52fe0ad [X86] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off build. NFC 2021-11-15 13:10:47 -08:00
Krasimir Georgiev 3d1d8c767b [llvm] adapt DWARFExpression.h for 6b9b86db9d
No functional changes intended.

Updated the iterator class for 6b9b86db9d.

Differential Revision: https://reviews.llvm.org/D113934
2021-11-15 22:08:39 +01:00
Philip Reames 8f95e915cd [unroll-runtime] Relax two profitability limitations on multi-exit unrolling
This change is mostly about getting rid of some "uninteresting" cases in a follow on deeper heuristic change.  If anyone sees actually interesting code differences out of this, please let me know.  I'm not expecting this to have much impact at all.

Case 1 - With the single deoptimize non-latch exit, we can't have two exiting blocks sharing an exit block.  We can only hit this with a poorly documented debug flag.

Case 2 - Why should we treat epilog cases differently from prolog cases?  Or to say it differently, why should starting with a constant control whether a multiple exit loop gets unrolled?

Sorry for the lack of tests here.  These are both *exceedingly* narrow cases in practice, and after a while trying, I couldn't come up with a test which did anything "useful" as opposed to simply exercise a random combination of force flags.  Note that the legality cases for each are already exercised with force flags.
2021-11-15 13:00:14 -08:00
Snehasish Kumar bee8e203c6 [InstrProf][NFC] Fix a few typos in code comments. 2021-11-15 12:55:25 -08:00
Nicolas Vasilache b828506eca [mlir][Linalg] Add a DownscaleDepthwiseConv2DNhwcHwcOp decomposition pattern.
Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D113907
2021-11-15 20:48:16 +00:00
Nico Weber b4e50e5228 [asm] Make EmitMSInlineAsmStr and EmitGCCInlineAsmStr more alike
https://reviews.llvm.org/D71677 copied a bunch of code from
EmitGCCInlineAsmStr() to EmitMSInlineAsmStr() but made a few small
(likely unintentional) changes. This makes these pieces look the same.

No behavior change.

(Why are these functions two copies? No great reason as far as I can tell.
https://reviews.llvm.org/rG1778831a3d1d24ab6545635f63da4d9c5f8f0ac7 did the
split; we might want to undo them at some point. But PR23933 suggests
that a bigger change is planned for this file in the future, so keeping
this incremental for now.)

Differential Revision: https://reviews.llvm.org/D113924
2021-11-15 15:43:01 -05:00
Nico Weber 0be836b7dd [asm] Convert AsmPrinter::PrintSpecial() to StringRef
No behavior change.

Differential Revision: https://reviews.llvm.org/D113911
2021-11-15 15:38:27 -05:00
Nico Weber 833393e021 [asm] Correctly handle special names in variants
There's really no reason why anyone should use these special names in a variant.
I noticed this while reading the code: all other writes to OS are guarded by
this conditional, and the behavior with the check seems more correct, so
let's add the check.

Differential Revision: https://reviews.llvm.org/D113909
2021-11-15 15:37:09 -05:00
Lei Huang f50c6c1718 [PowerPC] Fix 32bit vector insert instructions for ISA3.1
The platform independent ISD::INSERT_VECTOR_ELT take a element index,
but vins* instructions take a byte index. Update 32bit td patterns for
vector insert to handle the element index accordingly.

Since vector insert for non constant index are supported in
ISA3.1, there is no need to use platform specific ISD node,
PPCISD::VECINSERT.  Update td pattern to directly use
ISD::INSERT_VECTOR_ELT instead.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D113802
2021-11-15 14:36:39 -06:00
Quinn Pham 1ca00ecfb8 [NFC][lld] Inclusive language: change master file to merged file
[NFC] As part of using inclusive language within the llvm project, this patch
replaces master with merged in these comments.

Reviewed By: #lld-macho, thakis

Differential Revision: https://reviews.llvm.org/D113903
2021-11-15 14:32:09 -06:00
Zequan Wu f17404a733 [LLDB][NativePDB] Fix image lookup by address
`image lookup -a ` doesn't work because the compilands list is always empty.
Create CU at given index if doesn't exit.

Differential Revision: https://reviews.llvm.org/D113821
2021-11-15 12:30:23 -08:00
Zahira Ammarguellat 95edd7f53e Making the code compliant to the documentation about Floating Point
support default values for C/C++. FPP-MODEL=PRECISE enables FFP-CONTRACT
(FMA is enabled).
2021-11-15 15:30:10 -05:00