Commit Graph

30469 Commits

Author SHA1 Message Date
dfukalov d066079728 [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.
Main reason is preparation to transform AliasResult to class that contains
offset for PartialAlias case.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98027
2021-04-09 12:54:22 +03:00
Sebastian Neubauer ba217b4655 [RegisterScavenging] Add asserts for better errors
These cases were failing before, but with cryptic asserts.
Add asserts in the RegScavenger that fail earlier with better
messages. NFC

Differential Revision: https://reviews.llvm.org/D100109
2021-04-09 11:23:36 +02:00
Serguei Katkov f6e3b4fe58 [GreedyRA ORE] Re-factor computeNumberOfSplillsReloads.
Replace if-else to if-continue usage.
This simplifies further extension of the collected stats.
2021-04-09 12:44:11 +07:00
Stephen Tozer 1b589172bd Revert "[DebugInfo] Correctly track SDNode dependencies for list debug values"
Reverted due to failure on the sanitizer-x86_64-linux-fast bot.

This reverts commit e10493eb50.
2021-04-08 17:55:45 +01:00
Stephen Tozer e10493eb50 [DebugInfo] Correctly track SDNode dependencies for list debug values
During SelectionDAG, we must track the SDNodes that each SDDbgValue depends on
to compute its value. These are ultimately derived from the location operands to
the SDDbgValue, but were stored in a separate vector prior to this patch. This
resulted in cases where one of the lists was updated incorrectly, resulting in
crashes during compilation. This patch fixes the issue by directly recomputing
the dependency list from the SDDbgOperands in getDependencies().

Differential Revision: https://reviews.llvm.org/D99423
2021-04-08 17:01:45 +01:00
Serguei Katkov a0e8738d45 [GreedyRA ORE] Add function level spill/reloads stats
Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames, thegameg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100014
2021-04-08 16:55:52 +07:00
Serguei Katkov 6b64c662c7 [GreedyRA ORE] Extract computeNumberOfSplillsReloads to use in different places. NFC.
Extract one basic block handling to introduce stat computation for method scope.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames, thegameg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100013
2021-04-08 14:40:45 +07:00
Serguei Katkov df25787797 [GreedyRA ORE] Extract stats in RAGreedyStats struct. NFC.
Combine all collected stats into separate struct RAGreedyStats
with add and report methods.

The motivation is to extend the number of statistics to capture and instead of
adding new parameters, just combine all of them into one structure.
Additionally I plan to use report from different places in future to report data
for function as well.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: thegameg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100012
2021-04-08 14:27:37 +07:00
Serguei Katkov 0a1c6637a1 [GreedyRA ORE] Compute ORE stats if extra analysis is enabled
To save compile time, avoid computation of stats if ORE will not emit it.
The motivation is to add more stats and compute it only if it will dumped.

Reviewers: reames, MatzeB, anemet, thegameg
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D100010
2021-04-08 14:24:18 +07:00
Esme-Yi 0c36da722a [Debug-Info] Use inlined strings in .dwinfo section by default for DBX.
Summary: Set the default DwarfInlinedStrings as inlined strings for DBX, due to DBX does not support .dwstr section for now.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D99933
2021-04-08 07:20:22 +00:00
Hongtao Yu 2a2720a2de [CSSPGO] Move pseudo probes to the beginning of a block to unblock SelectionDAG combine.
Pseudo probes, when scattered in a block, can be chained dependencies of other regular DAG nodes and block DAG combine optimizations. To fix this, scattered probes in a block are grouped and placed at the beginning of the block. This shouldn't affect the profile quality.

Test Plan:

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D100002
2021-04-07 22:45:35 -07:00
Arthur Eubanks 90af134473 Revert "[AsmPrinter] Delete dead takeDeletedSymbsForFunction()"
This reverts commit 9583a3f262.

This wasn't NFC as initially thought. Needed for D99707.
2021-04-07 11:40:44 -07:00
Craig Topper 67953311e2 [SelectionDAG] Teach SelectionDAG::FoldConstantArithmetic to handle SPLAT_VECTOR
This allows FoldConstantArithmetic to handle SPLAT_VECTOR in
addition to BUILD_VECTOR. This allows it to support scalable
vectors. I'm also allowing fixed length SPLAT_VECTOR which is
used by some targets, but I'm not familiar enough to write tests
for those targets.

I had to block this function from running on CONCAT_VECTORS to
avoid calling getNode for a CONCAT_VECTORS of 2 scalars.
This can happen because the 2 operand getNode calls this
function for any opcode. Previously we were protected because
CONCAT_VECTORs of BUILD_VECTOR is folded to a larger BUILD_VECTOR
before that call. But it's not always possible to fold a CONCAT_VECTORS
of SPLAT_VECTORs, and we don't even try.

This fixes PR49781 where DAG combine thought constant folding
should be possible, but FoldConstantArithmetic couldn't do it.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D99682
2021-04-07 10:03:33 -07:00
Yevgeny Rouban 3e738afae4 [Statepoint Lowering] Allow other than N byte sized types in deopt bundle
I do not see any bit-width restriction from the point of the
LLVM Lang Ref - Operand Bundles on the types of the deopt bundle
operands. Statepoint Lowering seems to be able to work with any
types.
This patch relaxes the two related assertions and adds a new test
for this change.

Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D100006
2021-04-07 17:48:31 +07:00
Nicolás Alvarez a1aada75f5 [docs] Fix doxygen comments wrongly attached to the llvm namespace
Looking at the Doxygen-generated documentation for the llvm namespace
currently shows all sorts of random comments from different parts of the
codebase. These are mostly caused by:

- File doc comments that aren't marked with \file, so they're attached to
  the next declaration, which is usually "namespace llvm {".
- Class doc comments placed before the namespace rather than before the
  class.
- Code comments before the namespace that (in my opinion) shouldn't be
  extracted by doxygen at all.

This commit fixes these comments. The generated doxygen documentation now
has proper docs for several classes and files, and the docs for the llvm
and llvm::detail namespaces are now empty.

Reviewed By: thakis, mizvekov

Differential Revision: https://reviews.llvm.org/D96736
2021-04-07 01:20:18 +02:00
Philip Reames 908215b346 Use AssumeInst in a few more places [nfc]
Follow up to a6d2a8d6f5.  These were found by simply grepping for "::assume", and are the subset of that result which looked cleaner to me using the isa/dyn_cast patterns.
2021-04-06 13:18:53 -07:00
Philip Reames fb41cae039 More precisely type code used for gc.relocate assertions [nfc] 2021-04-06 11:27:36 -07:00
Abhina Sreeskantharajan 82b3e28e83 [SystemZ][z/OS][Windows] Add new OF_TextWithCRLF flag and use this flag instead of OF_Text
Problem:
On SystemZ we need to open text files in text mode. On Windows, files opened in text mode adds a CRLF '\r\n' which may not be desirable.

Solution:
This patch adds two new flags

  - OF_CRLF which indicates that CRLF translation is used.
  - OF_TextWithCRLF = OF_Text | OF_CRLF indicates that the file is text and uses CRLF translation.

Developers should now use either the OF_Text or OF_TextWithCRLF for text files and OF_None for binary files. If the developer doesn't want carriage returns on Windows, they should use OF_Text, if they do want carriage returns on Windows, they should use OF_TextWithCRLF.

So this is the behaviour per platform with my patch:

z/OS:
OF_None: open in binary mode
OF_Text : open in text mode
OF_TextWithCRLF: open in text mode

Windows:
OF_None: open file with no carriage return
OF_Text: open file with no carriage return
OF_TextWithCRLF: open file with carriage return

The Major change is in llvm/lib/Support/Windows/Path.inc to only set text mode if the OF_CRLF is set.
```
  if (Flags & OF_CRLF)
    CrtOpenFlags |= _O_TEXT;
```

These following files are the ones that still use OF_Text which I left unchanged. I modified all these except raw_ostream.cpp in recent patches so I know these were previously in Binary mode on Windows.
./llvm/lib/Support/raw_ostream.cpp
./llvm/lib/TableGen/Main.cpp
./llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
./llvm/unittests/Support/Path.cpp
./clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
./clang/lib/Frontend/CompilerInstance.cpp
./clang/lib/Driver/Driver.cpp
./clang/lib/Driver/ToolChains/Clang.cpp

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D99426
2021-04-06 07:23:31 -04:00
Simon Pilgrim ddbb58736a [KnownBits] Rename KnownBits::computeForMul to KnownBits::mul. NFCI.
As promised in D98866
2021-04-06 10:11:41 +01:00
Serguei Katkov 0057ec8034 [Statepoint] Factor-out utility function to get non-foldable area of STATEPOINT like instructions. NFC
Reviewers: reames, dantrushin
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D99875
2021-04-06 11:44:37 +07:00
Stanislav Mekhanoshin 30b3aab329 Copy syncscope when expanding atomicrmw into cmpxchg loop
Fixes: SWDEV-280070

Differential Revision: https://reviews.llvm.org/D99902
2021-04-05 17:29:38 -07:00
Nikita Popov 665065821e [FastISel] Remove kill tracking
This is a followup to D98145: As far as I know, tracking of kill
flags in FastISel is just a compile-time optimization. However,
I'm not actually seeing any compile-time regression when removing
the tracking. This probably used to be more important in the past,
before FastRA was switched to allocate instructions in reverse
order, which means that it discovers kills as a matter of course.

As such, the kill tracking doesn't really seem to serve a purpose
anymore, and just adds additional complexity and potential for
errors. This patch removes it entirely. The primary changes are
dropping the hasTrivialKill() method and removing the kill
arguments from the emitFast methods. The rest is mechanical fixup.

Differential Revision: https://reviews.llvm.org/D98294
2021-04-03 15:50:13 +02:00
Simon Pilgrim 4ea5475a3f [KnownBits] Add KnownBits::haveNoCommonBitsSet helper. NFCI.
Include exhaustive test coverage.
2021-04-02 21:44:33 +01:00
Jun Ma 274ac9d40e [AArch64][SVE] Lowering sve.dot to DOT node
Differential Revision: https://reviews.llvm.org/D99699
2021-04-02 20:05:17 +08:00
Sander de Smalen 0f7bbbc481 Always emit error for wrong interfaces to scalable vectors, unless cmdline flag is passed.
In order to bring up scalable vector support in LLVM incrementally,
we introduced behaviour to emit a warning, instead of an error, when
asking the wrong question of a scalable vector, like asking for the
fixed number of elements.

This patch puts that behaviour under a flag. The default behaviour is
that the compiler will always error, which means that all LLVM unit
tests and regression tests will now fail when a code-path is taken that
still uses the wrong interface.

The behaviour to demote an error to a warning can be individually enabled
for tools that want to support experimental use of scalable vectors.
This patch enables that behaviour when driving compilation from Clang.
This means that for users who want to try out scalable-vector support,
fixed-width codegen support, or build user-code with scalable vector
intrinsics, Clang will not crash and burn when the compiler encounters
such a case.

This allows us to do away with the following pattern in many of the SVE tests:
  RUN: .... 2>%t
  RUN: cat %t | FileCheck --check-prefix=WARN
  WARN-NOT: warning: ...

The behaviour to emit warnings is only temporary and we expect this flag
to be removed in the future when scalable vector support is more stable.

This patch also has fixes the following tests:
 unittests:
   ScalableVectorMVTsTest.SizeQueries
   SelectionDAGAddressAnalysisTest.unknownSizeFrameObjects
   AArch64SelectionDAGTest.computeKnownBitsSVE_ZERO_EXTEND_VECTOR_INREG

 regression tests:
   Transforms/InstCombine/vscale_gep.ll

Reviewed By: paulwalker-arm, ctetreau

Differential Revision: https://reviews.llvm.org/D98856
2021-04-02 10:55:22 +01:00
Mircea Trofin ce61def529 [regalloc] Ensure Query::collectInterferringVregs is called before interval iteration
The main part of the patch is the change in RegAllocGreedy.cpp: Q.collectInterferringVregs()
needs to be called before iterating the interfering live ranges.

The rest of the patch offers support that is the case: instead of  clearing the query's
InterferingVRegs field, we invalidate it. The clearing happens when the live reg matrix
is invalidated (existing triggering mechanism).

Without the change in RegAllocGreedy.cpp, the compiler ices.

This patch should make it more easily discoverable by developers that
collectInterferringVregs needs to be called before iterating.

I will follow up with a subsequent patch to improve the usability and maintainability of Query.

Differential Revision: https://reviews.llvm.org/D98232
2021-04-01 08:33:28 -07:00
Simon Pilgrim 77d625f8d8 [DAG] MergeInnerShuffle with BinOps - sometimes accept undef mask elements
If the inner shuffle already contains undef elements, then accept them in the merged shuffle as well.

This helps some X86 HADD/SUB patterns where slow targets were ending up with HADD/SUB because the (un)merged shuffles were stuck either side of the ADD/SUB - meaning we ended up with a total cost much higher than the "2*shuffle+add" that a slow target usually expands a HADD/SUB to.
2021-04-01 14:33:00 +01:00
Simonas Kazlauskas 777a58e05b Support {S,U}REMEqFold before legalization
This allows these optimisations to apply to e.g. `urem i16` directly
before `urem` is promoted to i32 on architectures where i16 operations
are not intrinsically legal (such as on Aarch64). The legalization then
later can happen more directly and generated code gets a chance to avoid
wasting time on computing results in types wider than necessary, in the end.

Seems like mostly an improvement in terms of results at least as far as x86_64 and aarch64 are concerned, with a few regressions here and there. It also helps in preventing regressions in changes like {D87976}.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D88785
2021-04-01 01:35:41 +03:00
Craig Topper 9e00b6660d [SelectionDAG] Remove unneeded vector resize from the end of FoldConstantArithmetic. NFC
There's an assert right before that makes sure the size already matches.
Earlier in this function's life, scalars and vectors shared more
code.
2021-03-31 12:33:10 -07:00
Yang Fan 0d7fd9f0d0
[GlobalISel] Fix Wint-in-bool-context warning (NFC)
GCC warning:
```
/llvm-project/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp: In member function ‘bool llvm::CombinerHelper::matchFunnelShiftToRotate(llvm::MachineInstr&)’:
/llvm-project/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp:3882:35: warning: ?: using integer constants in boolean context, the expression will always evaluate to ‘true’ [-Wint-in-bool-context]
 3882 |       Opc == TargetOpcode::G_FSHL ? TargetOpcode::G_ROTL : TargetOpcode::G_ROTR;
      |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
2021-03-31 09:59:43 +08:00
Amara Emerson a35c2c7942 [GlobalISel] Implement fewerElements legalization for vector reductions.
This patch adds 3 methods, one for power-of-2 vectors which use tree
reductions using vector ops, before a final reduction op. For non-pow-2
types it generates multiple narrow reductions and combines the values with
scalar ops.

Differential Revision: https://reviews.llvm.org/D97163
2021-03-30 11:19:21 -07:00
Amara Emerson 91887cd4ec [AArch64][GlobalISel] Combine funnel shifts to rotates.
Differential Revision: https://reviews.llvm.org/D99388
2021-03-30 11:00:36 -07:00
Sourabh Singh Tomar f13f050551 [DebugInfo] Support for signed constants inside DIExpression
Negative numbers are represented using DW_OP_consts along with signed representation
of the number as the argument.

Test case IR is generated using Fortran front-end.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D99273
2021-03-30 23:20:38 +05:30
Jessica Paquette 700431128e [GlobalISel][AArch64] Combine G_SEXT_INREG + right shift -> G_SBFX
Basically a port of isBitfieldExtractOpFromSExtInReg in AArch64ISelDAGToDAG.

This is only done post-legalization for now. Once the legalizer knows how to
decompose these back into shifts, this requirement can probably be removed.

Differential Revision: https://reviews.llvm.org/D99230
2021-03-30 10:14:30 -07:00
Amara Emerson f5e9be6fdb [GlobalISel] Implement lowering for G_ROTR and G_ROTL.
This is a straightforward port.

Differential Revision: https://reviews.llvm.org/D99449
2021-03-30 09:44:41 -07:00
Tomas Matheson a9968c0a33 [NFC][CodeGen] Tidy up TargetRegisterInfo stack realignment functions
Currently needsStackRealignment returns false if canRealignStack returns false.
This means that the behavior of needsStackRealignment does not correspond to
it's name and description; a function might need stack realignment, but if it
is not possible then this function returns false. Furthermore,
needsStackRealignment is not virtual and therefore some backends have made use
of canRealignStack to indicate whether a function needs stack realignment.

This patch attempts to clarify the situation by separating them and introducing
new names:

 - shouldRealignStack - true if there is any reason the stack should be
   realigned

 - canRealignStack - true if we are still able to realign the stack (e.g. we
   can still reserve/have reserved a frame pointer)

 - hasStackRealignment = shouldRealignStack && canRealignStack (not target
   customisable)

Targets can now override shouldRealignStack to indicate that stack realignment
is required.

This change will make it easier in a future change to handle the case where we
need to realign the stack but can't do so (for example when the register
allocator creates an aligned spill after the frame pointer has been
eliminated).

Differential Revision: https://reviews.llvm.org/D98716

Change-Id: Ib9a4d21728bf9d08a545b4365418d3ffe1af4d87
2021-03-30 17:31:39 +01:00
Alok Kumar Sharma 9fb0025f70 [DebugInfo] Upgrade DISubragne::count to accept DIExpression also
This is needed for Fortran assumed shape arrays whose dimensions are
defined as,
  - 'count' is taken from array descriptor passed as parameter by
    caller, access from descriptor is defined by type DIExpression.
  - 'lowerBound' is defined by callee.
The current alternate way represents using upperBound in place of
count, where upperBound is calculated in callee in a temp variable
using lowerBound and count

Representation with count (DIExpression) is not only clearer as
compared to upperBound (DIVariable) but it has another advantage that
variable count is accessed by being parameter has better chance of
survival at higher optimization level than upperBound being local
variable.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D99335
2021-03-30 09:16:55 +05:30
Rahman Lavaee 90c401cab6 [Propeller] Do not generate the BB address map for empty functions.
Empty functions (functions with no real code) are irrelevant for propeller optimizations and their addresses sometimes conflict with other functions which obfuscates the analysis.
This simple change skips the BB address map emission for such functions.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D99395
2021-03-29 20:15:01 -07:00
Roger Ferrer Ibanez 489ca73ac4 [PrologEpilogInserter][AMDGPU] Only adjust offset for emergency spill slots if the stack grows down
D89239 adjusts the stack offset of emergency spill slots for overaligned
stacks. However the adjustment is not valid for targets whose stack
grows up (such as AMDGPU).

This change makes the adjustment conditional only to those targets whose
stack grows down.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49686

Differential Revision: https://reviews.llvm.org/D99504
2021-03-29 17:26:58 +00:00
Bradley Smith 9745dce8c3 [SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps
This is currently performed in SelectionDAGLegalize, here we make it also
happen in LegalizeVectorOps, allowing a target to lower the SETCC condition
codes first in LegalizeVectorOps and then lower to a custom node afterwards,
without having to duplicate all of the SETCC condition legalization in the
target specific lowering.

As a result of this, fixed length floating point SETCC nodes can now be
properly lowered for SVE.

Differential Revision: https://reviews.llvm.org/D98939
2021-03-29 15:32:25 +01:00
Florian Hahn eb3d9f2eb6
[SelDag] Add isIntOrFPConstant helper function.
This patch adds a new isIntOrFPConstant  helper function to check if a
SDValue is a integer of FP constant. This pattern is used in various
places.

There also are places that incorrectly just check for integer constants,
e.g. D99384, so hopefully this helper will help people avoid that issue.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D99428
2021-03-28 12:48:58 +01:00
Amara Emerson 55533203d7 [GlobalISel] Add G_ROTR and G_ROTL opcodes for rotates.
Differential Revision: https://reviews.llvm.org/D99383
2021-03-25 17:23:30 -07:00
Jessica Paquette 23f657c165 [AArch64][GlobalISel] Emit bzero on Darwin
Darwin platforms for both AArch64 and X86 can provide optimized `bzero()`
routines. In this case, it may be preferable to use `bzero` in place of a
memset of 0.

This adds a G_BZERO generic opcode, similar to G_MEMSET et al. This opcode can
be generated by platforms which may want to use bzero.

To emit the G_BZERO, this adds a pre-legalize combine for AArch64. The
conditions for this are largely a port of the bzero case in
`AArch64SelectionDAGInfo::EmitTargetCodeForMemset`.

The only difference in comparison to the SelectionDAG code is that, when
compiling for minsize, this will fire for all memsets of 0. The original code
notes that it's not beneficial to do this for small memsets; however, using
bzero here will save a mov from wzr. For minsize, I think that it's preferable
to prioritise omitting the mov.

This also fixes a bug in the libcall legalization code which would delete
instructions which could not be legalized. It also adds a check to make sure
that we actually get a libcall name.

Code size improvements (Darwin):

- CTMark -Os: -0.0% geomean (-0.1% on pairlocalalign)
- CTMark -Oz: -0.2% geomean (-0.5% on bullet)

Differential Revision: https://reviews.llvm.org/D99358
2021-03-25 17:14:25 -07:00
Amara Emerson 0d2c4db637 [GlobalISel] Fix crash in RBS with a non-generic IMPLICIT_DEF.
This may occur when swifterror codegen in the translator generates these,
but we shouldn't try to handle them since they should have regclasses anyway.

rdar://75784009

Differential Revision: https://reviews.llvm.org/D99287
2021-03-24 23:08:51 -07:00
Sander de Smalen 55d18b3cc2 [TTI] Return a TypeSize from getRegisterBitWidth.
This patch changes the interface to take a RegisterKind, to indicate
whether the register bitwidth of a scalar register, fixed-width vector
register, or scalable vector register must be returned.

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D98874
2021-03-24 14:45:13 +00:00
Serguei Katkov 311d81ce97 [RegAlloc] Fix "ran out of regs" with uses in statepoint
Statepoint instruction is known to have a variable and big number of operands.
It is possible that Register Allocator will split live intervals in the way that all
physical registers are occupied by "zero-length" live intervals which are marked
as not-spillable.
While intervals are marked as not-spillable in the moment of creation when they are
really zero-length it is possible that in future as part of re-materialization there will
need for physical register between def and use of such tiny interval (the use is not
related to this interval at all).
As all physical registers are assigned to not-spillable intervals there is not avaialbe
registers and RA reports an error.

The idea of the fix is avoid marking tiny live intervals where there is a use in statepoint
instruction in var args section. Such interval may be perfectly spilled and folded to
operand of statepoint.

Reviewers: reames, dantrushin, qcolombet, dsanders, dmgreen
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D98766
2021-03-24 10:25:34 +07:00
serge-sans-paille e19884cd74 Introduce a generic operator to apply complex operations to BitVector
This avoids temporary and memcpy call when computing large expressions.

It's basically some kind of poor man's expression template, but it seems easier
to maintain to have a single generic `apply` call instead of the whole
expression template machinery here.

Differential Revision: https://reviews.llvm.org/D98176
2021-03-23 14:23:26 +01:00
Matt Arsenault b24436ac96 GlobalISel: Lower funnel shifts 2021-03-23 09:11:17 -04:00
David Sherwood 748ae5281d [IR][SVE] Add new llvm.experimental.stepvector intrinsic
This patch adds a new llvm.experimental.stepvector intrinsic,
which takes no arguments and returns a linear integer sequence of
values of the form <0, 1, ...>. It is primarily intended for
scalable vectors, although it will work for fixed width vectors
too. It is intended that later patches will make use of this
new intrinsic when vectorising induction variables, currently only
supported for fixed width. I've added a new CreateStepVector
method to the IRBuilder, which will generate a call to this
intrinsic for scalable vectors and fall back on creating a
ConstantVector for fixed width.

For scalable vectors this intrinsic is lowered to a new ISD node
called STEP_VECTOR, which takes a single constant integer argument
as the step. During lowering this argument is set to a value of 1.
The reason for this additional argument at the codegen level is
because in future patches we will introduce various generic DAG
combines such as

  mul step_vector(1), 2 -> step_vector(2)
  add step_vector(1), step_vector(1) -> step_vector(2)
  shl step_vector(1), 1 -> step_vector(2)
  etc.

that encourage a canonical format for all targets. This hopefully
means all other targets supporting scalable vectors can benefit
from this too.

I've added cost model tests for both fixed width and scalable
vectors:

  llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll
  llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll

as well as codegen lowering tests for fixed width and scalable
vectors:

  llvm/test/CodeGen/AArch64/neon-stepvector.ll
  llvm/test/CodeGen/AArch64/sve-stepvector.ll

See this thread for discussion of the intrinsic:
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html
2021-03-23 10:43:35 +00:00
Pushpinder Singh d0e5422eb8 [GlobalISel][AMDGPU] Lower G_UMULO/G_SMULO
Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D93963
2021-03-23 05:45:43 +00:00