Commit Graph

29602 Commits

Author SHA1 Message Date
Austin Kerbow 37d907899f [HazardRec] Allow inserting multiple wait-states simultaneously
If a target can encode multiple wait-states into a noop allow emitting such
instructions directly.

Reviewed By: rampitec, dmgreen

Differential Revision: https://reviews.llvm.org/D89753
2020-10-20 17:03:47 -07:00
Nicolai Hähnle c0cdd22c72 Introduce CfgTraits abstraction
The CfgTraits abstraction simplfies writing algorithms that are
generic over the type of CFG, and enables writing such algorithms
as regular non-template code that operates on opaque references
to CFG blocks and values.

Implementations of CfgTraits provide operations on the concrete
CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock *`.

CfgInterface is an abstract base class which provides operations
on opaque types CfgBlockRef and CfgValueRef. Those opaque types
encapsulate a `void *`, but the meaning depends on the concrete
CFG type. For example, MachineCfgTraits -- for use with MachineIR
in SSA form -- encodes a Register inside CfgValueRef. Converting
between concrete references and opaque/generic ones is done by
CfgTraits::{fromGeneric,toGeneric}. Convenience methods
CfgTraits::{un}wrap{Iterator,Range} are available as well.

Writing algorithms in terms of CfgInterface adds some overhead
(virtual method calls, plus in same cases it removes the
opportunity to inline iterators), but can be much more convenient
since generic algorithms can be written as non-templates.

This patch adds implementations of CfgTraits for all CFGs on
which dominator trees are calculated, so that the dominator
tree can be ported to this machinery. Only IrCfgTraits (LLVM IR)
and MachineCfgTraits (Machine IR in SSA form) are complete, the
other implementations are limited to the absolute minimum
required to make the upcoming dominator tree changes work.

v5:
- fix MachineCfgTraits::blockdef_iterator and allow it to iterate over
  the instructions in a bundle
- use MachineBasicBlock::printName

v6:
- implement predecessors/successors for all CfgTraits implementations
- fix error in unwrapRange
- rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming
  that is consistent with {wrap,unwrap}{Iterator,Range}
- use getVRegDef instead of getUniqueVRegDef

v7:
- std::forward fix in wrapping_iterator
- fix typos

v8:
- cleanup operators on CfgOpaqueType
- address other review comments

Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d

Differential Revision: https://reviews.llvm.org/D83088
2020-10-20 13:50:52 +02:00
Luqman Aden 51892a42da [COFF][ARM] Fix CodeView for Windows on 32bit ARM targets.
Create the LLVM / CodeView register mappings for the 32-bit ARM Window targets.

Reviewed By: compnerd

Differential Revision: https://reviews.llvm.org/D89622
2020-10-19 22:16:16 -07:00
Qiu Chaofan 1b2fe71ecf [DAGCombiner] Tighten reasscociation of visitFMA
From LangRef, FMF contract should not enable reassociating to form
arbitrary contractions. So it should not help rearrange nodes like
(fma (fmul x, c1), c2, y) into (fma x, c1*c2, y).

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D89527
2020-10-20 10:13:01 +08:00
Craig Topper edd0cb11bd [SelectionDAG][X86] Enable SimplifySetCC CTPOP transforms for vector splats
This enables these transforms for vectors:
(ctpop x) u< 2 -> (x & x-1) == 0
(ctpop x) u> 1 -> (x & x-1) != 0
(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)
(ctpop x) != 1 --> (x == 0) || ((x & x-1) != 0)

All enabled if CTPOP isn't Legal. This differs from the scalar
behavior where the first two are done unconditionally and the
last two are done if CTPOP isn't Legal or Custom. The Legal
check produced better results for vectors based on X86's
custom handling. Might be worth re-visiting scalars here.

I disabled the looking through truncate for vectors. The
code that creates new setcc can use the same result VT as the
original setcc even if we truncated the input. That may work
work for most scalars, but definitely wouldn't work for vectors
unless it was a vector of i1.

Fixes or at least improves PR47825

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D89346
2020-10-19 12:56:59 -07:00
Amy Kwan 6a946fd06f [DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead.
MULH is often expanded on targets.
This patch removes the isMulhCheaperThanMulShift hook and uses
isOperationLegalOrCustom instead.

Differential Revision: https://reviews.llvm.org/D80485
2020-10-19 12:23:04 -05:00
Mircea Trofin 225065b9a8 [NFC][MC] Type [MC]Register uses in MachineTraceMetrics
Differential Revision: https://reviews.llvm.org/D89710
2020-10-19 09:49:52 -07:00
David Sherwood 3945b69e81 [SVE][CodeGen] Replace more TypeSize comparison operators with their scalar equivalents
In certain places in llvm/lib/CodeGen we were relying upon the TypeSize
comparison operators when in fact the code was only ever expecting
either scalar values or fixed width vectors. This patch changes a few
functions that were always expecting to work on scalar or fixed width
types:

1. DAGCombiner::mergeTruncStores - deals with scalar integers only.
2. DAGCombiner::ReduceLoadWidth - not valid for vectors.
3. DAGCombiner::createBuildVecShuffle - should only be used for
   fixed width vectors.
4. SelectionDAGLegalize::ExpandFCOPYSIGN and
   SelectionDAGLegalize::getSignAsIntValue - only work on scalars.

Differential Revision: https://reviews.llvm.org/D88562
2020-10-19 08:38:50 +01:00
David Sherwood 35a531fb45 [SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents
In certain places in llvm/lib/CodeGen we were relying upon the TypeSize
comparison operators when in fact the code was only ever expecting
either scalar values or fixed width vectors. I've changed some of these
places to use the equivalent scalar operator.

Differential Revision: https://reviews.llvm.org/D88482
2020-10-19 08:30:31 +01:00
David Sherwood f693f915a0 [SVE][CodeGen] Replace uses of TypeSize comparison operators
In certain places in the code we can never end up in a situation where
we're mixing fixed width and scalable vector types. For example,
we can't have truncations and extends that change the lane count. Also,
in other places such as GenWidenVectorStores and GenWidenVectorLoads we
know from the behaviour of FindMemType that we can never choose a vector
type with a different scalable property.

In various places I have used EVT::bitsXY functions instead of
TypeSize::isKnownXY, where it probably makes sense to keep an assert
that scalable properties match.

Differential Revision: https://reviews.llvm.org/D88654
2020-10-19 08:08:41 +01:00
Alok Kumar Sharma 0538353b3b [DebugInfo] Support for DWARF operator DW_OP_over
LLVM rejects DWARF operator DW_OP_over. This DWARF operator is needed
for Flang to support assumed rank array.

  Summary:
Currently LLVM rejects DWARF operator DW_OP_over. Below error is
produced when llvm finds this operator.
[..]
invalid expression
!DIExpression(151, 20, 16, 48, 30, 35, 80, 34, 6)
warning: ignoring invalid debug info in over.ll
[..]
There were some parts missing in support of this operator, which are
now completed.

  Testing
-added a unit testcase
-check-debuginfo
-check-llvm

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D89208
2020-10-17 08:42:28 +05:30
Craig Topper 278bd06891 [TargetLowering] Extract simplifySetCCs ctpop into a separate function. NFCI
As requested in D89346. This allows us to add some early outs.

I reordered some checks a little bit to make the more common bail outs happen earlier. Like checking opcode before checking hasOneUse. And I moved the bit width check to make sure it was safe to look through a truncate to the spot where we look through truncates instead of after.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D89494
2020-10-16 19:47:56 -07:00
Jameson Nash 4242df1470 Revert "make the AsmPrinterHandler array public"
I messed up one of the tests.
2020-10-16 17:22:07 -04:00
Jameson Nash ac2def2d8d make the AsmPrinterHandler array public
This lets external consumers customize the output, similar to how
AssemblyAnnotationWriter lets the caller define callbacks when printing
IR. The array of handlers already existed, this just cleans up the code
so that it can be exposed publically.

Differential Revision: https://reviews.llvm.org/D74158
2020-10-16 16:27:31 -04:00
Amara Emerson 6042c25b0a [GlobalISel] Add translation support for vector reduction intrinsics.
In order to prevent the ExpandReductions pass from expanding some intrinsics
before they get to codegen, I had to add a -disable-expand-reductions flag
for testing purposes.

Differential Revision: https://reviews.llvm.org/D89028
2020-10-16 10:17:53 -07:00
Jay Foad 0c1381d795 [llc] Use -filetype=null to disable MIR printing
If you use -stop-after or similar options, llc will normally print MIR.
This patch checks for -filetype=null as a special case to disable MIR
printing. As the comment says, "The Null output is intended for use for
performance analysis ...", and I found this useful for timing a subset
of the passes that llc runs without the significant overhead of printing
MIR just to send it to /dev/null.

Differential Revision: https://reviews.llvm.org/D89476
2020-10-16 16:51:56 +01:00
Amara Emerson c2551c1f40 [GlobalISel] Remove scalar src from non-sequential fadd/fmul reductions.
It's probably better to split these into separate G_FADD/G_FMUL + G_VECREDUCE
operations in the translator rather than carrying the scalar around. The
majority of the time it'll get simplified away as the scalars are probably
identity values.

Differential Revision: https://reviews.llvm.org/D89150
2020-10-15 15:51:44 -07:00
Denis Antrushin 8f0ddd4a1a [Statepoints] Remove MI limit on number of tied operands.
After D87915 statepoint can have more than 15 tied operands.
Remove this restriction from statepoint lowering code.
2020-10-15 19:02:38 +07:00
Adrian Kuegel ead2aa7098 Fix unused variable warning when compiling with asserts disabled.
Differential Revision: https://reviews.llvm.org/D89454
2020-10-15 12:50:19 +02:00
Jeremy Morse c521e44def [DebugInstrRef] Support recording of instruction reference substitutions
Add a table recording "substitutions" between pairs of <instruction,
operand> numbers, from old pairs to new pairs. Post-isel optimizations are
able to record the outcome of an optimization in this way. For example, if
there were a divide instruction that generated the quotient and remainder,
and it were replaced by one that only generated the quotient:

  $rax, $rcx = DIV-AND-REMAINDER $rdx, $rsi, debug-instr-num 1
  DBG_INSTR_REF 1, 0
  DBG_INSTR_REF 1, 1

Became:

  $rax = DIV $rdx, $rsi, debug-instr-num 2
  DBG_INSTR_REF 1, 0
  DBG_INSTR_REF 1, 1

We could enter a substitution from <1, 0> to <2, 0>, and no substitution
for <1, 1> as it's no longer generated.

This approach means that if an instruction or value is deleted once we've
left SSA form, all variables that used the value implicitly become
"optimized out", something that isn't true of the current DBG_VALUE
approach.

Differential Revision: https://reviews.llvm.org/D85749
2020-10-15 11:30:14 +01:00
Denis Antrushin 8c2b69d53a [Statepoints] Unlimited tied operands.
Current limit on amount of tied operands (15) sometimes is too low
for statepoint. We may get couple dozens of gc pointer operands on
statepoint.
Review D87154 changed format of statepoint to list every gc pointer
only once, which makes it trivial to find tiedness relation between
statepoint operands: defs are mapped 1-1 to gc pointer operands passed
on registers.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D87915
2020-10-15 16:16:11 +07:00
Craig Topper 50c9f1e11d [TargetLowering] Replace Log2_32_Ceil with Log2_32 in SimplifySetCC ctpop combine.
This combine can look through (trunc (ctpop X)). When doing this
it tries to make sure the trunc doesn't lose any information
from the ctpop. It does this by checking that the truncated type
has more bits that Log2_32_Ceil of the ctpop type. The Ceil is
unnecessary and pessimizes non-power of 2 types.

For example, ctpop of i256 requires 9 bits to represent the max
value of 256. But ctpop of i255 only requires 8 bits to represent
the max result of 255. Log2_32_Ceil of 256 and 255 both return 8
while Log2_32 returns 8 for 256 and 7 for 255

The code with popcnt enabled is a regression for this test case,
but it does match what already happens with i256 truncated to i9.
Since power of 2 is more likely, I don't think it should block
this change.

Differential Revision: https://reviews.llvm.org/D89412
2020-10-15 01:05:07 -07:00
Snehasish Kumar 24bf6ff4e0 [llvm] Update default cutoff threshold for machine function splitter.
Based on internal testing at Google we found that setting the profile
summary cutoff threshold to 999950 yields the best results in terms of
itlb and icache metrics (as observed on Intel CPUs).

*default* = Split out code if no profile count available for block
*size-%*  = The fraction of bytes split out of .text and .text.hot
*itlb*    = Misses per kilo instructions (MPKI) for itlb
*icache*  = Misses per kilo instructions (MPKI) for L1 icache

Search1

| cutoff  | size-%  | itlb      | icache  |
|---------|---------|-----------|---------|
| default | 42.5861 | 0.0822151 | 2.46363 |
|  999999 | 44.9350 | 0.0767194 | 2.44416 |
|  999950 | 50.0660 |  0.075744 |  2.4091 |
|  999500 | 56.9158 |  0.082564 |  2.4188 |
|  995000 | 63.8625 | 0.0814927 | 2.42832 |
|  990000 | 71.7314 |  0.106906 | 2.57785 |

Search2

| cutoff  | size-% | itlb     | icache  |
|---------|--------|----------|---------|
| default | 2.8845 | 0.626712 | 4.73245 |
|  999999 | 3.3291 | 0.602309 | 4.70045 |
|  999950 | 3.8577 | 0.587842 | 4.71632 |
|  999500 | 4.4170 |  0.63577 | 4.68351 |
|  995000 | 5.1020 | 0.657969 | 4.82272 |
|  990000 | 5.7153 | 0.719122 | 5.39496 |

Differential Revision: https://reviews.llvm.org/D89085
2020-10-14 12:48:10 -07:00
Snehasish Kumar 77638a5343 [llvm] Set the default for -bbsections-cold-text-prefix to .text.split.
After using this for a while, we find that it is generally useful to
have it set to .text.split. by default, removing the need for an
additional -mllvm option.

Differential Revision: https://reviews.llvm.org/D88997
2020-10-14 12:16:36 -07:00
Guozhi Wei adfb541501 [MBP] Add whole chain to BlockFilterSet instead of individual BB
Currently we add individual BB to BlockFilterSet if its frequency satisfies

LoopFreq / Freq <= LoopToColdBlockRatio

LoopFreq is edge frequency from outside to loop header.
LoopToColdBlockRatio is a command line parameter.

It doesn't make sense since we always layout whole chain, not individual BBs.

It may also cause a tricky problem. Sometimes it is possible that the LoopFreq
of an inner loop is smaller than LoopFreq of outer loop. So a BB can be in
BlockFilterSet of inner loop, but not in BlockFilterSet of outer loop,
like .cold in the test case. So it is added to the chain of inner loop. When
work on the outer loop, .cold is not added to BlockFilterSet, so the edge to
successor .problem is not counted in UnscheduledPredecessors of .problem chain.
But other blocks in the inner loop are added BlockFilterSet, so the whole inner
loop chain can be layout, and markChainSuccessors is called to decrease
UnscheduledPredecessors of following chains. markChainSuccessors calls
markBlockSuccessors for every BB, even it is not in BlockFilterSet, like .cold,
so .problem chain's UnscheduledPredecessors is decreased, but this edge was not
counted on in fillWorkLists, so .problem chain's UnscheduledPredecessors
becomes 0 when it still has an unscheduled predecessor .pred! And it causes
problems in following various successor BB selection algorithms.

Differential Revision: https://reviews.llvm.org/D89088
2020-10-14 11:55:10 -07:00
jasonliu f85bcc21dd [AIX] Turn -fdata-sections on by default in Clang
Summary:

This patch does the following:
1. Make InitTargetOptionsFromCodeGenFlags() accepts Triple as a
 parameter, because some options' default value is triple dependant.
2. DataSections is turned on by default on AIX for llc.
3. Test cases change accordingly because of the default behaviour change.
4. Clang Driver passes in -fdata-sections by default on AIX.

Reviewed By: MaskRay, DiggerLin

Differential Revision: https://reviews.llvm.org/D88737
2020-10-14 15:58:31 +00:00
Mircea Trofin c8fcffe775 [NFC][MC] Use MCRegister in Machine{Sink|Pipeliner}.cpp
Differential Revision: https://reviews.llvm.org/D89328
2020-10-14 08:42:17 -07:00
Michael Liao ae40d2858e Fix an apparent typo. `assert()` must not contain side-effects. NFC. 2020-10-14 11:33:34 -04:00
Jeremy Morse c4e7857d4e [DebugInstrRef] Create DBG_INSTR_REFs in SelectionDAG
When given the -experimental-debug-variable-locations option (via -Xclang
or to llc), have SelectionDAG generate DBG_INSTR_REF instructions instead
of DBG_VALUE. For now, this only happens in a limited circumstance: when
the value referred to is not a PHI and is defined in the current block.
Other situations introduce interesting problems, addresed in later patches.

Practically, this patch hooks into InstrEmitter and if it can find a
defining instruction for a value, gives it an instruction number, and
points the DBG_INSTR_REF at that <instr, operand> pair.

Differential Revision: https://reviews.llvm.org/D85747
2020-10-14 14:24:08 +01:00
Jeremy Morse 2c5f3d54c5 [DebugInstrRef] Parse debug instruction-references from/to MIR
This patch defines the MIR format for debug instruction references: it's an
integer trailing an instruction, marked out by "debug-instr-number", much
like how "debug-location" identifies the DebugLoc metadata of an
instruction. The instruction number is stored directly in a MachineInstr.

Actually referring to an instruction comes in a later patch, but is done
using one of these instruction numbers.

I've added a round-trip test and two verifier checks: that we don't label
meta-instructions as generating values, and that there are no duplicates.

Differential Revision: https://reviews.llvm.org/D85746
2020-10-14 10:57:09 +01:00
Aditya Nandakumar ef3d17482f [GISel] Add combine for constant G_PTR_ADD offsets.
https://reviews.llvm.org/D88865

This adds a single combine for GlobalISel to fold:

ptradd (inttoptr C1) C2
Into:

C1 + C2
Additionally, a small test for AArch64 is added.

Patch by pnappa.
2020-10-13 17:26:12 -07:00
Andrew Paverd cfd8481da1 Reland [CFGuard] Add address-taken IAT tables and delay-load support
This patch adds support for creating Guard Address-Taken IAT Entry Tables (.giats$y sections) in object files, matching the behavior of MSVC. These contain lists of address-taken imported functions, which are used by the linker to create the final GIATS table.
Additionally, if any DLLs are delay-loaded, the linker must look through the .giats tables and add the respective load thunks of address-taken imports to the GFIDS table, as these are also valid call targets.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D87544
2020-10-13 13:20:52 -07:00
Mircea Trofin 08097fc6a9 [NFC][Regalloc] Use MCRegister in MachineCopyPropagation
Differential Revision: https://reviews.llvm.org/D89250
2020-10-13 09:05:08 -07:00
Mirko Brkusanin 52ba4fa6aa [GlobalISel] Avoid making G_PTR_ADD with nullptr
When the first operand is a null pointer we can avoid making a G_PTR_ADD and
make a G_INTTOPTR with the offset operand.
This helps us avoid making add with 0 later on for targets such as AMDGPU.

Differential Revision: https://reviews.llvm.org/D87140
2020-10-13 13:02:55 +02:00
Craig Topper 1687a8d83b [X86][SelectionDAG] Add SADDO_CARRY and SSUBO_CARRY to support multipart signed add/sub overflow legalization.
This passes existing X86 test but I'm not sure if it handles all type
legalization cases it needs to.

Alternative to D89200

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D89222
2020-10-12 23:18:29 -07:00
Konstantin Schwarz 7341123439 [GlobalISel][KnownBits] Early return on out of bound shift amounts
If the known shift amount is bigger than or equal to the bitwidth of the type of the value to be shifted,
the result is target dependent, so don't try to infer any bits.

This fixes a crash we've seen in one of our internal test suites.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D89232
2020-10-12 18:39:19 +02:00
Mircea Trofin 43d347995c [NFC][MC] Use MCRegister in LiveRangeMatrix
The change starts from LiveRangeMatrix and also checks the users of the
APIs are typed accordingly.

Differential Revision: https://reviews.llvm.org/D89145
2020-10-12 08:54:36 -07:00
Mircea Trofin 596a9f6b89 [NFC][Regalloc] Pass VirtRegMap by reference.
It's never null - the reason it's modeled as a pointer is because the
pass can't init it in its ctor. Passing by ref simplifies the code, too,
as the null checks were unnecessary complexity.

Differential Revision: https://reviews.llvm.org/D89171
2020-10-12 08:32:30 -07:00
Simon Pilgrim c252200e4d [DAG][ARM][MIPS][RISCV] Improve funnel shift promotion to use 'double shift' patterns
Based on a discussion on D88783, if we're promoting a funnel shift to a width at least twice the size as the original type, then we can use the 'double shift' patterns (shifting the concatenated sources).

Differential Revision: https://reviews.llvm.org/D89139
2020-10-12 14:11:02 +01:00
David Sherwood c5ba0d33cc [SVE] Make ElementCount and TypeSize use a new PolySize class
I have introduced a new template PolySize class, where the template
parameter determines the type of quantity, i.e. for an element
count this is just an unsigned value. The ElementCount class is
now just a simple derivation of PolySize<unsigned>, whereas TypeSize
is more complicated because it still needs to contain the uint64_t
cast operator, since there are still many places in the code that
rely upon this implicit cast. As such the class also still needs
some of it's own operators.

I've tried to minimise the amount of code in the base PolySize
class, which led to a couple of changes:

1. In some places we were relying on '==' operator comparisons
between ElementCounts and the scalar value 1. I didn't put this
operator in the new PolySize class, and thought it was actually
clearer to use the isScalar() function instead.
2. I removed the isByteSized function and replaced it with calls
to isKnownMultipleOf(8).

I've also renamed NextPowerOf2 to be coefficientNextPowerOf2 so
that it's more consistent with coefficientDivideBy.

Differential Revision: https://reviews.llvm.org/D88409
2020-10-12 08:23:38 +01:00
Fangrui Song cddb49bcc0 [SchedDAGInstrs] Delete redundant contains(). NFC 2020-10-11 20:58:30 -07:00
Krzysztof Parzyszek 61eaa2e14a [SDAG] Remember to set UndefElts in isSplatValue for SPLAT_VECTOR 2020-10-10 19:42:24 -05:00
David Green cb27006a94 [ARM] Attempt to make Tail predication / RDA more resilient to empty blocks
There are a number of places in RDA where we assume the block will not
be empty. This isn't necessarily true for tail predicated loops where we
have removed instructions. This attempt to make the pass more resilient
to empty blocks, not casting pointers to machine instructions where they
would be invalid.

The test contains a case that was previously failing, but recently been
hidden on trunk. It contains an empty block to begin with to show a
similar error.

Differential Revision: https://reviews.llvm.org/D88926
2020-10-10 14:50:25 +01:00
Alok Kumar Sharma 96bd4d34a2 [DebugInfo] Support for DWARF attribute DW_AT_rank
This patch adds support for DWARF attribute DW_AT_rank.

  Summary:
Fortran assumed rank arrays have dynamic rank. DWARF attribute
DW_AT_rank is needed to support that.

  Testing:
unit test cases added (hand-written)
check llvm
check debug-info

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D89141
2020-10-10 17:51:12 +05:30
Denis Antrushin 2b96dcebfa [Statepoints] Allow deopt GC pointer on VReg if gc-live bundle is empty.
Currently we allow passing pointers from deopt bundle on VReg only if
they were seen in list of gc-live pointers passed on VRegs.
This means that for the case of empty gc-live bundle we spill deopt
bundle's pointers. This change allows lowering deopt pointers to VRegs
in case of empty gc-live bundle. In case of non-empty gc-live bundle,
behavior does not change.

Reviewed By: skatkov

Differential Revision: https://reviews.llvm.org/D88999
2020-10-10 14:58:08 +07:00
Mircea Trofin c11c20fb00 [NFC][Regalloc] VirtRegAuxInfo::Hint does not need to be a field
It is only used in weightCalcHelper, and cleared upon its finishing its
job there.

The patch further cleans up style guide discrepancies, and simplifies
CopyHint by removing duplicate 'IsPhys' information (it's what the Reg
field would report).
2020-10-09 13:42:23 -07:00
Mircea Trofin 62e2ac6461 [NFC][Regalloc] Fix coding style in CalcSpillWeights 2020-10-09 12:22:12 -07:00
Scott Linder 4a98cf7867 [NFC] Reformat MILexer.cpp:getIdentifierKind
Reformat to avoid unrelated changes in diff of future patch.
Committed as obvious.
2020-10-09 15:21:24 +00:00
Esme-Yi e9fd8823ba [DAGCombiner] Add decomposition patterns for Mul-by-Imm.
Summary: This patch is derived from D87384.
In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns:
```
  mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M))
  mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M))
```
The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`.
More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved.

```
*res1 = a * 0x8800;
*res2 = a * 0x8080;
```

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D88201
2020-10-09 08:51:40 +00:00
Fangrui Song 2c4c2dc2d9 [MCRegister] Simplify isStackSlot & isPhysicalRegister and delete isPhysical. NFC 2020-10-08 22:08:33 -07:00