Commit Graph

24835 Commits

Author SHA1 Message Date
Sam Parker 597811e7a7 [DAGCombiner] Reduce load widths of shifted masks
During combining, ReduceLoadWdith is used to combine AND nodes that
mask loads into narrow loads. This patch allows the mask to be a
shifted constant. This results in a narrow load which is then left
shifted to compensate for the new offset.

Differential Revision: https://reviews.llvm.org/D50432

llvm-svn: 340261
2018-08-21 10:26:59 +00:00
Simon Pilgrim 72b324de4d [TargetLowering] Add BuildSDiv support for division by one or negone.
This reduces most of the sdiv stages (the MULHS, shifts etc.) to just zero/identity values and use the numerator scale factor to multiply by +1/-1.

llvm-svn: 340260
2018-08-21 10:20:36 +00:00
Bjorn Pettersson 880f291577 [RegisterCoalescer] Do not assert when trying to remat dead values
Summary:
RegisterCoalescer::reMaterializeTrivialDef used to assert that
the input register was live in. But as shown by the new
coalesce-dead-lanes.mir test case that seems to be a valid
scenario. We now return false instead of the assert, simply
avoiding to remat the dead def.

Normally a COPY of an undef value is eliminated by
eliminateUndefCopy(). Although we only do that when the
destination isn't a physical register. So the situation
above should be limited to the case when we copy an undef
value to a physical register.

Reviewers: kparzysz, wmi, tpr

Reviewed By: kparzysz

Subscribers: MatzeB, qcolombet, tpr, llvm-commits

Differential Revision: https://reviews.llvm.org/D50842

llvm-svn: 340255
2018-08-21 07:49:05 +00:00
Krzysztof Parzyszek cc3f630252 Consistently use MemoryLocation::UnknownSize to indicate unknown access size
1. Change the software pipeliner to use unknown size instead of dropping
   memory operands. It used to do it before, but MachineInstr::mayAlias
   did not handle it correctly.
2. Recognize UnknownSize in MachineInstr::mayAlias.
3. Print and parse UnknownSize in MIR.

Differential Revision: https://reviews.llvm.org/D50339

llvm-svn: 340208
2018-08-20 20:37:57 +00:00
Cameron McInally 94b9029be9 [FPEnv] Support constrained FREM intrinsic
Differential Revision: https://reviews.llvm.org/D50975

llvm-svn: 340201
2018-08-20 19:28:56 +00:00
Marcello Maggioni 5ca4128b45 [PSV] Update API to be able to use TargetCustom without UB.
getTargetCustom() requires values for "Kind" in the constructor
that are not in the PSVKind enum. Passing a value that is not inside
an enum as an argument to a constructor of the type of the enum is
UB. Changing to the underlying type of the enum would solve the UB

Differential Revision: https://reviews.llvm.org/D50909

llvm-svn: 340200
2018-08-20 19:23:45 +00:00
Aditya Nandakumar 2a08285cf3 Revert "Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics"
This reverts commit 7debc334e6421bb5251ef8f18e97166dfc7dd787.

I missed updating legalizer-info-validation.mir as I had assertions
turned off in my build and that specific test requires asserts. Fixed it
now.

llvm-svn: 340197
2018-08-20 18:43:19 +00:00
Simon Pilgrim 6ac905926f [TargetLowering] Disable BuildSDiv division by one or negone.
Fuzz tests have detected an issue, currently working on a fix.

llvm-svn: 340195
2018-08-20 18:23:54 +00:00
Reid Kleckner 918930adf9 Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting operations"
It causes LegalizerHelperTest.LowerBitCountingCTTZ1 to fail.

llvm-svn: 340186
2018-08-20 16:50:19 +00:00
Simon Pilgrim 1a00042270 [SelectionDAG] Reuse the Op's VT. NFCI.
llvm-svn: 340173
2018-08-20 13:44:03 +00:00
Simon Pilgrim 5b78c9d58d [SelectionDAG] Add partial sign-bit support to ComputeNumSignBits for BITCAST nodes
Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts.

Handle the case where the sign bit extends to only part of the small elements.

llvm-svn: 340169
2018-08-20 13:05:48 +00:00
Simon Pilgrim 5b936ec89e [SelectionDAG] Add basic demanded elements support to ComputeNumSignBits for BITCAST nodes
Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts.

The next step would be to support cases where the large elements aren't all sign bits, and determine the small element equivalent based on the demanded elements.

llvm-svn: 340143
2018-08-19 17:47:50 +00:00
Hsiangkai Wang 68c706ceb7 [DebugInfo] In FastISel, convert llvm.dbg.label to DBG_LABEL MI.
Convert llvm.dbg.label(!label_metadata) to DBG_LABEL !label_metadata.

Differential Revision: https://reviews.llvm.org/D50622

llvm-svn: 340122
2018-08-18 14:55:34 +00:00
Craig Topper cc5dbbf759 [DAGCombiner] Allow divide by constant optimization on opaque constants.
Summary:
I believe this restores the behavior we had before r339147.

Fixes PR38622.

Reviewers: RKSimon, chandlerc, spatel

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50936

llvm-svn: 340120
2018-08-18 05:52:42 +00:00
Aditya Nandakumar 59b2485ba2 [GISel]: Add Legalization/lowering code for bit counting operations
https://reviews.llvm.org/D48847#inline-448257

Ported legalization expansions for CTLZ/CTTZ from DAG to GISel.

Reviewed by rtereshin.

llvm-svn: 340111
2018-08-18 00:01:54 +00:00
Matt Arsenault 25e51540e1 DAG: Fix isKnownNeverNaN for basic non-sNaN cases
fadd/fsub/fmul need to worry about infinities as well
as fdiv.

llvm-svn: 340085
2018-08-17 21:19:22 +00:00
Hsiangkai Wang 2532ac880a [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)
There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

llvm-svn: 340039
2018-08-17 15:22:04 +00:00
Alex Bradbury 3291f9aa81 [AtomicExpandPass] Widen partword atomicrmw or/xor/and before tryExpandAtomicRMW
This patch performs a widening transformation of bitwise atomicrmw 
{or,xor,and} and applies it prior to tryExpandAtomicRMW. This operates 
similarly to convertCmpXchgToIntegerType. For these operations, the i8/i16 
atomicrmw can be implemented in terms of the 32-bit atomicrmw by appropriately 
manipulating the operands. There is no functional change for the handling of 
partword or/xor, but the transformation for partword 'and' is new.

The advantage of performing this transformation early is that the same 
code-path can be used regardless of the approach used to expand the atomicrmw 
(AtomicExpansionKind). i.e. the same logic is used for 
AtomicExpansionKind::CmpXchg and can also be used by the intrinsic-based 
expansion in D47882.

Differential Revision: https://reviews.llvm.org/D48129

llvm-svn: 340027
2018-08-17 14:03:37 +00:00
Simon Pilgrim 03e57521c0 [DAGCombiner] extractShiftForRotate - fix out of range shift issue
Don't just check for negative shift amounts.

Fixes OSS Fuzz #9935
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9935

llvm-svn: 340015
2018-08-17 12:25:18 +00:00
Simon Pilgrim 5113b48798 [DAGCombine] Improve (sra (sra x, c1), c2) -> (sra x, (add c1, c2)) folding
Add support for cases where only some c1+c2 results exceed the max bitshift, clamping accordingly.

Differential Revision: https://reviews.llvm.org/D35722

llvm-svn: 340010
2018-08-17 10:52:49 +00:00
Simon Pilgrim 22d580f2ca Fix "control reaches end of non-void function" -Wreturn-type warning. NFCI.
llvm-svn: 340006
2018-08-17 09:47:52 +00:00
Chen Zheng e2d47dd1bb [MISC]Fix wrong usage of std::equal()
Differential Revision: https://reviews.llvm.org/D49958

llvm-svn: 340000
2018-08-17 07:51:01 +00:00
Chandler Carruth b898b86f49 Revert r339977: [GISel]: Add Opcodes for a few LLVM Intrinsics
This is breaking ~all the bots.

llvm-svn: 339982
2018-08-17 04:47:16 +00:00
Aditya Nandakumar 973a557338 [GISel]: Add Opcodes for a few LLVM Intrinsics
https://reviews.llvm.org/D50401

Add opcodes for llvm.intrinsic.trunc, round, and update the IRTranslator
for the same.

Reviewed by: dsanders.

llvm-svn: 339977
2018-08-17 01:41:56 +00:00
David Blaikie 0e03047e85 DebugInfo: Remove command line (& target-based) disabling of pubnames in favor of metadata
Now that Clang disables NVPTX pubnames via metadata there's no need for
this fallback to target detection in the backend.

llvm-svn: 339970
2018-08-16 23:57:15 +00:00
Chandler Carruth 75ca6be1c1 [x86/MIR] Implement support for pre- and post-instruction symbols, as
well as MIR parsing support for `MCSymbol` `MachineOperand`s.

The only real way to test pre- and post-instruction symbol support is to
use them in operands, so I ended up implementing that within the patch
as well. I can split out the operand support if folks really want but it
doesn't really seem worth it.

The functional implementation of pre- and post-instruction symbols is
now *completely trivial*. Two tiny bits of code in the (misnamed)
AsmPrinter. It should be completely target independent as well. We emit
these exactly the same way as we emit basic block labels. Most of the
code here is to give full dumping, MIR printing, and MIR parsing support
so that we can write useful tests.

The MIR parsing of MC symbol operands still isn't 100%, as it forces the
symbols to be non-temporary and non-local symbols with names. However,
those names often can encode most (if not all) of the special semantics
desired, and unnamed symbols seem especially annoying to serialize and
de-serialize. While this isn't perfect or full support, it seems plenty
to write tests that exercise usage of these kinds of operands.

The MIR support for pre-and post-instruction symbols was quite
straightforward. I chose to print them out in an as-if-operand syntax
similar to debug locations as this seemed the cleanest way and let me
use nice introducer tokens rather than inventing more magic punctuation
like we use for memoperands.

However, supporting MIR-based parsing of these symbols caused me to
change the design of the symbol support to allow setting arbitrary
symbols. Without this, I don't see any reasonable way to test things
with MIR.

Differential Revision: https://reviews.llvm.org/D50833

llvm-svn: 339962
2018-08-16 23:11:05 +00:00
Craig Topper 883ff69c93 [DAGCombiner] Don't reassociate operations that have the vector reduction flag set.
When nodes are reassociated the vector-reduction flag gets lost.

The test case is here is what would happen if you had a sum of absolute differences loop that started with a non-zero but contant sum and that loop was unrolled. The vectorizer will generate a constant vector for the initial value. And DAGCombiner reassociate tries to move it down the addition tree erasing the vector-reduction flag. Interestingly this moves constants the opposite direction of the reassociate IR pass.

I've chosen to just punt on the reassociate, but I suppose we could maybe preserve the flag if both nodes have it set.

Differential Revision: https://reviews.llvm.org/D50827

llvm-svn: 339946
2018-08-16 21:54:05 +00:00
Chandler Carruth c73c0307fe [MI] Change the array of `MachineMemOperand` pointers to be
a generically extensible collection of extra info attached to
a `MachineInstr`.

The primary change here is cleaning up the APIs used for setting and
manipulating the `MachineMemOperand` pointer arrays so chat we can
change how they are allocated.

Then we introduce an extra info object that using the trailing object
pattern to attach some number of MMOs but also other extra info. The
design of this is specifically so that this extra info has a fixed
necessary cost (the header tracking what extra info is included) and
everything else can be tail allocated. This pattern works especially
well with a `BumpPtrAllocator` which we use here.

I've also added the basic scaffolding for putting interesting pointers
into this, namely pre- and post-instruction symbols. These aren't used
anywhere yet, they're just there to ensure I've actually gotten the data
structure types correct. I'll flesh out support for these in
a subsequent patch (MIR dumping, parsing, the works).

Finally, I've included an optimization where we store any single pointer
inline in the `MachineInstr` to avoid the allocation overhead. This is
expected to be the overwhelmingly most common case and so should avoid
any memory usage growth due to slightly less clever / dense allocation
when dealing with >1 MMO. This did require several ergonomic
improvements to the `PointerSumType` to reasonably support the various
usage models.

This also has a side effect of freeing up 8 bits within the
`MachineInstr` which could be repurposed for something else.

The suggested direction here came largely from Hal Finkel. I hope it was
worth it. ;] It does hopefully clear a path for subsequent extensions
w/o nearly as much leg work. Lots of thanks to Reid and Justin for
careful reviews and ideas about how to do all of this.

Differential Revision: https://reviews.llvm.org/D50701

llvm-svn: 339940
2018-08-16 21:30:05 +00:00
David Blaikie 66cf14d06b DebugInfo: Add metadata support for disabling DWARF pub sections
In cases where the debugger load time is a worthwhile tradeoff (or less
costly - such as loading from a DWP instead of a variety of DWOs
(possibly over a high-latency/distributed filesystem)) against object
file size, it can be reasonable to disable pubnames and corresponding
gdb-index creation in the linker.

A backend-flag version of this was implemented for NVPTX in
D44385/r327994 - which was fine for NVPTX which wouldn't mix-and-match
CUs. Now that it's going to be a user-facing option (likely powered by
"-gno-pubnames", the same as GCC) it should be encoded in the
DICompileUnit so it can vary per-CU.

After this, likely the NVPTX support should be migrated to the metadata
& the previous flag implementation should be removed.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D50213

llvm-svn: 339939
2018-08-16 21:29:55 +00:00
Krzysztof Parzyszek 9af86a5e01 [MachineVerifier] Check if predecessor is jointly dominated by undefs
Each use of a value should be jointly dominated by the union of defs and
undefs. It can happen that it will only be jointly dominated by undefs,
and that is still legal. Make sure that the verifier is aware of that.

llvm-svn: 339924
2018-08-16 19:13:28 +00:00
Eli Friedman 73e8a784e6 [SelectionDAG] Improve the legalisation lowering of UMULO.
There is no way in the universe, that doing a full-width division in
software will be faster than doing overflowing multiplication in
software in the first place, especially given that this same full-width
multiplication needs to be done anyway.

This patch replaces the previous implementation with a direct lowering
into an overflowing multiplication algorithm based on half-width
operations.

Correctness of the algorithm was verified by exhaustively checking the
output of this algorithm for overflowing multiplication of 16 bit
integers against an obviously correct widening multiplication. Baring
any oversights introduced by porting the algorithm to DAG, confidence in
correctness of this algorithm is extremely high.

Following table shows the change in both t = runtime and s = space. The
change is expressed as a multiplier of original, so anything under 1 is
“better” and anything above 1 is worse.

+-------+-----------+-----------+-------------+-------------+
| Arch  | u64*u64 t | u64*u64 s | u128*u128 t | u128*u128 s |
+-------+-----------+-----------+-------------+-------------+
|   X64 |     -     |     -     |    ~0.5     |    ~0.64    |
|  i686 |   ~0.5    |   ~0.6666 |    ~0.05    |    ~0.9     |
| armv7 |     -     |   ~0.75   |      -      |    ~1.4     |
+-------+-----------+-----------+-------------+-------------+

Performance numbers have been collected by running overflowing
multiplication in a loop under `perf` on two x86_64 (one Intel Haswell,
other AMD Ryzen) based machines. Size numbers have been collected by
looking at the size of function containing an overflowing multiply in
a loop.

All in all, it can be seen that both performance and size has improved
except in the case of armv7 where code size has regressed for 128-bit
multiply. u128*u128 overflowing multiply on 32-bit platforms seem to
benefit from this change a lot, taking only 5% of the time compared to
original algorithm to calculate the same thing.

The final benefit of this change is that LLVM is now capable of lowering
the overflowing unsigned multiply for integers of any bit-width as long
as the target is capable of lowering regular multiplication for the same
bit-width. Previously, 128-bit overflowing multiply was the widest
possible.

Patch by Simonas Kazlauskas!

Differential Revision: https://reviews.llvm.org/D50310

llvm-svn: 339922
2018-08-16 18:39:39 +00:00
Krzysztof Parzyszek 17143f6111 [RegisterCoalescer] Shrink to uses if needed after removeCopyByCommutingDef
llvm-svn: 339912
2018-08-16 18:02:59 +00:00
Simon Pilgrim 87d0039a45 [TargetLowering] Add support for non-uniform vectors to BuildSDIV
This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators.

This is the last patch necessary to close PR36545

Differential Revision: https://reviews.llvm.org/D50765

llvm-svn: 339908
2018-08-16 17:44:33 +00:00
Simon Pilgrim ede4905375 [TargetLowering] Refactor BuildSDIV in preparation for D50765. NFCI.
Pull out magic factor calculators into a helper function, use 0/+1/-1 multiplication factor to (optionally) add/sub the numerator.

llvm-svn: 339898
2018-08-16 16:54:06 +00:00
Guozhi Wei 8c17f9a77d [CodeGenPrepare] Add BothExtension type to PromotedInsts
This patch fixes PR38125.

Instruction extension types are recorded in PromotedInsts, it can be used later in function canGetThrough. If an instruction has two users with different extension types, it will be inserted into PromotedInsts two times in function promoteOperandForOther. The second one overwrites the first one, and the final extension type is wrong, later causes problem in canGetThrough.

This patch changes the simple bool extension type to 2-bit enum type, add a BothExtension type in addition to zero/sign extension. When an user sees BothExtension for an instruction, it actually knows nothing about how that instruction is extended.

Differential Revision: https://reviews.llvm.org/D49512

llvm-svn: 339822
2018-08-15 22:08:26 +00:00
Matt Arsenault 0f2c1cf429 DAG: Use getObjectOffset helper
llvm-svn: 339813
2018-08-15 21:03:44 +00:00
Matt Arsenault 22f01268fe DAG: Try to custom lower when promoting float operands
For some reason this wasn't done for floats like
integers.

llvm-svn: 339811
2018-08-15 20:34:54 +00:00
Krzysztof Parzyszek 3b097b4d3e [RegisterCoalescer] Ensure that both registers have subranges if one does
llvm-svn: 339792
2018-08-15 17:04:58 +00:00
Krzysztof Parzyszek 88d267d094 [RegisterCoalescer] Reset VNInfo def when copying segments over
llvm-svn: 339788
2018-08-15 16:21:53 +00:00
Krzysztof Parzyszek 46ce441df6 [RegAlloc] Check that subreg liveness tracking applies to given virtual reg
Subregister liveness applies selectively to register classes with certain
properties. Make sure that when it's enabled, it applies to a given virtual
register (in virtual register rewriter).

llvm-svn: 339784
2018-08-15 16:07:47 +00:00
Simon Pilgrim 4b2317ebfb [TargetLowering] Minor cleanup of TargetLowering::BuildSDIV. NFCI.
Pull out some types to match layout in TargetLowering::BuildUDIV. Early step towards adding non-uniform vector support.

llvm-svn: 339763
2018-08-15 11:11:05 +00:00
Simon Pilgrim a4ba43d3d3 [TargetLowering] Minor refactor to TargetLowering::BuildUDIV to merge scalar/vector magic value collection. NFCI.
Use the same ISD::matchUnaryPredicate pattern that was used in D50392.

llvm-svn: 339758
2018-08-15 10:11:13 +00:00
Simon Pilgrim e8a906ba47 [DagCombiner] Don't bother adding to the work list if TLI.BuildSDIVPow2 failed. NFCI.
Matches the code in BuildSDIV/BuildUDIV

llvm-svn: 339757
2018-08-15 10:02:54 +00:00
Simon Pilgrim a272fa9b0c [TargetLowering] Add support for non-uniform vectors to BuildExactSDIV
This patch refactors the existing BuildExactSDIV implementation to support non-uniform constant vector denominators.

Differential Revision: https://reviews.llvm.org/D50392

llvm-svn: 339756
2018-08-15 09:35:12 +00:00
Chandler Carruth 66654b72c9 [SDAG] Remove the reliance on MI's allocation strategy for
`MachineMemOperand` pointers attached to `MachineSDNodes` and instead
have the `SelectionDAG` fully manage the memory for this array.

Prior to this change, the memory management was deeply confusing here --
The way the MI was built relied on the `SelectionDAG` allocating memory
for these arrays of pointers using the `MachineFunction`'s allocator so
that the raw pointer to the array could be blindly copied into an
eventual `MachineInstr`. This creates a hard coupling between how
`MachineInstr`s allocate their array of `MachineMemOperand` pointers and
how the `MachineSDNode` does.

This change is motivated in large part by a change I am making to how
`MachineFunction` allocates these pointers, but it seems like a layering
improvement as well.

This would run the risk of increasing allocations overall, but I've
implemented an optimization that should avoid that by storing a single
`MachineMemOperand` pointer directly instead of allocating anything.
This is expected to be a net win because the vast majority of uses of
these only need a single pointer.

As a side-effect, this makes the API for updating a `MachineSDNode` and
a `MachineInstr` reasonably different which seems nice to avoid
unexpected coupling of these two layers. We can map between them, but we
shouldn't be *surprised* at where that occurs. =]

Differential Revision: https://reviews.llvm.org/D50680

llvm-svn: 339740
2018-08-14 23:30:32 +00:00
Cameron McInally 00b0658aae [FPEnv] Scalarize StrictFP vector operations
Add a helper function to scalarize constrained FP operations as needed.

Differential Revision: https://reviews.llvm.org/D50720

llvm-svn: 339735
2018-08-14 22:13:11 +00:00
Eli Friedman 0d12e90bf5 [ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly.
Intentionally excluding nodes from the DAGCombine worklist is likely to
lead to weird optimizations and infinite loops, so it's generally a bad
idea.

To avoid the infinite loops, fix DAGCombine to use the
isDesirableToCommuteWithShift target hook before performing the
transforms in question, and implement the target hook in the ARM backend
disable the transforms in question.

Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a
reduced testcase for that bug. But we should have sufficient test
coverage for PerformSHLSimplify given that we're not playing weird
tricks with the worklist. I can try to bugpoint it if necessary,
though.)

Differential Revision: https://reviews.llvm.org/D50667

llvm-svn: 339734
2018-08-14 22:10:25 +00:00
Adrian Prantl 55f4262999 [DebugInfoMetadata] Added DIFlags interface in DIBasicType.
Flags in DIBasicType will be used to pass attributes used in
DW_TAG_base_type, such as DW_AT_endianity.

Patch by Chirag Patel!

Differential Revision: https://reviews.llvm.org/D49610

llvm-svn: 339714
2018-08-14 19:35:34 +00:00
Bruno Cardoso Lopes f446282aad Revert "[DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)"
This reverts commit cb8c5e417d55141f3f079a8a876e786f44308336 / r339676.

This causing a test to fail in http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/48406/

    LLVM :: DebugInfo/Generic/debug-label.ll

llvm-svn: 339700
2018-08-14 17:54:41 +00:00
Nirav Dave fbfe2ad9e0 [DAG] Avoid redundant chain transversal in store merge cycle check. NFCI.
Patch by Henric Karlsson.

llvm-svn: 339688
2018-08-14 16:20:43 +00:00