Commit Graph

179256 Commits

Author SHA1 Message Date
Shawn Landden fa91ab85d9 [SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger
llvm-svn: 361728
2019-05-26 13:55:52 +00:00
Shawn Landden 30111c786f [SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize
Rather than gating on "isSwitchDense" (resulting in necessesarily
sparse lookup tables even when they were generated), always run
this quite cheap transform.

This transform is useful not just for generating tables.
LowerSwitch also wants this: read LowerSwitch.cpp:257.

Be careful to not generate worse code, by introducing a
SubThreshold heuristic.

Instead of just sorting by signed, generalize the finding of the
best base.

And now that it is run unconditionally, do not replicate its
functionality in SwitchToLookupTable (which could use a Sub
when having a hole is smaller, hence the SubThreshold
heuristic located in a single place).
This simplifies SwitchToLookupTable, and fixes
some ugly corner cases due to the use of signed numbers,
such as a table containing i16 32768 and 32769, of which
32769 would be interpreted as -32768, and now the code thinks
the table is size 65536.

(We still use unconditional subtraction when building a single-register mask,
but I think this whole block should go when the more general sparse
map is added, which doesn't leave empty holes in the table.)

And the reason test4 and test5 did not trigger was documented wrong:
it was because they were not considered sufficiently "dense".

Also, fix generation of invalid LLVM-IR: shl by bit-width.

llvm-svn: 361727
2019-05-26 13:55:14 +00:00
Shawn Landden 444eaaf1cc [SimpligyCFG] NFC, remove GCD that was only used for powers of two
and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

llvm-svn: 361726
2019-05-26 13:54:04 +00:00
Shawn Landden 50c73a044f [SimplifyCFG] NFC, update Switch tests to HEAD so I can see if my changes change anything
Also add baseline tests to show effect of later patches.

llvm-svn: 361725
2019-05-26 13:52:41 +00:00
Shawn Landden b7cc093db2 [Support] make countLeadingZeros() and countTrailingZeros() return unsigned
This matches countLeadingOnes() and countTrailingOnes(), and
APInt's countLeadingZeros() and countTrailingZeros().

(as well as __builtin_clzll())

llvm-svn: 361724
2019-05-26 13:49:58 +00:00
Nikita Popov d0f13e618f [ValueTracking] Base computeOverflowForUnsignedMul() on ConstantRange code; NFCI
The implementation in ValueTracking and ConstantRange are equally
powerful, reuse the one in ConstantRange, which will make this easier
to extend.

llvm-svn: 361723
2019-05-26 13:22:01 +00:00
Nico Weber 7228b50802 gn build: Merge r361664
llvm-svn: 361722
2019-05-26 13:06:48 +00:00
Nikita Popov 39f2bebf41 [InstCombine] Refactor OptimizeOverflowCheck; NFCI
Extract method to compute overflow based on binop and signedness,
and then make the result handling code generic. This extends the
always-overflow handling to signed muls, but has currently no effect,
as we don't compute always overflow for them (thus NFC).

llvm-svn: 361721
2019-05-26 11:43:37 +00:00
Nikita Popov 352f598795 [InstCombine] Remove OverflowCheckFlavor; NFC
Instead pass binary op and signedness. The extra enum only makes
things more complicated in this case.

llvm-svn: 361720
2019-05-26 11:43:31 +00:00
David Green 0dbafe191e [ARM] Select fp16 fma
This adds a pattern for fma, similar to the float and double patterns.

Differential Revision: https://reviews.llvm.org/D62330

llvm-svn: 361719
2019-05-26 11:34:30 +00:00
David Green 21542cd6f4 [ARM] Select a number of fp16 rounding functions
This add patterns for fp16 round and ceil etc. Same as the float and double
patterns.

Differential Revision: https://reviews.llvm.org/D62326

llvm-svn: 361718
2019-05-26 11:13:00 +00:00
David Green c9f4b7d201 [ARM] Promote various fp16 math intrinsics
Promote a number of fp16 math intrinsics to float, so that the relevant float
math routines can be used. Copysign is expanded so as to be handled in-place.

Differential Revision: https://reviews.llvm.org/D62325

llvm-svn: 361717
2019-05-26 10:59:21 +00:00
Simon Pilgrim 58a8541dcc [X86][AVX] combineBitcastvxi1 - peek through bitops to determine size of original vector
We were only testing for direct SETCC results - this allows us to peek through AND/OR/XOR combinations of the comparison results as well.

There's a missing SEXT(PACKSS) fold that I need to investigate for v8i1 cases before I can enable it there as well.

llvm-svn: 361716
2019-05-26 10:54:23 +00:00
David Green 2881325b17 [ARM] Select fp16 fabs
This adds a pattern for the fabs intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62324

llvm-svn: 361715
2019-05-26 10:51:58 +00:00
David Green aeade651f3 [ARM] Select fp16 fsqrt
This adds a pattern for the sqrt intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62322

llvm-svn: 361714
2019-05-26 10:42:24 +00:00
David Green caf8a11b65 [ARM] Promote fp16 frem
Promote fp16 frem operations on ARM to floats so they call fmodf.

Differential Revision: https://reviews.llvm.org/D62321

llvm-svn: 361713
2019-05-26 10:30:22 +00:00
David Green 1c1e2ca022 [ARM] Add some base fullfp16 tests. NFC
llvm-svn: 361712
2019-05-26 10:06:40 +00:00
Fangrui Song 603ca511f9 [PowerPC] Add missing R_PPC_* relocation types
While people mostly care about 64-bit, some systems need basic lib32
support. The plan is to make lld (see PR40888) capable of linking some
applications (PR40888).

llvm-svn: 361711
2019-05-26 08:31:00 +00:00
David Bolvansky 0290a77aa8 [SimplifyCFG] Added condition assumption for unreachable blocks
Summary: PR41688

Reviewers: spatel, efriedma, craig.topper, hfinkel, reames

Reviewed By: hfinkel

Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61409

llvm-svn: 361707
2019-05-25 22:34:27 +00:00
Simon Pilgrim 40fa52b174 [X86] lowerBuildVectorToBitOp - support build_vector(shift()) -> shift(build_vector(),C)
Commonly occurs in sign-extension cases

llvm-svn: 361706
2019-05-25 18:02:17 +00:00
Robert Widmann b0fd12b689 [LLVM-C] Add Accessor for Mach-O Universal Binary Slices
Summary: Allow for retrieving an object file corresponding to an architecture-specific slice in a Mach-O universal binary file.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60378

llvm-svn: 361705
2019-05-25 16:47:27 +00:00
Nikita Popov d87eceda0e [X86] Combine fminnum/fmaxnum with non-nan operand to fmin/fmax
If we have a known non-nan operand, place it in the second operand
of fmin/fmax that is returned if either operand is nan.

Differential Revision: https://reviews.llvm.org/D62448

llvm-svn: 361704
2019-05-25 16:44:29 +00:00
Nikita Popov 6bb5041e94 [LVI][CVP] Add support for saturating add/sub
Adds support for the uadd.sat family of intrinsics in LVI, based on
ConstantRange methods from D60946.

Differential Revision: https://reviews.llvm.org/D62447

llvm-svn: 361703
2019-05-25 16:44:14 +00:00
Simon Pilgrim 34d5a74b03 [X86][SSE] vector-sext - cleanup prefix lists
Add X32-SSE common prefix to merge some checks

llvm-svn: 361702
2019-05-25 16:33:17 +00:00
Sanjay Patel 3f0905e46f [SelectionDAG] define binops as a superset of commutative binops
The test diffs show improved vector narrowing for integer min/max opcodes because
those were all absent from the list. I'm not sure if we can expose functional diffs
for all of the moved/added opcodes though.

It seems like we are missing an AVX512 opportunity to use 256-bit ops in place of
512-bit ops on some tests/targets, but I think that can be a follow-up.

Preliminary steps to make sure the callers are not misusing these queries:
rL361268
rL361547

Differential Revision: https://reviews.llvm.org/D62191

llvm-svn: 361701
2019-05-25 15:28:55 +00:00
Nikita Popov c9de92ee76 [X86] Add tests for min/maxnum with const operand; NFC
llvm-svn: 361700
2019-05-25 15:06:54 +00:00
Nikita Popov 3c7edb2de5 [LoopVectorize] Fix test by regenerating checks
llvm-svn: 361699
2019-05-25 14:33:30 +00:00
Nikita Popov 8b1fa07639 [CVP] Remove unnecessary checks for empty GNWR; NFC
The guaranteed no-wrap region is never empty, it always contains at
least zero, so these optimizations don't ever apply.

To make this more obviously true, replace the conversative return
in makeGNWR with an assertion.

llvm-svn: 361698
2019-05-25 14:11:55 +00:00
David Bolvansky 2149811854 [NFC] Make tests more robust for new optimizations
llvm-svn: 361697
2019-05-25 14:10:20 +00:00
Sanjay Patel 91131b6500 [SelectionDAG] soften assertion when legalizing narrow vector FP ops
The test based on PR42010:
https://bugs.llvm.org/show_bug.cgi?id=42010
...may show an inaccuracy for PPC's target defs, but we should not
be so aggressive with an assert here. There's no telling what out-of-tree
targets look like.

llvm-svn: 361696
2019-05-25 13:48:07 +00:00
David Bolvansky bb76cf0f96 [NFC] Update test checks
llvm-svn: 361695
2019-05-25 13:11:22 +00:00
Nikita Popov 9a33dc9fb8 [CVP] Add tests for saturating add/sub ranges; NFC
llvm-svn: 361694
2019-05-25 09:53:51 +00:00
Nikita Popov 024b18aca7 [LVI][CVP] Calculate with.overflow result range
In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0)
as the range of op(%x, %y). This is mainly useful in conjunction with
D60650: If the result of the operation is extracted in a branch guarded
against overflow, then the value of %x will be appropriately constrained
and the result range of the operation will be calculated taking that
into account.

Differential Revision: https://reviews.llvm.org/D60656

llvm-svn: 361693
2019-05-25 09:53:45 +00:00
Nikita Popov 17367b0d89 [LVI] Extract helper for binary range calculations; NFC
llvm-svn: 361692
2019-05-25 09:53:37 +00:00
Craig Topper 46e5052b8e [X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. Support LEA64_32r properly.
INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags.

This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg.

One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input.

Differential Revision: https://reviews.llvm.org/D61472

llvm-svn: 361691
2019-05-25 06:17:47 +00:00
Craig Topper 4b08fcdeb1 [X86] Add zero idioms to the haswell, broadwell, and skylake schedule models. Add 256-bit fp xor to sandybridge zero idioms
This copies the Sandy Bridge zero idiom support to later CPUs. Adding the AVX2 and AVX512F/VL instructions as appropriate.

Differential Revision: https://reviews.llvm.org/D62360

llvm-svn: 361690
2019-05-25 04:47:49 +00:00
Craig Topper af6c9df163 [X86][llvm-mca] Add zero idiom tests for Intel CPUs. NFC
This pre-commits tests for D62360

llvm-svn: 361689
2019-05-25 04:47:42 +00:00
Peter Collingbourne 3b93737446 Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."
Broke sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio

llvm-svn: 361688
2019-05-25 01:52:38 +00:00
Akira Hatanaka a846427ad0 Revert "[Analysis] Link library dependencies to Analysis plugins"
This reverts commit r361340. The following builder has been broken for
the past few days because of this commit:

http://green.lab.llvm.org/green/job/clang-stage2-cmake-RgSan/

Also revert r361399, which was committed to fix r361340.

llvm-svn: 361685
2019-05-25 00:50:03 +00:00
Nico Weber bab1d8edcf Rename clangToolingRefactor to clangToolingRefactoring for consistency with its directory
See "[cfe-dev] The name of clang/lib/Tooling/Refactoring".

Differential Revision: https://reviews.llvm.org/D62420

llvm-svn: 361684
2019-05-25 00:27:19 +00:00
David Blaikie a17564c2f1 llvm-dwarfdump: Don't error on mixed units using/not using str_offsets
This lead to errors when dumping binaries with v4 and v5 units linked
together (but could've also errored on v5 units that did/didn't use
str_offsets).

Also improves error handling and messages around invalid str_offsets
contributions.

llvm-svn: 361683
2019-05-25 00:07:22 +00:00
Jessica Paquette 97d668d70f [GlobalISel][AArch64] Make FP constraint checks consider possible use/def banks
In a few places in getInstrMapping, we check if use/def instructions for the
instruction we're mapping have floating point constraints.

We can improve this check and reduce the number of copies in GISel-compiled code
if we make a couple observations:

- For a def instruction, it only matters if the def instruction must always
  output a value stored on a FPR

- For a use instruction, it only matters if the use instruction must always
  only take in values stored in FPRs

This adds two new functions:

- onlyUsesFP
- onlyDefinesFP

Then we can use those when we're checking the uses/defs instead.

Without this patch, the load, unmerge, store, and select in the added test
would have unnecessary copies.

Differential Revision: https://reviews.llvm.org/D62426

llvm-svn: 361679
2019-05-24 23:08:45 +00:00
Jessica Paquette bede937b16 [GlobalISel][AArch64] NFC: Factor out HasFPConstraints into a proper function
Factor it out into a function, and replace places where we had the same check
with the new function.

Differential Revision: https://reviews.llvm.org/D62421

llvm-svn: 361677
2019-05-24 22:12:21 +00:00
Jonas Devlieghere 0da8160df3 [dwarfdump] Add flag to limit the number of parents DIEs
This adds `-parent-recurse-depth` which limits the number of parent DIEs
being dumped.

Differential revision: https://reviews.llvm.org/D62359

llvm-svn: 361671
2019-05-24 21:11:28 +00:00
Jason Liu 8e1d921bb3 Implement call lowering without parameters on AIX
Summary:dd
This patch implements call lowering for calls without parameters
on AIX as initial support.

Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma

Differential Revision: https://reviews.llvm.org/D61948

llvm-svn: 361669
2019-05-24 20:54:35 +00:00
Jessica Paquette 56503865ed [GlobalISel][AArch64] Improve register bank mappings for G_SELECT
The fcsel and csel instructions differ in only the register banks they work on.

So, they're entirely interchangeable otherwise.

With this in mind, this does two things:

- Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as
  the outputs.
- Teach it to choose the best register bank mapping based off the constraints
  of the inputs and outputs.

The "best" in this case means the one that requires the smallest number of
copies to properly emit a fcsel/csel.

For example, if the inputs are all already going to be on FPRs, we should
emit a fcsel, even if the output is a GPR. This costs one copy to produce the
result, but saves us from copying the inputs into GPRs.

Also update the regbank-select.mir to check that we end up with the right
select instruction.

Differential Revision: https://reviews.llvm.org/D62267

llvm-svn: 361665
2019-05-24 19:35:25 +00:00
Nick Desaulniers 33bc64202b [AArch64] check for INLINEASM_BR along w/ INLINEASM
Summary:
It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.

pr/41999

Reviewers: t.p.northover, peter.smith

Reviewed By: peter.smith

Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62402

llvm-svn: 361661
2019-05-24 19:00:13 +00:00
Nick Desaulniers 9f7bd71cf5 [ARM] additionally check for ARM::INLINEASM_BR w/ ARM::INLINEASM
Summary:
We were observing failures for arm32 allyesconfigs of the Linux kernel
with the asm goto Clang patch, where ldr's were being generated to
offsets too far away to encode in imm12.

It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.

pr/41999

Link: https://github.com/ClangBuiltLinux/linux/issues/490

Reviewers: peter.smith, kristof.beyls, ostannard, rengolin, t.p.northover

Reviewed By: peter.smith

Subscribers: jyu2, javed.absar, hiraditya, llvm-commits, nathanchance, craig.topper, kees, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62400

llvm-svn: 361659
2019-05-24 18:58:21 +00:00
Matt Arsenault 3d59e388ca AMDGPU: Activate all lanes when spilling CSR VGPR for SGPR spills
If some lanes weren't active on entry to the function, this could
clobber their VGPR values.

llvm-svn: 361655
2019-05-24 18:18:51 +00:00
Matt Arsenault 0ff901fba0 AMDGPU: Boost inline threshold with addrspacecasted alloca arguments
This was skipping GetUnderlyingObject for nonprivate addresses, but an
alloca could also be found through an addrspacecast if it's flat.

llvm-svn: 361649
2019-05-24 16:52:35 +00:00