Commit Graph

191415 Commits

Author SHA1 Message Date
Craig Topper 4175d7e22e [X86] Custom isel floating point X86ISD::CMP on pre-CMOV targets. Eliminate ConvertCmpIfNecessary
If we don't have cmov, X87 compares write to FPSW and we need to
move the bits to EFLAGS to use as JCC/SETCC/CMOV conditions.

Previously this was done by calling ConvertCmpIfNecessary in
multiple places which would emit the extra code for the FNSTSW,
a shift, a truncate, and a SAHF instructions. Isel would then
select trunc+X86ISD::CMP to a FUCOM instruction that produces FPSW.

This patch centralizes all of the handling into a single custom
isel handler. This allows us to remove ConvertCmpIfNecessary and
a couple target specific ISD opcodes.

Differential Revision: https://reviews.llvm.org/D73863
2020-02-06 10:43:06 -08:00
Hiroshi Yamauchi 4ed205c816 [PGO][PGSO] Enable profile guided size optimization for non-cold code under instrumentation PGO.
Summary:
This enables it for large working set size cases only.

This does not enable it under sample PGO.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74073
2020-02-06 10:29:01 -08:00
Craig Topper 600f2e1c4d [X86] Remove SETB_C8r/SETB_C16r pseudo instructions. Use SETB_C32r and EXTRACT_SUBREG instead.
Only 32 and 64 bit SBB are dependency breaking instructons on some
CPUs. The 8 and 16 bit forms have to preserve upper bits of the GPR.

This patch removes the smaller forms and selects the wider form
instead. I had to do this with custom code as the tblgen generated
code glued the eflags copytoreg to the extract_subreg instead of
to the SETB pseudo.

Longer term I think we can remove X86ISD::SETCC_CARRY and use
(X86ISD::SBB zero, zero). We'll want to keep the pseudo and select
(X86ISD::SBB zero, zero) to either a MOV32r0+SBB for targets where
there is no dependency break and SETB_C32/SETB_C64 for targets
that have a dependency break. May want some way to avoid the MOV32r0
if the instruction that produced the carry flag happened to def a
register that we can use for the dependency.

I think the flag copy lowering should be using NEG instead of SUB to
handle SETB. That would avoid the MOV32r0 there. Or maybe it should
use a ADC with -1 to recreate the carry flag and keep the SETB?
That would avoid a MOVZX on the input of the SUB.

Differential Revision: https://reviews.llvm.org/D74024
2020-02-06 10:22:24 -08:00
Matt Arsenault 5a8c0f552b AMDGPU/GlobalISel: Avoid handling registers twice in waterfall loops
When multiple instructions are moved into a waterfall loop, it's
possible some of them re-use the same operands. Avoid creating
multiple sequences of readfirstlanes for them. None of the current
uses will hit this, but will be used in a future patch.
2020-02-06 09:38:24 -08:00
Chris Bowler b373ec8ce7 [AIX] Implement caller arguments passed in stack memory.
This patch implements the caller side of placing function call arguments
in stack memory. This removes the current limitation where LLVM on AIX
will report fatal error when arguments can't be contained in registers.

There is a particular oddity that a float argument that passes in a
register and also in stack memory requires that the caller initialize
both. From what AIX "ABI" documentation I have it's not clear that this
needs to be done, however, it is necessary for compatibility with the
AIX XL compiler so I think it's best to implement it the same way.

Note a later patch will follow to address the callee side.

Differential Revision: https://reviews.llvm.org/D73209
2020-02-06 12:07:34 -05:00
Mikhail Maltsev 2694cc3dca [ARM][MVE] Add fixed point vector conversion intrinsics
Summary:
This patch implements the following Arm ACLE MVE intrinsics:
* vcvtq_n_*
* vcvtq_m_n_*
* vcvtq_x_n_*

and two corresponding LLVM IR intrinsics:
* int_arm_mve_vcvt_fix (vcvtq_n_*)
* int_arm_mve_vcvt_fix_predicated (vcvtq_m_n_*, vcvtq_x_n_*)

Reviewers: simon_tatham, ostannard, MarkMurrayARM, dmgreen

Reviewed By: MarkMurrayARM

Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74134
2020-02-06 16:49:45 +00:00
Petr Hosek 8707c246bc Revert "[CMake] Passthrough CMAKE_SYSTEM_NAME to default builtin and runtimes target"
This reverts commit 491a4a7ac9 as it
broke the runtimes build on Darwin.
2020-02-06 08:24:08 -08:00
Sjoerd Meijer f70109f70c [doc] typo in optimisation remark example
Fix typo in the vectorisation optimisation remarks example:

  -Rpass-missed=loop-vectorized
=>
  -Rpass-missed=loop-vectorize
2020-02-06 14:55:18 +00:00
Jeremy Morse 6531a78ac4 Revert "[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field"
This reverts commit ed29dbaafa.

I'm backing out D68945, which as the discussion for D73526 shows, doesn't
seem to handle the -O0 path through the codegen backend correctly. I'll
reland the patch when a fix is worked out, apologies for all the churn.
The two parent commits are part of this revert too.

Conflicts:
	llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
	llvm/test/DebugInfo/X86/dbg-addr-dse.ll

SelectionDAGBuilder conflict is due to a nearby change in e39e2b4a79
that's technically unrelated. dbg-addr-dse.ll conflicted because
41206b61e3 (legitimately) changes the order of two lines.

There are further modifications to dbg-value-func-arg.ll: it landed after
the patch being reverted, and I've converted indirection to be represented
by the isIndirect field rather than DW_OP_deref.
2020-02-06 14:41:40 +00:00
Jeremy Morse ece761427f Revert "[DebugInfo][DAG] Distinguish different kinds of location indirection"
This reverts commit 3137fe4d23.

I'm backing out D68945, which this patch is a follow up for. It'll be
re-landed when D68945 is fixed.

The changes to dbg-value-func-arg.ll occur because our handling of certain
kinds of location now mixes up indirection that happens at different points
in a DIExpression. While this is a regression, it's a return to the prior
behaviour while a better patch is sought.
2020-02-06 14:41:40 +00:00
Jeremy Morse ed5998d21e Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location"
This reverts commit 2d3174c4df.

The overall solution for this problem is reverting D68945, which wasn't
handling the -O0 path through the codegen backend correctly. See:
discussion in D73526.
2020-02-06 14:41:39 +00:00
Sjoerd Meijer 20a1d03d77 [ARM] peephole-bitcast test change. NFC.
This test case was XFAIL'ed because the peepholer was missing an optimisation.
But the peepholer is now able to handle this case, so enable this test. I will
close the corresponding and very old PR11364.
2020-02-06 14:36:48 +00:00
Sjoerd Meijer 93b0536fd2 [RDA] getInstFromId: find instructions. NFC.
To find the instruction in the block for a given ID, first a count and then a
lookup was performed in the map, which is almost the same thing, thus doing
double the work.

Differential Revision: https://reviews.llvm.org/D73866
2020-02-06 14:13:31 +00:00
Sam Parker 0a8cae10fe [ReachingDefs] Make isSafeToMove more strict.
Test that we're not moving the instruction through instructions with
side-effects.

Differential Revision: https://reviews.llvm.org/D74058
2020-02-06 14:06:08 +00:00
Clement Courbet 89a66474b6 [llvm-exegesis] Document `repetition-mode`.
Reviewers: gchatelet

Subscribers: tschuett, mstojanovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74114
2020-02-06 13:42:12 +01:00
Russell Gallop e7cb374433 [LLD][ELF] Add time-trace to ELF LLD
This adds some of LLD specific scopes and picks up optimisation scopes
via LTO/ThinLTO. Makes use of TimeProfiler multi-thread support added in
77e6bb3c.

Differential Revision: https://reviews.llvm.org/D71060
2020-02-06 12:14:13 +00:00
Hans Wennborg abe01e17f6 Revert "[llvm-exegesis] Improve error reporting" and follow-up.
It broke e.g. all tests under tools/llvm-exegesis/X86/ when libpfm is
not available, see comment on D74085.

This reverts commit b3576f60eb and
141915963b.
2020-02-06 12:53:16 +01:00
Hans Wennborg 4c330be678 Try to fix ilist.h after 529e6f8791 2020-02-06 12:33:44 +01:00
Hans Wennborg 1b3d1661bb StringRef.h: __builtin_strlen seems to exist in VS 2017 MSVC 19.16 or later
This is a follow-up to ff837aa63c, as discussed on the llvm-commits
thread for that one.
2020-02-06 12:33:44 +01:00
Miloš Stojanović 141915963b [llvm-exegesis] Improve error reporting in Target.cpp
Followup to D74085.
Replace the use of `report_fatal_error()` with returning the error to
`llvm-exegesis.cpp` and handling it there.

Differential Revision: https://reviews.llvm.org/D74113
2020-02-06 12:26:08 +01:00
Miloš Stojanović b3576f60eb [llvm-exegesis] Improve error reporting
Fix inconsistencies in error reporting created by mixing
`report_fatal_error()` and `ExitOnErr()`, and add additional information
to the error message to make it more user friendly. Minimize the use
`report_fatal_error()` because it's meant for use in very rare cases and
it results in low information density of the error messages.

Summary of the new design:

 * For command line argument errors output `llvm-exegesis: <error_message>`,
   which is consistent with the error output format emitted by the backend
   which checks correctness of the command line arguments.
 * For other errors the format `llvm-exegesis error: <error_message>` is used.
 ** If the error occurred during file access `<error_message>` will have
    of two parts: `'<file_name>': <rest_of_the_error_message>`

Differential Revision: https://reviews.llvm.org/D74085
2020-02-06 12:26:08 +01:00
Simon Pilgrim 529e6f8791 [ADT] Fix iplist_impl - use after move warnings (PR43943)
As detailed on PR43943, we're seeing static analyzer use after move warnings in the iplist_impl move constructor/operator as they call std::move to both the TraitsT and IntrusiveListT base classes.

As suggested by @dexonsmith this patch casts the moved value to the base classes to silence the warnings.

Differential Revision: https://reviews.llvm.org/D74062
2020-02-06 11:22:21 +00:00
Denis Antrushin 99a6e405ed [IRCE] Use SCEVExpander to modify loop bound
IRCE pass checks that it can calculate loop bounds by checking
SCEV availability at loop entry. However it is possible that loop
bound SCEV is loop invariant, but instruction used to compute it
resides within loop. In such case adjusting loop bound in preheader
using IRBuilder leads to malformed SSA.
Use SCEVExpander instead to generate proper instructions.

Reviewed-by: mkazantsev
Differential Revision: https://reviews.llvm.org/D73496
2020-02-06 12:44:43 +03:00
Fangrui Song 819e755a26 [llvm-readobj][test] Fix test after yaml2obj change (D74034) 2020-02-06 01:22:10 -08:00
Diogo Sampaio 8ba2b62810 [ARM] Fix non-determenistic behaviour
Summary:
ARM Type Promotion pass does not clear
the container that defines if one variable
was visited or not, missing optimization
opportunities by luck when two llvm:Values
from different functions  are allocated at
the same memory address.

Also fixes a comment and uses existing
method to pop and obtain last element
of the worklist.

Reviewers: samparker

Reviewed By: samparker

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73970
2020-02-06 09:21:13 +00:00
Miloš Stojanović b093b66370 [NFC] Fix error handling documentation
The default Error constructor can't be used since rL286561.

Differential Revision: https://reviews.llvm.org/D74069
2020-02-06 10:20:00 +01:00
Fangrui Song a29a9a34f4 [yaml2obj] Refactor command line parsing
* Hide unrelated options.
* Add "OVERVIEW: " to yaml2obj -h/--help.
* Place options under a yaml2obj category.
* Disallow -docnum. Currently -docnum is the only yaml2obj specific long option that is affected.
* Specify `cl::init("-")` and `cl::Prefix` for OutputFilename. The
  latter allows `-ofile`

Reviewed By: grimar, jhenderson

Differential Revision: https://reviews.llvm.org/D73982
2020-02-06 01:11:58 -08:00
Georgii Rymar fd0abcbfc1 [yaml2obj] - Change NameIndex to StName for Symbol.
It is consistent with the approach we use for Section struct.

Differential revision: https://reviews.llvm.org/D74034
2020-02-06 12:04:19 +03:00
Hans Wennborg 67905fc13e Fix some typos in ArrayRef.h 2020-02-06 09:34:29 +01:00
Teresa Johnson 25aa2eef99 Revert "[WPD/LowerTypeTests] Delay lowering/removal of type tests until after ICP"
This reverts commit 748bb5a0f1.

Due to Chromium CFI+ThinLTO test crashes reported on patch.
2020-02-05 19:27:32 -08:00
Petr Hosek 00b3d49d3a [CMake] Link against ZLIB::ZLIB
This is the imported target that find_package(ZLIB) defines.
2020-02-05 18:06:13 -08:00
Huihui Zhang 5389ca7a1f [ConstantFold][NFC] Move scalable vector unit tests under vscale.ll 2020-02-05 16:03:51 -08:00
Huihui Zhang 801857c59e [ConstantFold][SVE] Fix constant folding for bitcast.
Do not iterate on scalable vector type in BitCastConstantVector.
Continuation work of D70985, D71147.

Support for folding bitcast into splat value is kept in D74095, as
it depends on D71637.

Differential Revision: https://reviews.llvm.org/D71389
2020-02-05 15:39:57 -08:00
Matt Arsenault 7464e8d6ad GlobalISel: Remove check for illegal MIR
The verifier will catch this.
2020-02-05 18:37:17 -05:00
Jessica Paquette a373841407 [AArch64][GlobalISel] Emit TBNZ with G_BRCOND where the condition is SLT
When we have a G_ICMP which checks SLT, and the comparison is against 0, we
can emit a TBNZ instead of a CBZ.

This lets us fold in things into the branch, which can provide some code size
savings.

This is similar to the case in `AArch64TargetLowering::LowerBR_CC`.

https://reviews.llvm.org/D74090
2020-02-05 15:23:54 -08:00
Jessica Paquette bab993451e [AArch64][GlobalISel][NFC] Factor out TB(N)Z emission code into its own function
Factor it out into `emitTestBit` and add some asserts to the new function.

This will be useful for implementing TB(N)Z emission for SLT/SGT compares.

Differential Revision: https://reviews.llvm.org/D74080
2020-02-05 15:15:44 -08:00
Jessica Paquette 7212f65784 [AArch64][GlobalISel] Fold G_LSHR into test bit calculation
Add support for walking through G_LSHR in `getTestBitReg`. Equivalent to the
code in `getTestBitOperand` in AArch64ISelLowering.

```
(tbz (lshr x, c), b) -> (tbz x, b+c) when b + c is < # bits in x
```

Differential Revision: https://reviews.llvm.org/D74077
2020-02-05 15:14:12 -08:00
Jonas Paulsson 96ea377ea4 [PHIElimination] Compile time optimization for huge functions.
This is a compile-time optimization for PHIElimination (splitting of critical
edges), which was reported at https://bugs.llvm.org/show_bug.cgi?id=44249. As
discussed there, the way to remedy the slowdowns with huge functions is to
pre-compute the live-in registers for each MBB in an efficient way in
PHIElimination.cpp and then pass that information along to
LiveVariabless::addNewBlock().

In all the huge test programs where this slowdown has been noticable, it has
dissapeared entirely with this patch.

Review: Björn Pettersson, Quentin Colombet.

Differential Revision: https://reviews.llvm.org/D73152
2020-02-05 18:10:03 -05:00
Matt Arsenault 89b7091c28 AMDGPU: Make LDS_DIRECT an artifical register 2020-02-05 17:47:22 -05:00
Matt Arsenault 9087ef0765 GlobalISel: Allow CSE of G_IMPLICIT_DEF
The legalizer produces a lot of these, and they make reading legalized
MIR annoying. For some reason, this does seem to sometimes introduce
copies of implicit def, which is dumb.
2020-02-05 17:47:21 -05:00
David Blaikie a4b590dd39 DebugInfo: Stabilize DW_OP_convert tests so they don't depend on register allocation, etc 2020-02-05 14:28:03 -08:00
Jonas Paulsson 4a3760d2ba [SystemZ] Improve handling of inline asm constraints.
The "{=v0}" constraint did not result in the expected error message in the
abscence of the vector facility, because 'v0' matches as a string into the
AnyRegBitRegClass in common code.

This patch adds checks for vector support in case of "{v" and soft-float in
case of "{f" to remedy this.

Review: Ulrich Weigand.
2020-02-05 17:04:16 -05:00
Juneyoung Lee 5687acf431 [MemCpyOpt] Simplify find*Alignment 2020-02-06 06:42:07 +09:00
Craig Topper c6bdd8e731 [X86] Improve the gather scheduler models for SkylakeClient and SkylakeServer
The load ports need a cycle for each potentially loaded element just like Haswell and Skylake. Unlike Haswell and Broadwell, the number of uops does not scale with the number of elements. Instead the load uops run for multiple cycles.

I've taken the latency number from the uops.info. The port binding for the non-load uops is taken from the original IACA data I have.

Differential Revision: https://reviews.llvm.org/D74000
2020-02-05 13:26:47 -08:00
Matt Arsenault baafe82b07 AMDGPU/GlobalISel: Remove bitcast legality hack 2020-02-05 16:24:24 -05:00
Juneyoung Lee ad9ae6ee2b MemCpyOpt cannot use ABI alignment even if it was not given
Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44388 which incorrectly assigns an ABI alignment to memset when there was no explicit alignment given.

Reviewers: gchatelet, lenary, nikic

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74083
2020-02-06 06:21:55 +09:00
Hans Wennborg 6c4a8bc0a9 Make llvm::crc32() work also for input sizes larger than 32 bits.
The problem was noticed by the Chrome OS toolchain folks
(crbug.com/1048445) because llvm-objcopy --add-gnu-debuglink would
insert the wrong checksum when processing a binary larger than 4 GB.
That use case regressed in 1e1e3ba252 when we started using
llvm::crc32() in more places.

Differential revision: https://reviews.llvm.org/D74039
2020-02-05 21:32:11 +01:00
Matt Arsenault 364326ce66 AMDGPU/GlobalISel: Add mem operand to s.buffer.load intrinsic
Really the intrinsic definition is wrong, but work around this
here. The DAG lowering introduces an MMO. We have to introduce a new
operation to avoid the verifier complaining about the missing mayLoad.
2020-02-05 15:04:42 -05:00
Sanjay Patel 0a389c81cd [x86] use getSplatIndex() in lowerShuffleAsBroadcast()
The old code was doing an N^2 search for splat index.

Differential Revision: https://reviews.llvm.org/D74064
2020-02-05 14:55:02 -05:00
Sanjay Patel 686a038ed8 [Analysis] add query to get splat value from array of ints
I was debug stepping through an x86 shuffle lowering and
noticed we were doing an N^2 search for splat index. I
didn't find the equivalent functionality anywhere else in
LLVM, so here's a helper that takes an array of int and
returns a splatted index while ignoring undefs (any
negative value).

This might also be used inside existing
ShuffleVectorInst/ShuffleVectorSDNode functions and/or
help with D72467.

Differential Revision: https://reviews.llvm.org/D74064
2020-02-05 14:55:02 -05:00