Commit Graph

47730 Commits

Author SHA1 Message Date
Matthias Braun cc603ee3d5 TargetLibraryInfo: Stop guessing wchar_t size
Usually the frontend communicates the size of wchar_t via metadata and
we can optimize wcslen (and possibly other calls in the future). In
cases without the wchar_size metadata we would previously try to guess
the correct size based on the target triple; however this is fragile to
keep up to date and may miss users manually changing the size via flags.
Better be safe and stop guessing and optimizing if the frontend didn't
communicate the size.

Differential Revision: https://reviews.llvm.org/D38106

llvm-svn: 314185
2017-09-26 02:36:57 +00:00
Dylan McKay f2c83670f7 [AVR] Fix the build after setting alignment to 1 in r314179
Changing all types to be byte-aligned broke a small number of tests.

llvm-svn: 314183
2017-09-26 02:07:54 +00:00
Vedant Kumar feb3f5272f [llvm-cov] Warn if -show-functions is used without query files
llvm-cov's report mode does not print any output when -show-functions is
specified and no source files are specified. This can be surprising, so
the tool should at least print out an error message when this happens.

rdar://problem/34636859

llvm-svn: 314175
2017-09-25 23:10:03 +00:00
Eli Friedman edee9999c4 Revert r312724 ("[ARM] Remove redundant vcvt patterns.").
It leads to some improvements, but also a regression for the simple
case, so it's not clearly a good idea.

test/CodeGen/ARM/vcvt.ll now has test coverage to show the difference.

Ultimately, the right solution is probably to custom-lower fp-to-int
conversions, to something like ARMISD::VCVT_F32_S32 plus a bitcast.
It's hard to do the right thing when the implicit bitcast isn't visible
to DAG transforms.

llvm-svn: 314169
2017-09-25 22:07:33 +00:00
Saleem Abdulrasool 2e0d72311b X86: remove R12 from CSR on Windows x64 SwiftCC
R12 is used for the SwiftError parameter.  It is no longer a CSR as it
is used for transfer the SwiftError, and the caller must preserve it if
they need to.

llvm-svn: 314165
2017-09-25 22:00:17 +00:00
Eli Friedman 48853741ad [ARM] Fix tests for vcvt+store to return void.
This is what I meant to do in r314161; I didn't realize I'd messed up
because the generated assembly is currently identical.

llvm-svn: 314163
2017-09-25 21:55:27 +00:00
Eli Friedman 7961112df9 [ARM] Add tests for vcvt followed by store.
llvm-svn: 314161
2017-09-25 21:37:52 +00:00
Eli Friedman 7404fad205 [ARM] Regenerate vcvt test checks.
llvm-svn: 314160
2017-09-25 21:34:29 +00:00
Craig Topper 5124a14d9c [X86] Don't select anyext GR32->GR64 to SUBREG_TO_REG. Use INSERT_SUBREG instead.
As far as I know SUBREG_TO_REG is stating that the upper bits are 0. But if we are just converting the GR32 with no checks, then we have no reason to say the upper bits are 0.

I don't really know how to test this today since I can't find anything that looks that closely at SUBREG_TO_REG. The test changes here seems to be some perturbance of register allocation.

Differential Revision: https://reviews.llvm.org/D38001

llvm-svn: 314152
2017-09-25 21:14:59 +00:00
Craig Topper d830f276c1 [X86] Make all the NOREX CodeGenOnly instructions into postRA pseudos like the NOREX version of TEST.
llvm-svn: 314151
2017-09-25 21:14:55 +00:00
Sanjay Patel ecb175608f [InstCombine] remove extract-of-select vector transform (2nd try)
The 1st attempt at this:
https://reviews.llvm.org/rL314117
was reverted at:
https://reviews.llvm.org/rL314118

because of bot fails for clang tests that were checking optimized IR. That should be fixed with:
https://reviews.llvm.org/rL314144
...so try again. 

Original commit message:

The transform to convert an extract-of-a-select-of-vectors was added at:
https://reviews.llvm.org/rL194013

And a question about the validity of this transform was raised in the review:
https://reviews.llvm.org/D1539:
...but not answered AFAICT>

Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with
the original commit, but they are not regressing even after we remove the transform in this patch.

The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do
those transforms as canonicalizations.

The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301:
https://bugs.llvm.org/show_bug.cgi?id=33301
...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see.

Differential Revision: https://reviews.llvm.org/D38006

llvm-svn: 314147
2017-09-25 20:30:53 +00:00
Justin Lebar d31d5e6aa2 Revert "[NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins.", rL314135.
Causing assertion failures on macos:

> Assertion failed: (Num < NumOperands && "Invalid child # of SDNode!"),
> function getOperand, file
> /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/include/llvm/CodeGen/SelectionDAGNodes.h,
> line 835.

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/42739/testReport/LLVM/CodeGen_NVPTX/surf_read_cuda_ll/

llvm-svn: 314142
2017-09-25 19:41:56 +00:00
Konstantin Belochapka 741099bc0f [X86] [ASM INTEL SYNTAX] fix for incorrect assembler code generation when x86-asm-syntax=intel (PR34617).
Fix for incorrect code generation when x86-asm-syntax=intel.
Differential Revision: https://reviews.llvm.org/D37945

llvm-svn: 314140
2017-09-25 19:26:48 +00:00
Craig Topper 5bc10ede53 [SelectionDAG] Teach simplifyDemandedBits to handle shifts by constant splat vectors
This teach simplifyDemandedBits to handle constant splat vector shifts.

This required changing some uses of getZExtValue to getLimitedValue since we can't rely on legalization using getShiftAmountTy for the shift amount.

I believe there may have been a bug in the ((X << C1) >>u ShAmt) handling where we didn't check if the inner shift was too large. I've fixed that here.

I had to add new patterns to ARM because the zext/sext the patterns were trying to look for got turned into an any_extend with this patch. Happy to split that out too, but not sure how to test without this change.

Differential Revision: https://reviews.llvm.org/D37665

llvm-svn: 314139
2017-09-25 19:26:08 +00:00
Alexey Bataev b3aec7a636 [SLP] Add a test for PR32086, NFC.
llvm-svn: 314137
2017-09-25 19:12:59 +00:00
Artem Belevich 9941ee9529 [NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins.
Differential Revision: https://reviews.llvm.org/D38191

llvm-svn: 314135
2017-09-25 18:53:57 +00:00
Hongbin Zheng f0093e45c4 [SimplifyIndvar] Replace the srem used by IV if we can prove both of its operands are non-negative
Since now SCEV can handle 'urem', an 'urem' is a better canonical form than an 'srem' because it has well-defined behavior

This is a follow up of D34598

Differential Revision: https://reviews.llvm.org/D38072

llvm-svn: 314125
2017-09-25 17:39:40 +00:00
Arnold Schwaighofer ae4de58a5b ARM: Use the proper swifterror CSR list on platforms other than darwin
Noticed by inspection

llvm-svn: 314121
2017-09-25 17:19:50 +00:00
Sanjay Patel aa7f750bec revert r314117 because there are bogus clang tests that depend on the optimizer
llvm-svn: 314118
2017-09-25 17:00:04 +00:00
Sanjay Patel 9639897d77 [InstCombine] remove extract-of-select vector transform
The transform to convert an extract-of-a-select-of-vectors was added at:
rL194013

And a question about the validity of this transform was raised in the review:
https://reviews.llvm.org/D1539:
...but not answered AFAICT>

Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with
the original commit, but they are not regressing even after we remove the transform in this patch.

The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do
those transforms as canonicalizations.

The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301:
https://bugs.llvm.org/show_bug.cgi?id=33301
...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see.

Differential Revision: https://reviews.llvm.org/D38006

llvm-svn: 314117
2017-09-25 16:41:34 +00:00
Reid Kleckner 8898cd8dcf [DebugInfo] Sort the SDDbgValue list before assuming it is in IR order
Summary:
This code iterates the 'Orders' vector in parallel with the DbgValue
list, emitting all DBG_VALUEs that occurred between the last IR order
insertion point and the next insertion point. This assumes the
SDDbgValue list is sorted in IR order, which it usually is. However, it
is not sorted when a node with a debug value is replaced with another
one. When this happens, TransferDbgValues is called, and the new value
is added to the end of the list.

The problem can be solved by stably sorting the list by IR order.

Reviewers: aprantl, Ka-Ka

Reviewed By: aprantl

Subscribers: MatzeB, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D38197

llvm-svn: 314114
2017-09-25 16:14:53 +00:00
Michael Zuckerman 4a97df01c4 [X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess (VF8 stride 4):
This patch expands the support of lowerInterleavedStore to 8x8i stride 4.

LLVM creates suboptimal shuffle code-gen for AVX2.
In overall, this patch is a specific fix for the pattern (Strid=4 VF=8) and we plan to include more patterns in the future.

The patch goal is to optimize the following sequence:
At the end of the computation, we have xmm2, xmm0, xmm12 and xmm3 holding
each 8 chars:

c0, c1, , c7
m0, m1, , m7
y0, y1, , y7
k0, k1, ., k7

And these need to be transposed/interleaved and stored like so:

c0 m0 y0 k0 c1 m1 y1 k1 c2 m2 y2 k2 c3 m3 y3 k3 ....

Reviewers
DavidKreitzer
Farhana
zvi
igorb
guyblank
RKSimon
Ayal

Differential Revision: https://reviews.llvm.org/D36058

Change-Id: I3cc5c2ca5d6318901c192a4428493b99ef424c32
llvm-svn: 314109
2017-09-25 14:50:38 +00:00
Nemanja Ivanovic f7bc9ce378 [PowerPC] Eliminate compares - add i64 sext/zext handling for SETLT/SETGT
As mentioned in https://reviews.llvm.org/D33718, this simply adds another
pattern to the compare elimination sequence and is committed without a
differential review.

llvm-svn: 314106
2017-09-25 14:05:46 +00:00
Chad Rosier 71070856e6 [AArch64] Add basic support for Qualcomm's Saphira CPU.
llvm-svn: 314105
2017-09-25 14:05:00 +00:00
Alexey Bataev ccce7afee8 [SLP] Support for horizontal min/max reduction.
Summary:
SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions.
Patch fixes PR26956.

Reviewers: spatel, mkuper, hfinkel, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27846

llvm-svn: 314101
2017-09-25 13:34:59 +00:00
Craig Topper 47e14ead54 [X86] Make IFMA instructions during isel so we can fold broadcast loads.
This required changing the ISD opcode for these instructions to have the commutable operands first and the addend last. This way tablegen can autogenerate the additional patterns for us.

llvm-svn: 314083
2017-09-24 19:30:55 +00:00
Craig Topper 4ffd90c504 [X86] Add tests to show missed opportunities to fold broadcast loads into IFMA instructions when the load is on operand1 of the instrinsic.
We need to enable commuting during isel to catch this since the load folding tables can't handle broadcasts.

llvm-svn: 314082
2017-09-24 19:30:54 +00:00
Craig Topper 23f1830748 [X86] Add IFMA instructions to the load folding tables and make them commutable for the multiply operands.
llvm-svn: 314080
2017-09-24 17:28:14 +00:00
Simon Pilgrim e1335b1c75 [X86][SSE] Add more tests for shuffle combining with extracted vector elements (PR22415)
llvm-svn: 314077
2017-09-24 13:45:49 +00:00
Simon Pilgrim a705db9a9e [X86][SSE] Add support for extending bool vectors bitcasted from scalars
This patch acts as a reverse to combineBitcastvxi1 - bitcasting a scalar integer to a boolean vector and extending it 'in place' to the requested legal type.

Currently this doesn't handle AVX512 at all - but the current mask register approach is lacking for some cases.

Differential Revision: https://reviews.llvm.org/D35320

llvm-svn: 314076
2017-09-24 13:42:31 +00:00
Nemanja Ivanovic f894ce35d0 [PowerPC] Eliminate compares - add i64 sext/zext handling for SETLE/SETGE
As mentioned in https://reviews.llvm.org/D33718, this simply adds another
pattern to the compare elimination sequence and is committed without a
differential review.

llvm-svn: 314073
2017-09-24 05:48:11 +00:00
Craig Topper eb5c411218 [AVX-512] Add pattern for selecting masked version of v8i32/v8f32 compare instructions when VLX isn't available.
We use a v16i32/v16f32 compare instead and truncate the result. We already did this for the unmasked version, but were missing the version with 'and'.

llvm-svn: 314072
2017-09-24 05:24:52 +00:00
Davide Italiano 2122119150 [Verifier] Stop accepting broken DIGlobalVariable(s).
The code wasn't yelling at the user when there's a reference
from a DIGlobalVariableExpression. Thanks to Adrian for the
reduced testcase. Fixes PR34672.

llvm-svn: 314069
2017-09-24 01:06:35 +00:00
Simon Pilgrim 026727f861 [X86] Regenerate i64 to v2f32 bitcast test
llvm-svn: 314068
2017-09-23 19:18:29 +00:00
Sanjay Patel fa8bad8a0f [x86] reduce 64-bit mask constant to 32-bits by right shifting
This is a follow-up from D38181 (r314023). We have to put 64-bit
constants into a register using a separate instruction, so we
should try harder to avoid that.

From what I see, we're not likely to encounter this pattern in the 
DAG because the upstream setcc combines from this don't (usually?) 
produce this pattern. If we fix that, then this will become more 
relevant. Since the cost of handling this case is just loosening 
the predicate of the existing fold, we might as well do it now.

llvm-svn: 314064
2017-09-23 14:32:07 +00:00
Sanjay Patel 5ca9f7a0cb [x86] add an add+shift test for follow-up suggestion from D38181; NFC
llvm-svn: 314063
2017-09-23 14:24:07 +00:00
Nemanja Ivanovic 35db4f956a [PowerPC] Eliminate compares - add i32 sext/zext handling for SETULT/SETUGT
As mentioned in https://reviews.llvm.org/D33718, this simply adds another
pattern to the compare elimination sequence and is committed without a
differential revision.

llvm-svn: 314062
2017-09-23 12:53:03 +00:00
Nemanja Ivanovic c4980799ab [PowerPC] Eliminate compares - add i32 sext/zext handling for SETULE/SETUGE
As mentioned in https://reviews.llvm.org/D33718, this simply adds another
pattern to the compare elimination sequence and is committed without a
differential revision.

llvm-svn: 314060
2017-09-23 09:50:12 +00:00
Nemanja Ivanovic 41c4a109d8 [PowerPC] Eliminate compares - add i32 sext/zext handling for SETLT/SETGT
As mentioned in https://reviews.llvm.org/D33718, this simply adds another
pattern to the compare elimination sequence and is committed without a
differential revision.

llvm-svn: 314055
2017-09-23 04:41:34 +00:00
Konstantin Belochapka 3477711ec7 [X86] [MC] fixed non optimal encoding of instruction memory operand (PR24038).
Fixed suboptimal encoding of instruction memory operand when assembler is used to select 32 bit fixup rather than 8 bit immediate for encoding memory offset value.
Differential Revision: https://reviews.llvm.org/D38117

llvm-svn: 314044
2017-09-22 23:37:48 +00:00
Craig Topper ea927baee2 [InstCombine] Teach foldICmpUsingKnownBits to simplify SLE/SGE/ULE/UGE to equality comparisons when the min/max ranges intersect in a single value.
This is the inverse of what we do for SGT/SLT/UGT/ULT.

llvm-svn: 314032
2017-09-22 21:47:22 +00:00
Craig Topper 73a998908f [InstCombine] Add test cases for known bits simplifications for comparisons that don't depend on constant RHS. NFC
This shows some missing simplifications for sge/sle/uge/ule relative to their non-equality counterparts.

llvm-svn: 314031
2017-09-22 21:47:21 +00:00
Craig Topper 615729b305 [InstCombine] Remove a FIXME from a test that was fixed in r314025.
llvm-svn: 314030
2017-09-22 21:47:20 +00:00
Sanjay Patel ac76201d4e [x86] remove over-specified platform from test config
llvm-svn: 314027
2017-09-22 21:07:13 +00:00
Craig Topper 3f364aa908 [InstCombine] Add constant splat handling to one of the ICMP_SLT/SGT cases in foldICmpUsingKnownBits.
llvm-svn: 314025
2017-09-22 19:54:15 +00:00
Sanjay Patel 3339954fa3 [x86] swap order of srl (and X, C1), C2 when it saves size
The (non-)obvious win comes from saving 3 bytes by using the 0x83 'and' opcode variant instead of 0x81. 
There are also better improvements based on known-bits that allow us to eliminate the mask entirely.

As noted, this could be extended. There are potentially other wins from always shifting first, but doing
that reveals a tangle of problems in other pattern matching. We do this transform generically in 
instcombine, but we often have icmp IR that doesn't match that pattern, so we must account for this
in the backend.

Differential Revision: https://reviews.llvm.org/D38181

llvm-svn: 314023
2017-09-22 19:37:21 +00:00
Rafael Espindola 0bd982b79f llvm-ar: align the first archive member consistently.
Before we were aligning the member after the symbol table to 4 but
other members to 8.

llvm-svn: 314010
2017-09-22 18:36:00 +00:00
Tim Shen cee7536188 [XRay] support conditional return on PPC.
Summary: Conditional returns were not taken into consideration at all. Implement them by turning them into jumps and normal returns. This means there is a slightly higher performance penalty for conditional returns, but this is the best we can do, and it still disturbs little of the rest.

Reviewers: dberris, echristo

Subscribers: sanjoy, nemanjai, hiraditya, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D38102

llvm-svn: 314005
2017-09-22 18:30:02 +00:00
Guozhi Wei bce228ca42 [TargetTransformInfo] Handle intrinsic call in getInstructionLatency()
Usually an intrinsic is a simple target instruction, it should have a small latency. A real function call has much larger latency. So handle the intrinsic call in function getInstructionLatency().

Differential Revision: https://reviews.llvm.org/D38104

llvm-svn: 314003
2017-09-22 18:25:53 +00:00
Rafael Espindola d5d77372d4 llvm-ar: Don't add an unnecessary alignment in gnu mode.
This is mostly for getting stricter testing in preparation for future
changes.

llvm-svn: 314000
2017-09-22 18:16:13 +00:00