Commit Graph

53987 Commits

Author SHA1 Message Date
Heejin Ahn 4934f76b58 [WebAssembly] Add WebAssemblyLateEHPrepare pass
Summary:
Add WebAssemblyLateEHPrepare pass that does several small jobs for
exception handling. This runs before CFGSort, and is different from
WasmEHPrepare pass that runs before ISel, even though the names are
similar.

Reviewers: dschuff, majnemer

Subscribers: sbc100, jgravelle-google, sunfish, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D46803

llvm-svn: 335438
2018-06-25 01:07:11 +00:00
Craig Topper 4331d6218d [X86] Remove the changes to combineScalarToVector made in r335037.
They appear to be untested other than the test case for p37879.ll and I believe we should be using SimplifyDemandedElts here to handle these cases.

llvm-svn: 335436
2018-06-25 00:21:53 +00:00
Sanjay Patel 962ee178fa [DAGCombiner] eliminate setcc bool math when input is low-bit of some value
This patch has the same motivating example as D48466:
define void @foo(i64 %x, i32 %c.0282.in, i32 %d.0280, i32* %ptr0, i32* %ptr1) {
    %c.0282 = and i32 %c.0282.in, 268435455
    %a16 = lshr i64 32508, %x
    %a17 = and i64 %a16, 1
    %tobool = icmp eq i64 %a17, 0
    %. = select i1 %tobool, i32 1, i32 2
    %.286 = select i1 %tobool, i32 27, i32 26
    %shr97 = lshr i32 %c.0282, %.
    %shl98 = shl i32 %c.0282.in, %.286
    %or99 = or i32 %shr97, %shl98
    %shr100 = lshr i32 %d.0280, %.
    %shl101 = shl i32 %d.0280, %.286
    %or102 = or i32 %shr100, %shl101
    store i32 %or99, i32* %ptr0
    store i32 %or102, i32* %ptr1
    ret void
}

...but I'm trying to kill the setcc bool math sooner rather than later.

By matching a larger pattern that includes both the low-bit mask and the trailing add/sub, 
we can create a universally good fold because we always eliminate the condition code 
intermediate value.

Here are Alive proofs for these (currently instcombine folds the 'add' variants, but 
misses the 'sub' patterns):
https://rise4fun.com/Alive/Gsyp

Name: sub of zext cmp mask
  %a = and i8 %x, 1
  %c = icmp eq i8 %a, 0
  %z = zext i1 %c to i32
  %r = sub i32 C1, %z
  =>
  %optional_cast = zext i8 %a to i32
  %r = add i32 %optional_cast, C1-1

Name: add of zext cmp mask
  %a = and i32 %x, 1
  %c = icmp eq i32 %a, 0
  %z = zext i1 %c to i8
  %r = add i8 %z, C1
  =>
  %optional_cast = trunc i32 %a to i8
  %r = sub i8 C1+1, %optional_cast

All of the tests look like improvements or neutral to me. But it is possible that x86 
test+set+bitop is better than what we now show here. I suspect we could do better by 
adding another fold for the 'sub' variants.

We start with select-of-constant in IR in the larger motivating test, so that's why I 
included tests with selects. Proofs for those variants:
https://rise4fun.com/Alive/Bx1

Name: true const is bigger
Pre: C2 == (C1 + 1)
  %a = and i8 %x, 1
  %c = icmp eq i8 %a, 0
  %r = select i1 %c, i64 C2, i64 C1
  =>
  %z = zext i8 %a to i64
  %r = sub i64 C2, %z

Name: false const is bigger
Pre: C2 == (C1 + 1)
  %a = and i8 %x, 1
  %c = icmp eq i8 %a, 0
  %r = select i1 %c, i64 C1, i64 C2
  =>
  %z = zext i8 %a to i64
  %r = add i64 C1, %z

Differential Revision: https://reviews.llvm.org/D48466

llvm-svn: 335433
2018-06-24 14:37:30 +00:00
Jonas Devlieghere fb54074112 [llvm-mt] Use WithColor for printing errors.
Use the WithColor helper from support to print errors.

llvm-svn: 335416
2018-06-23 16:49:07 +00:00
Craig Topper d8d64a56b5 [X86] Make %eiz usage in 64-bit mode, force a 0x67 address size prefix. Fix some test CHECK lines.
llvm-svn: 335414
2018-06-23 06:15:04 +00:00
Craig Topper 2545529034 [X86] Teach disassembler to use %eip instead of %rip when 0x67 prefix is used on a rip-relative address.
llvm-svn: 335413
2018-06-23 06:03:48 +00:00
Craig Topper 68d64e3859 [X86][AsmParser] Improve base/index register checks.
-Ensure EIP isn't used with an index reigster.
-Ensure EIP isn't used as index register.
-Ensure base register isn't a vector register.
-Ensure eiz/riz usage matches the size of their base register.

llvm-svn: 335412
2018-06-23 05:53:00 +00:00
Stanislav Mekhanoshin d8c9374797 Fix invariant fdiv hoisting in LICM
FDiv is replaced with multiplication by reciprocal and invariant
reciprocal is hoisted out of the loop, while multiplication remains
even if invariant.

Switch checks for all invariant operands and only invariant
denominator to fix the issue.

Differential Revision: https://reviews.llvm.org/D48447

llvm-svn: 335411
2018-06-23 04:01:28 +00:00
Reid Kleckner f5890e4e43 [IR] Split Intrinsics.inc into enums and implementations
Implements PR34259

Intrinsics.h is a very popular header. Most LLVM TUs care about things
like dbg_value, but they don't care how they are implemented. After I
split these out, IntrinsicImpl.inc is 1.7 MB, so this saves each LLVM TU
from scanning 1.7 MB of source that gets pre-processed away.

It also means we can modify intrinsic properties without triggering a
full rebuild, but that's probably less of a win.

I think the next best thing to do would be to split out the target
intrinsics into their own header. Very, very few TUs care about
target-specific intrinsics. It's very hard to split up the target
independent intrinsics like llvm.expect, assume, and dbg.value, though.

llvm-svn: 335407
2018-06-23 02:02:38 +00:00
Fangrui Song 4ef42a83f9 [ELF] Change isSectionData to exclude SHF_EXECINSTR
Summary:
This affects what sections are displayed as "DATA" in llvm-objdump.
The other user llvm-size is unaffected.

Before, a "TEXT" section is also "DATA", which seems weird.
The sh_flags condition matches that of bfd's SEC_DATA but the sh_type
condition uses (== SHF_PROGBITS) instead of bfd's (!= SHT_NOBITS).
bfd's SEC_DATA is not appealing as so many sections will be shown as DATA.

Reviewers: jyknight, Bigcheese

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48472

llvm-svn: 335405
2018-06-23 00:15:33 +00:00
Reid Kleckner 330f65b3e8 [RuntimeDyld] Implement the ELF PIC large code model relocations
Prerequisite for https://reviews.llvm.org/D47211 which improves our ELF
large PIC codegen.

llvm-svn: 335402
2018-06-22 23:53:22 +00:00
Eli Friedman 203eaaf5ba [LoopReroll] Rewrite induction variable rewriting.
This gets rid of a bunch of weird special cases; instead, just use SCEV
rewriting for everything.  In addition to being simpler, this fixes a
bug where we would use the wrong stride in certain edge cases.

The one bit I'm not quite sure about is the trip count handling,
specifically the FIXME about overflow.  In general, I think we need to
widen the exit condition, but that's probably not profitable if the new
type isn't legal, so we probably need a check somewhere.  That said, I
don't think I'm making the existing problem any worse.

As a followup to this, a bunch of IV-related code in root-finding could
be cleaned up; with SCEV-based rewriting, there isn't any reason to
assume a loop will have exactly one or two PHI nodes.

Differential Revision: https://reviews.llvm.org/D45191

llvm-svn: 335400
2018-06-22 22:58:55 +00:00
Craig Topper 10e2f73793 [X86][AsmParser] Keep track of whether an explicit scale was specified while parsing an address in Intel syntax. Use it for improved error checking.
This allows us to check these:
-16-bit addressing doesn't support scale so we should error if we find one there.
-Multiplying ESP/RSP by a scale even if the scale is 1 should be an error because ESP/RSP can't be an index.

llvm-svn: 335398
2018-06-22 22:28:39 +00:00
Sanjay Patel 80b85a46db [x86] add more tests for bit hacking opportunities with setcc; NFC
Missed cases where the input and output are the same size in rL335391.

llvm-svn: 335396
2018-06-22 22:07:26 +00:00
Sanjay Patel 0fe8ea568b [PowerPC] add more tests for bit hacking opportunities with setcc; NFC
Missed cases where the input and output are the same size in rL335390.

llvm-svn: 335395
2018-06-22 22:06:33 +00:00
Craig Topper 1d707539e4 [X86][AsmParser] In Intel syntax make sure we support ESP/RSP being the second register in memory expressions like [EAX+ESP].
By default, the second register gets assigned to the index register slot. But ESP can't be an index register so we need to swap it with the other register.

There's still a slight bug that we allow [EAX+ESP*1]. The existence of the multiply even though its with 1 should force ESP to the index register and trigger an error, but it doesn't currently.

llvm-svn: 335394
2018-06-22 21:57:24 +00:00
Sanjay Patel 705cde3ac8 [x86] add tests for bit hacking opportunities with setcc; NFC
We likely gave up on folding some select-of-constants patterns in 
IR with rL331486, and we need to recover those in the DAG.

The tests without select are based on our current DAGCombiner 
optimizations for select-of-constants.

llvm-svn: 335391
2018-06-22 21:16:54 +00:00
Sanjay Patel 6e505e4388 [PowerPC] add tests for bit hacking opportunities with setcc; NFC
We likely gave up on folding some select-of-constants patterns in 
IR with rL331486, and we need to recover those in the DAG.

The tests without select are based on our current DAGCombiner 
optimizations for select-of-constants.

llvm-svn: 335390
2018-06-22 21:16:29 +00:00
Craig Topper a55cc4a2e9 [X86] Add test cases showing missed select simplifcation for MCU when icmp is in a slightly different form.
These test cases show that the "(select (and (x , 0x1) == 0), y, (z ^ y) ) -> (-(and (x , 0x1)) & z ) ^ y" doesn't work if the select condition is changed to (and (x, 0x1) != 1)

llvm-svn: 335389
2018-06-22 21:09:31 +00:00
Aditya Nandakumar e2a7f31064 [GISel]: Add G_ADDRSPACE_CAST Opcode
Added IRTranslator support for addrspacecast.

https://reviews.llvm.org/D48469

reviewed by: volkan

llvm-svn: 335388
2018-06-22 20:58:51 +00:00
Craig Topper 9bc2c059c3 [X86] Don't accept (%si,%bp) 16-bit address expressions.
The second register is the index register and should only be %si or %di if used with a base register. And in that case the base register should be %bp or %bx.

This makes us compatible with gas.

We do still need to support both orders with Intel syntax which uses [bp+si] and [si+bp]

llvm-svn: 335384
2018-06-22 20:20:38 +00:00
Craig Topper c26c62e0e5 [X86][AsmParser] Allow (%bp,%si) and (%bp,%di) to be encoded without using a zero displacement.
(%bp) can't be encoded without a displacement. The encoding is instead used for displacement alone. So a 1 byte displacement of 0 must be used. But if there is an index register we can encode without a displacement.

llvm-svn: 335379
2018-06-22 19:42:21 +00:00
Simon Pilgrim 938dbe664b [X86][SSE] Add sdiv by (nonuniform) minus one tests (PR37119)
Test cases from D45806

llvm-svn: 335376
2018-06-22 18:31:57 +00:00
Craig Topper cd18bb523c [X86][AsmParser] Check for invalid 16-bit base register in Intel syntax.
llvm-svn: 335373
2018-06-22 17:50:40 +00:00
Craig Topper 22d1db122a [X86] Don't allow ESP/RSP to be used as an index register in assembly.
Fixes PR37892

llvm-svn: 335370
2018-06-22 17:15:58 +00:00
Easwaran Raman f997233890 [X86] Add a test to show missed opportunity to generate vfnmadd
llvm-svn: 335367
2018-06-22 17:01:13 +00:00
Simon Pilgrim 9d3ef8ee2b [SLPVectorizer] Support alternate opcodes in tryToVectorizeList
Enable tryToVectorizeList to support InstructionsState alternate opcode patterns at a root (build vector etc.) as well as further down the vectorization tree.

NOTE: This patch reduces some of the debug reporting if there are opcode mismatches - I can try to add it back if it proves a problem. But it could get rather messy trying to provide equivalent verbose debug strings via getSameOpcode etc.

Differential Revision: https://reviews.llvm.org/D48488

llvm-svn: 335364
2018-06-22 16:37:34 +00:00
Paul Robinson 9b9e25d34c Fix test again, try to keep all targets happy
llvm-svn: 335356
2018-06-22 15:19:45 +00:00
Paul Robinson 6c43488030 Fix test, nop is not always 1 byte
llvm-svn: 335353
2018-06-22 15:07:26 +00:00
Paul Robinson 11539b0969 [DWARFv5] Allow ".loc 0" to refer to the root file.
DWARF v5 explicitly represents file #0 in the line table.  Prior
versions did not, so ".loc 0" is still an error in those cases.

Differential Revision: https://reviews.llvm.org/D48452

llvm-svn: 335350
2018-06-22 14:16:11 +00:00
Simon Pilgrim 1e564504bb [SLPVectorizer] Relax alternate opcodes to accept any BinaryOperator pair
SLP currently only accepts (F)Add/(F)Sub alternate counterpart ops to be merged into an alternate shuffle.

This patch relaxes this to accept any pair of BinaryOperator opcodes instead, assuming the target's cost model accepts the vectorization+shuffle.

Differential Revision: https://reviews.llvm.org/D48477

llvm-svn: 335349
2018-06-22 14:04:06 +00:00
Simon Pilgrim 229a781214 [SLPVectorizer][X86] Add alternate opcode tests for simple build vector cases
llvm-svn: 335348
2018-06-22 13:53:58 +00:00
Sanjay Patel c2b37f73f5 [InstCombine] add shuffle+binops test from PR37806; NFC
This one shows another pattern that we'll need to match
in some cases, but the current ordering of folds allows
us to match this as 2 binops before simplification takes
place.

llvm-svn: 335347
2018-06-22 13:44:42 +00:00
Sanjay Patel cfd9da038c [InstCombine] add tests for shuffle-with-different-binops; NFC
llvm-svn: 335345
2018-06-22 13:19:25 +00:00
Simon Pilgrim 234a6f6842 [X86] Regenerate tests to include fma comments
Noticed in the review of D48467

llvm-svn: 335342
2018-06-22 12:41:48 +00:00
George Rimar dcf59c5480 Recommit r335333 "[MC] - Add .stack_size sections into groups and link them with .text"
With compilation fix.

Original commit message:

D39788 added a '.stack-size' section containing metadata on function stack sizes
to output ELF files behind the new -stack-size-section flag.

This change does following two things on top:

1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. 
    The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to
    eliminate them fast during resolving the COMDATs.
2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text.
   With that linker will be able to do -gc-sections on dead stack sizes sections.

Differential revision: https://reviews.llvm.org/D46874

llvm-svn: 335336
2018-06-22 10:53:47 +00:00
George Rimar 6d448da1be Revert r335332 "[MC] - Add .stack_size sections into groups and link them with .text"
It broke bots.

http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/12891
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/9443
http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-buildserver/builds/25551

llvm-svn: 335333
2018-06-22 10:27:33 +00:00
George Rimar e14485a0c6 [MC] - Add .stack_size sections into groups and link them with .text
D39788 added a '.stack-size' section containing metadata on function stack sizes
to output ELF files behind the new -stack-size-section flag.

This change does following two things on top:

1) Imagine the case when there are -ffunction-sections flag given and there are text sections in COMDATs. 
    The patch adds a '.stack-size' section into corresponding COMDAT group, so that linker will be able to
    eliminate them fast during resolving the COMDATs.
2) Patch sets a SHF_LINK_ORDER flag and links '.stack-size' with the corresponding .text.
   With that linker will be able to do -gc-sections on dead stack sizes sections.

Differential revision: https://reviews.llvm.org/D46874

llvm-svn: 335332
2018-06-22 10:10:53 +00:00
Sjoerd Meijer 1043dffbd3 Recommit of r335326, with the test fixed that I missed.
llvm-svn: 335331
2018-06-22 10:03:03 +00:00
Simon Pilgrim 9c8f9374b5 [CostModel][AArch64] Add some initial costs for SK_Select and SK_PermuteSingleSrc
AArch64 was only setting costs for SK_Transpose, which meant that many of the simpler shuffles (e.g. SK_Select and SK_PermuteSingleSrc for larger vector elements) was being severely overestimated by the default shuffle expansion.

This patch adds costs to help improve SLP performance and avoid a regression in reductions introduced by D48174.

I'm not very knowledgeable about AArch64 shuffle lowering so I've kept the extra costs to a minimum - someone who knows this code can add extra costs which should improve vectorization a lot more.

Differential Revision: https://reviews.llvm.org/D48172

llvm-svn: 335329
2018-06-22 09:45:31 +00:00
Sjoerd Meijer 7ee5b090de Reverting r335326 while I look at the test failure
llvm-svn: 335328
2018-06-22 09:17:08 +00:00
Eugene Leviant 6d711ca168 Revert r335324 due to a builtbot failure
llvm-svn: 335327
2018-06-22 08:57:01 +00:00
Sjoerd Meijer 8d2f1565b7 [ARM] ARMv6m and v8m.baseline strict align
This sets target feature FeatureStrictAlign for Armv6-m and Armv8-m.baseline,
because it has no support for unaligned accesses.
It looks like we always pass target feature "+strict-align" from
Clang, so this is not a user facing problem, but querying the subtarget
(in e.g. llc) for unaligned access support is incorrect.

Differential Revision: https://reviews.llvm.org/D48437

llvm-svn: 335326
2018-06-22 08:48:13 +00:00
Matt Arsenault 3f8e7a3dbc AMDGPU: Add patterns for i32/i64 local atomic load/store
Not sure why the 32/64 split is needed in the atomic_load
store hierarchies. The regular PatFrags do this, but we don't
do it for the existing handling for global.

llvm-svn: 335325
2018-06-22 08:39:52 +00:00
Eugene Leviant ea19c9473c [Evaluator] Improve evaluation of call instruction
Differential revision: https://reviews.llvm.org/D46584

llvm-svn: 335324
2018-06-22 08:29:36 +00:00
Mikhail Dvoretckii 0963562083 [X86] Changing the check for valid inputs in combineScalarToVector
Changing the logic of scalar mask folding to check for valid input types rather
than against invalid ones, making it more robust and fixing PR37879.

Differential Revision: https://reviews.llvm.org/D48366

llvm-svn: 335323
2018-06-22 08:28:05 +00:00
Chandler Carruth aa5f4d2e23 Revert r335306 (and r335314) - the Call Graph Profile pass.
This is the first pass in the main pipeline to use the legacy PM's
ability to run function analyses "on demand". Unfortunately, it turns
out there are bugs in that somewhat-hacky approach. At the very least,
it leaks memory and doesn't support -debug-pass=Structure. Unclear if
there are larger issues or not, but this should get the sanitizer bots
back to green by fixing the memory leaks.

llvm-svn: 335320
2018-06-22 05:33:57 +00:00
Tom Stellard 26fac0f8e1 AMDGPU/GlobalISel: legalize and select 32-bit G_ASHR
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D48196

llvm-svn: 335318
2018-06-22 02:54:57 +00:00
Chandler Carruth fe70b29cf7 [LegacyPM] Fix PR37888 by teaching the legacy loop pass manager how to
clear out deleted loops from the current queue beyond just the current
loop.

This is important because SimpleLoopUnswitch will now enqueue the same
loop to be re-processed. When it does this with the legacy PM, we don't
have a way of canceling the rest of the pipeline and so we can end up
deleting the loop before we reprocess it. =/

This change also makes it easy to support deleting other loops in the
queue to process, although I don't have any use cases for that.

Differential Revision: https://reviews.llvm.org/D48470

llvm-svn: 335317
2018-06-22 02:43:41 +00:00
Tom Stellard 9a6535718e AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFP
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48195

llvm-svn: 335316
2018-06-22 02:34:29 +00:00